API Reference
This section provides detailed API documentation for all modules in LangChain ArangoDB.
Vector Stores
- class langchain_arangodb.vectorstores.arangodb_vector.SearchType(*values)[source]
Bases:
str,EnumEnumerator of the search types.
- VECTOR = 'vector'
- HYBRID = 'hybrid'
- class langchain_arangodb.vectorstores.arangodb_vector.ArangoVector(embedding: Embeddings, embedding_dimension: int, database: StandardDatabase, collection_name: str = 'documents', search_type: SearchType = SearchType.VECTOR, embedding_field: str = 'embedding', text_field: str = 'text', vector_index_name: str = 'vector_index', distance_strategy: DistanceStrategy = DistanceStrategy.COSINE, num_centroids: int = 1, relevance_score_fn: Callable[[float], float] | None = None, keyword_index_name: str = 'keyword_index', keyword_analyzer: str = 'text_en', rrf_constant: int = 60, rrf_search_limit: int = 100)[source]
Bases:
VectorStoreArangoDB vector store implementation for LangChain.
This class provides a vector store implementation using ArangoDB as the backend. It supports both vector similarity search and hybrid search (vector + keyword) capabilities.
- Parameters:
embedding (langchain.embeddings.base.Embeddings) – The embedding function to use for converting text to vectors. Must implement the langchain.embeddings.base.Embeddings interface.
embedding_dimension (int) – The dimensionality of the embedding vectors. Must match the output dimension of the embedding function.
database (arango.database.StandardDatabase) – The ArangoDB database instance to use for storage and retrieval.
collection_name (str) – The name of the ArangoDB collection to store documents. Defaults to “documents”.
search_type (SearchType) – The type of search to perform. Can be either SearchType.VECTOR for pure vector similarity search or SearchType.HYBRID for combining vector and keyword search. Defaults to SearchType.VECTOR.
embedding_field (str) – The field name in the document to store the embedding vector Defaults to “embedding”.
text_field (str) – The field name in the document to store the text content. Defaults to “text”.
vector_index_name (str) – The name of the vector index to create in ArangoDB. This index enables efficient vector similarity search. Defaults to “vector_index”.
distance_strategy (DistanceStrategy) – The distance metric to use for vector similarity. Can be either DistanceStrategy.COSINE or DistanceStrategy.EUCLIDEAN_DISTANCE. Defaults to DistanceStrategy.COSINE.
num_centroids (int) – The number of centroids to use for the vector index. Higher values can improve search accuracy but increase memory usage. Defaults to 1.
relevance_score_fn (Optional[Callable[[float], float]]) – Optional function to normalize the relevance score. If not provided, uses the default normalization for the distance strategy.
keyword_index_name (str) – The name of the ArangoDB View created to enable Full-Text-Search capabilities. Only used if search_type is set to SearchType.HYBRID. Defaults to “keyword_index”.
keyword_analyzer (str) – The text analyzer to use for keyword search. Must be one of the supported analyzers in ArangoDB. Defaults to “text_en”.
rrf_constant (int) – The constant used in Reciprocal Rank Fusion (RRF) for hybrid search. Higher values give more weight to lower-ranked results. Defaults to 60.
rrf_search_limit (int) – The maximum number of results to consider in RRF scoring. Defaults to 100.
- property embeddings: Embeddings
Access the query embedding object if available.
- retrieve_vector_index() dict[str, Any] | None[source]
Retrieve the vector index from the collection.
- retrieve_keyword_index() dict[str, Any] | None[source]
Retrieve the keyword index from the collection.
- add_embeddings(texts: Iterable[str], embeddings: List[List[float]], metadatas: List[dict] | None = None, ids: List[str] | None = None, batch_size: int = 500, use_async_db: bool = False, insert_text: bool = True, **kwargs: Any) List[str][source]
Add embeddings to the vectorstore.
- add_texts(texts: Iterable[str], metadatas: List[dict] | None = None, ids: List[str] | None = None, **kwargs: Any) List[str][source]
Add texts to the vector store.
This method embeds the provided texts using the embedding function and stores them in ArangoDB along with their embeddings and metadata.
- Parameters:
texts (Iterable[str]) – An iterable of text strings to add to the vector store.
metadatas (Optional[List[dict]]) – Optional list of metadata dictionaries to associate with each text. Each dictionary can contain arbitrary key-value pairs that will be stored alongside the text and embedding.
ids (Optional[List[str]]) – Optional list of unique identifiers for each text. If not provided, IDs will be generated using a hash of the text content.
kwargs (Any) – Additional keyword arguments passed to add_embeddings.
- Returns:
List of document IDs that were added to the vector store.
- Return type:
List[str]
# Add simple texts texts = ["hello world", "hello arango", "test document"] ids = vector_store.add_texts(texts) print(f"Added {len(ids)} documents") # Add texts with metadata texts = ["Machine learning tutorial", "Python programming guide"] metadatas = [ {"category": "AI", "difficulty": "beginner"}, {"category": "Programming", "difficulty": "intermediate"} ] ids = vector_store.add_texts(texts, metadatas=metadatas) # Add texts with custom IDs texts = ["Document 1", "Document 2"] custom_ids = ["doc_001", "doc_002"] ids = vector_store.add_texts(texts, ids=custom_ids)
- similarity_search(query: str, k: int = 4, return_fields: set[str] = set(), use_approx: bool = True, embedding: List[float] | None = None, filter_clause: str = '', search_type: SearchType | None = None, vector_weight: float = 1.0, keyword_weight: float = 1.0, keyword_search_clause: str = '', metadata_clause: str = '', stream: bool = True, **kwargs: Any) Iterator[Document][source]
- similarity_search(query: str, k: int = 4, return_fields: set[str] = set(), use_approx: bool = True, embedding: List[float] | None = None, filter_clause: str = '', search_type: SearchType | None = None, vector_weight: float = 1.0, keyword_weight: float = 1.0, keyword_search_clause: str = '', metadata_clause: str = '', stream: bool | None = None, **kwargs: Any) List[Document]
Search for similar documents using vector similarity or hybrid search.
This method performs a similarity search using either pure vector similarity or a hybrid approach combining vector and keyword search. The search type can be overridden for individual queries.
- Parameters:
query (str) – The text query to search for.
k (int) – The number of most similar documents to return. Defaults to 4.
return_fields (set[str]) – Set of additional document fields to return in results. The _key and text fields are always returned.
use_approx (bool) – Whether to use approximate nearest neighbor search. Enables faster but potentially less accurate results. Defaults to True.
embedding (Optional[List[float]]) – Optional pre-computed embedding for the query. If not provided, the query will be embedded using the embedding function.
filter_clause (str) – Optional AQL filter clause to apply to the search. Can be used to filter results based on document properties.
search_type (Optional[SearchType]) – Override the default search type for this query. Can be either SearchType.VECTOR or SearchType.HYBRID.
vector_weight (float) – Weight to apply to vector similarity scores in hybrid search. Only used when search_type is SearchType.HYBRID. Defaults to 1.0.
keyword_weight (float) – Weight to apply to keyword search scores in hybrid search. Only used when search_type is SearchType.HYBRID. Defaults to 1.0.
keyword_search_clause (str) – Optional AQL filter clause to apply Full Text Search. If empty, a default search clause will be used.
metadata_clause (str) – Optional AQL clause to return additional metadata once the top k results are retrieved. If specified, the metadata will be added to the Document.metadata field.
stream (Optional[bool]) – If True, returns an iterator that yields results one at a time. This reduces memory usage for large k values. If None or False, returns all results as a list. Defaults to None (batch mode).
kwargs (Any) – Additional keyword arguments.
- Returns:
List of Document objects if stream is None or False, Iterator if stream=True.
- Return type:
Union[List[Document], Iterator[Document]]
# Simple vector search (batch mode) results = vector_store.similarity_search("hello", k=1) print(results[0].page_content) # Search with metadata filtering (batch mode) results = vector_store.similarity_search( "machine learning", k=2, filter_clause="doc.category == 'AI'", return_fields={"category", "difficulty"} ) # Hybrid search with custom weights (batch mode) results = vector_store.similarity_search( "neural networks", k=3, search_type=SearchType.HYBRID, vector_weight=0.8, keyword_weight=0.2 ) # Streaming mode (memory efficient for large k) for doc in vector_store.similarity_search( "query", k=10000, stream=True ): process_document(doc)
- similarity_search_with_score(query: str, k: int = 4, return_fields: set[str] = set(), use_approx: bool = True, embedding: List[float] | None = None, filter_clause: str = '', search_type: SearchType | None = None, vector_weight: float = 1.0, keyword_weight: float = 1.0, keyword_search_clause: str = '', metadata_clause: str = '', stream: bool = True) Iterator[tuple[Document, float]][source]
- similarity_search_with_score(query: str, k: int = 4, return_fields: set[str] = set(), use_approx: bool = True, embedding: List[float] | None = None, filter_clause: str = '', search_type: SearchType | None = None, vector_weight: float = 1.0, keyword_weight: float = 1.0, keyword_search_clause: str = '', metadata_clause: str = '', stream: bool | None = None) List[tuple[Document, float]]
Search for similar documents and return their similarity scores.
Similar to similarity_search but returns a tuple of (Document, score) for each result. The score represents the similarity between the query and the document.
- Parameters:
query (str) – The text query to search for.
k (int) – The number of most similar documents to return. Defaults to 4.
return_fields (set[str]) – Set of additional document fields to return in results. The _key and text fields are always returned.
use_approx (bool) – Whether to use approximate nearest neighbor search. Enables faster but potentially less accurate results. Defaults to True.
embedding (Optional[List[float]]) – Optional pre-computed embedding for the query. If not provided, the query will be embedded using the embedding function.
filter_clause (str) – Optional AQL filter clause to apply to the search. Can be used to filter results based on document properties.
search_type (Optional[SearchType]) – Override the default search type for this query. Can be either SearchType.VECTOR or SearchType.HYBRID.
vector_weight (float) – Weight to apply to vector similarity scores in hybrid search. Only used when search_type is SearchType.HYBRID. Defaults to 1.0.
keyword_weight (float) – Weight to apply to keyword search scores in hybrid search. Only used when search_type is SearchType.HYBRID. Defaults to 1.0.
keyword_search_clause (str) – Optional AQL filter clause to apply Full Text Search. If empty, a default search clause will be used.
metadata_clause (str) – Optional AQL clause to return additional metadata once the top k results are retrieved.
stream (Optional[bool]) – If True, returns an iterator that yields results one at a time. This reduces memory usage for large k values. If None or False, returns all results as a list. Defaults to None (batch mode).
- Returns:
List of tuples containing (Document, score) pairs if stream is None or False, Iterator if stream=True.
- Return type:
Union[List[tuple[Document, float]], Iterator[tuple[Document, float]]]
# Batch mode (default) results = vector_store.similarity_search_with_score("query", k=100) for doc, score in results: print(f"Score: {score}, Content: {doc.page_content[:50]}") # Streaming mode (memory efficient) for doc, score in vector_store.similarity_search_with_score( "query", k=10000, stream=True ): process_document(doc, score)
- similarity_search_by_vector(embedding: List[float], k: int = 4, return_fields: set[str] = set(), use_approx: bool = True, filter_clause: str = '', metadata_clause: str = '', stream: bool = True, **kwargs: Any) Iterator[Document][source]
- similarity_search_by_vector(embedding: List[float], k: int = 4, return_fields: set[str] = set(), use_approx: bool = True, filter_clause: str = '', metadata_clause: str = '', stream: bool | None = None, **kwargs: Any) List[Document]
Return docs most similar to embedding vector.
- Parameters:
embedding (List[float]) – Embedding to look up documents similar to.
k (int) – Number of Documents to return. Defaults to 4.
return_fields (set[str]) – Fields to return in the result. For example, {“foo”, “bar”} will return the “foo” and “bar” fields of the document, in addition to the _key & text field. Defaults to an empty set.
use_approx (bool) – Whether to use approximate vector search via ANN. Defaults to True. If False, exact vector search will be used.
filter_clause (str) – Filter clause to apply to the query.
metadata_clause (str) – Optional AQL clause to return additional metadata once the top k results are retrieved. If specified, the metadata will be added to the Document.metadata field.
stream (Optional[bool]) – If True, returns an iterator that yields results one at a time. This reduces memory usage for large k values. If None or False, returns all results as a list. Defaults to None (batch mode).
kwargs (Any) – Additional keyword arguments.
- Returns:
List of Documents if stream is None or False, Iterator if stream=True.
- Return type:
Union[List[Document], Iterator[Document]]
# Batch mode (default) docs = vector_store.similarity_search_by_vector(embedding, k=100) # Streaming mode (memory efficient) for doc in vector_store.similarity_search_by_vector( embedding, k=10000, stream=True ): process_document(doc)
- similarity_search_by_vector_and_keyword(query: str, embedding: List[float], k: int = 4, return_fields: set[str] = set(), use_approx: bool = True, filter_clause: str = '', vector_weight: float = 1.0, keyword_weight: float = 1.0, keyword_search_clause: str = '', metadata_clause: str = '', stream: bool = True) Iterator[Document][source]
- similarity_search_by_vector_and_keyword(query: str, embedding: List[float], k: int = 4, return_fields: set[str] = set(), use_approx: bool = True, filter_clause: str = '', vector_weight: float = 1.0, keyword_weight: float = 1.0, keyword_search_clause: str = '', metadata_clause: str = '', stream: bool | None = None) List[Document]
Return docs most similar to query using hybrid search.
- Parameters:
query (str) – Query text to search for.
embedding (List[float]) – Embedding vector for the query.
k (int) – Number of Documents to return. Defaults to 4.
return_fields (set[str]) – Fields to return in the result. For example, {“foo”, “bar”} will return the “foo” and “bar” fields of the document, in addition to the _key & text field. Defaults to an empty set.
use_approx (bool) – Whether to use approximate vector search via ANN. Defaults to True. If False, exact vector search will be used.
filter_clause (str) – Filter clause to apply to the query.
vector_weight (float) – Weight to apply to vector similarity scores in hybrid search. Defaults to 1.0.
keyword_weight (float) – Weight to apply to keyword search scores in hybrid search. Defaults to 1.0.
keyword_search_clause (str) – Optional AQL filter clause to apply Full Text Search. If empty, a default search clause will be used.
metadata_clause (str) – Optional AQL clause to return additional metadata once the top k results are retrieved. If specified, the metadata will be added to the Document.metadata field.
stream (Optional[bool]) – If True, returns an iterator that yields results one at a time. This reduces memory usage for large k values. If None or False, returns all results as a list. Defaults to None (batch mode).
- Returns:
List of Documents if stream is None or False, Iterator if stream=True.
- Return type:
Union[List[Document], Iterator[Document]]
# Batch mode (default) docs = vector_store.similarity_search_by_vector_and_keyword( query, embedding, k=100 ) # Streaming mode (memory efficient) for doc in vector_store.similarity_search_by_vector_and_keyword( query, embedding, k=10000, stream=True ): process_document(doc)
- similarity_search_by_vector_with_score(embedding: List[float], k: int = 4, return_fields: set[str] = {}, use_approx: bool = True, filter_clause: str = '', metadata_clause: str = '', stream: bool | None = None) List[tuple[Document, float]] | Iterator[tuple[Document, float]][source]
Return docs most similar to embedding vector with scores.
- Parameters:
embedding (List[float]) – Embedding to look up documents similar to.
k (int) – Number of Documents to return. Defaults to 4.
return_fields (set[str]) – Fields to return in the result. For example, {“foo”, “bar”} will return the “foo” and “bar” fields of the document, in addition to the _key & text field. Defaults to an empty set.
use_approx (bool) – Whether to use approximate vector search via ANN. Defaults to True. If False, exact vector search will be used.
filter_clause (str) – Filter clause to apply to the query.
metadata_clause (str) – Optional AQL clause to return additional metadata once the top k results are retrieved. If specified, the metadata will be added to the Document.metadata field.
stream (Optional[bool]) – If True, returns an iterator that yields results one at a time. This reduces memory usage for large k values. If None or False, returns all results as a list. Defaults to None (batch mode).
- Returns:
List of tuples containing (Document, score) pairs if stream is None or False, Iterator if stream=True.
- Return type:
Union[List[tuple[Document, float]], Iterator[tuple[Document, float]]]
# Batch mode (default) results = vector_store.similarity_search_by_vector_with_score( embedding, k=100 ) # Streaming mode (memory efficient) for doc, score in vector_store.similarity_search_by_vector_with_score( embedding, k=10000, stream=True ): process_document(doc, score)
- similarity_search_by_vector_and_keyword_with_score(query: str, embedding: List[float], k: int = 4, return_fields: set[str] = {}, use_approx: bool = True, filter_clause: str = '', vector_weight: float = 1.0, keyword_weight: float = 1.0, keyword_search_clause: str = '', metadata_clause: str = '', stream: bool | None = None) List[tuple[Document, float]] | Iterator[tuple[Document, float]][source]
Run hybrid similarity search combining vector and keyword search with scores.
- Parameters:
query (str) – Query text to search for.
embedding (List[float]) – Embedding vector for the query.
k (int) – Number of results to return. Defaults to 4.
return_fields (set[str]) – Fields to return in the result. For example, {“foo”, “bar”} will return the “foo” and “bar” fields of the document, in addition to the _key & text field. Defaults to an empty set.
use_approx (bool) – Whether to use approximate vector search via ANN. Defaults to True. If False, exact vector search will be used.
filter_clause (str) – Filter clause to apply to the query.
vector_weight (float) – Weight to apply to vector similarity scores in hybrid search. Only used when search_type is SearchType.HYBRID. Defaults to 1.0.
keyword_weight (float) – Weight to apply to keyword search scores in hybrid search. Only used when search_type is SearchType.HYBRID. Defaults to 1.0.
keyword_search_clause (str) – Optional AQL filter clause to apply Full Text Search. If empty, a default search clause will be used.
metadata_clause (str) – Optional AQL clause to return additional metadata once the top k results are retrieved. If specified, the metadata will be added to the Document.metadata field.
stream (Optional[bool]) – If True, returns an iterator that yields results one at a time. This reduces memory usage for large k values. If None or False, returns all results as a list. Defaults to None (batch mode).
- Returns:
List of tuples containing (Document, score) pairs if stream is None or False, Iterator if stream=True.
- Return type:
Union[List[tuple[Document, float]], Iterator[tuple[Document, float]]]
# Batch mode (default) results = vector_store.similarity_search_by_vector_and_keyword_with_score( query, embedding, k=100 ) # Streaming mode (memory efficient) for doc, score in ( vector_store.similarity_search_by_vector_and_keyword_with_score( query, embedding, k=10000, stream=True ) ): process_document(doc, score)
- delete(ids: List[str] | None = None, **kwargs: Any) bool | None[source]
Delete by vector ID or other criteria.
- Parameters:
ids (Optional[List[str]]) – List of ids to delete.
kwargs (Any) – Other keyword arguments that can be used to delete vectors.
- Returns:
True if deletion is successful, None if no ids are provided, or raises an exception if an error occurs.
- Return type:
Optional[bool]
- get_by_ids(ids: Sequence[str], /) list[Document][source]
Get documents by their IDs.
- Parameters:
ids (Sequence[str]) – List of ids to get.
- Returns:
List of Documents with the given ids.
- Return type:
list[Document]
- max_marginal_relevance_search(query: str, k: int = 4, fetch_k: int = 20, lambda_mult: float = 0.5, return_fields: set[str] = {}, use_approx: bool = True, embedding: List[float] | None = None, **kwargs: Any) List[Document][source]
Search for documents using Maximal Marginal Relevance (MMR).
MMR optimizes for both similarity to the query and diversity among the results. It helps avoid returning redundant or very similar documents by balancing relevance and diversity in the selection process.
- Parameters:
query (str) – The text query to search for.
k (int) – The number of documents to return. Defaults to 4.
fetch_k (int) – The number of documents to fetch for MMR selection. Should be larger than k to allow for diversity selection. Defaults to 20.
lambda_mult (float) – Controls the diversity vs relevance tradeoff. Values between 0 and 1, where 0 = maximum diversity, 1 = maximum relevance. Defaults to 0.5.
return_fields (set[str]) – Set of additional document fields to return in results. The _key and text fields are always returned.
use_approx (bool) – Whether to use approximate nearest neighbor search. Enables faster but potentially less accurate results. Defaults to True.
embedding (Optional[List[float]]) – Optional pre-computed embedding for the query. If not provided, the query will be embedded using the embedding function.
kwargs (Any) – Additional keyword arguments passed to the search methods.
- Returns:
List of Document objects selected by MMR algorithm.
- Return type:
List[Document]
# Search with balanced diversity and relevance results = vector_store.max_marginal_relevance_search( "machine learning", k=3, fetch_k=10 ) # Emphasize diversity over relevance diverse_results = vector_store.max_marginal_relevance_search( "neural networks", k=5, fetch_k=20, lambda_mult=0.1 # More diverse results ) # Emphasize relevance over diversity relevant_results = vector_store.max_marginal_relevance_search( "deep learning", k=3, fetch_k=15, lambda_mult=0.9 # More relevant results )
- classmethod from_texts(texts: List[str], embedding: Embeddings, metadatas: List[dict] | None = None, database: StandardDatabase | None = None, collection_name: str = 'documents', search_type: SearchType = SearchType.VECTOR, embedding_field: str = 'embedding', text_field: str = 'text', vector_index_name: str = 'vector_index', distance_strategy: DistanceStrategy = DistanceStrategy.COSINE, num_centroids: int = 1, ids: List[str] | None = None, overwrite_index: bool = False, insert_text: bool = True, keyword_index_name: str = 'keyword_index', keyword_analyzer: str = 'text_en', rrf_constant: int = 60, rrf_search_limit: int = 100, **kwargs: Any) ArangoVector[source]
Create an ArangoVector instance from a list of texts.
This is a convenience method that creates a new ArangoVector instance, embeds the provided texts, and stores them in ArangoDB.
- Parameters:
texts (List[str]) – List of text strings to add to the vector store.
embedding (langchain.embeddings.base.Embeddings) – The embedding function to use for converting text to vectors.
metadatas (Optional[List[dict]]) – Optional list of metadata dictionaries to associate with each text.
database (Optional[arango.database.StandardDatabase]) – The ArangoDB database instance to use.
collection_name (str) – The name of the ArangoDB collection to use. Defaults to “documents”.
search_type (SearchType) – The type of search to perform. Can be either SearchType.VECTOR or SearchType.HYBRID. Defaults to SearchType.VECTOR.
embedding_field (str) – The field name to store embeddings. Defaults to “embedding”.
text_field (str) – The field name to store text content. Defaults to “text”.
vector_index_name (str) – The name of the vector index. Defaults to “vector_index”.
distance_strategy (DistanceStrategy) – The distance metric to use. Can be DistanceStrategy.COSINE or DistanceStrategy.EUCLIDEAN_DISTANCE. Defaults to DistanceStrategy.COSINE.
num_centroids (int) – Number of centroids for vector index. Defaults to 1.
ids (Optional[List[str]]) – Optional list of unique identifiers for each text.
overwrite_index (bool) – Whether to delete and recreate existing indexes. Defaults to False.
insert_text (bool) – Whether to store the text content in the database. Required for hybrid search. Defaults to True.
keyword_index_name (str) – Name of the keyword search index. Defaults to “keyword_index”.
keyword_analyzer (str) – Text analyzer for keyword search. Defaults to “text_en”.
rrf_constant (int) – Constant for RRF scoring in hybrid search. Defaults to 60.
rrf_search_limit (int) – Maximum results for RRF scoring. Defaults to 100.
kwargs (Any) – Additional keyword arguments passed to the constructor.
- Returns:
A new ArangoVector instance with the texts embedded and stored.
- Return type:
from arango import ArangoClient from langchain_arangodb.vectorstores import ArangoVector from langchain_community.embeddings import OpenAIEmbeddings # Connect to ArangoDB client = ArangoClient("http://localhost:8529") db = client.db("test", username="root", password="openSesame") # Create vector store from texts texts = ["hello world", "hello arango", "test document"] metadatas = [{"source": "doc1"}, {"source": "doc2"}, {"source": "doc3"}] vector_store = ArangoVector.from_texts( texts=texts, embedding=OpenAIEmbeddings(), metadatas=metadatas, database=db, collection_name="test_collection" ) # Create hybrid search store hybrid_store = ArangoVector.from_texts( texts=["Machine learning algorithms", "Deep neural networks"], embedding=OpenAIEmbeddings(), database=db, search_type=SearchType.HYBRID, collection_name="hybrid_docs", overwrite_index=True # Clean start )
- classmethod from_existing_collection(collection_name: str, text_properties_to_embed: List[str], embedding: Embeddings, database: StandardDatabase, embedding_field: str = 'embedding', text_field: str = 'text', vector_index_name: str = 'vector_index', batch_size: int = 1000, aql_return_text_query: str = '', insert_text: bool = False, skip_existing_embeddings: bool = False, search_type: SearchType = SearchType.VECTOR, keyword_index_name: str = 'keyword_index', keyword_analyzer: str = 'text_en', rrf_constant: int = 60, rrf_search_limit: int = 100, **kwargs: Any) ArangoVector[source]
Create an ArangoVector instance from an existing ArangoDB collection.
This method reads documents from an existing collection, extracts specified text properties, embeds them, and creates a new vector store.
- Parameters:
collection_name (str) – Name of the existing ArangoDB collection.
text_properties_to_embed (List[str]) – List of document properties containing text to embed. These properties will be concatenated to create the text for embedding.
embedding (Embeddings) – The embedding function to use for converting text to vectors.
database (StandardDatabase) – The ArangoDB database instance to use.
embedding_field (str) – The field name to store embeddings. Defaults to “embedding”.
text_field (str) – The field name to store text content. Defaults to “text”. Only used if insert_text is True.
vector_index_name (str) – The name of the vector index. Defaults to “vector_index”.
batch_size (int) – Number of documents to process in each batch. Defaults to 1000.
aql_return_text_query (str) – Custom AQL query to extract text from properties. Defaults to “RETURN doc[p]”.
insert_text (bool) – Whether to store the concatenated text in the database. Required for hybrid search. Defaults to False.
skip_existing_embeddings (bool) – Whether to skip documents that already have embeddings. Defaults to False.
search_type (SearchType) – The type of search to perform. Can be either SearchType.VECTOR or SearchType.HYBRID. Defaults to SearchType.VECTOR.
keyword_index_name (str) – Name of the keyword search index. Defaults to “keyword_index”.
keyword_analyzer (str) – Text analyzer for keyword search. Defaults to “text_en”.
rrf_constant (int) – Constant for RRF scoring in hybrid search. Defaults to 60.
rrf_search_limit (int) – Maximum results for RRF scoring. Defaults to 100.
kwargs (Any) – Additional keyword arguments passed to the constructor.
- Returns:
A new ArangoVector instance with embeddings created from the collection.
- Return type:
- find_entity_clusters(threshold: float = 0.8, k: int = 4, use_approx: bool = True, use_subset_relations: bool = False, merge_similar_entities: bool = False) List[Dict[str, Any]] | Dict[str, List[Dict[str, Any]]][source]
Find similar documents within the collection for entity resolution.
This method compares documents within the collection to each other and returns entities grouped with their most similar documents. Each entity is returned with a list of the top k most similar entities based on the chosen similarity function. similarity function: [COSINE, EUCLIDEAN_DISTANCE, JACCARD, APPROX_NEAR_COSINE, APPROX_NEAR_L2] NOTE: for JACCARD, use_approx is automatically set to False
- Parameters:
threshold (float) – Minimum similarity score for documents to be considered similar. Defaults to 0.8.
k (int) – Number of similar documents to return for each entity. Defaults to 4.
use_approx (bool) – Whether to use approximate nearest neighbor search. Defaults to True.
use_subset_relations (bool) – Whether to analyze subset relations. Defaults to False.
merge_similar_entities (bool) – Whether to merge similar entities based on subset relationships. Only effective when use_subset_relations=True. When True, merges subset groups into their superset groups to create consolidated, non-overlapping clusters. Defaults to False.
- Returns:
Return format depends on parameters:
Basic clustering (use_subset_relations=False and merge_similar_entities=False): List[Dict] with format: {‘entity’: entity_key, ‘similar’: [list_of_keys]}
With subset analysis (use_subset_relations=True, merge_similar_entities=False): Dict with keys: ‘similar_entities’, ‘subset_relationships’
With merging (use_subset_relations=True, merge_similar_entities=True): Dict with keys: ‘similar_entities’, ‘subset_relationships’, ‘merged_entities’
- Return type:
Union[List[Dict[str, Any]], Dict[str, List[Dict[str, Any]]]]
Chat Message Histories
- class langchain_arangodb.chat_message_histories.arangodb.ArangoChatMessageHistory(session_id: str | int, db: StandardDatabase, collection_name: str = 'ChatHistory', window: int = 3, *args: Any, **kwargs: Any)[source]
Bases:
BaseChatMessageHistoryChat message history stored in an ArangoDB database.
This class provides persistent storage for chat message histories using ArangoDB as the backend. It supports session-based message storage with automatic collection creation and indexing.
- Parameters:
session_id (Union[str, int]) – Unique identifier for the chat session.
db (arango.database.StandardDatabase) – ArangoDB database instance for storing chat messages.
collection_name (str) – Name of the ArangoDB collection to store messages. Defaults to “ChatHistory”.
window (int) – Maximum number of messages to keep in memory (currently unused). Defaults to 3.
args (Any) – Additional positional arguments passed to BaseChatMessageHistory.
kwargs (Any) – Additional keyword arguments passed to BaseChatMessageHistory.
from arango import ArangoClient from langchain_arangodb.chat_message_histories import ArangoChatMessageHistory # Connect to ArangoDB client = ArangoClient("http://localhost:8529") db = client.db("test", username="root", password="openSesame") # Create chat message history history = ArangoChatMessageHistory( session_id="user_123", db=db, collection_name="chat_sessions" ) # Add messages history.add_user_message("Hello! How are you?") history.add_ai_message("I'm doing well, thank you!") # Add QA message history.add_qa_message( user_input="Who is the first character?", aql_query="FOR doc IN Characters LIMIT 1 RETURN doc", result="The first character is Arya Stark." ) # Retrieve messages messages = history.messages print(f"Found {len(messages)} messages") # Retrieve messages by role human_messages = history.get_messages(role="human") ai_messages = history.get_messages(role="ai") qa_messages = history.get_messages(role="qa") # Clear session history.clear()
- property messages: List[BaseMessage]
Retrieve the messages from ArangoDB.
Retrieves all messages for the current session from the ArangoDB collection, sorted by timestamp in descending order (most recent first).
- Returns:
List of chat messages for the current session, sorted in reverse chronological order (most recent first).
- Return type:
List[BaseMessage]
# Get all messages for the session messages = history.messages for msg in messages: print(f"{msg.type}: {msg.content}") # Check if session has any messages if history.messages: print(f"Session has {len(history.messages)} messages") else: print("No messages in this session")
- get_messages(role: str | None = None, n_messages: int = 10, excluded_fields: list[str] = ['_id', '_key', '_rev', 'session_id', 'time']) list[source]
Retrieve messages from ArangoDB, optionally filtered by role.
- Parameters:
role (Optional[str]) – Optional filter to retrieve messages of a specific role.
n_messages (int) – Number of messages to retrieve.
excluded_fields (list[str]) – Fields to exclude from the returned messages.
# Get all types of messages, default is 10 messages messages = history.get_messages() # Get the first 20 human messages messages = history.get_messages(role="human", n_messages=20) # Get the first 20 AI messages messages = history.get_messages(role="ai", n_messages=20)
- add_message(message: BaseMessage) None[source]
Append the message to the record in ArangoDB.
Stores a single chat message in the ArangoDB collection associated with the current session. The message is stored with its type, content, and session identifier.
- Parameters:
message (BaseMessage) – The chat message to add to the history.
from langchain_core.messages import HumanMessage, AIMessage # Add user message user_msg = HumanMessage(content="What is the weather like?") history.add_message(user_msg) # Add AI response ai_msg = AIMessage(content="I don't have access to current weather data.") history.add_message(ai_msg) # Or use convenience methods history.add_user_message("Hello!") history.add_ai_message("Hi there!")
- add_qa_message(user_input: str, aql_query: str, result: str) None[source]
Add a QA message to the chat history.
- Parameters:
user_input (str) – The user’s input.
aql_query (str) – The AQL query to execute.
result (str) – The result of the AQL query.
history.add_qa_message( user_input="Who is the first character?", aql_query="FOR doc IN Characters LIMIT 1 RETURN doc", result="The first character is Arya Stark." )
- clear() None[source]
Clear session memory from ArangoDB.
Removes all messages associated with the current session from the ArangoDB collection. The collection itself is preserved for future use.
# Add some messages history.add_user_message("Hello") history.add_ai_message("Hi!") print(f"Messages before clear: {len(history.messages)}") # Output: 2 # Clear all messages for this session history.clear() print(f"Messages after clear: {len(history.messages)}") # Output: 0 # Collection still exists for future messages history.add_user_message("Starting fresh conversation") print(f"New message count: {len(history.messages)}") # Output: 1
Graph Stores
- langchain_arangodb.graphs.arangodb_graph.get_arangodb_client(url: str | None = None, dbname: str | None = None, username: str | None = None, password: str | None = None) Any[source]
Get the Arango DB client from credentials.
- Parameters:
url (str) – Arango DB url. Can be passed in as named arg or set as environment var
ARANGODB_URL. Defaults to “http://localhost:8529”.dbname (str) – Arango DB name. Can be passed in as named arg or set as environment var
ARANGODB_DBNAME. Defaults to “_system”.username (str) – Can be passed in as named arg or set as environment var
ARANGODB_USERNAME. Defaults to “root”.password (str) – Can be passed in as named arg or set as environment var
ARANGODB_PASSWORD. Defaults to “”.
- Returns:
An arango.database.StandardDatabase.
- Return type:
Any
- Raises:
ArangoClientError – If the ArangoDB client cannot be created.
ArangoServerError – If the ArangoDB server cannot be reached.
ArangoCollectionError – If the collection cannot be created.
- class langchain_arangodb.graphs.arangodb_graph.ArangoGraph(db: StandardDatabase, generate_schema_on_init: bool = True, schema_sample_ratio: float = 0, schema_graph_name: str | None = None, schema_include_examples: bool = True, schema_list_limit: int = 32, schema_string_limit: int = 256, schema_include_views: bool = False, schema_include_indexes: bool = False)[source]
Bases:
GraphStoreArangoDB wrapper for graph operations.
- Parameters:
db (StandardDatabase) – The ArangoDB database instance.
generate_schema_on_init (bool) – Whether to generate the graph schema on initialization. Defaults to True.
schema_sample_ratio (float) – The ratio of documents/edges to sample in relation to the Collection size to generate each Collection Schema. If 0, one document/edge is used per Collection. Defaults to 0.
schema_graph_name (Optional[str]) – The name of an existing ArangoDB Graph to specifically use to generate the schema. If None, the entire database will be used. Defaults to None.
schema_include_examples (bool) – Whether to include example values fetched from a sample documents as part of the schema. Defaults to True. Lists of size higher than schema_list_limit will be excluded from the schema, even if schema_include_examples is set to True. Defaults to True.
schema_list_limit (int) – The maximum list size the schema will include as part of the example values. If the list is longer than this limit, a string describing the list will be used in the schema instead. Default is 32.
schema_string_limit (int) – The maximum number of characters to include in a string. If the string is longer than this limit, a string describing the string will be used in the schema instead. Default is 256.
schema_include_views (bool) – Whether to include ArangoDB Views and Analyzers as part of the schema passed to the AQL Generation prompt. Default is False.
schema_include_indexes (bool :return: None :rtype: None :raises ArangoClientError: If the ArangoDB client cannot be created. :raises ArangoServerError: If the ArangoDB server cannot be reached. :raises ArangoCollectionError: If the collection cannot be created.) – Whether to include ArangoDB Indexes as part of the collection schema passed to the AQL Generation prompt. Default is False.
- Security note: Make sure that the database connection uses credentials
that are narrowly-scoped to only include necessary permissions. Failure to do so may result in data corruption or loss, since the calling code may attempt commands that would result in deletion, mutation of data if appropriately prompted or reading sensitive data if such data is present in the database. The best way to guard against such negative outcomes is to (as appropriate) limit the permissions granted to the credentials used with this tool.
See https://python.langchain.com/docs/security for more information.
- property db: StandardDatabase
- property schema: Dict[str, Any]
Returns the schema of the Graph Database as a structured object
- property get_structured_schema: Dict[str, Any]
Returns the schema of the Graph Database as a structured object
- property schema_json: str
Returns the schema of the Graph Database as a JSON string
- Returns:
The schema of the Graph Database as a JSON string
- Return type:
str
- property schema_yaml: str
Returns the schema of the Graph Database as a YAML string
- Returns:
The schema of the Graph Database as a YAML string
- Return type:
str
- set_schema(schema: Dict[str, Any]) None[source]
Sets a custom schema for the ArangoDB Database.
- Parameters:
schema (Dict[str, Any]) – The schema to set.
- Returns:
None
- Return type:
None
- refresh_schema(sample_ratio: float = 0, graph_name: str | None = None, include_examples: bool = True, list_limit: int = 32) None[source]
Refresh the graph schema information.
Parameters:
- Parameters:
sample_ratio (float) – A float (0 to 1) to determine the ratio of documents/edges sampled in relation to the Collection size to generate each Collection Schema. If 0, one document/edge is used per Collection. Defaults to 0.
graph_name (Optional[str]) – The name of an existing ArangoDB Graph to specifically use to generate the schema. If None, the entire database will be used. Defaults to None.
include_examples (bool) – Whether to include example values fetched from a sample documents as part of the schema. Defaults to True. Lists of size higher than list_limit will be excluded from the schema, even if schema_include_examples is set to True. Defaults to True.
list_limit (int) – The maximum list size the schema will include as part of the example values. If the list is longer than this limit, a string describing the list will be used in the schema instead. Default is 32.
- Returns:
None
- Return type:
None
- Raises:
ArangoClientError – If the ArangoDB client cannot be created.
ArangoServerError – If the ArangoDB server cannot be reached.
ArangoCollectionError – If the collection cannot be created.
- generate_schema(sample_ratio: float = 0, graph_name: str | None = None, include_examples: bool = True, list_limit: int = 32, schema_string_limit: int = 256, schema_include_views: bool = False, schema_include_indexes: bool = False) Dict[str, List[Dict[str, Any]]][source]
Generates the schema of the ArangoDB Database and returns it
- Parameters:
sample_ratio (float) – A ratio (0 to 1) to determine the ratio of documents/edges used (in relation to the Collection size) to render each Collection Schema. If 0, one document/edge is used per Collection.
graph_name (Optional[str]) – The name of the graph to use to generate the schema. If None, the entire database will be used.
include_examples (bool) – A flag whether to scan the database for example values and use them in the graph schema. Default is True.
list_limit (int) – The maximum number of elements to include in a list. If the list is longer than this limit, a string describing the list will be used in the schema instead. Default is 32.
schema_string_limit (int) – The maximum number of characters to include in a string. If the string is longer than this limit, a string describing the string will be used in the schema instead. Default is 128.
schema_include_views (bool) – Whether to include ArangoDB Views and Analyzers as part of the schema passed to the AQL Generation prompt. Default is False.
schema_include_indexes (bool) – Whether to include ArangoDB Indexes as part of the collection schema passed to the AQL Generation prompt. Default is False.
- Returns:
A dictionary containing the graph schema and collection schema.
- Return type:
Dict[str, List[Dict[str, Any]]]
- Raises:
ValueError – If the sample ratio is not between 0 and 1.
ArangoClientError – If the ArangoDB client cannot be created.
ArangoServerError – If the ArangoDB server cannot be reached.
ArangoCollectionError – If the collection cannot be created.
- query(query: str, params: dict = {}) List[Any][source]
Execute an AQL query and return the results.
- Parameters:
query (str) – The AQL query to execute.
params (dict) – Additional arguments piped to the function. Defaults to None.
list_limit (Optional[int]) – Removes lists above list_limit size that have been returned from the AQL query.
string_limit (Optional[int]) – Removes strings above string_limit size that have been returned from the AQL query.
remaining_params (Optional[dict]) – Remaining params are passed to the AQL query execution. Defaults to None.
- Returns:
A list of dictionaries containing the query results.
- Return type:
List[Any]
- Raises:
ArangoClientError – If the ArangoDB client cannot be created.
ArangoServerError – If the ArangoDB server cannot be reached.
ArangoCollectionError – If the collection cannot be created.
- explain(query: str, params: dict = {}) List[Dict[str, Any]][source]
Explain an AQL query without executing it.
- Parameters:
query (str) – The AQL query to explain.
params (dict) – Additional arguments piped to the function. Defaults to None.
- Returns:
A list of dictionaries containing the query explanation.
- Return type:
List[Dict[str, Any]]
- Raises:
ArangoClientError – If the ArangoDB client cannot be created.
ArangoServerError – If the ArangoDB server cannot be reached.
ArangoCollectionError – If the collection cannot be created.
- add_graph_documents(graph_documents: List[GraphDocument], include_source: bool = False, graph_name: str | None = None, update_graph_definition_if_exists: bool = False, batch_size: int = 1000, use_one_entity_collection: bool = True, insert_async: bool = False, source_collection_name: str | None = None, source_edge_collection_name: str | None = None, entity_collection_name: str | None = None, entity_edge_collection_name: str | None = None, embeddings: Embeddings | None = None, embedding_field: str = 'embedding', embed_source: bool = False, embed_nodes: bool = False, embed_relationships: bool = False, capitalization_strategy: str = 'none') None[source]
Constructs nodes & relationships in the graph based on the provided GraphDocument objects.
- Parameters:
graph_documents (List[GraphDocument]) – The GraphDocument objects to add to the graph.
include_source (bool) – Whether to include the source document in the graph.
graph_name (Optional[str]) – The name of the graph to add the documents to.
update_graph_definition_if_exists (bool) – Whether to update the graph definition if it already exists.
batch_size (int) – The number of documents to process in each batch.
use_one_entity_collection (bool) – Whether to use one entity collection for all nodes.
insert_async (bool) – Whether to insert the documents asynchronously.
source_collection_name (Union[str, None]) – The name of the source collection.
source_edge_collection_name (Union[str, None]) – The name of the source edge collection.
entity_collection_name (Union[str, None]) – The name of the entity collection.
entity_edge_collection_name (Union[str, None]) – The name of the entity edge collection.
embeddings (Union[Embeddings, None]) – The embeddings model to use.
embedding_field (str) – The field to use for the embedding.
embed_source (bool) – Whether to embed the source document.
embed_nodes (bool) – Whether to embed the nodes.
embed_relationships (bool) – Whether to embed the relationships.
capitalization_strategy (str) – The capitalization strategy to use.
- Returns:
None
- Return type:
None
- Raises:
ValueError – If the capitalization strategy is not ‘lower’, ‘upper’, or ‘none’.
ArangoClientError – If the ArangoDB client cannot be created.
ArangoServerError – If the ArangoDB server cannot be reached.
ArangoCollectionError – If the collection cannot be created.
- classmethod from_db_credentials(url: str | None = None, dbname: str | None = None, username: str | None = None, password: str | None = None) Any[source]
Convenience constructor that builds Arango DB from credentials.
- Parameters:
url (str) – Arango DB url. Can be passed in as named arg or set as environment var
ARANGODB_URL. Defaults to “http://localhost:8529”.dbname (str) – Arango DB name. Can be passed in as named arg or set as environment var
ARANGODB_DBNAME. Defaults to “_system”.username (str) – Can be passed in as named arg or set as environment var
ARANGODB_USERNAME. Defaults to “root”.password (str) – Can be passed in as named arg or set as environment var
ARANGODB_USERNAME. Defaults to “root”.
- Returns:
An arango.database.StandardDatabase.
- Return type:
Any
- Raises:
ArangoClientError – If the ArangoDB client cannot be created.
ArangoServerError – If the ArangoDB server cannot be reached.
Chains
Question answering over a graph.
- class langchain_arangodb.chains.graph_qa.arangodb.ArangoGraphQAChain(*, name: str | None = None, memory: ~langchain_classic.base_memory.BaseMemory | None = None, callbacks: list[~langchain_core.callbacks.base.BaseCallbackHandler] | ~langchain_core.callbacks.base.BaseCallbackManager | None = None, verbose: bool = <factory>, tags: list[str] | None = None, metadata: dict[str, ~typing.Any] | None = None, callback_manager: ~langchain_core.callbacks.base.BaseCallbackManager | None = None, graph: ~langchain_arangodb.graphs.arangodb_graph.ArangoGraph, embedding: ~langchain_core.embeddings.embeddings.Embeddings | None = None, query_cache_collection_name: str = 'Queries', aql_generation_chain: ~langchain_core.runnables.base.Runnable[~typing.Dict[str, ~typing.Any], ~typing.Any], aql_fix_chain: ~langchain_core.runnables.base.Runnable[~typing.Dict[str, ~typing.Any], ~typing.Any], qa_chain: ~langchain_core.runnables.base.Runnable[~typing.Dict[str, ~typing.Any], ~typing.Any], input_key: str = 'query', output_key: str = 'result', use_query_cache: bool = False, query_cache_similarity_threshold: float = 0.8, include_history: bool = False, max_history_messages: int = 10, chat_history_store: ~langchain_arangodb.chat_message_histories.arangodb.ArangoChatMessageHistory | None = None, top_k: int = 10, aql_examples: str = '', return_aql_query: bool = False, return_aql_result: bool = False, max_aql_generation_attempts: int = 3, execute_aql_query: bool = True, allow_dangerous_requests: bool = False, output_list_limit: int = 32, output_string_limit: int = 256, force_read_only_query: bool = False)[source]
Bases:
ChainChain for question-answering against a graph by generating AQL statements.
- Security note: Make sure that the database connection uses credentials
that are narrowly-scoped to only include necessary permissions. Failure to do so may result in data corruption or loss, since the calling code may attempt commands that would result in deletion, mutation of data if appropriately prompted or reading sensitive data if such data is present in the database. The best way to guard against such negative outcomes is to (as appropriate) limit the permissions granted to the credentials used with this tool.
See https://python.langchain.com/docs/security for more information.
- graph: ArangoGraph
The ArangoGraph instance to use for the chain.
- embedding: Embeddings | None
Embedding model to use for the chain.
- query_cache_collection_name: str
Name of the collection for storing queries.
- aql_generation_chain: Runnable[Dict[str, Any], Any]
Chain to use for AQL generation.
- aql_fix_chain: Runnable[Dict[str, Any], Any]
Chain to use for AQL fix.
- qa_chain: Runnable[Dict[str, Any], Any]
Chain to use for QA.
- input_key: str
Key to use for the input.
- output_key: str
Key to use for the output.
- use_query_cache: bool
Whether to use the query cache.
- query_cache_similarity_threshold: float
Similarity threshold for matching cached queries.
- include_history: bool
Whether to include the chat history in the prompt.
- max_history_messages: int
Maximum number of messages to include in the chat history.
- chat_history_store: ArangoChatMessageHistory | None
ArangoChatMessageHistory instance to store chat history.
- top_k: int
Number of results to return from the query
- aql_examples: str
Specifies the set of AQL Query Examples that promote few-shot-learning
- return_aql_query: bool
Specify whether to return the AQL Query in the output dictionary
- return_aql_result: bool
Specify whether to return the AQL JSON Result in the output dictionary
- max_aql_generation_attempts: int
Specify the maximum amount of AQL Generation attempts that should be made
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'protected_namespaces': ()}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- execute_aql_query: bool
If False, the AQL Query is only explained & returned, not executed
- allow_dangerous_requests: bool
Forced user opt-in to acknowledge that the chain can make dangerous requests.
- output_list_limit: int
Maximum list length to include in the response prompt. Truncated if longer.
- output_string_limit: int
Maximum string length to include in the response prompt. Truncated if longer.
- force_read_only_query: bool
If True, the query is checked for write operations and raises an error if a write operation is detected.
- property input_keys: List[str]
Get the input keys for the chain.
- property output_keys: List[str]
Get the output keys for the chain.
- classmethod from_llm(llm: BaseLanguageModel, *, qa_prompt: BasePromptTemplate | None = None, aql_generation_prompt: BasePromptTemplate | None = None, aql_fix_prompt: BasePromptTemplate | None = None, **kwargs: Any) ArangoGraphQAChain[source]
Initialize from LLM.
- Parameters:
llm (BaseLanguageModel) – The large language model to use.
embedding (Embeddings) – The embedding model to use.
use_query_cache (bool) – If True, enables reuse of similar past queries from cache.
query_cache_similarity_threshold (float) – The similarity threshold to consider a query as a match in the cache.
query_cache_collection_name (str) – Name of the collection for storing queries.
include_history (bool) – If True, includes recent chat history in the prompt to provide context for query generation.
max_history_messages (int) – The maximum number of messages to include in the chat history.
chat_history_store (ArangoChatMessageHistory) – ArangoChatMessageHistory instance to store chat history.
qa_prompt (BasePromptTemplate) – The prompt to use for the QA chain.
aql_generation_prompt (BasePromptTemplate) – The prompt to use for the AQL generation chain.
aql_fix_prompt (BasePromptTemplate) – The prompt to use for the AQL fix chain.
kwargs (Any) – Additional keyword arguments.
- Returns:
The initialized ArangoGraphQAChain.
- Return type:
- Raises:
ValueError – If the LLM is not provided.
Query Constructors
Utilities
- class langchain_arangodb.vectorstores.utils.DistanceStrategy(*values)[source]
Bases:
str,EnumEnumerator of the Distance strategies for calculating distances between vectors.
Note that use_approx is not supported for the following distance strategies: - JACCARD - MAX_INNER_PRODUCT - DOT_PRODUCT
- EUCLIDEAN_DISTANCE = 'EUCLIDEAN_DISTANCE'
- MAX_INNER_PRODUCT = 'MAX_INNER_PRODUCT'
- DOT_PRODUCT = 'DOT_PRODUCT'
- JACCARD = 'JACCARD'
- COSINE = 'COSINE'