chroma_get_documents
Retrieve documents from a Chroma collection with optional filtering by IDs, metadata, or content. Supports logical operators, regex, and custom includes for precise querying and response customization.
Instructions
Get documents from a Chroma collection with optional filtering.
Args:
collection_name: Name of the collection to get documents from
ids: Optional list of document IDs to retrieve
where: Optional metadata filters using Chroma's query operators
Examples:
- Simple equality: {"metadata_field": "value"}
- Comparison: {"metadata_field": {"$gt": 5}}
- Logical AND: {"$and": [{"field1": {"$eq": "value1"}}, {"field2": {"$gt": 5}}]}
- Logical OR: {"$or": [{"field1": {"$eq": "value1"}}, {"field1": {"$eq": "value2"}}]}
where_document: Optional document content filters
Examples:
- Contains: {"$contains": "value"}
- Not contains: {"$not_contains": "value"}
- Regex: {"$regex": "[a-z]+"}
- Not regex: {"$not_regex": "[a-z]+"}
- Logical AND: {"$and": [{"$contains": "value1"}, {"$not_regex": "[a-z]+"}]}
- Logical OR: {"$or": [{"$regex": "[a-z]+"}, {"$not_contains": "value2"}]}
include: List of what to include in response. By default, this will include documents, and metadatas.
limit: Optional maximum number of documents to return
offset: Optional number of documents to skip before returning results
Returns:
Dictionary containing the matching documents, their IDs, and requested includes
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| collection_name | Yes | ||
| ids | No | ||
| include | No | ||
| limit | No | ||
| offset | No | ||
| where | No | ||
| where_document | No |
Implementation Reference
- src/chroma_mcp/server.py:442-490 (handler)The handler function for the 'chroma_get_documents' tool. It is decorated with @mcp.tool() for registration and implements the core logic: retrieves a Chroma collection and calls its .get() method with provided parameters (ids, where, where_document, include, limit, offset), returning the results or raising an exception on failure.@mcp.tool() async def chroma_get_documents( collection_name: str, ids: List[str] | None = None, where: Dict | None = None, where_document: Dict | None = None, include: List[str] = ["documents", "metadatas"], limit: int | None = None, offset: int | None = None ) -> Dict: """Get documents from a Chroma collection with optional filtering. Args: collection_name: Name of the collection to get documents from ids: Optional list of document IDs to retrieve where: Optional metadata filters using Chroma's query operators Examples: - Simple equality: {"metadata_field": "value"} - Comparison: {"metadata_field": {"$gt": 5}} - Logical AND: {"$and": [{"field1": {"$eq": "value1"}}, {"field2": {"$gt": 5}}]} - Logical OR: {"$or": [{"field1": {"$eq": "value1"}}, {"field1": {"$eq": "value2"}}]} where_document: Optional document content filters Examples: - Contains: {"$contains": "value"} - Not contains: {"$not_contains": "value"} - Regex: {"$regex": "[a-z]+"} - Not regex: {"$not_regex": "[a-z]+"} - Logical AND: {"$and": [{"$contains": "value1"}, {"$not_regex": "[a-z]+"}]} - Logical OR: {"$or": [{"$regex": "[a-z]+"}, {"$not_contains": "value2"}]} include: List of what to include in response. By default, this will include documents, and metadatas. limit: Optional maximum number of documents to return offset: Optional number of documents to skip before returning results Returns: Dictionary containing the matching documents, their IDs, and requested includes """ client = get_chroma_client() try: collection = client.get_collection(collection_name) return collection.get( ids=ids, where=where, where_document=where_document, include=include, limit=limit, offset=offset ) except Exception as e: raise Exception(f"Failed to get documents from collection '{collection_name}': {str(e)}") from e