Which integrations are available for this server?

Offers specialized support for LangChain-compatible embedding configurations and environment-based setups to streamline vector database management in AI workflows. Enables the use of OpenAI embedding models for document vectorization and semantic similarity searching within ChromaDB collections.

How do I use ChromaDB MCP Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@ChromaDB MCP Server search the 'research' collection for documents about 'neural networks'" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

ChromaDB MCP Server

A comprehensive Model Context Protocol (MCP) server that exposes ChromaDB vector database operations as MCP tools. This server allows AI assistants and other MCP clients to interact with ChromaDB databases through a standardized interface.

Features

Collection Management: Create, list, get, and delete collections
Document Operations: Add, query, update, and delete documents
Vector Similarity Search: Perform semantic searches using text or embedding queries
Metadata Filtering: Filter queries by document metadata and content
Collection Statistics: Get information about collections and their contents
Async/Await Support: Built with asyncio for high performance

Installation

Create and activate a virtual environment:

# Create virtual environment python3 -m venv venv # Activate the virtual environment source venv/bin/activate

Install the project using Hatch:

# Install in development mode with all dependencies pip install hatch hatch env create hatch run pip install -e . # Or simply use pip with the project (after installing hatch) pip install -e ".[dev]" # Optional: Install additional embedding providers pip install -e ".[openai]" # For OpenAI embeddings pip install -e ".[cohere]" # For Cohere embeddings (LangChain compatible) pip install -e ".[instructor]" # For Instructor embeddings (LangChain compatible) pip install -e ".[sentence-transformers]" # For Sentence Transformers pip install -e ".[huggingface]" # For HuggingFace models

Verify installation:

hatch run chroma-mcp-server --help # or python -m app.chroma_mcp_server --help

Run the server:

# Using Hatch (recommended) hatch run serve # Or using Python directly (make sure virtual env is activated) python -m app.chroma_mcp_server # Or using the script entry point chroma-mcp-server # Run with specific transport (stdio, http, sse) python -m app.chroma_mcp_server stdio python -m app.chroma_mcp_server http 127.0.0.1 8000 python -m app.chroma_mcp_server sse 127.0.0.1 8000

Ensure ChromaDB is running. You can start it locally:

# For local persistent storage chroma run --path ./chroma_data # Or for in-memory (will be lost on restart) chroma run

ChromaDB Connection Modes

The server supports multiple ChromaDB connection modes:

HTTP Mode (Default)

Connects to a running ChromaDB server via HTTP
Requires CHROMA_HOST and CHROMA_PORT (default: localhost:8000)
Best for: Production, shared databases, remote servers

Persistent Mode

Uses local ChromaDB persistent client (no network needed)
Requires CHROMA_PERSIST_DIRECTORY or defaults to ./chroma_db
Best for: Local development, single-user scenarios

Memory Mode

Uses in-memory ChromaDB (fastest, data lost on restart)
No configuration needed - just set CHROMA_MODE=memory
Best for: Testing, temporary data, quick prototyping

Configuration Precedence and CLI Options

CLI flags override environment variables from .env.
If a CLI option is provided, it takes precedence over any .env value.

Available CLI overrides:

# Backend selection --chroma-mode [http|persistent|memory] # HTTP mode parameters (when --chroma-mode=http) --chroma-host <host> --chroma-port <port> # Persistent mode parameter (when --chroma-mode=persistent) --chroma-persist-directory <path> ## Usage Examples ```bash # HTTP mode (default) - connect to ChromaDB server hatch run chroma-mcp-server stdio --chroma-mode http --chroma-host localhost --chroma-port 8000 # Persistent mode - local database with persistence hatch run chroma-mcp-server stdio --chroma-mode persistent --chroma-persist-directory ./my_data # Memory mode - fast in-memory database hatch run chroma-mcp-server stdio --chroma-mode memory # SSE transport with persistent backend and explicit host/port hatch run chroma-mcp-server sse 0.0.0.0 8091 --chroma-mode persistent --chroma-persist-directory ./chroma_data # HTTP transport with explicit host/port and HTTP backend hatch run chroma-mcp-server http 127.0.0.1 8090 --chroma-mode http --chroma-host localhost --chroma-port 8000 # CLI options override .env values hatch run chroma-mcp-server stdio --chroma-mode http --chroma-host my-chroma-server.com --chroma-port 8080

CHROMA_HOST: ChromaDB host (default: localhost)
CHROMA_PORT: ChromaDB port (default: 8000) - optional, can be omitted
CHROMA_PERSIST_DIRECTORY: Path for persistent storage (optional)
CHROMA_MODE: Connection mode - http (default), persistent, or memory
DEFAULT_EMBEDDING_PROVIDER: Default embedding provider (sentence-transformers, openai, cohere, instructor, huggingface)
DEFAULT_EMBEDDING_MODEL: Default embedding model name
OPENAI_API_KEY: OpenAI API key for OpenAI embeddings
COHERE_API_KEY: Cohere API key for Cohere embeddings

Development

Using Hatch Environments

This project uses Hatch for environment management and development workflow:

# Create and enter development environment hatch shell # Run tests hatch run test # Format code hatch run format # Lint code hatch run lint # Type check hatch run type-check # Run server in development mode hatch run python -m app.chroma_mcp_server

Virtual Environment Management

# Create virtual environment python -m venv venv # Activate the virtual environment source venv/bin/activate # Deactivate when done deactivate

Project Structure

src/app/ - Main application package
- chroma_mcp_server.py - Main MCP server implementation
- __init__.py - Package initialization
pyproject.toml - Project configuration and dependencies
example_usage.py - Usage examples and demonstrations
Configuration files for MCP clients in JSON format

Available MCP Tools

Collection Management

`list_collections()`

Lists all collections in the database with their metadata.

Returns: JSON array of collection information

`create_collection(name, metadata, embedding_config)`

Creates a new collection.

Parameters:

name (str): Collection name
metadata (dict, optional): Collection metadata
embedding_config (dict, optional): Embedding function configuration:
- provider (str): "sentence-transformers", "openai", "cohere", "instructor", or "huggingface"
- model (str): Model name (e.g., "all-MiniLM-L6-v2", "text-embedding-ada-002", "embed-english-v3.0")
- api_key (str, optional): API key for provider services

`get_collection(name)`

Gets information about a specific collection.

Parameters:

name (str): Collection name

Returns: Collection information including document count

`delete_collection(name)`

Deletes a collection and all its documents.

Parameters:

name (str): Collection name

Document Operations

`add_documents(collection_name, documents)`

Adds documents to a collection.

Parameters:

collection_name (str): Target collection name
documents (array): Array of document objects with:
- id (str): Unique document identifier
- content (str): Document text content
- metadata (dict, optional): Document metadata
- embedding (array, optional): Pre-computed embedding vector

`query_collection(query)`

Performs similarity search on a collection.

Parameters:

query (object): Query configuration with:
- collection_name (str): Target collection
- query_texts (array, optional): Text queries for semantic search
- query_embeddings (array, optional): Embedding vectors for direct similarity search
- n_results (int, default: 10): Number of results to return
- where (dict, optional): Metadata filters
- where_document (dict, optional): Document content filters

Returns: Query results with documents, metadata, and similarity scores

`update_document(update)`

Updates an existing document.

Parameters:

update (object): Update configuration with:
- collection_name (str): Target collection
- document_id (str): Document to update
- content (str, optional): New content
- metadata (dict, optional): New metadata
- embedding (array, optional): New embedding

`delete_documents(collection_name, document_ids)`

Deletes documents from a collection.

Parameters:

collection_name (str): Target collection
document_ids (array): Array of document IDs to delete

`get_document(collection_name, document_id)`

Retrieves a specific document.

Parameters:

collection_name (str): Target collection
document_id (str): Document ID to retrieve

Utility Tools

`collection_stats(collection_name)`

Gets statistics for a collection.

Parameters:

collection_name (str): Target collection

`peek_collection(collection_name, limit)`

Peeks at the first few documents in a collection.

Parameters:

collection_name (str): Target collection
limit (int, default: 10): Maximum documents to return

Usage Examples

LangChain-Compatible Embedding Examples

# Create collection with Cohere embeddings (popular in LangChain) await create_collection("docs", embedding_config={ "provider": "cohere", "model": "embed-english-v3.0", "api_key": "your-cohere-key" }) # Create collection with Instructor embeddings (optimized for instructions) await create_collection("instructions", embedding_config={ "provider": "instructor", "model": "hkunlp/instructor-xl" }) # Create collection with OpenAI embeddings (standard in LangChain) await create_collection("embeddings", embedding_config={ "provider": "openai", "model": "text-embedding-ada-002" }) # Environment-based configuration (great for LangChain integration) import os os.environ["DEFAULT_EMBEDDING_PROVIDER"] = "cohere" os.environ["COHERE_API_KEY"] = "your-key" os.environ["DEFAULT_EMBEDDING_MODEL"] = "embed-english-v3.0"

Adding and Querying Documents

# Add documents documents = [ { "id": "doc1", "content": "This is a sample document about AI", "metadata": {"category": "tech", "year": 2024} }, { "id": "doc2", "content": "Another document about machine learning", "metadata": {"category": "tech", "year": 2024} } ] await add_documents("my_documents", documents) # Query by text query = { "collection_name": "my_documents", "query_texts": ["artificial intelligence"], "n_results": 5 } results = await query_collection(query) # Query with metadata filters query = { "collection_name": "my_documents", "query_texts": ["technology"], "where": {"category": "tech"}, "n_results": 10 } results = await query_collection(query)

Document Management

# Update a document await update_document({ "collection_name": "my_documents", "document_id": "doc1", "content": "Updated content about AI and machine learning", "metadata": {"category": "tech", "year": 2024, "updated": True} }) # Get a specific document doc = await get_document("my_documents", "doc1") # Delete documents await delete_documents("my_documents", ["doc1", "doc2"])

Running the Server

Start the MCP server:

python -m app.chroma_mcp_server

The server will start and expose MCP tools that can be used by compatible MCP clients.

Transport Modes

The server supports multiple transport protocols for different use cases:

STDIO Transport (Default)

Best for: Local development, Claude Desktop integration
Usage: python -m app.chroma_mcp_server stdio
Characteristics: Direct communication via standard input/output

HTTP Transport

Best for: Network accessibility, multiple concurrent clients
Usage: python -m app.chroma_mcp_server http 127.0.0.1 8000
Access: Available at http://127.0.0.1:8000/mcp

SSE Transport (Server-Sent Events)

Best for: Real-time streaming, web integration
Usage: python -m app.chroma_mcp_server sse 127.0.0.1 8000
Note: Legacy transport, HTTP is recommended for new projects

MCP Client Configuration

Claude Desktop Configuration

To use this server with Claude Desktop or other MCP clients that support stdio protocol, add the following configuration to your MCP client config file:

Basic Configuration (claude-desktop-config.json):

{ "mcpServers": { "chroma-db": { "command": "python", "args": ["-m", "app.chroma_mcp_server", "stdio"], "env": { "CHROMA_HOST": "localhost", "CHROMA_PORT": "8000" } } } }

SSE Configuration for network access:

{ "mcpServers": { "chroma-sse": { "command": "python", "args": ["-m", "app.chroma_mcp_server", "sse", "127.0.0.1", "8091", "--chroma-mode", "persistent", "--chroma-persist-directory", "./chroma_data"], "cwd": "/home/zoomrec/projects/chroma-mcp-server", "env": { "LOG_LEVEL": "INFO" } } } }

Configuration Parameters

command: The command to run (usually python)
args: Command line arguments (path to server script)
cwd: Working directory (optional, useful for relative paths)
env: Environment variables for ChromaDB connection:
- CHROMA_HOST: ChromaDB server hostname
- CHROMA_PORT: ChromaDB server port
- CHROMA_PERSIST_DIRECTORY: Local persistent storage path
- LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR)

Integration with LangChain

This MCP server is designed to work seamlessly with LangChain applications. The embedding providers are specifically chosen to match LangChain's commonly used embeddings:

Supported LangChain Embeddings

OpenAI: text-embedding-ada-002 - Most popular in LangChain
Cohere: embed-english-v3.0 - High-performance alternative
Instructor: hkunlp/instructor-xl - Instruction-tuned embeddings
Sentence Transformers: all-MiniLM-L6-v2 - Local, no API required
HuggingFace: Any model from the Hub

LangChain Integration Example

# In your LangChain application from langchain.vectorstores import Chroma from langchain.embeddings import OpenAIEmbeddings, CohereEmbeddings # Use with LangChain's Chroma wrapper embeddings = OpenAIEmbeddings( model="text-embedding-ada-002", openai_api_key="your-key" ) # Create Chroma collection via MCP server first # Then use it in LangChain vectorstore = Chroma( collection_name="my_docs", embedding_function=embeddings, client=your_chroma_client )

Integration with AI Assistants

This server is designed to work with AI assistants that support the Model Context Protocol. Once running, the assistant can:

Discover available tools automatically
Use the tools to interact with ChromaDB
Perform vector similarity searches
Manage collections and documents
Build RAG (Retrieval-Augmented Generation) applications

Error Handling

The server includes comprehensive error handling and will return descriptive error messages for:

Connection issues with ChromaDB
Invalid collection or document IDs
Query parameter errors
Network timeouts

Performance Considerations

Uses async/await for non-blocking operations
Supports both local persistent and remote ChromaDB instances
Configurable query result limits
Efficient batch operations for document management

Troubleshooting

Common Issues

Connection refused: Ensure ChromaDB is running on the specified host/port
Collection not found: Verify collection names are correct
Document not found: Check document IDs exist in the collection
Import errors: Ensure all dependencies are installed from pyproject.toml

Logging

The server logs to stdout with INFO level by default. You can adjust logging by modifying the logging configuration in the server file.

License

This project is open source and available under the MIT License.

ChromaDB MCP Server

Features

Installation

ChromaDB Connection Modes

HTTP Mode (Default)

Persistent Mode

Memory Mode

Configuration Precedence and CLI Options

Development

Using Hatch Environments

Virtual Environment Management

Project Structure

Available MCP Tools

Collection Management

list_collections()

create_collection(name, metadata, embedding_config)

get_collection(name)

delete_collection(name)

Document Operations

add_documents(collection_name, documents)

query_collection(query)

update_document(update)

delete_documents(collection_name, document_ids)

get_document(collection_name, document_id)

Utility Tools

collection_stats(collection_name)

peek_collection(collection_name, limit)

Usage Examples

LangChain-Compatible Embedding Examples

Adding and Querying Documents

Document Management

Running the Server

Transport Modes

STDIO Transport (Default)

HTTP Transport

SSE Transport (Server-Sent Events)

MCP Client Configuration

Claude Desktop Configuration

Configuration Parameters

Integration with LangChain

Supported LangChain Embeddings

LangChain Integration Example

Integration with AI Assistants

Error Handling

Performance Considerations

Troubleshooting

Common Issues

Logging

License

Resources

Latest Blog Posts

MCP directory API

`list_collections()`

`create_collection(name, metadata, embedding_config)`

`get_collection(name)`

`delete_collection(name)`

`add_documents(collection_name, documents)`

`query_collection(query)`

`update_document(update)`

`delete_documents(collection_name, document_ids)`

`get_document(collection_name, document_id)`

`collection_stats(collection_name)`

`peek_collection(collection_name, limit)`