Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@ChromaDB MCP Serversearch the 'research' collection for documents about 'neural networks'"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
ChromaDB MCP Server
A comprehensive Model Context Protocol (MCP) server that exposes ChromaDB vector database operations as MCP tools. This server allows AI assistants and other MCP clients to interact with ChromaDB databases through a standardized interface.
Features
Collection Management: Create, list, get, and delete collections
Document Operations: Add, query, update, and delete documents
Vector Similarity Search: Perform semantic searches using text or embedding queries
Metadata Filtering: Filter queries by document metadata and content
Collection Statistics: Get information about collections and their contents
Async/Await Support: Built with asyncio for high performance
Installation
Create and activate a virtual environment:
# Create virtual environment
python3 -m venv venv
# Activate the virtual environment
source venv/bin/activateInstall the project using Hatch:
# Install in development mode with all dependencies
pip install hatch
hatch env create
hatch run pip install -e .
# Or simply use pip with the project (after installing hatch)
pip install -e ".[dev]"
# Optional: Install additional embedding providers
pip install -e ".[openai]" # For OpenAI embeddings
pip install -e ".[cohere]" # For Cohere embeddings (LangChain compatible)
pip install -e ".[instructor]" # For Instructor embeddings (LangChain compatible)
pip install -e ".[sentence-transformers]" # For Sentence Transformers
pip install -e ".[huggingface]" # For HuggingFace modelsVerify installation:
hatch run chroma-mcp-server --help
# or
python -m app.chroma_mcp_server --helpRun the server:
# Using Hatch (recommended)
hatch run serve
# Or using Python directly (make sure virtual env is activated)
python -m app.chroma_mcp_server
# Or using the script entry point
chroma-mcp-server
# Run with specific transport (stdio, http, sse)
python -m app.chroma_mcp_server stdio
python -m app.chroma_mcp_server http 127.0.0.1 8000
python -m app.chroma_mcp_server sse 127.0.0.1 8000Ensure ChromaDB is running. You can start it locally:
# For local persistent storage
chroma run --path ./chroma_data
# Or for in-memory (will be lost on restart)
chroma runChromaDB Connection Modes
The server supports multiple ChromaDB connection modes:
HTTP Mode (Default)
Connects to a running ChromaDB server via HTTP
Requires
CHROMA_HOSTandCHROMA_PORT(default: localhost:8000)Best for: Production, shared databases, remote servers
Persistent Mode
Uses local ChromaDB persistent client (no network needed)
Requires
CHROMA_PERSIST_DIRECTORYor defaults to./chroma_dbBest for: Local development, single-user scenarios
Memory Mode
Uses in-memory ChromaDB (fastest, data lost on restart)
No configuration needed - just set
CHROMA_MODE=memoryBest for: Testing, temporary data, quick prototyping
Configuration Precedence and CLI Options
CLI flags override environment variables from
.env.If a CLI option is provided, it takes precedence over any
.envvalue.
Available CLI overrides:
# Backend selection
--chroma-mode [http|persistent|memory]
# HTTP mode parameters (when --chroma-mode=http)
--chroma-host <host>
--chroma-port <port>
# Persistent mode parameter (when --chroma-mode=persistent)
--chroma-persist-directory <path>
## Usage Examples
```bash
# HTTP mode (default) - connect to ChromaDB server
hatch run chroma-mcp-server stdio --chroma-mode http --chroma-host localhost --chroma-port 8000
# Persistent mode - local database with persistence
hatch run chroma-mcp-server stdio --chroma-mode persistent --chroma-persist-directory ./my_data
# Memory mode - fast in-memory database
hatch run chroma-mcp-server stdio --chroma-mode memory
# SSE transport with persistent backend and explicit host/port
hatch run chroma-mcp-server sse 0.0.0.0 8091 --chroma-mode persistent --chroma-persist-directory ./chroma_data
# HTTP transport with explicit host/port and HTTP backend
hatch run chroma-mcp-server http 127.0.0.1 8090 --chroma-mode http --chroma-host localhost --chroma-port 8000
# CLI options override .env values
hatch run chroma-mcp-server stdio --chroma-mode http --chroma-host my-chroma-server.com --chroma-port 8080CHROMA_HOST: ChromaDB host (default: localhost)CHROMA_PORT: ChromaDB port (default: 8000) - optional, can be omittedCHROMA_PERSIST_DIRECTORY: Path for persistent storage (optional)CHROMA_MODE: Connection mode -http(default),persistent, ormemoryDEFAULT_EMBEDDING_PROVIDER: Default embedding provider (sentence-transformers, openai, cohere, instructor, huggingface)DEFAULT_EMBEDDING_MODEL: Default embedding model nameOPENAI_API_KEY: OpenAI API key for OpenAI embeddingsCOHERE_API_KEY: Cohere API key for Cohere embeddings
Development
Using Hatch Environments
This project uses Hatch for environment management and development workflow:
# Create and enter development environment
hatch shell
# Run tests
hatch run test
# Format code
hatch run format
# Lint code
hatch run lint
# Type check
hatch run type-check
# Run server in development mode
hatch run python -m app.chroma_mcp_serverVirtual Environment Management
# Create virtual environment
python -m venv venv
# Activate the virtual environment
source venv/bin/activate
# Deactivate when done
deactivateProject Structure
src/app/- Main application packagechroma_mcp_server.py- Main MCP server implementation__init__.py- Package initialization
pyproject.toml- Project configuration and dependenciesexample_usage.py- Usage examples and demonstrationsConfiguration files for MCP clients in JSON format
Available MCP Tools
Collection Management
list_collections()
Lists all collections in the database with their metadata.
Returns: JSON array of collection information
create_collection(name, metadata, embedding_config)
Creates a new collection.
Parameters:
name(str): Collection namemetadata(dict, optional): Collection metadataembedding_config(dict, optional): Embedding function configuration:provider(str): "sentence-transformers", "openai", "cohere", "instructor", or "huggingface"model(str): Model name (e.g., "all-MiniLM-L6-v2", "text-embedding-ada-002", "embed-english-v3.0")api_key(str, optional): API key for provider services
get_collection(name)
Gets information about a specific collection.
Parameters:
name(str): Collection name
Returns: Collection information including document count
delete_collection(name)
Deletes a collection and all its documents.
Parameters:
name(str): Collection name
Document Operations
add_documents(collection_name, documents)
Adds documents to a collection.
Parameters:
collection_name(str): Target collection namedocuments(array): Array of document objects with:id(str): Unique document identifiercontent(str): Document text contentmetadata(dict, optional): Document metadataembedding(array, optional): Pre-computed embedding vector
query_collection(query)
Performs similarity search on a collection.
Parameters:
query(object): Query configuration with:collection_name(str): Target collectionquery_texts(array, optional): Text queries for semantic searchquery_embeddings(array, optional): Embedding vectors for direct similarity searchn_results(int, default: 10): Number of results to returnwhere(dict, optional): Metadata filterswhere_document(dict, optional): Document content filters
Returns: Query results with documents, metadata, and similarity scores
update_document(update)
Updates an existing document.
Parameters:
update(object): Update configuration with:collection_name(str): Target collectiondocument_id(str): Document to updatecontent(str, optional): New contentmetadata(dict, optional): New metadataembedding(array, optional): New embedding
delete_documents(collection_name, document_ids)
Deletes documents from a collection.
Parameters:
collection_name(str): Target collectiondocument_ids(array): Array of document IDs to delete
get_document(collection_name, document_id)
Retrieves a specific document.
Parameters:
collection_name(str): Target collectiondocument_id(str): Document ID to retrieve
Utility Tools
collection_stats(collection_name)
Gets statistics for a collection.
Parameters:
collection_name(str): Target collection
peek_collection(collection_name, limit)
Peeks at the first few documents in a collection.
Parameters:
collection_name(str): Target collectionlimit(int, default: 10): Maximum documents to return
Usage Examples
LangChain-Compatible Embedding Examples
# Create collection with Cohere embeddings (popular in LangChain)
await create_collection("docs", embedding_config={
"provider": "cohere",
"model": "embed-english-v3.0",
"api_key": "your-cohere-key"
})
# Create collection with Instructor embeddings (optimized for instructions)
await create_collection("instructions", embedding_config={
"provider": "instructor",
"model": "hkunlp/instructor-xl"
})
# Create collection with OpenAI embeddings (standard in LangChain)
await create_collection("embeddings", embedding_config={
"provider": "openai",
"model": "text-embedding-ada-002"
})
# Environment-based configuration (great for LangChain integration)
import os
os.environ["DEFAULT_EMBEDDING_PROVIDER"] = "cohere"
os.environ["COHERE_API_KEY"] = "your-key"
os.environ["DEFAULT_EMBEDDING_MODEL"] = "embed-english-v3.0"Adding and Querying Documents
# Add documents
documents = [
{
"id": "doc1",
"content": "This is a sample document about AI",
"metadata": {"category": "tech", "year": 2024}
},
{
"id": "doc2",
"content": "Another document about machine learning",
"metadata": {"category": "tech", "year": 2024}
}
]
await add_documents("my_documents", documents)
# Query by text
query = {
"collection_name": "my_documents",
"query_texts": ["artificial intelligence"],
"n_results": 5
}
results = await query_collection(query)
# Query with metadata filters
query = {
"collection_name": "my_documents",
"query_texts": ["technology"],
"where": {"category": "tech"},
"n_results": 10
}
results = await query_collection(query)Document Management
# Update a document
await update_document({
"collection_name": "my_documents",
"document_id": "doc1",
"content": "Updated content about AI and machine learning",
"metadata": {"category": "tech", "year": 2024, "updated": True}
})
# Get a specific document
doc = await get_document("my_documents", "doc1")
# Delete documents
await delete_documents("my_documents", ["doc1", "doc2"])Running the Server
Start the MCP server:
python -m app.chroma_mcp_serverThe server will start and expose MCP tools that can be used by compatible MCP clients.
Transport Modes
The server supports multiple transport protocols for different use cases:
STDIO Transport (Default)
Best for: Local development, Claude Desktop integration
Usage:
python -m app.chroma_mcp_server stdioCharacteristics: Direct communication via standard input/output
HTTP Transport
Best for: Network accessibility, multiple concurrent clients
Usage:
python -m app.chroma_mcp_server http 127.0.0.1 8000Access: Available at
http://127.0.0.1:8000/mcp
SSE Transport (Server-Sent Events)
Best for: Real-time streaming, web integration
Usage:
python -m app.chroma_mcp_server sse 127.0.0.1 8000Note: Legacy transport, HTTP is recommended for new projects
MCP Client Configuration
Claude Desktop Configuration
To use this server with Claude Desktop or other MCP clients that support stdio protocol, add the following configuration to your MCP client config file:
Basic Configuration (claude-desktop-config.json):
{
"mcpServers": {
"chroma-db": {
"command": "python",
"args": ["-m", "app.chroma_mcp_server", "stdio"],
"env": {
"CHROMA_HOST": "localhost",
"CHROMA_PORT": "8000"
}
}
}
}SSE Configuration for network access:
{
"mcpServers": {
"chroma-sse": {
"command": "python",
"args": ["-m", "app.chroma_mcp_server", "sse", "127.0.0.1", "8091", "--chroma-mode", "persistent", "--chroma-persist-directory", "./chroma_data"],
"cwd": "/home/zoomrec/projects/chroma-mcp-server",
"env": {
"LOG_LEVEL": "INFO"
}
}
}
}Configuration Parameters
command: The command to run (usually
python)args: Command line arguments (path to server script)
cwd: Working directory (optional, useful for relative paths)
env: Environment variables for ChromaDB connection:
CHROMA_HOST: ChromaDB server hostnameCHROMA_PORT: ChromaDB server portCHROMA_PERSIST_DIRECTORY: Local persistent storage pathLOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR)
Integration with LangChain
This MCP server is designed to work seamlessly with LangChain applications. The embedding providers are specifically chosen to match LangChain's commonly used embeddings:
Supported LangChain Embeddings
OpenAI:
text-embedding-ada-002- Most popular in LangChainCohere:
embed-english-v3.0- High-performance alternativeInstructor:
hkunlp/instructor-xl- Instruction-tuned embeddingsSentence Transformers:
all-MiniLM-L6-v2- Local, no API requiredHuggingFace: Any model from the Hub
LangChain Integration Example
# In your LangChain application
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings, CohereEmbeddings
# Use with LangChain's Chroma wrapper
embeddings = OpenAIEmbeddings(
model="text-embedding-ada-002",
openai_api_key="your-key"
)
# Create Chroma collection via MCP server first
# Then use it in LangChain
vectorstore = Chroma(
collection_name="my_docs",
embedding_function=embeddings,
client=your_chroma_client
)Integration with AI Assistants
This server is designed to work with AI assistants that support the Model Context Protocol. Once running, the assistant can:
Discover available tools automatically
Use the tools to interact with ChromaDB
Perform vector similarity searches
Manage collections and documents
Build RAG (Retrieval-Augmented Generation) applications
Error Handling
The server includes comprehensive error handling and will return descriptive error messages for:
Connection issues with ChromaDB
Invalid collection or document IDs
Query parameter errors
Network timeouts
Performance Considerations
Uses async/await for non-blocking operations
Supports both local persistent and remote ChromaDB instances
Configurable query result limits
Efficient batch operations for document management
Troubleshooting
Common Issues
Connection refused: Ensure ChromaDB is running on the specified host/port
Collection not found: Verify collection names are correct
Document not found: Check document IDs exist in the collection
Import errors: Ensure all dependencies are installed from pyproject.toml
Logging
The server logs to stdout with INFO level by default. You can adjust logging by modifying the logging configuration in the server file.
License
This project is open source and available under the MIT License.
This server cannot be installed
Resources
Looking for Admin?
Admins can modify the Dockerfile, update the server description, and track usage metrics. If you are the server author, to access the admin panel.