Chroma MCP Server
A Model Context Protocol (MCP) server that provides semantic search and document management capabilities using ChromaDB. This server enables LLMs to perform natural language queries over document collections with intuitive similarity metrics, making it ideal for RAG (Retrieval Augmented Generation) applications.
Features
Semantic Search: Find documents based on meaning using state-of-the-art embeddings
Intuitive Similarity Metrics: Results include human-friendly similarity scores (0-100%)
Document Management: Full CRUD operations for documents and collections
Rich Metadata Support: Attach and search by custom metadata fields
Persistent Storage: Reliable document storage with SQLite backend
Security: Configurable access controls and input validation
Error Handling: Comprehensive error messages and graceful failure recovery
Requirements
Python 3.12 or higher
ChromaDB 0.4.22 or higher
MCP Python SDK 1.1.2 or higher
uv package manager (recommended) or pip
Quick Start
For Claude Desktop integration, see Installation.
Architecture
The server is built on:
ChromaDB for vector storage and search
MCP Python SDK for server implementation
SQLite for persistent storage
Data Flow
Documents are embedded using ChromaDB's default embedding model
Embeddings and metadata are stored in ChromaDB's SQLite backend
Queries are processed through the same embedding model
Results are normalized to a 0-100% similarity scale
Components
Collections and Documents
The server manages two main resource types:
Collections: Containers for related documents with shared embedding settings
Documents: Text content with metadata and automatically generated embeddings
Tools
Collection Management
list-collections
: List all available collectionscreate-collection
: Create a new collection with optional settingsdelete-collection
: Delete a collection and its documents
Document Operations
add-document
: Add a new document with content and metadataget-document
: Retrieve a specific document by IDupdate-document
: Modify document content or metadatadelete-document
: Remove a document from a collectionsearch-documents
: Semantic search with normalized similarity scores
Installation
Prerequisites
Python 3.12+
uv package manager (recommended) or pip
Setup
Clone the repository:
Create and activate virtual environment:
Install dependencies:
Claude Desktop Integration
Add the server to your Claude Desktop configuration:
Windows (%APPDATA%/Claude/claude_desktop_config.json
):
MacOS (~/Library/Application Support/Claude/claude_desktop_config.json
):
Usage Examples
Managing Collections
Create a collection:
List collections:
Working with Documents
Add a document:
Get a specific document:
Update a document:
Search documents:
Understanding Similarity Scores
Search results include normalized similarity scores from 0-100%:
90-100%: Nearly identical content or very strong semantic match
70-89%: Highly relevant with strong semantic similarity
50-69%: Moderately related with partial semantic overlap
30-49%: Somewhat related with minimal semantic connection
0-29%: Likely unrelated or very weak semantic connection
Troubleshooting
Common Issues
Database Connection Errors
Ensure the database path is writable
Check if another process is using the database
Try deleting
.chroma
directory and restarting
Memory Issues
Large collections may require more RAM
Consider using smaller batch sizes
Monitor memory usage with
--log-level DEBUG
Slow Search Performance
Large collections may need index optimization
Consider using fewer
n_results
Check system resource usage
Debug Mode
Run the server in debug mode:
Getting Help
Check ChromaDB Documentation
Open an issue on GitHub
Join MCP Community Discussions
Development
Running Tests
Run the test suite:
Run with coverage:
Debugging
For debugging, use the MCP Inspector:
The inspector provides:
Real-time request/response monitoring
Tool testing interface
Performance metrics
Error tracking
Error Handling
The server provides detailed error messages for common scenarios:
Invalid collection names or IDs
Missing or malformed documents
Database connection issues
Invalid search parameters
Authentication/authorization failures
Security Considerations
Input validation on all parameters
Configurable access controls
Safe handling of file paths
Protection against injection attacks
Rate limiting support
Secure error messages
Configuration
Database Location
Set custom database path:
Default: .chroma
in the server directory
Environment Variables
CHROMA_DB_PATH
: Override database locationCHROMA_LOG_LEVEL
: Set logging verbosity (default: INFO)CHROMA_MAX_CONNECTIONS
: Database connection pool size (default: 10)
Contributing
Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Submit a pull request
Please read our Contributing Guidelines for more details.
License
MIT License
Copyright (c) 2024 privetin
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
This server cannot be installed
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
Enables LLMs to perform semantic search and document management using ChromaDB, supporting natural language queries with intuitive similarity metrics for retrieval augmented generation applications.
- Features
- Requirements
- Quick Start
- Architecture
- Components
- Installation
- Usage Examples
- Troubleshooting
- Development
- Configuration
- Contributing
- License
Related Resources
Related MCP Servers
- -securityAlicense-qualityProvides semantic memory and persistent storage for Claude, leveraging ChromaDB and sentence transformers for enhanced search and retrieval capabilities.Last updated -3746Apache 2.0
- -securityAlicense-qualityProvides a semantic memory layer that integrates LLMs with OpenSearch, enabling storage and retrieval of memories within the OpenSearch engine.Last updated -4Apache 2.0
- AsecurityAlicenseAqualityCline MCP integration that allows users to save, search, and format memories with semantic understanding, providing tools to store and retrieve information using vector embeddings for meaning-based search.Last updated -127MIT License
- -securityAlicense-qualityA long-term memory storage system for LLMs that helps them remember context across multiple sessions using semantic search with embeddings to provide relevant historical information from past interactions and development decisions.