Chroma MCP Server

by privetin

Chroma MCP Server

A Model Context Protocol (MCP) server that provides semantic search and document management capabilities using ChromaDB. This server enables LLMs to perform natural language queries over document collections with intuitive similarity metrics, making it ideal for RAG (Retrieval Augmented Generation) applications.

Features

  • Semantic Search: Find documents based on meaning using state-of-the-art embeddings
  • Intuitive Similarity Metrics: Results include human-friendly similarity scores (0-100%)
  • Document Management: Full CRUD operations for documents and collections
  • Rich Metadata Support: Attach and search by custom metadata fields
  • Persistent Storage: Reliable document storage with SQLite backend
  • Security: Configurable access controls and input validation
  • Error Handling: Comprehensive error messages and graceful failure recovery

Requirements

  • Python 3.12 or higher
  • ChromaDB 0.4.22 or higher
  • MCP Python SDK 1.1.2 or higher
  • uv package manager (recommended) or pip

Quick Start

# Clone the repository git clone https://github.com/privetin/mcp-server-chroma.git cd mcp-server-chroma # Install with uv (recommended) uv venv .venv\Scripts\activate # Windows source .venv/bin/activate # Unix uv pip install -e . # Or with pip python -m venv .venv .venv\Scripts\activate # Windows source .venv/bin/activate # Unix pip install -e . # Run the server mcp-server-chroma

For Claude Desktop integration, see Installation.

Architecture

The server is built on:

Data Flow

  1. Documents are embedded using ChromaDB's default embedding model
  2. Embeddings and metadata are stored in ChromaDB's SQLite backend
  3. Queries are processed through the same embedding model
  4. Results are normalized to a 0-100% similarity scale

Components

Collections and Documents

The server manages two main resource types:

  • Collections: Containers for related documents with shared embedding settings
  • Documents: Text content with metadata and automatically generated embeddings

Tools

Collection Management
  • list-collections: List all available collections
  • create-collection: Create a new collection with optional settings
  • delete-collection: Delete a collection and its documents
Document Operations
  • add-document: Add a new document with content and metadata
  • get-document: Retrieve a specific document by ID
  • update-document: Modify document content or metadata
  • delete-document: Remove a document from a collection
  • search-documents: Semantic search with normalized similarity scores

Installation

Prerequisites

  • Python 3.12+
  • uv package manager (recommended) or pip

Setup

  1. Clone the repository:
git clone https://github.com/privetin/mcp-server-chroma.git cd mcp-server-chroma
  1. Create and activate virtual environment:
uv venv # On Windows: .venv\Scripts\activate # On Unix: source .venv/bin/activate
  1. Install dependencies:
uv pip install -e .

Claude Desktop Integration

Add the server to your Claude Desktop configuration:

Windows (%APPDATA%/Claude/claude_desktop_config.json):

{ "mcpServers": { "chroma": { "command": "uv", "args": [ "--directory", "C:\\path\\to\\mcp-server-chroma", "run", "mcp-server-chroma" ] } } }

MacOS (~/Library/Application Support/Claude/claude_desktop_config.json):

{ "mcpServers": { "chroma": { "command": "uv", "args": [ "--directory", "/path/to/mcp-server-chroma", "run", "mcp-server-chroma" ] } } }

Usage Examples

Managing Collections

Create a collection:

Tool: create-collection Args: {"name": "research-papers"}

List collections:

Tool: list-collections Args: {}

Working with Documents

Add a document:

Tool: add-document Args: { "collection": "research-papers", "content": "Recent advances in transformer architectures have led to significant improvements in natural language processing tasks.", "metadata": { "title": "Transformer Architectures", "year": 2024, "category": "ML" } }

Get a specific document:

Tool: get-document Args: { "collection": "research-papers", "document_id": "doc_123" }

Update a document:

Tool: update-document Args: { "collection": "research-papers", "document_id": "doc_123", "content": "Updated findings on transformer architectures show improvements in both efficiency and accuracy.", "metadata": { "title": "Transformer Architectures - Updated", "year": 2024, "category": "ML", "status": "updated" } }

Search documents:

Tool: search-documents Args: { "collection": "research-papers", "query": "What are the latest developments in transformers?", "n_results": 3 }

Understanding Similarity Scores

Search results include normalized similarity scores from 0-100%:

  • 90-100%: Nearly identical content or very strong semantic match
  • 70-89%: Highly relevant with strong semantic similarity
  • 50-69%: Moderately related with partial semantic overlap
  • 30-49%: Somewhat related with minimal semantic connection
  • 0-29%: Likely unrelated or very weak semantic connection

Troubleshooting

Common Issues

  1. Database Connection Errors
    • Ensure the database path is writable
    • Check if another process is using the database
    • Try deleting .chroma directory and restarting
  2. Memory Issues
    • Large collections may require more RAM
    • Consider using smaller batch sizes
    • Monitor memory usage with --log-level DEBUG
  3. Slow Search Performance
    • Large collections may need index optimization
    • Consider using fewer n_results
    • Check system resource usage

Debug Mode

Run the server in debug mode:

mcp-server-chroma --log-level DEBUG

Getting Help

Development

Running Tests

Run the test suite:

pytest -v

Run with coverage:

pytest --cov=chroma tests/

Debugging

For debugging, use the MCP Inspector:

# Install the inspector npm install -g @modelcontextprotocol/inspector # Run the server with inspector mcp-inspector uv --directory /path/to/mcp-server-chroma run mcp-server-chroma

The inspector provides:

  • Real-time request/response monitoring
  • Tool testing interface
  • Performance metrics
  • Error tracking

Error Handling

The server provides detailed error messages for common scenarios:

  • Invalid collection names or IDs
  • Missing or malformed documents
  • Database connection issues
  • Invalid search parameters
  • Authentication/authorization failures

Security Considerations

  • Input validation on all parameters
  • Configurable access controls
  • Safe handling of file paths
  • Protection against injection attacks
  • Rate limiting support
  • Secure error messages

Configuration

Database Location

Set custom database path:

mcp-server-chroma --db-path /path/to/db

Default: .chroma in the server directory

Environment Variables

  • CHROMA_DB_PATH: Override database location
  • CHROMA_LOG_LEVEL: Set logging verbosity (default: INFO)
  • CHROMA_MAX_CONNECTIONS: Database connection pool size (default: 10)

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Submit a pull request

Please read our Contributing Guidelines for more details.

License

MIT License

Copyright (c) 2024 privetin

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

-
security - not tested
F
license - not found
-
quality - not tested

remote-capable server

The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.

Enables LLMs to perform semantic search and document management using ChromaDB, supporting natural language queries with intuitive similarity metrics for retrieval augmented generation applications.

  1. Features
    1. Requirements
      1. Quick Start
        1. Architecture
          1. Data Flow
        2. Components
          1. Collections and Documents
          2. Tools
        3. Installation
          1. Prerequisites
          2. Setup
          3. Claude Desktop Integration
        4. Usage Examples
          1. Managing Collections
          2. Working with Documents
          3. Understanding Similarity Scores
        5. Troubleshooting
          1. Common Issues
          2. Debug Mode
          3. Getting Help
        6. Development
          1. Running Tests
          2. Debugging
          3. Error Handling
          4. Security Considerations
        7. Configuration
          1. Database Location
          2. Environment Variables
        8. Contributing
          1. License

            Related MCP Servers

            • -
              security
              A
              license
              -
              quality
              Provides semantic memory and persistent storage for Claude, leveraging ChromaDB and sentence transformers for enhanced search and retrieval capabilities.
              Last updated -
              3
              260
              Python
              MIT License
              • Linux
            • -
              security
              A
              license
              -
              quality
              Provides a semantic memory layer that integrates LLMs with OpenSearch, enabling storage and retrieval of memories within the OpenSearch engine.
              Last updated -
              Python
              Apache 2.0
            • A
              security
              A
              license
              A
              quality
              Cline MCP integration that allows users to save, search, and format memories with semantic understanding, providing tools to store and retrieve information using vector embeddings for meaning-based search.
              Last updated -
              6
              1
              JavaScript
              MIT License
              • Apple
            • -
              security
              -
              license
              -
              quality
              A long-term memory storage system for LLMs that helps them remember context across multiple sessions using semantic search with embeddings to provide relevant historical information from past interactions and development decisions.
              Last updated -
              3
              TypeScript
              MIT License

            View all related MCP servers

            ID: zcn8z6syk4