Allows for the storage and retrieval of Docker configuration patterns and system settings within a persistent vector-searchable knowledge base.
Enables the management and searching of Home Assistant smart home configurations and automation scripts for AI-powered reference.
Supports automatic EXIF metadata extraction from JPEG images, including camera settings, exposure, and GPS location data for advanced searchable photo archives.
Supports the ingestion and retrieval of Kubernetes system and application configurations to provide context for AI-driven infrastructure management.
Facilitates the batch ingestion and parsing of TOML configuration files into searchable vector storage.
Allows for the structured ingestion and search of XML data files within vector collections.
Enables AI assistants to parse and store YAML configuration data for persistent memory and context retrieval.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@ChromaDB Local MCP ServerSearch my saved snippets for Python async examples"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
ChromaDB MCP Server π§
A Model Context Protocol (MCP) server that gives AI assistants persistent memory through ChromaDB vector storage. Now with EXIF extraction, Watch Folders, and Duplicate Detection - the ultimate tool for creators!
β¨ Features
Core
Persistent AI Memory: Your AI assistant remembers past conversations and solutions
Vector Search: Find similar code patterns, configurations, and documentation instantly
Local First: Run everything on your own hardware, no cloud dependencies
π Batch Processing
Fast Batch Ingest: Process entire directories in seconds (500+ files)
77 File Types: Photos, CAD, documents, data files, code
Quick Load/Unload: Temporary collections for rapid workflows
Export/Import: Backup and transfer collections as JSON
πΈ Photo Features (NEW in v3.0)
EXIF Extraction: Camera, lens, exposure, GPS location, date taken
Search by Camera: "Find photos shot with my Canon 5D"
Search by Location: GPS coordinates embedded and searchable
Search by Date: "Find photos from vacation 2024"
ποΈ Watch Folders (NEW in v3.0)
Auto-Ingest: Drop files in watched folders, auto-add to ChromaDB
Hands-Free: Perfect for incoming photo dumps, downloads
Filter by Type: Watch only for specific file types
π Duplicate Detection (NEW in v3.0)
Find Duplicates: Hash-based detection across directories
Reclaim Space: See exactly how much space duplicates waste
Compare Files: Check if two files are identical
Perceptual Hashing: Find similar (not just identical) images
π Quick Start
Prerequisites
Bun (JavaScript runtime)
Docker (for ChromaDB)
Claude Desktop (or any MCP client)
Installation
Clone the repository
git clone https://github.com//vespo92/chromadblocal-mcp-server.git cd chromadb-mcp-serverInstall dependencies
bun installStart ChromaDB
docker run -d \ --name chromadb-local \ -p 8001:8000 \ -v ~/chromadb-data:/chroma/chroma \ -e IS_PERSISTENT=TRUE \ chromadb/chroma:latestInitialize collections
bun run setupConfigure Claude Desktop
Add to
~/Library/Application Support/Claude/claude_desktop_config.json:{ "mcpServers": { "chromadb-context": { "command": "bun", "args": ["run", "/path/to/chromadb-mcp-server/index.js"], "env": { "CHROMADB_URL": "http://localhost:8001" } } } }Restart Claude Desktop and start building your knowledge base!
π¬ Usage Examples
Once configured, interact naturally with your AI:
Store Knowledge
"Store this Docker configuration in ChromaDB for future reference"
"Save this React component pattern with tags: hooks, authentication"
"Remember this solution for GPU passthrough issues"
Retrieve Information
"Search ChromaDB for Python async examples"
"Find similar component patterns to this one"
"What solutions do we have for Docker networking issues?"
Build Context
"Add this API documentation to the project_docs collection"
"Store these test patterns for our testing suite"
π Batch File Processing
The killer feature! Process massive amounts of files instantly for AI-powered search and retrieval.
Quick Load Workflow (Fastest)
Perfect for "load, process, discard" workflows:
You: "Quick load my photos from /home/photos/vacation2024"
AI: Creates temp collection, ingests 500 photos in seconds
You: "Find photos with mountains or beaches"
AI: Returns matching photos with metadata
You: "Unload the collection"
AI: Cleans up, frees memorySupported File Types
Category | Extensions | Metadata Extracted |
Images | .jpg, .jpeg, .png, .heic, .raw, .cr2, .nef, .arw, .tiff, .gif, .webp | Dimensions, size, format |
CAD | .stl, .obj, .dxf, .dwg, .step, .iges, .fbx, .blend, .skp, .scad | Vertices, faces, format |
Documents | .pdf, .txt, .md, .doc, .docx, .rtf | Full text content |
Data | .json, .yaml, .xml, .csv, .toml, .ini | Parsed content |
Code | .js, .ts, .py, .go, .rs, .java, .cpp, .c, .php, .rb + 20 more | Full source code |
Batch Processing Examples
"Scan /projects/cad-files to see what's there"
"Batch ingest all STL files from /3d-prints into the 'print_library' collection"
"Quick load my Downloads folder, find anything mentioning 'invoice'"
"Export the photo_archive collection to backup.json"
"Import backup.json into a new collection called 'restored_photos'"Processing Speed
Quick Load: ~200 files in 2-3 seconds
Batch Ingest: ~500 files in 5-10 seconds (with full metadata)
Concurrent Processing: 10-20 parallel file operations
No external dependencies: Pure JavaScript/Bun processing
π Available Collections
Collection | Description | Use Case |
| Smart home configs & automations | Home Assistant, IoT scripts |
| Reusable code patterns | Functions, hooks, utilities |
| System & app configs | Docker, Kubernetes, services |
| Problem solutions | Fixes, workarounds, debugging |
| Project documentation | APIs, architecture, guides |
| Learning insights | Tutorials, concepts, notes |
π οΈ MCP Tools
search_context
Search for relevant information across collections
Parameters:
- query: Search query
- collection: (optional) Specific collection to search
- limit: (optional) Number of resultsstore_context
Store new information with metadata
Parameters:
- content: The content to store
- metadata: Tags, categories, descriptions
- collection: Target collectionlist_collections
List all available collections and their metadata
find_similar_patterns
Find code patterns similar to provided example
Batch Processing Tools
scan_directory
Preview files in a directory before ingesting
Parameters:
- path: Directory to scan
- categories: Filter by type (images, cad, documents, data, code)
- extensions: Filter by extension (.jpg, .stl, etc.)
- recursive: Include subdirectories (default: true)batch_ingest
Bulk ingest files into ChromaDB with full metadata
Parameters:
- path: Source directory
- collection: Target collection name
- categories: File types to include
- max_files: Limit number of filesquick_load
π FAST: Rapidly load files for temporary processing
Parameters:
- path: Directory to load
- name: Collection name (auto-generated if omitted)
- categories: File types to includeunload_collection
Delete a collection (cleanup after quick_load)
Parameters:
- collection: Name of collection to deleteexport_collection
Export collection to JSON file
Parameters:
- collection: Collection to export
- output_path: File path for JSON outputimport_collection
Import collection from JSON file
Parameters:
- input_path: JSON file to import
- collection: Override collection name
- overwrite: Delete existing first (default: false)get_collection_info
Get detailed stats about a collection
Parameters:
- collection: Collection nameingest_file
Ingest a single file with metadata extraction
Parameters:
- path: File to ingest
- collection: Target collectionlist_file_types
Show all supported file extensions
EXIF & Photo Tools
extract_exif
Extract detailed EXIF metadata from photos
Parameters:
- path: Path to JPEG or TIFF image
Returns: Camera, lens, exposure, GPS, date takenWatch Folder Tools
watch_folder
Start auto-ingesting new files from a folder
Parameters:
- path: Folder to watch
- collection: Target collection (default: auto_ingest)
- categories: File types to watch
- include_exif: Extract EXIF from photos (default: true)stop_watch
Stop watching a folder
Parameters:
- path: Folder to stop watchinglist_watchers
List all active folder watchers
Duplicate Detection Tools
find_duplicates
Scan directory for duplicate files
Parameters:
- path: Directory to scan
- hash_method: "partial" (fast), "full" (thorough), "perceptual" (images)
- categories: File types to check
Returns: Duplicate groups with wasted space infocompare_files
Check if two files are duplicates
Parameters:
- file1: First file path
- file2: Second file pathfind_collection_duplicates
Find duplicate entries in a ChromaDB collection
Parameters:
- collection: Collection nameπ§ Configuration
Environment Variables
CHROMADB_URL=http://localhost:8001 # ChromaDB server URLCustom Collections
Add new collections in setup-home-collections.js:
await createCollection('ml_experiments', {
description: 'Machine learning experiments and results'
});π¦ Project Structure
chromadb-mcp-server/
βββ index.js # MCP server with 22 tools
βββ batch-processor.js # Fast batch file processing engine
βββ exif-extractor.js # EXIF metadata extraction for photos
βββ watch-folder.js # Auto-ingest watch folder system
βββ duplicate-detector.js # Duplicate file detection
βββ setup-home-collections.js # Collection initialization
βββ test-chromadb.js # Connection test script
βββ test-mcp.js # MCP functionality test
βββ test-batch-processor.js # Batch processing tests
βββ HOME-AI-SETUP.md # Detailed setup guide
βββ package.json # Project dependencies
βββ README.md # This fileπ€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
See CONTRIBUTING.md for more details.
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
Anthropic for the MCP specification
Chroma for the excellent vector database
The open-source community for inspiration and support
π What's Next?
β
Export/import collectionsDONE!β
Batch file processingDONE!β
EXIF metadata extractionDONE in v3.0!β
Watch folders / auto-ingestDONE in v3.0!β
Duplicate detectionDONE in v3.0!Cloud sync capabilities
Multi-user support
Web UI for collection management
AI-powered image descriptions (what's in the photo)
3D print analysis (volume, time estimates)
Built with β€οΈ for the Home AI Community