ickyMCP

README.md•2.88 KiB

# ickyMCP RAG MCP Server for Document Search. Built for legal professionals and business users who need to search across large document collections. ## Features - **Semantic Search**: Find relevant content based on meaning, not just keywords - **Document Support**: PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx), Markdown, Text - **4K Token Chunks**: Large chunks preserve context for legal and business documents - **Incremental Indexing**: Only re-index changed files - **Local Embeddings**: Uses nomic-embed-text-v1.5 (no API costs) - **SQLite Storage**: Single portable database file ## Installation ```bash # Clone or copy the project cd ickyMCP # Create virtual environment python -m venv venv source venv/bin/activate # or `venv\Scripts\activate` on Windows # Install dependencies pip install -r requirements.txt # Or install as package pip install -e . ``` ## Configuration ### Environment Variables | Variable | Default | Description | |----------|---------|-------------| | `ICKY_CHUNK_SIZE` | 4000 | Tokens per chunk | | `ICKY_CHUNK_OVERLAP` | 500 | Overlap between chunks | | `ICKY_DB_PATH` | `./icky.db` | Path to SQLite database | | `ICKY_EMBEDDING_MODEL` | `nomic-ai/nomic-embed-text-v1.5` | Embedding model | ### Claude Code Configuration Add to your `claude_desktop_config.json` or MCP settings: ```json { "mcpServers": { "ickyMCP": { "command": "python", "args": ["/path/to/ickyMCP/run.py"], "env": { "ICKY_CHUNK_SIZE": "4000", "ICKY_CHUNK_OVERLAP": "500", "ICKY_DB_PATH": "/path/to/icky.db" } } } } ``` ## Usage ### Tools Available #### `index` Index documents from a file or directory. ``` index(path="/contracts/2024", patterns=["*.pdf", "*.docx"]) ``` #### `search` Semantic search across indexed documents. ``` search(query="indemnification clause", top_k=10, file_types=["pdf"]) ``` #### `similar` Find chunks similar to a given text. ``` similar(chunk_text="The parties agree to...", top_k=5) ``` #### `refresh` Re-index only files that have changed. ``` refresh(path="/contracts") ``` #### `list` List all indexed documents. ``` list(path_filter="/contracts") ``` #### `delete` Remove documents from the index. ``` delete(path="/contracts/old") delete(all=true) # Clear entire index ``` #### `status` Get server status and statistics. ``` status() ``` ## How It Works 1. **Indexing**: Documents are parsed, split into 4K token chunks with 500 token overlap 2. **Embedding**: Each chunk is embedded using nomic-embed-text-v1.5 (768 dimensions) 3. **Storage**: Embeddings stored in SQLite with sqlite-vec for fast vector search 4. **Search**: Query is embedded, compared against all chunks using cosine similarity 5. **Results**: Top-K most similar chunks returned with full text and metadata ## System Requirements - Python 3.10+ - 4GB RAM (2GB for model + headroom) - ~1GB disk space (model + database) ## License MIT

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/dl1683/ickyMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•2.88 KiB