Skip to main content
Glama

ickyMCP

RAG MCP Server for Document Search. Built for legal professionals and business users who need to search across large document collections.

Features

  • Semantic Search: Find relevant content based on meaning, not just keywords

  • Document Support: PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx), Markdown, Text

  • 4K Token Chunks: Large chunks preserve context for legal and business documents

  • Incremental Indexing: Only re-index changed files

  • Local Embeddings: Uses nomic-embed-text-v1.5 (no API costs)

  • SQLite Storage: Single portable database file

Installation

# Clone or copy the project
cd ickyMCP

# Create virtual environment
python -m venv venv
source venv/bin/activate  # or `venv\Scripts\activate` on Windows

# Install dependencies
pip install -r requirements.txt

# Or install as package
pip install -e .

Configuration

Environment Variables

Variable

Default

Description

ICKY_CHUNK_SIZE

4000

Tokens per chunk

ICKY_CHUNK_OVERLAP

500

Overlap between chunks

ICKY_DB_PATH

./icky.db

Path to SQLite database

ICKY_EMBEDDING_MODEL

nomic-ai/nomic-embed-text-v1.5

Embedding model

Claude Code Configuration

Add to your claude_desktop_config.json or MCP settings:

{
  "mcpServers": {
    "ickyMCP": {
      "command": "python",
      "args": ["/path/to/ickyMCP/run.py"],
      "env": {
        "ICKY_CHUNK_SIZE": "4000",
        "ICKY_CHUNK_OVERLAP": "500",
        "ICKY_DB_PATH": "/path/to/icky.db"
      }
    }
  }
}

Usage

Tools Available

index

Index documents from a file or directory.

index(path="/contracts/2024", patterns=["*.pdf", "*.docx"])

Semantic search across indexed documents.

search(query="indemnification clause", top_k=10, file_types=["pdf"])

similar

Find chunks similar to a given text.

similar(chunk_text="The parties agree to...", top_k=5)

refresh

Re-index only files that have changed.

refresh(path="/contracts")

list

List all indexed documents.

list(path_filter="/contracts")

delete

Remove documents from the index.

delete(path="/contracts/old")
delete(all=true)  # Clear entire index

status

Get server status and statistics.

status()

How It Works

  1. Indexing: Documents are parsed, split into 4K token chunks with 500 token overlap

  2. Embedding: Each chunk is embedded using nomic-embed-text-v1.5 (768 dimensions)

  3. Storage: Embeddings stored in SQLite with sqlite-vec for fast vector search

  4. Search: Query is embedded, compared against all chunks using cosine similarity

  5. Results: Top-K most similar chunks returned with full text and metadata

System Requirements

  • Python 3.10+

  • 4GB RAM (2GB for model + headroom)

  • ~1GB disk space (model + database)

License

MIT

-
security - not tested
F
license - not found
-
quality - not tested

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/dl1683/ickyMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server