Skip to main content
Glama
datarian
by datarian

cocoindex MCP

An MCP server that incrementally indexes repositories and documents into a Postgres + pgvector store using CocoIndex, and exposes semantic search over them.

The pipeline is source → extract (format registry) → chunk → embed → store:

  • Sources (src/mcp_coco/sources.py) — local filesystem today, as two profiles: repo (code-aware, vendored dirs excluded) and document (markdown/text/pdf, prose chunking).

  • Formats (src/mcp_coco/formats.py) — a registry mapping a file to normalized text. PDF (via pymupdf) is just one handler; add a format by registering one.

  • Indexer (src/mcp_coco/indexer.py) — the CocoIndex app: chunk + embed (sentence-transformers) and declare rows into one doc_embeddings table.

  • Search (src/mcp_coco/db.py) — embeds the query and runs a pgvector similarity search.

CocoIndex tracks its incremental state in a local LMDB file (COCOINDEX_DB), so re-indexing only reprocesses what changed and removes rows for deleted files.

Prerequisites

  • uv (Python package manager)

  • just (task runner, optional but convenient)

  • A Postgres instance with pgvector

  • Docker (if you want to run pgvector via the included compose file)

Related MCP server: ragi

Quick start (local)

1. Start a pgvector database

If you already have a Postgres instance with pgvector, skip this step and set DATABASE_URL accordingly.

Otherwise, use the included compose file:

docker compose up -d

This starts pgvector on localhost:5432 with user/password/db all set to cocoindex.

2. Install dependencies

uv sync

3. Configure

cp .env.example .env

Edit .env and set DATABASE_URL to point at your Postgres instance. For the Docker-based database:

DATABASE_URL=postgresql://cocoindex:cocoindex@localhost:5432/cocoindex

4. Verify the database connection

just init

5. Index something

just index ./path/to/repo repo
just index ./path/to/docs document

The first run downloads the embedding model (~80 MB) from Hugging Face.

just search "how does authentication work"

Using with Coding Agents

Add the MCP server to your Claude Code settings (~/.claude/settings.json for global, or .claude/settings.json in a project):

{
  "mcpServers": {
    "cocoindex": {
      "command": "uv",
      "args": ["run", "--directory", "/absolute/path/to/cocoindex-mcp", "mcp-coco-server"],
      "env": {
        "DATABASE_URL": "postgresql://cocoindex:cocoindex@localhost:5432/cocoindex",
        "COCOINDEX_DB": "/absolute/path/to/cocoindex-mcp/.cocoindex/state.db"
      }
    }
  }
}

Replace /absolute/path/to/cocoindex-mcp with the actual path to this repository.

If your Postgres instance is elsewhere (e.g. a cloud-hosted database), adjust DATABASE_URL accordingly. It is highly encouraged to pass your authentication information through env vars, do NOT hardcode into the connection string!

Once configured, Claude Code can use these tools:

Tool

Description

index_repo(path)

Index a code repository

index_documents(path)

Index a document collection

search(query, limit, source_kind)

Semantic search over indexed content

Development (devcontainer)

  1. Open this folder in VS Code and Reopen in Container (Dev Containers). The db service starts automatically alongside the app container.

  2. Run the preflight check:

    just install
    just init

    Copy .env.example to .env to customize settings. Inside the devcontainer the database hostname is db (the default).

just recipes

just index <path> [repo|document|auto]   # index a path
just index-repo <path>                   # index as code repository
just index-docs <path>                   # index as document collection
just search "query" [limit]              # semantic search
just drop <path> [repo|document|auto]    # remove a source from the index
just visualize_index                     # show a map of what's indexed
just serve                               # run the MCP server over stdio
just test                                # run tests
just lint                                # run ruff
Install Server
F
license - not found
A
quality
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/datarian/mcp-coco'

If you have feedback or need assistance with the MCP directory API, please join our Discord server