Skip to main content
Glama

Paparats MCP

npm version License: MIT PulseMCP MCP Badge

Open in GitHub Codespaces  β† try the full stack in your browser, no install (details)

Paparats-kvetka β€” a magical flower from Slavic folklore that blooms on Kupala Night and grants whoever finds it the power to see hidden things. Likewise, paparats-mcp helps your agent see the right code across a sea of repositories.

🌿 Works with Claude Code · Cursor · Windsurf · Copilot · Codex · Antigravity · any MCP-compatible agent

Give your AI coding assistant deep, real understanding of your entire workspace. Paparats indexes every repo you care about β€” semantically, with AST-aware chunking and a cross-chunk symbol graph β€” and exposes it through the Model Context Protocol. Search by meaning, follow who-uses-what through real symbol edges, see who last touched a chunk and which ticket it came from β€” all without your code ever leaving your machine.

πŸ“Š The built-in /ui operator console β€” ROI, query quality, cross-project usage, per-user activity, indexer health. Screenshot uses synthetic data (?demo=1) β€” no real queries, users, or project names.

  • ⚑ One install, one config. paparats install β†’ paparats add ~/code/repo β†’ done.

  • 🌳 AST-aware chunking and symbol extraction. Tree-sitter parses every supported file once and feeds both chunking and the cross-chunk symbol graph (calls / called_by / references / referenced_by) β€” 11 languages including TypeScript, Python, Go, Rust, Java, Ruby, C, C++, C#.

  • 🧠 Architectural memory that the agent maintains itself. A second vector store per group holds components, decisions (ADRs) and lessons learned β€” your agent writes them as it works and reads them before answering. Bootstrap on day one with the init_arch_memory MCP prompt (the /init of architectural memory). Server-side similarity gate prevents duplicates, supersedes links replace stale decisions, a min_score threshold gates low-confidence reads, every card carries an "updated N ago" stamp, and Prometheus metrics tell you whether your memory is actually being used.

  • πŸ’Έ Saves tokens. Returns only the chunks that matter, with token-savings telemetry to prove it (per-query, per-user, per-anchor-project).

  • πŸ”­ Production-ready observability. Prometheus /metrics, OpenTelemetry traces (Tempo, Jaeger, Honeycomb, Datadog, Grafana Cloud, Elastic APM), local SQLite analytics, and a built-in /ui operator console that visualises ROI, query quality, cross-project usage and indexer health in one screen.

  • 🏠 100% local by default. Qdrant + Ollama on your machine. No cloud, no API keys, no telemetry leaving the box. Bring your own Qdrant Cloud / Ollama URL if you want.


Table of Contents


Related MCP server: Codebase Contextifier 9000

Why Paparats?

AI coding assistants are smart, but they can only see files you open. They don't know your codebase structure, where the authentication logic lives, or how services connect. Paparats fixes that.

What you get

  • Semantic code search β€” ask "where is the rate limiting logic?" and get exact code ranked by meaning, not grep matches

  • Real-time sync β€” edit a file, and 2 seconds later it's re-indexed. No manual re-runs

  • Cross-chunk symbol graph β€” find_usages walks AST-derived edges (calls, called_by, references, referenced_by) so the agent can trace dependencies without re-grepping

  • Token savings β€” return only relevant chunks instead of full files to reduce context size

  • Multi-project workspaces β€” search across backend, frontend, infra repos in one query

  • 100% local & private β€” Qdrant vector database + Ollama embeddings. Nothing leaves your laptop

  • AST-aware chunking β€” code split by AST nodes (functions/classes) via tree-sitter, not arbitrary character counts (TypeScript, JavaScript, TSX, Python, Go, Rust, Java, Ruby, C, C++, C#; regex fallback for Terraform)

  • Rich metadata β€” each chunk knows its symbol name (from tree-sitter AST), service, domain context, and tags from directory structure

  • Git history per chunk β€” see who last modified a chunk, when, and which tickets (Jira, GitHub) are linked to it

  • Architectural memory β€” a living knowledge base of components, decisions (ADRs) and lessons learned, written by the agent as it learns, deduplicated server-side by vector similarity, and consulted on every support query so the agent stays consistent across sessions

Who benefits

Use Case

How Paparats Helps

Solo developers

Quickly navigate unfamiliar codebases, find examples of patterns, reduce context-switching

Multi-repo teams

Cross-project search (backend + frontend + infra), consistent patterns, faster onboarding

AI agents

Foundation for product support bots, QA automation, dev assistants β€” any agent that needs code context

Legacy modernization

Find all usages of deprecated APIs, identify migration patterns, discover hidden dependencies

Contractors/consultants

Accelerate ramp-up on client codebases, reduce "where is X?" questions


Quick Start

Try it in the browser (no install)

Open in GitHub Codespaces

Spin up a full Qdrant + Ollama + paparats stack in a Codespace. A small slice of the repo (packages/shared/src) is auto-indexed on first start so you can run

paparats search -g demo 'gitignore filter'

within a few minutes. Codespace forwards port 9876 for MCP β€” point Cursor/Claude Code at it via the URL VS Code shows in the Ports panel.

Note: Codespaces is for demo only. With Ollama-on-CPU embedding the full repo would take 15+ minutes and can hit batch timeouts on large files. For real workloads run locally β€” or set OPENAI_API_KEY (or VOYAGE_API_KEY) as a Codespaces user secret and indexing drops to a couple of seconds; see the Embedding providers section below.

Run locally

You need Docker and Docker Compose v2. On macOS, also install Ollama natively β€” running it inside Docker on macOS is significantly slower because the Docker VM cannot use Apple Silicon GPU acceleration.

# 1. Install the CLI.
npm install -g @paparats/cli

# 2. macOS only β€” install Ollama natively (Linux uses Docker Ollama by default).
brew install ollama

# 3. One-time bootstrap. Generates ~/.paparats/{docker-compose.yml,projects.yml},
#    starts the stack, downloads the embedding model, wires Cursor/Claude Code MCP.
paparats install

# 4. Add the projects you want indexed. Local paths bind-mount read-only into the
#    indexer; git URLs and owner/repo shorthand get cloned.
paparats add ~/code/my-project
paparats add git@github.com:acme/billing.git
paparats add acme/widgets

# 5. Watch it work.
paparats list

That's it. Your IDE is already wired (~/.cursor/mcp.json, ~/.claude/mcp.json) to http://localhost:9876/mcp. Open Cursor or Claude Code and ask:

"Search this workspace for the auth middleware and show me everything that calls it."

Existing v1 user?

Just run paparats install again. The installer detects the legacy per-project compose, asks once before swapping it for the new global setup, and preserves your indexed data (Qdrant collections, SQLite metadata, embedding cache). Your in-repo .paparats.yml files keep working as per-project overrides.


How the install works

paparats install is the only setup command. It creates a single global home at ~/.paparats/, brings up a Docker stack, and wires your MCP clients. Re-run it any time to reconfigure β€” it diffs the existing compose and asks before overwriting hand edits.

~/.paparats/
β”œβ”€β”€ docker-compose.yml          generated; hand-editable; install asks before overwriting
β”œβ”€β”€ projects.yml        project list (CLI rewrites it; comments survive your manual edits)
β”œβ”€β”€ install.json                install flags persisted so add/remove can regenerate compose
β”œβ”€β”€ .env                        secrets β€” Qdrant API key, GitHub token; chmod 600
β”œβ”€β”€ models/                     jina-code-embeddings GGUF + Modelfile
└── data/                       Docker volumes (mounted by name from compose)
    β”œβ”€β”€ qdrant/                 vector index
    β”œβ”€β”€ sqlite/                 metadata.db, embeddings.db, analytics.db
    └── repos/                  cloned remote projects

Inside the Docker stack:

Service

Image

Port

Role

paparats-mcp

ibaz/paparats-server:latest

9876

MCP HTTP/SSE endpoints, search, metadata API

paparats-indexer

ibaz/paparats-indexer:latest

9877

Cron + on-demand indexing, hot-reload of project list

qdrant

qdrant/qdrant:latest

6333

Vector DB (skipped when you pass --qdrant-url)

ollama

ibaz/paparats-ollama:latest

11434

Embedding model (Linux default; macOS uses native Ollama)

The indexer hot-reloads projects.yml. Edits that change project metadata only (group, language, indexing tweaks) reindex in place. Edits that add or remove local-path projects require a stack restart so Docker picks up the new bind-mount β€” the CLI does this for you on paparats add and paparats remove.


Install variants

paparats install

On macOS prefers native Ollama and dockerized Qdrant. On Linux defaults to Docker for both.

Bring your own Qdrant

paparats install --qdrant-url https://qdrant.example.com
# Asks for an API key after; stored in ~/.paparats/.env as QDRANT_API_KEY.

When --qdrant-url is set the Qdrant container is omitted from the stack entirely.

Bring your own Ollama

paparats install --ollama-url http://10.0.0.5:11434

Skips both native and Docker Ollama.

You must register the embedding model on the remote Ollama yourself. The installer will not touch a remote instance. On the Ollama host, download the GGUF (jinaai/jina-code-embeddings-1.5b-Q8_0.gguf) and run:

echo "FROM /path/to/jina-code-embeddings-1.5b-Q8_0.gguf" > Modelfile
ollama create jina-code-embeddings -f Modelfile

Then paparats install --ollama-url http://that-host:11434 and Paparats will use it.

Force Docker Ollama on macOS

paparats install --ollama-mode docker

Slower on Apple Silicon (no Metal GPU), but useful for parity testing or laptops without brew.

Scripted / CI

paparats install --non-interactive --force

Fails on any prompt; --force answers Y to compose-overwrite and migration prompts.


Migrating from a v1 install

When paparats install finds a legacy ~/.paparats/docker-compose.yml (the one from the old per-project flow with no paparats-indexer service), it prints a one-screen migration notice and asks before tearing the legacy stack down.

What survives: Qdrant collections, SQLite metadata, indexer repos, and any .paparats.yml files inside your repos (those still take precedence over projects.yml overrides).

What's deleted: the legacy docker-compose.yml and .env. They are regenerated on the spot under the new schema.

No re-indexing needed β€” the data volumes are referenced by the same names in the new compose. Add your projects with paparats add and they re-appear in paparats list with their existing chunks.

If your install predates the paparats-indexer.yml β†’ projects.yml rename, the installer migrates the file in place on first run and prints a one-line notice. The indexer also reads the legacy name as a fallback, so nothing breaks if you roll out the indexer before re-running paparats install.

Pass --force to skip the migration prompt in scripts.


Support agent setup

For bots and support teams that consume an existing Paparats server β€” no Docker, no Ollama needed on this side.

# Connect to a running server (default: localhost:9876)
paparats install --mode support

# Connect to a remote server
paparats install --mode support --server http://prod-server:9876

The installer verifies the server is reachable, then wires Cursor MCP (~/.cursor/mcp.json) and Claude Code MCP (~/.claude/mcp.json) to the support endpoint. Tools available on /support/mcp: search_code, get_chunk, find_usages, list_projects, health_check, get_chunk_meta, search_changes, explain_feature, recent_changes, impact_analysis, arch_context, arch_record_component, arch_record_decision, arch_record_lesson (architectural memory β€” see Key Features), plus the analytics tools described in Observability below.


How It Works

Your projects                   Paparats                       AI assistant
                                                               (Claude Code / Cursor)
  backend/                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    .paparats.yml ────────►│  Indexer              β”‚
  frontend/                β”‚   - chunks code       β”‚          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    .paparats.yml ────────►│   - embeds via Ollama │─────────►│ MCP search   β”‚
  infra/                   β”‚   - stores in Qdrant  β”‚          β”‚ tool call    β”‚
    .paparats.yml ────────►│   - watches changes   β”‚          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Indexing Pipeline

During each indexer cycle (cron-driven, on-demand via paparats add, or triggered by the indexer's chokidar file watcher), every file in scope flows through this pipeline:

 Source file
     β”‚
     β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ 1. File discoveryβ”‚  Collect files from indexing.paths, apply
 β”‚    & filtering   β”‚  gitignore + exclude patterns, skip binary
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ 2. Content hash  β”‚  SHA-256 of file content β†’ compare with
 β”‚    check         β”‚  existing Qdrant chunks β†’ skip unchanged
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ 3. AST parsing   β”‚  tree-sitter parses the file once (WASM)
 β”‚    (single pass) β”‚  β†’ reused for chunking AND symbol extraction
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ 4. Chunking      β”‚  AST nodes β†’ chunks at function/class
 β”‚                  β”‚  boundaries. Regex fallback for unsupported
 β”‚                  β”‚  languages (brace/indent/block strategies)
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ 5. Symbol        β”‚  AST queries extract module-level defines
 β”‚    extraction    β”‚  (function/class/variable names) and uses
 β”‚                  β”‚  (calls, references) per chunk. 11 languages
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ 6. Metadata      β”‚  Service name, bounded_context, tags from
 β”‚    enrichment    β”‚  config + auto-detected directory tags
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ 7. Embedding     β”‚  Jina Code Embeddings 1.5B via Ollama
 β”‚                  β”‚  SQLite cache (content-hash key) β†’ skip
 β”‚                  β”‚  already-embedded content
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ 8. Qdrant upsert β”‚  Vectors + payload (content, file, lines,
 β”‚                  β”‚  symbols, metadata) β†’ batched upsert
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ 9. Git history   β”‚  git log per file β†’ diff hunks β†’ map
 β”‚    (post-index)  β”‚  commits to chunks by line overlap β†’
 β”‚                  β”‚  extract ticket refs β†’ store in SQLite
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚10. Symbol graph  β”‚  Cross-chunk edges: calls ↔ called_by,
 β”‚    (post-index)  β”‚  references ↔ referenced_by β†’ SQLite
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Step 5's symbol extractor only emits module-level definitions β€” locals declared inside function bodies, callback args, and hook closures stay out of the graph because they're not addressable from another chunk anyway.

Search Flow

AI assistant queries via MCP β†’ server detects query type (nl2code / code2code / techqa) β†’ expands query (abbreviations, case variants, plurals) β†’ all variants searched in parallel against Qdrant β†’ results merged by max score β†’ only relevant chunks returned with confidence scores and symbol info.

Watching

The indexer container watches the projects mounted into it via chokidar with debouncing (1s default). On change, only the affected file re-enters the pipeline. Unchanged content is never re-embedded thanks to the content-hash cache. The indexer also hot-reloads ~/.paparats/projects.yml itself: metadata-only edits reindex in place; add/remove of local-path projects triggers a stack restart through the CLI.


Key Features

Better Search Quality

Task-specific embeddings β€” Jina Code Embeddings supports 3 query types (nl2code, code2code, techqa) with different prefixes for better relevance:

  • "find authentication middleware" β†’ nl2code prefix (natural language β†’ code)

  • "function validateUser(req, res)" β†’ code2code prefix (code β†’ similar code)

  • "how does OAuth work in this app?" β†’ techqa prefix (technical questions)

Query expansion β€” every search generates 2-3 variations server-side:

  • Abbreviations: auth ↔ authentication, db ↔ database

  • Case variants: userAuth β†’ user_auth β†’ UserAuth

  • Plurals: users β†’ user, dependencies β†’ dependency

  • Filler removal: "how does auth work" β†’ "auth"

All variants searched in parallel, results merged by max score.

Confidence scores β€” each result includes a percentage score (β‰₯60% high, 40–60% partial, <40% low) to guide AI next steps.

Performance

Embedding cache β€” SQLite cache with content-hash keys + Float32 vectors. Unchanged code never re-embedded. LRU cleanup at 100k entries.

AST-aware chunking β€” tree-sitter AST nodes define natural chunk boundaries for 11 languages. Falls back to regex strategies (block-based for Ruby, brace-based for JS/TS, indent-based for Python, fixed-size) for unsupported languages.

Real-time watching β€” the indexer's chokidar watcher reindexes a project on file changes with debouncing (1s default). For local-path projects bind-mounted into the indexer, edits on your host show up in MCP queries within seconds.

Cross-chunk symbol graph

The post-index pass walks every chunk's defines_symbols / uses_symbols lists and materializes edges into SQLite β€” calls, called_by, references, referenced_by. find_usages returns those edges grouped by direction so the agent can traverse the graph without re-searching. Because extraction is AST-driven, function locals don't pollute the graph.

Architectural memory (agent-maintained ADRs, components, lessons)

Code search tells the agent what the code does. Architectural memory tells it why β€” and the agent maintains that knowledge itself, across sessions, without you authoring a single doc.

Three card kinds, structured by design:

Kind

Captures

Fields

Component

A unit with a clear responsibility (service, module, subsystem)

name, summary with Does / Owns / Does not / Touched when

Decision

An architectural choice (ADR-style)

title, context, decision, alternatives_rejected, consequences

Lesson

A rule learned from an incident, a code review, a bug, or a user correction (Reflexion-style)

rule, why, when

The agent reads them via arch_context before any architectural answer, and writes them via arch_record_component, arch_record_decision, and arch_record_lesson whenever it discovers something new or learns from a correction. Each card carries an updated N ago stamp in the read tool so the agent can spot stale memory and verify against current code.

Server-side similarity gate (cosine over bge-m3 text embeddings, 1024d):

  • β‰₯ 0.85 is a duplicate β€” decisions are refused (the agent must reconcile or supersede); lessons bump updatedAt (Reflexion-style "rule confirmed").

  • 0.70 – 0.85 is similar β€” surfaced to the agent so it can refine the wording or chain a supersede.

  • < 0.70 is new β€” accepted as a fresh card.

supersedes links bypass the gate and mark the prior decision as status=superseded so it disappears from default search but remains in history.

Why this matters:

  • 🧠 Cross-session continuity β€” what the agent learned last week, today's agent still knows.

  • πŸ“ ADRs without the ceremony β€” no markdown files to maintain, no review process, no doc drift. The agent writes when it learns.

  • πŸ”„ Reflexion built in β€” corrections become lessons; repeated mistakes get caught.

  • 🚦 No memory rot β€” similarity gate kills duplicates, supersedes link replaces stale decisions, age stamps trigger verification against code.

Lives in a separate Qdrant collection per group (paparats_<group>_arch). Reading (arch_context) is available on both endpoints β€” coding agents need to know about prior decisions before refactoring. Writing (arch_record_*) is support-only: recording belongs to the architectural-review workflow, not to every line edit.

arch_context accepts a min_score parameter (default 0.45, cosine over bge-m3). Lower it to broaden recall on a sparse arch memory; raise it to demand only high-confidence cards. The tool also emits an explicit low-confidence hint when the question matched nothing above the threshold, so the agent knows to either rephrase or lower min_score instead of inventing context.

Initialise the arch layer on day one. Two purpose-built MCP workflow prompts make the boring scaffolding work disappear:

  • init_arch_memory β€” the /init of architectural memory. Walks the repo, identifies 8-20 components by domain boundary, writes them, and captures any obvious decisions inferable from comments or README. Run it once per group, right after installing.

  • audit_architecture β€” sweeps the memory of one group, flags cards older than 90 days, verifies anchors against the live code, and surfaces a punch list of updates / supersedes for your approval.

  • record_lesson_from_correction β€” converts a user correction into a structured lesson card (rule / why / when) without overrecording typos.

MCP resources for live introspection:

  • arch://schema β€” the full card-schema reference (fields, similarity-gate thresholds, write semantics). Cite it from the agent when explaining the model.

  • arch://stats/{group} β€” live counts (total / by kind / by status) and the oldest/newest updatedAt per group. The same numbers are also pushed to Prometheus.

Observability built in. When PAPARATS_METRICS=true, every read/write hits a counter and the cosine score of returned cards lands in a histogram:

  • paparats_arch_context_calls_total{group} β€” calls per group

  • paparats_arch_write_total{kind, status} β€” writes by card kind and gate outcome

  • paparats_arch_search_score β€” histogram of cosine scores in arch_context results (post min_score)

  • paparats_arch_collection_size{group, kind, status} β€” gauge updated whenever arch://stats/{group} is read

These let you spot a memory that's not being written to, a similarity gate that's too aggressive, or a sparse group where every query returns low-confidence hits.


Use Cases

For Developers (Coding)

Connect via the coding endpoint (/mcp):

Use Case

How

Navigate unfamiliar code

search_code "authentication middleware" β†’ exact locations

Find similar patterns

search_code "retry with exponential backoff" β†’ examples

Trace dependencies

find_usages {chunk_id, direction: "incoming"} β†’ callers via the graph

Explore context

get_chunk <chunk_id> --radius_lines 50 β†’ expand around

Manage projects

list_projects and delete_project for index hygiene

For Support Teams

Connect via the support endpoint (/support/mcp):

Use Case

How

Explain a feature

explain_feature "rate limiting" β†’ code locations + changes

Recent changes

recent_changes "auth" --since 2024-01-01 β†’ timeline with tickets

Trace usages

find_usages {chunk_id} β†’ who calls/references this chunk

Change history

get_chunk_meta <chunk_id> β†’ authors, dates, linked tickets

Blast radius

impact_analysis <chunk_id> β†’ cross-chunk + cross-project impact

Architectural Q&A

arch_context "why X" β†’ components / decisions / lessons (with age)

Capture decisions & lessons

arch_record_decision / arch_record_lesson β€” agent writes as it learns, server-side dedup

Support chatbot example:

User: "How do I configure rate limiting?"

Bot workflow (via /support/mcp):
1. explain_feature("rate limiting", group="my-app")
   β†’ returns code locations + recent changes + related modules
2. get_chunk_meta(<chunk_id>)
   β†’ returns who last modified it, when, linked tickets
3. Bot synthesizes response in plain language with ticket references

Configuration

Paparats uses two config files. Both are optional β€” defaults work for the common case.

~/.paparats/projects.yml β€” global project list

Lives outside your repos. Edited by paparats add / paparats remove or by hand via paparats edit projects. Every entry has either path: (local bind-mount) or url: (remote git, cloned by the indexer), never both.

defaults:
  cron: '0 */6 * * *' # global indexer schedule
  group: workspace # default group when an entry doesn't specify one

repos:
  - path: /Users/alice/code/billing # local bind-mount
    group: dev
    language: typescript

  - url: org/widgets # remote git, cloned by the indexer
    group: prod
    language: ruby

  - url: git@github.com:acme/billing.git
    name: billing # override the auto-derived name
    group: prod

The indexer hot-reloads this file. Adding/removing local-path entries causes the CLI to restart the stack so Docker picks up the new bind-mount; metadata-only edits reindex in place.

.paparats.yml in your repo β€” per-project overrides

Drop one at the project root to override anything from the global file.

group: my-app
language: typescript

# Indexing tuning (all optional)
indexing:
  paths: [src, packages] # restrict to these subdirectories
  exclude: [node_modules, dist, '**/*.test.ts']
  exclude_extra: ['**/__fixtures__/**'] # added on top of language defaults
  chunkSize: 1500 # characters per chunk (default: 1200)
  overlap: 100 # chunk overlap (default: 100)
  concurrency: 4 # parallel embedding requests
  batchSize: 8 # embeddings per Ollama call

# Metadata
metadata:
  service: billing
  bounded_context: payments
  tags: [backend, critical]
  directory_tags:
    src/api: [public-api]
    src/internal: [internal]

  # Git history per chunk (Jira / GitHub ticket extraction included)
  git:
    enabled: true
    maxCommitsPerFile: 50
    ticketPatterns:
      - '\b([A-Z]+-\d+)\b' # Jira-style PROJ-123
      - '#(\d+)' # GitHub-style #123

In-repo .paparats.yml always wins over projects.yml. The CLI never overwrites it.

Groups

A group is a Qdrant collection (paparats_<group>). Multiple projects can share a group to enable cross-project search; each project lives as a project: field in the chunk payload. By default group defaults to the project name (one project, one collection). Set the same group: on multiple entries to consolidate them.

Git history per chunk

When metadata.git.enabled: true (default), the indexer maps each chunk to the commits that touched its line range using diff-hunk overlap. Tickets are extracted from commit messages using metadata.git.ticketPatterns (built-in: Jira PROJ-123, GitHub #42, cross-repo org/repo#99). Surfaced through MCP tools get_chunk_meta, search_changes, recent_changes, explain_feature. Non-fatal: non-git projects index normally.


MCP Tools Reference

Paparats serves the Model Context Protocol on two separate endpoints, each with its own tool set and system instructions.

Coding endpoint (/mcp)

For developers using Claude Code, Cursor, etc. Focus: search code, read chunks, follow the cross-chunk symbol graph, manage projects.

Tool

Description

search_code

Semantic search across indexed projects. Returns chunks with symbol info and confidence scores.

get_chunk

Retrieve a chunk by ID with optional surrounding context.

find_usages

Walk the symbol graph from a chunk_id β€” incoming (callers/references in), outgoing (calls/references out), or both.

list_projects

List indexed projects with chunk counts and detected languages.

delete_project

Wipe Qdrant chunks + SQLite metadata for a project (CLI's paparats remove calls it).

health_check

Indexing status, chunks per group, running jobs.

arch_context

Read-only architectural memory. Returns components, decisions, and lessons relevant to the query with updated N ago stamps and a min_score cutoff.

Support endpoint (/support/mcp)

For support teams and bots without direct code access. Focus: feature explanations, change history, cost reporting β€” all in plain language.

Tool

Description

search_code

Same as coding endpoint.

get_chunk

Same.

find_usages

Same.

list_projects

Same.

health_check

Same.

get_chunk_meta

Git history and ticket references for a chunk β€” commits, authors, dates. No code.

search_changes

Semantic search filtered by last-commit date. Each result shows when it last changed.

explain_feature

Comprehensive feature analysis: locations + recent changes for a question.

recent_changes

Timeline grouped by date with commits, tickets, affected files. since filter.

impact_analysis

Cross-chunk impact for a chunk_id β€” symbol graph traversal + cross-project blast radius.

arch_context

Read the architectural memory for a group β€” top-matching components, decisions and lessons, each stamped with "updated N ago" and a cosine score. Accepts a min_score parameter (default 0.45) to gate low-confidence hits. Call before any architectural answer. Also available on /mcp.

arch_record_component

Record a component with Does / Owns / Does not / Touched when fields. Idempotent by name.

arch_record_decision

Record an ADR-style decision (context / decision / alternatives_rejected / consequences). Server-side similarity gate refuses duplicates and surfaces near-matches; supersedes links replace prior decisions.

arch_record_lesson

Record a lesson as rule / why / when. Duplicates bump updatedAt (Reflexion confirmation) instead of overwriting.

token_savings_report

Aggregate token-savings stats (naive baseline vs search-only vs actually consumed).

top_queries

Most frequent queries by user/session/project anchor.

slowest_searches

Top-N slowest searches with timing + chunk counts.

cross_project_share

Off-anchor result share per user β€” indicator of search noise.

retry_rate

Tool-call retry rate per user β€” indicator of unhelpful results.

failed_chunks

AST parse failures, regex fallbacks, zero-chunk files, binary skips.

Typical workflows

Drill-down (coding agent):

1. search_code "authentication middleware"           β†’ relevant chunks with symbols
2. get_chunk <chunk_id> --radius_lines 50            β†’ expand context around a hit
3. find_usages {chunk_id, direction: "incoming"}     β†’ who calls / references this chunk

Single-call (support agent):

1. explain_feature "How does authentication work?"   β†’ locations + recent changes
2. recent_changes "auth" --since 2024-01-01          β†’ timeline with tickets
3. token_savings_report                              β†’ cost report for the last 7 days

Architectural memory (support agent):

1. arch_context "why do we use bge-m3 for the arch layer?"
                                                     β†’ top components / decisions / lessons,
                                                       each with an "updated N ago" stamp
2. arch_record_decision { title, context, decision, alternatives_rejected, consequences }
                                                     β†’ status=created | duplicate | similar
                                                       (gate refuses duplicates server-side)
3. arch_record_lesson   { rule, why, when }          β†’ status=created | updated (Reflexion bump)

Connecting MCP

paparats install already wires Cursor (~/.cursor/mcp.json) and Claude Code (~/.claude/mcp.json) to http://localhost:9876/mcp. The sections below are for manual setup or for adding the support endpoint alongside the default coding one.

Cursor

Create or edit ~/.cursor/mcp.json (global) or .cursor/mcp.json (project):

{
  "mcpServers": {
    "paparats": {
      "type": "http",
      "url": "http://localhost:9876/mcp"
    }
  }
}

For support use case (feature explanations, change history, impact analysis):

{
  "mcpServers": {
    "paparats-support": {
      "type": "http",
      "url": "http://localhost:9876/support/mcp"
    }
  }
}

Restart Cursor after changing config.

Claude Code

# Coding endpoint (default)
claude mcp add --transport http paparats http://localhost:9876/mcp

# Support endpoint (for support bots/agents)
claude mcp add --transport http paparats-support http://localhost:9876/support/mcp

Or add to .mcp.json in project root:

{
  "mcpServers": {
    "paparats": {
      "type": "http",
      "url": "http://localhost:9876/mcp"
    }
  }
}

Verify

  • paparats status β€” check stack is up

  • Coding endpoint (/mcp): search_code, get_chunk, find_usages, list_projects, delete_project, health_check

  • Support endpoint (/support/mcp): search_code, get_chunk, find_usages, health_check, list_projects, plus the support-specific tools get_chunk_meta, search_changes, explain_feature, recent_changes, impact_analysis, and the analytics tools listed in Observability (token_savings_report, top_queries, slowest_searches, cross_project_share, retry_rate, failed_chunks)

  • Ask the AI: "Search this workspace for the auth middleware"


CLI Commands

paparats install [flags]                Bootstrap or reconfigure the global stack.
paparats add <path-or-repo> [flags]     Add a project (local path or git URL/shorthand).
paparats list [--json] [--group g]      Show indexed projects with status from the indexer.
paparats remove <name> [--yes]          Remove a project β€” deletes Qdrant + SQLite data.

paparats start [--logs]                 Start the Docker stack (with `--logs` follows them).
paparats stop                           Stop the stack (preserves data volumes).
paparats restart                        Recreate containers (applies new compose changes).
paparats edit compose|projects          Open the file in $EDITOR; on save, validate +
                                          regenerate compose + restart + reindex (projects).

paparats search <query> [flags]         Semantic search from the terminal.
paparats status                         Stack health: Docker, Ollama, server, indexer.
paparats groups [--json]                List groups and their projects.
paparats doctor                         Diagnostic checks (Docker, Ollama, ports, configs).
paparats update                         Update CLI from npm + pull latest Docker images.

The legacy per-project commands (paparats init, paparats index, paparats watch) are gone β€” adding a project is now paparats add, indexing is automatic in the indexer container, watching is the chokidar watcher inside the indexer.

Common flags

paparats install

  • --ollama-mode <native|docker> β€” force Ollama mode (default: native on macOS, docker on Linux)

  • --ollama-url <url> β€” external Ollama; skips both native and docker Ollama

  • --qdrant-url <url> β€” external Qdrant; skips the Qdrant container

  • --qdrant-api-key <key> β€” for authenticated Qdrant (e.g. Qdrant Cloud); written to ~/.paparats/.env

  • --mode support β€” wire MCP clients only, no Docker stack

  • --server <url> β€” server URL for support mode (default: http://localhost:9876)

  • --force β€” skip overwrite/migration prompts

  • --non-interactive β€” fail on any prompt instead of asking

  • -v, --verbose β€” stream Docker output

paparats add <path-or-repo>

  • --name <name> β€” override the auto-derived project name (basename of path / repo)

  • --group <group> β€” override group (default: project name)

  • --language <lang> β€” override language (default: auto-detect)

  • --no-restart β€” skip the Docker restart for local-path adds (useful in scripts)

  • --no-reindex β€” skip the per-project reindex trigger

  • --force β€” drop the project's existing chunks before reindexing (destructive, use after schema/config changes)

paparats remove <name>

  • --yes β€” skip the confirmation prompt

paparats search <query>

  • -n, --limit <n> β€” max results (default: 5)

  • -p, --project <name> β€” filter by project

  • -g, --group <name> β€” restrict to a group

  • --json β€” machine-readable output

Environment overrides

Var

Default

What

PAPARATS_SERVER_URL

http://localhost:9876

MCP server base URL (used by CLI commands)

PAPARATS_INDEXER_URL

http://localhost:9877

Indexer base URL (add, list, edit)


Monitoring

Paparats exposes Prometheus metrics for operational visibility. Opt in by setting PAPARATS_METRICS=true in the server's environment:

# In ~/.paparats/docker-compose.yml, under paparats service:
environment:
  PAPARATS_METRICS: 'true'

Metrics endpoint

curl http://localhost:9876/metrics

Key metrics

Metric

Type

Description

paparats_search_total

Counter

Search requests by group and method

paparats_search_duration_seconds

Histogram

Search latency

paparats_index_files_total

Counter

Files indexed

paparats_index_chunks_total

Counter

Chunks indexed

paparats_query_cache_hit_rate

Gauge

Query result cache hit rate

paparats_embedding_cache_hit_rate

Gauge

Embedding cache hit rate

paparats_watcher_events_total

Counter

File watcher events

Prometheus scrape config

scrape_configs:
  - job_name: paparats
    scrape_interval: 15s
    static_configs:
      - targets: ['localhost:9876']

Query cache

Search results are cached in-memory (LRU, default 1000 entries, 5-minute TTL). The cache is automatically invalidated when files change. Configure via environment variables:

  • QUERY_CACHE_MAX_ENTRIES β€” max cached queries (default: 1000)

  • QUERY_CACHE_TTL_MS β€” TTL in milliseconds (default: 300000)

Cache stats are included in GET /api/stats under the queryCache field.


Analytics & Observability

Paparats ships with three observability layers that work together:

  1. Prometheus (PAPARATS_METRICS=true, see above) β€” scrape /metrics.

  2. Local SQLite analytics store at ~/.paparats/analytics.db (default ON) β€” raw search/tool/indexing events. Six MCP tools query it directly: token_savings_report, top_queries, cross_project_share, retry_rate, slowest_searches, failed_chunks.

  3. OpenTelemetry (PAPARATS_OTEL_ENABLED=true + OTEL_EXPORTER_OTLP_ENDPOINT) β€” spans for every search, MCP tool call, embedding, indexing run, chunking error. Works with Tempo, Jaeger, Honeycomb, Datadog, Grafana Cloud β€” anything that speaks OTLP/HTTP.

Operator console (/ui)

Open http://localhost:9876/ui for a single-screen dashboard (see screenshot at top of README) that visualises the analytics store above: ROI, top / slowest queries, cross-project usage, per-user activity, indexer status, embedding p95/p99, and recent failures. Polls every 5 s, no extra services to run.

  • Protect it (optional): PAPARATS_UI_BASIC_AUTH=user:pass β€” applies to /ui and /api/analytics only; /mcp and /api/search stay open so agents keep working.

  • Show the screenshot view to anyone without touching real data: PAPARATS_UI_DEMO=true (or append ?demo=1 to the URL once).

Pre-built Grafana dashboard

The built-in /ui covers the current snapshot. For history (latency p99 over weeks, GC trends, CPU under indexing bursts) wire /metrics to Prometheus and import docs/grafana/paparats.json β€” 15 panels across four rows: Traffic & latency, Embeddings, Indexing, Process health.

# 1. Enable Prometheus surface on the server.
PAPARATS_METRICS=true paparats up   # or set in your docker-compose.yml

# 2. Point your Prometheus at http://<server>:9876/metrics.

# 3. In Grafana: Dashboards β†’ Import β†’ upload docs/grafana/paparats.json
#    β†’ pick your Prometheus datasource β†’ Import.

The dashboard uses a ${DS_PROMETHEUS} variable, so it works with any Prometheus instance (local, Grafana Cloud, Mimir, VictoriaMetrics).

Sending traces to Elastic APM (or any OTLP backend)

Elastic APM Server accepts OpenTelemetry natively since 7.14 β€” no agent install, no SDK injection. Set four env vars on the paparats container and restart:

PAPARATS_OTEL_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=https://your-apm-server:8200
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer <apm-secret-token>
OTEL_SERVICE_NAME=paparats-mcp

Within a minute a new service paparats-mcp appears in APM β†’ Services. The same env vars work for Tempo, Jaeger, Honeycomb, Datadog, Grafana Cloud Traces β€” change the endpoint and auth header.

What gets recorded β€” one span per event, with paparats-specific attributes for filtering and grouping:

Span name

Key attributes

When

paparats.search

tool, group, query.hash, query.length, search.duration_ms, result_count, cache_hit

every search_code / find_usages

paparats.get_chunk

chunk_id, fetch.radius_lines, fetch.duration_ms, fetch.found

every get_chunk call

paparats.mcp.tool

tool, tool.duration_ms, tool.ok

every MCP tool invocation

paparats.embedding

kind, batch_size, cache_hits, cache_miss, duration_ms, timeout

every embedding request

paparats.indexing.run

group, project, trigger, status, files_total, chunks_total, errors_total

every indexer cycle

paparats.indexing.chunking_error

group, project, file, language, error_class

per-file chunking failure

Every span also carries paparats.user, paparats.session, paparats.client, paparats.request_id, and (when present) paparats.anchor_project from the identity headers above β€” so you can filter APM by user or correlate spans across a single MCP session.

What this is good for in Elastic APM:

  • Errors view β€” chunking and embedding failures with stacktrace + file/language/error_class context, aggregated by error class.

  • Transactions β€” paparats.search becomes a transaction type. Sort by p95/p99/error rate to find the slow workloads. Filter by paparats.tool=search_code or paparats.group=… to slice by repo.

  • Custom queries / metrics β€” every paparats attribute is indexed. Build APM queries like paparats.embedding.cache_miss:true AND duration_ms>500 to find slow cache-miss embeddings, or aggregate paparats.search.result_count per paparats.group.

  • Log correlation β€” if you ship paparats stdout to Elastic via Filebeat, the trace.id field links a log line back to its span.

What this is not β€” honest caveats:

  • Spans are flat (one event = one span), not parented. Service Map will show paparats-mcp as an isolated node; you won't see a "search β†’ embedding β†’ Qdrant" waterfall. Use the per-span duration_ms attributes for stage timing instead.

  • Outbound HTTP to Qdrant / Ollama is not auto-instrumented β€” to see those as separate dependencies in APM you'd need to enable @opentelemetry/instrumentation-http (planned, not shipped). For now, embedding and search latency live on the existing spans.

  • Per-request token-savings, top queries, and cross-project usage stay in the local SQLite store β€” they're aggregations, not events. View them in the built-in /ui console, not in APM.

For pure metrics (CPU, GC, RSS, request rates) Elastic Metricbeat or our Prometheus exporter (above) is a better fit than APM.

Identity attribution

Clients (IDE plugins, CLI) can set X-Paparats-User, X-Paparats-Session, X-Paparats-Client, X-Paparats-Anchor-Project headers. The header name for user is configurable via PAPARATS_IDENTITY_HEADER (default X-Paparats-User). Missing header β†’ events are attributed to anonymous. There is no cryptographic verification β€” this is for attribution, not access control.

GET /api/stats echoes the resolved identity, useful for verifying header propagation:

curl -H 'X-Paparats-User: alice' http://localhost:9876/api/stats | jq .identity

Token-savings estimators

Three levels, computed from raw events at query-time:

  • Naive baseline β€” what a model would have read if it pulled the whole file for each result.

  • Search-only β€” tokens actually returned by search_code.

  • Actually consumed β€” tokens that the client subsequently fetched via get_chunk. The most honest signal, since it discounts noisy results that were never used.

Run token_savings_report from any MCP client connected to /support/mcp.

Cross-project noise

When a client passes X-Paparats-Anchor-Project (or specifies a single project in the search call), the share of results from other projects in the same group is recorded. Use cross_project_share to see how noisy your group's index is for each user.

Indexer-pipeline visibility

failed_chunks aggregates AST parse failures, regex fallbacks, zero-chunk files, and binary skips. slowest_searches ranks individual searches by latency.

Configuration matrix

Env var

Default

Purpose

PAPARATS_METRICS

false

Prometheus surface (existing, unchanged)

PAPARATS_ANALYTICS_ENABLED

true

Local SQLite analytics writes

PAPARATS_ANALYTICS_DB_PATH

~/.paparats/analytics.db

Analytics DB file

PAPARATS_ANALYTICS_RETENTION_DAYS

90

Daily prune cutoff

PAPARATS_ANALYTICS_RETENTION_RUN_HOUR

3

Hour-of-day for prune (local time)

PAPARATS_IDENTITY_HEADER

X-Paparats-User

Header name for user attribution

PAPARATS_LOG_RESULT_FILES

true

If false, store NULL for search_results.file

PAPARATS_LOG_QUERY_TEXT

true

If false, store NULL for search_events.query_text

PAPARATS_REFORMULATION_WINDOW_MS

90000

Reformulation detection window

PAPARATS_TELEMETRY_SAMPLE_RATE

1.0

Sampling rate (errors are always kept)

PAPARATS_OTEL_ENABLED

false

Enable OTel SDK + OTLP exporter

OTEL_EXPORTER_OTLP_ENDPOINT

unset

OTLP HTTP endpoint (e.g. http://localhost:4318/v1/traces)

OTEL_EXPORTER_OTLP_HEADERS

unset

OTLP auth headers (key=value,key2=value2)

OTEL_SERVICE_NAME

paparats-mcp

OTel resource attribute

OTEL_RESOURCE_ATTRIBUTES

unset

Extra resource attrs (key=value,key2=value2)

PII guidance

  • File paths and query text are stored locally by default. For shared deployments where paths could leak sensitive info, set PAPARATS_LOG_RESULT_FILES=false and/or PAPARATS_LOG_QUERY_TEXT=false.

  • OTel spans never carry full query text by default β€” only paparats.query.hash and length.


Architecture

paparats-mcp/
β”œβ”€β”€ packages/
β”‚   β”œβ”€β”€ server/          # MCP server (Docker image: ibaz/paparats-server)
β”‚   β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”‚   β”œβ”€β”€ lib.ts                # Public library exports (for programmatic use)
β”‚   β”‚   β”‚   β”œβ”€β”€ index.ts              # HTTP server bootstrap + graceful shutdown
β”‚   β”‚   β”‚   β”œβ”€β”€ app.ts                # Express app + HTTP API routes
β”‚   β”‚   β”‚   β”œβ”€β”€ indexer.ts            # Group-aware indexing, single-parse chunkFile()
β”‚   β”‚   β”‚   β”œβ”€β”€ searcher.ts           # Search with query expansion, cache, metrics
β”‚   β”‚   β”‚   β”œβ”€β”€ query-expansion.ts    # Abbreviation, case, plural expansion
β”‚   β”‚   β”‚   β”œβ”€β”€ task-prefixes.ts      # Jina task prefix detection
β”‚   β”‚   β”‚   β”œβ”€β”€ query-cache.ts        # In-memory LRU search result cache
β”‚   β”‚   β”‚   β”œβ”€β”€ metrics.ts            # Prometheus metrics (opt-in)
β”‚   β”‚   β”‚   β”œβ”€β”€ ast-chunker.ts        # AST-based code chunking (tree-sitter, primary strategy)
β”‚   β”‚   β”‚   β”œβ”€β”€ chunker.ts            # Regex-based code chunking (fallback for unsupported languages)
β”‚   β”‚   β”‚   β”œβ”€β”€ ast-symbol-extractor.ts # AST-based symbol extraction (module-level only, 11 languages)
β”‚   β”‚   β”‚   β”œβ”€β”€ ast-queries.ts        # Tree-sitter S-expression queries per language
β”‚   β”‚   β”‚   β”œβ”€β”€ tree-sitter-parser.ts # WASM tree-sitter manager
β”‚   β”‚   β”‚   β”œβ”€β”€ symbol-graph.ts       # Cross-chunk symbol edges (calls/called_by/refs)
β”‚   β”‚   β”‚   β”œβ”€β”€ embeddings.ts         # Ollama provider + SQLite cache
β”‚   β”‚   β”‚   β”œβ”€β”€ config.ts             # .paparats.yml reader + validation
β”‚   β”‚   β”‚   β”œβ”€β”€ metadata.ts           # Tag resolution + auto-detection
β”‚   β”‚   β”‚   β”œβ”€β”€ metadata-db.ts        # SQLite store for git commits + tickets + symbol edges
β”‚   β”‚   β”‚   β”œβ”€β”€ git-metadata.ts       # Git history extraction + chunk mapping
β”‚   β”‚   β”‚   β”œβ”€β”€ ticket-extractor.ts   # Jira/GitHub/custom ticket parsing
β”‚   β”‚   β”‚   β”œβ”€β”€ mcp-handler.ts        # MCP protocol β€” dual-mode (coding /mcp + support /support/mcp)
β”‚   β”‚   β”‚   β”œβ”€β”€ watcher.ts            # File watcher (chokidar)
β”‚   β”‚   β”‚   β”œβ”€β”€ arch/                 # Architectural memory layer (components, decisions, lessons)
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ types.ts          # ArchComponent, ArchDecision, ArchLesson, ArchWriteResult
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ collection.ts     # Per-group Qdrant collection (`paparats_<group>_arch`) lifecycle
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ text-embeddings.ts # bge-m3 text embedder (1024d, mean-pooled, Ollama)
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ store.ts          # CRUD + server-side similarity gate (cosine 0.85 / 0.70)
β”‚   β”‚   β”‚   β”‚   └── context.ts        # `arch_context` query β€” top-N across kinds with age stamps
β”‚   β”‚   β”‚   └── types.ts              # Shared types
β”‚   β”‚   └── Dockerfile
β”‚   β”œβ”€β”€ indexer/         # Automated repo indexer (Docker image: ibaz/paparats-indexer)
β”‚   β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”‚   β”œβ”€β”€ index.ts              # Entry: Express mini-server + cron scheduler
β”‚   β”‚   β”‚   β”œβ”€β”€ config-loader.ts      # projects.yml parser + per-repo overrides
β”‚   β”‚   β”‚   β”œβ”€β”€ config-watcher.ts     # chokidar watcher for hot-reloading the project list
β”‚   β”‚   β”‚   β”œβ”€β”€ repo-manager.ts       # parseReposEnv(), cloneOrPull() using simple-git
β”‚   β”‚   β”‚   β”œβ”€β”€ scheduler.ts          # node-cron wrapper
β”‚   β”‚   β”‚   └── types.ts              # IndexerConfig, RepoConfig, RepoOverrides, IndexerFileConfig
β”‚   β”‚   └── Dockerfile
β”‚   β”œβ”€β”€ ollama/          # Custom Ollama with pre-baked model (Docker image: ibaz/paparats-ollama)
β”‚   β”‚   └── Dockerfile
β”‚   β”œβ”€β”€ cli/             # CLI tool (npm package: @paparats/cli)
β”‚   β”‚   └── src/
β”‚   β”‚       β”œβ”€β”€ index.ts                    # Commander entry
β”‚   β”‚       β”œβ”€β”€ docker-compose-generator.ts # Programmatic YAML generation
β”‚   β”‚       β”œβ”€β”€ projects-yml.ts             # projects.yml + install.json read/write
β”‚   β”‚       └── commands/                   # install, projects (add/remove/list), lifecycle, edit, etc.
β”‚   └── shared/          # Shared utilities (npm package: @paparats/shared)
β”‚       └── src/
β”‚           β”œβ”€β”€ path-validation.ts    # Path validation
β”‚           β”œβ”€β”€ gitignore.ts          # Gitignore parsing
β”‚           β”œβ”€β”€ exclude-patterns.ts   # Glob exclude normalization
β”‚           └── language-excludes.ts  # Language-specific exclude defaults
└── examples/
    └── paparats.yml.*   # Config examples per language

Stack

  • Qdrant β€” vector database (1 collection per group with paparats_ prefix for code, plus a separate paparats_<group>_arch collection per group for the architectural memory layer; cosine similarity, payload filtering)

  • Ollama β€” local embeddings via Jina Code Embeddings 1.5B for code (task-specific prefixes) and bge-m3 for the architectural memory layer (1024d, mean-pooled, multilingual)

  • SQLite β€” embedding cache (~/.paparats/cache/embeddings.db) + git metadata + symbol edges store (~/.paparats/metadata.db)

  • MCP β€” Model Context Protocol (SSE for Cursor, Streamable HTTP for Claude Code). Dual endpoints: /mcp (coding) and /support/mcp (support)

  • TypeScript monorepo with Yarn workspaces


Integration Examples

Support Chatbot

Use paparats as the knowledge backend for a product support bot. Connect the bot to the support endpoint (/support/mcp) for access to explain_feature, recent_changes, find_usages, and other support-oriented tools:

User: "How do I configure rate limiting?"

Bot workflow (via /support/mcp):
1. explain_feature("rate limiting", group="my-app")
   β†’ returns code locations + recent changes + related modules
2. get_chunk_meta(<chunk_id>)
   β†’ returns who last modified it, when, linked tickets
3. Bot synthesizes response in plain language with ticket references

CI/CD reindex on push

Indexing lives in the indexer container. To force a reindex of a project from CI, trigger the indexer's HTTP endpoint:

name: Reindex Paparats
on:
  push:
    branches: [main]

jobs:
  reindex:
    runs-on: ubuntu-latest
    steps:
      - run: |
          curl -X POST http://your-paparats-host:9877/trigger \
            -H 'Content-Type: application/json' \
            -d '{"repos": ["your-org/your-repo"]}'

Pass "force": true in the body to drop existing chunks first (destructive β€” use after schema/config changes). If the project isn't yet in projects.yml, add it once during your initial setup and the indexer's cron + hot-reload will keep it in sync going forward.

Code-review assistant

Combine multiple tools to analyze the impact of a pull request:

1. explain_feature("the feature being changed")
   β†’ understand what the code does and how it connects
2. find_usages({chunk_id: "<changed chunk>", direction: "both"})
   β†’ blast radius via the symbol graph
3. search_changes("related area", since="2024-01-01")
   β†’ recent changes that might conflict or overlap

Embedding Model Setup

Paparats supports three embedding backends. Pick one β€” the choice is sticky per Qdrant collection (changing it requires reindexing; the server refuses to mix providers in one collection and surfaces a clear error).

Provider

Model

Dims

Privacy

Speed (1k chunks)

Cost

Ollama

jina-code-embeddings 1.5B

1536

100% local

~10–20 min (CPU)

Free, ~1.7 GB on disk

OpenAI

text-embedding-3-small

1536

Sent to OpenAI

~30 s

~$0.02 / 1 M tokens

Voyage

voyage-code-3

1024

Sent to Voyage

~30 s

~$0.18 / 1 M tokens

Selection precedence: explicit EMBEDDING_PROVIDER β†’ OPENAI_API_KEY present β†’ VOYAGE_API_KEY present β†’ Ollama. So setting just your API key in the environment is enough to switch.

# OpenAI β€” cheapest cloud option
export OPENAI_API_KEY=sk-...
docker compose up -d

# Voyage AI β€” best quality on code per recent benchmarks
export VOYAGE_API_KEY=pa-...
docker compose up -d

# Force a provider explicitly (overrides auto-detect)
export EMBEDDING_PROVIDER=voyage

Overrides: EMBEDDING_MODEL (defaults: text-embedding-3-small, voyage-code-3, jina-code-embeddings) and EMBEDDING_DIMENSIONS (1536 / 1024 / 1536). Voyage voyage-code-3 supports 256/512/1024/2048 via Matryoshka β€” set EMBEDDING_DIMENSIONS to opt into a non-default size.

Local (Ollama) β€” defaults below

Default: jinaai/jina-code-embeddings-1.5b-GGUF β€” code-optimized, 1.5B params, 1536 dims, 32k context. Not in Ollama registry, so we create a local alias.

Recommended: paparats install automates this:

  • Native mode (--ollama-mode native, default on macOS): Downloads GGUF (~1.65 GB) to ~/.paparats/models/, creates Modelfile and runs ollama create jina-code-embeddings

  • Docker mode (--ollama-mode docker, default on Linux): Uses ibaz/paparats-ollama image with model pre-baked β€” zero setup

Manual setup:

# 1. Download GGUF
curl -L -o jina-code-embeddings-1.5b-Q8_0.gguf \
  "https://huggingface.co/jinaai/jina-code-embeddings-1.5b-GGUF/resolve/main/jina-code-embeddings-1.5b-Q8_0.gguf"

# 2. Create Modelfile
cat > Modelfile <<'EOF'
FROM ./jina-code-embeddings-1.5b-Q8_0.gguf
PARAMETER num_ctx 8192
EOF

# 3. Register in Ollama
ollama create jina-code-embeddings -f Modelfile

# 4. Verify
ollama list | grep jina

Spec

Value

Parameters

1.5B

Dimensions

1536

Context

32,768 tokens (recommended ≀ 8,192)

Quantization

Q8_0 (~1.6 GB)

Languages

15+ programming languages

Task-specific prefixes (nl2code, code2code, techqa) applied automatically.


Comparison with Alternatives

Feature Matrix

Deployment

Feature

Paparats

Vexify

SeaGOAT

Augment

Sourcegraph

Greptile

Bloop

Open source

βœ… MIT

βœ… MIT

βœ… MIT

❌

⚠️ Partial

❌

⚠️ 1

Fully local

βœ…

βœ…

βœ…

⚠️ No 2

❌

❌

βœ…

Search Quality

Feature

Paparats

Vexify

SeaGOAT

Augment

Sourcegraph

Greptile

Bloop

Code embeddings

βœ… Jina 3

⚠️ 4

❌ 5

⚠️ Partial

⚠️ Partial

⚠️ Partial

βœ…

Vector database

βœ… Qdrant

SQLite

ChromaDB

Propri.

Propri.

pgvector

Qdrant

AST chunking

βœ…

❌

❌

⚠️ Partial

⚠️ Partial

⚠️ Partial

βœ…

Query expansion

βœ… 6

❌

❌

⚠️ Partial

⚠️ Partial

⚠️ Partial

❌

Developer Experience

Feature

Paparats

Vexify

SeaGOAT

Augment

Sourcegraph

Greptile

Bloop

Real-time watching

βœ… Auto

❌

❌

⚠️ CI/CD

βœ…

⚠️ Partial

⚠️ Partial

Embedding cache

βœ… SQLite

⚠️ Partial

❌

⚠️ Partial

⚠️ Partial

⚠️ Partial

❌

Multi-project

βœ… Groups

βœ…

❌

βœ…

βœ…

βœ…

βœ…

One-cmd install

βœ…

⚠️ Partial

⚠️ Partial

❌

❌

❌

❌

AI Integration

Feature

Paparats

Vexify

SeaGOAT

Augment

Sourcegraph

Greptile

Bloop

MCP native

βœ…

βœ…

❌

βœ…

❌

⚠️ API

❌

Symbol graph

βœ…

❌

❌

❌

⚠️ Partial

❌

❌

Token metrics

βœ…

❌

❌

⚠️ Partial

❌

❌

❌

Git history

βœ…

❌

❌

❌

⚠️ Partial

❌

❌

Ticket extraction

βœ…

❌

❌

❌

❌

❌

❌

Architectural memory 7

βœ… ADRs

❌

❌

❌

❌

❌

❌

Pricing

Paparats

Vexify

SeaGOAT

Augment

Sourcegraph

Greptile

Bloop

Cost

βœ… Free

βœ… Free

βœ… Free

❌ Paid

❌ Paid

❌ Paid

⚠️ Archived

  1. Bloop archived January 2, 2025

  2. Augment Context Engine indexes locally but stores vectors in cloud

  3. Jina Code Embeddings 1.5B (1536 dims) with task-specific prefixes (nl2code, code2code, techqa)

  4. Vexify supports Ollama models but limited to specific embeddings (jina-embeddings-2-base-code, nomic-embed-text)

  5. SeaGOAT locked to all-MiniLM-L6-v2 (384 dims, general-purpose)

  6. Abbreviations, case variants, plurals, filler word removal

  7. Agent-maintained components / decisions (ADRs) / lessons in a second Qdrant collection per group; server-side similarity gate deduplicates writes, supersedes links replace stale decisions, every card carries an "updated N ago" stamp on read


Token Savings Metrics

What we measure (and what we don't)

Paparats provides estimated token savings to help you understand the order of magnitude of context reduction. These are heuristics, not precise measurements.

Per-search response

{
  "metrics": {
    "tokensReturned": 150,
    "estimatedFullFileTokens": 5000,
    "tokensSaved": 4850,
    "savingsPercent": 97
  }
}

Field

Calculation

Reality Check

tokensReturned

ceil(content.length / 4)

Based on actual returned content; /4 is rough approximation

estimatedFullFileTokens

ceil(endLine * 50 / 4)

Heuristic: assumes 50 chars/line, never loads actual files

tokensSaved

estimated - returned

Derived: difference between two estimates

savingsPercent

(saved / estimated) * 100

Relative: percentage of heuristic estimate

Cumulative stats

curl -s http://localhost:9876/api/stats | jq '.usage'
{
  "searchCount": 47,
  "totalTokensSaved": 152340,
  "avgTokensSavedPerSearch": 3241
}

These are sums of estimates, not measured token counts from a real tokenizer.


License

MIT


Releasing (maintainers)

Releases are driven by Changesets. Versioning + CHANGELOG generation happen in CI; publishing to npm and tagging happen locally from a maintainer machine that's authenticated with npm. There are no npm credentials in CI.

Authoring a changeset (per PR)

yarn changeset
# Pick affected packages, bump type (patch/minor/major), and write the user-facing summary.
git add .changeset/
git commit -m "chore: changeset"

All four packages (@paparats/shared, @paparats/cli, @paparats/server, @paparats/indexer) are kept on a fixed version β€” pick any one and the rest are bumped to match.

How a release happens

1. CI opens a release PR (automatic). The Release workflow runs on every push to main. If pending .changeset/*.md files exist, it opens (or updates) a chore: release PR with: version bumps in every package.json, regenerated per-package CHANGELOG.md files, server.json synced via scripts/sync-server-json.js, and the consumed .changeset/*.md files deleted.

2. Maintainer merges the release PR. No further CI publish step runs.

3. Maintainer publishes locally. From a clean checkout of main after the merge:

git checkout main && git pull
yarn release:local         # or `--dry-run` to preview

yarn release:local runs scripts/release-local.sh, which:

  • refuses to run unless you're on main, the tree is clean, and you're in sync with origin/main;

  • refuses if any pending .changeset/*.md are present (means the release PR wasn't merged);

  • reads the new version from packages/cli/package.json;

  • builds, runs yarn changeset publish (skips already-published versions), then tags vX.Y.Z and pushes the tag.

4. Downstream workflows fire on the tag. Pushing vX.Y.Z triggers docker-publish.yml and publish-mcp.yml automatically.

Required credentials

Where

What

Purpose

CI

GITHUB_TOKEN (auto)

Open/update the chore: release PR

Local

npm login (or NPM_TOKEN in env)

yarn changeset publish to publish @paparats/*

No npm token lives in GitHub secrets β€” publishing is intentionally a manual, authenticated step.

Manual / fallback flows

./scripts/release-docker.sh --push still builds and pushes the Docker images by hand if needed (e.g. between official releases). It reads the version from package.json.

Docker images

Image

Source

Size

ibaz/paparats-server

packages/server/Dockerfile

~200 MB

ibaz/paparats-indexer

packages/indexer/Dockerfile

~200 MB

ibaz/paparats-ollama

packages/ollama/Dockerfile

~3 GB (includes model)


Contributing

Contributions welcome! Areas of interest:

  • Additional language support (PHP, Elixir, Scala, Kotlin, Swift)

  • Alternative embedding providers (OpenAI, Cohere, local GGUF via llama.cpp)

  • Performance optimizations (chunking strategies, cache eviction)

  • Agent use cases (support bots, QA automation, code analytics)

Open an issue or pull request to get started.



Star the repo if Paparats helps you code faster!

A
license - permissive license
-
quality - not tested
B
maintenance

Maintenance

–Maintainers
–Response time
–Release cycle
–Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/IBazylchuk/paparats-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server