Paparats MCP
Allows sending traces to Datadog via OpenTelemetry.
Allows sending traces to Elastic APM via OpenTelemetry.
Links code chunks to GitHub issues and pull requests via git history, enabling agents to see ticket context for code changes.
Allows sending traces to Grafana Cloud via OpenTelemetry.
Allows sending distributed traces to Jaeger via OpenTelemetry.
Links code chunks to Jira tickets via git history, enabling agents to see which tickets are associated with code changes.
Sends traces to OpenTelemetry-compatible backends for distributed tracing of the MCP server.
Exposes metrics in Prometheus format for monitoring the MCP server's performance and usage.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Paparats MCPfind where the rate limiting logic is"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Paparats MCP
β try the full stack in your browser, no install (details)
Paparats-kvetka β a magical flower from Slavic folklore that blooms on Kupala Night and grants whoever finds it the power to see hidden things. Likewise, paparats-mcp helps your agent see the right code across a sea of repositories.
πΏ Works with Claude Code Β· Cursor Β· Windsurf Β· Copilot Β· Codex Β· Antigravity Β· any MCP-compatible agent
Give your AI coding assistant deep, real understanding of your entire workspace.
Paparats indexes every repo you care about β semantically, with AST-aware chunking and
a cross-chunk symbol graph β and exposes it through the Model Context Protocol. Search
by meaning, follow who-uses-what through real symbol edges, see who last touched a
chunk and which ticket it came from β all without your code ever leaving your machine.
π The built-in /ui operator console β ROI, query quality, cross-project usage, per-user activity, indexer health. Screenshot uses synthetic data (?demo=1) β no real queries, users, or project names.
β‘ One install, one config.
paparats installβpaparats add ~/code/repoβ done.π³ AST-aware chunking and symbol extraction. Tree-sitter parses every supported file once and feeds both chunking and the cross-chunk symbol graph (calls / called_by / references / referenced_by) β 11 languages including TypeScript, Python, Go, Rust, Java, Ruby, C, C++, C#.
π§ Architectural memory that the agent maintains itself. A second vector store per group holds components, decisions (ADRs) and lessons learned β your agent writes them as it works and reads them before answering. Bootstrap on day one with the
init_arch_memoryMCP prompt (the/initof architectural memory). Server-side similarity gate prevents duplicates, supersedes links replace stale decisions, amin_scorethreshold gates low-confidence reads, every card carries an "updated N ago" stamp, and Prometheus metrics tell you whether your memory is actually being used.πΈ Saves tokens. Returns only the chunks that matter, with token-savings telemetry to prove it (per-query, per-user, per-anchor-project).
π Production-ready observability. Prometheus
/metrics, OpenTelemetry traces (Tempo, Jaeger, Honeycomb, Datadog, Grafana Cloud, Elastic APM), local SQLite analytics, and a built-in/uioperator console that visualises ROI, query quality, cross-project usage and indexer health in one screen.π 100% local by default. Qdrant + Ollama on your machine. No cloud, no API keys, no telemetry leaving the box. Bring your own Qdrant Cloud / Ollama URL if you want.
Table of Contents
Related MCP server: Codebase Contextifier 9000
Why Paparats?
AI coding assistants are smart, but they can only see files you open. They don't know your codebase structure, where the authentication logic lives, or how services connect. Paparats fixes that.
What you get
Semantic code search β ask "where is the rate limiting logic?" and get exact code ranked by meaning, not grep matches
Real-time sync β edit a file, and 2 seconds later it's re-indexed. No manual re-runs
Cross-chunk symbol graph β
find_usageswalks AST-derived edges (calls, called_by, references, referenced_by) so the agent can trace dependencies without re-greppingToken savings β return only relevant chunks instead of full files to reduce context size
Multi-project workspaces β search across backend, frontend, infra repos in one query
100% local & private β Qdrant vector database + Ollama embeddings. Nothing leaves your laptop
AST-aware chunking β code split by AST nodes (functions/classes) via tree-sitter, not arbitrary character counts (TypeScript, JavaScript, TSX, Python, Go, Rust, Java, Ruby, C, C++, C#; regex fallback for Terraform)
Rich metadata β each chunk knows its symbol name (from tree-sitter AST), service, domain context, and tags from directory structure
Git history per chunk β see who last modified a chunk, when, and which tickets (Jira, GitHub) are linked to it
Architectural memory β a living knowledge base of components, decisions (ADRs) and lessons learned, written by the agent as it learns, deduplicated server-side by vector similarity, and consulted on every support query so the agent stays consistent across sessions
Who benefits
Use Case | How Paparats Helps |
Solo developers | Quickly navigate unfamiliar codebases, find examples of patterns, reduce context-switching |
Multi-repo teams | Cross-project search (backend + frontend + infra), consistent patterns, faster onboarding |
AI agents | Foundation for product support bots, QA automation, dev assistants β any agent that needs code context |
Legacy modernization | Find all usages of deprecated APIs, identify migration patterns, discover hidden dependencies |
Contractors/consultants | Accelerate ramp-up on client codebases, reduce "where is X?" questions |
Quick Start
Try it in the browser (no install)
Spin up a full Qdrant + Ollama + paparats stack in a Codespace.
A small slice of the repo (packages/shared/src) is auto-indexed on first start so
you can run
paparats search -g demo 'gitignore filter'within a few minutes. Codespace forwards port 9876 for MCP β point Cursor/Claude Code at it via the URL VS Code shows in the Ports panel.
Note: Codespaces is for demo only. With Ollama-on-CPU embedding the full repo would take 15+ minutes and can hit batch timeouts on large files. For real workloads run locally β or set
OPENAI_API_KEY(orVOYAGE_API_KEY) as a Codespaces user secret and indexing drops to a couple of seconds; see the Embedding providers section below.
Run locally
You need Docker and Docker Compose v2. On macOS, also install Ollama natively β running it inside Docker on macOS is significantly slower because the Docker VM cannot use Apple Silicon GPU acceleration.
# 1. Install the CLI.
npm install -g @paparats/cli
# 2. macOS only β install Ollama natively (Linux uses Docker Ollama by default).
brew install ollama
# 3. One-time bootstrap. Generates ~/.paparats/{docker-compose.yml,projects.yml},
# starts the stack, downloads the embedding model, wires Cursor/Claude Code MCP.
paparats install
# 4. Add the projects you want indexed. Local paths bind-mount read-only into the
# indexer; git URLs and owner/repo shorthand get cloned.
paparats add ~/code/my-project
paparats add git@github.com:acme/billing.git
paparats add acme/widgets
# 5. Watch it work.
paparats listThat's it. Your IDE is already wired (~/.cursor/mcp.json, ~/.claude/mcp.json) to
http://localhost:9876/mcp. Open Cursor or Claude Code and ask:
"Search this workspace for the auth middleware and show me everything that calls it."
Existing v1 user?
Just run paparats install again. The installer detects the legacy per-project
compose, asks once before swapping it for the new global setup, and preserves your
indexed data (Qdrant collections, SQLite metadata, embedding cache). Your in-repo
.paparats.yml files keep working as per-project overrides.
How the install works
paparats install is the only setup command. It creates a single global home at
~/.paparats/, brings up a Docker stack, and wires your MCP clients. Re-run it any time
to reconfigure β it diffs the existing compose and asks before overwriting hand edits.
~/.paparats/
βββ docker-compose.yml generated; hand-editable; install asks before overwriting
βββ projects.yml project list (CLI rewrites it; comments survive your manual edits)
βββ install.json install flags persisted so add/remove can regenerate compose
βββ .env secrets β Qdrant API key, GitHub token; chmod 600
βββ models/ jina-code-embeddings GGUF + Modelfile
βββ data/ Docker volumes (mounted by name from compose)
βββ qdrant/ vector index
βββ sqlite/ metadata.db, embeddings.db, analytics.db
βββ repos/ cloned remote projectsInside the Docker stack:
Service | Image | Port | Role |
|
| 9876 | MCP HTTP/SSE endpoints, search, metadata API |
|
| 9877 | Cron + on-demand indexing, hot-reload of project list |
|
| 6333 | Vector DB (skipped when you pass |
|
| 11434 | Embedding model (Linux default; macOS uses native Ollama) |
The indexer hot-reloads projects.yml. Edits that change project metadata
only (group, language, indexing tweaks) reindex in place. Edits that add or remove
local-path projects require a stack restart so Docker picks up the new bind-mount β
the CLI does this for you on paparats add and paparats remove.
Install variants
Default (recommended)
paparats installOn macOS prefers native Ollama and dockerized Qdrant. On Linux defaults to Docker for both.
Bring your own Qdrant
paparats install --qdrant-url https://qdrant.example.com
# Asks for an API key after; stored in ~/.paparats/.env as QDRANT_API_KEY.When --qdrant-url is set the Qdrant container is omitted from the stack entirely.
Bring your own Ollama
paparats install --ollama-url http://10.0.0.5:11434Skips both native and Docker Ollama.
You must register the embedding model on the remote Ollama yourself. The installer will not touch a remote instance. On the Ollama host, download the GGUF (jinaai/jina-code-embeddings-1.5b-Q8_0.gguf) and run:
echo "FROM /path/to/jina-code-embeddings-1.5b-Q8_0.gguf" > Modelfile ollama create jina-code-embeddings -f ModelfileThen
paparats install --ollama-url http://that-host:11434and Paparats will use it.
Force Docker Ollama on macOS
paparats install --ollama-mode dockerSlower on Apple Silicon (no Metal GPU), but useful for parity testing or laptops without brew.
Scripted / CI
paparats install --non-interactive --forceFails on any prompt; --force answers Y to compose-overwrite and migration prompts.
Migrating from a v1 install
When paparats install finds a legacy ~/.paparats/docker-compose.yml (the one from the
old per-project flow with no paparats-indexer service), it prints a one-screen
migration notice and asks before tearing the legacy stack down.
What survives: Qdrant collections, SQLite metadata, indexer repos, and any
.paparats.yml files inside your repos (those still take precedence over
projects.yml overrides).
What's deleted: the legacy docker-compose.yml and .env. They are regenerated on
the spot under the new schema.
No re-indexing needed β the data volumes are referenced by the same names in the new
compose. Add your projects with paparats add and they re-appear in paparats list with
their existing chunks.
If your install predates the paparats-indexer.yml β projects.yml rename, the
installer migrates the file in place on first run and prints a one-line notice.
The indexer also reads the legacy name as a fallback, so nothing breaks if you
roll out the indexer before re-running paparats install.
Pass --force to skip the migration prompt in scripts.
Support agent setup
For bots and support teams that consume an existing Paparats server β no Docker, no Ollama needed on this side.
# Connect to a running server (default: localhost:9876)
paparats install --mode support
# Connect to a remote server
paparats install --mode support --server http://prod-server:9876The installer verifies the server is reachable, then wires Cursor MCP
(~/.cursor/mcp.json) and Claude Code MCP (~/.claude/mcp.json) to the support
endpoint. Tools available on /support/mcp: search_code, get_chunk, find_usages,
list_projects, health_check, get_chunk_meta, search_changes, explain_feature,
recent_changes, impact_analysis, arch_context, arch_record_component,
arch_record_decision, arch_record_lesson (architectural memory β see
Key Features), plus
the analytics tools described in Observability below.
How It Works
Your projects Paparats AI assistant
(Claude Code / Cursor)
backend/ ββββββββββββββββββββββββ
.paparats.yml βββββββββΊβ Indexer β
frontend/ β - chunks code β ββββββββββββββββ
.paparats.yml βββββββββΊβ - embeds via Ollama βββββββββββΊβ MCP search β
infra/ β - stores in Qdrant β β tool call β
.paparats.yml βββββββββΊβ - watches changes β ββββββββββββββββ
ββββββββββββββββββββββββIndexing Pipeline
During each indexer cycle (cron-driven, on-demand via paparats add, or triggered by
the indexer's chokidar file watcher), every file in scope flows through this pipeline:
Source file
β
βΌ
βββββββββββββββββββ
β 1. File discoveryβ Collect files from indexing.paths, apply
β & filtering β gitignore + exclude patterns, skip binary
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 2. Content hash β SHA-256 of file content β compare with
β check β existing Qdrant chunks β skip unchanged
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 3. AST parsing β tree-sitter parses the file once (WASM)
β (single pass) β β reused for chunking AND symbol extraction
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 4. Chunking β AST nodes β chunks at function/class
β β boundaries. Regex fallback for unsupported
β β languages (brace/indent/block strategies)
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 5. Symbol β AST queries extract module-level defines
β extraction β (function/class/variable names) and uses
β β (calls, references) per chunk. 11 languages
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 6. Metadata β Service name, bounded_context, tags from
β enrichment β config + auto-detected directory tags
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 7. Embedding β Jina Code Embeddings 1.5B via Ollama
β β SQLite cache (content-hash key) β skip
β β already-embedded content
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 8. Qdrant upsert β Vectors + payload (content, file, lines,
β β symbols, metadata) β batched upsert
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β 9. Git history β git log per file β diff hunks β map
β (post-index) β commits to chunks by line overlap β
β β extract ticket refs β store in SQLite
ββββββββββ¬βββββββββ
βΌ
βββββββββββββββββββ
β10. Symbol graph β Cross-chunk edges: calls β called_by,
β (post-index) β references β referenced_by β SQLite
βββββββββββββββββββStep 5's symbol extractor only emits module-level definitions β locals declared inside function bodies, callback args, and hook closures stay out of the graph because they're not addressable from another chunk anyway.
Search Flow
AI assistant queries via MCP β server detects query type (nl2code / code2code / techqa) β expands query (abbreviations, case variants, plurals) β all variants searched in parallel against Qdrant β results merged by max score β only relevant chunks returned with confidence scores and symbol info.
Watching
The indexer container watches the projects mounted into it via chokidar with debouncing
(1s default). On change, only the affected file re-enters the pipeline. Unchanged content
is never re-embedded thanks to the content-hash cache. The indexer also hot-reloads
~/.paparats/projects.yml itself: metadata-only edits reindex in place;
add/remove of local-path projects triggers a stack restart through the CLI.
Key Features
Better Search Quality
Task-specific embeddings β Jina Code Embeddings supports 3 query types (nl2code, code2code, techqa) with different prefixes for better relevance:
"find authentication middleware"βnl2codeprefix (natural language β code)"function validateUser(req, res)"βcode2codeprefix (code β similar code)"how does OAuth work in this app?"βtechqaprefix (technical questions)
Query expansion β every search generates 2-3 variations server-side:
Abbreviations:
authβauthentication,dbβdatabaseCase variants:
userAuthβuser_authβUserAuthPlurals:
usersβuser,dependenciesβdependencyFiller removal:
"how does auth work"β"auth"
All variants searched in parallel, results merged by max score.
Confidence scores β each result includes a percentage score (β₯60% high, 40β60% partial, <40% low) to guide AI next steps.
Performance
Embedding cache β SQLite cache with content-hash keys + Float32 vectors. Unchanged code never re-embedded. LRU cleanup at 100k entries.
AST-aware chunking β tree-sitter AST nodes define natural chunk boundaries for 11 languages. Falls back to regex strategies (block-based for Ruby, brace-based for JS/TS, indent-based for Python, fixed-size) for unsupported languages.
Real-time watching β the indexer's chokidar watcher reindexes a project on file
changes with debouncing (1s default). For local-path projects bind-mounted into the
indexer, edits on your host show up in MCP queries within seconds.
Cross-chunk symbol graph
The post-index pass walks every chunk's defines_symbols / uses_symbols lists and
materializes edges into SQLite β calls, called_by, references, referenced_by.
find_usages returns those edges grouped by direction so the agent can traverse the
graph without re-searching. Because extraction is AST-driven, function locals don't
pollute the graph.
Architectural memory (agent-maintained ADRs, components, lessons)
Code search tells the agent what the code does. Architectural memory tells it why β and the agent maintains that knowledge itself, across sessions, without you authoring a single doc.
Three card kinds, structured by design:
Kind | Captures | Fields |
Component | A unit with a clear responsibility (service, module, subsystem) |
|
Decision | An architectural choice (ADR-style) |
|
Lesson | A rule learned from an incident, a code review, a bug, or a user correction (Reflexion-style) |
|
The agent reads them via arch_context before any architectural answer, and
writes them via arch_record_component, arch_record_decision, and
arch_record_lesson whenever it discovers something new or learns from a
correction. Each card carries an updated N ago stamp in the read tool so the agent
can spot stale memory and verify against current code.
Server-side similarity gate (cosine over bge-m3 text embeddings, 1024d):
β₯ 0.85is a duplicate β decisions are refused (the agent must reconcile or supersede); lessons bumpupdatedAt(Reflexion-style "rule confirmed").0.70 β 0.85is similar β surfaced to the agent so it can refine the wording or chain a supersede.< 0.70is new β accepted as a fresh card.
supersedes links bypass the gate and mark the prior decision as status=superseded
so it disappears from default search but remains in history.
Why this matters:
π§ Cross-session continuity β what the agent learned last week, today's agent still knows.
π ADRs without the ceremony β no markdown files to maintain, no review process, no doc drift. The agent writes when it learns.
π Reflexion built in β corrections become lessons; repeated mistakes get caught.
π¦ No memory rot β similarity gate kills duplicates, supersedes link replaces stale decisions, age stamps trigger verification against code.
Lives in a separate Qdrant collection per group (paparats_<group>_arch). Reading
(arch_context) is available on both endpoints β coding agents need to know
about prior decisions before refactoring. Writing (arch_record_*) is support-only:
recording belongs to the architectural-review workflow, not to every line edit.
arch_context accepts a min_score parameter (default 0.45, cosine over
bge-m3). Lower it to broaden recall on a sparse
arch memory; raise it to demand only high-confidence cards. The tool also emits an
explicit low-confidence hint when the question matched nothing above the threshold,
so the agent knows to either rephrase or lower min_score instead of inventing
context.
Initialise the arch layer on day one. Two purpose-built MCP workflow prompts make the boring scaffolding work disappear:
init_arch_memoryβ the/initof architectural memory. Walks the repo, identifies 8-20 components by domain boundary, writes them, and captures any obvious decisions inferable from comments or README. Run it once per group, right after installing.audit_architectureβ sweeps the memory of one group, flags cards older than 90 days, verifies anchors against the live code, and surfaces a punch list of updates / supersedes for your approval.record_lesson_from_correctionβ converts a user correction into a structured lesson card (rule / why / when) without overrecording typos.
MCP resources for live introspection:
arch://schemaβ the full card-schema reference (fields, similarity-gate thresholds, write semantics). Cite it from the agent when explaining the model.arch://stats/{group}β live counts (total / by kind / by status) and the oldest/newestupdatedAtper group. The same numbers are also pushed to Prometheus.
Observability built in. When PAPARATS_METRICS=true, every read/write hits a
counter and the cosine score of returned cards lands in a histogram:
paparats_arch_context_calls_total{group}β calls per grouppaparats_arch_write_total{kind, status}β writes by card kind and gate outcomepaparats_arch_search_scoreβ histogram of cosine scores inarch_contextresults (postmin_score)paparats_arch_collection_size{group, kind, status}β gauge updated wheneverarch://stats/{group}is read
These let you spot a memory that's not being written to, a similarity gate that's too aggressive, or a sparse group where every query returns low-confidence hits.
Use Cases
For Developers (Coding)
Connect via the coding endpoint (/mcp):
Use Case | How |
Navigate unfamiliar code |
|
Find similar patterns |
|
Trace dependencies |
|
Explore context |
|
Manage projects |
|
For Support Teams
Connect via the support endpoint (/support/mcp):
Use Case | How |
Explain a feature |
|
Recent changes |
|
Trace usages |
|
Change history |
|
Blast radius |
|
Architectural Q&A |
|
Capture decisions & lessons |
|
Support chatbot example:
User: "How do I configure rate limiting?"
Bot workflow (via /support/mcp):
1. explain_feature("rate limiting", group="my-app")
β returns code locations + recent changes + related modules
2. get_chunk_meta(<chunk_id>)
β returns who last modified it, when, linked tickets
3. Bot synthesizes response in plain language with ticket referencesConfiguration
Paparats uses two config files. Both are optional β defaults work for the common case.
~/.paparats/projects.yml β global project list
Lives outside your repos. Edited by paparats add / paparats remove or by hand via
paparats edit projects. Every entry has either path: (local bind-mount) or url:
(remote git, cloned by the indexer), never both.
defaults:
cron: '0 */6 * * *' # global indexer schedule
group: workspace # default group when an entry doesn't specify one
repos:
- path: /Users/alice/code/billing # local bind-mount
group: dev
language: typescript
- url: org/widgets # remote git, cloned by the indexer
group: prod
language: ruby
- url: git@github.com:acme/billing.git
name: billing # override the auto-derived name
group: prodThe indexer hot-reloads this file. Adding/removing local-path entries causes the CLI to restart the stack so Docker picks up the new bind-mount; metadata-only edits reindex in place.
.paparats.yml in your repo β per-project overrides
Drop one at the project root to override anything from the global file.
group: my-app
language: typescript
# Indexing tuning (all optional)
indexing:
paths: [src, packages] # restrict to these subdirectories
exclude: [node_modules, dist, '**/*.test.ts']
exclude_extra: ['**/__fixtures__/**'] # added on top of language defaults
chunkSize: 1500 # characters per chunk (default: 1200)
overlap: 100 # chunk overlap (default: 100)
concurrency: 4 # parallel embedding requests
batchSize: 8 # embeddings per Ollama call
# Metadata
metadata:
service: billing
bounded_context: payments
tags: [backend, critical]
directory_tags:
src/api: [public-api]
src/internal: [internal]
# Git history per chunk (Jira / GitHub ticket extraction included)
git:
enabled: true
maxCommitsPerFile: 50
ticketPatterns:
- '\b([A-Z]+-\d+)\b' # Jira-style PROJ-123
- '#(\d+)' # GitHub-style #123In-repo .paparats.yml always wins over projects.yml. The CLI never
overwrites it.
Groups
A group is a Qdrant collection (paparats_<group>). Multiple projects can share a
group to enable cross-project search; each project lives as a project: field in the
chunk payload. By default group defaults to the project name (one project, one
collection). Set the same group: on multiple entries to consolidate them.
Git history per chunk
When metadata.git.enabled: true (default), the indexer maps each chunk to the commits
that touched its line range using diff-hunk overlap. Tickets are extracted from commit
messages using metadata.git.ticketPatterns (built-in: Jira PROJ-123, GitHub #42,
cross-repo org/repo#99). Surfaced through MCP tools get_chunk_meta, search_changes,
recent_changes, explain_feature. Non-fatal: non-git projects index normally.
MCP Tools Reference
Paparats serves the Model Context Protocol on two separate endpoints, each with its own tool set and system instructions.
Coding endpoint (/mcp)
For developers using Claude Code, Cursor, etc. Focus: search code, read chunks, follow the cross-chunk symbol graph, manage projects.
Tool | Description |
| Semantic search across indexed projects. Returns chunks with symbol info and confidence scores. |
| Retrieve a chunk by ID with optional surrounding context. |
| Walk the symbol graph from a |
| List indexed projects with chunk counts and detected languages. |
| Wipe Qdrant chunks + SQLite metadata for a project (CLI's |
| Indexing status, chunks per group, running jobs. |
| Read-only architectural memory. Returns components, decisions, and lessons relevant to the query with |
Support endpoint (/support/mcp)
For support teams and bots without direct code access. Focus: feature explanations, change history, cost reporting β all in plain language.
Tool | Description |
| Same as coding endpoint. |
| Same. |
| Same. |
| Same. |
| Same. |
| Git history and ticket references for a chunk β commits, authors, dates. No code. |
| Semantic search filtered by last-commit date. Each result shows when it last changed. |
| Comprehensive feature analysis: locations + recent changes for a question. |
| Timeline grouped by date with commits, tickets, affected files. |
| Cross-chunk impact for a |
| Read the architectural memory for a group β top-matching components, decisions and lessons, each stamped with "updated N ago" and a cosine score. Accepts a |
| Record a component with |
| Record an ADR-style decision ( |
| Record a lesson as |
| Aggregate token-savings stats (naive baseline vs search-only vs actually consumed). |
| Most frequent queries by user/session/project anchor. |
| Top-N slowest searches with timing + chunk counts. |
| Off-anchor result share per user β indicator of search noise. |
| Tool-call retry rate per user β indicator of unhelpful results. |
| AST parse failures, regex fallbacks, zero-chunk files, binary skips. |
Typical workflows
Drill-down (coding agent):
1. search_code "authentication middleware" β relevant chunks with symbols
2. get_chunk <chunk_id> --radius_lines 50 β expand context around a hit
3. find_usages {chunk_id, direction: "incoming"} β who calls / references this chunkSingle-call (support agent):
1. explain_feature "How does authentication work?" β locations + recent changes
2. recent_changes "auth" --since 2024-01-01 β timeline with tickets
3. token_savings_report β cost report for the last 7 daysArchitectural memory (support agent):
1. arch_context "why do we use bge-m3 for the arch layer?"
β top components / decisions / lessons,
each with an "updated N ago" stamp
2. arch_record_decision { title, context, decision, alternatives_rejected, consequences }
β status=created | duplicate | similar
(gate refuses duplicates server-side)
3. arch_record_lesson { rule, why, when } β status=created | updated (Reflexion bump)Connecting MCP
paparats install already wires Cursor (~/.cursor/mcp.json) and Claude Code
(~/.claude/mcp.json) to http://localhost:9876/mcp. The sections below are for
manual setup or for adding the support endpoint alongside the default coding one.
Cursor
Create or edit ~/.cursor/mcp.json (global) or .cursor/mcp.json (project):
{
"mcpServers": {
"paparats": {
"type": "http",
"url": "http://localhost:9876/mcp"
}
}
}For support use case (feature explanations, change history, impact analysis):
{
"mcpServers": {
"paparats-support": {
"type": "http",
"url": "http://localhost:9876/support/mcp"
}
}
}Restart Cursor after changing config.
Claude Code
# Coding endpoint (default)
claude mcp add --transport http paparats http://localhost:9876/mcp
# Support endpoint (for support bots/agents)
claude mcp add --transport http paparats-support http://localhost:9876/support/mcpOr add to .mcp.json in project root:
{
"mcpServers": {
"paparats": {
"type": "http",
"url": "http://localhost:9876/mcp"
}
}
}Verify
paparats statusβ check stack is upCoding endpoint (
/mcp):search_code,get_chunk,find_usages,list_projects,delete_project,health_checkSupport endpoint (
/support/mcp):search_code,get_chunk,find_usages,health_check,list_projects, plus the support-specific toolsget_chunk_meta,search_changes,explain_feature,recent_changes,impact_analysis, and the analytics tools listed in Observability (token_savings_report,top_queries,slowest_searches,cross_project_share,retry_rate,failed_chunks)Ask the AI: "Search this workspace for the auth middleware"
CLI Commands
paparats install [flags] Bootstrap or reconfigure the global stack.
paparats add <path-or-repo> [flags] Add a project (local path or git URL/shorthand).
paparats list [--json] [--group g] Show indexed projects with status from the indexer.
paparats remove <name> [--yes] Remove a project β deletes Qdrant + SQLite data.
paparats start [--logs] Start the Docker stack (with `--logs` follows them).
paparats stop Stop the stack (preserves data volumes).
paparats restart Recreate containers (applies new compose changes).
paparats edit compose|projects Open the file in $EDITOR; on save, validate +
regenerate compose + restart + reindex (projects).
paparats search <query> [flags] Semantic search from the terminal.
paparats status Stack health: Docker, Ollama, server, indexer.
paparats groups [--json] List groups and their projects.
paparats doctor Diagnostic checks (Docker, Ollama, ports, configs).
paparats update Update CLI from npm + pull latest Docker images.The legacy per-project commands (paparats init, paparats index, paparats watch) are
gone β adding a project is now paparats add, indexing is automatic in the indexer
container, watching is the chokidar watcher inside the indexer.
Common flags
paparats install
--ollama-mode <native|docker>β force Ollama mode (default: native on macOS, docker on Linux)--ollama-url <url>β external Ollama; skips both native and docker Ollama--qdrant-url <url>β external Qdrant; skips the Qdrant container--qdrant-api-key <key>β for authenticated Qdrant (e.g. Qdrant Cloud); written to~/.paparats/.env--mode supportβ wire MCP clients only, no Docker stack--server <url>β server URL for support mode (default:http://localhost:9876)--forceβ skip overwrite/migration prompts--non-interactiveβ fail on any prompt instead of asking-v, --verboseβ stream Docker output
paparats add <path-or-repo>
--name <name>β override the auto-derived project name (basename of path / repo)--group <group>β override group (default: project name)--language <lang>β override language (default: auto-detect)--no-restartβ skip the Docker restart for local-path adds (useful in scripts)--no-reindexβ skip the per-project reindex trigger--forceβ drop the project's existing chunks before reindexing (destructive, use after schema/config changes)
paparats remove <name>
--yesβ skip the confirmation prompt
paparats search <query>
-n, --limit <n>β max results (default: 5)-p, --project <name>β filter by project-g, --group <name>β restrict to a group--jsonβ machine-readable output
Environment overrides
Var | Default | What |
|
| MCP server base URL (used by CLI commands) |
|
| Indexer base URL ( |
Monitoring
Paparats exposes Prometheus metrics for operational visibility. Opt in by setting PAPARATS_METRICS=true in the server's environment:
# In ~/.paparats/docker-compose.yml, under paparats service:
environment:
PAPARATS_METRICS: 'true'Metrics endpoint
curl http://localhost:9876/metricsKey metrics
Metric | Type | Description |
| Counter | Search requests by group and method |
| Histogram | Search latency |
| Counter | Files indexed |
| Counter | Chunks indexed |
| Gauge | Query result cache hit rate |
| Gauge | Embedding cache hit rate |
| Counter | File watcher events |
Prometheus scrape config
scrape_configs:
- job_name: paparats
scrape_interval: 15s
static_configs:
- targets: ['localhost:9876']Query cache
Search results are cached in-memory (LRU, default 1000 entries, 5-minute TTL). The cache is automatically invalidated when files change. Configure via environment variables:
QUERY_CACHE_MAX_ENTRIESβ max cached queries (default: 1000)QUERY_CACHE_TTL_MSβ TTL in milliseconds (default: 300000)
Cache stats are included in GET /api/stats under the queryCache field.
Analytics & Observability
Paparats ships with three observability layers that work together:
Prometheus (
PAPARATS_METRICS=true, see above) β scrape/metrics.Local SQLite analytics store at
~/.paparats/analytics.db(default ON) β raw search/tool/indexing events. Six MCP tools query it directly:token_savings_report,top_queries,cross_project_share,retry_rate,slowest_searches,failed_chunks.OpenTelemetry (
PAPARATS_OTEL_ENABLED=true+OTEL_EXPORTER_OTLP_ENDPOINT) β spans for every search, MCP tool call, embedding, indexing run, chunking error. Works with Tempo, Jaeger, Honeycomb, Datadog, Grafana Cloud β anything that speaks OTLP/HTTP.
Operator console (/ui)
Open http://localhost:9876/ui for a single-screen dashboard (see screenshot at top of README) that visualises the analytics store above: ROI, top / slowest queries, cross-project usage, per-user activity, indexer status, embedding p95/p99, and recent failures. Polls every 5 s, no extra services to run.
Protect it (optional):
PAPARATS_UI_BASIC_AUTH=user:passβ applies to/uiand/api/analyticsonly;/mcpand/api/searchstay open so agents keep working.Show the screenshot view to anyone without touching real data:
PAPARATS_UI_DEMO=true(or append?demo=1to the URL once).
Pre-built Grafana dashboard
The built-in /ui covers the current snapshot. For history (latency p99 over weeks, GC trends, CPU under indexing bursts) wire /metrics to Prometheus and import docs/grafana/paparats.json β 15 panels across four rows: Traffic & latency, Embeddings, Indexing, Process health.
# 1. Enable Prometheus surface on the server.
PAPARATS_METRICS=true paparats up # or set in your docker-compose.yml
# 2. Point your Prometheus at http://<server>:9876/metrics.
# 3. In Grafana: Dashboards β Import β upload docs/grafana/paparats.json
# β pick your Prometheus datasource β Import.The dashboard uses a ${DS_PROMETHEUS} variable, so it works with any Prometheus instance (local, Grafana Cloud, Mimir, VictoriaMetrics).
Sending traces to Elastic APM (or any OTLP backend)
Elastic APM Server accepts OpenTelemetry natively since 7.14 β no agent install, no SDK injection. Set four env vars on the paparats container and restart:
PAPARATS_OTEL_ENABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=https://your-apm-server:8200
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer <apm-secret-token>
OTEL_SERVICE_NAME=paparats-mcpWithin a minute a new service paparats-mcp appears in APM β Services. The same env vars work for Tempo, Jaeger, Honeycomb, Datadog, Grafana Cloud Traces β change the endpoint and auth header.
What gets recorded β one span per event, with paparats-specific attributes for filtering and grouping:
Span name | Key attributes | When |
|
| every |
|
| every |
|
| every MCP tool invocation |
|
| every embedding request |
|
| every indexer cycle |
|
| per-file chunking failure |
Every span also carries paparats.user, paparats.session, paparats.client, paparats.request_id, and (when present) paparats.anchor_project from the identity headers above β so you can filter APM by user or correlate spans across a single MCP session.
What this is good for in Elastic APM:
Errors view β chunking and embedding failures with stacktrace + file/language/error_class context, aggregated by error class.
Transactions β
paparats.searchbecomes a transaction type. Sort by p95/p99/error rate to find the slow workloads. Filter bypaparats.tool=search_codeorpaparats.group=β¦to slice by repo.Custom queries / metrics β every paparats attribute is indexed. Build APM queries like
paparats.embedding.cache_miss:true AND duration_ms>500to find slow cache-miss embeddings, or aggregatepaparats.search.result_countperpaparats.group.Log correlation β if you ship paparats stdout to Elastic via Filebeat, the
trace.idfield links a log line back to its span.
What this is not β honest caveats:
Spans are flat (one event = one span), not parented. Service Map will show
paparats-mcpas an isolated node; you won't see a "search β embedding β Qdrant" waterfall. Use the per-spanduration_msattributes for stage timing instead.Outbound HTTP to Qdrant / Ollama is not auto-instrumented β to see those as separate dependencies in APM you'd need to enable
@opentelemetry/instrumentation-http(planned, not shipped). For now, embedding and search latency live on the existing spans.Per-request token-savings, top queries, and cross-project usage stay in the local SQLite store β they're aggregations, not events. View them in the built-in
/uiconsole, not in APM.
For pure metrics (CPU, GC, RSS, request rates) Elastic Metricbeat or our Prometheus exporter (above) is a better fit than APM.
Identity attribution
Clients (IDE plugins, CLI) can set X-Paparats-User, X-Paparats-Session, X-Paparats-Client, X-Paparats-Anchor-Project headers. The header name for user is configurable via PAPARATS_IDENTITY_HEADER (default X-Paparats-User). Missing header β events are attributed to anonymous. There is no cryptographic verification β this is for attribution, not access control.
GET /api/stats echoes the resolved identity, useful for verifying header propagation:
curl -H 'X-Paparats-User: alice' http://localhost:9876/api/stats | jq .identityToken-savings estimators
Three levels, computed from raw events at query-time:
Naive baseline β what a model would have read if it pulled the whole file for each result.
Search-only β tokens actually returned by
search_code.Actually consumed β tokens that the client subsequently fetched via
get_chunk. The most honest signal, since it discounts noisy results that were never used.
Run token_savings_report from any MCP client connected to /support/mcp.
Cross-project noise
When a client passes X-Paparats-Anchor-Project (or specifies a single project in the search call), the share of results from other projects in the same group is recorded. Use cross_project_share to see how noisy your group's index is for each user.
Indexer-pipeline visibility
failed_chunks aggregates AST parse failures, regex fallbacks, zero-chunk files, and binary skips. slowest_searches ranks individual searches by latency.
Configuration matrix
Env var | Default | Purpose |
|
| Prometheus surface (existing, unchanged) |
|
| Local SQLite analytics writes |
|
| Analytics DB file |
|
| Daily prune cutoff |
|
| Hour-of-day for prune (local time) |
|
| Header name for user attribution |
|
| If |
|
| If |
|
| Reformulation detection window |
|
| Sampling rate (errors are always kept) |
|
| Enable OTel SDK + OTLP exporter |
| unset | OTLP HTTP endpoint (e.g. |
| unset | OTLP auth headers ( |
|
| OTel resource attribute |
| unset | Extra resource attrs ( |
PII guidance
File paths and query text are stored locally by default. For shared deployments where paths could leak sensitive info, set
PAPARATS_LOG_RESULT_FILES=falseand/orPAPARATS_LOG_QUERY_TEXT=false.OTel spans never carry full query text by default β only
paparats.query.hashand length.
Architecture
paparats-mcp/
βββ packages/
β βββ server/ # MCP server (Docker image: ibaz/paparats-server)
β β βββ src/
β β β βββ lib.ts # Public library exports (for programmatic use)
β β β βββ index.ts # HTTP server bootstrap + graceful shutdown
β β β βββ app.ts # Express app + HTTP API routes
β β β βββ indexer.ts # Group-aware indexing, single-parse chunkFile()
β β β βββ searcher.ts # Search with query expansion, cache, metrics
β β β βββ query-expansion.ts # Abbreviation, case, plural expansion
β β β βββ task-prefixes.ts # Jina task prefix detection
β β β βββ query-cache.ts # In-memory LRU search result cache
β β β βββ metrics.ts # Prometheus metrics (opt-in)
β β β βββ ast-chunker.ts # AST-based code chunking (tree-sitter, primary strategy)
β β β βββ chunker.ts # Regex-based code chunking (fallback for unsupported languages)
β β β βββ ast-symbol-extractor.ts # AST-based symbol extraction (module-level only, 11 languages)
β β β βββ ast-queries.ts # Tree-sitter S-expression queries per language
β β β βββ tree-sitter-parser.ts # WASM tree-sitter manager
β β β βββ symbol-graph.ts # Cross-chunk symbol edges (calls/called_by/refs)
β β β βββ embeddings.ts # Ollama provider + SQLite cache
β β β βββ config.ts # .paparats.yml reader + validation
β β β βββ metadata.ts # Tag resolution + auto-detection
β β β βββ metadata-db.ts # SQLite store for git commits + tickets + symbol edges
β β β βββ git-metadata.ts # Git history extraction + chunk mapping
β β β βββ ticket-extractor.ts # Jira/GitHub/custom ticket parsing
β β β βββ mcp-handler.ts # MCP protocol β dual-mode (coding /mcp + support /support/mcp)
β β β βββ watcher.ts # File watcher (chokidar)
β β β βββ arch/ # Architectural memory layer (components, decisions, lessons)
β β β β βββ types.ts # ArchComponent, ArchDecision, ArchLesson, ArchWriteResult
β β β β βββ collection.ts # Per-group Qdrant collection (`paparats_<group>_arch`) lifecycle
β β β β βββ text-embeddings.ts # bge-m3 text embedder (1024d, mean-pooled, Ollama)
β β β β βββ store.ts # CRUD + server-side similarity gate (cosine 0.85 / 0.70)
β β β β βββ context.ts # `arch_context` query β top-N across kinds with age stamps
β β β βββ types.ts # Shared types
β β βββ Dockerfile
β βββ indexer/ # Automated repo indexer (Docker image: ibaz/paparats-indexer)
β β βββ src/
β β β βββ index.ts # Entry: Express mini-server + cron scheduler
β β β βββ config-loader.ts # projects.yml parser + per-repo overrides
β β β βββ config-watcher.ts # chokidar watcher for hot-reloading the project list
β β β βββ repo-manager.ts # parseReposEnv(), cloneOrPull() using simple-git
β β β βββ scheduler.ts # node-cron wrapper
β β β βββ types.ts # IndexerConfig, RepoConfig, RepoOverrides, IndexerFileConfig
β β βββ Dockerfile
β βββ ollama/ # Custom Ollama with pre-baked model (Docker image: ibaz/paparats-ollama)
β β βββ Dockerfile
β βββ cli/ # CLI tool (npm package: @paparats/cli)
β β βββ src/
β β βββ index.ts # Commander entry
β β βββ docker-compose-generator.ts # Programmatic YAML generation
β β βββ projects-yml.ts # projects.yml + install.json read/write
β β βββ commands/ # install, projects (add/remove/list), lifecycle, edit, etc.
β βββ shared/ # Shared utilities (npm package: @paparats/shared)
β βββ src/
β βββ path-validation.ts # Path validation
β βββ gitignore.ts # Gitignore parsing
β βββ exclude-patterns.ts # Glob exclude normalization
β βββ language-excludes.ts # Language-specific exclude defaults
βββ examples/
βββ paparats.yml.* # Config examples per languageStack
Qdrant β vector database (1 collection per group with
paparats_prefix for code, plus a separatepaparats_<group>_archcollection per group for the architectural memory layer; cosine similarity, payload filtering)Ollama β local embeddings via Jina Code Embeddings 1.5B for code (task-specific prefixes) and bge-m3 for the architectural memory layer (1024d, mean-pooled, multilingual)
SQLite β embedding cache (
~/.paparats/cache/embeddings.db) + git metadata + symbol edges store (~/.paparats/metadata.db)MCP β Model Context Protocol (SSE for Cursor, Streamable HTTP for Claude Code). Dual endpoints:
/mcp(coding) and/support/mcp(support)TypeScript monorepo with Yarn workspaces
Integration Examples
Support Chatbot
Use paparats as the knowledge backend for a product support bot. Connect the bot to the support endpoint (/support/mcp) for access to explain_feature, recent_changes, find_usages, and other support-oriented tools:
User: "How do I configure rate limiting?"
Bot workflow (via /support/mcp):
1. explain_feature("rate limiting", group="my-app")
β returns code locations + recent changes + related modules
2. get_chunk_meta(<chunk_id>)
β returns who last modified it, when, linked tickets
3. Bot synthesizes response in plain language with ticket referencesCI/CD reindex on push
Indexing lives in the indexer container. To force a reindex of a project from CI, trigger the indexer's HTTP endpoint:
name: Reindex Paparats
on:
push:
branches: [main]
jobs:
reindex:
runs-on: ubuntu-latest
steps:
- run: |
curl -X POST http://your-paparats-host:9877/trigger \
-H 'Content-Type: application/json' \
-d '{"repos": ["your-org/your-repo"]}'Pass "force": true in the body to drop existing chunks first (destructive β use after
schema/config changes). If the project isn't yet in projects.yml, add it once
during your initial setup and the indexer's cron + hot-reload will keep it in sync going
forward.
Code-review assistant
Combine multiple tools to analyze the impact of a pull request:
1. explain_feature("the feature being changed")
β understand what the code does and how it connects
2. find_usages({chunk_id: "<changed chunk>", direction: "both"})
β blast radius via the symbol graph
3. search_changes("related area", since="2024-01-01")
β recent changes that might conflict or overlapEmbedding Model Setup
Paparats supports three embedding backends. Pick one β the choice is sticky per Qdrant collection (changing it requires reindexing; the server refuses to mix providers in one collection and surfaces a clear error).
Provider | Model | Dims | Privacy | Speed (1k chunks) | Cost |
Ollama |
| 1536 | 100% local | ~10β20 min (CPU) | Free, ~1.7 GB on disk |
OpenAI |
| 1536 | Sent to OpenAI | ~30 s | ~$0.02 / 1 M tokens |
Voyage |
| 1024 | Sent to Voyage | ~30 s | ~$0.18 / 1 M tokens |
Selection precedence: explicit EMBEDDING_PROVIDER β OPENAI_API_KEY present
β VOYAGE_API_KEY present β Ollama. So setting just your API key in the
environment is enough to switch.
# OpenAI β cheapest cloud option
export OPENAI_API_KEY=sk-...
docker compose up -d
# Voyage AI β best quality on code per recent benchmarks
export VOYAGE_API_KEY=pa-...
docker compose up -d
# Force a provider explicitly (overrides auto-detect)
export EMBEDDING_PROVIDER=voyageOverrides: EMBEDDING_MODEL (defaults: text-embedding-3-small,
voyage-code-3, jina-code-embeddings) and EMBEDDING_DIMENSIONS (1536 /
1024 / 1536). Voyage voyage-code-3 supports 256/512/1024/2048 via
Matryoshka β set EMBEDDING_DIMENSIONS to opt into a non-default size.
Local (Ollama) β defaults below
Default: jinaai/jina-code-embeddings-1.5b-GGUF β code-optimized, 1.5B params, 1536 dims, 32k context. Not in Ollama registry, so we create a local alias.
Recommended: paparats install automates this:
Native mode (
--ollama-mode native, default on macOS): Downloads GGUF (~1.65 GB) to~/.paparats/models/, creates Modelfile and runsollama create jina-code-embeddingsDocker mode (
--ollama-mode docker, default on Linux): Usesibaz/paparats-ollamaimage with model pre-baked β zero setup
Manual setup:
# 1. Download GGUF
curl -L -o jina-code-embeddings-1.5b-Q8_0.gguf \
"https://huggingface.co/jinaai/jina-code-embeddings-1.5b-GGUF/resolve/main/jina-code-embeddings-1.5b-Q8_0.gguf"
# 2. Create Modelfile
cat > Modelfile <<'EOF'
FROM ./jina-code-embeddings-1.5b-Q8_0.gguf
PARAMETER num_ctx 8192
EOF
# 3. Register in Ollama
ollama create jina-code-embeddings -f Modelfile
# 4. Verify
ollama list | grep jinaSpec | Value |
Parameters | 1.5B |
Dimensions | 1536 |
Context | 32,768 tokens (recommended β€ 8,192) |
Quantization | Q8_0 (~1.6 GB) |
Languages | 15+ programming languages |
Task-specific prefixes (nl2code, code2code, techqa) applied automatically.
Comparison with Alternatives
Feature Matrix
Deployment
Feature | Paparats | Vexify | SeaGOAT | Augment | Sourcegraph | Greptile | Bloop |
Open source | β MIT | β MIT | β MIT | β | β οΈ Partial | β | β οΈ 1 |
Fully local | β | β | β | β οΈ No 2 | β | β | β |
Search Quality
Feature | Paparats | Vexify | SeaGOAT | Augment | Sourcegraph | Greptile | Bloop |
Code embeddings | β Jina 3 | β οΈ 4 | β 5 | β οΈ Partial | β οΈ Partial | β οΈ Partial | β |
Vector database | β Qdrant | SQLite | ChromaDB | Propri. | Propri. | pgvector | Qdrant |
AST chunking | β | β | β | β οΈ Partial | β οΈ Partial | β οΈ Partial | β |
Query expansion | β 6 | β | β | β οΈ Partial | β οΈ Partial | β οΈ Partial | β |
Developer Experience
Feature | Paparats | Vexify | SeaGOAT | Augment | Sourcegraph | Greptile | Bloop |
Real-time watching | β Auto | β | β | β οΈ CI/CD | β | β οΈ Partial | β οΈ Partial |
Embedding cache | β SQLite | β οΈ Partial | β | β οΈ Partial | β οΈ Partial | β οΈ Partial | β |
Multi-project | β Groups | β | β | β | β | β | β |
One-cmd install | β | β οΈ Partial | β οΈ Partial | β | β | β | β |
AI Integration
Feature | Paparats | Vexify | SeaGOAT | Augment | Sourcegraph | Greptile | Bloop |
MCP native | β | β | β | β | β | β οΈ API | β |
Symbol graph | β | β | β | β | β οΈ Partial | β | β |
Token metrics | β | β | β | β οΈ Partial | β | β | β |
Git history | β | β | β | β | β οΈ Partial | β | β |
Ticket extraction | β | β | β | β | β | β | β |
Architectural memory 7 | β ADRs | β | β | β | β | β | β |
Pricing
Paparats | Vexify | SeaGOAT | Augment | Sourcegraph | Greptile | Bloop | |
Cost | β Free | β Free | β Free | β Paid | β Paid | β Paid | β οΈ Archived |
Bloop archived January 2, 2025
Augment Context Engine indexes locally but stores vectors in cloud
Jina Code Embeddings 1.5B (1536 dims) with task-specific prefixes (nl2code, code2code, techqa)
Vexify supports Ollama models but limited to specific embeddings (jina-embeddings-2-base-code, nomic-embed-text)
SeaGOAT locked to all-MiniLM-L6-v2 (384 dims, general-purpose)
Abbreviations, case variants, plurals, filler word removal
Agent-maintained components / decisions (ADRs) / lessons in a second Qdrant collection per group; server-side similarity gate deduplicates writes,
supersedeslinks replace stale decisions, every card carries an "updated N ago" stamp on read
Token Savings Metrics
What we measure (and what we don't)
Paparats provides estimated token savings to help you understand the order of magnitude of context reduction. These are heuristics, not precise measurements.
Per-search response
{
"metrics": {
"tokensReturned": 150,
"estimatedFullFileTokens": 5000,
"tokensSaved": 4850,
"savingsPercent": 97
}
}Field | Calculation | Reality Check |
|
| Based on actual returned content; /4 is rough approximation |
|
| Heuristic: assumes 50 chars/line, never loads actual files |
|
| Derived: difference between two estimates |
|
| Relative: percentage of heuristic estimate |
Cumulative stats
curl -s http://localhost:9876/api/stats | jq '.usage'{
"searchCount": 47,
"totalTokensSaved": 152340,
"avgTokensSavedPerSearch": 3241
}These are sums of estimates, not measured token counts from a real tokenizer.
License
MIT
Releasing (maintainers)
Releases are driven by Changesets. Versioning + CHANGELOG generation happen in CI; publishing to npm and tagging happen locally from a maintainer machine that's authenticated with npm. There are no npm credentials in CI.
Authoring a changeset (per PR)
yarn changeset
# Pick affected packages, bump type (patch/minor/major), and write the user-facing summary.
git add .changeset/
git commit -m "chore: changeset"All four packages (@paparats/shared, @paparats/cli, @paparats/server, @paparats/indexer) are kept on a fixed version β pick any one and the rest are bumped to match.
How a release happens
1. CI opens a release PR (automatic). The Release workflow runs on every push to main. If pending .changeset/*.md files exist, it opens (or updates) a chore: release PR with: version bumps in every package.json, regenerated per-package CHANGELOG.md files, server.json synced via scripts/sync-server-json.js, and the consumed .changeset/*.md files deleted.
2. Maintainer merges the release PR. No further CI publish step runs.
3. Maintainer publishes locally. From a clean checkout of main after the merge:
git checkout main && git pull
yarn release:local # or `--dry-run` to previewyarn release:local runs scripts/release-local.sh, which:
refuses to run unless you're on
main, the tree is clean, and you're in sync withorigin/main;refuses if any pending
.changeset/*.mdare present (means the release PR wasn't merged);reads the new version from
packages/cli/package.json;builds, runs
yarn changeset publish(skips already-published versions), then tagsvX.Y.Zand pushes the tag.
4. Downstream workflows fire on the tag. Pushing vX.Y.Z triggers docker-publish.yml and publish-mcp.yml automatically.
Required credentials
Where | What | Purpose |
CI |
| Open/update the |
Local |
|
|
No npm token lives in GitHub secrets β publishing is intentionally a manual, authenticated step.
Manual / fallback flows
./scripts/release-docker.sh --push still builds and pushes the Docker images by hand if needed (e.g. between official releases). It reads the version from package.json.
Docker images
Image | Source | Size |
|
| ~200 MB |
|
| ~200 MB |
|
| ~3 GB (includes model) |
Contributing
Contributions welcome! Areas of interest:
Additional language support (PHP, Elixir, Scala, Kotlin, Swift)
Alternative embedding providers (OpenAI, Cohere, local GGUF via llama.cpp)
Performance optimizations (chunking strategies, cache eviction)
Agent use cases (support bots, QA automation, code analytics)
Open an issue or pull request to get started.
Links
Jina Code Embeddings β embedding model
Qdrant β vector database
Ollama β local LLM runtime
MCP β Model Context Protocol
Star the repo if Paparats helps you code faster!
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/IBazylchuk/paparats-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server