How do I use Jama MCP Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@Jama MCP Server search requirements about volume sync" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

Jama MCP Server

by yyy188

Overview Schema Related Servers Score Discussions

Python

Local

Jama MCP Server

A production-grade Model Context Protocol (MCP) server for the Jama requirements management system, combining high-precision RAG retrieval with native REST API filtering. An LLM client (Claude Desktop, etc.) can autonomously choose between semantic search and structured metadata queries.

Architecture

            ┌──────────────────────── MCP (stdio) ────────────────────────┐
            │                                                              │
  LLM ──────┤  init_jama_project      get_sync_progress                   │
  Client    │  search_jama_semantics  query_jama_native_metadata           │
            │                                                              │
            │   server.py  (FastMCP + APScheduler + thread pool)          │
            │      │                                                       │
            │      ├── rag_pipeline.py  (Multi-Query + Hybrid + RRF +     │
            │      │                       cross-encoder reranker)         │
            │      ├── jama_client.py   (OAuth2 + pagination + HTML clean)│
            │      └── db_setup.py      (SQLite + FTS5 + sqlite-vec)      │
            │                                                              │
            └──────────────────────────────────────────────────────────────┘
                         │                        │
                  Jama REST API            Local CPU embeddings (default:
                  (read-only GET)          bge-small-en-v1.5) + Azure OpenAI
                                           (optional, text-embedding-3-small)

Related MCP server: AXYS MCP Lite

Retrieval pipeline (`search_jama_semantics`)

Multi-Query — the query is expanded into 3-5 sub-queries. The MCP LLM client performs the expansion and passes the variants via the sub_queries parameter; when none are supplied, the server falls back to deterministic lexical variants (stopword-stripped + truncated) so RRF fusion still benefits from multiple recall angles. No server-side chat LLM is configured or called.
Hybrid recall — for each sub-query: vector recall (sqlite-vec, cosine)
- keyword recall (FTS5, BM25), each capped at candidate_k.
RRF fusion — Reciprocal Rank Fusion merges all ranked lists into one candidate pool of ≤ candidate_k unique chunks.
Rerank — a local cross-encoder (cross-encoder/ms-marco-MiniLM-L-6-v2, ~80MB, CPU, ONNX via fastembed/onnxruntime) scores (query, chunk) pairs via a sequence-classification head; top top_k returned. It runs on the SAME onnxruntime as the bge embedding model — no torch/transformers dependency, so the Windows c10.dll/WinError 1114 load failure is eliminated. If the model is unavailable, the pipeline gracefully falls back to RRF scores. Model weights are fetched from the HuggingFace China mirror (HF_ENDPOINT=https://hf-mirror.com) on first use, then served from cache.

Reliability & crash recovery

The server is designed to survive crashes without losing data and to come back up consistent on restart:

Atomic per-item indexing — each item's chunks (text + FTS5 + sqlite-vec) are replaced in a single write_txn (BEGIN IMMEDIATE), so a crash mid-sync never leaves a half-written item. done/progress only advance after the commit, so the DB is consistent up to the last flushed batch.
Idempotent re-sync — upserts overwrite (never duplicate), so re-processing already-indexed items on resume is harmless.
Startup recovery — _resume_interrupted_syncs re-queues any project left INITIALIZING by a prior crash, so the server self-heals without manual action.
Concurrency guard — init_jama_project refuses a duplicate concurrent sync for a project that already has a job in flight, returning the existing job_id instead of spawning a racing second worker.
Bounded HTTP retries — 429 rate-limit handling is a bounded loop (not recursion), so a persistent rate-limit fails cleanly instead of overflowing the stack; Retry-After parsing tolerates non-numeric values; a 401 mid-sync refreshes the token and retries the page; malformed JSON bodies are retried.
WAL mode + write lock — SQLite runs in WAL with a process-wide write lock, so the scheduler's writer and MCP reader threads coexist without SQLITE_BUSY failures.

Chunking (LlamaIndex)

Jama rich-text (Description / Test Case Steps) is cleaned to plain text with BeautifulSoup before being wrapped in LlamaIndex Document objects. The documents are split into TextNode chunks by LlamaIndex's SentenceSplitter (recursive, sentence-aware; chunk_size=512, chunk_overlap=80 to preserve context for the ~30% long-form items). The item name is prepended to each chunk so the title is always retrievable.

Native API (`query_jama_native_metadata`)

Bypasses the vector store for exact-match questions (specific document key, status, item type). Uses /abstractitems which honours itemType, contains and documentKey server-side; status is refined client-side. Handles pagination internally, returns up to 20 core metadata records.

Incremental sync

On startup, APScheduler registers a job (every 2h by default) that reads the projects table for projects in READY status (INITIALIZING is deliberately excluded — those are handled by crash recovery) along with their last_sync_time, then walks Jama items whose modifiedDate > last_sync_time, re-cleans/re-chunks them and updates the FTS5 + sqlite-vec indexes. New items are added; modified items have their old chunks replaced atomically. A project that already has an in-flight job is skipped so a scheduled sync never races a user-initiated one.

Setup

Get the code

# Direct (if GitHub is reachable)
git clone https://github.com/yyy188/jama-mcp-server.git jama
cd jama

# China mirror (if github.com is slow/blocked)
git clone https://gh-proxy.com/https://github.com/yyy188/jama-mcp-server.git jama
cd jama

Recommended: `uv` (deterministic, reproducible)

uv is a single-binary Python package manager (~20 MB). Its lockfile (uv.lock) pins the entire dependency tree — every package and its transitive deps — so uv sync on a new machine produces the exact same environment, with no version-resolution surprises.

# 1. Install uv (one-time, ~20 MB single binary)
#    Windows:  winget install astral-sh.uv
#    macOS/Linux:  curl -LsSf https://astral.sh/uv/install.sh | sh
#    (or: pip install uv)

# 2. Sync dependencies from the lockfile (creates .venv, installs 116 packages)
uv sync

# 3. Configure
cp .env.example .env        # then edit: fill in JAMA_URL / JAMA_CLIENT_ID / JAMA_CLIENT_SECRET
#    Or run the interactive wizard (also lets you choose the DB storage directory):
#    uv run python setup_wizard.py

# 4. Pre-download models (~150 MB ONNX, one-time)
uv run python bootstrap.py

# 5. Run
uv run python server.py     # stdio (default) — local MCP client spawns it

Alternative: `pip`

pip install -r requirements.txt
cp .env.example .env        # fill in Jama credentials
python bootstrap.py
python server.py

Transports: stdio vs HTTP

The server supports three transports, selected by JAMA_MCP_TRANSPORT:

Transport	Use case	Client connects via
`stdio` (default)	Local MCP client (Claude Desktop) spawns server as subprocess	stdin/stdout
`streamable-http`	Remote client / Docker / shared server	`http://host:8000/mcp`
`sse`	Older MCP clients that only support SSE	`http://host:8000/sse`

For HTTP/SSE mode, set JAMA_MCP_HOST (0.0.0.0 for remote access) and JAMA_MCP_PORT (default 8000):

# streamable-http (MCP new standard), listening on all interfaces
JAMA_MCP_TRANSPORT=streamable-http JAMA_MCP_HOST=0.0.0.0 uv run python server.py
# SSE (older clients)
JAMA_MCP_TRANSPORT=sse JAMA_MCP_HOST=0.0.0.0 uv run python server.py

MCP client config (stdio, Claude Desktop example)

{
  "mcpServers": {
    "jama-mcp": {
      "command": "uv",
      "args": ["run", "--directory", "/abs/path/to/jama", "python", "server.py"]
    }
  }
}

MCP client config (streamable-http)

Point your MCP client at http://localhost:8000/mcp (or the remote host:port).

To (re)download just the models later without re-running the wizard:

uv run python bootstrap.py     # or: python bootstrap.py

The models live in user/huggingface/ (project-local, ~150 MB: a ~130 MB ONNX embedding + ~80 MB ONNX cross-encoder reranker). Both run on onnxruntime via fastembed — CPU-only, no torch/transformers. The model files are plain data — portable across machines, so you can copy that folder from another machine to skip the download entirely.

Why pinned onnxruntime / Python 3.12

onnxruntime is pinned to 1.20.1 and Python 3.13+ is not supported (requires-python = ">=3.10,<3.13"). On Windows, onnxruntime ≥1.21 (which Python 3.13 forces, because fastembed requires >1.21 there) depends on the new VC++ Runtime (vcruntime140_1.dll) absent on many machines, causing WinError 1114 DLL load failures. 1.20.1 loads cleanly on Python 3.10–3.12 and satisfies fastembed's constraint. uv sync automatically picks Python 3.12 (the verified stable target) from the lockfile. If you upgrade onnxruntime, re-test on a clean Windows machine without the latest VC++ Redistributable.

Windows: VC++ Runtime (vcruntime140.dll)

onnxruntime is a C++ binary that needs vcruntime140.dll — part of the Microsoft VC++ Redistributable. Most Windows machines already have it (anything with Chrome / Java / VS Code installed does), but a clean Windows install may not.

The server auto-detects this: preflight probes for the DLL and, if missing, reports a clear blocking error with the fix. setup_wizard.py offers to auto-install it (downloads the 24 MB installer from https://aka.ms/vs/16/release/vc_redist.x64.exe — reachable from mainland China at ~420 KB/s — and runs it silently). You can also install it manually:

# From the project directory (after uv sync):
uv run python -c "from preflight import install_vcruntime; install_vcruntime()"
# Or download + run the installer yourself:
#   https://aka.ms/vs/16/release/vc_redist.x64.exe

This is a system-level install (writes vcruntime140.dll to C:\Windows\System32, requires admin/UAC) — it's the one thing this project installs outside its own folder, because the DLL must be in the system path for onnxruntime to find it. Linux/macOS don't need it (onnxruntime bundles the system libs in its wheels there).

After the server starts, the LLM client should call bootstrap_models (and poll get_bootstrap_progress every ~2 min) to pre-download the embedding + reranker models BEFORE the first init_jama_project — see Model bootstrap. On startup the server logs a hint if the models aren't cached yet.

First-run configuration guard

Every MCP tool runs an offline pre-flight check before doing any work: Python dependencies, required env vars (JAMA_URL / JAMA_CLIENT_ID / JAMA_CLIENT_SECRET — plus EMBEDDING_BASE_URL / EMBEDDING_API_KEY only when EMBEDDING_PROVIDER=azure; the default local CPU provider needs no embedding credentials) and the SQLite store. If anything is missing the tool returns a clear error dict with a hint instead of failing midway through a Jama API call. Configure via the wizard, or call the configure_jama / validate_setup tools at runtime.

MCP client config (Claude Desktop example)

{
  "mcpServers": {
    "jama-mcp": {
      "command": "python",
      "args": ["/absolute/path/to/jama-mcp-server/server.py"],
      "env": { "JAMA_MCP_DB_PATH": "/absolute/path/to/jama-mcp-server/jama_mcp.db" }
      // ↑ DB directory is selectable at install time via setup_wizard; filename is fixed.
    }
  }
}

Usage flow (for the LLM)

bootstrap_models() → pre-download embedding + reranker models (first run only). Returns job_id immediately; poll get_bootstrap_progress(job_id) every ~2 min until DONE. Skip if models are already cached (re-running is a fast no-op).
init_jama_project("20571") → returns job_id immediately (non-blocking).
get_sync_progress(job_id) → poll until status == "DONE", roughly every 2 minutes (syncs index many items and take minutes — don't busy-poll).
search_jama_semantics("20571", "how does volume sync work", top_k=5) → RAG.
query_jama_native_metadata("20314", document_key="SA-TC-7") → exact match.

To re-index a project that is already initialized, use reinit_jama_project("20571") (full re-sync) and poll the same way. Scheduled incremental syncs run automatically (~every 2h); check any project's in-flight job plus its last init/reinit/sync run at any time with get_sync_status("20571").

Model bootstrap

The embedding model (~130MB ONNX, bge-small-en-v1.5) and the cross-encoder reranker (~80MB) are not bundled — they download on first use. To keep the first sync from stalling on a model download, call bootstrap_models right after the server is configured. It downloads BOTH models asynchronously (a kind="bootstrap" job in sync_jobs, run on the same thread pool as syncs) and returns a job_id immediately.

bootstrap_models() — start the async pre-download (no-op per model if already cached). Reentrancy-guarded: a second call while one is RUNNING returns the existing job_id.
get_bootstrap_progress(job_id) — poll every ~2 min. Progress is phase-based, not live bytes: the reranker downloads via snapshot_download and the embedding via fastembed, neither of which gives a per-chunk byte callback, so message reports phase transitions (e.g. "Downloading reranker model (...)" → "Reranker model ready") rather than byte counts. status → DONE (both cached) or ERROR.

On startup, if either model isn't cached, the server logs a hint to call bootstrap_models. The sync-time ensure_downloaded calls remain as a fallback so a skipped bootstrap still works (the first sync downloads the models inline).

Monitoring

get_sync_status(project_id) is the one-call monitor for a project's sync operations. All three operations — init_jama_project, reinit_jama_project and the scheduled incremental sync — run asynchronously as background jobs (recorded in the sync_jobs table with kind = init / reinit / sync), so each is pollable. The tool returns:

active_job — the in-flight job for this project (or null if idle);
recent.{init,reinit,sync} — the most recent job of each kind, terminal or running, so you can see the last result even when nothing is running now;
project_status / last_sync_time / item_count / chunk_count — current project state;
process — lightweight live metrics (RSS, threads, DB size, chunk count) for the server process; null if psutil is unavailable.

After starting an init or reinit, poll get_sync_progress(job_id) (or get_sync_status(project_id)) roughly every 2 minutes, reporting each sample to the user, until the job reaches DONE/ERROR. On startup, any job left RUNNING by a prior crash is reconciled to ERROR (interrupted by restart) so the monitor never shows a phantom in-flight job.

Resilience

Jama API: OAuth token auto-refresh on expiry + 401 retry; urllib3 Retry with exponential backoff on 429/5xx; explicit Retry-After handling; SSL connection-reset tolerated (transient on this network).
Embeddings: same retry/backoff session on the embedding endpoint.
SQLite concurrency: WAL mode + busy timeout + a process-level write lock so the APScheduler writer and MCP reader threads coexist without SQLITE_BUSY errors; chunk replacement is atomic per item.
Reranker: lazy-loaded singleton; failure degrades to RRF-only scoring instead of crashing the search.
Read-only: JamaClient only issues GET requests — it cannot create, modify or delete data on the Jama instance.

Files

File	Purpose
`requirements.txt`	deps + Aliyun mirror config
`config.py`	env-driven settings (dataclasses) + validation/persistence/reload
`db_setup.py`	SQLite schema, FTS5 + sqlite-vec loading, CRUD
`jama_client.py`	OAuth, paginated fetch, HTML cleaning, native query, browse API
`rag_pipeline.py`	chunking, embeddings, Multi-Query, hybrid recall, RRF, rerank
`server.py`	MCP tools, async jobs, APScheduler incremental sync, pre-flight guards
`preflight.py`	offline dependency + config + storage validation
`net_guard.py`	pre-download bandwidth speed test (`NetworkTooSlowError`)
`bootstrap.py`	foreground model pre-download CLI (`python bootstrap.py`)
`setup_wizard.py`	interactive configuration wizard (`python setup_wizard.py`)
`selftest.py`	end-to-end self-test suite (`python selftest.py`)
`.env.example`	template for environment configuration

Tools

Configuration & validation

validate_setup(live=False) — offline pre-flight (+ optional live Jama/embedding probe).
configure_jama(values) — apply config at runtime, persist to .env, reload.

Jama browse (read-only, gated by pre-flight)

list_jama_projects() — all visible projects.
find_jama_project_by_name(name, exact?) — find projects by name → get id + info.
get_jama_item(item_id) — full single item (cleaned text).
get_jama_item_children(item_id) — decomposition children.
get_jama_item_relationships(item_id) / list_jama_project_relationships(project_id, item_id?) — relationships (cursor-paginated /relationships).
get_jama_item_comments(item_id) — item comments (cleaned body).
get_jama_item_attachments(item_id) — attachment metadata (no binary).
list_jama_releases(project_id) — project releases/versions.
list_jama_test_runs(project_id?, test_cycle_id?) — test runs.
list_jama_item_types() — tenant item types (id → name).
find_jama_item_type_by_name(name, exact?) — find item types by display name → get the id needed by item_type filters.
query_jama_endpoint(path, params?, all_pages?) — generic read-only GET escape hatch.

RAG / retrieval / sync monitoring

bootstrap_models() — async pre-download of embedding + reranker models (returns job_id).
get_bootstrap_progress(job_id) — poll a bootstrap job (every ~2 min) until DONE/ERROR.
init_jama_project(project_id) — async background init (returns job_id).
reinit_jama_project(project_id) — async full re-sync of an already-initialized project.
get_sync_progress(job_id) — poll one init/reinit/sync job's progress.
get_sync_status(project_id) — project monitor: in-flight job + last init/reinit/sync run + process metrics.
search_jama_semantics(project_id, query, ...) — Multi-Query + hybrid + RRF + cross-encoder rerank.
query_jama_native_metadata(project_id, ...) — exact-match metadata via /abstractitems.

Verified

All components self-tested against the live Jama instance and the local CPU embedding backend: OAuth + paginated fetch, HTML→text cleaning, Test Case step rendering, item-type mapping, DB schema (FTS5 + vec0), full RAG search, async init with progress polling, incremental sync (0 new items), concurrent download + batched embed, crash recovery (INITIALIZING → auto-resynced READY), native metadata filters (item_type / status / keyword / document_key), APScheduler startup, MCP stdio handshake, and error paths (bad project id, unknown job, nonexistent project, missing args).

The cross-encoder reranker (ms-marco-MiniLM-L-6-v2, ONNX port via fastembed) was downloaded from the HuggingFace China mirror (hf-mirror.com) and loaded on onnxruntime (no torch); verified it produces non-zero relevance scores with correct ordering (a related document scores significantly higher than an unrelated one) and that the end-to-end RAG search returns strategy=rerank results. Scores are the model's raw logits (may be negative) — only the relative order is meaningful for re-ranking. LlamaIndex is the primary RAG framework: SentenceSplitter + Document/TextNode for chunking. Multi-Query expansion is performed by the MCP LLM client and passed to the pipeline via search(sub_queries=...); when omitted, deterministic lexical variants are used.

Install Server

license - not found

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

View all tools

Related MCP Servers

Knowledge MCP Service
Vector Databases Documentation Access Testing & QA Tools
vietnama10
F
license
-
quality
F
maintenance
Enables AI-powered document analysis and querying for project documentation using vector embeddings stored in Redis. Supports document upload, context-aware Q\&A, automatic test case generation, and requirements traceability through OpenAI integration.
Last updated 2026-03-01
307
AXYS MCP Lite
Search RAG Systems Databases
rajesh-siliconvalleycloudit
A
license
-
quality
D
maintenance
Enables AI assistants to search through structured databases and unstructured content (documents, videos, files) using natural language queries with semantic understanding.
Last updated 2025-12-01
MIT
Oracle MCP
Knowledge & Memory RAG Systems Search
laris-co
F
license
-
quality
D
maintenance
Enables semantic search and knowledge management for storing and querying principles, patterns, and learnings using hybrid keyword and vector search.
Last updated 2025-12-31
1
spec-assistant
RAG Systems Code Analysis
beastkp
F
license
-
quality
C
maintenance
Enables natural language queries on technical specifications and automated code compliance checks using local RAG with vector search, integrated via MCP.
Last updated 2026-06-13

View all related MCP servers

Related MCP Connectors

Ragora
Search your knowledge bases from any AI assistant using hybrid RAG.
mcp-server
Apple Developer Documentation with Semantic Search, RAG, and AI reranking for MCP clients
Rootr
Connect your team's living knowledge base — docs, data, issues, CRM — to Claude and ChatGPT.

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/yyy188/jama-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

Jama MCP Server

Architecture

Retrieval pipeline (search_jama_semantics)

Reliability & crash recovery

Chunking (LlamaIndex)

Native API (query_jama_native_metadata)

Incremental sync

Setup

Get the code

Recommended: uv (deterministic, reproducible)

Alternative: pip

Transports: stdio vs HTTP

MCP client config (stdio, Claude Desktop example)

MCP client config (streamable-http)

Why pinned onnxruntime / Python 3.12

Windows: VC++ Runtime (vcruntime140.dll)

First-run configuration guard

MCP client config (Claude Desktop example)

Usage flow (for the LLM)

Model bootstrap

Monitoring

Resilience

Files

Tools

Verified

Maintenance

Resources

Looking for Admin?

Tools

Related MCP Servers

Knowledge MCP Service

AXYS MCP Lite

Oracle MCP

spec-assistant

Related MCP Connectors

Latest Blog Posts

MCP directory API

Retrieval pipeline (`search_jama_semantics`)

Native API (`query_jama_native_metadata`)

Recommended: `uv` (deterministic, reproducible)

Alternative: `pip`