tripitaka-mcp

Survey Corpus (exhaustive)

survey_corpus

Read-onlyIdempotent

Survey the whole Pāli Tipiṭaka for a term with exact counts and per-pitaka breakdown. Guarantees complete coverage and returns distinct matched forms for auditing.

Instructions

Exhaustively survey the WHOLE Tipiṭaka for a term — guaranteed complete.

Use this (not search_by_keyword) when the question is about coverage or counting rather than "show me the best passages":

"How many times does Kusinārā appear in the canon?"
"Every place ānāpānassati is mentioned — don't miss any"
"Which pitakas/how many suttas mention this term?"

Unlike search_by_keyword (ranked, capped at 50, no total), this returns an exact count, a per-pitaka breakdown, the distinct surface forms that matched (so you can audit and discard over-matches), and a paginated enumeration. The lexical result carries complete: true — a hard guarantee that nothing was dropped for the chosen match_scope.

Two layers, two different promises:

lexical — the word and its forms. Deterministic + EXHAUSTIVE.
semantic (mode="thorough", hosted only) — passages teaching the same concept with DIFFERENT vocabulary (e.g. ānāpānassati via assasati/passasati). Approximate, NOT exhaustive — it never claims completeness, it only boosts recall.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`mode`	No	"fast" (default) = lexical only — quick, no server-side ML, works offline. "thorough" = also run the semantic layer (hosted only; this is the heavier part). The lexical guarantee holds in BOTH.	fast
`cursor`	No	Offset into the full lexical result set for pagination.
`pitaka`	No	Restrict to "vinaya" / "sutta" / "abhidhamma", or None for all.
`keyword`	Yes	Term to survey (Romanised Pāli preferred; diacritics optional — matching folds `ā→a`, `ṁ→m`, etc.).
`language`	No	"pali" (default) or "english". Thai is not indexed yet.	pali
`page_size`	No	Lexical results per page (default 20, max 100). Counts/forms cover the WHOLE corpus regardless of this.
`sem_limit`	No	Max semantic hits (default 50, max 200). `capped` flags when reached. Only used when mode="thorough".
`match_scope`	No	"word" (default) matches the exact word/phrase only. "stem" also matches inflections + compounds via prefix (kusinārā → kusinārāyaṁ, kusināravagga …) — higher recall, may over-match (audit via `matched_forms`).	word
`sem_threshold`	No	Max cosine distance for semantic hits (default 0.7; lower = stricter). Only used when mode="thorough".

Output Schema

TableJSON Schema

Name	Required	Description	Default
No arguments

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint, so the description doesn't need to restate those. However, it adds valuable behavioral context: guarantees completeness for lexical mode, explicitly states that semantic mode is approximate and NOT exhaustive, explains that the lexical guarantee holds in both modes, and mentions pagination behavior. The only minor gap is not explicitly stating that it does not modify data, but annotations already cover that.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections, bullet points, and examples. It front-loads the core guarantee and usage guidance. However, it is somewhat verbose, particularly in the second paragraph about layers and promises. Some sentences could be trimmed without loss of clarity, but overall it is still effective and organized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (9 parameters, output schema exists), the description is comprehensive. It covers core functionality, usage context, mode differences, guarantees, and limitations. The output schema is referenced indirectly but its existence means return value details are not needed. The sibling tools list confirms differentiation. The description leaves no major gaps for an agent to misuse the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already describes all 9 parameters with 100% coverage, so the description's role is additive. It adds useful context like the lexical guarantee holding in both modes, that the Tp language is not indexed yet, and that audits via matched_forms are possible. While it doesn't add extensive new info for each parameter, it provides enough extra guidance to justify above baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool exhaustively surveys the whole Tipiṭaka for a term with a guaranteed complete lexical search. It distinguishes itself from `search_by_keyword` by emphasizing coverage and counting rather than ranked results. The verb 'survey' combined with the resource 'corpus' and the term 'exhaustive' make the purpose very specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool ('when the question is about coverage or counting') and when not to ('not search_by_keyword for best passages'). It provides concrete examples of appropriate questions and directly names the alternative tool (`search_by_keyword`) with a clear contrast in capabilities (ranked, capped, no total). This makes usage guidance excellent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/dhamma-seeker/tripitaka-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server