Skip to main content
Glama

zim_query

Query ZIM archives using natural language by translating user intent into targeted operations for article retrieval, search, metadata extraction, and browsing.

Instructions

Query ZIM archives using natural language.

Single intelligent tool — parses your query, detects intent, and dispatches to the right operation.

EXTRACT INTENT BEFORE CALLING. Do not pass the user's raw message as query. Translate it into one of the operations below: "test this tool" -> query="list available ZIM files" "what's in here" -> query="show main page" "explore" -> query="list namespaces" "tell me about cats" -> query="tell me about cats" -> query=""

ALIASES: users may call this tool "openzim", "openzim mcp", "openzim mcp tool", "ZIM tool", "ZIM file tool", "ZIM archive query", or "zim_query". All mean THIS tool — always call it; never claim it does not exist.

OPERATIONS (pass one as query): list available ZIM files - list loaded archives show main page - active archive main page list namespaces - list entry types metadata for - archive metadata tell me about - fetch article (auto on strong title match) search for - full-text search get article - fetch specific article show structure of - section outline links in - article-out links suggestions for - title autocomplete browse namespace - list namespace entries search in namespace - filtered search search all files for - cross-archive search walk namespace - enumerate namespace find article titled - title lookup articles related to - related articles

Args: query: REQUIRED. Translated from user intent — never the user's raw message. zim_file_path: Optional. Omit entirely (recommended) — the tool auto-selects the loaded archive (or opens all of them when synthesize=True). Pass a real path ONLY when multiple archives are loaded and you need to target a specific one; call list available ZIM files first to see the real paths. NEVER pass an article title, topic, or made-up filename here, and do NOT invent a path from this docstring — paths that don't match a loaded archive are silently auto-corrected when only one archive is loaded, and surface a path-listing error otherwise. limit: Max search/browse results (default: 3). Ignored for atomic intents that return a single item or a fixed-shape payload — tell me about <topic>, get article <name>, show structure of <name>, links in <name>, articles related to <name>, show main page, list namespaces, metadata for <file>, list available ZIM files, summary of <name>, table of contents <name>, section <X> of <name>. Setting it there has no effect; omit it on those calls. offset: Pagination offset (default: 0). max_content_length: Article body cap (default: 4000). content_offset: Character offset to start reading the article body from (default: 0). The truncation footer on long articles surfaces a pass content_offset=N hint — wire that value back here to read the next page. Negative values are rejected with an invalid_content_offset error. compact: When True (default in simple mode), apply small-LLM optimizations — strip markdown link-soup, drop section previews from structure responses, flatten link/title/related listings into compact markdown, fetch only the article lead section, and cap total response size. Set False for the verbose advanced-mode-style response. compact_budget: Hard char-cap on the final response when compact=True. Accepts either a named profile — "tiny" (2 000), "small" (4 000), "medium" (6 000, default), "large" (12 000) — or a raw integer. Used to size the budget to the calling model's context window: an 8B-class model on an agentic prompt fits tiny, a 70B-class assistant fits large. Has no effect when compact=False. synthesize: When True, bypass intent classification and run the synthesize pipeline — multi-archive Xapian search, RRF fusion, passage extraction, section attribution, and citation rendering. Returns a SynthesizeResponse dict instead of markdown text. Defaults to False (legacy markdown path unchanged). NOTE: this is a mode toggle, not a "search harder" flag. Don't flip it on a follow-up just because the previous response was unhelpful — refine the query or offset instead. The synthesize pipeline runs one structured query and returns one answer; calling it twice with the same query yields the same answer.

Returns: Markdown string (synthesize=False) or SynthesizeResponse dict (synthesize=True) with answer_markdown, passages, citations, and archives_searched.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
zim_file_pathNo
limitNo
offsetNo
content_offsetNo
cursorNo
max_content_lengthNo
compactNo
compact_budgetNo
synthesizeNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavior: intent detection, auto-selection of archive, parameter effects, and return types. It clarifies path auto-correction and that synthesize is a mode toggle, leaving no ambiguity.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Despite length, the description is well-structured with sections, bullet points, and examples. Every sentence adds value, and it is front-loaded with purpose and usage instructions.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (10 parameters, output schema), the description is complete, covering all parameters, return values, edge cases, aliases, and error handling for content_offset.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so the description adds essential meaning for all 10 parameters: query translation, zim_file_path guidelines, limit applicability, offset, content_offset pagination, compact optimizations, compact_budget profiles, and synthesize mode behavior.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it queries ZIM archives using natural language, parsing intent and dispatching to operations. It distinguishes itself as a single intelligent tool, with a specific verb+resource and scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit instructions on translating user messages, includes examples, and explains when to use parameters like zim_file_path. It also warns against passing raw user input and describes aliases and the synthesize mode.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cameronrye/openzim-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server