Skip to main content
Glama
gglessner

ZIM RAG MCP Server

by gglessner

ZIM RAG MCP Server

MCP (Model Context Protocol) server for reading .zim archives and exposing search/content retrieval tools over stdio.

What This Server Provides

  • ZIM file discovery from a configured directory

  • Metadata and article listing tools

  • Title/url search

  • Article content retrieval

  • TF-IDF based RAG retrieval over extracted article chunks

Project Layout

  • server.py - MCP stdio server and tool/resource handlers

  • zim_reader.py - binary ZIM parser and article extraction

  • rag_engine.py - chunking + TF-IDF retrieval engine

  • requirements.txt - Python dependencies

Requirements

  • Python 3.10+

  • A directory containing one or more .zim files

Install dependencies:

pip install -r requirements.txt

Run Locally

From this folder (MCPs/ZIM_MCP):

python server.py

Environment variable:

  • ZIM_DIRECTORY (optional): directory containing .zim files

    • default: current working directory

Example:

set ZIM_DIRECTORY=E:\ZIMs
python server.py

MCP Client Configuration

Use script execution (not -m MCPs.ZIM_MCP), because this package does not define __main__.py.

Example (Windows / Cline-style JSON)

{
  "mcpServers": {
    "ZIM-MCP": {
      "type": "stdio",
      "command": "C:\\Program Files\\Python310\\python.exe",
      "args": [
        "e:\\ZIM-MCP\\MCPs\\ZIM_MCP\\server.py"
      ],
      "env": {
        "ZIM_DIRECTORY": "e:\\ZIMs"
      },
      "timeout": 60,
      "disabled": false,
      "autoApprove": []
    }
  }
}

Tools

  • list_zim_files

    • List .zim files discovered in ZIM_DIRECTORY.

  • zim_info

    • Return metadata and namespace counts for a specific ZIM file.

  • zim_search

    • Search by title/url substring.

  • zim_get_article

    • Return article title/url/content.

  • zim_rag_retrieve

    • Return top semantic matches from TF-IDF retrieval.

  • zim_list_articles

    • Paginated article list with namespace filter.

Resource URIs

  • zim://{file}/info

  • zim://{file}/article/{url}

  • zim://{file}/search/{query}

  • zim://{file}/rag/{query}

Notes

  • RAG indexing now gracefully handles small/stopword-heavy corpora and returns empty results instead of crashing.

  • ZIM cluster parsing supports common compression formats, including Zstandard (via Python zstandard package).

Troubleshooting

  • Server starts but no files found

    • Verify ZIM_DIRECTORY points to the folder that contains .zim files.

  • No module named ...

    • Reinstall deps: pip install -r requirements.txt

  • MCP fails to launch from client

    • Use the full script path in args (...\\server.py), not -m MCPs.ZIM_MCP.

Author

Garland Glessner (gglessner@gmail.com)

License

GNU General Public License v3 (GPLv3)

-
security - not tested
A
license - permissive license
-
quality - not tested

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gglessner/ZIM-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server