contxt-box
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@contxt-boxsearch for 'neural-network' in my research folder"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
What Is It?
ConTXT BOX is a strict, local-first knowledge layer that sits beside any project or document folder. It gives coding agents such as Claude Code, Codex, Cursor, and other MCP clients a fast external memory: indexed filenames, folders, neighbors, summaries, cached document/image context, and durable chat preservation.
The design is intentionally narrow. Documents and images are the core path because they cover most real user context. Heavy extraction uses exactly one configured engine: MarkItDown or Docling. No multi-tool fallback chain is used in core extraction.
Related MCP server: ContextAtlas
Features
Lazy indexing with
rel_path, filename, folder, mtime, size, type, neighbors, folder summaries, and cheap file summaries.On-demand extraction only through MarkItDown or Docling.
Permanent Markdown sidecars under
.contextbox/history/media/.MCP tools for coding agents.
Watchdog-based
watchcommand for continuous index updates.Preview-only smart reorganization.
Auto preservation into
.contextbox/CONTEXT.mdplus JSONL history.
Quick Start
uv sync
uv run contxtbox --help
uv run contxtbox init --root "S:\Papers"
uv run contxtbox config-show --root "S:\Papers"
uv run contxtbox index --root "S:\Papers"
uv run contxtbox health --root "S:\Papers"
uv run contxtbox search "computer vision" --root "S:\Papers"When commands are run from inside the target workspace, --root can be omitted.
Install the document/image engines:
uv sync --extra mediaExtract one file with the strict default engine:
uv run contxtbox extract-media "Computer Vision\paper.pdf" --root "S:\Papers"Use Docling explicitly:
uv run contxtbox extract-media "Computer Vision\paper.pdf" --root "S:\Papers" --engine doclingWatch a folder:
uv run contxtbox watch --root "S:\Papers"Run production readiness checks:
uv run contxtbox health --root "S:\Papers" --fail-on-errorShow the effective workspace config:
uv run contxtbox config-show --root "S:\Papers"Production and MCP setup guides:
How It Works
workspace/
`-- .contextbox/
|-- index.json
|-- config.toml
|-- CONTEXT.md
|-- preservation.jsonl
`-- history/
`-- media/
`-- sanitized__file__path.context.mdIndexing Rules
index, update_index, and watch always record:
rel_pathfilenamefolder_pathmtimesizefile_typeneighborsparent_folder_summarylast_indexedcontext_summary
The default summary is cheap and deterministic. It uses filename, folder name, and 5-7 nearby files. It does not open PDFs or images during indexing.
Configuration
init creates .contextbox/config.toml:
extraction_engine = "markitdown"
max_inline_bytes = 512000
large_file_bytes = 50000000
max_neighbors = 10
debounce_seconds = 2.0
auto_watch = true
ignored_dirs = [
".git",
".venv",
"node_modules",
]
priority_folders = [
"codebases/",
"research/",
"specs/",
"decisions/",
"assets/images/",
]Use "docling" when you want Docling as the strict extraction engine.
Extraction Rules
Heavy extraction only happens when:
extract-media pathis called,or an MCP client calls
get_file(path, depth="full").
The result is cached as Markdown in .contextbox/history/media/, and index.json receives:
extracted_atcontext_refextraction_methodextraction_statusextraction_warningsextraction_duration_seconds
Sidecars include the same audit header before extracted content. Status values are conservative:
success, partial, metadata-only, or cached.
MCP Tools
update_index()server_info()set_root(root, index=true)health()search(query, limit=10)get_file(path, depth="metadata" | "full")pull_context(task, limit=5)extract_media(path, force=false)reorganize(instruction)auto_preserve_context(summary, metadata=null)
Start the MCP server:
uv run contxtbox mcp --root "S:\Papers"Attribution
MarkItDown, MIT.
Docling, MIT.
watchdog, Apache-2.0.
sentence-transformers, Apache-2.0 library with model-specific licenses.
ChromaDB, Apache-2.0.
gstack, MIT, as workflow inspiration.
Ponytail, MIT, as minimal-agent behavior inspiration.
Roadmap
Stronger semantic search over sidecars.
Reorganization scoring based on folder summaries and neighbor cues.
MCP client recipes for Claude Code, Codex, Cursor, and others.
Safe apply/undo flow for reorganization.
Configurable ignore rules and extraction engine policy.
Contributing
New ideas, bug fixes, documentation improvements, integration recipes, and production hardening work are welcome. Open an issue for discussion, or submit a focused pull request with a clear description, tests where relevant, and the verification commands you ran.
Useful contribution areas:
MCP client setup recipes for more coding tools.
Better document/image extraction quality checks.
Faster indexing and retrieval for large workspaces.
Safer reorganization previews and apply/undo flows.
Clearer docs, examples, and real-world testing notes.
See CONTRIBUTING.md for the development checks.
Connect
Email: samarakoonf@gmail.com
LinkedIn: Oshadha Samarakoon
License
MIT. See LICENSE.
Release
PyPI publishing is configured for Trusted Publishing through GitHub Actions. See Production readiness.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Oshadha345/contxt-box'
If you have feedback or need assistance with the MCP directory API, please join our Discord server