fittok
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@fittokoptimize context from /path/to/code for 'how does auth work?' using 500 tokens"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Fittok
A standalone MCP server that filters and compresses context before it reaches the LLM, reducing token consumption by 80–90%.
How It Works
[User Query + Files]
│
▼
┌──────────────────────────────────────┐
│ MCP Server: fittok │
│ │
│ ┌──────────┐ │
│ │ Graphify │ → parse code into │
│ └────┬─────┘ knowledge graph │
│ │ │
│ ┌────▼─────┐ │
│ │ slurp │ → select relevant │
│ └────┬─────┘ subgraph (budget) │
│ │ │
│ ┌────▼──────┐ │
│ │LLMLingua │ → compress to target │
│ └──────────┘ token count │
└──────────────────────────────────────┘
│
▼
[Compressed Context] → Send to LLMRelated MCP server: Context Mode
What's New in v0.2.0
Feature | Description |
Streaming Output | Stage-by-stage progress events via |
Watch Mode | Incremental graph updates with file watcher ( |
Batch Boosting | O(n log n) neighbor selection in slurp (3-phase approach) |
Chunked Parsing | Batch file parsing with progress events for large codebases |
GPU Acceleration | CUDA support with auto-detection for LLMLingua compression |
3-Level Cache | Persistent graph/query/compression cache via diskcache |
Web UI | Interactive graph visualization with Gradio + pyvis |
Graph Diffing | Compare two graphs to see structural changes |
Multi-Query | One parse, many queries with |
PII Scrubbing | Detect and redact secrets, API keys, emails before processing |
Structured Output | JSON output mode with supporting nodes and relevance scores |
Installation
# From source
pip install -e .
# With all extras
pip install -e ".[dev,gpu,ui]"Requirements
Python 3.9+
4GB RAM minimum
8GB VRAM optional (for GPU-accelerated compression)
Model Configuration
LLMLingua defaults to microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank (~500MB, CPU-friendly).
Override via environment variable:
export FITTOK_MODEL="microsoft/phi-2" # GPU recommended
export FITTOK_DEVICE=auto # auto | cuda | cpu
python -m fittokOr programmatically:
from fittok.llmlingua_wrapper import compress_context
result = compress_context(text, "question", target_tokens=500, model="microsoft/phi-2")Environment Variables
Variable | Default | Description |
| bert-base-multilingual | CPU model name |
| bert-base-multilingual | GPU model name |
| auto | Device: auto, cuda, cpu |
| false | Enable PII scrubbing in pipeline |
| ~/.cache/fittok | Cache directory |
| 500 | Max cache size in MB |
Usage
As an MCP Server (Claude Code, etc.)
{
"mcpServers": {
"fittok": {
"command": "python",
"args": ["-m", "fittok"]
}
}
}Or run standalone:
python -m fittokAs a Python Library
from fittok.graphify import parse_codebase, save_graph
from fittok.slurp import query_graph
from fittok.llmlingua_wrapper import compress_context
# Step 1: Parse codebase
graph = parse_codebase("/path/to/code")
save_graph(graph, "graph.json")
# Step 2: Query for relevant subgraph
markdown, count, tokens = query_graph(graph, "How does auth work?", token_budget=4000)
# Step 3: Compress
result = compress_context(markdown, "How does auth work?", target_tokens=500)
print(result["compressed"])One-call Pipeline
from fittok.server import optimize_context_tool
result = optimize_context_tool(
codebase_path="/path/to/code",
query="How does authentication work?",
token_budget=500,
)
print(result["optimized_context"])Multi-Query Batching
from fittok.server import optimize_context_batch
result = optimize_context_batch(
codebase_path="/path/to/code",
queries=["How does auth work?", "What is the entry point?"],
token_budget=500,
)
for r in result["results"]:
print(f"Q: {r['query']}\nA: {r['optimized_context']}\n")PII Scrubbing
from fittok.pii_scrubber import scrub_text
result = scrub_text("Contact admin@company.com with key AKIAIOSFODNN7EXAMPLE")
print(result["scrubbed"])
# "Contact [REDACTED_EMAIL] with key [REDACTED_AWS_ACCESS_KEY]"MCP Tools (v0.1.0)
Tool | Description |
| Parse code into a knowledge graph |
| Query graph for relevant subgraph |
| Compress text using LLMLingua |
| Full pipeline: parse → query → compress |
MCP Tools (v0.2.0 — new)
Tool | Description |
| Streaming pipeline with stage-by-stage progress |
| One parse, many queries |
| JSON structured output with supporting nodes |
| Chunked parsing with progress events |
| Incremental graph updates via file watcher |
| Graph metadata and type distribution |
| Force full re-parse, ignoring cache |
| Compare two knowledge graphs |
| PII detection and redaction |
| Manage PII patterns |
| Cache management |
| Launch web visualization dashboard |
Supported Languages
Python
JavaScript / TypeScript
Java
Go
Rust
Running Tests
pip install -e ".[dev]"
pytest tests/ -vArchitecture
fittok/
├── pyproject.toml
├── src/fittok/
│ ├── __init__.py
│ ├── server.py # MCP server (FastMCP)
│ ├── graphify.py # Code → knowledge graph
│ ├── slurp.py # Graph query engine
│ ├── llmlingua_wrapper.py # Compression wrapper (CPU + GPU)
│ ├── models.py # Pydantic data models
│ ├── tokens.py # Shared token counting
│ ├── cache.py # 3-level persistent cache
│ ├── diff.py # Graph diffing
│ ├── pii_scrubber.py # PII detection & redaction
│ ├── watcher.py # File watcher for incremental updates
│ └── ui.py # Web visualization (Gradio + pyvis)
├── tests/
│ ├── test_graphify.py
│ ├── test_slurp.py
│ ├── test_llmlingua.py
│ ├── test_server.py
│ ├── test_server_v2.py
│ ├── test_cache.py
│ ├── test_diff.py
│ └── test_pii_scrubber.py
├── examples/
│ └── usage.py
├── README.md
└── LICENSELicense
MIT
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/likhithreddy/fittok'
If you have feedback or need assistance with the MCP directory API, please join our Discord server