mcplens
Provides semantic codebase search tools for Windsurf (by Codeium), enabling efficient retrieval of relevant code chunks instead of reading full files.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@mcplenssearch for how authentication is implemented in the user service"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
mcplens
Semantic codebase search for AI coding assistants — 70-85% token reduction, 100% local, zero cloud dependency.
AI coding assistants like Claude Code, Cursor, and Codex are powerful — but they have a fundamental problem: when you ask a question, they read files by guessing which ones are relevant based on path and filename heuristics. On a medium-sized project, a single query can consume 10,000–20,000 tokens of context just loading files that may not even be relevant.
claude-context-optimizer solves this by giving your AI assistant semantic search over your codebase. Instead of reading files blindly, it calls
search_code("how does payment work?") and gets back only the 5 most relevant code chunks — indexed locally using embeddings, stored in SQLite, zero data leaving your machine.
How it works
When you open your AI assistant in a project:
The MCP server starts automatically (spawned via stdio by the assistant)
It compares file hashes against the last index and re-indexes only what changed (delta indexing)
A file watcher keeps the index in sync as you code
Your assistant now has access to 3 semantic search tools instead of reading raw files
You ask: "how does the Asaas webhook work?"
Without cco: With cco:
Read AsaasWebhookController.php search_code("asaas webhook")
Read AsaasWebhookService.php → returns 5 relevant chunks
Read PaymentService.php → ~800 tokens total
Read BillingModule.php
Read ...8 more files
→ ~15,000 tokens totalUnder the hood
Embeddings:Ollama with
nomic-embed-text(768-dim) — 100% local, free, no API keyVector store: SQLite with cosine similarity computed in-process — no extra infrastructure
Chunking: AST-aware via
tree-sitter(splits by function/class) with sliding window fallbackTransport: MCP stdio — the assistant spawns the process and communicates via pipe
Persistence: Index lives in
.claude-context/index.dband survives between sessions
Compatibility
claude-context-optimizer works with any MCP-compatible AI coding assistant. MCP (Model Context Protocol) is an open standard — the same server works across all clients
without modification.
Assistant | Status | Config location |
Claude Code | ✅ |
|
Cursor | ✅ |
|
Windsurf | ✅ |
|
Trae | ✅ |
|
Codex | ✅ | MCP config (preview) |
Any MCP client | ✅ | Follows MCP stdio spec |
The init command detects which assistants you use and registers the server automatically in the right place.
Token savings
The index lives locally. The assistant fetches only what's relevant. The numbers speak for themselves:
Project size | Without cco | With cco | Savings |
~200 files | ~5k tokens/query | ~1.2k tokens/query | ~75% |
~1000 files | ~10k tokens/query | ~1.5k tokens/query | ~85% |
~5000 files | ~20k+ tokens/query | ~2k tokens/query | ~90% |
These are context tokens — the portion you control. Savings scale with project size because larger projects trigger more heuristic file reads by default.
Tools exposed
Tool | When to use |
| Conceptual queries:"how does billing work","where is authentication handled" |
| Exact lookups:"find PaymentService","where is handleWebhook defined" |
| Debug: how many files and chunks are currently indexed |
Add this to your project's CLAUDE.md (or equivalent) to guide the assistant:
## Context Search
Always use MCP tools before reading files:
- search_code() — for conceptual or natural language queries
- get_symbol() — for exact class/function/method lookups
Only read full files if both tools return insufficient context.Installation options
Option A — npm (requires Ollama)
Zero overhead. Best for developers who already have Ollama installed.
npm install -g @vmsfigueredo/mcplens
ollama pull nomic-embed-text:latest
cd your-project && mcplens initSee INSTALL.md for full setup instructions.
Option B — Docker
Not available yet. Docker distribution (bundling Node + Ollama + model) is planned but not implemented. Track progress in the Roadmap.
Configuration
.claude-context/config.json is created automatically by init. Edit it to customize behavior:
{
"embeddings": {
"provider": "ollama",
"ollamaUrl": "http://localhost:11434",
"ollamaModel": "nomic-embed-text:latest"
},
"search": {
"topK": 5,
"minScore": 0.3
},
"ignore": [
"**/tests/fixtures/**"
]
}To use OpenAI embeddings instead:
{
"embeddings": {
"provider": "openai",
"openaiApiKey": "sk-...",
"openaiModel": "text-embedding-3-small"
}
}What gets indexed
Included by default:.ts .tsx .js .jsx .mjs .php .svelte .vue .py .rb .go .rs .css .scss .json .yaml .yml .md .sql
Ignored by default:node_modules, .git, vendor, dist, build, .next, .claude-context
The .claude-context/ directory is automatically added to .gitignore.
Index size reference
Project | Files | Approx size |
Small | ~200 files | ~15 MB |
Medium | ~1000 files | ~70 MB |
Large | ~5000 files | ~350 MB |
Dashboard
A lightweight web dashboard is available at http://localhost:3000 while the server is running:
Overview — files indexed, chunks, index size, Ollama status
Activity — live feed of re-indexing events
Search — test queries manually and see scores (useful for calibrating
minScore)Files — full list of indexed files with chunk counts
The dashboard runs on port 3333 by default. If that port is already taken (e.g. two projects open simultaneously), the port is automatically calculated from the project name. To
open:
mcplens dashboardTo disable: add --no-dashboard to the server args in your MCP config.
Privacy
Everything runs on your machine:
Embeddings are generated locally via Ollama — your code never leaves
The index is stored in
.claude-context/index.dbin your projectNo telemetry, no analytics, no accounts
⚠️ If you use the OpenAI embeddings option, chunks are sent to OpenAI's API.
Why not just use existing tools?
Tool | Language | Fully local? | Install friction |
| TypeScript | ❌ requires Zilliz Cloud + OpenAI | Medium |
| Python | ✅ | High (torch, FAISS, pipx) |
| Python | ✅ | Medium (pipx, sentence-transformers) |
| Rust | ✅ | High (must compile Rust) |
@vmsfigueredo/mcplens | Node.js | ✅ | Low ( |
The goal is to be the most accessible option for JS/TS developers — not the most feature-complete. If you already have Node.js, you're one command away.
Roadmap
AST-based chunking via tree-sitter
Delta indexing by file hash
Real-time file watcher
Dashboard
Multi-client init (Claude Code, Cursor, Windsurf, Trae)
Hybrid search (BM25 + semantic)
Docker option with bundled Ollama
Contextual retrieval (LLM-generated chunk summaries)
Token usage analytics via Claude Code hooks
Contributing
PRs welcome. See INSTALL.md for local development setup.
Built with
This project was built using Claude Code — which is exactly why it exists.
License
MIT
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/vmsfigueredo/mcplens'
If you have feedback or need assistance with the MCP directory API, please join our Discord server