Skip to main content
Glama
jmacd867

fastcontext-mcp

by jmacd867

fastcontext-mcp

An MCP server that wraps FastContext-1.0 as a repo-exploration subagent for Claude Code.

Instead of letting Sonnet spend half its context budget grepping around a codebase, you offload that work to a dedicated 4B model trained specifically to explore repos. FastContext issues parallel READ/GLOB/GREP calls, then returns compact file paths + line ranges as grounded citations. Claude Code gets clean context; FastContext does the legwork.

Claude Code (Sonnet) ──explore_repo──▶  fastcontext-mcp  ──READ/GLOB/GREP──▶  your repo
        ▲                                       │
        └──── file:line citations ──────────────┘

Based on Microsoft's FastContext paper: integrating FastContext improves coding agent accuracy by up to 5.5% while reducing main-agent token consumption by up to 60%.


Requirements

  • Python 3.10+

  • A running FastContext inference server (SGLang or vLLM, OpenAI-compatible)

  • Claude Code


Related MCP server: Code Intelligence MCP Server

Setup

1. Serve FastContext locally

You need a GPU with ~6GB VRAM for the 4B model. The 4B-RL variant slightly outperforms 4B-SFT on most benchmarks and is recommended for deployment.

pip install sglang[all]

# SFT variant (default)
./scripts/serve.sh microsoft/FastContext-1.0-4B-SFT

# RL variant (recommended)
./scripts/serve.sh microsoft/FastContext-1.0-4B-RL

Or with vLLM:

pip install vllm
vllm serve microsoft/FastContext-1.0-4B-SFT --tool-call-parser hermes

The server will be available at http://localhost:30000.

2. Install this MCP server

git clone https://github.com/YOUR_USERNAME/fastcontext-mcp
cd fastcontext-mcp
pip install -e .

3. Register with Claude Code

Add to your ~/.claude/claude_desktop_config.json (or project-level .mcp.json):

{
  "mcpServers": {
    "fastcontext": {
      "command": "fastcontext-mcp",
      "env": {
        "FASTCONTEXT_BASE_URL": "http://localhost:30000/v1",
        "FASTCONTEXT_MODEL": "FastContext-1.0-4B-SFT"
      }
    }
  }
}

Restart Claude Code. You should see explore_repo in the available tools.


Usage

Once registered, Claude Code can call explore_repo automatically, or you can invoke it explicitly:

explore_repo("where is the rate limiting middleware defined")
explore_repo("find all places that call the payment API", repo_root="/path/to/repo")

FastContext will issue several parallel read/search calls internally and return something like:

<final_answer>
- src/middleware/ratelimit.py: lines 12-47
- src/middleware/__init__.py: line 8
- tests/test_ratelimit.py: lines 1-30
</final_answer>

Claude Code then uses those citations as focused context rather than reading the whole codebase.


Configuration

Env var

Default

Description

FASTCONTEXT_BASE_URL

http://localhost:30000/v1

SGLang/vLLM server URL

FASTCONTEXT_MODEL

FastContext-1.0-4B-SFT

Model name as registered in the server

FASTCONTEXT_MAX_TURNS

8

Max exploration turns before giving up

FASTCONTEXT_MAX_FILE_LINES

300

Max lines returned per READ call


No GPU? Remote inference

If you don't have a local GPU, you can serve FastContext on a remote machine and point FASTCONTEXT_BASE_URL at it. The MCP server itself is CPU-only and just proxies requests.


Why not just use Claude Code directly?

You can. But FastContext is trained specifically for the locate-relevant-code task, and it's 4B parameters — it's faster and cheaper per exploration call than routing everything through Sonnet. On large codebases the token savings are significant (the paper reports up to 60% reduction in main-agent tokens).


License

MIT. FastContext model weights are also MIT licensed by Microsoft.

A
license - permissive license
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jmacd867/fastcontext-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server