llms-txt-mcp

implementation-plan.md•10.7 KiB

# Plan 1: llms-txt-mcp (Lean Documentation MCP) *Fast, predictable, minimal‑context documentation access for Claude Code via llms.txt, with URL‑only config, human‑readable IDs, and freshness under your control.* ## Problem Statement - **Real Impact** - AI SDK exposes its docs via `llms.txt` using repeated YAML frontmatter blocks; many tools either miss sections or dump huge context, making Claude Code outputs noisy and slow. Example: AI SDK `llms.txt` is 25K+ lines and >100 sections; naive listing pollutes the context. - Third‑party knowledge MCPs (e.g., broad, multi‑source indices) sometimes mix outdated and fresh pages (e.g., AI SDK v4 vs v5 during migration), leading to inconsistent answers and broken code. - **Root Cause** - Parsing assumptions (expecting plain markdown headings) don’t fit AI SDK’s YAML‑block format; tools lack structured section boundaries. - Tool schemas return too much text (TOCs or full files) instead of small, surgical payloads. - No source allowlist or freshness controls (TTL + ETag) → randomness in versions. - **Business Impact** - Developer time lost due to context bloat and wrong/outdated guidance. - Higher token costs; slower iteration; reduced trust in assistant outputs. ### Current State Analysis - **✅ What’s Working** - The need is clear: lean, deterministic doc access improves Claude Code productivity. - Chroma provides simple ephemeral/persistent modes; sentence‑transformers provide fast small embedding models. - **🔧 What Needs Improvement** - Robust parser for both AI SDK’s YAML‑block llms.txt and the official llms.txt. - Search‑first tool flow with tiny results; byte‑capped section retrieval; human‑readable IDs. - Strict source allowlist, TTL + ETag/Last‑Modified freshness, and unified index with host filtering. - **📊 Data/Metrics** - AI SDK `llms.txt` >25K lines; target structure discovery <1s; search <150ms on ~1–5k chunks; `docs_get` payload capped at ~60KB. ## Core Objectives - **Primary Goal**: Provide Claude Code with fast, minimal‑context, deterministic documentation retrieval from user‑specified `llms.txt` URLs. - **Secondary Goals**: - Parse both AI SDK YAML‑frontmatter llms.txt and official llms.txt formats cleanly. - Unified index across multiple sources; filter by host when needed. - Human‑readable IDs (canonical URLs + anchors) and small, predictable tool schemas. - Freshness you control: TTL + ETag/Last‑Modified revalidation. - **Success Metrics**: - Search p95 <150ms; single `docs_get` p95 <50ms (local, indexed). - `docs_get` responses respect default 60KB cap and preserve order. - No full TOC or full‑file dumps in normal flows; tools remain <~250 tokens of schema. ### MVP Decision Filter *(30 seconds max)* - [x] **Serve users/revenue?** Yes — immediate productivity gains, lower token use. - [x] **Build now or defer?** Build now — core to your daily workflow. - [x] **Maintenance burden?** Low — small surface; modest deps; TTL/ETag reduce churn. --- ## Phase-Based Implementation > Each phase ends in a working, committable state. Fewer phases where possible; no time estimates. ### Phase 1: Foundation/Core Implementation **Goals:** - Implement a lean MCP (“docs‑mcp”) that indexes user‑provided `llms.txt` URLs and exposes four tiny tools: `docs_sources`, `docs_search`, `docs_get`, `docs_refresh`. - Ensure search‑first UX and byte‑capped retrieval with human‑readable IDs. **Phase Scope:** - **Must Have** - URL‑only config; strict allowlist; Chroma backend (ephemeral by default); `thenlper/gte-small` embeddings. - Parsers for AI SDK YAML‑block llms.txt and official llms.txt (H2 sections, H1 fallback). - Unified collection `docs` with host metadata; `docs_search(q, hosts?, k)`; `docs_get(ids[], max_bytes, merge?)`. - Freshness: TTL + ETag/Last‑Modified; `docs_refresh` to force reindex. - Streaming fetch + incremental parse for large files. - **Can Defer** - Disk persistence flag (`--store=disk`), stale doc cleanup on reindex (can land in P2 if needed), diversity (MMR). - **Won’t Do** - Name:url configs; backend switching flags; listing full TOCs; complex multi‑page crawl. **Implementation:** ```python # CLI (positional URLs also accepted) URL... --ttl 24h --timeout 30 --embed-model thenlper/gte-small \ --store memory [--store-path PATH] --max-get-bytes 60000 # Or with flag form: --sources URL... --ttl 24h --timeout 30 --embed-model thenlper/gte-small \ --store memory [--store-path PATH] --max-get-bytes 60000 # Tools (FastMCP) - docs_sources() -> [{ source_url, host, lastIndexed, docCount }] - docs_search(q, hosts?: [str], k?: int=5) -> [{ id, source, title, snippet, score }] - docs_get(ids: [str], max_bytes?: int=60000, merge?: bool=false, depth?: int=0) - docs_refresh(source?: str) -> { refreshed, counts } # IDs id = canonical_url (+ '#anchor') # Parser selection try AI SDK YAML-frontmatter blocks -> blocks to sections else parse official headings -> H2 sections (H1 fallback) ``` **Quick Validation:** - [ ] `docs_search` returns tiny items (no content blobs), `k<=20`. - [ ] `docs_get` respects `max_bytes` and preserves order. - [ ] Indexing AI SDK `llms.txt` completes and `docCount>0`. **Success Criteria:** - [ ] `uv run llms-txt-mcp --sources https://ai-sdk.dev/llms.txt` produces working tools. - [ ] Search p95 <150ms (local, indexed); `docs_get` p95 <50ms single section. - [ ] Schemas stay small and deterministic. --- ### Phase 2: Enhancement/Extension **Goals:** - Improve reliability and extend storage options without expanding the tool surface. **Phase Scope:** - **Must Have** - Disk persistence: `--store=disk --store-path PATH` via Chroma PersistentClient. - Stale doc removal on reindex per source (remove ids no longer present). - Background TTL refresh triggered by tool calls (non‑blocking). - **Can Defer** - Search diversity (MMR), per‑host caps in results. - **Won’t Do** - Switching to alternate backends publicly; keep internal abstraction only. **Implementation:** - Add PersistentClient wiring; guard writes; delete stale by `source_url`. - Schedule refresh when `now - last_indexed > ttl` on `docs_search`/`docs_get`. **Quick Validation:** - [ ] Restart preserves index if `--store=disk`. - [ ] Reindex removes stale content; no orphaned hits. - [ ] Tool latency unaffected by background refresh. **Success Criteria:** - [ ] Persistent runs behave the same as ephemeral with data retained. - [ ] Stale removal verified by changing source content. - [ ] Background refresh does not block tool responses. --- ### Phase 3: Polish/Integration (if needed) **Goals:** - Add optional capabilities and prepare for broader ingestion. **Phase Scope:** - **Must Have** - None, if P1–P2 already meet needs. - **Can Defer** - Doc‑site ingestion (`https://ai-sdk.dev/docs/...`) into same chunk schema. - Diversity (MMR), per‑host caps, version pinning. - **Won’t Do** - Over‑optimization or large tool surface expansion. **Implementation:** - Introduce a `site` ingest pipeline that crawls allowed roots and reuses the unified index. **Quick Validation:** - [ ] Site‑ingested chunks interoperate seamlessly with `docs_search`/`docs_get`. - [ ] No regressions to llms.txt flows. **Success Criteria:** - [ ] Cross‑source answers (AI SDK + Tailwind + shadcn + Next.js) from one search. - [ ] Payload caps and schemas remain stable. --- ## Key Technical Decisions ### Architecture Choices - **Chroma (ephemeral by default)**: Zero‑ops locally; optional persistence later. Simple metadata filtering and a single unified collection. - **Human‑readable IDs**: Canonical URLs (+ anchors) for transparent, copy‑pasteable references. ### Implementation Patterns - **Follow Existing Conventions**: Search‑first → get flow; tiny tool schemas; deterministic shapes. - **Error Handling**: Return `{ "error": "message" }` for tool errors; validate inputs; strict allowlist for URLs ending with `/llms.txt`. - **Testing Strategy**: Unit tests for parsers; mocked HTTP (ETag/304); index/retrieval integration; performance smoke tests on large `llms.txt`. ### Dependencies and Integrations - **External**: `httpx`, `sentence-transformers` (default `thenlper/gte-small`), `chromadb`, `mcp`. - **Database Changes**: None (Chroma collection only). - **Frontend Impact**: None; tool consumers are MCP clients (e.g., Claude Code). ## Risk Mitigation - **Rollback Plan**: Disable server entry or revert CLI flags; ephemeral mode leaves no on‑disk state. - **Compatibility**: Small, stable schemas; keep tool list minimal; avoid breaking param renames. - **Testing**: Per‑phase validation; CI for unit/integration tests; perf checks on large fixtures. - **Monitoring**: Log fetch/index times, chunk counts, query latency, response sizes. ## Success Metrics **Technical:** - [ ] Search p95 <150ms; get p95 <50ms (local, indexed) - [ ] Payload cap enforced (default 60KB) and configurable - [ ] Parser handles AI SDK YAML‑block and official llms.txt accurately **Business:** - [ ] Reduced token usage vs prior tools (no TOC/full‑file dumps) - [ ] Faster doc lookup → improved developer throughput - [ ] Deterministic/fresh results (TTL + ETag) **Quality:** - [ ] No regressions in tool outputs - [ ] All tests pass (unit + integration) - [ ] README and CLI help are up‑to‑date --- ## Implementation Notes ### Solo Dev Context (for future you) - **Key files**: `src/llms_txt_mcp/server.py` (server + tools), `README.md`, `docs/implementation-plan.md`. - **Gotchas**: AI SDK files are huge; ensure streaming fetch; YAML variance; ensure slug stability; guard Chroma upsert vs add; embedding time dominates. - **Quick wins**: Host filter in search; byte cap in get; preindex on launch for instant first query. - **Debug shortcuts**: Log counts and durations; test against a local cached `llms.txt` to iterate quickly. ### Before Starting - [ ] Review MCP tool shapes for minimality - [ ] Confirm CLI flags match README - [ ] Identify acceptance tests for AI SDK `llms.txt` ### During Implementation - [ ] Commit at end of each phase - [ ] Run unit/integration tests locally - [ ] Update README and examples as you go - [ ] Monitor logs for fetch/index timings ### Phase Completion Criteria - ✅ Working end‑to‑end - ✅ Tested (unit + integration) - ✅ Documented - ✅ Committable ### Quality Gates - [ ] No breaking changes to existing functionality - [ ] Lint + type checks pass - [ ] Integration tests pass - [ ] Performance targets maintained --- ## Template Usage Guidelines (kept for reference) - Use multiple phases for complex/high‑risk changes; otherwise prefer fewer phases with complete states. - Each phase must add user value; avoid setup‑only phases. - Plan for early feedback; keep tool schemas tiny and stable.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tenequm/llms-txt-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

implementation-plan.md•10.7 KiB