Skip to main content
Glama
wuyuwenj
by wuyuwenj

Documentation MCP Server (RAG-enabled)

An MCP server that provides AI assistants with semantic search access to any documentation. Uses RAG (Retrieval-Augmented Generation) with vector embeddings to find relevant content based on meaning, not just keywords.

Features

  • Semantic Search - Find relevant content by meaning, not just keyword matching

  • Recursive Chunking - Documents split by headers (H1 → H2 → H3) for precise retrieval

  • Low Token Usage - Returns ~500-1000 tokens per query instead of 15k+ full documents

  • Vector Embeddings - Uses OpenRouter (text-embedding-3-small) via Apify proxy

  • PostgreSQL + pgvector - Supabase for scalable vector storage

Architecture

Scrape → Parse HTML → Recursive Chunk → Generate Embeddings → Store in Supabase ↓ User Query → Embed Query → Vector Similarity Search → Return Relevant Chunks

Available Tools

Tool

Description

Token Usage

list_docs

List all docs with metadata (id, title, summary, sections)

~200

search

Semantic search across all docs

~500-1000

get_doc_overview

Get document summary and section structure

~200

get_section

Get content of a specific section

~500-1000

get_chunks

Paginated access to all chunks in a doc

~500-1000

search_in_doc

Semantic search within a specific document

~500-1000

Setup

1. Create Supabase Project

  1. Go to supabase.com and create a project

  2. Go to SQL Editor and run the schema below

  3. Copy your Project URL and anon key from Settings → API

2. Supabase SQL Schema

Run this in your Supabase SQL Editor:

-- Enable pgvector extension create extension if not exists vector; -- Document metadata table create table doc_metadata ( id text primary key, title text not null, url text not null, summary text, sections text[], total_chunks integer, type text, created_at timestamp default now() ); -- Chunks table with vector embeddings create table doc_chunks ( id uuid primary key default gen_random_uuid(), doc_id text references doc_metadata(id) on delete cascade, doc_title text, doc_url text, section_path text[], heading text, content text not null, token_count integer, chunk_index integer, embedding vector(1536), created_at timestamp default now() ); -- Function for similarity search create or replace function search_chunks( query_embedding vector(1536), match_count int default 5, filter_doc_id text default null ) returns table ( id uuid, doc_id text, doc_title text, section_path text[], heading text, content text, similarity float ) language plpgsql as $$ begin return query select dc.id, dc.doc_id, dc.doc_title, dc.section_path, dc.heading, dc.content, 1 - (dc.embedding <=> query_embedding) as similarity from doc_chunks dc where (filter_doc_id is null or dc.doc_id = filter_doc_id) order by dc.embedding <=> query_embedding limit match_count; end; $$; -- Optional: Add index for large datasets (1000+ chunks) -- create index doc_chunks_embedding_idx -- on doc_chunks using ivfflat (embedding vector_cosine_ops) -- with (lists = 100);

3. Configure Environment Variables

Set these in your Apify Actor or use Apify secrets:

# Add secrets (recommended) apify secrets add supabaseKey "your-supabase-anon-key"

In .actor/actor.json:

{ "environmentVariables": { "SUPABASE_URL": "https://your-project.supabase.co", "SUPABASE_KEY": "@supabaseKey" } }

4. Configure Documentation Sources

Input

Description

Default

Start URLs

Documentation pages to scrape

Apify SDK docs

Max Pages

Maximum pages to scrape (1-1000)

100

Force Refresh

Re-scrape even if data exists

false

Example input:

{ "startUrls": [ { "url": "https://docs.example.com/getting-started" }, { "url": "https://docs.example.com/api-reference" } ], "maxPages": 200, "forceRefresh": false }

5. Deploy and Connect

# Push to Apify apify push # Add to Claude Code (get URL from Actor output) claude mcp add api-docs https://<YOUR_ACTOR_URL>/mcp \ --transport http \ --header "Authorization: Bearer <YOUR_APIFY_TOKEN>"

Usage Examples

Once connected, your AI assistant can:

"Search the docs for authentication best practices" → Returns relevant chunks from multiple documents "Show me the overview of the API reference doc" → Returns summary and section list "Get the 'Getting Started' section from doc-1" → Returns specific section content "What documentation is available?" → Returns list of all docs with summaries

API Endpoints

Endpoint

Method

Description

/

GET

Server status and setup instructions

/mcp

POST

MCP protocol endpoint

/refresh-docs

POST

Re-scrape and update all documentation

/stats

GET

Get document and chunk statistics

How Chunking Works

Documents are split recursively by headers:

Document (15k tokens) ├── H1: Introduction (chunk 1, ~600 tokens) ├── H2: Getting Started │ ├── H3: Installation (chunk 2, ~400 tokens) │ └── H3: Configuration (chunk 3, ~500 tokens) ├── H2: API Reference │ ├── H3: Methods (chunk 4, ~700 tokens) │ └── H3: Examples (chunk 5, ~600 tokens) └── ...
  • Target chunk size: 500-800 tokens

  • Max chunk size: 1000 tokens

  • Min chunk size: 100 tokens (smaller sections merged with parent)

Pricing

This Actor uses pay-per-event pricing through Apify. Costs include:

  • Scraping: Initial crawl of documentation

  • Embeddings: Generated via OpenRouter (charged to Apify account)

  • Tool calls: Each MCP tool invocation

Development

# Install dependencies npm install # Run locally npm run start:dev # Build npm run build # Push to Apify apify push

License

ISC

-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wuyuwenj/api-doc-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server