Which integrations are available for this server?

Uses Supabase with pgvector extension for scalable vector storage of document embeddings, enabling semantic search across documentation through vector similarity search and PostgreSQL-based chunk retrieval.

How do I use Documentation MCP Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@Documentation MCP Server search the docs for authentication best practices" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

Documentation MCP Server (RAG-enabled)

An MCP server that provides AI assistants with semantic search access to any documentation. Uses RAG (Retrieval-Augmented Generation) with vector embeddings to find relevant content based on meaning, not just keywords.

Features

Semantic Search - Find relevant content by meaning, not just keyword matching
Recursive Chunking - Documents split by headers (H1 → H2 → H3) for precise retrieval
Low Token Usage - Returns ~500-1000 tokens per query instead of 15k+ full documents
Vector Embeddings - Uses OpenRouter (text-embedding-3-small) via Apify proxy
PostgreSQL + pgvector - Supabase for scalable vector storage

Architecture

Scrape → Parse HTML → Recursive Chunk → Generate Embeddings → Store in Supabase ↓ User Query → Embed Query → Vector Similarity Search → Return Relevant Chunks

Available Tools

Tool	Description	Token Usage
`list_docs`	List all docs with metadata (id, title, summary, sections)	~200
`search`	Semantic search across all docs	~500-1000
`get_doc_overview`	Get document summary and section structure	~200
`get_section`	Get content of a specific section	~500-1000
`get_chunks`	Paginated access to all chunks in a doc	~500-1000
`search_in_doc`	Semantic search within a specific document	~500-1000

Setup

1. Create Supabase Project

Go to supabase.com and create a project
Go to SQL Editor and run the schema below
Copy your Project URL and anon key from Settings → API

2. Supabase SQL Schema

Run this in your Supabase SQL Editor:

-- Enable pgvector extension create extension if not exists vector; -- Document metadata table create table doc_metadata ( id text primary key, title text not null, url text not null, summary text, sections text[], total_chunks integer, type text, created_at timestamp default now() ); -- Chunks table with vector embeddings create table doc_chunks ( id uuid primary key default gen_random_uuid(), doc_id text references doc_metadata(id) on delete cascade, doc_title text, doc_url text, section_path text[], heading text, content text not null, token_count integer, chunk_index integer, embedding vector(1536), created_at timestamp default now() ); -- Function for similarity search create or replace function search_chunks( query_embedding vector(1536), match_count int default 5, filter_doc_id text default null ) returns table ( id uuid, doc_id text, doc_title text, section_path text[], heading text, content text, similarity float ) language plpgsql as $$ begin return query select dc.id, dc.doc_id, dc.doc_title, dc.section_path, dc.heading, dc.content, 1 - (dc.embedding <=> query_embedding) as similarity from doc_chunks dc where (filter_doc_id is null or dc.doc_id = filter_doc_id) order by dc.embedding <=> query_embedding limit match_count; end; $$; -- Optional: Add index for large datasets (1000+ chunks) -- create index doc_chunks_embedding_idx -- on doc_chunks using ivfflat (embedding vector_cosine_ops) -- with (lists = 100);

3. Configure Environment Variables

Set these in your Apify Actor or use Apify secrets:

# Add secrets (recommended) apify secrets add supabaseKey "your-supabase-anon-key"

In .actor/actor.json:

{ "environmentVariables": { "SUPABASE_URL": "https://your-project.supabase.co", "SUPABASE_KEY": "@supabaseKey" } }

4. Configure Documentation Sources

Input	Description	Default
Start URLs	Documentation pages to scrape	Apify SDK docs
Max Pages	Maximum pages to scrape (1-1000)	100
Force Refresh	Re-scrape even if data exists	false

Example input:

{ "startUrls": [ { "url": "https://docs.example.com/getting-started" }, { "url": "https://docs.example.com/api-reference" } ], "maxPages": 200, "forceRefresh": false }

5. Deploy and Connect

# Push to Apify apify push # Add to Claude Code (get URL from Actor output) claude mcp add api-docs https://<YOUR_ACTOR_URL>/mcp \ --transport http \ --header "Authorization: Bearer <YOUR_APIFY_TOKEN>"

Usage Examples

Once connected, your AI assistant can:

"Search the docs for authentication best practices" → Returns relevant chunks from multiple documents "Show me the overview of the API reference doc" → Returns summary and section list "Get the 'Getting Started' section from doc-1" → Returns specific section content "What documentation is available?" → Returns list of all docs with summaries

API Endpoints

Endpoint	Method	Description
`/`	GET	Server status and setup instructions
`/mcp`	POST	MCP protocol endpoint
`/refresh-docs`	POST	Re-scrape and update all documentation
`/stats`	GET	Get document and chunk statistics

How Chunking Works

Documents are split recursively by headers:

Document (15k tokens) ├── H1: Introduction (chunk 1, ~600 tokens) ├── H2: Getting Started │ ├── H3: Installation (chunk 2, ~400 tokens) │ └── H3: Configuration (chunk 3, ~500 tokens) ├── H2: API Reference │ ├── H3: Methods (chunk 4, ~700 tokens) │ └── H3: Examples (chunk 5, ~600 tokens) └── ...

Target chunk size: 500-800 tokens
Max chunk size: 1000 tokens
Min chunk size: 100 tokens (smaller sections merged with parent)

Pricing

This Actor uses pay-per-event pricing through Apify. Costs include:

Scraping: Initial crawl of documentation
Embeddings: Generated via OpenRouter (charged to Apify account)
Tool calls: Each MCP tool invocation

Development

# Install dependencies npm install # Run locally npm run start:dev # Build npm run build # Push to Apify apify push

License

ISC

Documentation MCP Server