Enables semantic search and retrieval of Google Gemini API documentation, providing detailed information on large language models, function calling, embeddings, and multimodal capabilities.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Documentation Fetcher & RAG Searchsearch the gemini docs for function calling examples"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Documentation Fetcher & RAG Search
A modular system for fetching API documentation and enabling semantic search via RAG (Retrieval-Augmented Generation). Designed to give AI coding assistants like Claude access to up-to-date documentation from any project.
Features
Fetch Documentation: Download complete documentation from API providers in markdown format
Semantic Search: Hybrid search combining vector embeddings with keyword matching
MCP Server: Expose search as tools accessible from Claude Code in any project
Modular Design: Easy to add new documentation sources
Supported Documentation Sources
Source | Documents | Description |
Gemini | ~2000 | Google Gemini API - LLM, function calling, embeddings, multimodal |
FastMCP | ~1900 | FastMCP framework - MCP servers, tools, resources, authentication |
Quick Start
Prerequisites
Python 3.12+
Ollama with bge-m3 model
Claude Code (for MCP integration)
Installation
# Clone the repository
git clone <repository-url>
cd documentation
# Create virtual environment
python3.12 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Pull the embedding model
ollama pull bge-m3Fetch & Index Documentation
# Fetch documentation
python -m src.main fetch gemini
python -m src.main fetch fastmcp
# Index for search (requires Ollama running)
python -m src.rag.index gemini
python -m src.rag.index fastmcpSearch Documentation
# Search Gemini docs
python -m src.main search "function calling"
# Search FastMCP docs
python -m src.main search "how to create a tool" -c fastmcp
# More results
python -m src.main search "rate limits" -n 10MCP Server Integration
The MCP server exposes documentation search as tools that Claude Code can use from any project.
Install in Claude Code
IMPORTANT: MCP configuration requires absolute paths. The cwd field is NOT supported by Claude Code.
Option 1: Using Claude CLI (recommended)
# Replace /path/to/documentation with your actual absolute path
claude mcp add docs-search --scope user --transport stdio -- \
/path/to/documentation/.venv/bin/python \
/path/to/documentation/src/mcp_server.pyOption 2: Add to ~/.claude.json manually
{
"mcpServers": {
"docs-search": {
"command": "/path/to/documentation/.venv/bin/python",
"args": ["/path/to/documentation/src/mcp_server.py"]
}
}
}Common mistakes to avoid:
Do NOT use
cwd- it's not a valid MCP configuration fieldDo NOT use relative paths - they resolve from the caller's directory
Do NOT use
-m src.mcp_server- this requires being in the project directory
Verify Installation
# Check server is registered
claude mcp list
# In Claude Code, check connection status
/mcpAvailable Tools
Tool | Description |
| Search documentation with hybrid semantic + keyword search |
| List available documentation collections |
Available Resources
Resource URI | Description |
| JSON list of all collections |
| List of all Gemini documentation pages |
| List of all FastMCP documentation pages |
| Search tips for Gemini docs |
| Search tips for FastMCP docs |
Usage from Claude Code
Once installed, you can ask Claude from any project:
"Search the gemini docs for function calling"
"What documentation collections are available?"
"Search fastmcp for how to create tools"
"Find rate limit information in gemini docs"
Project Structure
documentation/
├── src/
│ ├── main.py # CLI entry point
│ ├── mcp_server.py # MCP server for Claude Code
│ ├── core/
│ │ ├── fetcher.py # HTTP/markdown fetching
│ │ └── parser.py # Navigation parsing
│ ├── modules/
│ │ ├── base.py # Abstract base class
│ │ ├── gemini/ # Gemini documentation module
│ │ └── fastmcp/ # FastMCP documentation module
│ └── rag/
│ ├── chunker.py # Markdown-aware chunking
│ ├── embedder.py # Ollama bge-m3 embeddings
│ ├── sqlite_store.py # SQLite + sqlite-vec vector store
│ ├── search.py # Hybrid search with RRF
│ ├── query_expander.py # Multi-query expansion (LLM)
│ ├── reranker.py # Cross-encoder reranking
│ └── index.py # Indexing CLI
├── output/ # Fetched documentation
│ ├── gemini/
│ └── fastmcp/
├── data/
│ └── docs.db # SQLite vector database
├── requirements.txt
└── README.mdAdding New Documentation Sources
Create a new module in
src/modules/<name>/:
# src/modules/example/config.py
BASE_URL = "https://docs.example.com"
SITEMAP_URL = "https://docs.example.com/sitemap.xml"
MARKDOWN_SUFFIX = ".md" # or ".md.txt" for Google sites# src/modules/example/module.py
from src.modules.base import BaseModule
class ExampleModule(BaseModule):
@property
def name(self) -> str:
return "example"
def get_doc_urls(self) -> list[NavLink]:
# Parse sitemap or navigation
...
def fetch_page(self, url: str) -> str:
# Fetch markdown content
...Register in
src/main.py:
from src.modules.example.module import ExampleModule
# In fetch_command():
elif args.module == "example":
module = ExampleModule()
module.run(output_dir)Add to
KNOWN_COLLECTIONSinsrc/mcp_server.pyFetch and index:
python -m src.main fetch example
python -m src.rag.index exampleHow It Works
Fetching
Parse navigation/sitemap to discover documentation pages
Fetch each page in markdown format (using source-specific tricks like
.md.txtsuffix)Save with source URL metadata
Indexing
Chunk markdown by headers (preserving code blocks)
Generate embeddings via Ollama bge-m3 (1024 dimensions)
Store in SQLite with sqlite-vec (vectors) and FTS5 (keywords)
Searching
Generate query embedding
Perform semantic search (sqlite-vec vector similarity)
Perform keyword search (FTS5 BM25)
Combine with Reciprocal Rank Fusion (RRF)
Optionally expand query with LLM variations
Optionally rerank with cross-encoder
Return ranked results with source URLs
Configuration
Environment Variables
Variable | Description | Default |
| Ollama server URL |
|
SQLite Database
Vector database stored in data/docs.db. Each documentation source gets its own collection within the database.
Development
# Run tests
python -m pytest
# Check MCP server
claude mcp list
# Test search functionality
python -m src.rag.searchTroubleshooting
"Ollama connection failed"
# Make sure Ollama is running
ollama serve
# Pull the embedding model
ollama pull bge-m3"No results found"
# Check if collection is indexed
python -m src.rag.index --status gemini
# Re-index if needed
python -m src.rag.index --clear geminiMCP server not connecting
# Check server status
claude mcp list
# Reinstall
claude mcp remove docs-search
fastmcp install claude-code src/mcp_server.py --name docs-searchLicense
MIT
Credits
Ollama - Local LLM and embeddings
sqlite-vec - Vector search for SQLite
FastMCP - MCP server framework
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.