Provides tools for searching and retrieving Apache Spark documentation, enabling full-text keyword searches with section filtering and access to the full content of documentation pages.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP Spark Documentation Serversearch for window functions in the sql-ref section"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP Spark Documentation Server
An MCP (Model Context Protocol) server that provides search and retrieval tools for Apache Spark documentation. This server enables AI assistants like Claude to search and read Spark documentation directly.
Features
Full-text search using SQLite FTS5 with BM25 ranking and Porter stemming
Section filtering to narrow search results by documentation category
Sparse checkout for efficient cloning of only the docs directory from apache/spark
Docker support for portable deployment across projects
STDIO transport for seamless MCP client integration
Quick Start
Using Docker (Recommended)
# Build the Docker image (includes pre-indexed documentation)
make docker-build
# Test the server
make docker-runUsing uv (Local Development)
# Initialise the environment
make init
# Build the documentation index
make index
# Run the server
make runConfiguration
Claude Code / Claude Desktop
Add to your .mcp.json or global settings:
{
"mcpServers": {
"spark-documentation": {
"command": "docker",
"args": ["run", "-i", "--rm", "martoc/mcp-spark-documentation:latest"]
}
}
}For a locally built Docker image:
{
"mcpServers": {
"spark-documentation": {
"command": "docker",
"args": ["run", "-i", "--rm", "mcp-spark-documentation"]
}
}
}For local development without Docker:
{
"mcpServers": {
"spark-documentation": {
"command": "uv",
"args": ["run", "mcp-spark-documentation"],
"cwd": "/path/to/mcp-spark-documentation"
}
}
}MCP Tools
Tool | Description |
| Search Spark documentation by keyword query with optional section filtering |
| Retrieve the full content of a specific documentation page |
search_documentation
Search Apache Spark documentation using full-text search with stemming support.
Parameter | Type | Required | Default | Description |
| string | Yes | - | Search terms (supports stemming) |
| string | No | None | Filter by section (e.g., sql-ref, streaming, mllib) |
| integer | No | 10 | Maximum results (1-50) |
Common Sections: sql-ref, api, streaming, mllib, graphx, structured-streaming, configuration, tuning
read_documentation
Retrieve the full content of a documentation page.
Parameter | Type | Required | Description |
| string | Yes | Relative path to document (from search results) |
CLI Commands
# Build/rebuild the documentation index
uv run spark-docs-index index
uv run spark-docs-index index --rebuild
uv run spark-docs-index index --branch master
# Show index statistics
uv run spark-docs-index statsDevelopment
make init # Initialise development environment
make build # Run full build (lint, typecheck, test)
make test # Run tests with coverage
make format # Format code
make lint # Run linter
make typecheck # Run type checkerDocumentation
USAGE.md - Detailed usage instructions
CODESTYLE.md - Code style guidelines
CLAUDE.md - Claude Code instructions
Licence
This project is licensed under the MIT Licence - see the LICENSE file for details.
Resources
Looking for Admin?
Admins can modify the Dockerfile, update the server description, and track usage metrics. If you are the server author, to access the admin panel.