ContextMine
Indexes FastAPI documentation sites, enabling search and retrieval of documentation content.
Indexes GitHub repositories for hybrid search and code intelligence, including symbol extraction and structural navigation.
Allows pushing coverage reports from GitHub Actions workflows to ContextMine for real metrics.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@ContextMineSearch the FastAPI docs for dependency injection"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
What is ContextMine?
ContextMine indexes your documentation and code repositories, making them searchable via the Model Context Protocol (MCP). Connect it to Claude Desktop, Cursor, or any MCP-compatible AI assistant to provide rich context for code understanding, documentation lookup, and codebase exploration.
Key features:
Hybrid search - Full-text + vector similarity with RRF ranking for accurate retrieval
Deep research agent - Multi-step AI agent with LSP and Tree-sitter for complex codebase questions
Code intelligence - Symbol extraction, code outlines, and structural navigation via Tree-sitter
Architecture Cockpit - Read-only extracted Twin views per collection/scenario (
Overview,Topology,Deep Dive,C4 Diff,Exports)Strict real metrics - File-level LOC/complexity/coupling/coverage for GitHub sources with explicit availability status
Web crawling - Index documentation sites automatically
Git indexing - Index GitHub repositories with incremental updates
Self-hosted - Your data stays on your infrastructure
Deep Research Agent
The deep research agent goes beyond simple search to answer complex questions about your codebase. It uses an iterative approach with multiple tools:
Tool | Description |
Hybrid Search | BM25 + vector similarity search with RRF ranking |
LSP Go to Definition | Jump to symbol definitions across files |
LSP Find References | Find all usages of a symbol |
LSP Hover | Get type information and documentation |
Tree-sitter Outline | Extract file structure (classes, functions, methods) |
Tree-sitter Find Symbol | Locate symbols by name pattern |
Graph Traversal | Navigate call graphs and dependencies |
The agent collects evidence from multiple sources, verifies findings, and synthesizes a comprehensive answer with citations.
Related MCP server: docs-mcp
Quick Start
Choose your deployment method:
Docker Compose (recommended for local development)
Kubernetes (Helm) (recommended for production)
Docker Compose
# Clone the repository
git clone https://github.com/mayflower/contextmine.git
cd contextmine
# Copy environment template and configure
cp .env.example .env
# Edit .env with your API keys (see Configuration section)
# Start all services
docker compose up -d
# Run database migrations
docker compose exec api sh -c "cd /app/packages/core && alembic upgrade head"Kubernetes (Helm)
For production deployments, use the Helm chart from GHCR:
# Create a values file with your configuration
cat > my-values.yaml << EOF
api:
image:
repository: ghcr.io/mayflower/contextmine-api
tag: latest
worker:
image:
repository: ghcr.io/mayflower/contextmine-worker
tag: latest
config:
publicBaseUrl: "https://contextmine.example.com"
secrets:
github:
clientId: "your-github-client-id"
clientSecret: "your-github-client-secret"
sessionSecret: "$(python -c 'import secrets; print(secrets.token_urlsafe(32))')"
tokenEncryptionKey: "$(python -c 'import secrets; print(secrets.token_urlsafe(32))')"
openaiApiKey: "sk-..."
EOF
# Install from OCI registry
helm install contextmine oci://ghcr.io/mayflower/contextmine -f my-values.yaml
# Access the application
kubectl port-forward svc/contextmine-api 8000:8000See deploy/helm/contextmine/README.md for full configuration options.
2. Create Your First Collection
Open the admin UI at http://localhost:8000
Log in with GitHub OAuth
Create a new Collection (e.g., "My Docs")
Add a Source:
Web: Enter a documentation URL (e.g.,
https://docs.python.org/3/)GitHub: Enter
owner/repo(e.g.,fastapi/fastapi)
Click Sync to start indexing
3. Connect Your AI Assistant
Configure your MCP client to connect to ContextMine. Authentication is handled via GitHub OAuth automatically.
Claude Desktop (~/.config/claude/claude_desktop_config.json on Linux, ~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"contextmine": {
"url": "http://localhost:8000/mcp"
}
}
}When you first connect, your MCP client will redirect to GitHub for authentication.
Cursor: Settings → MCP → Add server with URL http://localhost:8000/mcp
4. Start Using It
In your AI assistant, you can now:
Search the FastAPI docs for information about dependency injectionWhat authentication methods does this codebase support?Show me the outline of src/auth/handlers.pyArchitecture Cockpit (Extracted Views)
The web app includes an Architecture Cockpit for project/collection-level Twin inspection in the browser.
Views
Overview- City KPIs and hotspot analysis.Topology- Layered architecture graph view.Deep Dive- Large graph slices for dependency/controlflow inspection.Evolution- Investment/Utilization, Knowledge Islands, Temporal Coupling, Fitness Functions.C4 Diff- AS-IS / TO-BE Mermaid compare with selectable C4 view level.Exports- Generatecc_json,cx2,jgf,lpg_jsonl,mermaid_c4.
C4 View Controls
GET /api/twin/collections/{collection_id}/views/mermaid supports:
c4_view=context|container|component|code|deploymentc4_scope(optional focus selector for component/code/deployment)max_nodes(diagram cap for large code/deployment views)
The response includes warnings (and in compare mode as_is_warnings / to_be_warnings) when views are generated in best-effort mode due sparse source signals.
Real Metrics Semantics
Overview uses GET /api/twin/collections/{collection_id}/views/city and reads:
{
"summary": {
"metric_nodes": 120,
"coverage_avg": 71.4,
"complexity_avg": 9.8,
"coupling_avg": 3.2,
"change_frequency_avg": 4.6,
"churn_avg": 21.3
},
"metrics_status": {
"status": "ready|unavailable",
"reason": "ok|no_real_metrics|awaiting_ci_coverage|coverage_ingest_failed",
"strict_mode": true
},
"hotspots": [
{
"node_natural_key": "file:src/main.py",
"loc": 210,
"symbol_count": 12,
"coverage": 73.2,
"complexity": 16.1,
"coupling": 5,
"change_frequency": 8,
"churn": 46
}
]
}Rules:
ready: real metric snapshots are available.unavailable: no valid real metrics for the selected scenario.reason=awaiting_ci_coverage: structural metrics are ready, coverage has not been ingested yet.reason=coverage_ingest_failed: the latest coverage ingest job failed or was rejected.UI shows
N/Afor unavailable KPI values (not placeholder0.00).
GitHub Actions Coverage Ingest (CI Push)
Coverage is no longer discovered from repository files. CI pushes raw coverage reports to ContextMine.
One-time setup
Identify your GitHub source ID (
/api/collections/{collection_id}/sources).Rotate the ingest token once as source owner:
POST /api/sources/{source_id}/metrics/coverage-ingest-token/rotate
Store returned token in GitHub Secrets as
CONTEXTMINE_INGEST_TOKEN.Store source ID in GitHub Secrets as
CONTEXTMINE_SOURCE_ID.
GitHub Actions example
- name: Push coverage to ContextMine
if: always()
env:
CONTEXTMINE_URL: https://contextmine.example.com
CONTEXTMINE_SOURCE_ID: ${{ secrets.CONTEXTMINE_SOURCE_ID }}
CONTEXTMINE_INGEST_TOKEN: ${{ secrets.CONTEXTMINE_INGEST_TOKEN }}
run: |
curl --fail-with-body \
-X POST "$CONTEXTMINE_URL/api/sources/$CONTEXTMINE_SOURCE_ID/metrics/coverage-ingest" \
-H "X-ContextMine-Ingest-Token: $CONTEXTMINE_INGEST_TOKEN" \
-F "commit_sha=${{ github.sha }}" \
-F "branch=${{ github.ref_name }}" \
-F "workflow_run_id=${{ github.run_id }}" \
-F "provider=github_actions" \
-F "reports=@coverage/lcov.info" \
-F "reports=@coverage/coverage.xml"Notes:
commit_shamust exactly match the current source cursor SHA.Multiple report files are supported and merged by file-level average.
Supported protocols (Core 6):
lcov,Cobertura XML,JaCoCo XML,Clover/PHPUnit XML,OpenCover XML,generic-file-coverage-v1JSON.Check job status via
GET /api/sources/{source_id}/metrics/coverage-ingest/{job_id}.
Available MCP Tools
Context Retrieval
Tool | Description |
| Primary search tool. Searches indexed content and returns relevant context as Markdown. Supports filtering by collection. |
| List available documentation collections |
| Browse documents in a collection |
Code Intelligence
Tool | Description |
| List all functions, classes, and methods in a file with line numbers |
| Get the source code of a specific function or class by name |
| Jump to where a symbol is defined (requires LSP) |
| Find all usages of a symbol for impact analysis (requires LSP) |
| Explore code relationships - what a function calls, what calls it, imports, etc. |
Advanced Research
Tool | Description |
| Multi-step AI agent for complex questions. Autonomously searches, reads code, and builds answers with citations. |
Configuration
Copy .env.example to .env and configure these variables:
Required
Variable | Description |
| PostgreSQL connection string (default works with docker compose) |
| GitHub OAuth app client ID |
| GitHub OAuth app secret |
| Secret for session cookies |
| Key for encrypting stored tokens |
| OpenAI API key for embeddings |
Optional
Variable | Description |
| Alternative to OpenAI for embeddings |
| For deep_research agent (uses Claude) |
| CORS origins for MCP in production |
| Docker Compose postgres image platform override (default: |
| Enforce strict real metrics gate for GitHub syncs (default: |
| Metrics language scope (default: |
| Max multipart payload size for CI coverage uploads (default: |
| Prefect flow name for async coverage ingest (default: |
Setting Up GitHub OAuth
Click New OAuth App
Fill in:
Application name: ContextMine (or your preferred name)
Homepage URL:
http://localhost:8000Authorization callback URL:
http://localhost:8000/api/auth/callback
Copy the Client ID and Client Secret to your
.env
Note: Both the admin UI and MCP clients use the same callback URL. The server automatically routes OAuth flows to the appropriate handler.
Generating Secure Keys
# Generate session secret
python -c "import secrets; print(secrets.token_urlsafe(32))"
# Generate encryption key
python -c "import secrets; print(secrets.token_urlsafe(32))"Adding Sources
Web Documentation
Best for: API docs, guides, reference documentation
Create a collection in the admin UI
Add a source with type Web
Enter the base URL (e.g.,
https://docs.example.com/)The crawler follows links within the same domain
GitHub Repositories
Best for: Source code, README files, inline documentation
Add a source with type GitHub
Enter the repository as
owner/repoOptionally specify:
Branch: defaults to the default branch
Path filter: limit to specific directories (e.g.,
src/,docs/)
Code files are parsed for symbols (functions, classes, methods)
Supported languages for symbol extraction: Python, TypeScript, JavaScript, Go, Rust, Java, C, C++, Ruby, PHP
Supported languages for strict real metrics (Twin/City): Python, TypeScript, JavaScript, Java, PHP
Strict metrics gate behavior for GitHub sources:
Sync computes structural metrics (
loc,complexity,coupling) without blocking on coverage.Coverage is ingested asynchronously from CI and bound to exact commit SHA.
Coverage ingest is strict: invalid token/payload/SHA mismatch/path mismatch fails the ingest job.
City metrics become fully ready only after successful coverage ingest.
Architecture
┌───────────────────────────────┐ ┌─────────────┐
│ FastAPI + React SPA │────▶│ PostgreSQL │
│ /api/* /mcp/* /* (frontend) │ │ pg4ai │
└───────────────────────────────┘ └─────────────┘
│
┌──────┴──────┐
▼ ▼
┌─────────┐ ┌─────────┐
│ Prefect │ │ spider │
│ Worker │ │ _md │
└─────────┘ └─────────┘API (
apps/api): FastAPI serving REST API at/api/*, MCP at/mcp/*, and React frontend at/*Web (
apps/web): React admin console (built and served by API)Worker (
apps/worker): Background sync jobs using PrefectCore (
packages/core): Shared models, database, and utilities
Development
Prerequisites
Python 3.12+
Node.js 20+
uv for Python dependency management
Docker (for pg4ai: PostgreSQL + pgvector + Apache AGE)
Local Development Setup
# Start database
docker compose up -d postgres
# Optional: verify vector + graph capabilities in postgres
./scripts/docker/smoke-pg4ai.sh
# Install Python dependencies
uv sync --all-packages
# Run migrations
cd packages/core
DATABASE_URL=postgresql+asyncpg://contextmine:contextmine@localhost:5432/contextmine \
uv run alembic upgrade head
cd ../..
# Build frontend (one-time, or after frontend changes)
cd apps/web && npm install && npm run build && cd ../..
# Start API server (serves both API and frontend)
STATIC_DIR=apps/web/dist uv run uvicorn apps.api.app.main:app --reload --port 8000For frontend development with hot reload, run the Vite dev server separately:
# Terminal 1: API server
uv run uvicorn apps.api.app.main:app --reload --port 8000
# Terminal 2: Frontend dev server (proxies API requests to :8000)
cd apps/web && npm run devRunning Tests
# All tests
uv run pytest -v
# Specific test file
uv run pytest packages/core/tests/test_treesitter.py -v
# With coverage
uv run pytest --cov=contextmine_core --cov-report=term-missingCode Quality
# Linting
uv run ruff check .
# Type checking
uvx ty check
# Auto-format
uv run ruff format .
# Pre-commit hooks
uv run pre-commit install
uv run pre-commit run --all-filesContainer Images
Pre-built images are available from GitHub Container Registry:
docker pull ghcr.io/mayflower/contextmine-api:latest
docker pull ghcr.io/mayflower/contextmine-worker:latest
docker pull ghcr.io/mayflower/contextmine-web:latestTroubleshooting
"No collections found" in MCP client
Ensure you've created at least one collection in the admin UI
Check that the collection visibility is set to Global (or you're authenticated)
Verify you've completed the GitHub OAuth flow when prompted by your MCP client
Sync not finding documents
Check the Prefect UI at http://localhost:4200 for job status
For GitHub sources, ensure the repository is accessible
For web sources, verify the URL is reachable and returns HTML
Symbols not being extracted
Symbol extraction works for supported languages only. Check that:
The file has a recognized extension (
.py,.ts,.js,.go, etc.)The sync has completed (symbols are extracted during sync)
Cockpit Overview shows N/A metrics
Inspect
GET /api/twin/collections/{collection_id}/views/city.Check
metrics_status.reason:awaiting_ci_coverage: CI has not pushed coverage yet.coverage_ingest_failed: review ingest job diagnostics.no_real_metrics: no structural metric snapshots were produced.
Verify latest ingest job:
GET /api/sources/{source_id}/metrics/coverage-ingest/{job_id}
Re-run CI upload with matching
commit_sha=${{ github.sha }}and valid reports.
License
MIT License - see LICENSE for details.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/mayflower/contextmine'
If you have feedback or need assistance with the MCP directory API, please join our Discord server