Confluence RAG MCP Server
Provides RAG-based context retrieval from Confluence pages, allowing AI agents to search and retrieve relevant documentation from Confluence spaces.
Allows integration with GitHub Copilot to provide relevant context from Confluence documentation during code assistance.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Confluence RAG MCP Serversearch Confluence for recent design decisions"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Confluence RAG Data Pipeline with MCP Protocol
A Model Context Protocol (MCP) server that provides relevant context from Confluence pages using RAG (Retrieval Augmented Generation).
Features
Crawls Confluence spaces and pages
Stores document vectors using ChromaDB
Implements MCP protocol for context retrieval
Supports filtering by space, labels, and metadata
Handles attachments and comments
Provides REST API endpoints
Related MCP server: Confluence Knowledge Base MCP Server
Requirements
Python 3.9 or higher
UV for dependency management
Confluence API access token
ChromaDB for vector storage
Installation
Setup Python Environment:
Make sure you have Python 3.9 or higher installed
python --versionInstall UV if you haven't already:
curl -LsSf https://astral.sh/uv/install.sh | shClone and Setup Project:
git clone <repository-url> cd confluence-scraper-mcp # Create virtual environment uv venv .venv # Activate virtual environment source .venv/bin/activate # Install dependencies uv pip install -r requirements.txtConfigure Environment:
Create a
.envfile in the project root:
touch .envAdd the following configuration (adjust values as needed):
# Required settings CONFLUENCE_BASE_URL=https://your-domain.atlassian.net CONFLUENCE_TOKEN=your-api-token CONFLUENCE_SPACE_KEY=optional-space-key # Optional settings (with defaults) INITIAL_CRAWL=false CHROMA_PERSIST_DIR=./data/chroma EMBEDDING_MODEL="all-MiniLM-L6-v2" MAX_PAGES=1000 INCLUDE_ATTACHMENTS=true INCLUDE_COMMENTS=true
Usage
Using uvx (Recommended):
# Development mode with auto-reload uvx uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload # Run tests uvx pytest # Code formatting and checks uvx black . uvx isort . uvx mypy .Alternative: Using Virtual Environment:
# Activate virtual environment source .venv/bin/activate # Then run commands as usual uvicorn app.main:app --host 0.0.0.0 --port 8000 --reloadInitial Setup:
# Start initial crawl of Confluence pages curl -X POST http://localhost:8000/crawl # Verify server health curl http://localhost:8000/healthUse the MCP API:
# Get context for an LLM query curl -X POST http://localhost:8000/mcp/context \ -H "Content-Type: application/json" \ -d '{ "messages": [{"role": "user", "content": "Tell me about project X"}], "query": "project X documentation", "max_context_length": 1000 }' # The response will include relevant context from your Confluence pagesMonitor and Maintain:
# View logs tail -f logs/app.log # Re-crawl Confluence (e.g., after updates) curl -X POST http://localhost:8000/crawl
API Endpoints
GET /health: Health check endpointPOST /crawl: Trigger Confluence crawlPOST /mcp/context: Get relevant context for a query
Using with Code Assistants
This MCP server is specialized for Confluence documentation and uses RAG (Retrieval Augmented Generation) with ChromaDB, which makes it different from typical MCP servers in several ways:
Confluence Integration:
Direct integration with Confluence API
Handles Confluence-specific content types (pages, attachments, comments)
Preserves Confluence metadata (space keys, labels, authors)
Vector Search:
Uses ChromaDB for semantic search instead of traditional text search
Embeddings are generated using sentence transformers
More accurate context retrieval based on meaning, not just keywords
Filtering Capabilities:
Can filter by Confluence space keys
Supports label-based filtering
Can include/exclude attachments and comments
Configurable context length per endpoint
This MCP server can be integrated with code assistants like GitHub Copilot to provide relevant context from your Confluence documentation. Here's how to set it up:
Start the MCP Server:
# Make sure the server is running poetry shell uvicorn app.main:app --port 8000Configure Your Code Assistant:
For GitHub Copilot:
Open VS Code settings (Cmd+,)
Search for "copilot chat"
Add a new MCP endpoint under "Copilot Chat: MCP Servers" using either:
Option 1: Direct URL
Use URL:
http://localhost:8000/mcp/contextNote: This basic setup won't include filtering capabilities
Option 2: MCP Configuration File (Recommended)
An example configuration file is provided in
examples/mcp.jsonSupports Confluence-specific filtering
Can configure multiple endpoints for different spaces
Allows fine-tuning of context retrieval
{ "endpoints": [ { "name": "API Documentation", "url": "http://localhost:8000/mcp/context", "options": { "max_context_length": 2000, "filter": { "space_key": "API", "labels": ["technical-docs", "api-reference"], "include_comments": true, "include_attachments": false, "semantic_ranking": { "weight": 0.7, "model": "all-MiniLM-L6-v2" } } }, "authentication": { "type": "none" } }, { "name": "Architecture Docs", "url": "http://localhost:8000/mcp/context", "options": { "max_context_length": 3000, "filter": { "space_key": "ARCH", "labels": ["architecture", "design"], "include_comments": false, "include_attachments": true, "semantic_ranking": { "weight": 0.8, "model": "all-MiniLM-L6-v2" } } }, "authentication": { "type": "none" } } ], "default_endpoint": "API Documentation" }Add the path to this file in VS Code settings under "Copilot Chat: MCP Configuration File"
See
examples/mcp.jsonfor a full example with multiple endpoints and filtering options
Usage with Copilot:
In VS Code, open Copilot Chat (Cmd+I)
Your queries will now include relevant context from your Confluence pages
Example: "How do I implement feature X?" will include context from related Confluence documentation
You can also use
/doccommand in Copilot Chat to explicitly search documentation
Tips for Better Results:
Keep Confluence pages well-organized and up-to-date
Use descriptive titles and labels in Confluence
Re-crawl after significant documentation updates:
curl -X POST http://localhost:8000/crawl
Development
Install Development Dependencies:
uv pip install -r requirements.txtUsing uvx for Development: UV installs a command runner called
uvxthat can run Python scripts and modules without explicitly activating the virtual environment:# Run the FastAPI server uvx uvicorn app.main:app --reload # Run tests uvx pytest # Code formatting uvx black . uvx isort . uvx mypy .Environment Configuration: The project uses environment variables for configuration. Copy
.env.exampleto.envand update the values:CONFLUENCE_BASE_URL=https://your-domain.atlassian.net CONFLUENCE_TOKEN=your-api-token CONFLUENCE_SPACE_KEY=your-space-key CHROMA_PERSIST_DIR=data/chroma CHROMA_COLLECTION_NAME=confluence_docs EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 CHUNK_SIZE=512 CHUNK_OVERLAP=50 TOP_K=3 SIMILARITY_THRESHOLD=0.7
Contributing
Fork the repository
Create your feature branch (
git checkout -b feature/amazing-feature)Make your changes:
Use
uvx black .anduvx isort .to format codeUse
uvx mypy .for type checkingAdd tests for new features
Update documentation as needed
Run tests (
uvx pytest)Commit your changes (
git commit -m 'Add some amazing feature')Push to the branch (
git push origin feature/amazing-feature)Open a Pull Request
License
MIT License. See LICENSE for more information.
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/akhilthomas236/confluence-scraper-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server