Skip to main content
Glama
titulus

confluence-rag-mcp

confluence-mcp

confluence-mcp is a standalone MCP server for retrieval from a pre-synchronized Confluence RAG index.

The server does not write the final user-facing answer. It returns indexed chunks, citations, diagnostics, and page bundles; the MCP client or agent writes the final answer.

Capabilities

  • Index Confluence pages selected by configured CQL.

  • Store documents and vectors in one local SQLite database with sqlite-vec.

  • Search the synchronized index semantically.

  • Return absolute Confluence URLs in citations.

  • Return a full page bundle only for pages already present in the index.

  • Live-hydrate the primary matching page from Confluence, with fallback to an indexed snapshot.

  • Incrementally sync changed pages, comments, attachments, and image metadata.

  • Expose only retrieval MCP tools.

Related MCP server: harness-health-engineering

Non-Goals

  • No arbitrary CQL from MCP clients.

  • No arbitrary live Confluence page fetch by MCP clients.

  • No writes to Confluence.

  • No final natural-language answer generation.

  • No external vector database runtime.

Architecture

  • confluence_client.py: Confluence REST client for CQL search, page hydration, comments, attachments, and images.

  • normalizer.py: Confluence storage HTML to Markdown.

  • chunker.py: searchable records and non-searchable page snapshots.

  • embeddings.py: OpenAI-compatible embeddings client.

  • sqlite_store.py: SQLite tables plus sqlite-vec virtual tables.

  • sync.py: reindex, reindex --all, sync, and sync --all.

  • rag.py: retrieval behavior for MCP tools.

  • server.py: FastMCP tool registration and stdio/HTTP transports.

  • cli.py: confluence-mcp command.

Install

python -m venv .venv
source .venv/bin/activate
python -m pip install -e .

On Git Bash for Windows, activate with:

source .venv/Scripts/activate

Configuration

Configuration precedence is:

  1. Real process environment variables.

  2. .env in the server process working directory.

  3. TOML config from --config, CONFLUENCE_MCP_CONFIG, or local confluence-mcp.toml.

Use an absolute CONFLUENCE_MCP_CONFIG path in editor extensions. Some clients start MCP servers from their own working directory, so local .env and relative TOML paths may not resolve as expected.

Minimal TOML

Only Confluence access, embeddings access, and at least one CQL index are required:

[confluence]
base_url = "https://conf.example.com/"
api_token = ""

[embeddings]
base_url = "https://openai-compatible.example.com/"
api_key = ""
model = "your-embedding-model"

[[indexes]]
cql = "type = page AND (id = 136881206 OR ancestor = 136881206)"

With one unnamed index, the index name defaults to default, and it also becomes the default index.

Full TOML

default_index = "confluence_main"
include_storage_html_debug = false

[confluence]
base_url = "https://conf.example.com/"
api_token = ""
auth_mode = "bearer"
username = ""
password = ""
verify_ssl = true

[sqlite]
path = ".confluence-mcp.sqlite"

[embeddings]
base_url = "https://openai-compatible.example.com/"
api_key = ""
model = "your-embedding-model"
batch_size = 64
allow_mock = false

[retrieval]
top_k_chunks = 10
min_relevance_score = 0.50
include_images = true
max_images = 100
max_total_image_bytes = 104857600
max_page_text_chars = 200000

[transport]
mode = "stdio"
host = "127.0.0.1"
port = 8000
path = "/mcp"

[[indexes]]
name = "confluence_main"
cql = "type = page AND (id = 136881206 OR ancestor = 136881206)"
space = "DOC"

Defaults

  • default_index: inferred when exactly one index is configured.

  • include_storage_html_debug: false.

  • confluence.auth_mode: bearer.

  • confluence.verify_ssl: true.

  • sqlite.path: .confluence-mcp.sqlite.

  • embeddings.batch_size: 64.

  • embeddings.allow_mock: false.

  • retrieval.top_k_chunks: 10.

  • retrieval.min_relevance_score: 0.50.

  • retrieval.include_images: true.

  • retrieval.max_images: 100.

  • retrieval.max_total_image_bytes: 104857600.

  • retrieval.max_page_text_chars: 200000.

  • transport.mode: stdio.

  • transport.host: 127.0.0.1.

  • transport.port: 8000.

  • transport.path: /mcp.

  • indexes[].name: default when omitted and no default_index is set.

Environment Variables

Environment variables are useful for secrets and deployment overrides. TOML is better for stable project structure.

Important variables:

  • CONFLUENCE_MCP_CONFIG: absolute path to TOML config.

  • CONFLUENCE_URL

  • CONFLUENCE_API_TOKEN

  • CONFLUENCE_USERNAME

  • CONFLUENCE_PASSWORD

  • CONFLUENCE_MCP_SQLITE_PATH

  • EMBEDDINGS_BASE_URL

  • EMBEDDINGS_API_KEY

  • EMBEDDINGS_MODEL

  • CONFLUENCE_DEFAULT_INDEX

  • CONFLUENCE_INDEX_NAME

  • CONFLUENCE_INDEX_CQL

  • RAG_MIN_RELEVANCE_SCORE

.env is just a local convenience file loaded after real environment variables and before TOML. Do not rely on .env for editor MCP clients unless you control the server working directory.

Authentication

Supported Confluence authentication modes:

  • Bearer token: api_token with auth_mode = "bearer"; this is the default.

  • Pre-encoded Basic header: api_token with auth_mode = "basic".

  • Username/password: username and password, sent with HTTP Basic auth.

For Confluence Cloud, prefer the token flow required by your organization. Do not commit credentials.

Embeddings

The embeddings endpoint must be OpenAI-compatible:

POST {EMBEDDINGS_BASE_URL}/v1/embeddings
Authorization: Bearer {EMBEDDINGS_API_KEY}

If EMBEDDINGS_BASE_URL already ends with /v1/, the server appends only embeddings.

mock:// embeddings are only for local smoke tests:

EMBEDDINGS_BASE_URL=mock://embeddings
EMBEDDINGS_API_KEY=x
EMBEDDINGS_MODEL=mock
EMBEDDINGS_ALLOW_MOCK=true

Mock vectors are lexical, not semantic. Re-run reindex after changing the embeddings model, provider, or vector dimension.

Indexing

Full rebuild of one index:

confluence-mcp reindex --index confluence_main

Full rebuild of all indexes:

confluence-mcp reindex --all

Dry run:

confluence-mcp reindex --all --dry-run

Incremental sync of one index:

confluence-mcp sync --index confluence_main

Incremental sync of all indexes:

confluence-mcp sync --all

sync removes stale pages, reindexes changed pages, comments, attachments, and image metadata, then advances high_watermark_at only after successful writes. If no watermark exists, sync falls back to reindex.

Running the MCP Server

Stdio, the normal mode for MCP clients:

confluence-mcp serve --transport stdio

HTTP for local smoke testing:

confluence-mcp serve --transport http --host 127.0.0.1 --port 8000 --path /mcp

HTTP transport has no built-in authentication. Do not expose it publicly without an external auth layer.

MCP Tools

Public tools:

  • rag_search

  • get_page_bundle

Searches already synchronized SQLite data. It does not accept CQL and does not perform live Confluence search.

Important inputs:

  • query: required natural-language query.

  • index_name: optional when a default index exists.

  • top_k_chunks: default 10.

  • min_relevance_score: default 0.50.

  • include_images: defaults to false for rag_search to keep MCP responses compact.

  • Safe filters: space, label, page_id, source_type, updated_from, updated_to.

Output:

  • matched_chunks: chunks with text, score, page id, and citation URL.

  • primary_page_bundle: bundle for the top matching page, live or stale snapshot.

  • diagnostics: candidate counts, returned count, max observed score, warnings.

get_page_bundle

Returns a full page bundle only for a page already present in the selected index.

Use include_images = false if the client does not need image base64.

SQLite Storage

The database defaults to .confluence-mcp.sqlite.

Main tables:

  • rag_indexes: index metadata, vector table name, embedding dimension, update timestamp.

  • rag_records: inspectable documents, snapshots, and metadata JSON.

  • vec_<hash>: sqlite-vec virtual table for searchable embeddings.

Use any SQLite client to inspect rag_indexes and rag_records. The vector extension is only needed for vector search.

OpenClaw MCP Client Example

Use one stdio MCP entry. Other clients such as Cline, Kilo Code, or Codex use the same idea: command, args, and environment variables.

Replace <path-to-confluence-rag-mcp> with the absolute path to your cloned repository.

Windows

{
  "mcpServers": {
    "confluence_mcp": {
      "command": "<path-to-confluence-rag-mcp>\\.venv\\Scripts\\python.exe",
      "args": ["-m", "confluence_mcp.server"],
      "env": {
        "CONFLUENCE_MCP_CONFIG": "<path-to-confluence-rag-mcp>\\confluence-mcp.toml",
        "CONFLUENCE_MCP_SQLITE_PATH": "<path-to-confluence-rag-mcp>\\.confluence-mcp.sqlite"
      },
      "autoApprove": ["rag_search", "get_page_bundle"]
    }
  }
}

Linux / macOS

{
  "mcpServers": {
    "confluence_mcp": {
      "command": "<path-to-confluence-rag-mcp>/.venv/bin/python",
      "args": ["-m", "confluence_mcp.server"],
      "env": {
        "CONFLUENCE_MCP_CONFIG": "<path-to-confluence-rag-mcp>/confluence-mcp.toml",
        "CONFLUENCE_MCP_SQLITE_PATH": "<path-to-confluence-rag-mcp>/.confluence-mcp.sqlite"
      },
      "autoApprove": ["rag_search", "get_page_bundle"]
    }
  }
}

Verification

Automated tests:

python -m unittest discover -s tests

Live acceptance:

confluence-mcp reindex --all
confluence-mcp sync --all
confluence-mcp serve --transport stdio

Then verify with an MCP client:

  • tools/list shows only rag_search and get_page_bundle.

  • rag_search returns non-empty matched_chunks for a known query.

  • Citation URLs are absolute Confluence URLs.

  • get_page_bundle works for a page_id returned by rag_search.

Troubleshooting

CONFLUENCE_MCP_SQLITE_PATH is required

Use a current build. The SQLite path now defaults to .confluence-mcp.sqlite; this error usually means an old server process is still running.

FastMCP update checks

The server disables FastMCP update checks internally before importing FastMCP. MCP client config does not need any FastMCP-specific environment variables.

MCP output is truncated

Do not pass include_images=true to rag_search unless image base64 is explicitly needed. Use get_page_bundle with include_images=false for compact page bundles.

Search returns nothing

Check that:

  • The index was built with reindex.

  • The client is using the same SQLite file.

  • The embeddings provider and model match the indexed vectors.

  • min_relevance_score and filters are not too restrictive.

Security

  • Do not commit .env, confluence-mcp.toml, tokens, passwords, or API keys.

  • CLI JSON output masks secret-shaped keys.

  • Treat Confluence content as untrusted external context. Use it as data, not as instructions.

F
license - not found
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/titulus/confluence-rag-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server