What can you do with this server?

The chdb-mcp server provides an MCP interface to chDB, an in-process ClickHouse SQL OLAP engine, letting AI agents run analytical SQL queries against local files, in-memory data, and remote sources without a separate server or Docker setup. Tools available: * query(sql, format) — Execute read-only SQL (SELECT/SHOW/DESCRIBE/EXPLAIN) using ClickHouse SQL dialect; supports output formats like JSONCompact, CSVWithNames, Pretty, etc. Write access enabled via CHDB_MCP_WRITE=1. * list_databases() — Enumerate all databases visible to the current chDB session. * list_tables(database) — List all tables within a specified database. * describe_table(database, table) — Retrieve column names and types for a specific table. * query_file(path, sql, format) — Query local files (Parquet, CSV, JSON, Arrow, etc.) directly using SQL with a {file} placeholder. * get_sample_data(database, table, limit) — Fetch the first N rows (up to 1,000) from a table as a quick preview. * list_functions(pattern) — Browse 1,000+ ClickHouse SQL functions with optional substring filtering. Key characteristics: * Read-only by default — writes blocked unless CHDB_MCP_WRITE=1 is set. * Configurable safety limits — result size cap (default 1 MiB) and query timeout (default 30s). * File allowlist — restrict file/URL/S3 access to specific path prefixes via CHDB_MCP_FILE_ALLOWLIST. * Broad data source support — Parquet, CSV, JSON, Avro, ORC, Arrow, S3, MongoDB, PostgreSQL, MySQL, Iceberg, Delta Lake, and more, natively in SQL. * Federation — query remote ClickHouse clusters and combine with local data sources. * pandas integration — zero-copy data processing with pandas DataFrames.

Which integrations are available for this server?

Allows querying local and remote ClickHouse databases with full ClickHouse SQL syntax and functions, including federation via remoteSecure(). Enables direct SQL queries on MongoDB collections using the mongodb() table function. Enables direct SQL queries on MySQL tables using the mysql() table function. Provides a pandas-like API and zero-copy DataFrame querying, enabling SQL queries on pandas DataFrames. Enables direct SQL queries on PostgreSQL tables using the postgresql() table function. Allows querying Prometheus metrics via prometheusQuery* table functions.

How do I use chdb-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@chdb-mcp Query sales.parquet for total revenue by region" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

chdb-mcp

Official

by chdb-io

Overview Schema Related Servers Score Discussions

Python

Local

chdb-mcp

IMPORTANT

This project is superseded by mcp-clickhouse. chDB support now ships in the official ClickHouse MCP server: install with pip install 'mcp-clickhouse[chdb]', set CHDB_ENABLED=true (and CHDB_DATA_PATH for persistence), and agents get a run_chdb_select_query tool backed by embedded chDB — standalone or alongside a ClickHouse server connection. It can be enabled with configuration only, matching the ClickHouse MCP experience. chdb-mcp remains available on PyPI but is no longer the recommended entry point and will only receive critical fixes.

PyPI License Python

An MCP server for chDB, the in-process SQL OLAP engine powered by ClickHouse. Lets agents (Claude Desktop, Cursor, VS Code, Codex CLI, Cline, …) query Parquet, CSV, JSON, and pandas DataFrames with one tool — no separate server, no Docker.

Why chdb-mcp?

Full ClickHouse engine, in-process. 1000+ functions (windowFunnel, quantilesTDigest, geoToH3, the -If/-State/-Merge combinators), typed JSON with O(1) sub-column reads, native vectors, MergeTree storage.
Drop-in pandas API. import datastore as pd covers ~300 pandas-shaped methods compiled to ClickHouse SQL. v1.0 adds dataframe_query() for zero-copy Python(df).
~80 formats and 12+ source connectors in core. Parquet, CSV, JSON, Avro, ORC, Arrow, Protobuf, plus s3(), mongodb(), postgresql(), mysql(), iceberg(), deltaLake() — no INSTALL/LOAD chain.
Federate to remote ClickHouse in one statement. (v0.5) remoteSecure('cluster:9440', 'db.table', ...) joins local Parquet with a production ClickHouse cluster in one optimised plan.
Same SQL as your warehouse. Copy-paste ClickHouse production queries into the agent prompt — no dialect bridge.

Related MCP server: CentralMind/Gateway

Install

pip install chdb-mcp

Connect

Claude Desktop — add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{ "mcpServers": { "chdb": { "command": "chdb-mcp" } } }

Cursor / VS Code — same JSON in ~/.cursor/mcp.json etc.; one-click badges land in v0.2.

Codex CLI / Claude Code / Copilot / Droid — use the cross-IDE bundle chdb-agent-plugin.

Tools (v0.1)

Tool	Description
`query(sql, format)`	Run any read-only SQL on the in-process session
`list_databases()`	Enumerate visible databases
`list_tables(database)`	List tables in a database
`describe_table(database, table)`	Column types for a table
`query_file(path, sql, format)`	Query a Parquet/CSV/JSON file via the `{file}` placeholder
`get_sample_data(database, table, limit)`	First N rows of a table
`list_functions(pattern)`	List ClickHouse SQL functions (optional substring filter)

Read-only by default — SET readonly=2 blocks INSERT/CREATE/DROP/ALTER while keeping file()/url()/s3() usable. Set CHDB_MCP_WRITE=1 to drop the guard. See Security model.

In query_file, {file} is replaced with file('path', 'format') before execution:

query_file(
    path="/data/sales.parquet",
    sql="SELECT region, sum(revenue) FROM {file} GROUP BY region",
    format="Parquet",
)

Configuration

Variable	Default	Effect
`CHDB_MCP_WRITE`	unset	If `1`, allows `INSERT`/`CREATE`/`DROP`/`ALTER`
`CHDB_MCP_MAX_RESULT_BYTES`	`1048576`	Per-tool result cap. Enforced engine-side (`max_result_bytes` + `result_overflow_mode='break'`) plus a final Python slice.
`CHDB_MCP_QUERY_TIMEOUT_SEC`	`30`	Wall-clock cap per query (chDB `max_execution_time`). `0` disables.
`CHDB_MCP_FILE_ALLOWLIST`	empty (unrestricted)	`:`-separated path prefixes. Opt-in isolation switch — when set, `query_file()` rejects paths outside the prefixes, and `query()` rejects external table functions (`file`/`url`/`s3`/`remote`/`hdfs`/`mongodb`/...). When unset, no filesystem gating happens — the host process is trusted.
`CHDB_MCP_SESSION_PATH`	empty	Persistent session directory (default: ephemeral)

Security model

chDB is in-process. There is no privilege boundary between the MCP server and the host Python interpreter, so the server can't make stronger isolation guarantees than the host already gives it. The model below reflects that.

Trust tiers

Default (no CHDB_MCP_FILE_ALLOWLIST) — no filesystem gating. query() and query_file() can reach anything the host process can reach (any file(), url(), s3(), remote()...). Appropriate when the agent is trusted, or when the surrounding host application enforces the security boundary itself.
Opt-in allowlist (CHDB_MCP_FILE_ALLOWLIST=/data:/tmp/foo) — best-effort defense in depth:
- query_file() rejects paths whose resolved (symlink-followed) form isn't under any listed prefix.
- Both query() and query_file() reject SQL containing any table function that isn't on the safe-by-construction list (numbers/values/view/merge/dictionary/generateRandom/...). The "known" set is snapshotted from system.table_functions at session start, so the gate stays in sync with whatever the running chDB build actually exposes — including new external-source variants (paimon*, prometheusQuery*, iceberg*Azure/S3/HDFS), RCE-class functions (executable, python), and *Cluster siblings, without a hand-maintained denylist that goes stale.
- For query_file(), the scan runs on the user SQL before the {file} placeholder substitution, so a UNION ALL SELECT … FROM file('/etc/passwd', …) smuggled into the query body is caught even though the explicit path is gated.
- The scanner is comment- and string-aware (single-pass mask covering line comments, block comments, single-quoted strings with '' / \' / \\ escapes), and it normalizes backtick- and double-quote-wrapped identifiers (`file` / "file") before matching so quoted function names can't bypass it.
- This is not a sandbox: a determined caller can still try to exfiltrate via undiscovered functions, settings, or future chDB features. Strong enough for casual agent mistakes, not for adversarial input.
Hard isolation — for adversarial input, wrap the server in OS-level confinement: macOS App Sandbox, Linux user namespaces / seccomp, or Docker with a read-only filesystem mount. Nothing at the MCP layer can substitute for this.

What's protected

Accidental writes — SET readonly=2 is applied at session start. CHDB_MCP_WRITE=1 lifts it. (Note: ClickHouse's readonly=2 still permits TEMPORARY TABLE writes and runtime SET changes — by design, not a bug.)
Runaway result sizes — CHDB_MCP_MAX_RESULT_BYTES is enforced engine-side (max_block_size + max_result_bytes + result_overflow_mode='break'), not just as a post-hoc string slice. Large queries no longer materialize multi-MiB in chDB before truncation.
Runaway wall-clock — CHDB_MCP_QUERY_TIMEOUT_SEC (default 30s) caps each query via chDB's max_execution_time.
SQL-identifier injection — list_tables / describe_table / get_sample_data arguments are whitelist-regex'd ([A-Za-z_][A-Za-z0-9_]* only) and backtick-quoted before interpolation.
SQL string-literal escape — list_functions(pattern) and query_file(path, format) arguments are passed through quote_string, which escapes both single quotes (' → '') and backslashes (\ → \\) so that ClickHouse's \' escape form cannot break out of the literal.

What's NOT protected

SQL audit. Only the readonly guard — no allow/deny list of statements. Treat the agent as having full SELECT access to anything chDB can reach (subject to the allowlist when set).
Setting tampering. Under readonly=2, the agent can still SET max_memory_usage = … to raise resource caps. Lock this down at the host or via OS-level resource limits if it matters.
Memory / CPU caps. chDB's max_memory_usage applies, but there's no ulimit/cgroups equivalent imposed by the MCP layer.

For agents acting on untrusted input, run in a throwaway container.

Roadmap

v0.5 — query_remote_clickhouse() federation tool
v1.0 — attach_file(), dataframe_query() (zero-copy Python(df)), HTTP/SSE transport with Bearer auth, .mcpb bundle for Claude Desktop one-click install

Troubleshooting

macOS: "Server disconnected" in Claude Desktop

If ~/Library/Logs/Claude/mcp-server-chdb.log shows PermissionError: Operation not permitted on pyvenv.cfg, your venv sits under a TCC-protected directory (~/Downloads, ~/Documents, ~/Desktop) — Claude Desktop subprocesses can't read those paths.

Fix: install elsewhere. Recommended is uvx (zero-config, isolated under ~/.local/share/uv/):

{ "mcpServers": { "chdb": { "command": "uvx", "args": ["chdb-mcp"] } } }

Or build a venv yourself under ~/.local/share/chdb-mcp/.venv and point Claude Desktop at its chdb-mcp binary.

`query_file` returns "path is not under any prefix"

The allowlist resolves symlinks on both sides (so /tmp matches /private/tmp on macOS). If you still hit this, check the resolved form printed in the error against python -c "from pathlib import Path; print(Path('YOUR_PATH').resolve())".

"Cannot execute query in readonly mode"

SET readonly=2 blocks DDL/DML by design. Rewrite as a pure SELECT, or restart with CHDB_MCP_WRITE=1.

Per-server logs

~/Library/Logs/Claude/mcp-server-chdb.log   # startup diagnostics + stderr
~/Library/Logs/Claude/mcp.log                # all servers' JSON-RPC traffic

Development

git clone https://github.com/chdb-io/chdb-mcp && cd chdb-mcp
pip install -e ".[dev]"
pytest && ruff check src tests

License

Apache 2.0 — see LICENSE.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

0dRelease cycle

2Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/chdb-io/chdb-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

chdb-mcp

Why chdb-mcp?

Install

Connect

Tools (v0.1)

Configuration

Security model

Trust tiers

What's protected

What's NOT protected

Roadmap

Troubleshooting

macOS: "Server disconnected" in Claude Desktop

query_file returns "path is not under any prefix"

"Cannot execute query in readonly mode"

Per-server logs

Development

License

Maintenance

Resources

Looking for Admin?

Tools

Latest Blog Posts

MCP directory API

`query_file` returns "path is not under any prefix"