Iceberg MCP Server
Provides tools for exploring and querying an Apache Iceberg lakehouse, including namespace discovery, table metadata inspection, snapshot history, time travel SQL generation, and partition pruning explanation.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Iceberg MCP ServerWhat tables are in the analytics namespace?"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Talk to Your Lakehouse: Iceberg MCP Demo
This repository contains a local demo for a talk on using the Model Context Protocol to make an Apache Iceberg lakehouse queryable through natural language.
The demo shows how an LLM can use typed tools to discover namespaces, inspect Iceberg table metadata, reason over snapshots, generate time travel SQL, and explain partition pruning without directly reading the full catalog metadata payload.
What this demo includes
This repo has three small Python processes:
catalog_server.pyA mock Apache Iceberg REST Catalog server running on port
5001. It serves realistic table metadata for demo namespaces such assales,analytics, andraw.mcp_server.pyA lightweight MCP-style tool server running on port
5002. It wraps the catalog API and exposes typed tools for the LLM, including namespace discovery, table listing, table description, snapshot history, time travel SQL generation, and partition explanation.client.pyAn interactive terminal client that connects to Groq, loads the tools from the MCP server, and lets the model call those tools while answering lakehouse questions.
Related MCP server: Cloudera Iceberg MCP Server
Architecture
User question
|
v
client.py
|
| Groq tool calling
v
mcp_server.py
|
| Authenticated REST calls
v
catalog_server.py
|
v
Mock Iceberg table metadataWhy this exists
Most lakehouse workflows still expect engineers to manually inspect catalogs, table schemas, snapshots, partitions, and metadata files. This demo explores what changes when an LLM is not asked to guess, but is instead given small, typed tools over the lakehouse control plane.
The goal is not to replace the query engine. The goal is to reduce the friction around discovery, debugging, schema inspection, and query planning.
Demo capabilities
The agent can answer questions such as:
What namespaces and tables are in this lakehouse?Tell me about the orders table: schema, partitioning, and recent activity.What changed in the orders table in the last two weeks?I need to query the orders data as it was last Monday. Give me the time travel SQL.How should I write efficient Spark SQL against the orders table to avoid full scans?Repository structure
.
├── catalog_server.py # Mock Iceberg REST Catalog server
├── mcp_server.py # MCP-style tool server over the catalog
├── client.py # Groq-powered terminal client
├── requirements.txt # Python dependencies
└── README.mdPrerequisites
Use Python 3.10 or later.
You also need a Groq API key for the interactive client.
Create a key from the Groq console, then export it before running the client:
export GROQ_API_KEY="your_groq_api_key_here"Setup
Clone the repo:
git clone https://github.com/<your-username>/<repo-name>.git
cd <repo-name>Create and activate a virtual environment:
python3 -m venv .venv
source .venv/bin/activateInstall dependencies:
pip install -r requirements.txtRun the demo
Open three terminal windows.
Terminal 1: start the mock Iceberg REST Catalog.
python catalog_server.pyExpected service:
http://localhost:5001Terminal 2: start the MCP tool server.
python mcp_server.pyExpected service:
http://localhost:5002Terminal 3: start the interactive client.
export GROQ_API_KEY="your_groq_api_key_here"
python client.pyThe client will show suggested demo questions. You can type one of the numbers or ask your own question.
Available MCP tools
The MCP server exposes these tools to the client:
Tool | Purpose |
| Lists available catalog namespaces |
| Lists tables inside a namespace |
| Returns trimmed schema, partition, property, and snapshot metadata for one table |
| Returns recent Iceberg snapshot history |
| Finds the closest snapshot for a target date and returns SQL |
| Explains partition transforms and query pruning strategy |
Design notes
The demo intentionally keeps the catalog local so the talk can focus on the control-plane pattern instead of cloud setup.
The MCP server trims table metadata before sending it to the LLM. This is important because real Iceberg metadata can be large, noisy, and full of file paths that are not useful for conversational reasoning.
The client uses model tool calling instead of prompting the model with raw metadata. This makes the agent behavior easier to inspect because every tool call and tool result is printed in the terminal.
Talk slides
This demo is part of the talk at Cloudera Talk to Your Lakehouse: Building an MCP Server for Apache Iceberg.
Slides are available here:
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/inirah02/talk-to-your-lakehouse'
If you have feedback or need assistance with the MCP directory API, please join our Discord server