What can you do with this server?

sifter-mcp is an MCP server for Sifter, an open-source document intelligence engine that extracts structured data from documents and enables querying and management of the extracted records. Sift Management * list_sifts, get_sift, create_sift, update_sift, delete_sift — create and manage sifts (extraction configurations) using natural language instructions (e.g. "client, date, total amount") Document & Folder Management * list_folders, get_folder — browse and inspect document folders * upload_document — upload a Base64-encoded file to a folder (folder is auto-created; linked sifts will automatically process the document) Extraction * run_extraction — enqueue a document for extraction against a specific sift * get_extraction_status — check whether extraction is queued, running, completed, or failed Querying & Retrieval * list_records — paginate through extracted structured records (cursor-based) * find_records — filter records using structured MongoDB-style criteria (e.g. {"total": {"$gt": 1000}}) * query_sift — ask a natural language question over a sift's records; Sifter generates and runs the query automatically * aggregate_sift — run a raw MongoDB aggregation pipeline directly for custom analytics * get_record_citations — retrieve per-field citation details (page, bounding box, source text) for any extracted record

de en es ja ko ru zh

sifter-mcp

Official

by sifter-ai

Overview Schema Related Servers Score Discussions

Python

Hybrid

Sifter

codecov PyPI npm Python Node License: MIT

Your documents are a dark database.

Open-source document intelligence engine — schema-driven extraction, NL query, MCP server, Python and TypeScript SDKs. Self-hostable under MIT.

Sifter demo

Why not RAG?

RAG is built for retrieval — find me chunks similar to this query. It breaks on homogeneous collections like invoices, contracts, or receipts where every document looks alike and the question is an aggregation, not a search.

Documents to structured records

Sifter's approach: extract structured fields once (client, date, total), store them as typed records, query with real filters and aggregations. The answer is exact and reproducible — because it's a database query, not a similarity search.

Related MCP server: struct-mcp

Quickstart

git clone https://github.com/sifter-ai/sifter
cd sifter/code
cp server/.env.example server/.env.local    # set SIFTER_DEFAULT_API_KEY (required)
docker compose up -d

Open http://localhost:3000 — create a sift, upload documents, query results.

Python SDK

pip install sifter-ai

from sifter import Sifter

s = Sifter(api_key="sk-...")

sift = s.create_sift("Invoices", "client name, date, total amount")
sift.upload("./invoices/")
sift.wait()

for record in sift.records():
    print(record["extracted_data"])
# {"client": "Acme Corp", "date": "2024-01-15", "total_amount": 1500.0}

TypeScript SDK

npm install @sifter-ai/sdk

import { Sifter } from "@sifter-ai/sdk";

const client = new Sifter({ apiKey: "sk-..." });

const sift = await client.createSift("Invoices", "client, date, total amount");
await sift.upload("./invoices/");
await sift.wait();

const records = await sift.records();
console.log(records);

MCP server (Claude Desktop / Cursor / AI agents)

{
  "mcpServers": {
    "sifter": {
      "command": "uvx",
      "args": ["sifter-mcp", "--base-url", "http://localhost:8000"],
      "env": { "SIFTER_API_KEY": "sk-dev" }
    }
  }
}

Then ask:

"What's the total unpaid across all invoices from last quarter?" "Show me all contracts expiring in the next 90 days." "Which candidates have Python and more than 5 years experience?"

Sifter answers with structured data — exact counts, sums, filtered rows. Not a text blob.

Want a remote MCP URL without running a local server? → Sifter Cloud

Dashboard

Sifter includes a built-in dashboard — no Metabase, no Grafana, no SQL required.

Describe what you want to see in plain language:

sift = client.sifts.get("invoices")
sift.create_dashboard("Show total invoiced and unpaid by vendor, monthly trend")

Produces KPI tiles, breakdowns, and time-series — updated automatically on every extraction.

What's included

Schema-driven extraction — describe what to extract in natural language; schema is inferred automatically and exported as Pydantic / TypeScript types
NL query — ask questions in plain language; Sifter generates inspectable MongoDB aggregation pipelines
MCP server — stdio transport, read + write tools, zero custom integration code
REST API + SDKs — full OpenAPI spec, typed clients for Python and TypeScript
Webhooks — HMAC-signed HTTP callbacks on every extraction event
Spec-driven dashboards — short NL spec → auto-generated board (KPI, breakdown, table, time series)
CLI — sifter extract, sifter records, sifter sifts for terminal workflows and CI
Self-hostable — Docker Compose, bring your own MongoDB and LLM API key

Don't want to run infrastructure?

Sifter Cloud is the managed version — no Mongo, no ops, remote MCP endpoint, Google Drive and email ingress. Free tier available.

Docs

Full documentation at docs.sifter.run — quickstart, SDK reference, MCP guide, cookbook, self-hosting.

License

MIT — see LICENSE.

Created by Bruno Fortunato.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

6dRelease cycle

9Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Tools

View all tools

Related MCP Servers

MCP Server Knowledge Engine
Search Documentation Access Knowledge & Memory
lhstorm
F
license
-
quality
D
maintenance
Transforms PDF collections into a searchable knowledge base using TF-IDF indexing and proximity matching. It enables users to search documents, retrieve specific page content, and manage document libraries through natural language via MCP clients.
Last updated 2025-11-13
5
struct-mcp
Data Platforms Documentation Access Developer Tools
LaurEars
A
license
-
quality
D
maintenance
Transform data structure definitions into queryable MCP servers, enabling natural language queries about field meanings, data lineage, and structure.
Last updated 2025-06-30
MIT
API Agentofficial
AI & Machine Learning Developer Tools Search
agoda-com
A
license
-
quality
D
maintenance
Turn any API into an MCP server. Query in English. Get results—even when the API can't.
Last updated 2026-06-19
282
MIT
pdf2mcp
Search RAG Systems
iSamBa
A
license
-
quality
C
maintenance
Turn any PDF folder into a searchable MCP server with semantic, hybrid, or keyword search.
Last updated 2026-03-14
1
MIT

View all related MCP servers

Related MCP Connectors

nlqdb — analytical memory for AI agents
Analytical memory for AI agents: a real Postgres queried in plain English over MCP. One command.
Document to JSON – PDF Invoice/Statement/Contract Parser
Turn any PDF into structured JSON via AI + OCR: invoices, bank statements, contracts.
mcp
GibsonAI MCP server: manage your databases with natural language

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sifter-ai/sifter'

If you have feedback or need assistance with the MCP directory API, please join our Discord server