Skip to main content
Glama
AiAgentKarl

document-intelligence-mcp

document-intelligence-mcp

PyPI version License: MIT Python 3.10+

Local document intelligence for AI agents — extract text, detect tables, read metadata, analyze structure, search keywords, and detect language from PDF and DOCX files. No cloud API required, no API key needed.

Features

  • 10 MCP Tools for PDF and DOCX processing

  • Local processing — no data leaves your machine

  • No API key required

  • Supports PDF (via PyMuPDF + pdfplumber) and Microsoft Word DOCX (via python-docx)

  • Language detection via langdetect (55+ languages)

Related MCP server: PDF Document MCP Server

Tools

Tool

Description

tool_extract_text_from_pdf

Extract all text from a PDF, page by page

tool_extract_tables_from_pdf

Detect and extract tables from PDF

tool_get_pdf_metadata

Read PDF metadata: title, author, dates, outline

tool_analyze_document_structure

Detect headings, font sizes, section structure

tool_search_in_pdf

Search for keywords with context in PDF

tool_extract_text_from_docx

Extract all text from a Word DOCX file

tool_extract_tables_from_docx

Extract all tables from a DOCX file

tool_analyze_docx_structure

Analyze headings, styles, and structure of DOCX

tool_count_words_and_stats

Word count, sentence count, reading time, top words

tool_detect_document_language

Detect language of PDF or DOCX (55+ languages)

Installation

pip install document-intelligence-mcp

Claude Desktop Configuration

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "document-intelligence": {
      "command": "document-intelligence-mcp"
    }
  }
}

Usage Examples

Extract text from a PDF:

Extract the text from /path/to/report.pdf

Find tables in a PDF:

Find all tables in /path/to/financial_report.pdf

Search for a keyword:

Search for "revenue" in /path/to/annual_report.pdf

Get document stats:

Count the words and estimate reading time for /path/to/document.docx

Detect language:

What language is /path/to/document.pdf written in?

Requirements

  • Python 3.10+

  • PyMuPDF >= 1.24.0

  • pdfplumber >= 0.11.0

  • python-docx >= 1.1.0

  • langdetect >= 1.0.9

License

MIT License — free to use, modify, and distribute.


Built by AiAgentKarl | Part of the AI Agent Economy toolkit

A
license - permissive license
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AiAgentKarl/document-intelligence-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server