Skip to main content
Glama
jiahuidegit

doc-mcp-server

by jiahuidegit

📄 Document Analyzer MCP Server

English | įŽ€äŊ“中文

PyPI version License: MIT Python 3.10+ MCP

Make AI understand complex documents - MCP server solving AI context limitations


đŸŽ¯ Key Features

  • ✅ Smart Document Analysis - Auto-detect sections, handle merged cells

  • ✅ Multi-format Support - Excel (.xlsx, .xls) | PDF/Word in development

  • ✅ Precise Field Mapping - Field mapping table + section-level reading

  • ✅ High Performance - Structured caching + lazy loading

🚀 Quick Start

Installation

macOS / Linux (Recommended with pipx)

# Install pipx
brew install pipx  # macOS
# or sudo apt install pipx  # Ubuntu/Debian

# Install doc-mcp-server
pipx install doc-mcp-server

Windows

pip install doc-mcp-server

For more installation options, see Full Installation Guide

Configure Claude Code

Add to ~/.claude.json or your project's config file:

{
  "mcpServers": {
    "document-analyzer": {
      "command": "doc-mcp-server"
    }
  }
}

For detailed configuration, see Quick Start Guide

📚 Full Documentation

💡 Usage Example

# 1. Analyze document structure
analyze_document(file_path="/path/to/document.xlsx")

# 2. Read specific section
read_section(file_path="/path/to/document.xlsx", section_name="Section 1")

# 3. Read single field
read_field(file_path="/path/to/document.xlsx", field_key="Section1_CompanyName")

đŸ› ī¸ Available Tools

Tool

Description

analyze_document

Analyze document structure and generate metadata

get_structure

Get cached document structure

read_field

Read specific field value

read_section

Read entire section data

write_field

Write field value (Excel only)

list_sections

List all sections

list_fields

List all fields

export_structure

Export document structure

đŸŽ¯ Why Use This?

Problem: Large Excel files consume massive tokens when directly read by AI

  • ❌ Traditional: Read entire 323-row Excel → 15000+ tokens → Often fails

  • ✅ Using MCP: Structured reading → 2000 tokens → 90%+ success rate

Performance Improvements:

  • 🚀 Token consumption reduced by 87% (15000 → 2000)

  • ✅ Success rate improved from 30% to 90%+

  • ⚡ Handles 323 rows × 24 columns with 4249 merged cells

🤝 Contributing & Feedback


📄 License

MIT License - see LICENSE for details


Made with â¤ī¸ by Yang Jiahui

A
license - permissive license
-
quality - not tested
B
maintenance

Maintenance

–Maintainers
–Response time
–Release cycle
1Releases (12mo)

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jiahuidegit/doc-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server