Skip to main content
Glama

MES Document MCP

CI

AI agents need a safe way to work with real manufacturing documents. This project provides a production-oriented MCP server and CLI for Excel, PDF, Word, and Markdown files used in MES workflows.

It does not try to make an LLM directly edit office files. It turns documents into a structured DocumentIR, keeps source provenance, requires approved PatchIR edits, validates output artifacts, and exports back to md, docx, pdf, or xlsx.

What This Is

This repository contains:

  • An MCP server: uv run python -m mes_doc_mcp.server

  • A CLI: uv run mes-doc-mcp ...

  • A Python package: mes_doc_mcp

  • Parsers for Excel, PDF, Word, Markdown, and CSV

  • MES extraction and validation helpers

  • Secure patch/edit/export workflow

  • Tests and CI gates for public GitHub use

Typical users:

  • MES engineers handling production Excel files

  • Manufacturing data teams cleaning mixed office documents

  • AI agent builders who need document tools with citations and guarded edits

  • Teams converting messy spreadsheets into reports and standard JSON

Related MCP server: KnowledgeBaseMCP

Why It Exists

MES teams often receive business-critical data in Excel, Word, PDF, and Markdown:

  • production daily reports

  • work orders

  • quality inspection sheets

  • defect reports

  • BOM and routing tables

  • LOT traceability documents

LLMs are useful here, but unsafe if they directly mutate files. This server gives the agent controlled tools instead.

source file
→ secure ingest
→ DocumentIR with source refs
→ search / inspect / MES extraction
→ PatchIR proposal
→ dry-run diff
→ human approval
→ apply
→ validate
→ export artifact

Quickstart

git clone https://github.com/sjMun09/MES-mcp.git
cd MES-mcp
uv sync --extra dev
uv run pytest -q

Create sample files:

uv run python scripts/create_examples.py

Try the CLI:

uv run mes-doc-mcp ingest examples/sample_production.xlsx
uv run mes-doc-mcp list
uv run mes-doc-mcp validate DOC_ID
uv run mes-doc-mcp mes DOC_ID
uv run mes-doc-mcp export DOC_ID md examples/out/production_report.md

Run the MCP server:

uv run python -m mes_doc_mcp.server

More detail:

MCP Client Config

Use this from the repository root:

{
  "mcpServers": {
    "mes-doc-mcp": {
      "command": "uv",
      "args": ["run", "python", "-m", "mes_doc_mcp.server"],
      "cwd": "/absolute/path/to/MES-mcp"
    }
  }
}

Then ask your AI client:

Ingest examples/sample_production.xlsx.
Inspect sheets, tables, formulas, merged cells, and source ranges.
Extract MES entities and validate quantity consistency.
Propose fixes as PatchIR only. Do not apply changes until I approve.

Supported Formats

Format

Read

Patch

Export

Notes

.xlsx

sheets, ranges, tables, formulas, comments, hyperlinks, merged cells, hidden rows/cols, image/chart metadata

table cells, Excel cells

md/docx/pdf/xlsx

formula overwrite is blocked unless explicitly allowed

.xlsm

same as xlsx

restricted

md/docx/pdf/xlsx

macros are detected and never executed

.csv

rows, encoding fallback

table cells

md/docx/pdf/xlsx

UTF-8, CP949, EUC-KR, Latin-1 fallback

.md

front matter, headings, paragraphs, lists, fenced code, tables, image links

blocks/table cells

md/docx/pdf/xlsx

best round-trip format

.docx

headings, paragraphs, tables, headers/footers, run style metadata, image metadata

blocks/table cells

md/docx/pdf/xlsx

complex Word XML is not fully round-tripped

.pdf

pages, text layer, table candidates, image metadata

IR-level edits then regenerate

md/docx/pdf/xlsx

scanned PDFs require OCR outside this project

Main Tools

Document:

  • document_ingest

  • document_outline

  • document_search

  • document_read_chunk

  • document_extract_tables

  • document_inspect_capabilities

  • document_propose_edit

  • document_dry_run_edit

  • document_apply_edit

  • document_diff

  • document_validate

  • document_export

Excel:

  • excel_inspect_workbook

  • excel_read_range

  • excel_write_range

  • excel_evaluate_formulas

  • excel_recalculate_workbook

MES:

  • mes_classify_document

  • mes_extract_entities

  • mes_validate_data

  • mes_generate_report

  • mes_export_json

Safety Model

Defaults are intentionally conservative:

  • Original files are copied into immutable-style storage before parsing.

  • Writes require revision_id and approval_id.

  • Patches are schema-checked and dry-run capable.

  • Formula cell overwrite is blocked unless allow_formula_overwrite=true.

  • Export will not overwrite files unless explicitly requested.

  • Export artifacts are opened and hashed before publishing.

  • Macro/OLE/ActiveX/PDF JavaScript/zip traversal/zip bomb risks are detected.

  • Blocked ingest attempts are written to audit logs.

  • Document text is treated as untrusted data, not agent instructions.

Production Gates

Run before publishing changes:

uv sync --extra dev
uv run pytest -q
uv run python -m compileall -q src tests
uv build

Current local status:

14 passed
compileall ok
uv build ok
server import ok

Project Layout

src/mes_doc_mcp/
  server.py        MCP tool/resource/prompt server
  cli.py           local CLI
  models.py        DocumentIR, Block, SourceRef, PatchResult
  parsers/         Excel, CSV, Markdown, Word, PDF parsers
  patches.py       PatchIR validation and application
  renderers.py     md/docx/pdf/xlsx exporters
  validation.py    structural validation gates
  security.py      file risk detection
  artifacts.py     export artifact validation
  mes.py           MES classification/extraction/validation
  office.py        optional LibreOffice recalculation helper
tests/             regression and production-gate tests
docs/              public usage docs
examples/          generated sample documents
scripts/           helper scripts

Honest Limits

  • This is not OCR. Image-only scanned PDFs need an OCR step before useful extraction.

  • PDF body editing is not true round-trip editing. Edit IR and regenerate PDF.

  • .xls and .xlsb need a conversion layer before full support.

  • Complex pivot tables and charts are detected/preserved as metadata, not faithfully regenerated.

  • Real production confidence improves with site-specific golden fixtures.

License

MIT. See LICENSE.

A
license - permissive license
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sjMun09/MES-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server