docuflow
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@docuflowread the project proposal.docx and give me its outline"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
docuflow
A unified interface for AI agents and scripts to work with office documents:
.docx,.xlsx,
Why
Documents in .docx/.xlsx share the same logic (headings, paragraphs, tables), but each library has its own API — and any non-trivial edit usually breaks the formatting (fonts, line spacing, alignment, table borders, headers/footers, sectPr).
docuflow solves this with three ideas:
Markdown as the lingua franca.
read → markdown + outline + tables;create/render → markdown → docx. The agent works with familiar text instead of raw XML.Patches by stable IDs.
apply_patcheswithtarget_id="h:0.1.0"/"t:0"—replace/insert_after/delete/appendwithout losing formatting..docxtemplates as a Normal-style source. Point at a reference document — the new file inherits the font, size, line spacing, page margins, and headers/footers. The "Times New Roman 14 / 1.5 / justified" template look is no longer lost.
Related MCP server: docxtpl MCP Server
Features
📖 Read documents as markdown + structural outline + tables (
.docx,.xlsx,.pdf)✍️ Create from markdown (with Jinja2 templating and YAML frontmatter)
🔧 In-place patches by stable IDs —
replace,insert_after,delete,append🛡 Run-aware in-place replacement that preserves bold, italic, color — both inside and outside the match
🧬 Inline styles (Pandoc-like):
{#id .class key=value}on headings,{color="red" bold}…{/}on paragraphs📐 Template inheritance — fonts, sizes, margins,
sectPrfrom a reference.docx🔍 Bulk search/replace (including regex) with run-aware formatting preservation
🔒 Sandbox with a directory whitelist and path-traversal protection
📕 PDF reading via
opendataloader-pdf(Java 11+)🤖 MCP server with 11 tools for Claude Desktop / Cursor / Claude Code
🎓 Claude skill at
.claude/skills/docuflow-using/SKILL.md— teaching an AI agent how to pick the right tool
Installation
pip install docuflow
# optional: PDF reading via opendataloader-pdf
pip install 'docuflow[pdf]'Requirements:
Python 3.11+
Java 11+ (only for PDF reading; everything else works without it)
Quick start
Python API
from pathlib import Path
from docuflow import Sandbox
from docuflow.formats import build_default_registry
sb = Sandbox(roots=[Path.home() / "Documents" / "Work"])
reg = build_default_registry()
# Read
content = reg.get_reader("docx").read(sb.resolve("report.docx"))
print(content.markdown)
for node in content.outline:
print(node.id, "—", node.title)
# Create from markdown
reg.get_writer("docx").write(
sb.resolve("new.docx"),
type(content)(format="docx", markdown="# Title\n\nBody.",
outline=[], tables=[], metadata={}),
template=sb.resolve("reference.docx"), # ← inherits fonts, margins, sectPr
)
# In-place patches (preserve fonts, alignment, table borders)
from docuflow.core import EditPatch
reg.get_editor("docx").apply(sb.resolve("report.docx"), [
EditPatch(op="replace", target_id="h:0", content="# Renamed"),
EditPatch(op="replace", target_id="t:0", content="| A | B |\n|---|---|\n| 1 | 2 |"),
EditPatch(op="insert_after", target_id="h:0.0", content="## New section"),
])MCP server
docuflow-mcp --root ~/Documents/Work --root ~/Documents/TemplatesOr via environment variable:
export DOCUFLOW_ROOTS="~/Documents/Work:~/Documents/Templates"
docuflow-mcp --env-rootsClaude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"docuflow": {
"command": "docuflow-mcp",
"args": ["--root", "/Users/me/Documents/Work"]
}
}
}MCP tools (11)
Group | Tool | Purpose |
Read |
| markdown + outline + tables |
| structure only (cheap for large files) | |
| tables only | |
Create |
| new docx/xlsx from markdown |
| Jinja2 + YAML frontmatter → docx/xlsx | |
| which | |
Edit |
| rewrite from markdown (legacy) |
| in-place patches by ID (recommended) | |
| in-place search/replace (run-aware) | |
Meta |
| author, size, pages |
| available directories |
Edit pattern: outline → patches
[
{"op": "replace", "target_id": "h:0", "content": "# Renamed"},
{"op": "insert_after", "target_id": "h:1.0", "content": "## New section\n\nBody."},
{"op": "replace", "target_id": "t:0", "content": "| A | B |\n|---|---|\n| X | Y |"},
{"op": "delete", "target_id": "h:2.1"},
{"op": "append", "content": "# Tail\n\nFinal paragraph."}
]Identifiers:
h:N.N.N— heading (N — 0-indexed sibling position; number of segments = level).t:N— table (N — global index in the document).h:0/h:0.0/h:0.0.0— 1st, 2nd, 3rd level respectively.
Why not edit_document for patches? A full rewrite loses styles, fonts, headers/footers, and embedded images. Use apply_patches for targeted changes.
Templates
Jinja2 + YAML frontmatter
templates/contract.md:
---
template_version: 1
style:
default_font: Times New Roman
default_size: 12
align: justify
---
# Contract No. {{number}}
City of {{city}}, {{date}}
{% for stage in stages %}
- **Stage {{loop.index}}**: {{stage.description}} — {{stage.amount}} ₽
{% endfor %}render_template(
template_path="templates/contract.md",
output_path="contracts/2026-001.docx",
data={"number": "2026-001", "city": "Moscow", "date": "2026-06-17",
"stages": [{"description": "Prepayment", "amount": "100000"},
{"description": "Settlement", "amount": "900000"}]},
)render_template is strict-undefined: a missing variable raises TemplateRenderError. For large templates call list_template_variables first and diff against data.
.docx template as a formatting source
If you need a specific look ("Times New Roman 14, line spacing 1.5, 2 cm margins, table borders"), pass a reference .docx to template=:
writer.write(out, content, template=Path("reference.docx"))The writer clears the reference body and fills it with markdown, inheriting the Normal style, sectPr (page margins, page size, headers/footers), and table-style definitions. If the reference is minimal (missing Heading 1, List Bullet, Table Grid), those styles are auto-created.
Inline styles (Pandoc-like)
Heading attributes
# Introduction {#intro}
## Methods {#methods color="#1f4e79" align="center" bold}
# Sheet: Data {#hdr bold color="#1f4e79"}#id— fixes the node ID in the outline..class— classes (.foo .bar).key=value— properties (color,align,bold,size,font,italic,underline).A bare token (no value) is treated as
=true(bold→bold=true).
Color is #rrggbb or #rgb only. For .docx the attributes are applied to all runs of the heading. For .xlsx — to the first header row of the sheet.
Inline spans in paragraphs (.docx only)
Text with {color="red" bold}important{/} part.Syntax {attrs}text{/} — text between { and {/} receives the attributes. The closing {/} is required. Supported attributes: color, bold, italic, underline, size, font.
Document style (YAML frontmatter)
---
title: Quarterly Report
author: Analytics
style:
default_font: Arial
default_size: 11
align: justify
---For .docx, default_font / default_size / align apply to the Normal style. For .xlsx, they apply to cells from the 2nd row on (the header row stays as is).
Heading IDs
Auto-generated as 0-indexed hierarchical (h:0, h:0.0, h:0.1.0). {#custom-id} overrides.
Architecture
docuflow/
├── core/ ← models, sandbox, markdown, registry
├── formats/
│ ├── docx/ ← Reader, Writer, Editor (in-place)
│ ├── xlsx/ ← Reader, Writer, Editor
│ └── pdf/ ← Reader (opendataloader-pdf)
└── mcp_server.py ← MCP tools
.claude/skills/
└── docuflow-using/ ← Claude skill for agent integrationTwo editor modes:
markdown-roundtrip (
apply_markdown): rewrite the document from markdown — for "create from scratch".in-place (
apply,find_and_replace_in_place): modify XML elements directly, preserving run fonts, table borders, line spacing,sectPr, headers/footers. Recommended for edits.
In-place find_and_replace runs in run-aware mode: for each match it splits the first and last run at the match boundaries, removes runs that fall entirely inside the match, and inserts a single new run with the replacement text, inheriting the first affected run's formatting. Bold/italic/color outside the matched fragment is preserved. Paragraph-level format (alignment, indent, line-spacing, style) is always preserved.
Security
All paths pass through Sandbox.resolve(). Attempts to escape the allowed roots raise PathSecurityError. Symlinks are resolved; relative paths are resolved against the first root.
Limitations
Preserved | NOT preserved from |
| Page headers/footers (unless via |
| Section breaks, page numbering |
| Embedded images, OLE objects |
| Track changes, comments |
Headers/footers (when |
For full visual fidelity with a complex template, use template= + apply_patches — the markdown parser keeps most of the template's formatting because you're editing text rather than recreating the document.
Common errors
Path outside sandbox →
PathSecurityError. Don't try to bypass via..— it's blocked. Ask to add a root.Unknown format →
UnsupportedFormatError. Supported:docx,xlsx,pdf(read-only).Missing template variable →
TemplateRenderError. Strict-undefined by design.edit_documentinstead ofapply_patchesand lost formatting → restore from a backup; next time use patches.PDF without Java →
PdfJavaNotFoundError. Install JDK 11+.
Claude skill
A Claude skill ships with the repo at .claude/skills/docuflow-using/SKILL.md. It teaches the agent:
which MCP tool to pick for a given task (read / create / patch / template),
the outline→patches workflow and how to read stable IDs (
h:N.N.N,t:N),inline-styles syntax (Pandoc-like),
the correct workflow for full template fidelity (
template=+apply_patches),common errors and their fixes.
For Russian-speaking agents the same skill is available at .claude/skills/docuflow-using/SKILL.md — it's authored in Russian and references README_RU.md.
Development
# Clone
git clone https://github.com/deja111vu/docuflow.git
cd docuflow
# Create venv
python -m venv .venv
. .venv/bin/activate # Linux/macOS
.venv\Scripts\activate.bat # Windows
# Install editable + dev deps
pip install -e '.[dev,pdf]'
# Tests + coverage (≥85%)
pytest --cov=docuflow
# Lint
ruff check src tests
mypy src/docuflowTest stack
pytest + pytest-cov
hypothesis (property-based tests for sandbox and outline)
All tests are isolated:
tmp_path+Sandbox(roots=[tmp_path])
CI
GitHub Actions runs ruff, mypy --strict, pytest --cov on Python 3.11 / 3.12 / 3.13.
Roadmap
v0.2.0 (current): in-place editor, run-aware replace, template-aware writer, auto-style creation, inline styles, PDF reader.
v0.3.0: pptx (read/write/edit), track changes, comments.
v0.4.0: bulk operations (merge, split), benchmarks, performance budgets.
See CHANGELOG.md for the full version history.
License
MIT. See LICENSE.
Acknowledgements
python-docx — docx engine core.
openpyxl — xlsx engine core.
opendataloader-pdf — PDF reader.
markdown-it-py — markdown parser.
MCP — agent integration protocol.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/deja111vu/docuflow'
If you have feedback or need assistance with the MCP directory API, please join our Discord server