DOCX-MCP: Word Document MCP Server
A comprehensive Model Context Protocol (MCP) server for Microsoft Word document operations, built with FastMCP and python-docx. Currently focused on advanced table operations with plans to expand into full document manipulation capabilities.
๐ Features
Phase 1 - Core Table Operations โ
Document Management
Open/create Word documents
Save documents with optional rename
Get document information and metadata
Table Structure Operations
Create tables with customizable dimensions
Delete tables by index
Add/remove rows and columns at any position
Support for header rows
Table Data Operations
Set/get individual cell values with optional styling
Bulk table data retrieval (array, object, CSV formats)
List all tables in document with metadata
Comprehensive table structure and style analysis
Phase 2.1 - Cell Formatting โ (New!)
Text Formatting
Font family, size, and color customization
Bold, italic, underline, strikethrough styling
Subscript and superscript support
Cell Alignment
Horizontal alignment (left, center, right, justify)
Vertical alignment (top, middle, bottom)
Visual Styling
Cell background colors (hex color support)
Cell borders with customizable styles, widths, and colors
Complete formatting (apply all options at once)
Phase 2.8 - Table Structure Analysis โ (New!)
Comprehensive Table Analysis
Complete table structure analysis with cell-by-cell details
Automatic detection of merged cells and their ranges
Full style extraction (fonts, colors, alignment, borders)
Header row identification using intelligent heuristics
Support for analyzing single tables or all tables in document
LLM-Friendly Formatting
Enhanced cell operations that preserve existing styles
Optional style application when setting cell values
Detailed formatting information when reading cell values
Perfect for maintaining document consistency during LLM operations
Phase 2 - Advanced Table Features ๐ (In Progress)
Table Formatting & Styling
Cell formatting (bold, italic, font, color, alignment)
Table borders and shading
Row height and column width adjustment
Table positioning and text wrapping
Data Import/Export
CSV import to tables
Excel data import
JSON data mapping to tables
Bulk data operations
Table Search & Query
Search cell content across tables
Filter table data by criteria
Sort table rows by column values
Table data validation
Phase 3 - Extended Table Features ๐ฎ (Planned)
Table Templates & Automation
Predefined table templates
Table style libraries
Automated table generation from data
Table cloning and duplication
Advanced Operations
Table merging and splitting
Cross-table references
Calculated fields and formulas
Table relationship management
Phase 4 - Document Operations ๐ฎ (Future)
Content Management
Text insertion and formatting
Paragraph operations
Heading and outline management
Document structure manipulation
Media & Objects
Image insertion and positioning
Shape and drawing objects
Charts and graphs integration
Hyperlinks and bookmarks
Document Formatting
Page layout and margins
Headers and footers
Styles and themes
Document properties and metadata
๐ฆ Installation
Prerequisites
Python 3.8+
pip package manager
Install from Source
Clone the repository:
Install dependencies:
Install the package in development mode:
๐ฅ๏ธ Usage
As MCP Server
Run the MCP server with default STDIO transport:
โ Independent Tool Design: Each MCP tool now works independently without requiring document pre-loading. You can directly call any tool with a file path, and the document will be automatically loaded as needed. This makes the tools more suitable for AI model integration.
Transport Protocols
The server supports multiple transport protocols:
STDIO (default) - Standard input/output for direct integration:
SSE (Server-Sent Events) - HTTP-based streaming:
Streamable HTTP - HTTP with streaming support:
Command Line Options
Available options:
--transport {stdio,sse,streamable-http}
- Transport protocol (default: stdio)--host HOST
- Host to bind to for HTTP/SSE transports (default: localhost)--port PORT
- Port to bind to for HTTP/SSE transports (default: 8000)--no-banner
- Disable startup banner
Direct Usage
๐ Automatic Document Loading: The new design automatically loads documents when needed, eliminating the need to explicitly call
open_document()
before using other operations. Documents are cached for performance, and you can still manually manage document loading if preferred.
๐ง Available MCP Tools
All tools accept JSON parameters and return JSON responses, making them compatible with language models.
Document Operations
open_document(file_path, create_if_not_exists=True)
- Open or create a Word document (optional - tools auto-load)save_document(file_path, save_as=None)
- Save a Word document (auto-loads if needed)get_document_info(file_path)
- Get document information (auto-loads if needed)
Table Structure Operations
create_table(file_path, rows, cols, position="end", paragraph_index=None, headers=None)
- Create a new tabledelete_table(file_path, table_index)
- Delete a tableadd_table_rows(file_path, table_index, count=1, position="end", row_index=None)
- Add rows to a tableadd_table_columns(file_path, table_index, count=1, position="end", column_index=None)
- Add columns to a tabledelete_table_rows(file_path, table_index, row_indices)
- Delete rows from a table
Data Operations
set_cell_value(file_path, table_index, row_index, column_index, value, ...)
- Set cell value with optional formattingget_cell_value(file_path, table_index, row_index, column_index, include_formatting=True)
- Get cell value with formatting infoget_table_data(file_path, table_index, include_headers=True, format="array")
- Get entire table data
Query Operations
list_tables(file_path, include_summary=True)
- List all tables in document
Table Structure Analysis Operations (New in Phase 2.8!)
analyze_table_structure(file_path, table_index)
- Comprehensive analysis of single table structure and stylesanalyze_all_tables_structure(file_path)
- Analyze all tables in document with complete details
Table Search Operations (New in Phase 2.7!)
search_table_content(file_path, query, search_mode="contains", case_sensitive=False, table_indices=None, max_results=None)
- Search for content within table cellssearch_table_headers(file_path, query, search_mode="contains", case_sensitive=False)
- Search specifically in table headers
Cell Formatting Operations (New in Phase 2.1!)
format_cell_text(file_path, table_index, row_index, column_index, ...)
- Format text in cellformat_cell_alignment(file_path, table_index, row_index, column_index, horizontal, vertical)
- Set cell alignmentformat_cell_background(file_path, table_index, row_index, column_index, color)
- Set cell background colorformat_cell_borders(file_path, table_index, row_index, column_index, ...)
- Set cell borders
Example Language Model Usage
Language models can call these tools with JSON parameters. No pre-loading required - each tool call is independent:
New! Cell Formatting Examples:
New! Enhanced Cell Operations Examples:
New! Table Structure Analysis Examples:
New! Table Search Examples:
๐งช Testing
The project uses pytest for comprehensive testing with 93 test cases covering all functionality.
Run all tests:
Run tests with coverage:
Run specific test categories:
Run tests with verbose output:
๐ Development Roadmap
Phase 2: Advanced Table Features (In Progress)
Priority: High - Completing table functionality before expanding scope
Cell Formatting & Styling โ COMPLETED
Cell text formatting (bold, italic, underline, font family/size, color)
Cell background colors with hex color support
Cell borders with customizable styles, widths, and colors
Text alignment (horizontal: left, center, right, justify)
Vertical alignment (top, middle, bottom)
Complete formatting (apply all options at once)
Row height and column width controls
Data Import/Export
CSV file import to tables
Excel file data import (.xlsx)
JSON data structure mapping
Enhanced export with formatting preservation
Bulk cell data operations
Table Search & Query โ COMPLETED
Search content across all table cells
Search specifically in table headers
Multiple search modes (exact, contains, regex)
Case-sensitive and case-insensitive search
Search specific tables or all tables
Limit search results
Filter table rows by column criteria
Sort table data by column values
Find and replace in table content
Table Structure Analysis โ COMPLETED
Comprehensive table structure analysis
Cell-by-cell style and content extraction
Automatic merged cell detection
Header row identification heuristics
Enhanced cell operations with style preservation
LLM-friendly formatting information
Phase 3: Extended Table Features
Priority: Medium - Advanced table manipulation
Table Templates & Automation
Predefined table styles and layouts
Table template library system
Auto-generate tables from data schemas
Table duplication and cloning
Advanced Table Operations
Merge and split table cells
Table-to-table data relationships
Calculated fields and basic formulas
Cross-reference table data
Performance & Optimization
Batch operations for large tables
Memory optimization for big documents
Caching for frequently accessed data
Async operations support
Phase 4: Document Content Operations
Priority: Medium - Expanding beyond tables
Text & Paragraph Management
Insert and format text content
Paragraph styling and spacing
Bullet points and numbering
Text search and replace
Document Structure
Heading hierarchy management
Table of contents generation
Section breaks and page layout
Document outline operations
Phase 5: Media & Advanced Features
Priority: Low - Rich document features
Media Integration
Image insertion and positioning
Charts and graphs from table data
Shape and drawing objects
Embedded object support
Advanced Document Features
Headers and footers
Page numbering and layout
Document properties and metadata
Track changes and comments
Phase 6: Enterprise Features
Priority: Low - Production-ready enhancements
Security & Compliance
Document encryption support
Access control and permissions
Audit logging and tracking
Data validation and sanitization
Integration & Extensibility
Plugin system architecture
External data source connections
API rate limiting and throttling
Multi-document operations
๐ง Architecture
Project Structure
Key Design Principles
Modular Architecture: Clear separation between document, table, and future operations
Type Safety: Full type hints and Pydantic models for data validation
Error Handling: Comprehensive exception handling with detailed error messages
Extensibility: Easy to add new operation types and document features
Testing: High test coverage with unit and integration tests
๐ Error Handling
The library includes comprehensive error handling with custom exceptions:
DocumentNotFoundError
- Document file not foundTableNotFoundError
- Table not found in documentInvalidTableIndexError
- Invalid table indexInvalidCellPositionError
- Invalid cell positionTableOperationError
- Table operation failedDataFormatError
- Invalid data format
๐ค Contributing
We welcome contributions! Here's how to get started:
Fork the repository
Create a feature branch (
git checkout -b feature/amazing-feature
)Make your changes following our coding standards
Add tests for new functionality
Ensure all tests pass (
pytest
)Commit your changes (
git commit -m 'Add amazing feature'
)Push to the branch (
git push origin feature/amazing-feature
)Open a Pull Request
Development Guidelines
Follow PEP 8 coding standards
Add type hints to all functions
Write comprehensive tests for new features
Update documentation for API changes
Ensure backward compatibility when possible
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Support
Issues: Report bugs and request features on GitHub Issues
Documentation: Check the examples/ directory for usage examples
Testing: Run the test suite to verify functionality
๐ Dependencies
Python 3.8+ - Core runtime
python-docx โฅ 1.1.0 - Word document manipulation
fastmcp โฅ 0.4.0 - MCP server framework
pytest - Testing framework (development)
๐ Project Status
โ Current Capabilities (Phase 1 + 2.1 + 2.7 + 2.8)
๐งช Test Coverage: 93/93 tests passing (100%)
๐ ๏ธ MCP Tools: 19 available tools (11 core + 4 formatting + 2 search + 2 analysis)
๐ฆ Modules: 8 core modules with clean architecture
๐จ Formatting: Complete cell formatting support with style preservation
๐ Search: Comprehensive table search capabilities
๐ฌ Analysis: Deep table structure and style analysis for LLMs
๐ Documentation: Comprehensive API docs and examples
๐ Recent Additions (Phase 2.8 + Tool Independence)
โ Independent Tool Design: Each MCP tool works independently without requiring document pre-loading
โ Automatic Document Loading: Documents are loaded automatically when needed and cached for performance
โ Table Structure Analysis: Complete analysis of table structure, styles, and merged cells
โ Enhanced Cell Operations: Set/get cell values with optional formatting preservation
โ Style Detection: Automatic extraction of fonts, colors, alignment, borders
โ Merge Detection: Identify and report merged cell regions
โ LLM Integration: Perfect for maintaining document consistency during AI operations
โ Background Color Fix: Resolved background color setting and extraction issues
๐ฏ Next Milestones
Phase 2.2: Data import/export (CSV, Excel, JSON)
Phase 3: Advanced table features and templates
DOCX-MCP - Making Word document automation accessible through the Model Context Protocol! ๐
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
Enables comprehensive Microsoft Word document manipulation through the Model Context Protocol, with advanced table operations including creation, data management, formatting, and bulk operations. Supports document creation, editing, and saving with plans for full document content management.