Skip to main content
Glama

DOCX MCP Server

by zeph-gh

DOCX MCP Server

A comprehensive Model Context Protocol (MCP) server for processing Microsoft Word (.docx) documents with full formatting support.

Features

This MCP server provides advanced DOCX document processing capabilities using the powerful mammoth library:

  • Text Extraction: Extract plain text with word count
  • HTML Conversion: Convert to HTML with preserved formatting
  • Structure Analysis: Analyze document structure, headings, and formatting elements
  • Image Extraction: Extract embedded images (as base64 or save to files)
  • Markdown Conversion: Convert to Markdown format
  • Rich Formatting Support: Handles bold, italic, lists, headings, and more

Available Tools

1. extract_text

Extract plain text content from a DOCX file.

Parameters:

  • file_path (string): Path to the .docx file

Returns:

  • Plain text content
  • Processing messages
  • Word count

2. convert_to_html

Convert DOCX file to HTML with formatting preserved.

Parameters:

  • file_path (string): Path to the .docx file
  • include_styles (boolean, optional): Include inline styles (default: true)

Returns:

  • HTML content with formatting
  • Processing messages
  • Warnings and errors

3. analyze_structure

Analyze document structure, headings, and formatting elements.

Parameters:

  • file_path (string): Path to the .docx file

Returns:

  • Document statistics (characters, words, paragraphs, headings)
  • Structure analysis (headings with levels)
  • Formatting analysis (bold, italic, lists count)
  • Processing messages

4. extract_images

Extract and list images from a DOCX file.

Parameters:

  • file_path (string): Path to the .docx file
  • output_dir (string, optional): Directory to save extracted images

Returns:

  • Total image count
  • Image details (src, alt text, base64 status)
  • Output directory information
  • Processing messages

5. convert_to_markdown

Convert DOCX file to Markdown format.

Parameters:

  • file_path (string): Path to the .docx file

Returns:

  • Markdown content
  • Word count
  • Processing messages

Installation

npm install npm run build

Usage

The server runs on stdio and communicates via JSON-RPC 2.0 protocol.

Example Usage with MCP Client

{ "jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": { "name": "analyze_structure", "arguments": { "file_path": "/path/to/document.docx" } } }

Example Usage with Roo

{ "file_path": "/path/to/document.docx" }

Supported Features

  • Text Extraction: Plain text with word counting
  • Rich Formatting: Bold, italic, underline, strikethrough
  • Document Structure: Headings (H1-H6), paragraphs
  • Lists: Ordered and unordered lists with items
  • Images: Extraction as base64 or file export
  • Tables: Basic table structure (via HTML conversion)
  • Links: Hyperlinks preservation
  • Styles: Custom style mapping support
  • Error Handling: Comprehensive error reporting
  • Multiple Formats: HTML, Markdown, plain text output

Advanced Features

Custom Style Mapping

The convert_to_html tool supports custom style mapping for better semantic HTML output:

// Example style mappings "p[style-name='Heading 1'] => h1:fresh" "r[style-name='Strong'] => strong" "r[style-name='Emphasis'] => em"

Image Handling

  • Base64 Embedding: Images can be embedded as base64 data URLs
  • File Export: Images can be extracted to a specified directory
  • Metadata: Alt text and content type preservation

Document Analysis

Provides comprehensive document analysis including:

  • Character and word counts
  • Paragraph and heading counts
  • Formatting element statistics
  • Document structure hierarchy

Development

Install dependencies:

npm install

Build the server:

npm run build

For development with auto-rebuild:

npm run watch

Installation for Claude Desktop

To use with Claude Desktop, add the server config:

On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json On Windows: %APPDATA%/Claude/claude_desktop_config.json

{ "mcpServers": { "docx-format-server": { "command": "/path/to/docx-format-server/build/index.js" } } }

Dependencies

  • @modelcontextprotocol/sdk: MCP protocol implementation
  • mammoth: Advanced DOCX processing library
  • zod: Schema validation
  • typescript: TypeScript support

Error Handling

All tools include comprehensive error handling with detailed error messages for:

  • File not found errors
  • Invalid file format
  • Processing errors
  • Permission issues

Debugging

Since MCP servers communicate over stdio, debugging can be challenging. We recommend using the MCP Inspector, which is available as a package script:

npm run inspector

The Inspector will provide a URL to access debugging tools in your browser.

Version History

  • v0.2.0: Complete rewrite with mammoth library, added 5 comprehensive tools
  • v0.1.0: Basic text extraction with docx-parser (deprecated)

License

ISC License

Install Server
A
security – no known vulnerabilities
F
license - not found
A
quality - confirmed to work

A comprehensive Model Context Protocol server that processes Microsoft Word documents with full formatting support, enabling text extraction, HTML/Markdown conversion, structure analysis, and image extraction.

  1. Features
    1. Available Tools
      1. 1. extract_text
      2. 2. convert_to_html
      3. 3. analyze_structure
      4. 4. extract_images
      5. 5. convert_to_markdown
    2. Installation
      1. Usage
        1. Example Usage with MCP Client
        2. Example Usage with Roo
      2. Supported Features
        1. Advanced Features
          1. Custom Style Mapping
          2. Image Handling
          3. Document Analysis
        2. Development
          1. Installation for Claude Desktop
            1. Dependencies
              1. Error Handling
                1. Debugging
                  1. Version History
                    1. License

                      Related MCP Servers

                      • A
                        security
                        A
                        license
                        A
                        quality
                        A server providing tools to read, write, and edit Microsoft Word (docx) files through the Model Context Protocol, allowing operations like complete document reading, content creation, targeted paragraph editing, and text insertion.
                        Last updated -
                        4
                        20
                        Python
                        MIT License
                      • -
                        security
                        A
                        license
                        -
                        quality
                        A server that provides document processing capabilities using the Model Context Protocol, allowing conversion of documents to markdown, extraction of tables, and processing of document images.
                        Last updated -
                        13
                        Python
                        MIT License
                        • Linux
                        • Apple
                      • A
                        security
                        A
                        license
                        A
                        quality
                        A Model Context Protocol server that enables AI assistants to create, read, edit, and format Microsoft Word documents through standardized tools and resources.
                        Last updated -
                        16
                        431
                        Python
                        MIT License
                        • Apple
                      • A
                        security
                        A
                        license
                        A
                        quality
                        A Model Context Protocol server that converts various file formats (PDF, PowerPoint, Word, Excel, Images, etc.) to Markdown to make them accessible to LLMs.
                        Last updated -
                        1
                        MIT License

                      View all related MCP servers

                      MCP directory API

                      We provide all the information about MCP servers via our MCP API.

                      curl -X GET 'https://glama.ai/api/mcp/v1/servers/zeph-gh/Docx-Mcp-Server'

                      If you have feedback or need assistance with the MCP directory API, please join our Discord server