Skip to main content
Glama

docsmith-mcp

npm version

Python-powered document processing MCP with MCP Apps — Process Excel, Word, PDF, PowerPoint documents with ease using Python, and view them beautifully through an interactive MCP App.

Features

  • Excel: Read/write .xlsx files with sheet support and pagination

  • Word: Read/write .docx files with paragraph and table support

  • PDF: Read .pdf files with text extraction and pagination

  • PowerPoint: Read .pptx files with slide content extraction

  • Text Files: Read/write .txt, .csv, .md, .json, .yaml, .yml with pagination support

  • Run Python: Execute Python code for flexible file operations and data processing

  • MCP App: Beautiful React + Tailwind CSS app for viewing all document types

  • Flexible Reading Modes: Raw full read or paginated for large files

  • Powered by Pyodide: Runs in secure WebAssembly sandbox via code-runner-mcp

Quick Start

MCP Configuration

Add to your MCP client configuration (e.g., Claude Desktop, Cline, etc.):

Via npx (recommended):

{
  "mcpServers": {
    "docsmith": {
      "command": "npx",
      "args": ["-y", "docsmith-mcp"],
      "env": {
        "DOC_PAGE_SIZE": "100"
      }
    }
  }
}

Via global installation:

npm install -g docsmith-mcp
{
  "mcpServers": {
    "docsmith": {
      "command": "docsmith-mcp",
      "env": {
        "DOC_PAGE_SIZE": "100"
      }
    }
  }
}

Via local path:

{
  "mcpServers": {
    "docsmith": {
      "command": "node",
      "args": ["/path/to/docsmith-mcp/dist/index.js"]
    }
  }
}

Then use the read_document tool:

{
  "file_path": "/path/to/document.xlsx",
  "mode": "paginated",
  "page": 1,
  "page_size": 50
}

The MCP App will automatically open to display the document content beautifully.

Supported Formats

Format

Extensions

Read

Write

Notes

Excel

.xlsx

Multi-sheet support, pagination

Word

.docx

Paragraphs and tables

PDF

.pdf

Text extraction with pagination

PowerPoint

.pptx

Slide content extraction

CSV

.csv

-

Text

.txt, .md

Pagination support

JSON

.json

-

YAML

.yaml, .yml

-

Tools

read_document

Read document content with automatic format detection.

Parameters:

  • file_path (string, required): Path to the document

  • mode (string, optional): "paginated" or "raw" (default: "paginated")

  • page (number, optional): Page number for paginated mode (default: 1)

  • page_size (number, optional): Items per page (default: 100)

  • sheet_name (string, optional): Sheet name for Excel files

Example:

{
  "file_path": "/path/to/document.xlsx",
  "mode": "paginated",
  "page": 1,
  "page_size": 50,
  "sheet_name": "Sheet1"
}

write_document

Write document content.

Parameters:

  • file_path (string, required): Output path

  • format (string, required): "excel", "word", "csv", "txt", "json", "yaml"

  • data (array/object, required): Document content

Example:

{
  "file_path": "/path/to/output.xlsx",
  "format": "excel",
  "data": [
    ["Product", "Q1", "Q2"],
    ["Laptop", 100, 150],
    ["Mouse", 500, 600]
  ]
}

get_document_info

Get document metadata without reading full content.

Parameters:

  • file_path (string, required): Path to the document

Example:

{
  "file_path": "/path/to/document.pdf"
}

run_python

Execute Python code for flexible file operations, data processing, and custom tasks. Supports any file format and Python libraries.

Parameters:

  • code (string, required): Python code to execute

  • packages (object, optional): Package mappings (import_name -> pypi_name) for required dependencies

  • file_paths (array, optional): File paths that the code needs to access

Examples:

Read and process any file:

{
  "code": "import json\nwith open('/path/to/file.json') as f:\n    data = json.load(f)\n    result = len(data)\n    print(json.dumps({'count': result}))",
  "file_paths": ["/path/to/file.json"]
}

Batch rename files with regex:

{
  "code": "import os, re\nfolder = '/path/to/files'\nfor name in os.listdir(folder):\n    new_name = re.sub(r'old_', 'new_', name)\n    os.rename(os.path.join(folder, name), os.path.join(folder, new_name))\nprint(json.dumps({'success': True}))",
  "file_paths": ["/path/to/files"]
}

Process data with pandas:

{
  "code": "import pandas as pd\ndf = pd.read_csv('/path/to/data.csv')\nsummary = df.describe().to_dict()\nprint(json.dumps(summary))",
  "packages": {"pandas": "pandas"},
  "file_paths": ["/path/to/data.csv"]
}

Extract archive files:

{
  "code": "import zipfile, os\nwith zipfile.ZipFile('/path/to/archive.zip', 'r') as z:\n    z.extractall('/path/to/output')\nfiles = os.listdir('/path/to/output')\nprint(json.dumps({'extracted_files': files}))",
  "file_paths": ["/path/to/archive.zip", "/path/to/output"]
}

MCP App

The built-in MCP App provides a beautiful, interactive interface for viewing documents:

  • Excel: Interactive tables with sticky headers

  • PDF: Page-by-page text viewing

  • Word: Paragraph and table rendering

  • PowerPoint: Slide navigation

Built with React 19, Tailwind CSS v4, and Lucide icons.

Configuration

Environment variables for customizing behavior:

Variable

Description

Default

DOC_RAW_FULL_READ

Enable full raw read mode

false

DOC_PAGE_SIZE

Default items per page

100

DOC_MAX_FILE_SIZE

Max file size in MB

50

Contributing

See CONTRIBUTING.md for development setup and contribution guidelines.

License

MIT

Install Server
A
security – no known vulnerabilities
A
license - permissive license
A
quality - confirmed to work

Resources

Looking for Admin?

Admins can modify the Dockerfile, update the server description, and track usage metrics. If you are the server author, to authenticate as an admin.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mcpc-tech/docsmith-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server