Skip to main content
Glama

docsmith-mcp

npm version

Python-powered document processing MCP with MCP Apps — Process Excel, Word, PDF, PowerPoint documents with ease using Python, and view them beautifully through an interactive MCP App.

Features

  • Excel: Read/write .xlsx files with sheet support and pagination

  • Word: Read/write .docx files with paragraph and table support

  • PDF: Read .pdf files with text extraction and pagination

  • PowerPoint: Read .pptx files with slide content extraction

  • Text Files: Read/write .txt, .csv, .md, .json, .yaml, .yml with pagination support

  • Run Python: Execute Python code for flexible file operations and data processing

  • MCP App: Beautiful React + Tailwind CSS app for viewing all document types

  • Flexible Reading Modes: Raw full read or paginated for large files

  • Powered by Pyodide: Runs in secure WebAssembly sandbox via code-runner-mcp

Quick Start

MCP Configuration

Add to your MCP client configuration (e.g., Claude Desktop, Cline, etc.):

Via npx (recommended):

{ "mcpServers": { "docsmith": { "command": "npx", "args": ["-y", "docsmith-mcp"], "env": { "DOC_PAGE_SIZE": "100" } } } }

Via global installation:

npm install -g docsmith-mcp
{ "mcpServers": { "docsmith": { "command": "docsmith-mcp", "env": { "DOC_PAGE_SIZE": "100" } } } }

Via local path:

{ "mcpServers": { "docsmith": { "command": "node", "args": ["/path/to/docsmith-mcp/dist/index.js"] } } }

Then use the read_document tool:

{ "file_path": "/path/to/document.xlsx", "mode": "paginated", "page": 1, "page_size": 50 }

The MCP App will automatically open to display the document content beautifully.

Supported Formats

Format

Extensions

Read

Write

Notes

Excel

.xlsx

Multi-sheet support, pagination

Word

.docx

Paragraphs and tables

PDF

.pdf

Text extraction with pagination

PowerPoint

.pptx

Slide content extraction

CSV

.csv

-

Text

.txt, .md

Pagination support

JSON

.json

-

YAML

.yaml, .yml

-

Tools

read_document

Read document content with automatic format detection.

Parameters:

  • file_path (string, required): Path to the document

  • mode (string, optional): "paginated" or "raw" (default: "paginated")

  • page (number, optional): Page number for paginated mode (default: 1)

  • page_size (number, optional): Items per page (default: 100)

  • sheet_name (string, optional): Sheet name for Excel files

Example:

{ "file_path": "/path/to/document.xlsx", "mode": "paginated", "page": 1, "page_size": 50, "sheet_name": "Sheet1" }

write_document

Write document content.

Parameters:

  • file_path (string, required): Output path

  • format (string, required): "excel", "word", "csv", "txt", "json", "yaml"

  • data (array/object, required): Document content

Example:

{ "file_path": "/path/to/output.xlsx", "format": "excel", "data": [ ["Product", "Q1", "Q2"], ["Laptop", 100, 150], ["Mouse", 500, 600] ] }

get_document_info

Get document metadata without reading full content.

Parameters:

  • file_path (string, required): Path to the document

Example:

{ "file_path": "/path/to/document.pdf" }

run_python

Execute Python code for flexible file operations, data processing, and custom tasks. Supports any file format and Python libraries.

Parameters:

  • code (string, required): Python code to execute

  • packages (object, optional): Package mappings (import_name -> pypi_name) for required dependencies

  • file_paths (array, optional): File paths that the code needs to access

Examples:

Read and process any file:

{ "code": "import json\nwith open('/path/to/file.json') as f:\n data = json.load(f)\n result = len(data)\n print(json.dumps({'count': result}))", "file_paths": ["/path/to/file.json"] }

Batch rename files with regex:

{ "code": "import os, re\nfolder = '/path/to/files'\nfor name in os.listdir(folder):\n new_name = re.sub(r'old_', 'new_', name)\n os.rename(os.path.join(folder, name), os.path.join(folder, new_name))\nprint(json.dumps({'success': True}))", "file_paths": ["/path/to/files"] }

Process data with pandas:

{ "code": "import pandas as pd\ndf = pd.read_csv('/path/to/data.csv')\nsummary = df.describe().to_dict()\nprint(json.dumps(summary))", "packages": {"pandas": "pandas"}, "file_paths": ["/path/to/data.csv"] }

Extract archive files:

{ "code": "import zipfile, os\nwith zipfile.ZipFile('/path/to/archive.zip', 'r') as z:\n z.extractall('/path/to/output')\nfiles = os.listdir('/path/to/output')\nprint(json.dumps({'extracted_files': files}))", "file_paths": ["/path/to/archive.zip", "/path/to/output"] }

MCP App

The built-in MCP App provides a beautiful, interactive interface for viewing documents:

  • Excel: Interactive tables with sticky headers

  • PDF: Page-by-page text viewing

  • Word: Paragraph and table rendering

  • PowerPoint: Slide navigation

Built with React 19, Tailwind CSS v4, and Lucide icons.

Configuration

Environment variables for customizing behavior:

Variable

Description

Default

DOC_RAW_FULL_READ

Enable full raw read mode

false

DOC_PAGE_SIZE

Default items per page

100

DOC_MAX_FILE_SIZE

Max file size in MB

50

Contributing

See CONTRIBUTING.md for development setup and contribution guidelines.

License

MIT

Install Server
A
security – no known vulnerabilities
A
license - permissive license
A
quality - confirmed to work

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mcpc-tech/docsmith-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server