Supports advanced data analysis and manipulation by allowing the use of the pandas library within the server's Python execution environment.
Enables the execution of Python code in a secure WebAssembly sandbox for custom file operations, data processing, and complex automation tasks.
Allows for the reading and writing of YAML and YML files, supporting configuration management and data serialization.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Docsmith MCPread the first 2 pages of monthly_report.pdf"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
docsmith-mcp
Python-powered document processing MCP with MCP Apps — Process Excel, Word, PDF, PowerPoint documents with ease using Python, and view them beautifully through an interactive MCP App.
Features
Excel: Read/write
.xlsxfiles with sheet support and paginationWord: Read/write
.docxfiles with paragraph and table supportPDF: Read
.pdffiles with text extraction and paginationPowerPoint: Read
.pptxfiles with slide content extractionText Files: Read/write
.txt,.csv,.md,.json,.yaml,.ymlwith pagination supportRun Python: Execute Python code for flexible file operations and data processing
MCP App: Beautiful React + Tailwind CSS app for viewing all document types
Flexible Reading Modes: Raw full read or paginated for large files
Powered by Pyodide: Runs in secure WebAssembly sandbox via code-runner-mcp
Quick Start
MCP Configuration
Add to your MCP client configuration (e.g., Claude Desktop, Cline, etc.):
Via npx (recommended):
Via global installation:
Via local path:
Then use the read_document tool:
The MCP App will automatically open to display the document content beautifully.
Supported Formats
Format | Extensions | Read | Write | Notes |
Excel |
| ✅ | ✅ | Multi-sheet support, pagination |
Word |
| ✅ | ✅ | Paragraphs and tables |
| ✅ | ❌ | Text extraction with pagination | |
PowerPoint |
| ✅ | ❌ | Slide content extraction |
CSV |
| ✅ | ✅ | - |
Text |
| ✅ | ✅ | Pagination support |
JSON |
| ✅ | ✅ | - |
YAML |
| ✅ | ✅ | - |
Tools
read_document
Read document content with automatic format detection.
Parameters:
file_path(string, required): Path to the documentmode(string, optional):"paginated"or"raw"(default:"paginated")page(number, optional): Page number for paginated mode (default: 1)page_size(number, optional): Items per page (default: 100)sheet_name(string, optional): Sheet name for Excel files
Example:
write_document
Write document content.
Parameters:
file_path(string, required): Output pathformat(string, required):"excel","word","csv","txt","json","yaml"data(array/object, required): Document content
Example:
get_document_info
Get document metadata without reading full content.
Parameters:
file_path(string, required): Path to the document
Example:
run_python
Execute Python code for flexible file operations, data processing, and custom tasks. Supports any file format and Python libraries.
Parameters:
code(string, required): Python code to executepackages(object, optional): Package mappings (import_name -> pypi_name) for required dependenciesfile_paths(array, optional): File paths that the code needs to access
Examples:
Read and process any file:
Batch rename files with regex:
Process data with pandas:
Extract archive files:
MCP App
The built-in MCP App provides a beautiful, interactive interface for viewing documents:
Excel: Interactive tables with sticky headers
PDF: Page-by-page text viewing
Word: Paragraph and table rendering
PowerPoint: Slide navigation
Built with React 19, Tailwind CSS v4, and Lucide icons.
Configuration
Environment variables for customizing behavior:
Variable | Description | Default |
| Enable full raw read mode |
|
| Default items per page |
|
| Max file size in MB |
|
Contributing
See CONTRIBUTING.md for development setup and contribution guidelines.
License
MIT