Mistral OCR MCP Server
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Mistral OCR MCP Serverextract markdown from /Users/me/documents/report.pdf"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Mistral OCR MCP Server
A Model Context Protocol (MCP) server that provides tools for extracting text and images from PDF and image files using the Mistral OCR API.
Features
Simple Text Extraction: Extract markdown content from documents without handling images
Full Extraction with Images: Extract markdown and save embedded images to disk with proper relative links
Security Sandbox: Restricts file writes to a configured allowed directory
Zero-Install Deployment: Run with
uvxwithout prior installationSupported Formats: PDF (
.pdf), PNG (.png), JPEG (.jpg,.jpeg), WebP (.webp), GIF (.gif)
Client Configuration
Claude Desktop
Add this to your claude_desktop_config.json:
macOS:
~/Library/Application Support/Claude/claude_desktop_config.jsonWindows:
%APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"mistral-ocr": {
"command": "uvx",
"args": ["mistral-ocr-mcp"],
"env": {
"MISTRAL_API_KEY": "your-api-key-here",
"MISTRAL_OCR_ALLOWED_DIR": "/absolute/path/to/allowed/directory"
}
}
}
}OpenCode
Add this to the mcp section of your configuration file:
{
"mcp": {
"mistral-ocr": {
"type": "local",
"command": ["uvx", "mistral-ocr-mcp"],
"enabled": true,
"environment": {
"MISTRAL_API_KEY": "your-api-key-here",
"MISTRAL_OCR_ALLOWED_DIR": "/absolute/path/to/allowed/directory"
}
}
}
}Codex
If you use the Codex CLI, you can add the server with:
codex mcp add mistral-ocr -- uvx mistral-ocr-mcpMake sure the environment variables MISTRAL_API_KEY and MISTRAL_OCR_ALLOWED_DIR are set in your shell environment.
Configuration
Required Environment Variables
Variable | Description | Example |
| Your Mistral API key (never logged) |
|
| Absolute path to allowed write directory |
|
Security Sandbox
The server enforces a write directory sandbox to prevent unauthorized file writes:
extract_markdown: No write restrictions (read-only operation)extract_markdown_with_images: Theoutput_dirparameter must be withinMISTRAL_OCR_ALLOWED_DIR
Validation Examples:
|
| Result |
|
| ✅ Allowed |
|
| ✅ Allowed (exact match) |
|
| ❌ Rejected |
|
| ❌ Rejected (resolves outside) |
Security Notes:
All paths are canonicalized (symlinks resolved,
..eliminated) before validationImage filenames are sanitized to prevent path traversal attacks
Tool Reference
Tool 1: extract_markdown
Extract markdown content from a document without saving images.
Arguments:
{
"file_path": "/absolute/path/to/document.pdf"
}Parameter | Type | Required | Description |
|
| Yes | Absolute path to input file (PDF or image) |
Returns:
"# Document Title\n\nExtracted markdown content from all pages..."Returns a single string containing concatenated markdown from all pages.
Example:
{
"tool": "extract_markdown",
"arguments": {
"file_path": "/Users/username/documents/report.pdf"
}
}Tool 2: extract_markdown_with_images
Extract markdown content and save embedded images to disk.
Arguments:
{
"file_path": "/absolute/path/to/document.pdf",
"output_dir": "/absolute/path/to/output/parent"
}Parameter | Type | Required | Description |
|
| Yes | Absolute path to input file (PDF or image) |
|
| Yes | Absolute path to output parent directory (must exist and be writable, must be within |
Returns:
{
"output_directory": "/absolute/path/to/output/parent/document",
"markdown_file": "/absolute/path/to/output/parent/document/content.md",
"images": ["img_abc123.png", "img_def456.jpeg"]
}Field | Type | Description |
|
| Absolute path to created subdirectory |
|
| Absolute path to |
|
| List of saved image filenames (not full paths) |
Behavior:
Creates a subdirectory named after the input file stem (e.g.,
reportforreport.pdf)If the subdirectory already exists, appends a timestamp:
report_20260102_143022Saves all extracted images as
<sanitized_id>.<ext>(e.g.,img_abc123.png)Saves markdown to
content.mdwith relative image links (e.g.,)
Example:
{
"tool": "extract_markdown_with_images",
"arguments": {
"file_path": "/Users/username/documents/quarterly-report.pdf",
"output_dir": "/Users/username/workdir/extracted"
}
}Output Structure:
/Users/username/workdir/extracted/
quarterly-report/
content.md # Markdown with relative image links
img_abc123.png # First extracted image
img_def456.jpeg # Second extracted imageExample Client Usage
Here's a minimal Python example using the MCP SDK to call the tools:
import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
async def extract_document():
server_params = StdioServerParameters(
command="mistral-ocr-mcp",
env={
"MISTRAL_API_KEY": "your-api-key",
"MISTRAL_OCR_ALLOWED_DIR": "/Users/username/workdir"
}
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
# Simple extraction
result = await session.call_tool(
"extract_markdown",
arguments={"file_path": "/path/to/document.pdf"}
)
print(result.content[0].text)
# Extraction with images
result = await session.call_tool(
"extract_markdown_with_images",
arguments={
"file_path": "/path/to/document.pdf",
"output_dir": "/Users/username/workdir/output"
}
)
print(result.content[0].text)
asyncio.run(extract_document())Troubleshooting
Error | Cause | Solution |
|
| Set the environment variable before running the server |
|
| Set the environment variable to an absolute path |
| Relative path provided (e.g., | Use an absolute path (e.g., |
| Directory does not exist on filesystem | Create the directory first: |
| Path points to a file, not a directory | Ensure the path is a directory |
| Relative path provided for input file | Use an absolute path (e.g., |
| Input file does not exist | Check the file path and ensure the file exists |
| File extension not supported | Use |
| Output directory does not exist | Create the directory first: |
| Path points to a file, not a directory | Ensure the path is a directory |
| Output directory exists but is not writable | Check directory permissions: |
|
| Use a path within the allowed directory |
| Invalid API key | Check your |
| Rate limit exceeded | Wait and retry, or check your API quota |
Development
Setup
Clone the repository and install with development dependencies:
git clone https://github.com/ORDIS-Co-Ltd/mistral-ocr-mcp
cd mistral-ocr-mcp
pip install -e '.[dev]'Run the server locally:
MISTRAL_API_KEY="your-key" \
MISTRAL_OCR_ALLOWED_DIR="/path/to/allowed/dir" \
python -m mistral_ocr_mcpRun Tests
pytestProject Structure
mistral-ocr-mcp/
├── src/
│ └── mistral_ocr_mcp/
│ ├── __init__.py
│ ├── __main__.py # Entry point
│ ├── server.py # MCP server and tool definitions
│ ├── config.py # Configuration loading and validation
│ ├── extraction.py # OCR orchestration logic
│ ├── mistral_client.py # Mistral API client
│ ├── images.py # Image parsing and saving
│ ├── markdown_rewrite.py # Markdown link rewriting
│ └── path_sandbox.py # Path validation and sandbox enforcement
├── tests/ # Unit tests
├── pyproject.toml # Package configuration
└── README.md # This fileLicense
MIT
Contributing
Contributions are welcome! Please open an issue or submit a pull request.
Links
GitHub Repository: https://github.com/ORDIS-Co-Ltd/mistral-ocr-mcp
MCP Specification: https://modelcontextprotocol.io
Mistral AI: https://mistral.ai
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/ORDIS-Co-Ltd/mistral-ocr-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server