Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP PresidioRedact any names, emails, and phone numbers from this customer feedback."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
⚠️ SECURITY & PRIVACY WARNING ⚠️
PLEASE READ CAREFULLY BEFORE USE
Using this MCP server to detect PII involves sending text data to the Presidio engine. While the processing happens locally within the container or python process, using this tool via an LLM Agent (like Claude, ChatGPT, etc.) implies that the text to be analyzed is being shared with that LLM.
RISKS:
PII Leakage: If you ask an LLM to "check this text for PII" or "anonymize this", you are sending the potentially sensitive text to the LLM provider first so they can construct the tool call.
Context Retention: The PII may be retained in the LLM's chat history, training data, or logs.
Transmitted Context: PII will be part of the prompt context transmitted over the network.
RECOMMENDED USE:
Local LLMs: Use with locally hosted LLMs where data does not leave your infrastructure.
Private/Enterprise Agents: Use in approved enterprise environments with strict data privacy agreements.
Non-LLM Integration: Use the underlying libraries directly in your code without an LLM intermediary if strict privacy is required.
ALTERNATIVE ARCHITECTURES: Consider using Presidio as a filter before the LLM. Tools like LiteLLM can integrate Presidio to sanitize input before it reaches the LLM provider, preventing PII from ever leaving your control. This MCP server is designed for agentic workflows where the LLM decides to check for PII, which inherently carries the risks mentioned above.
MCP Presidio
A Model Context Protocol (MCP) server that provides comprehensive PII (Personally Identifiable Information) detection and anonymization capabilities using Microsoft Presidio. This server enables LLMs to safely handle sensitive data by detecting and anonymizing PII in text and structured data.
Features
Core Capabilities
PII Detection: Identify 25+ types of PII including names, emails, phone numbers, credit cards, SSNs, addresses, and more
Text Anonymization: Multiple anonymization strategies (replace, redact, hash, mask, encrypt)
Structured Data Support: Analyze and anonymize JSON/dictionary data recursively
Batch Processing: Process multiple texts efficiently in batch operations
Custom Recognizers: Add domain-specific PII patterns with regex
Multi-language Support: Detect PII in multiple languages
Validation Tools: Test and validate detection accuracy with metrics
Available MCP Tools
analyze_text - Detect PII entities in text with confidence scores
anonymize_text - Anonymize PII using various operators
get_supported_entities - List all supported PII entity types
add_custom_recognizer - Add custom PII detection patterns
batch_analyze - Analyze multiple texts for PII
batch_anonymize - Anonymize multiple texts
get_anonymization_operators - List available anonymization methods
analyze_structured_data - Detect PII in JSON/structured data
anonymize_structured_data - Anonymize PII in structured data
validate_detection - Validate detection accuracy with metrics
Installation
Choose your preferred installation method:
🐳 Docker - Self-contained, reproducible environment (recommended for production)
🐍 Python - Direct installation with interactive setup
📦 Manual - Full control over the installation process
For detailed Docker deployment instructions, see DOCKER.md.
Prerequisites
For Python Installation:
Python 3.10 or higher
pip or uv package manager
For Docker Installation:
Docker 20.10 or higher
Docker Compose (optional, for easier management)
Docker Installation (Recommended for Production)
Docker provides a self-contained, reproducible environment with all dependencies pre-installed.
Quick Start with Docker
Using Docker Compose
Configuring Claude Desktop with Docker
To use the Docker container with Claude Desktop, update your claude_desktop_config.json:
Or if using a pre-built image from a registry:
Docker Image Details
The Docker image includes:
Python 3.11 slim base
All required dependencies (mcp, presidio-analyzer, presidio-anonymizer, spacy)
Pre-installed English language model (en_core_web_lg)
Security-hardened with non-root user
Multi-stage build for minimal image size (~500MB)
Advanced Docker Usage
Interactive Shell for Debugging:
Custom Language Models: To include additional language models, modify the Dockerfile:
Then rebuild the image:
Volume Mounting for Custom Configurations:
Python Installation (Quick Install)
Use the interactive installation script that handles dependencies and language models:
Unix/Linux/macOS:
Windows:
The script will:
Check Python version compatibility
Install base dependencies (mcp, presidio-analyzer, presidio-anonymizer, spacy)
Prompt for language model installation (English, Spanish, French, German, etc.)
Optionally install development dependencies
Verify the installation
Test basic functionality
Python Installation (Manual)
If you prefer manual installation:
For other languages, download the appropriate spaCy model:
Usage
Running the Server
The server runs using stdio transport, suitable for MCP clients:
Or run directly with Python:
Configuring with Claude Desktop
Add to your Claude Desktop configuration (claude_desktop_config.json):
Or if installed as a script:
Example Usage in LLM Conversations
Detecting PII:
Anonymizing Text:
Working with Structured Data:
Supported PII Entity Types
The server supports 25+ PII entity types including:
Personal: PERSON, DATE_TIME
Contact: EMAIL_ADDRESS, PHONE_NUMBER, URL
Financial: CREDIT_CARD, IBAN_CODE, US_BANK_NUMBER, CRYPTO
Government IDs: US_SSN, US_PASSPORT, US_DRIVER_LICENSE, UK_NHS
International IDs: SG_NRIC_FIN, IN_PAN, IN_AADHAAR, AU_ABN, AU_TFN, AU_MEDICARE
Location: LOCATION, IP_ADDRESS
Medical: MEDICAL_LICENSE
Other: And many more country-specific identifiers
Use the get_supported_entities tool to see all available types for your language.
Anonymization Operators
The server supports multiple anonymization strategies:
replace - Replace PII with placeholder text (e.g.,
<EMAIL_ADDRESS>)redact - Remove PII entirely from text
hash - Replace with cryptographic hash (SHA-256)
mask - Mask characters (e.g.,
***-**-1234)encrypt - Encrypt PII with AES encryption
keep - Keep PII as-is (for selective anonymization)
Advanced Features
Custom Recognizers
Add domain-specific PII patterns:
Batch Processing
Process multiple documents efficiently:
Language Support
Specify different languages:
Validation and Testing
Validate detection accuracy:
Architecture
This MCP server integrates:
MCP FastMCP: Provides the MCP protocol implementation
Presidio Analyzer: Detects PII using NLP and pattern matching
Presidio Anonymizer: Anonymizes detected PII with various operators
spaCy: Powers the NLP engine for accurate entity recognition
Security Considerations
All processing happens locally - no data is sent to external services
The server uses stdio transport for secure communication with MCP clients
Multiple anonymization strategies available for different privacy requirements
Supports compliance requirements (GDPR, HIPAA, CCPA)
Docker deployment provides additional isolation and security through containerization
Container runs as non-root user for enhanced security
Development
Running Tests
Project Structure
License
MIT License - see LICENSE file for details
Contributing
Contributions are welcome! Please feel free to submit issues or pull requests.
Acknowledgments
Microsoft Presidio - The underlying PII detection engine
Model Context Protocol - The protocol specification
spaCy - NLP library for entity recognition
Support
For issues, questions, or contributions, please visit the GitHub repository.