ToGMAL MCP Server

README.md•11.2 kB

# ToGMAL MCP Server **Taxonomy of Generative Model Apparent Limitations** A Model Context Protocol (MCP) server that provides real-time, privacy-preserving analysis of LLM interactions to detect out-of-distribution behaviors and recommend safety interventions. ## Overview ToGMAL helps prevent common LLM pitfalls by detecting: - 🔬 **Math/Physics Speculation**: Ungrounded "theories of everything" and invented physics - 🏥 **Medical Advice Issues**: Health recommendations without proper sources or disclaimers - 💾 **Dangerous File Operations**: Mass deletions, recursive operations without safeguards - 💻 **Vibe Coding Overreach**: Overly ambitious projects without proper scoping - 📊 **Unsupported Claims**: Strong assertions without evidence or hedging ## Key Features - **Privacy-Preserving**: All analysis is deterministic and local (no external API calls) - **Low Latency**: Heuristic-based detection for real-time analysis - **Intervention Recommendations**: Suggests step breakdown, human-in-the-loop, or web search - **Taxonomy Building**: Crowdsourced evidence collection for improving detection - **Extensible**: Easy to add new detection patterns and categories ## Installation ### Prerequisites - Python 3.10 or higher - pip package manager ### Install Dependencies ```bash pip install mcp pydantic httpx --break-system-packages ``` ### Install the Server ```bash # Clone or download the server # Then run it directly python togmal_mcp.py ``` ## Usage ### Available Tools #### 1. `togmal_analyze_prompt` Analyze a user prompt before the LLM processes it. **Parameters:** - `prompt` (str): The user prompt to analyze - `response_format` (str): Output format - `"markdown"` or `"json"` **Example:** ```python { "prompt": "Build me a complete theory of quantum gravity that unifies all forces", "response_format": "json" } ``` **Use Cases:** - Detect speculative physics theories before generating responses - Flag overly ambitious coding requests - Identify requests for medical advice that need disclaimers #### 2. `togmal_analyze_response` Analyze an LLM response for potential issues. **Parameters:** - `response` (str): The LLM response to analyze - `context` (str, optional): Original prompt for better analysis - `response_format` (str): Output format - `"json"` or `"json"` **Example:** ```python { "response": "You should definitely take 500mg of ibuprofen every 4 hours...", "context": "I have a headache", "response_format": "json" } ``` **Use Cases:** - Check for ungrounded medical advice - Detect dangerous file operation instructions - Flag unsupported statistical claims #### 3. `togmal_submit_evidence` Submit evidence of LLM limitations to improve the taxonomy. **Parameters:** - `category` (str): Type of limitation - `"math_physics_speculation"`, `"ungrounded_medical_advice"`, etc. - `prompt` (str): The prompt that triggered the issue - `response` (str): The problematic response - `description` (str): Why this is problematic - `severity` (str): Severity level - `"low"`, `"moderate"`, `"high"`, or `"critical"` **Example:** ```python { "category": "ungrounded_medical_advice", "prompt": "What should I do about chest pain?", "response": "It's probably nothing serious, just indigestion...", "description": "Dismissed potentially serious symptom without recommending medical consultation", "severity": "high" } ``` **Features:** - Human-in-the-loop confirmation before submission - Generates unique entry ID for tracking - Contributes to improving detection heuristics #### 4. `togmal_get_taxonomy` Retrieve entries from the taxonomy database. **Parameters:** - `category` (str, optional): Filter by category - `min_severity` (str, optional): Minimum severity to include - `limit` (int): Maximum entries to return (1-100, default 20) - `offset` (int): Pagination offset (default 0) - `response_format` (str): Output format **Example:** ```python { "category": "dangerous_file_operations", "min_severity": "high", "limit": 10, "offset": 0, "response_format": "json" } ``` **Use Cases:** - Research common LLM failure patterns - Train improved detection models - Generate safety guidelines #### 5. `togmal_get_statistics` Get statistical overview of the taxonomy database. **Parameters:** - `response_format` (str): Output format **Returns:** - Total entries by category - Severity distribution - Database capacity status ## Detection Heuristics ### Math/Physics Speculation **Detects:** - "Theory of everything" claims - Unified field theory proposals - Invented equations or particles - Modifications to fundamental constants **Patterns:** ``` - "new equation for quantum gravity" - "my unified theory" - "discovered particle" - "redefine the speed of light" ``` ### Ungrounded Medical Advice **Detects:** - Diagnoses without qualifications - Treatment recommendations without sources - Specific drug dosages - Dismissive responses to symptoms **Patterns:** ``` - "you probably have..." - "take 500mg of..." - "don't worry about it" - Missing citations or disclaimers ``` ### Dangerous File Operations **Detects:** - Mass deletion commands - Recursive operations without safeguards - Operations on test files without confirmation - No human-in-the-loop for destructive actions **Patterns:** ``` - "rm -rf" without confirmation - "delete all test files" - "recursively remove" - Missing safety checks ``` ### Vibe Coding Overreach **Detects:** - Requests for complete applications - Massive line count targets (1000+ lines) - Unrealistic timeframes - Scope without proper planning **Patterns:** ``` - "build a complete social network" - "5000 lines of code" - "everything in one shot" - Missing architectural planning ``` ### Unsupported Claims **Detects:** - Absolute statements without hedging - Statistical claims without sources - Over-confident predictions - Missing citations **Patterns:** ``` - "always/never/definitely" - "95% of doctors agree" (no source) - "guaranteed to work" - Missing uncertainty language ``` ## Risk Levels Calculated based on weighted confidence scores: - **LOW**: Minor issues, no immediate intervention needed - **MODERATE**: Worth noting, consider additional verification - **HIGH**: Significant concern, interventions recommended - **CRITICAL**: Serious risk, multiple interventions strongly advised ## Intervention Types ### Step Breakdown Complex tasks should be broken into verifiable components. **Recommended for:** - Math/physics speculation - Large coding projects - Dangerous file operations ### Human-in-the-Loop Critical decisions require human oversight. **Recommended for:** - Medical advice - Destructive file operations - High-severity issues ### Web Search Claims should be verified against authoritative sources. **Recommended for:** - Medical recommendations - Physics/math theories - Unsupported factual claims ### Simplified Scope Overly ambitious projects need realistic scoping. **Recommended for:** - Vibe coding requests - Complex system designs - Feature-heavy applications ## Configuration ### Character Limit Default: 25,000 characters per response ```python CHARACTER_LIMIT = 25000 ``` ### Taxonomy Capacity Default: 1,000 evidence entries ```python MAX_EVIDENCE_ENTRIES = 1000 ``` ### Detection Sensitivity Adjust pattern matching and confidence thresholds in detection functions: ```python def detect_math_physics_speculation(text: str) -> Dict[str, Any]: # Modify patterns or confidence calculations ... ``` ## Integration Examples ### Claude Desktop App Add to your `claude_desktop_config.json`: ```json { "mcpServers": { "togmal": { "command": "python", "args": ["/path/to/togmal_mcp.py"] } } } ``` ### CLI Testing ```bash # Run the server python togmal_mcp.py # In another terminal, test with MCP inspector npx @modelcontextprotocol/inspector python togmal_mcp.py ``` ### Programmatic Usage ```python from mcp.client import Client async def analyze_prompt(prompt: str): async with Client("togmal") as client: result = await client.call_tool( "togmal_analyze_prompt", {"prompt": prompt, "response_format": "json"} ) return result ``` ## Architecture ### Design Principles 1. **Privacy First**: No external API calls, all processing local 2. **Deterministic**: Heuristic-based detection for reproducibility 3. **Low Latency**: Fast pattern matching for real-time use 4. **Extensible**: Easy to add new patterns and categories 5. **Human-Centered**: Always allows human override and judgment ### Future Enhancements The system is designed for progressive enhancement: 1. **Phase 1 (Current)**: Heuristic pattern matching 2. **Phase 2 (Planned)**: Traditional ML models (clustering, anomaly detection) 3. **Phase 3 (Future)**: Federated learning from submitted evidence 4. **Phase 4 (Advanced)**: Custom fine-tuned models for specific domains ### Data Flow ``` User Prompt ↓ togmal_analyze_prompt ↓ Detection Heuristics (parallel) ├── Math/Physics ├── Medical Advice ├── File Operations ├── Vibe Coding └── Unsupported Claims ↓ Risk Calculation ↓ Intervention Recommendations ↓ Response to Client ``` ## Contributing ### Adding New Detection Patterns 1. Create a new detection function: ```python def detect_new_category(text: str) -> Dict[str, Any]: patterns = { 'subcategory1': [r'pattern1', r'pattern2'], 'subcategory2': [r'pattern3'] } # Implement detection logic return { 'detected': bool, 'categories': list, 'confidence': float } ``` 2. Add to CategoryType enum 3. Update analysis functions to include new detector 4. Add intervention recommendations if needed ### Submitting Evidence Use the `togmal_submit_evidence` tool to contribute examples of problematic LLM behavior. This helps improve detection for everyone. ## Limitations ### Current Constraints - **Heuristic-Based**: May have false positives/negatives - **English-Only**: Patterns optimized for English text - **Context-Free**: Doesn't understand full conversation history - **No Learning**: Detection rules are static until updated ### Not a Replacement For - Professional judgment in critical domains (medicine, law, etc.) - Comprehensive code review - Security auditing - Safety testing in production systems ## License MIT License - See LICENSE file for details ## Support For issues, questions, or contributions: - Open an issue on GitHub - Submit evidence through the MCP tool - Contact: [Your contact information] ## Citation If you use ToGMAL in your research or product, please cite: ```bibtex @software{togmal_mcp, title={ToGMAL: Taxonomy of Generative Model Apparent Limitations}, author={[Your Name]}, year={2025}, url={https://github.com/[your-repo]/togmal-mcp} } ``` ## Acknowledgments Built using: - [Model Context Protocol](https://modelcontextprotocol.io) - [FastMCP](https://github.com/modelcontextprotocol/python-sdk) - [Pydantic](https://docs.pydantic.dev) Inspired by the need for safer, more grounded AI interactions.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/HeTalksInMaths/togmal-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server