# MCP Database Tools
## Overview
This module exposes PostgreSQL database functions as MCP (Model Context Protocol) tools, allowing LLMs to directly interact with your databases through a safe, structured interface.
## โ
**Enhanced Capabilities**
Your MCP server now has **12 powerful database tools** that LLMs can use to:
### ๐ **Schema Discovery Tools**
- **`get_all_table_names`** - List all tables in the database
- **`get_table_schema`** - Get detailed table structure with columns, types, constraints
- **`get_database_schema`** - Get complete database overview with all tables and relationships
- **`discover_table_semantics`** - Understand table relationships and business context
### ๐ **Data Analysis Tools**
- **`get_table_row_count`** - Count records in specific tables
- **`get_table_size`** - Get storage size and disk usage information
- **`get_database_stats`** - Overall database statistics and performance metrics
### ๐ **Query & Search Tools**
- **`execute_safe_sql`** - Execute SELECT queries with safety validation
- **`search_tables_for_concept`** - Search across tables for business concepts
### ๐ ๏ธ **Utility Tools**
- **`get_available_databases`** - List all configured databases (db1, db2, db3)
- **`get_tool_capabilities`** - Get information about all available tools
## ๐๏ธ **Resources Structure**
```
resources/
โโโ ๐ lists/
โ โโโ mcp_database_tools.json # Tool definitions and metadata
โโโ ๐ templates/ # Future: Query templates
โโโ ๐ schemas/ # Future: API schemas
โโโ ๐ configs/ # Future: Configuration templates
```
## ๐ **Usage Examples**
### LLM Conversation Examples
#### "What tables are in the database?"
```
LLM calls: get_all_table_names(database="db3")
Response: {"tables": ["users", "orders", "products", "categories"]}
```
#### "Show me the structure of the users table"
```
LLM calls: get_table_schema(table_name="users", database="db3")
Response: {
"columns": [
{"name": "id", "type": "integer", "nullable": false},
{"name": "email", "type": "varchar", "nullable": false},
{"name": "created_at", "type": "timestamp", "nullable": true}
],
"constraints": [...],
"relationships": [...]
}
```
#### "How many users do we have?"
```
LLM calls: get_table_row_count(table_name="users", database="db3")
Response: {"row_count": 15420}
```
#### "Find tables related to customer data"
```
LLM calls: search_tables_for_concept(concept="customer", database="db3")
Response: {
"relevant_tables": ["users", "customers", "customer_orders"],
"matches": [...]
}
```
#### "Show me recent orders"
```
LLM calls: execute_safe_sql(
sql_query="SELECT * FROM orders WHERE created_at > NOW() - INTERVAL '7 days'",
database="db3",
limit=50
)
Response: {"results": [...]}
```
## ๐ **Safety Features**
### SQL Injection Protection
- All queries are validated and parameterized
- Only SELECT statements allowed for safety
- Automatic query limits to prevent resource exhaustion
### Resource Protection
- Default row limits (100 rows) on query results
- Timeout protection on long-running operations
- Read-only operations to prevent data modification
### Error Handling
- Comprehensive error messages for debugging
- Graceful degradation on connection issues
- Detailed logging for monitoring
## ๐๏ธ **Architecture**
### Service Layer
```
MCPDatabaseToolsService
โโโ PostgreSQLIntegration instances (db1, db2, db3)
โโโ Tool definition loading from resources
โโโ Safety validation and error handling
โโโ Result formatting for LLM consumption
```
### MCP Integration
```
EnhancedMCPServer
โโโ Tool registration with proper schemas
โโโ Request routing to database service
โโโ Response formatting and error handling
โโโ Logging and monitoring
```
## ๐ **Tool Definitions**
Each tool is defined with:
- **Name**: Unique identifier for the tool
- **Description**: What the tool does and when to use it
- **Parameters**: Required and optional parameters with types
- **Returns**: Expected response format and structure
- **Category**: Grouping for related tools
- **Safety Features**: Built-in protections and limitations
## ๐ฏ **Use Cases**
### Data Exploration
LLMs can now autonomously:
- Discover available data sources
- Understand database structure
- Explore relationships between tables
- Find relevant data for analysis
### Business Intelligence
- Generate reports from multiple tables
- Analyze data patterns and trends
- Answer business questions directly from data
- Provide insights without manual SQL writing
### Database Documentation
- Auto-generate schema documentation
- Understand data relationships
- Identify data quality issues
- Map business concepts to database tables
### Development Support
- Validate database design decisions
- Understand data access patterns
- Generate sample queries for testing
- Assist with migration planning
## ๐ง **Configuration**
### Database Configuration
Uses your existing `config.py` database connections:
```python
SQLALCHEMY_BINDS = {
'db1': 'postgresql://admin:password@192.168.230.101/defaultdb',
'db2': 'postgresql://admin:password@192.168.230.102/defaultdb',
'db3': 'postgresql://postgres:postgres@localhost/postgres'
}
```
### Default Database
- **db3** (local PostgreSQL) is the default for all operations
- LLMs can specify different databases using the `database` parameter
- Available databases can be discovered using `get_available_databases`
## ๐งช **Testing**
### Manual Testing
```bash
# Test the enhanced MCP server
python presentation/enhanced_mcp_server.py
# Test individual tools through HTTP bridge
curl -X POST http://localhost:8000/api/capabilities
```
### Automated Testing
```bash
# Run MCP database tools tests
py testing/core/test_mcp_database_tools.py
# Run integration tests
py testing/integration/test_enhanced_mcp_server.py
```
## ๐ **Performance Considerations**
### Caching
- Database schema information is cached per session
- Repeated tool calls for schema info are optimized
- Connection pooling for database efficiency
### Limits
- Default 100-row limit on query results
- 30-second timeout on database operations
- Maximum 10MB response size
### Monitoring
- Tool usage logging for analytics
- Database performance monitoring
- Error rate tracking
## ๐ **Future Enhancements**
### Planned Features
- **Vector search integration** for semantic queries
- **Query templates** for common business questions
- **Data visualization** tool integration
- **Real-time data streaming** capabilities
### Advanced Tools
- **Data profiling** and quality analysis
- **Schema migration** assistance
- **Performance optimization** recommendations
- **Automated report generation**
## ๐ค **Integration with Existing Systems**
### Smart Search Integration
The MCP database tools complement your existing smart search:
- **Schema tools** inform smart search about available data
- **Query tools** provide direct data access when smart search isn't needed
- **Analysis tools** give context for smart search results
### HTTP Bridge Integration
All tools are accessible through your HTTP bridge:
- `/api/capabilities` endpoint lists all available MCP tools
- Standard HTTP endpoints for tool invocation
- Consistent error handling and response formats
---
## ๐ **Summary**
Your LLMs now have **direct, safe access** to your PostgreSQL databases through 12 comprehensive tools covering:
โ
**Schema Discovery** - Understanding what data exists
โ
**Data Analysis** - Getting insights about data volume and quality
โ
**Safe Querying** - Executing read-only SQL with protection
โ
**Semantic Search** - Finding data by business concepts
โ
**Multi-Database Support** - Working across db1, db2, and db3
This transforms your MCP server from a basic interface into a **powerful database exploration and analysis platform** that LLMs can use autonomously and safely.