ARCHITECTURE.md•11.4 kB
# MCP Standards - System Architecture
**Last Updated**: 2025-10-14
---
## Executive Summary
MCP Standards is a self-learning AI standards system that automatically learns from corrections and updates configuration files:
1. **MindsDB** - Federated data access across 200+ sources via single MCP server
2. **Strata** - Progressive tool discovery and intelligent routing
3. **SQLite Knowledge Engine** - Local-first memory with full-text search
This architecture eliminates the complexity of managing dozens of MCP servers while providing enterprise-grade data access, intelligent tool selection, and persistent knowledge storage.
---
## Core Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Claude Code/Desktop │
│ (User Interface Layer) │
└─────────────────────┬───────────────────────────────────────┘
│
┌───────┼───────┐
│ │ │
▼ ▼ ▼
┌────────┐ ┌─────┐ ┌──────────┐
│MindsDB │ │Strata│ │ Local │
│ MCP │ │ MCP │ │Knowledge │
│Server │ │Router│ │ Engine │
└────┬───┘ └──┬──┘ └────┬─────┘
│ │ │
│ │ ▼
│ │ ┌─────────┐
│ │ │SQLite DB│
│ │ │+FTS5 │
│ │ └─────────┘
│ │
│ ▼
│ ┌─────────────┐
│ │Progressive │
│ │Tool │
│ │Discovery │
│ └─────────────┘
│
▼
┌─────────────────┐
│ Federated Data │
│ • 200+ Sources │
│ • Real-time │
│ • SQL Interface │
└─────────────────┘
```
---
## Component Architecture
### 1. MindsDB MCP Server (Data Federation Layer)
**Purpose**: Single unified gateway for all data access needs
**Capabilities**:
- **200+ Data Sources**: Databases, SaaS apps, APIs, file systems
- **Federated Queries**: Cross-source joins and analytics
- **AI-Native Operations**: Vector search, embeddings, ML models
- **Enterprise Security**: Audit trails, governance, performance optimization
**Key Tools**:
```json
{
"list_databases": "Discover available data sources",
"query": "Execute federated queries across all sources",
"create_model": "Deploy ML models for predictions",
"create_knowledge_base": "Setup RAG with vector search"
}
```
**Configuration**:
```json
{
"mindsdb": {
"command": "docker",
"args": ["run", "-p", "47334:47334", "mindsdb/mindsdb"]
}
}
```
### 2. Strata MCP Router (Tool Discovery Layer)
**Purpose**: Progressive tool discovery without context overload
**Discovery Pattern**:
```
Intent → Integration → Category → Action
```
**Benefits**:
- **Scalable**: Handles thousands of tools without token explosion
- **Progressive**: Step-by-step drill-down prevents overwhelm
- **Intelligent**: Context-aware tool recommendation
- **Multi-App**: Seamless workflows across applications
**Configuration**:
```json
{
"strata": {
"command": "npx",
"args": ["-y", "strata-mcp"]
}
}
```
### 3. Local Knowledge Engine (Memory Layer)
**Purpose**: Persistent knowledge storage with full-text search
**Components**:
- **SQLite Database**: Single-file storage with FTS5 search
- **Automatic Logging**: Captures significant tool executions
- **Semantic Extraction**: Converts actions to knowledge episodes
- **Cross-Session Memory**: Persistent context across interactions
**Schema**:
```sql
CREATE TABLE episodes (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
content TEXT NOT NULL,
source TEXT NOT NULL,
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP,
metadata JSON
);
CREATE VIRTUAL TABLE episodes_fts USING fts5(
name, content, source, content=episodes
);
```
---
## Data Flow Architecture
### 1. Knowledge Capture Flow
```
User Action → Tool Execution → Semantic Extractor → SQLite Storage
↓ ↓ ↓ ↓
Claude Code → MCP Tool Call → Knowledge Episode → Local Memory
```
### 2. Query Resolution Flow
```
User Query → Local Search → MindsDB Federation → Strata Discovery
↓ ↓ ↓ ↓
Natural → SQLite FTS → Cross-Source Query → Progressive Tools
Language Results Real-time Data Guided Selection
```
### 3. Tool Orchestration Flow
```
Intent → Strata Router → Tool Selection → MindsDB Execution → Memory Update
↓ ↓ ↓ ↓ ↓
"Research" → Categories → Specific Tool → Data Query → Episode Stored
```
---
## Configuration Strategy
### Unified MCP Configuration
**Claude Desktop** (`~/Library/Application Support/Claude/mcp.json`):
```json
{
"mcpServers": {
"mindsdb": {
"command": "docker",
"args": ["run", "-p", "47334:47334", "mindsdb/mindsdb"]
},
"strata": {
"command": "npx",
"args": ["-y", "strata-mcp"]
},
"local-memory": {
"command": "uv",
"args": ["--directory", "./mcp-servers/local-memory", "run", "server.py"]
}
}
}
```
**Claude Desktop Config** (`~/Library/Application Support/Claude/claude_desktop_config.json`):
```json
{
"projects": {
"/path/to/your/project": {
"mcpServers": {
"mindsdb": {
"type": "stdio",
"command": "docker",
"args": ["run", "-p", "47334:47334", "mindsdb/mindsdb"]
},
"strata": {
"type": "stdio",
"command": "npx",
"args": ["-y", "strata-mcp"]
},
"local-memory": {
"type": "stdio",
"command": "uv",
"args": ["--directory", "./mcp-servers/local-memory", "run", "server.py"]
}
}
}
}
}
```
### Scoped Configuration
**Project-Level** (`.mcp.json`):
```json
{
"mcpServers": {
"local-memory": {
"type": "stdio",
"command": "uv",
"args": ["--directory", "./mcp-servers/local-memory", "run", "server.py"],
"env": {
"DB_PATH": "./knowledge.db"
}
}
}
}
```
---
## Security and Governance
### Data Security
1. **Local-First**: SQLite knowledge stored locally, no cloud dependency
2. **Federated Security**: MindsDB handles data source authentication
3. **Permission-Based**: Strata enforces tool access controls
4. **Audit Trails**: All operations logged for compliance
### Enterprise Controls
```json
{
"permissions": {
"allow": [
"mindsdb:query:read",
"strata:discover:*",
"local-memory:*"
],
"deny": [
"mindsdb:model:create"
],
"ask": [
"mindsdb:database:create"
]
}
}
```
---
## Performance Characteristics
### Scalability Metrics
| Component | Scale | Performance | Memory |
|-----------|-------|-------------|---------|
| MindsDB | 200+ sources | Real-time queries | ~500MB |
| Strata | 1000+ tools | Progressive discovery | ~50MB |
| SQLite | 1M+ episodes | Sub-second FTS | ~100MB |
### Optimization Strategies
1. **Intelligent Caching**: MindsDB caches frequent queries
2. **Progressive Loading**: Strata loads tools on-demand
3. **FTS Optimization**: SQLite indexes for fast search
4. **Connection Pooling**: Reuse database connections
---
## Integration Patterns
### Enterprise Integration
```python
# Python SDK Usage
from research_mcp import UnifiedMCP
mcp = UnifiedMCP()
# Query across all systems
result = await mcp.query(
"What changed in our Salesforce pipeline this week?",
sources=["salesforce", "github", "local-memory"]
)
# Progressive tool discovery
tools = await mcp.discover_tools(
intent="automate sales reporting",
categories=["crm", "analytics"]
)
```
### Agent Orchestration
```python
# Multi-agent coordination
from research_mcp.agents import ResearchAgent, ActionAgent
research = ResearchAgent(knowledge_base="local-memory")
action = ActionAgent(tool_router="strata")
# Coordinated workflow
findings = await research.investigate("competitor analysis")
actions = await action.plan_from_research(findings)
```
---
## Migration Strategy
### Phase 1: Foundation (Week 1)
- [ ] Setup MindsDB MCP server
- [ ] Configure basic Strata routing
- [x] Migrate to SQLite with FTS5
- [ ] Test unified configuration
### Phase 2: Integration (Week 2)
- [ ] Connect primary data sources to MindsDB
- [ ] Configure Strata tool categories
- [ ] Implement automatic knowledge capture
- [ ] Setup cross-system queries
### Phase 3: Optimization (Week 3)
- [ ] Performance tuning and caching
- [ ] Advanced security configuration
- [ ] Agent orchestration patterns
- [ ] Enterprise governance setup
### Phase 4: Scale (Week 4)
- [ ] Multi-user configuration
- [ ] Team knowledge sharing
- [ ] Advanced analytics dashboards
- [ ] Production monitoring
---
## Monitoring and Observability
### Health Checks
```bash
# MCP Server Status
curl http://localhost:47334/health # MindsDB
/mcp list # Strata tools
sqlite3 knowledge.db "SELECT COUNT(*) FROM episodes" # Local memory
```
### Metrics Collection
- **Query Performance**: Response times across data sources
- **Tool Usage**: Most frequently discovered/used tools
- **Knowledge Growth**: Episodes created per day/week
- **Error Rates**: Failed queries and tool executions
---
## Future Roadmap
### Short-term (3 months)
- Agent-to-Agent (A2A) protocol integration
- Advanced RAG with vector embeddings
- Real-time collaboration features
- Mobile/web interface
### Medium-term (6 months)
- Multi-tenant enterprise deployment
- Advanced analytics and insights
- Workflow automation engine
- Third-party MCP server registry
### Long-term (12 months)
- Federated knowledge graphs
- AI-driven process optimization
- Cross-organization knowledge sharing
- Regulatory compliance automation
---
## Conclusion
This unified architecture positions Research MCP as a comprehensive AI knowledge copilot that scales from individual use to enterprise deployment. By combining MindsDB's data federation, Strata's progressive tool discovery, and local SQLite knowledge storage, we create a system that is both powerful and practical.
The architecture eliminates the complexity of managing multiple MCP servers while providing enterprise-grade capabilities for data access, tool discovery, and knowledge management. This foundation enables rapid development of sophisticated AI workflows while maintaining simplicity for everyday use.