Skip to main content
Glama

cBioPortal MCP Server

by pickleton89
ARCHITECTURE_DIAGRAM.md11.8 kB
# cBioPortal MCP Server - System Architecture Diagram ## 🏗️ High-Level System Architecture ```mermaid graph TB %% External Interfaces subgraph "AI Assistants & Clients" Claude[Claude Desktop] VSCode[VS Code MCP] CLI[Command Line] CustomAI[Custom AI Tools] end %% MCP Protocol Layer subgraph "MCP Protocol Layer" MCP[Model Context Protocol] FastMCP[FastMCP Framework] end %% Main Server Components subgraph "cBioPortal MCP Server" direction TB Server[CBioPortalMCPServer] Config[Configuration System] APIClient[APIClient] %% Endpoint Modules subgraph "Endpoint Modules (BaseEndpoint Pattern)" Base[BaseEndpoint<br/>• Pagination Logic<br/>• Error Handling<br/>• Validation Decorators] Studies[StudiesEndpoints<br/>• Cancer Studies<br/>• Study Search] Genes[GenesEndpoints<br/>• Gene Operations<br/>• Mutations<br/>• Batch Processing] Samples[SamplesEndpoints<br/>• Sample Data<br/>• Sample Lists] Molecular[MolecularProfilesEndpoints<br/>• Clinical Data<br/>• Gene Panels] end %% Utility Modules subgraph "Utility Modules" Pagination[Pagination Utils<br/>• Async Generators<br/>• Efficient Streaming] Validation[Input Validation<br/>• Parameter Checking<br/>• Type Safety] Logging[Logging System<br/>• Structured Logs<br/>• Debug Support] end end %% External Data Source subgraph "External Data Sources" cBioPortal[cBioPortal API<br/>• Cancer Studies<br/>• Genomic Data<br/>• Clinical Data] end %% Data Flow Connections Claude --> MCP VSCode --> MCP CLI --> MCP CustomAI --> MCP MCP --> FastMCP FastMCP --> Server Server --> Config Server --> APIClient Server --> Studies Server --> Genes Server --> Samples Server --> Molecular Studies --> Base Genes --> Base Samples --> Base Molecular --> Base Base --> Pagination Base --> Validation Base --> Logging APIClient --> cBioPortal %% Styling classDef serverClass fill:#e1f5fe,stroke:#01579b,stroke-width:2px classDef endpointClass fill:#f3e5f5,stroke:#4a148c,stroke-width:2px classDef utilClass fill:#e8f5e8,stroke:#1b5e20,stroke-width:2px classDef externalClass fill:#fff3e0,stroke:#e65100,stroke-width:2px classDef protocolClass fill:#fce4ec,stroke:#880e4f,stroke-width:2px class Server,Config,APIClient serverClass class Studies,Genes,Samples,Molecular,Base endpointClass class Pagination,Validation,Logging utilClass class Claude,VSCode,CLI,CustomAI,cBioPortal externalClass class MCP,FastMCP protocolClass ``` ## 🔄 Data Flow Architecture ```mermaid sequenceDiagram participant AI as AI Assistant participant MCP as MCP Protocol participant Server as CBioPortalMCPServer participant Endpoint as Endpoint Module participant Base as BaseEndpoint participant API as APIClient participant cBio as cBioPortal API AI->>MCP: Request cancer studies MCP->>Server: Tool invocation Server->>Endpoint: Delegate to StudiesEndpoints Endpoint->>Base: Inherit pagination & validation Base->>Base: Validate parameters Base->>Base: Apply pagination logic Base->>API: Make async HTTP request API->>cBio: GET /studies cBio-->>API: JSON response API-->>Base: Processed data Base->>Base: Apply response formatting Base-->>Endpoint: Paginated response Endpoint-->>Server: Structured data Server-->>MCP: MCP response format MCP-->>AI: Tool result ``` ## 🧩 Component Architecture Details ```mermaid graph LR subgraph "Configuration System" CLI_Args[CLI Arguments<br/>Highest Priority] Env_Vars[Environment Variables<br/>CBIOPORTAL_*] YAML_Config[YAML Config Files<br/>config.yaml] Defaults[Default Values<br/>Lowest Priority] CLI_Args --> Config_Merger[Configuration Merger] Env_Vars --> Config_Merger YAML_Config --> Config_Merger Defaults --> Config_Merger Config_Merger --> Final_Config[Final Configuration] end subgraph "BaseEndpoint Pattern (60% Duplication Reduction)" BaseEndpoint[BaseEndpoint Class] Decorators[Validation Decorators<br/>@validate_paginated_params<br/>@handle_api_errors] Pagination_Logic[Shared Pagination Logic<br/>paginated_request()] Error_Handling[Centralized Error Handling] BaseEndpoint --> Decorators BaseEndpoint --> Pagination_Logic BaseEndpoint --> Error_Handling end subgraph "Async Performance Layer" AsyncClient[httpx.AsyncClient<br/>480s timeout] Concurrent[Concurrent Operations<br/>asyncio.gather()] Batching[Smart Batching<br/>Configurable sizes] Streaming[Async Generators<br/>Memory efficient] AsyncClient --> Concurrent Concurrent --> Batching Batching --> Streaming end ``` ## 🛡️ Error Handling & Validation Flow ```mermaid graph TD Request[Incoming Request] --> Validation{Parameter Validation} Validation -->|Invalid| ValidationError[Validation Error<br/>TypeError/ValueError] Validation -->|Valid| APICall[API Request] APICall --> APIResponse{API Response} APIResponse -->|HTTP Error| HTTPError[HTTP Error Handler<br/>4xx/5xx responses] APIResponse -->|Network Error| NetworkError[Network Error Handler<br/>Timeouts/Connection] APIResponse -->|Success| DataProcessing[Data Processing] DataProcessing --> Pagination[Pagination Logic] Pagination --> Response[Formatted Response] ValidationError --> ErrorResponse[Error Response] HTTPError --> ErrorResponse NetworkError --> ErrorResponse Response --> Client[Return to Client] ErrorResponse --> Client ``` ## 🚀 Performance Optimization Architecture ```mermaid graph TB subgraph "Performance Layer" direction TB subgraph "Async Concurrency" Sequential[Sequential Operations<br/>1.31s for 10 studies] Concurrent[Concurrent Operations<br/>0.29s for 10 studies<br/>4.57x improvement] Sequential -.->|Optimized to| Concurrent end subgraph "Smart Batching" LargeRequest[Large Gene List Request] BatchSplitter[Batch Splitter<br/>Configurable size: 100] ConcurrentBatches[Concurrent Batch Processing<br/>asyncio.gather()] BatchMerger[Result Merger] LargeRequest --> BatchSplitter BatchSplitter --> ConcurrentBatches ConcurrentBatches --> BatchMerger end subgraph "Memory Efficiency" AsyncGenerator[Async Generators<br/>yield results] StreamingPagination[Streaming Pagination<br/>Low memory footprint] LazyLoading[Lazy Loading<br/>On-demand processing] AsyncGenerator --> StreamingPagination StreamingPagination --> LazyLoading end end ``` ## 🧪 Testing Architecture ```mermaid graph LR subgraph "Test Suite (93 Tests)" direction TB Lifecycle[Lifecycle Tests<br/>• Server startup/shutdown<br/>• Tool registration<br/>• Resource management] Pagination[Pagination Tests<br/>• Edge cases<br/>• Async generators<br/>• Memory efficiency] MultiEntity[Multi-Entity Tests<br/>• Concurrent operations<br/>• Batch processing<br/>• Error handling] Validation[Input Validation Tests<br/>• Parameter checking<br/>• Type safety<br/>• Error messages] Snapshot[Snapshot Tests<br/>• API response consistency<br/>• Data integrity<br/>• Regression prevention] CLI[CLI Tests<br/>• Argument parsing<br/>• Configuration<br/>• Error handling] ErrorHandling[Error Handling Tests<br/>• Network failures<br/>• API errors<br/>• Graceful degradation] Configuration[Configuration Tests<br/>• Multi-layer config<br/>• Environment variables<br/>• Validation] end subgraph "Quality Infrastructure" Ruff[Ruff Linting<br/>• Code quality<br/>• Formatting<br/>• Import sorting] TypeChecking[Type Checking<br/>• Static analysis<br/>• Type safety<br/>• IDE support] Coverage[Code Coverage<br/>• pytest-cov<br/>• Comprehensive metrics<br/>• Quality gates] end ``` ## 🔧 Configuration Architecture ```mermaid graph TD subgraph "Configuration Sources (Priority Order)" CLI[1. CLI Arguments<br/>--base-url, --config, etc.] ENV[2. Environment Variables<br/>CBIOPORTAL_BASE_URL<br/>CBIOPORTAL_GENE_BATCH_SIZE] YAML[3. YAML Configuration<br/>config.yaml files] DEFAULT[4. Default Values<br/>Built-in defaults] end subgraph "Configuration Processing" Loader[Configuration Loader] Validator[Configuration Validator<br/>• Type checking<br/>• Range validation<br/>• Required fields] Merger[Configuration Merger<br/>Deep merge with priority] end subgraph "Configuration Categories" Server[Server Settings<br/>• base_url<br/>• transport<br/>• timeout] Logging[Logging Settings<br/>• level<br/>• format<br/>• file output] API[API Settings<br/>• rate limiting<br/>• retry logic<br/>• batch sizes] Performance[Performance Settings<br/>• concurrent limits<br/>• timeout values<br/>• cache settings] end CLI --> Loader ENV --> Loader YAML --> Loader DEFAULT --> Loader Loader --> Validator Validator --> Merger Merger --> Server Merger --> Logging Merger --> API Merger --> Performance ``` ## 📊 Deployment Architecture ```mermaid graph TB subgraph "Development Environment" Dev[Local Development<br/>uv sync<br/>pytest<br/>ruff check] end subgraph "AI Assistant Integration" Claude[Claude Desktop<br/>MCP Configuration] VSCode[VS Code Extension<br/>MCP Integration] Custom[Custom Applications<br/>MCP SDK] end subgraph "cBioPortal MCP Server" Server[Production Server<br/>Configuration-driven<br/>Multi-transport support] end subgraph "External Services" Public[Public cBioPortal<br/>www.cbioportal.org] Private[Private Instances<br/>Custom deployments] end Dev --> Server Claude --> Server VSCode --> Server Custom --> Server Server --> Public Server --> Private ``` --- ## 🎯 Key Architectural Principles 1. **🏗️ BaseEndpoint Pattern**: Inheritance-based architecture eliminating 60% code duplication 2. **⚡ Async-First Design**: Full asynchronous implementation for maximum performance 3. **🔧 Modular Architecture**: Clear separation of concerns with domain-specific modules 4. **🛡️ Robust Error Handling**: Comprehensive validation and error management 5. **🚀 Performance Optimized**: Smart batching, concurrent operations, and memory efficiency 6. **🧪 Test-Driven Quality**: 93 comprehensive tests ensuring reliability 7. **⚙️ Configuration-Driven**: Multi-layer configuration for flexible deployment 8. **📊 Production-Ready**: Enterprise-grade architecture with monitoring and observability This architecture demonstrates a mature, production-ready bioinformatics platform built through innovative human-AI collaboration.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pickleton89/cbioportal-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server