🧬 cBioPortal MCP Server
A high-performance, production-ready Model Context Protocol (MCP) server that enables AI assistants to seamlessly interact with cancer genomics data from cBioPortal. Built with modern async Python architecture, enterprise-grade modular design, and BaseEndpoint pattern for maximum reliability, maintainability, and 4.5x faster performance.
🌟 Overview & Key Features
🚀 Performance & Architecture
- ⚡ 4.5x Performance Boost: Full async implementation with concurrent API operations
- 🏗️ Enterprise Architecture: BaseEndpoint pattern with 60% code duplication elimination
- 📐 Modular Design: Professional structure with 71% code reduction (1,357 → 396 lines)
- 📦 Modern Package Management: uv-based workflow with pyproject.toml
- 🔄 Concurrent Operations: Bulk fetching of studies and genes with automatic batching
🔧 Enterprise Features
- ⚙️ Multi-layer Configuration: CLI args → Environment variables → YAML config → Defaults
- 📋 Comprehensive Testing: 93 tests across 8 organized test suites with full coverage
- 🛡️ Input Validation: Robust parameter validation and error handling
- 📊 Pagination Support: Efficient data retrieval with automatic pagination
- 🔧 Code Quality: Ruff linting, formatting, and comprehensive code quality checks
- ⚡ Configurable Performance: Adjustable batch sizes and performance tuning
🧬 Cancer Genomics Capabilities
- 🔍 Study Management: Browse, search, and analyze cancer studies
- 🧪 Molecular Data: Access mutations, clinical data, and molecular profiles
- 📈 Bulk Operations: Concurrent fetching of multiple entities
- 🔎 Advanced Search: Keyword-based discovery across studies and genes
🎆 Recent Quality & Architecture Improvements
🚀 Major Refactoring Achievements (2025)
- 🏗️ BaseEndpoint Architecture: Eliminated ~60% code duplication through inheritance-based design
- 📝 Code Quality Excellence: Comprehensive external review integration with modern linting (Ruff)
- ⚙️ Enhanced Configurability: Gene batch sizes, retry logic, and performance tuning now configurable
- 🛡️ Robust Validation: Decorator-based parameter validation and error handling
- 🧪 Testing Maturity: 93 comprehensive tests with zero regressions through major refactoring
📈 Production-Ready Status
- ✅ External Code Review: Professional code quality validation and improvements implemented
- 🔧 Modern Python Practices: Type checking, linting, formatting, and best practice adherence
- 🏗️ Enterprise Architecture: Modular design with clear separation of concerns
- 🚀 Performance Optimized: 4.5x async improvements with configurable batch processing
🧠🤖 AI-Collaborative Development
This project demonstrates cutting-edge human-AI collaboration in bioinformatics software development:
- 🧠 Domain Expertise: 20+ years cancer research experience guided architecture and feature requirements
- 🤖 AI Implementation: Advanced code generation, API design, and performance optimization through systematic LLM collaboration
- 🔄 Quality Assurance: Iterative refinement ensuring professional standards and production reliability
- 🏗️ Architectural Evolution: BaseEndpoint pattern and 60% code duplication elimination through AI-guided refactoring
- 📈 Innovation Approach: Showcases how domain experts can effectively leverage AI tools to build enterprise-grade bioinformatics platforms
Recent Achievements: External code review integration with comprehensive quality improvements including Ruff configuration, configurable performance settings, and modern Python best practices.
Methodology: This collaborative approach combines deep biological domain knowledge with AI-powered development capabilities, accelerating innovation while maintaining rigorous code quality and scientific accuracy.
🚀 Quick Start
Prerequisites
- Python 3.10+ 🐍
- uv (modern package manager) - recommended 📦
- Git (optional, for cloning)
⚡ Installation & Launch
That's it! 🎉 Your server is running and ready for AI assistant connections.
📦 Installation Options
🔥 Option 1: uv (Recommended)
Modern, lightning-fast package management with automatic environment handling:
🐍 Option 2: pip (Traditional)
Standard Python package management approach:
⚙️ Configuration
🎛️ Multi-Layer Configuration System
The server supports flexible configuration with priority: CLI args > Environment variables > Config file > Defaults
YAML Configuration 📄
Create config.yaml
for persistent settings:
Environment Variables 🌍
CLI Options 💻
🔌 Usage & Integration
🖥️ Claude Desktop Integration
Configure in your Claude Desktop MCP settings:
Option 1: Direct Script Path (Recommended)
Option 2: uv run (Alternative)
Important Setup Steps:
- Replace
/path/to/your/project/cbioportal_MCP
with your actual project path - Ensure the project is installed in editable mode:
uv pip install -e .
- Restart Claude Desktop after updating the configuration
🔧 VS Code Integration
Add to your workspace settings:
🏃♂️ Command Line Usage
🏗️ Architecture
📁 Modern Project Structure
🎯 Design Principles
- 🔧 Modular: Clear separation of concerns with domain-specific modules
- ⚡ Async-First: Full asynchronous implementation for maximum performance
- 🏗️ BaseEndpoint Pattern: Inheritance-based architecture eliminating 60% code duplication
- 🛡️ Robust: Comprehensive input validation and error handling with decorators
- 🧪 Testable: 93 tests ensuring reliability and preventing regressions
- 🔄 Maintainable: Clean code architecture with 71% reduction in complexity
- 📝 Code Quality: Ruff linting, formatting, and modern Python practices
🛠️ Available Tools
The server provides 12 high-performance tools for AI assistants:
🔧 Tool | 📝 Description | ⚡ Features |
---|---|---|
get_cancer_studies | List all available cancer studies | 📄 Pagination, 🔍 Filtering |
search_studies | Search studies by keyword | 🔎 Full-text search, 📊 Sorting |
get_study_details | Detailed study information | 📈 Comprehensive metadata |
get_samples_in_study | Samples for specific studies | 📄 Paginated results |
get_genes | Gene information by ID/symbol | 🏷️ Flexible identifiers |
search_genes | Search genes by keyword | 🔍 Symbol & name search |
get_mutations_in_gene | Gene mutations in studies | 🧬 Mutation details |
get_clinical_data | Patient clinical information | 👥 Patient-centric data |
get_molecular_profiles | Study molecular profiles | 📊 Profile metadata |
get_multiple_studies | 🚀 Concurrent study fetching | ⚡ Bulk operations |
get_multiple_genes | 🚀 Concurrent gene retrieval | 📦 Automatic batching |
get_gene_panels_for_study | Gene panels in studies | 🧬 Panel information |
🌟 Performance Features
- ⚡ Concurrent Operations:
get_multiple_*
methods useasyncio.gather
for parallel processing - 📦 Smart Batching: Automatic batching for large gene lists
- 📄 Efficient Pagination: Async generators for memory-efficient data streaming
- ⏱️ Performance Metrics: Execution timing and batch count reporting
🚀 Performance
📊 Benchmark Results
Our async implementation delivers significant performance improvements:
🔥 Async Benefits
- 🚀 4.5x Faster: Concurrent API requests vs sequential operations
- 📦 Bulk Processing: Efficient batched operations for multiple entities
- ⏱️ Non-blocking: Asynchronous I/O prevents request blocking
- 🧮 Smart Batching: Automatic optimization for large datasets
💡 Performance Tips
- Use
get_multiple_studies
for fetching multiple studies concurrently - Leverage
get_multiple_genes
with automatic batching for gene lists - Configure
concurrent_batch_size
in config for optimal performance - Monitor execution metrics included in response metadata
👨💻 Development
🔨 Development Workflow
🧪 Testing
Comprehensive test suite with 93 tests across 8 categories:
- 🔄
test_server_lifecycle.py
- Server startup/shutdown & tool registration - 📄
test_pagination.py
- Pagination logic & edge cases - 🚀
test_multiple_entity_apis.py
- Concurrent operations & bulk fetching - ✅
test_input_validation.py
- Parameter validation & error handling - 📸
test_snapshot_responses.py
- API response consistency (syrupy) - 💻
test_cli.py
- Command-line interface & argument parsing - 🛡️
test_error_handling.py
- Error scenarios & network issues - ⚙️
test_configuration.py
- Configuration system validation
🛠️ Development Tools & Quality Infrastructure
- 📦 uv: Modern package management (10-100x faster than pip)
- 🧪 pytest: Testing framework with async support and 93 comprehensive tests
- 📸 syrupy: Snapshot testing for API response consistency
- 🔍 Ruff: Lightning-fast linting, formatting, and code quality enforcement
- 📊 pytest-cov: Code coverage reporting and quality metrics
- 🏗️ BaseEndpoint: Inheritance pattern eliminating 60% code duplication
- ⚙️ Type Checking: Comprehensive type annotations for better code safety
- 🛡️ Validation Decorators: Automatic parameter validation and error handling
🤝 Contributing
- 🍴 Fork the repository
- 🌿 Create a feature branch (
git checkout -b feature/amazing-feature
) - ✅ Test your changes (
uv run pytest
) - 📝 Commit with clear messages (
git commit -m 'Add amazing feature'
) - 🚀 Push to branch (
git push origin feature/amazing-feature
) - 🔄 Create a Pull Request
🔧 Troubleshooting
🚨 Common Issues
Server Fails to Start
Claude Desktop Connection Issues
- ✅ Use direct script path (Option 1) for most reliable connection
- ✅ Verify paths in MCP configuration are absolute (no
~
or relative paths) - ✅ Install in editable mode: Run
uv pip install -e .
in project directory - ✅ Ensure the virtual environment
.venv/bin/cbioportal-mcp
script exists - ✅ For Option 2: Check that
uv
is in your system PATH andcwd
points to project directory - ✅ Review Claude Desktop logs for detailed errors
Performance Issues
- 🔧 Increase
concurrent_batch_size
in config - 🔧 Adjust
max_concurrent_requests
for your system - 🔧 Use
get_multiple_*
methods for bulk operations - 🔧 Monitor network latency to cBioPortal API
Configuration Problems
🌐 API Connectivity
💡 Examples & Use Cases
🔍 Research Queries
🧬 Genomic Analysis
📊 Bulk Operations
📜 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- 🧬 cBioPortal - Open-access cancer genomics data platform
- 🔗 Model Context Protocol - Enabling seamless AI-tool interactions
- ⚡ FastMCP - High-performance MCP server framework
- 📦 uv - Modern Python package management
- 🤖 AI Collaboration - Demonstrating the power of human-AI partnership in scientific software development
🌟 Production-ready bioinformatics platform built through innovative human-AI collaboration! 🧬✨
Demonstrating the power of domain expertise + AI-assisted development for enterprise-grade scientific software.
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
Tools
cBioPortal의 암 유전체 데이터와 AI 보조자가 상호 작용할 수 있도록 하는 서버로, 사용자는 암 연구를 탐색하고, 유전체 데이터에 접근하고, 돌연변이와 임상 정보를 검색할 수 있습니다.
Related MCP Servers
- AsecurityAlicenseAqualityProvides comprehensive access to Roam Research's API functionality. This server enables AI assistants like Claude to interact with your Roam Research graph through a standardized interface.Last updated -181160TypeScriptMIT License
- -securityFlicense-qualityA MCP server that allows AI assistants to interact with the browser, including getting page content as markdown, modifying page styles, and searching browser history.Last updated -80TypeScript
- -securityFlicense-qualityHigh-performance server enabling AI assistants to access web scraping, crawling, and deep research capabilities through Model Context Protocol.Last updated -11TypeScript
- -securityFlicense-qualityA custom server that integrates WebDNA documentation with AI assistants by scraping, indexing, and providing searchable documentation through MCP-compatible API endpoints.Last updated -JavaScript