Skip to main content
Glama

cBioPortal MCP Server

by pickleton89
CHANGELOG.md71.5 kB
# Changelog All notable changes to the cBioPortal MCP Server project will be documented in this file. ## [Unreleased] ### 2025-07-20 (Evening Session) #### Critical MCP Server Connection and Functionality Fixes **Complete Resolution of Claude Desktop Integration Issues**: - **Fixed APIClient initialization problem**: Root cause was missing `@handle_api_errors` decorator on 10 endpoint methods, preventing lazy API client initialization - **Added missing decorators to endpoint methods**: - `studies.py`: `search_studies`, `get_study_details`, `get_multiple_studies` - `genes.py`: `search_genes`, `get_genes`, `get_multiple_genes`, `get_mutations_in_gene` - `molecular_profiles.py`: `get_molecular_profiles`, `get_clinical_data`, `get_gene_panels_for_study`, `get_gene_panel_details` - `samples.py`: `get_sample_list_id` - **Resolved import errors**: Added missing `handle_api_errors` imports to `genes.py` and `molecular_profiles.py` - **Implemented lazy API client initialization**: Enhanced `BaseEndpoint._ensure_api_client_ready()` method to automatically initialize API client on first use - **Updated error handling decorator**: Modified `@handle_api_errors` to call API client initialization before method execution **Claude Desktop Configuration Improvements**: - **Fixed connection configuration**: Changed from problematic `uv run` approach to reliable direct script path method - **Updated Claude Desktop config**: Switched to `/path/to/project/.venv/bin/cbioportal-mcp` absolute path approach - **Enhanced installation process**: Emphasized importance of `uv pip install -e .` for editable mode installation **Documentation and Setup Enhancements**: - **Updated README.md**: Added comprehensive Claude Desktop setup instructions with two configuration options - **Enhanced troubleshooting guide**: Added specific guidance for virtual environment script paths and installation requirements - **Improved setup documentation**: Clear distinction between direct script path (recommended) and uv run (alternative) approaches **Quality Assurance**: - **Complete functionality verification**: All 17 cancer genomics tools now working correctly with automatic API client initialization - **Comprehensive error handling**: Consistent error handling across all endpoint methods with proper initialization - **Zero regressions**: All existing functionality preserved while fixing connection issues - **Production ready**: Server now reliably connects to Claude Desktop and provides full cancer genomics data access **Impact and Benefits**: - **Resolved "APIClient._client is not initialized" errors**: Complete fix for all endpoint methods - **Enhanced user experience**: Seamless Claude Desktop integration with clear setup instructions - **Improved reliability**: Automatic API client initialization eliminates startup timing issues - **Better documentation**: Users can now successfully configure and use the server without connection problems ### 2025-07-20 #### Code Quality Improvements Based on External Review **Comprehensive Code Quality Fixes**: - **Fixed outdated imports in example scripts** (scripts/test_async.py and scripts/quick_test.py): - Updated imports from old flat structure to proper package imports: `cbioportal_mcp.server` and `cbioportal_mcp.config` - Updated server initialization to use Configuration-based constructor with dependency injection - All development scripts now work with current architecture - **Replaced custom deep copy implementation** (config.py): - Removed manual recursive dictionary copying in `_deep_copy_dict()` method - Replaced with standard library `copy.deepcopy()` for better reliability and maintainability - Added proper import for copy module - **Eliminated global logger reassignment** (server.py): - Removed problematic `global logger` declaration and reassignment in main function - Replaced with local `config_logger` variable for cleaner code organization - Updated all subsequent logger references to use the local variable - Prevents potential issues with global state modification - **Made gene batch size configurable** (config.py, endpoints/genes.py, server.py): - Added `api.batch_size.genes` configuration option with default value of 100 - Added environment variable support: `CBIOPORTAL_GENE_BATCH_SIZE` - Added proper validation for batch size (must be positive integer) - Updated GenesEndpoints to accept configuration and use configurable batch size - Enhanced server initialization to pass configuration to GenesEndpoints - **Added comprehensive Ruff configuration** (pyproject.toml): - Configured linter rules: pycodestyle, pyflakes, isort, flake8-bugbear, comprehensions, pyupgrade, simplify - Set line length to 88 characters and target Python 3.10+ - Added formatter configuration with proper quote style and indentation - Configured import sorting with known first-party packages - Ignored specific rules where appropriate (E501 for line length, B008 for function defaults) **Quality Assurance**: - All 93 tests continue to pass after code quality improvements - Zero functional regressions introduced by cleanup - Improved code maintainability and adherence to Python best practices - Enhanced configurability for deployment flexibility - Added modern linting infrastructure for ongoing code quality ### 2025-07-18 #### Major Code Duplication Refactoring Complete **Comprehensive Architectural Improvements**: - **Successfully eliminated ~60% of code duplication** across endpoint modules through systematic refactoring - **Created BaseEndpoint class** (`cbioportal_mcp/endpoints/base.py`) with shared functionality: - Centralized pagination logic in `paginated_request()` method - Standardized error handling with `@handle_api_errors` decorator - Automatic parameter validation with `@validate_paginated_params` decorator - Shared utility methods for response building and limit application - **Refactored endpoint classes** to inherit from `BaseEndpoint`: - `StudiesEndpoints` and `SamplesEndpoints` now use clean delegation pattern - Methods reduced from 50+ lines to 8-10 lines using base class functionality - Consistent pagination response structure across all endpoints **Code Quality Achievements**: - **Removed unnecessary wrapper methods**: Eliminated redundant `collect_all_results` methods from endpoint classes - **Standardized pagination patterns**: All paginated methods now use identical parameter building and response formatting - **Improved error handling consistency**: Validation errors properly bubble up while API errors are handled gracefully - **Enhanced maintainability**: New endpoints can inherit from `BaseEndpoint` and get full functionality with minimal code **Testing Excellence**: - **All 93 tests passing** - Zero functional regressions through comprehensive refactoring - **Maintained API compatibility**: All existing method signatures and response formats preserved - **Enhanced validation testing**: Input validation decorator properly handles edge cases - **Preserved error handling**: Custom error messages maintained where needed for backward compatibility **Technical Implementation**: - **Decorators for cross-cutting concerns**: `@handle_api_errors` and `@validate_paginated_params` eliminate repetitive code - **Inheritance-based architecture**: `BaseEndpoint` provides foundation for all endpoint classes - **Preserved async patterns**: All async functionality and performance optimizations maintained - **Clean separation of concerns**: Pagination, validation, and error handling properly separated **Impact and Benefits**: - **Reduced duplication**: From 850+ lines of duplicated code to ~350 lines of shared functionality - **Improved consistency**: Standardized pagination responses and error handling across all endpoints - **Enhanced developer experience**: Adding new endpoints now requires minimal boilerplate code - **Better maintainability**: Changes to pagination or error handling logic only need to be made in one place - **Foundation for future improvements**: Architectural improvements make the codebase ready for further enhancements ### 2025-07-10 #### Type Checking Improvements **Comprehensive Type Annotation Fixes**: - **Fixed parameter default type mismatches** across multiple modules: - `utils/pagination.py`: Changed `Dict[str, Any] = None` to `Optional[Dict[str, Any]] = None` and `int = None` to `Optional[int] = None` - `server.py`: Updated pagination method signatures with proper Optional types - `endpoints/molecular_profiles.py` and `endpoints/studies.py`: Fixed parameter type annotations - **Fixed return type mismatches** in `molecular_profiles.py`: Updated `get_gene_panels_for_study` return type to `Union[List[Dict[str, Any]], Dict[str, Any]]` to allow error dictionaries - **Fixed unresolved imports**: Corrected relative import path from `utils.pagination` to `..utils.pagination` in `molecular_profiles.py` - **Added missing type imports**: Added `Optional` and `Union` to typing imports where needed - **Fixed test attribute issue**: Added type ignore comment for `SystemExit.code` attribute in test **Quality Assurance**: - Reduced type checker errors from 26 to 3 (88% improvement) - Remaining 3 errors are in old script files that don't affect main codebase - All critical type safety issues resolved - Improved code maintainability with better type annotations #### Code Quality and Linting Improvements **Comprehensive Linter Fixes**: - **Critical bug fix** (api_client.py:130): Fixed undefined `url` variable in debug logging by using `response.url` - **Removed unused imports** across multiple files: - server.py: Removed unused `logging`, `httpx`, and validation function imports - endpoints/genes.py: Cleaned up unused `Any` and validation imports (kept `List` as needed) - endpoints/samples.py: Removed unused `Any` and `List` imports - utils/pagination.py: Removed unused `Optional` import - Test files: Removed unused `CBioPortalMCPServer` imports - **Style improvements**: Fixed unnecessary f-string prefixes in server.py log statements **Quality Assurance**: - All 93 tests continue to pass after linter fixes - Ruff linter reports "All checks passed!" - Zero functional impact - all changes were safe cleanup or critical bug fixes - Maintained full backward compatibility #### High Priority Code Quality Fixes (Phase 1) **Configuration System Improvements**: - **Fixed Configuration.get() falsy values bug** (config.py:266): Replaced 'or default' logic with proper None check to preserve legitimate zero/false configurations - **Added comprehensive test coverage** for falsy values scenarios (0, False, empty strings) **HTTP Client Architecture Improvements**: - **Enhanced APIClient initialization** (api_client.py:33): Pass base_url to httpx.AsyncClient constructor for cleaner URL handling - **Simplified API request logic**: Removed manual URL concatenation in favor of httpx's built-in relative path resolution - **Improved error handling**: Updated logging to use endpoint paths instead of full URLs **Testing and Validation**: - All 93 existing tests continue to pass - Added new test for falsy values edge cases - Maintained full backward compatibility #### Medium Priority Code Quality Fixes (Phase 2) **Async Best Practices**: - **Fixed deprecated event loop usage** (endpoints/studies.py): Replaced asyncio.get_event_loop().time() with time.perf_counter() for performance timing - **Improved signal handling** (cbioportal_server.py:314): Replaced sys.exit(0) with graceful KeyboardInterrupt-based shutdown for proper async cleanup **Exception Handling Enhancements**: - **Created specific exception classes** (api_client.py): Replaced generic Exception wrapping with typed hierarchy: - APIClientError (base), APIHTTPError (with status codes), APINetworkError, APITimeoutError, APIParseError - **Enhanced error diagnostics**: Each exception type includes relevant context (status codes, endpoints, response text) - **Improved error handling granularity**: Network timeouts, HTTP errors, and parsing errors now have distinct exception types **Reliability Improvements**: - Better async shutdown sequence preventing resource leaks - Enhanced error reporting with specific exception context - Future-proof timing code using modern Python async patterns #### Package Structure Refactoring (Phase 3 - COMPLETED) **Professional Package Structure Implementation**: - **Created proper Python package** under `cbioportal_mcp/` directory following industry standards - **Moved all modules** into package with preserved git history using `git mv` commands - **Updated import structure**: Relative imports within package (`.api_client`, `..utils.validation`) - **Renamed main module**: `cbioportal_server.py` → `server.py` for cleaner package organization **Build and Distribution Improvements**: - **Updated pyproject.toml**: Entry point `cbioportal-mcp = "cbioportal_mcp.server:cli_main"` - **Simplified build targets**: From individual file listing to package directory inclusion - **Package exports**: Comprehensive `__init__.py` with clean public API exports - **CLI functionality verified**: `cbioportal-mcp --help` works correctly **Comprehensive Test Updates**: - **Updated all test imports** to reference new package structure (`cbioportal_mcp.server`) - **Fixed @patch decorators** across 8 test files with new module paths - **Maintained test coverage**: All 93 tests passing with zero regressions #### Final Project Completion **All Six Priority Areas Successfully Implemented**: - ✅ **High priority**: Configuration falsy values bug and HTTP client initialization - ✅ **Medium priority**: Deprecated event loop usage, signal handling, and exception hierarchy - ✅ **Low priority**: Package structure refactoring (COMPLETED - was initially deferred but successfully implemented) **Transformation Summary**: - **Before**: Collection of scripts at repository root - **After**: Professional Python package with proper namespace management - **Benefits**: Better distribution, cleaner imports, namespace conflict avoidance - **Result**: Industry-standard package structure ready for PyPI distribution ### 2025-07-08 #### Documentation Transformation and Project Modernization - **Complete README.md Rewrite**: - **Professional documentation overhaul**: Transformed README.md from outdated structure to comprehensive, modern documentation showcasing project evolution - **AI-Collaborative Development section**: Added prominent section highlighting innovative human-AI partnership methodology combining 20+ years cancer research expertise with LLM-powered implementation - **Architecture showcase**: Documented 71% code reduction (1,357 → 396 lines) and modular structure with utils/, endpoints/, config.py organization - **Performance achievements**: Prominently featured 4.5x async performance improvement and concurrent operations capabilities - **Modern tooling emphasis**: Updated installation instructions to prioritize uv package management and pyproject.toml workflow - **Enhanced Project Presentation**: - **Multi-layer configuration documentation**: Comprehensive YAML configuration examples with CLI args > Environment variables > Config file > Defaults priority - **Testing excellence highlighted**: Showcased 92 comprehensive tests across 8 organized test suites with full coverage - **Enterprise features**: Documented input validation, error handling, pagination support, and bulk operations - **Developer experience**: Added modern development workflow with uv commands, troubleshooting guide, and contributing guidelines - **Use case examples**: Included realistic cancer genomics research queries and bulk operation scenarios - **Legal and Attribution Updates**: - **MIT License personalization**: Updated copyright holder from "cBioPortal MCP Project" to "Jeff Kiefer" for proper attribution - **Professional presentation**: README now reflects production-ready, enterprise-grade bioinformatics platform status - **Innovation showcase**: Documentation demonstrates how domain experts can effectively leverage AI tools for scientific software development - **Impact and Significance**: - **Documentation alignment**: README now accurately represents the project's transformation from basic script to professional platform - **Methodology demonstration**: Showcases cutting-edge human-AI collaboration approach in bioinformatics software development - **Technical excellence**: Comprehensive documentation of modular architecture, async performance, and comprehensive testing ### 2025-07-08 #### Project Structure Cleanup and Organization - **Repository Structure Reorganization**: - **Created `docs/` directory**: Moved documentation files (IMPLEMENTATION_PLAN.md, MIGRATION_TO_UV.md, REFACTOR_PLAN.md) for better organization - **Created `scripts/` directory**: Moved development utilities (quick_test.py, test_async.py) with updated import paths - **Removed redundant files**: Eliminated requirements.txt and requirements-dev.txt (now handled by pyproject.toml) - **Professional structure**: Repository now follows Python best practices with logical file grouping - **Git Configuration Improvements**: - **Updated .gitignore**: Added Claude Code personal files section to ignore CLAUDE.local.md and .claude/ directory - **Removed personal files from tracking**: CLAUDE.local.md and .claude/settings.local.json no longer tracked - **Preserved shared documentation**: CLAUDE.md remains tracked for team collaboration - **Enhanced privacy**: Personal workflow notes and settings now properly excluded from version control - **Test Suite Modernization and Fixes**: - **Fixed CLI test failures**: Updated all 5 CLI tests to work with new Configuration-based API - **API compatibility updates**: Tests now properly mock load_config and Configuration objects - **Assertion corrections**: Updated test expectations to match current server constructor (config= parameter) - **Logging mock updates**: Changed from logging.basicConfig to setup_logging mocks - **Complete test coverage**: All 92 tests now passing with zero regressions - **Script Maintenance**: - **Updated import paths**: Fixed quick_test.py and test_async.py to work from scripts/ directory - **Path resolution**: Added proper sys.path manipulation for relocated development utilities - **Functionality preserved**: All development scripts maintain full compatibility #### Code Quality and Maintenance Improvements - **Comprehensive Code Cleanup**: - **Removed obsolete comments**: Cleaned up TODO and reminder comments from test files (test_input_validation.py and test_snapshot_responses.py) - **POSIX compliance**: Added trailing newlines to all Python files missing them (10 files updated: config.py, endpoints/*.py, utils/*.py) - **Magic number elimination**: Created constants.py with FETCH_ALL_PAGE_SIZE = 10000000 and replaced all hardcoded occurrences across endpoint modules - **Updated legacy scripts**: Fixed quick_test.py to work with current Configuration-based architecture - **Test Infrastructure Modernization**: - **Simplified imports**: Removed manual sys.path.insert() modifications from all 8 test files and conftest.py - **Leveraged package structure**: Tests now use proper module resolution through uv and pyproject.toml - **Improved maintainability**: Eliminated 50+ lines of boilerplate path manipulation code - **Enhanced readability**: Cleaner import statements and better organization - **Code Quality Achievements**: - **Better maintainability**: Centralized constants prevent magic number proliferation - **Improved consistency**: Standardized file endings and import patterns - **Reduced technical debt**: Eliminated outdated helper scripts and comments - **Professional standards**: Adherence to Python best practices and POSIX compliance #### Modular Refactoring Successfully Merged to Main - **Complete Implementation Plan Success**: - **All 4 phases completed and merged**: Utility modules, endpoint modules, server refactoring, and configuration enhancement - **Pull Request #1 merged**: Successfully integrated modular refactoring into main branch with merge conflict resolution - **Branch cleanup**: refactor/cbioportal-server-modular branch merged and cleaned up - **Production deployment ready**: Comprehensive architecture transformation completed - **Final Architecture Achievement**: - **71% code reduction**: Main server reduced from 1,357 to 396 lines (961 lines removed) - **Zero functionality regression**: All API capabilities preserved with complete backward compatibility - **4.5x performance maintained**: Async improvements and concurrent operations fully preserved - **Modern modular structure**: Clean separation into utils/, endpoints/, config.py, and api_client.py - **Quality Assurance Milestone**: - **Core functionality verified**: Configuration system, server startup, and API integration confirmed working - **Test infrastructure updated**: Test fixtures and core tests updated for new architecture - **Documentation complete**: CLAUDE.md, IMPLEMENTATION_PLAN.md, and comprehensive changelog - **Build system modernized**: uv package management with pyproject.toml and proper dependency management - **Implementation Plan Final Status**: - ✅ **Phase 1 complete**: Utility modules (pagination, validation, logging) extracted and integrated - ✅ **Phase 2 complete**: Endpoint modules (studies, genes, samples, molecular_profiles) with domain separation - ✅ **Phase 3 complete**: Server class refactoring with clean dependency injection pattern - ✅ **Phase 4 complete**: Configuration enhancement with YAML and environment variable support - 🎯 **Main branch integration**: All phases successfully merged and production ready #### Phase 4: Configuration Enhancement Complete - **Comprehensive Configuration Management System**: - **Created `config.py` module**: Full-featured Configuration class with validation, environment variable support, and YAML file integration - **Multi-layered configuration priority**: CLI arguments > Environment variables > Configuration file > Default values - **YAML configuration files**: Added PyYAML dependency and comprehensive file-based configuration support - **Environment variable integration**: `CBIOPORTAL_*` prefixed variables for all configuration options - **Enhanced CLI Interface**: - **Configuration file support**: `--config config.yaml` option for specifying configuration files - **Example configuration generation**: `--create-example-config` creates fully documented example configuration - **Improved help documentation**: Comprehensive examples and configuration source priority explanation - **Backward compatibility**: All existing CLI options maintained with configuration file override capability - **Configuration Features**: - **Comprehensive validation**: Type checking, range validation, and descriptive error messages for all configuration options - **Nested configuration access**: Dot notation support for accessing nested configuration values - **Future-ready API settings**: Configuration structure for rate limiting, retry logic, and caching (ready for future implementation) - **Flexible deployment support**: Environment variable integration enables easy deployment configuration - **Server Architecture Integration**: - **Updated constructor**: `CBioPortalMCPServer` now accepts Configuration object instead of individual parameters - **Dependency injection**: Configuration object passed through server initialization - **Enhanced logging integration**: Configuration-driven logging setup and level management - **Build system updates**: Updated pyproject.toml to include new configuration module and dependencies - **Quality Assurance**: - **Core functionality verified**: Configuration system working correctly with CLI help, example generation, and server integration - **Test infrastructure updated**: Test fixtures updated to support new configuration-based constructor - **Backward compatibility maintained**: Default configuration values preserve existing behavior - **Configuration validation**: Comprehensive error checking ensures proper configuration at startup - **Implementation Plan Success**: - ✅ Phase 4 complete: Configuration enhancement implemented and verified - **Target exceeded**: Achieved comprehensive configuration management with validation, file support, and environment integration - **Foundation for deployment**: Configuration system ready for production deployment scenarios #### Phase 3: Server Refactoring Complete - **Main Server Class Refinement**: - **Clean dependency injection pattern**: APIClient and endpoint modules properly injected in `__init__` - **Explicit tool registration**: Replaced dynamic method discovery with explicit tool method list - **Improved lifecycle management**: Fixed invalid client references and streamlined startup/shutdown - **Enhanced initialization**: Added clear documentation and organized structure with proper separation - **Code Quality & Maintenance**: - **Final line count: 396 lines** (down from 1,352 original - **71% total reduction**) - Fixed signal handler diagnostic warnings and improved error handling - Added comprehensive docstrings and organized imports - Streamlined tool registration prevents accidental exposure of internal methods - **Architecture Excellence**: - **Focused responsibility**: Main server class now handles only MCP coordination, tool routing, and lifecycle management - **Clean separation**: All business logic extracted to dedicated endpoint modules - **Testable design**: Clear dependency injection makes components easily mockable - **Maintainable structure**: Explicit configurations over dynamic discovery - **Quality Assurance**: - **All 92 tests continue to pass** - zero functionality regression through Phase 3 - Complete API compatibility maintained across all refactored components - Tool registration verification ensures all expected methods are properly exposed - Lifecycle management tests confirm proper resource handling - **Implementation Plan Success**: - ✅ Phase 3 complete: Server class refactored and streamlined - 🎯 Ready for Phase 4: Configuration Enhancement - **Exceeded targets**: Achieved better organization and maintainability than planned #### Phase 2: Endpoint Modules Refactoring Complete - **Modular Endpoint Architecture**: - Created `endpoints/` package with focused domain separation: - `endpoints/studies.py` - Cancer studies, types, search functionality (~360 lines) - `endpoints/genes.py` - Gene operations, mutations, batch processing (~240 lines) - `endpoints/samples.py` - Sample data and sample lists (~80 lines) - `endpoints/molecular_profiles.py` - Molecular profiles, clinical data, gene panels (~280 lines) - **Main Server Transformation**: - **Dramatic reduction from 1,352 to 371 lines** (72.5% reduction, 981 lines removed!) - Exceeded implementation plan target of 600-800 lines by achieving just 371 lines - Implemented clean dependency injection pattern with APIClient - All endpoint methods now delegate to focused domain modules - Maintained exact method signatures for FastMCP compatibility - **Architecture Excellence**: - **Clear separation of concerns**: Each endpoint module handles a specific API domain - **Dependency injection**: APIClient injected into all endpoint modules for testability - **Maintainable codebase**: Easy to locate, modify, and extend specific functionality - **Extensible design**: Simple to add new endpoints to appropriate domain modules - **Quality Assurance Success**: - **All 92 tests continue to pass** - zero functionality regression - Full API compatibility maintained across all endpoints - Comprehensive test coverage preserved through careful delegation - Clean module imports and proper package structure - **Performance & Async Benefits Preserved**: - All async patterns and performance optimizations maintained - Concurrent operations (get_multiple_studies, get_multiple_genes) fully functional - 4.5x async performance improvement retained from Phase 1 - Complete pagination support across all endpoint modules - **Implementation Plan Progress**: - ✅ Phase 2 complete: Endpoint modules extracted and integrated - 🎯 Ready for Phase 3: Server class refactoring and further cleanup - Target exceeded: Achieved 371 lines vs planned 600-800 lines #### Phase 1: Utility Modules Refactoring Complete - **Modular Architecture Foundation**: - Created `utils/` package with clean module structure and proper exports - Extracted pagination logic from main server into `utils/pagination.py` (~95 lines) - Standardized input validation patterns into `utils/validation.py` (~150 lines) - Centralized logging configuration in `utils/logging.py` (~90 lines) - **Main Server Refactoring**: - Reduced main server file from 1,357 to ~1,100 lines (257 line reduction) - Replaced repetitive validation code with reusable validation functions - Updated pagination methods to delegate to utility functions - Migrated to centralized logging setup using `utils.logging.setup_logging()` - Maintained 100% backward compatibility - all method signatures unchanged - **Code Quality Improvements**: - Eliminated code duplication across validation patterns - Improved maintainability with centralized utility functions - Enhanced reusability of common functionality across modules - Established foundation for Phase 2 endpoint module extraction - **Testing Success**: - All 92 tests continue to pass after refactoring - No breaking changes introduced to existing functionality - Preserved test compatibility through careful API preservation - Updated test mocking to handle new utility module imports - **Implementation Plan Progress**: - ✅ Phase 1 complete: Utility modules extracted and integrated - 🎯 Ready for Phase 2: Endpoint modules (studies.py, genes.py, samples.py, molecular_profiles.py) - Target: Further reduce main server to 400-500 lines through endpoint extraction ### 2025-01-08 #### Migration to uv Package Manager - **Project Modernization**: - Migrated from pip/requirements.txt to uv's modern project workflow using pyproject.toml - Created comprehensive pyproject.toml with full project metadata, dependencies, and configuration - Generated universal lockfile (uv.lock) for reproducible builds across platforms - Updated Python requirement from 3.8+ to 3.10+ (required by fastmcp dependency) - **Dependency Management Improvements**: - Consolidated requirements.txt and requirements-dev.txt into pyproject.toml dependency groups - Moved pytest configuration from pytest.ini to pyproject.toml for centralized configuration - Added proper build system configuration with hatchling backend - Created script entry point: `cbioportal-mcp` command for easy CLI access - **Development Workflow Enhancements**: - All development commands now use `uv run` for automatic environment management - Simplified dependency installation with `uv sync` (replaces multiple pip install commands) - Added synchronous CLI entry point wrapper for async main function - Performance improvement: 10-100x faster dependency operations with uv vs pip - **Documentation Updates**: - Updated README.md to prioritize uv workflow with migration instructions - Updated CLAUDE.md with new uv-based development commands - Created comprehensive MIGRATION_TO_UV.md guide for existing developers - Updated installation instructions and usage examples throughout documentation - **Quality Assurance**: - Verified all 92 tests pass with uv workflow - Confirmed server functionality with both `uv run python cbioportal_server.py` and `uv run cbioportal-mcp` - Maintained backward compatibility for pip users (legacy support documented) - No breaking changes to server functionality or API ### 2025-05-09 #### Session Summary (2025-05-09 13:52) - **Pagination Implementation Completion**: - Completed pagination support for the two remaining collection-returning methods: - `get_mutations_in_gene`: Added pagination parameters and updated implementation to properly handle pagination - `get_clinical_data`: Added pagination parameters and updated implementation to support pagination for both GET and POST requests - Ensured consistent pagination interface across all methods with the same parameter names and response structure - Updated method documentation with detailed parameter descriptions and return value documentation - Maintained special handling for `limit=0` (fetch all results) in the newly updated methods - **API Documentation Review**: - Reviewed the cBioPortal API v3 documentation (OpenAPI 3.1.0) to ensure proper implementation - Confirmed the available pagination parameters and their usage across different endpoints - Verified the correct endpoint paths and parameter structures for mutation and clinical data endpoints - **MCP Integration Research**: - Explored the Model Context Protocol (MCP) documentation and Python SDK - Gained familiarity with MCP server implementation patterns for future enhancements - Reviewed how the cBioPortal server exposes API functionality through MCP tools - **Code Quality Improvements**: - Fixed docstring syntax error in the `run` method - Addressed critical syntax issues to ensure proper code execution - Maintained consistent code style across the newly implemented pagination methods #### Pagination Completion - **Milestone**: Pagination support has been fully implemented and verified across all collection-returning API methods in the server. - All relevant methods now accept `page_number`, `page_size`, `sort_by`, `direction`, and `limit` parameters. - Each method returns a response with a `pagination` metadata dictionary for consistent client consumption. - Special handling for `limit=0` (fetch all results) is in place where supported by the API. - Methods covered: - `get_cancer_studies` - `get_cancer_types` - `get_samples_in_study` - `search_genes` - `search_studies` - `get_molecular_profiles` - `get_mutations_in_gene` - `get_clinical_data` - **Quality Improvements**: - Audited all endpoints to ensure no collection-returning method is missing pagination support. - Standardized error handling and response formatting for paginated endpoints. #### Testing Results - **Test Execution**: - Successfully passed 11 of 11 tests after all implementation fixes - Verified proper pagination functionality across all methods - Confirmed batch processing for gene retrieval works correctly - Addressed code linting issues for better quality and maintainability #### Implementation Fixes - **Pagination Implementation**: - Fixed `get_cancer_studies` method to properly use pagination parameters - Updated `search_genes` method to use correct endpoint with pagination - Ensured consistent pagination structure across all methods - Implemented special case handling for "all results" (limit=0) requests - Added proper limit handling to truncate results when needed - **Method Implementation Updates**: - Updated `get_genes` to use batch endpoint for better efficiency - Simplified `get_study_details` to focus on core functionality - Improved code quality by fixing lint issues #### Testing Infrastructure - **Testing Framework Setup**: - Added pytest configuration with pytest.ini - Created comprehensive unit test structure with mocking - Added test coverage reporting capability with pytest-cov - **Test Files Implementation**: - Created test_pagination.py with specific tests for pagination features - Created test_cbioportal_server.py for basic functionality tests - Added tests for error handling and configuration - **Dependency Updates**: - Added pytest and pytest-cov to requirements.txt - Updated MCP library version constraint to use available versions (1.0.0-1.8.0) #### Project Setup and Initial Implementation - **Repository Initialization**: Created a Git repository for version control and collaboration - **Documentation**: - Created comprehensive README.md with installation instructions, usage examples, and feature documentation - Added LICENSE file with MIT License - Created requirements.txt file specifying dependencies - Added standard .gitignore file for Python projects #### API Research and Evaluation - **API Documentation Review**: - Analyzed cBioPortal API v2 Swagger documentation - Discovered and explored cBioPortal API v3 (OpenAPI 3.1.0) documentation - Documented differences between v2 and v3 APIs - Identified key endpoints and patterns for data retrieval #### Code Improvements - **Pagination Support**: Added LLM-optimized pagination to collection-based methods: - `get_cancer_studies`: Added pagination, sorting, and limit parameters - `get_cancer_types`: Added pagination, sorting, and limit parameters - `get_samples_in_study`: Added pagination, sorting, and limit parameters - `search_genes`: Added pagination, sorting, and limit parameters - **Response Structure Optimization**: - Restructured method responses for better LLM interaction - Added consistent pagination metadata across methods - Implemented `has_more` flag for easier pagination state tracking - **Method Parameter Enhancements**: - Added `gene_id_type` and `projection` parameters to `get_genes` - Improved method documentation with detailed parameter descriptions - Added explanations of valid values for enumeration parameters #### LLM Interaction Optimizations - **Context Window Management**: Set conservative default page sizes (50 items) - **Progressive Information Retrieval**: Added `limit` parameter to control total results - **Explicit Documentation**: Added detailed comments explaining valid sort fields for each method #### Bug Fixes (Post-Pagination Implementation) - **Server-Side Logic (`cbioportal_server.py`)**: - Corrected the API endpoint in the `search_genes` method from `genes/search` to the standard `genes` endpoint. This resolves an issue where tests for gene searching were failing due to an incorrect API path. - Revised the `has_more` pagination flag calculation in the following methods to accurately reflect whether more data might be available from the API, rather than relying on the count of items on the current page: - `get_cancer_studies` - `get_cancer_types` - `get_samples_in_study` - `search_genes` - `get_mutations_in_gene` - `get_clinical_data` - This change fixes previous assertion errors in tests where the `has_more` flag was incorrectly `False` when subsequent pages were expected. #### Code Improvements & Linting - **Code Health in `cbioportal_server.py`**: - Removed unused Python imports (`json`, `os`, `dotenv.load_dotenv`) to clean up the codebase. - Refactored error message string formatting: - Converted f-strings to regular strings in instances where placeholders were not actually used, addressing lint warnings. - Ensured that `study_id` (and other relevant variables) are correctly included in error messages for methods like `get_mutations_in_gene`, `get_clinical_data`, and `get_study_details` by using f-strings appropriately. This restores clarity to error messages that were inadvertently made generic. ### 2025-05-09 (16:00) #### Pagination Logic Completion and Testing - **Test Suite Fixes**: - Fixed test failures related to multiple API calls in `get_mutations_in_gene` by: - Updating tests to use `assert_has_calls()` instead of `assert_called_with()` - Correctly mocking both the molecular profiles request and the mutations request in sequence - Adding proper side effect patterns for mock API responses in sequence - Fixed test failures in `get_clinical_data` tests by: - Accounting for the patient-centric data structure where multiple clinical data entries are consolidated per patient - Updating response validation to correctly check unique patient counts instead of raw data counts - Adding proper test cases for both GET and POST API request paths - **API Response Consistency**: - Modified the `get_clinical_data` method to calculate `total_found` based on the number of unique patients rather than raw clinical data entries - This ensures the pagination metadata accurately reflects the actual data returned to clients - Improved test assertions to verify that pagination structure is consistent with the response content - **Dependency Management**: - Added `fastmcp>=0.1.0` to `requirements.txt` to support server functionality - Updated project to properly use the latest fastmcp package for MCP server implementation - **Code Architecture Improvements**: - Ensured all tests correctly model the internal behavior of methods that make multiple API calls - Improved mock setup to better simulate the actual API responses - Added better test documentation to explain the data transformation from raw API responses to structured client data #### Validation Results - **Test Status**: - Successfully completed all 15 tests with no failures or errors - Confirmed proper pagination behavior for all collection-returning methods - Verified correct API endpoint usage for all methods - Ensured the `has_more` flag correctly indicates when additional pages of data might be available ### 2025-05-09 (17:43) #### Async Implementation for Performance Enhancement - **Step 1: Replace HTTP Client**: - Replaced `requests` with `httpx` for async HTTP support - Added `asyncio` import for asynchronous programming capabilities - This change is the foundation for converting synchronous API calls to asynchronous - **Step 2: Create Async Client Manager**: - Implemented async lifecycle management with `startup` and `shutdown` hooks - Added `httpx.AsyncClient` initialization in the startup process - Configured proper client cleanup on server shutdown - Registered FastMCP lifecycle hooks using `mcp.on_startup` and `mcp.on_shutdown` - **Step 3: Convert API Methods to Async**: - Transformed `_make_api_request` helper method to async with proper error handling - Converted `get_cancer_studies` method to use async/await pattern - Added await keyword to API calls where needed - Maintained the same method signature and response structure for compatibility - **Step 4: Implement Pagination with Async Generators**: - Created `paginate_results` async generator for efficient paging through API results - Added `collect_all_results` helper to gather complete datasets when needed - Enhanced `get_cancer_studies` to use new pagination infrastructure - Made `run` method async to support full async lifecycle - Updated main function with better documentation and enhanced CLI options - **Step 5: Update Run Logic**: - Added comprehensive logging system with configurable levels - Implemented proper signal handling for graceful shutdowns - Enhanced error reporting with structured logging - Added startup and shutdown status reporting - Improved CLI interface with better argument documentation - **Step 6: Update Tool Registration for Async Methods**: - Converted remaining methods to async (`get_cancer_types`, `get_genes`, `get_study_details`) - Enhanced `get_cancer_types` to use the async pagination infrastructure - Added detailed method descriptions to tool registrations - Categorized tools by functionality (data retrieval, molecular data) - Added thorough documentation on how FastMCP handles async methods - **Step 7: Implement Concurrency for Bulk Operations**: - Added `get_multiple_studies` method to fetch studies in parallel using `asyncio.gather` - Implemented `get_multiple_genes` with smart batching for efficient concurrent requests - Added performance metrics like execution time to concurrent operations - Registered new bulk operations as MCP tools with descriptive documentation - Resolved the unused `asyncio` import warning by leveraging asyncio's concurrency features ### 2025-05-09 (18:04) #### Bug Fix: Method Registration Error - **Issue**: During testing, discovered a reference to a non-existent method `get_mutations_by_gene_and_study` in the tool registration code - **Resolution**: Removed the invalid tool registration line from `_register_tools` method - **Impact**: This fix enables the server to start properly without throwing an `AttributeError` - **Root Cause**: The method was likely referenced during development but never implemented, or was removed without updating the registration code #### FastMCP Compatibility Fix - **Issue**: The installed version of FastMCP doesn't support the `on_startup` and `on_shutdown` lifecycle hooks - **Resolution**: Modified the server initialization to directly create the HTTP client instead of using lifecycle hooks - **Impact**: Ensures compatibility with the available FastMCP version - **Adaptation**: Updated test script to remove calls to startup/shutdown methods that are no longer needed #### Performance Testing Results - **Test Implementation**: Created a test script (`test_async.py`) to benchmark async vs sequential operations - **Sequential vs Concurrent Study Fetching**: - Sequential fetching: 1.31 seconds to fetch 10 studies - Concurrent fetching: 0.29 seconds for the same operation - **4.57x performance improvement** with the async implementation - Data consistency verified between both approaches - **Gene Batch Concurrency**: - Successfully fetched 14 genes in 0.26 seconds with automatic batching - Demonstrated the effective batch processing with concurrent execution - **Validation**: All operations produced consistent results with excellent performance gains ### 2025-05-10 (09:00) #### Async Implementation Completion - **Fixed Async Methods and Tests**: - Completed the transition to fully async implementation, fixing the remaining issues with coroutines and tests - Added `await` to the second API call in `get_mutations_in_gene` method to ensure proper handling of asynchronous operations - Converted `get_clinical_data` method to be async and added `await` to its API calls - Updated all tests to properly handle async functions using `asyncio.run()` - Fixed tests that were getting stuck in infinite loops by improving async mock handling - Fixed all remaining test failures with proper async mocking patterns - **Test Suite Results**: - All 15 tests now pass successfully - Fixed specific test issues in: - `test_get_mutations_in_gene_pagination` - `test_get_mutations_in_gene_with_sort_and_limit` - `test_get_clinical_data_pagination` - `test_get_clinical_data_with_sort_and_limit` - `test_get_all_results` - **Code Quality**: - Improved mock setup for async methods using async side_effect functions - Standardized approach to testing asynchronous API methods - Eliminated coroutine handling errors and RuntimeWarnings #### FastMCP Async Lifecycle Management Fix (09:30) - **Fixed Server Startup and Lifecycle Management**: - Modified the server's run method to be synchronous, which matches FastMCP's expected pattern - Properly registered async lifecycle hooks via `mcp.on_startup` and `mcp.on_shutdown` properties - Simplified the main function to let FastMCP handle the event loop internally - Fixed integration with Claude Desktop which now successfully connects to the server - Maintained the benefits of async implementation for I/O-bound API calls ### 2025-05-10 (09:42) #### Async Function Call Fixes - **Resolved `TypeError: 'coroutine' object is not iterable` in `search_studies`**: - Modified `search_studies` to be an `async def` function. - Ensured the call to `self._make_api_request("studies")` within `search_studies` is properly `await`ed. - Improved error handling within `search_studies` for API and network issues. - Adjusted pagination logic in the response when `limit` is 0 for better clarity. - **Proactively fixed `get_molecular_profiles`**: - Modified `get_molecular_profiles` to be an `async def` function. - Ensured the call to `self._make_api_request(...)` within `get_molecular_profiles` is properly `await`ed. - **Impact**: These changes resolve runtime errors when these functions are called, allowing them to correctly interact with the asynchronous `_make_api_request` method. This enhances the stability and reliability of the MCP server when performing these operations. ### 2025-05-10 (10:23) #### Test Code Refinement - Refactored `test_get_cancer_studies_pagination` in `tests/test_pagination.py` to use `pytest.mark.parametrize`. This consolidates multiple test scenarios into a single, data-driven test function, improving conciseness and maintainability. #### Server Core & Lifecycle Management - Corrected `CBioPortalMCPServer.__init__` to defer `httpx.AsyncClient` initialization to the `startup` method, preventing redundant client creation. - Ensured `startup` and `shutdown` methods are correctly assigned to `FastMCP` instance's `on_startup` and `on_shutdown` lifecycle hooks. - Added tests to `tests/test_cbioportal_server.py` to verify: - Correct registration of `startup` and `shutdown` lifecycle hooks. - `startup` method initializes `httpx.AsyncClient` as expected. - `shutdown` method calls `aclose()` on the active client. - Refined `main()` function in `cbioportal_server.py` for clarity and robust `stdio` transport handling, removing prior experimental WebSocket code and addressing associated lint errors. - Fixed various import-related lint issues in `cbioportal_server.py`. #### Test Suite Modernization - **Initiated `pytest` Migration**: - Added `pytest-asyncio` to project dependencies (`requirements.txt`) to support asynchronous testing with `pytest`. - Successfully converted `tests/test_cbioportal_server.py` from `unittest` to `pytest` style: - Replaced `unittest.TestCase` structure with standalone test functions. - Converted `setUp` method and mock data initialization to `pytest` fixtures (`mock_study_data`, `mock_gene_data`, `cbioportal_server_instance`). - Updated assertion syntax from `self.assertEqual` etc., to standard Python `assert`. - Adapted asynchronous tests using `@pytest.mark.asyncio` and `await`, removing the need for `asyncio.run()` within tests. - Configured `pytest.ini` to include `asyncio_mode = auto` and `asyncio_default_fixture_loop_scope = function` to address `pytest-asyncio` deprecation warnings and improve integration. - **Impact**: This modernization effort aims to make the test suite more readable, maintainable, and aligned with current Python testing best practices. The `tests/test_cbioportal_server.py` file now fully utilizes `pytest` features. ### 2025-05-10 (11:00) #### Server Lifecycle and Tool Registration Enhancements - **Server Core Refinements (`cbioportal_server.py`)**: - Refactored `CBioPortalMCPServer.__init__` to correctly initialize `httpx.AsyncClient` within the `startup` method, preventing client initialization issues. - Ensured `startup` and `shutdown` async methods are properly registered as lifecycle hooks directly on the `FastMCP` instance (using `mcp.on_startup` and `mcp.on_shutdown`). - Removed the redundant `run()` method from `CBioPortalMCPServer` as its functionality is managed by `FastMCP`. - Simplified the `main()` function for improved clarity and robust `stdio` transport handling. - **Test Suite Enhancements (`tests/test_cbioportal_server.py`)**: - Added comprehensive tests for server lifecycle management: - `test_lifecycle_hooks_registered`: Verifies correct registration of startup/shutdown hooks. - `test_tool_registration`: Confirms that all intended public API methods are registered as MCP tools and that the list of registered tools is accurate. - Implemented `test_tool_registration` to dynamically verify that all intended public API methods are registered as MCP tools and that the list of registered tools is accurate. - Iteratively debugged tool fetching logic, adapting to the correct `FastMCP.get_tools()` method. - Corrected test logic to handle `mcp.get_tools()` returning a list of tool name strings directly. - Synchronized the `expected_tools` set with the actual public methods in `CBioPortalMCPServer`, removing entries for non-existent methods (`get_clinical_attributes_in_study`, `get_all_clinical_attributes`) to ensure test accuracy. - **Impact**: These changes significantly improve the robustness and maintainability of the server's core lifecycle management and tool registration processes. The enhanced test suite now provides strong guarantees for these critical functionalities, ensuring all exposed tools are correctly registered and lifecycle events are handled as expected. #### Pagination Test Suite Enhancements (11:08) - **Granular Tests for `paginate_results` (`tests/test_pagination.py`)**: - Added comprehensive unit tests for the `paginate_results` helper method in `cbioportal_server.py` to cover various scenarios: - `test_paginate_results_basic`: Verifies basic multi-page iteration. - `test_paginate_results_empty_first_call`: Ensures correct handling when the API returns no data on the first call. - `test_paginate_results_with_max_pages`: Confirms that the `max_pages` parameter correctly limits fetched pages. - `test_paginate_results_last_page_partial`: Checks correct termination when the API's last page has fewer items than `page_size`. - **Bug Fix in `paginate_results` (`cbioportal_server.py`)**: - Corrected an issue where `unittest.mock.call` would not accurately capture `params` for `_make_api_request` due to mutable dictionary references. - Fixed by passing `request_params.copy()` to `_make_api_request`, ensuring each call receives a distinct snapshot of parameters. - **Impact**: Improved test coverage for the core pagination logic, leading to increased confidence in data retrieval. The bug fix ensures test accuracy for methods relying on `paginate_results`. #### Test Suite Cleanup (11:11) - **Removed Obsolete Skipped Tests (`tests/test_pagination.py`)**: - Deleted `test_get_all_clinical_attributes_pagination`: This test was skipped because the corresponding server method for fetching all global clinical attributes does not exist. - Deleted `test_get_genes_pagination`: This test was skipped because the server's `get_genes` method is designed for fetching specific gene IDs, not for paginated listing of all genes. - **Impact**: Streamlined the test suite by removing tests for non-existent or inapplicable functionality, improving clarity and focus. ### 2025-05-10 (11:30) #### Test Suite Enhancements for Concurrent Fetching - Added comprehensive tests for the `get_multiple_studies` method in `tests/test_cbioportal_server.py`: - `test_get_multiple_studies_success`: Verifies successful concurrent fetching of multiple studies. - `test_get_multiple_studies_partial_failure`: Ensures correct handling when some study API calls fail while others succeed. - `test_get_multiple_studies_empty_list`: Confirms correct behavior when an empty list of study IDs is provided. - Added comprehensive tests for the `get_multiple_genes` method in `tests/test_cbioportal_server.py`: - `test_get_multiple_genes_single_batch_success`: Verifies successful concurrent fetching for a small number of genes (single batch). - `test_get_multiple_genes_multiple_batches_success`: Validates correct batching logic and concurrent fetching for a larger number of genes requiring multiple API batches. - `test_get_multiple_genes_partial_batch_failure`: Ensures proper error reporting and data handling when some gene batches fail to fetch. - `test_get_multiple_genes_empty_list`: Confirms correct behavior and metadata reporting when an empty list of gene IDs is provided. - Corrected a minor bug in `get_multiple_genes` where the `total_requested` metadata field was missing when an empty gene list was supplied. - Impact: Significantly improved test coverage for concurrent API fetching capabilities, ensuring robustness and reliability of these features. Provides greater confidence in the server's ability to handle multiple asynchronous requests efficiently and gracefully manage partial failures. ### 2025-05-10 (13:10) #### Error Handling Test Suite Enhancements - **Modernized Error Handling Tests**: - Replaced the previous `test_error_handling` function with a more robust and maintainable parameterized test function: `test_generic_api_error_handling` in `tests/test_cbioportal_server.py`. - This new function systematically tests various API methods against different `httpx` exceptions (`HTTPStatusError` for 4xx/5xx errors and `RequestError` for network issues). - **Expanded Test Coverage for Error Handling**: - Added specific error handling test cases for the following methods, verifying correct error message propagation: - `get_study_details` (simulating 404 Not Found) - `get_cancer_studies` (simulating Request Timeout) - `get_molecular_profiles` (simulating 500 Internal Server Error) - `get_genes` (simulating Network Error, also corrected method name from `get_gene` to `get_genes` and updated arguments/error prefix) - `get_clinical_data` (simulating 403 Forbidden, and Network Error for fetching specific attributes) - **Refinement of Test Cases**: - Removed a previously problematic test case for `get_mutations_in_molecular_profile_for_sample_ids` as no direct corresponding server method was identified. - **Impact**: Enhanced the reliability and comprehensiveness of the test suite by ensuring API methods correctly handle and report various error conditions. This improves confidence in the server's stability and provides clearer diagnostics when issues arise. ### 2025-05-10 #### Lifecycle and Registration Tests - Implemented comprehensive tests for server lifecycle management ([startup](cci:1://file:///Users/jeffkiefer/Documents/projects/cbioportal_MCP/tests/test_cbioportal_server.py:305:0-317:27), [shutdown](cci:1://file:///Users/jeffkiefer/Documents/projects/cbioportal_MCP/tests/test_cbioportal_server.py:320:0-338:70)) and MCP tool registration: - Verified that [startup()](cci:1://file:///Users/jeffkiefer/Documents/projects/cbioportal_MCP/tests/test_cbioportal_server.py:305:0-317:27) correctly initializes the `httpx.AsyncClient`. - Verified that [shutdown()](cci:1://file:///Users/jeffkiefer/Documents/projects/cbioportal_MCP/tests/test_cbioportal_server.py:320:0-338:70) correctly closes an active `httpx.AsyncClient` and handles cases with no active client. - Confirmed that [startup](cci:1://file:///Users/jeffkiefer/Documents/projects/cbioportal_MCP/tests/test_cbioportal_server.py:305:0-317:27) and [shutdown](cci:1://file:///Users/jeffkiefer/Documents/projects/cbioportal_MCP/tests/test_cbioportal_server.py:320:0-338:70) methods are properly registered as FastMCP lifecycle hooks (`on_startup`, `on_shutdown`). - Tested the `_register_tools()` method to ensure all intended public API methods are registered as MCP tools and that private/lifecycle methods are excluded. - Ensured server initialization sets up the base URL and MCP instance correctly. - Impact: Increased confidence in the server's core operational integrity, ensuring proper resource management and correct exposure of its capabilities through FastMCP. #### Test Suite: Server Lifecycle and Tool Registration - **Objective**: Ensure robust testing for server startup/shutdown, HTTP client management, and dynamic MCP tool registration. - **Completed Tests in `tests/test_server_lifecycle.py`**: - `test_server_lifecycle_hooks_registered`: Verified that `startup` and `shutdown` methods are correctly registered as FastMCP lifecycle hooks. - `test_server_startup_initializes_client_and_logs`: Confirmed that the `startup` method initializes the `httpx.AsyncClient` and logs the action. - `test_server_shutdown_closes_client_and_logs`: Ensured the `shutdown` method closes the `httpx.AsyncClient` and logs the action. - `test_server_shutdown_no_client`: Verified graceful handling when `shutdown` is called without an active client. - `test_register_tools_adds_public_methods`: Confirmed that public methods of `CBioPortalMCPServer` (excluding private/special methods) are correctly registered as tools with the FastMCP instance. This involved debugging and correctly mocking the interaction with `FastMCP.get_tools()`. - **Impact**: Strengthened the test suite by providing comprehensive coverage for essential server operational aspects and the correct functioning of the MCP tool discovery mechanism. This ensures reliability and makes future refactoring of server initialization or tool exposure safer. #### Test Suite Refactoring: Lifecycle Tests - **Objective**: Improve test suite organization and maintainability. - **Refactoring**: - Created a new test file: `tests/test_server_lifecycle.py`. - Moved the following tests from `tests/test_cbioportal_server.py` to `tests/test_server_lifecycle.py`: - `test_lifecycle_hooks_registered` - `test_tool_registration` - `test_server_startup_initializes_async_client` - `test_server_shutdown_closes_async_client` - `test_server_shutdown_handles_no_client` - `test_initialization` - Removed older, redundant lifecycle tests (`test_startup_initializes_client`, `test_shutdown_closes_client`) from `tests/test_cbioportal_server.py`. - **Fixes**: - Corrected `AttributeError` in `test_tool_registration` and `test_initialization` within `tests/test_server_lifecycle.py` by ensuring `server.mcp.get_tools()` (which returns a list of strings) is correctly processed into a set of tool names. - Addressed lint errors related to unused imports in both `tests/test_cbioportal_server.py` and `tests/test_server_lifecycle.py` after moving tests. - **Verification**: All 33 tests in the suite are passing after these changes. - **Impact**: This change enhances the structure of the test suite, making it easier to navigate and maintain tests related to server lifecycle and registration. #### Test Suite Refactoring: Multi-Entity API Tests & Bug Fixes - **Objective**: Continue improving test suite organization and address bugs uncovered during refactoring. - **Refactoring**: - Created `tests/test_multiple_entity_apis.py`. - Moved all `test_get_multiple_*` tests (7 tests) from `tests/test_cbioportal_server.py` to `tests/test_multiple_entity_apis.py`. - `tests/test_cbioportal_server.py` is now significantly streamlined, primarily containing setup code. - **Bug Fixes in `cbioportal_server.py`**: - `get_multiple_studies`: Ensured metadata includes `"concurrent": True` even for empty input lists, fixing a `KeyError`. - `get_multiple_genes`: - Corrected dictionary keying to use `hugoGeneSymbol` or `entrezGeneId` based on the `gene_id_type` request parameter, resolving `KeyError` and count discrepancies. - Refined `execution_time` calculation to accurately reflect the entire method's duration. - **Test Fixes in `tests/test_multiple_entity_apis.py`**: - `test_get_multiple_genes_multiple_batches_success`: Changed `mock_make_api_request.side_effect` from an `async def` function to a list of mock responses (`[mock_batch_1_response, mock_batch_2_response]`). This resolved an `AssertionError` where only one batch of results was being recognized. - **Linting**: - Addressed `E402` (module level import not at top of file) in `tests/test_multiple_entity_apis.py` with `# noqa: E402` for a necessary import order. - **Verification**: All 33 tests in the suite are passing. - **Impact**: Further organizes the test suite by grouping related tests. Critical bugs in multi-entity fetch methods were identified and fixed, improving data integrity and reliability. The test suite itself is more robust due to mocking improvements. #### Input Validation Tests and Implementation (15:43) - **Objective**: Enhance server robustness by implementing comprehensive input validation for API methods. - **New Tests Added (`tests/test_input_validation.py`)**: - Implemented and verified tests for the following methods to cover various invalid inputs (e.g., negative numbers, empty strings, incorrect types): - `get_cancer_studies` (page_number, page_size) - `get_study_details` (study_id) - `get_cancer_types` (page_number, page_size) - `get_molecular_profiles` (study_id, page_number, page_size) - `get_samples_in_study` (study_id, page_number, page_size) - `search_genes` (keyword, page_number, page_size) - `search_studies` (keyword, page_number, page_size) - **Server-Side Validation Logic (`cbioportal_server.py`)**: - Added input validation logic to the corresponding server methods to raise `TypeError` or `ValueError` for invalid inputs before any external API calls are made. - **Verification**: All 42 input validation tests are passing. - **Impact**: Improved the reliability and error handling of the server by ensuring that API methods correctly validate their input parameters. ### 2025-05-10 (16:09) #### Gene Panel Functionality - **New Server Methods Added**: - Implemented `get_gene_panels_for_study`: Retrieves gene panels for a specified study with pagination and detailed projection by default. - Implemented `get_gene_panel_details`: Fetches detailed information for a specific gene panel by its ID, including the list of genes (using detailed projection by default). - **Snapshot Tests for Gene Panels**: - Added snapshot tests for `get_gene_panels_for_study`. - Added snapshot tests for `get_gene_panel_details`. - All 12 snapshot tests in `test_snapshot_responses.py` are now passing. - Impact: Enhanced the server's capability to provide gene panel information and ensured response consistency through snapshot testing. ### 2025-05-10 (16:30) #### Test Suite Refinement - Consolidated server lifecycle and tool registration tests into `tests/test_server_lifecycle.py`. - Enhanced lifecycle tests in `tests/test_server_lifecycle.py` to include checks for startup and shutdown logging. - Removed redundant test file `tests/test_cbioportal_server.py` as its tests are now covered in `tests/test_server_lifecycle.py` and `tests/test_multiple_entity_apis.py`. - Impact: Streamlined test structure, removed ~120 lines of redundant test code, and improved clarity of test organization. All 90 tests pass. - Added `get_gene_panels_for_study` and `get_gene_panel_details` to `test_tool_registration`. #### Next Steps - The cBioPortal MCP server now features full async support with significant performance improvements - Potential future enhancements: - Apply concurrent fetching to remaining collection methods that could benefit from parallelization - Implement more sophisticated error handling and retry mechanisms for network errors - Add configuration options for controlling concurrency limits and timeout settings - Develop more comprehensive benchmarking and performance monitoring tools - Consider caching frequently requested data to further improve performance ### 2025-05-10 (17:00) #### CLI Testing Enhancements - **Comprehensive CLI Tests (`tests/test_cli.py`)**: - Implemented and verified tests for the `main()` function in `cbioportal_server.py`. - Covered default and custom command-line argument parsing (including `--base-url`, `--transport`, `--log-level`). - Ensured correct server instantiation and logging setup based on CLI arguments. - Tested graceful handling of runtime errors occurring within `mcp.run()`. - Added tests to verify correct error reporting and exit for unsupported `--transport` arguments. - Implemented tests to confirm graceful shutdown on `KeyboardInterrupt` (Ctrl+C) during server operation. - **Verification**: All 5 CLI tests in `tests/test_cli.py` are passing. The full suite of 92 tests also passes, confirming no regressions. - **Impact**: Improved the robustness and reliability of the server's command-line interface and startup procedures. #### Next Steps - The cBioPortal MCP server now features full async support with significant performance improvements - Potential future enhancements: - Apply concurrent fetching to remaining collection methods that could benefit from parallelization - Implement more sophisticated error handling and retry mechanisms for network errors - Add configuration options for controlling concurrency limits and timeout settings - Develop more comprehensive benchmarking and performance monitoring tools - Consider caching frequently requested data to further improve performance #### Refactoring Planning (2024-05-19) - **Codebase Analysis**: - Conducted comprehensive analysis of the current `cbioportal_server.py` implementation - Identified key components and their interdependencies - Evaluated opportunities for improved code organization and maintainability - **Refactoring Strategy**: - Created detailed refactoring plan in `REFACTOR_PLAN.md`` - Designed new package structure following Python best practices - Outlined phased approach to minimize disruption - Added Graphiti memory integration for tracking refactoring progress - **Next Steps**: - Begin implementation of Phase 1 (Core Components) - Set up new package structure - Move API client and utility functions to dedicated modules - Update test suite to match new structure ### 2025-05-20 (09:17) #### Test Suite Refinements (APIClient Integration) - **`tests/test_server_lifecycle.py` Updates**: - Corrected logger mock setups to accurately capture log messages from both `APIClient` and `CBioPortalMCPServer` during startup and shutdown. - Updated assertions for log messages to align with the refactored server and API client behavior. - **`tests/test_snapshot_responses.py` Mocking Overhaul**: - Updated mock targets in `test_get_clinical_data_snapshot`, `test_get_clinical_data_specific_attributes_snapshot`, and `test_get_mutations_in_gene_snapshot` from the deprecated `_make_api_request` to the new `api_client.make_api_request`. - Refactored mocking strategy for paginated snapshot tests (`test_get_cancer_studies_snapshot`, `test_get_molecular_profiles_snapshot`, `test_get_cancer_types_snapshot`, `test_get_samples_in_study_snapshot`, `test_search_genes_snapshot`): - Changed mock target from internal `_paginate_results` to `api_client.make_api_request`. - Adjusted mock return values to provide the direct list of data items for the first page. - Removed unused `mock_response_with_pagination` helper variables. - **Impact & Current Status**: - These changes address test failures stemming from the `APIClient` refactoring. - Updated snapshot tests with `pytest --snapshot-update` to match the current mock data structure. - Fixed the `test_server_startup_initializes_async_client` test to correctly match the actual log message from `APIClient.startup()`. - All 92 tests in the test suite are now passing.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pickleton89/cbioportal-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server