Scalene-MCP

architecture.md•19 KiB

# Architecture Design and architecture of Scalene-MCP. ## Overview Scalene-MCP is a FastMCP v2 server that provides LLMs with structured access to Scalene's profiling capabilities. The architecture is designed around clean separation of concerns with minimal coupling. ``` ┌─────────────────────────────────────────────────────────────────┐ │ FastMCP MCP Server │ │ (Handles protocol, tool registration, streaming responses) │ └─────────────────────────────────────────────────────────────────┘ │ ┌───────────────────┼───────────────────┐ │ │ │ ┌───▼────┐ ┌────▼──┐ ┌────▼──┐ │Profile │ │Analyze│ │Compare│ │ Tool │ │ Tool │ │ Tool │ └───┬────┘ └────┬──┘ └────┬──┘ │ │ │ └───────────────────┼───────────────────┘ │ ┌───────────────────┼───────────────────────────┐ │ │ │ ┌───▼──────────┐ ┌─────▼──────┐ ┌──────────────┐ │ │ Profiler │ │ Analyzer │ │ Comparator │ │ │ (Subprocess) │ │ (Analysis) │ │ (Comparison) │ │ └───┬──────────┘ └─────┬──────┘ └──────────────┘ │ │ │ │ └───────────────────┼───────────────────────────┘ │ ┌───────────────────┼───────────────────┐ │ │ │ ┌───▼────┐ ┌────▼──┐ ┌────▼──┐ │ Parser │ │ Models│ │Logging│ │ (JSON) │ │(Validation) │(Rich) │ └───┬────┘ └────┬──┘ └───────┘ │ │ └───────────────────┼───────────────┐ │ │ ┌───────▼──────┐ ┌────▼────┐ │ Scalene CLI │ │ Temp │ │ (subprocess) │ │ Files │ └──────────────┘ └─────────┘ ``` ## Component Architecture ### 1. FastMCP Server (`server.py`) **Responsibility**: Expose profiling capabilities through MCP tools **Key Classes**: - `FastMCP` instance configured with Scalene Profiler description - Tool registration for all profiling operations **Inputs**: - MCP protocol messages from LLM clients - Tool requests with parameters **Outputs**: - Structured tool responses - Streaming messages via MCP protocol **Dependencies**: - FastMCP framework - All profiler/analyzer/comparator modules - Logging infrastructure --- ### 2. ScaleneProfiler (`profiler.py`) **Responsibility**: Execute Scalene via subprocess, handle execution, return structured results **Key Methods**: - `profile_script()`: Profile a Python script - `profile_code()`: Profile code snippet via temp file **Execution Flow**: 1. Validate inputs (file exists, timeout reasonable) 2. Build Scalene command with all options 3. Create temp file for JSON output 4. Execute via `asyncio.create_subprocess_exec()` 5. Parse JSON output via ProfileParser 6. Clean up temp file in finally block 7. Return ProfileResult **Key Details**: - Uses `tempfile.NamedTemporaryFile(delete=False)` for JSON output - Subprocess command: `["python", "-m", "scalene", "run", "--json", "--outfile", path, "--no-browser"]` - Comprehensive error handling with logging at all levels - Cleanup happens even on errors **Dependencies**: - asyncio for subprocess - tempfile for JSON output - ProfileParser for JSON conversion - Logging infrastructure --- ### 3. ProfileParser (`parser.py`) **Responsibility**: Convert Scalene JSON to structured Pydantic models **Key Methods**: - `parse_file()`: Parse from file path - `parse_json()`: Parse from JSON string **Implementation Details**: - `parse_json()` extracts JSON from mixed output (uses `find('{')` and `rfind('}')`) - Handles Scalene quirks: - `scalene_args` comes as list → convert to dict - Nested file paths → preserve structure - Comprehensive error messages on parse failures **Output Model Hierarchy**: ``` ProfileResult ├── SummaryMetrics (overall stats) ├── CPUMetrics (if cpu=True) ├── MemoryMetrics (if memory=True) ├── GPUMetrics (if gpu=True) └── errors: list[str] ``` **Dependencies**: - Pydantic for validation - JSON module for parsing - Logging infrastructure --- ### 4. ProfileAnalyzer (`analyzer.py`) **Responsibility**: Extract insights and identify problems from profiles **Key Methods**: - `get_hotspots()`: Find CPU/memory hot spots - `get_bottlenecks()`: Identify performance issues - `get_memory_leaks()`: Detect potential leaks - `get_function_metrics()`: Aggregate by function **Analysis Strategies**: - **Hotspots**: Sort by time/memory, apply threshold - **Bottlenecks**: Combine multiple metrics (CPU + memory + velocity) - **Leaks**: Use allocation velocity (MB/s) + confidence scoring - **Functions**: Aggregate line metrics to function level **Output Models**: - `Hotspot`: Ranked performance locations - `Bottleneck`: Severity-ranked issues with recommendations - `MemoryLeak`: Detected leaks with confidence - `FunctionMetrics`: Aggregated function-level data **Dependencies**: - Profile models - Statistical calculations --- ### 5. ProfileComparator (`comparator.py`) **Responsibility**: Compare profiles to detect regressions and improvements **Key Methods**: - `compare()`: Compare two profiles with tolerance **Comparison Logic**: 1. Match lines between profiles by file:line 2. Calculate percentage changes in metrics 3. Flag regressions (CPU/memory increase > tolerance) 4. Flag improvements (CPU/memory decrease > tolerance) 5. Mark unchanged lines **Output**: - `ComparisonResult` with summary and line-by-line changes - Severity assessment for regressions - Recommendations for improvements **Dependencies**: - Profile models - Statistical calculations --- ### 6. Data Models (`models.py`) **Responsibility**: Define and validate all data structures **Model Hierarchy**: ``` ProfileResult (Top-level profile data) ├── SummaryMetrics │ ├── elapsed_time_sec │ ├── max_footprint_mb │ └── ... ├── CPUMetrics │ ├── total_python_time_sec │ ├── total_c_time_sec │ └── lines: dict[str, LineMetrics] ├── MemoryMetrics │ ├── peak_mb │ ├── average_mb │ └── lines: dict[str, LineMemory] └── GPUMetrics ├── total_gpu_time_sec └── lines: dict[str, LineGPU] AnalysisResult ├── hotspots: list[Hotspot] ├── bottlenecks: list[Bottleneck] ├── recommendations: list[str] └── summary: str ComparisonResult ├── summary: ComparisonSummary ├── regressions: list[Regression] ├── improvements: list[Improvement] └── unchanged: list[UnchangedLine] ``` **Key Features**: - All models inherit from `BaseModel` (Pydantic) - Full type validation - Serializable to JSON - Docstrings on all fields **Dependencies**: - Pydantic for validation - Python typing --- ### 7. Logging Infrastructure (`logging.py`) **Responsibility**: Centralized, structured logging with rich formatting **Key Functions**: - `get_logger(name)`: Get logger with "scalene_mcp." namespace - `configure_logging()`: Set up logging with rich formatting **Features**: - Rich formatting for readable console output - Rich tracebacks for better error visibility - Integration with FastMCP settings - Respects `SCALENE_LOG_LEVEL` environment variable **Usage**: ```python from scalene_mcp.logging import get_logger logger = get_logger(__name__) logger.debug("Debug message") logger.info("Info message") logger.warning("Warning message") logger.error("Error message") ``` **Dependencies**: - Python logging module - rich library for formatting - FastMCP settings --- ## Data Flow ### Profiling Workflow ``` User Request (script path, options) ↓ [FastMCP Server] ↓ [ScaleneProfiler] ├─ Validate inputs ├─ Create temp file ├─ Execute Scalene subprocess │ └─ Collect stdout/stderr ├─ Read temp file └─ Clean up (finally block) ↓ [ProfileParser] ├─ Read/parse JSON ├─ Handle Scalene quirks └─ Validate with Pydantic ↓ [ProfileResult] - Structured data ↓ Returns to LLM via MCP protocol ``` ### Analysis Workflow ``` [ProfileResult] ↓ [ProfileAnalyzer] ├─ Extract metrics ├─ Calculate percentages ├─ Apply thresholds └─ Generate recommendations ↓ [AnalysisResult] with hotspots, bottlenecks, leaks ↓ Format for LLM consumption ↓ Returns to LLM via MCP protocol ``` ### Comparison Workflow ``` [ProfileResult 1] + [ProfileResult 2] ↓ [ProfileComparator] ├─ Match lines by location ├─ Calculate deltas ├─ Compare with tolerance └─ Classify changes ↓ [ComparisonResult] with regressions/improvements ↓ Format for LLM consumption ↓ Returns to LLM via MCP protocol ``` --- ## Key Design Decisions ### 0. Subprocess-Based Profiling (Scope Decision) **Decision**: Use subprocess-based profiling (`python -m scalene run`) instead of direct Scalene API **Rationale**: - **Isolation**: Script execution isolated from profiler state - **Reliability**: No Scalene state conflicts or memory leaks - **Simplicity**: Don't manage Scalene's internal state machine - **Safety**: Script crashes/hangs don't crash profiler - **Resource Cleanup**: Automatic process termination - **LLM Workflows**: Subprocess approach fits "script profiling" use case **What This Supports**: - ✅ Profile standalone scripts - ✅ Profile packages (via entry point) - ✅ Pass command-line arguments - ✅ Full profiling feature set **What This Doesn't Support**: - ❌ In-process profiling (`Scalene.start()`/`stop()`) - ❌ Function-level profiling without subprocess - ❌ Process attachment/PID-based profiling **Trade-offs**: - Subprocess overhead (~1-5% extra time per profile) - Temporary file I/O (mitigated by async) - But gains isolation, reliability, and simplicity **Future**: Could add in-process profiling in v1.1+ if needed: ```python # Hypothetical future method async def profile_function(func, *args, **kwargs) -> ProfileResult: """Profile a function using direct Scalene API.""" # Would use Scalene.start()/stop() directly # Lower latency but more complex error handling ``` **See Also**: [SCALENE_MODES_ANALYSIS.md](../SCALENE_MODES_ANALYSIS.md) for detailed coverage analysis --- ### 1. Async/Await for Subprocess **Decision**: Use `asyncio.create_subprocess_exec()` instead of subprocess.run() **Rationale**: - Non-blocking execution allows multiple profiles in parallel - Better integration with async MCP server - Allows timeouts without blocking thread **Trade-offs**: - More complex code than sync subprocess - But worth it for production use --- ### 2. Temporary Files for Scalene Output **Decision**: Use temp files instead of stdout/stderr redirection **Rationale**: - Scalene's JSON output can be mixed with script output - `/dev/stdout` and `/dev/stderr` don't work reliably - Temp files provide clean separation - Easy cleanup with error handling **Trade-offs**: - Slight I/O overhead (mitigated by cleanup in finally block) - But reliability is worth it **Implementation**: ```python output_file = tempfile.NamedTemporaryFile(suffix=".json", delete=False) output_path = Path(output_file.name) output_file.close() try: # Run Scalene, write to output_path result = parse_json_from_file(output_path) finally: output_path.unlink(missing_ok=True) # Cleanup ``` --- ### 3. Pydantic for Validation **Decision**: Use Pydantic BaseModel for all data structures **Rationale**: - Automatic JSON serialization - Type validation - Clear API contracts - IDE autocomplete support **Trade-offs**: - Slight overhead for validation - But catches errors early --- ### 4. In-Memory Profile Storage **Decision**: Store profiles in memory (current session only) **Rationale**: - Simple implementation for now - Suitable for interactive LLM sessions - Easy to extend with persistent storage later **Future**: Could add: - SQLite for persistence - Cloud storage (S3, etc.) - Redis for distributed caching --- ### 5. Structured Logging **Decision**: Use Python logging + rich formatting **Rationale**: - Standard Python approach - Rich formatting for better visibility - Easy to redirect/filter logs - Integrates with FastMCP settings **Benefits**: - Debug issues without changing code - Professional error messages for users - Structured data for log aggregation --- ## Extension Points ### Adding a New Analysis Type 1. Add method to `ProfileAnalyzer` 2. Create corresponding Result model in `models.py` 3. Register tool in `server.py` 4. Add tests in `tests/test_analyzer.py` **Example**: ```python # In analyzer.py def get_thread_contention(self, profile: ProfileResult) -> ThreadContention: """Analyze thread contention issues.""" # Implementation pass # In models.py class ThreadContention(BaseModel): """Thread contention metrics.""" # Fields pass # In server.py @server.tool() async def analyze_threads(profile: ProfileResult) -> dict: """Analyze thread contention.""" return analyzer.get_thread_contention(profile).dict() ``` --- ### Adding a New Profiling Option 1. Add parameter to `ScaleneProfiler.profile_script()` 2. Add to Scalene command construction 3. Update parser if new JSON fields appear 4. Add parameter documentation 5. Update server tool descriptions 6. Add tests --- ### Adding Persistent Storage 1. Create `Storage` interface/ABC 2. Implement concrete storage (SQLite, S3, etc.) 3. Inject into components 4. Add configuration options --- ## Testing Strategy ### Unit Tests - `test_models.py`: Validate all data models - `test_parser.py`: Test JSON parsing and edge cases - `test_profiler.py`: Test subprocess execution - `test_analyzer.py`: Test analysis algorithms - `test_comparator.py`: Test comparison logic ### Integration Tests - `test_server.py`: Test MCP tools end-to-end - Mock profiler to avoid slow actual profiling - Test error handling - Test tool parameter validation ### Code Coverage Target: 85%+ coverage Currently: 85.62% overall --- ## Performance Characteristics ### Profiling Overhead - CPU profiling: 2-5x slowdown - Memory profiling: 5-10x slowdown - GPU profiling: 2-3x slowdown - Reduced profile: 1-2x slowdown ### Memory Usage - Per profile: ~1-10MB depending on code size - Server memory: ~100MB base + profile storage ### Latency - Parse JSON: <100ms - Analyze: <50ms - Compare: <100ms --- ## Security Considerations ### Code Execution ⚠️ **WARNING**: Scalene-MCP runs arbitrary Python code via subprocess **Security Implications**: - Only use with trusted code - Run in isolated environment (container, VM) - Don't expose publicly without authentication - Consider sandboxing ### File Access - Temp files are readable by process owner - No additional permissions needed - Cleanup happens automatically ### Resource Limits - Timeout prevents infinite loops - Memory/CPU limits external (OS-level) --- ## Deployment Architecture ### Development Mode ``` ┌─────────────────┐ │ Claude Desktop │ │ (Client) │ └────────┬────────┘ │ MCP │ Protocol │ ┌────────▼────────┐ │ Scalene-MCP │ │ (Server) │ └────────┬────────┘ │ subprocess │ ┌────────▼────────┐ │ Scalene CLI │ │ (Profiler) │ └─────────────────┘ ``` ### Production Deployment (Future) ``` ┌─────────────────────┐ │ LLM Application │ │ (claude-sdk) │ └────────┬────────────┘ │ │ MCP │ Protocol │ ┌────▼──────┐ │ MCP │ │ Router │ └─┬──────┬──┘ │ │ ┌────▼─┐ ┌─▼─────┐ │Scalene│ │Other │ │ MCP │ │ MCPs │ │Server │ │ ... │ └───────┘ └───────┘ ``` --- ## Future Enhancements ### Short Term - Persistent profile storage (SQLite) - Profile export formats (CSV, HTML) - Advanced filtering for hotspots - Performance history tracking ### Medium Term - Real-time profiling dashboard - Automated regression detection - Integration with CI/CD - Performance benchmarking suite ### Long Term - Distributed profiling - ML-based anomaly detection - Collaborative profiling - Advanced visualization --- ## Glossary | Term | Definition | |------|-----------| | **Hotspot** | Code location with high CPU/memory usage | | **Bottleneck** | Performance issue with recommended fix | | **Memory Leak** | Allocation velocity pattern suggesting leak | | **Regression** | Performance degradation in later profile | | **Improvement** | Performance optimization in later profile | | **Velocity** | Allocation rate (MB/s) for leak detection | | **Reduced Profile** | Output filtered to significant lines only | | **Tolerance** | Percentage threshold for regression detection | --- ## References - [Scalene GitHub](https://github.com/plasma-umass/scalene) - [FastMCP Documentation](https://fastmcp.dev) - [Python asyncio Documentation](https://docs.python.org/3/library/asyncio.html) - [Pydantic Documentation](https://docs.pydantic.dev) - [Rich Documentation](https://rich.readthedocs.io)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ptmorris05/scalene-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

architecture.md•19 KiB