# Scalene-MCP
A FastMCP v2 server providing LLMs with structured access to [Scalene](https://github.com/plasma-umass/scalene)'s comprehensive CPU, GPU, and memory profiling capabilities for Python packages and C/C++ bindings.
## Installation
### Prerequisites
- Python 3.10+
- uv (recommended) or pip
### From Source
```bash
git clone https://github.com/plasma-umass/scalene-mcp.git
cd scalene-mcp
uv venv
uv sync
```
### As a Package
```bash
pip install scalene-mcp
```
## Quick Start: Running the Server
### Development Mode
```bash
# Using uv
uv run scalene_mcp.server
# Using pip
python -m scalene_mcp.server
```
### Production Mode
```bash
python -m scalene_mcp.server
```
## šÆ Native Integration with LLM Agents
**Works seamlessly with:**
- ā
**[GitHub Copilot](SETUP_GITHUB_COPILOT.md)** - Direct integration
- ā
**[Claude Code](SETUP_CLAUDE.md)** - Claude Code and Claude VSCode extension
- ā
**[Cursor](SETUP_CURSOR.md)** - All-in-one IDE
- ā
**Any MCP-compatible LLM client**
### Zero-Friction Setup (3 Steps)
1. **Install**
```bash
pip install scalene-mcp
```
2. **Configure** - Choose one method:
**Automated (Recommended):**
```bash
python scripts/setup_vscode.py
```
Interactive setup script auto-finds your editor and configures it.
**Manual - GitHub Copilot:**
```json
// .vscode/settings.json
{
"github.copilot.chat.mcp.servers": {
"scalene": {
"command": "uv",
"args": ["run", "-m", "scalene_mcp.server"]
}
}
}
```
**Manual - Claude Code / Cursor:** See editor-specific setup guides
3. **Restart VSCode/Cursor** and start profiling!
### Start Profiling Immediately
Open any Python project and ask your LLM:
```
"Profile main.py and show me the bottlenecks"
```
The LLM automatically:
- š Detects your project structure
- š Finds and profiles your code
- š Analyzes CPU, memory, GPU usage
- š” Suggests optimizations
**No path thinking. No manual configuration. Zero friction.**
š **Editor-Specific Setup:**
- [GitHub Copilot Setup](SETUP_GITHUB_COPILOT.md)
- [Claude Code Setup](SETUP_CLAUDE.md)
- [Cursor Setup](SETUP_CURSOR.md)
š **Full docs:** [SETUP_VSCODE.md](SETUP_VSCODE.md) | [QUICKSTART.md](QUICKSTART.md) | [TOOLS_REFERENCE.md](TOOLS_REFERENCE.md)
### Available Serving Methods (FastMCP)
Scalene-MCP can be served in multiple ways using FastMCP's built-in serving capabilities:
#### 1. **Standard Server (Default)**
```bash
# Starts an MCP-compatible server on stdio
python -m scalene_mcp.server
```
#### 2. **With Claude Desktop**
Configure in your `claude_desktop_config.json`:
```json
{
"mcpServers": {
"scalene": {
"command": "python",
"args": ["-m", "scalene_mcp.server"]
}
}
}
```
Then restart Claude Desktop.
#### 3. **With HTTP/SSE Endpoint**
```bash
# If using fastmcp with HTTP support
uv run --help # Check FastMCP documentation for HTTP serving
```
#### 4. **With Environment Variables**
```bash
# Configure via environment
export SCALENE_PYTHON_EXECUTABLE=python3.11
export SCALENE_TIMEOUT=30
python -m scalene_mcp.server
```
#### 5. **Programmatically**
```python
from fastmcp import Server
# Create and run server programmatically
server = create_scalene_server()
# Configure and start...
```
## Programmatic Usage
Use Scalene-MCP directly in your Python code:
```python
from scalene_mcp.profiler import ScaleneProfiler
import asyncio
async def main():
profiler = ScaleneProfiler()
# Profile a script
result = await profiler.profile(
type="script",
script_path="fibonacci.py",
include_memory=True,
include_gpu=False
)
print(f"Profile ID: {result['profile_id']}")
print(f"Peak memory: {result['summary'].get('total_memory_mb', 'N/A')}MB")
asyncio.run(main())
```
## Overview
Scalene-MCP transforms Scalene's powerful profiling output into an LLM-friendly format through a clean, minimal set of well-designed tools. Get detailed performance insights without images or excessive context overhead.
### What Scalene-MCP Does
- ā
**Profile Python scripts** with full Scalene feature set
- ā
**Analyze profiles** for hotspots, bottlenecks, memory leaks
- ā
**Compare profiles** to detect regressions
- ā
**Pass arguments** to profiled scripts
- ā
**Structured output** in JSON format for LLMs
- ā
**Async execution** for non-blocking profiling
### What Scalene-MCP Doesn't Do
- ā **In-process profiling** (`Scalene.start()`/`stop()`) - uses subprocess instead for isolation
- ā **Process attachment** (`--pid` based profiling) - profiles scripts, not running processes
- ā **Single-function profiling** - designed for complete script analysis
**Note**: The subprocess-based approach was chosen for reliability and simplicity. LLM workflows typically profile complete scripts, which is a perfect fit. See [SCALENE_MODES_ANALYSIS.md](./SCALENE_MODES_ANALYSIS.md) for detailed scope analysis.
### Key Features
- **Complete CPU profiling**: Line-by-line Python/C time, system time, CPU utilization
- **Memory profiling**: Peak/average memory per line, leak detection with velocity metrics
- **GPU profiling**: NVIDIA and Apple GPU support with per-line attribution
- **Advanced analysis**: Stack traces, bottleneck identification, performance recommendations
- **Profile comparison**: Track performance changes across runs
- **LLM-optimized**: Structured JSON output, summaries before details, context-aware formatting
## Available Tools (7 Consolidated Tools)
Scalene-MCP provides a clean, LLM-optimized set of 7 tools:
### Discovery (3 tools)
- **get_project_root()** - Auto-detect project structure
- **list_project_files(pattern, max_depth)** - Find files by glob pattern
- **set_project_context(project_root)** - Override auto-detection
### Profiling (1 unified tool)
- **profile(type, script_path/code, ...)** - Profile scripts or code snippets
- `type="script"` for script profiling
- `type="code"` for code snippet profiling
### Analysis (1 mega tool)
- **analyze(profile_id, metric_type, ...)** - 9 analysis modes in one tool:
- `metric_type="all"` - Comprehensive analysis
- `metric_type="cpu"` - CPU hotspots
- `metric_type="memory"` - Memory hotspots
- `metric_type="gpu"` - GPU hotspots
- `metric_type="bottlenecks"` - Performance bottlenecks
- `metric_type="leaks"` - Memory leak detection
- `metric_type="file"` - File-level metrics
- `metric_type="functions"` - Function-level metrics
- `metric_type="recommendations"` - Optimization suggestions
### Comparison & Storage (2 tools)
- **compare_profiles(before_id, after_id)** - Compare two profiles
- **list_profiles()** - View all captured profiles
**Full reference**: See [TOOLS_REFERENCE.md](TOOLS_REFERENCE.md)
## Configuration
### Profiling Options
The unified `profile()` tool supports these options:
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `type` | str | required | "script" or "code" |
| `script_path` | str | None | Required if type="script" |
| `code` | str | None | Required if type="code" |
| `include_memory` | bool | true | Profile memory |
| `include_gpu` | bool | false | Profile GPU usage |
| `cpu_only` | bool | false | Skip memory/GPU profiling |
| `reduced_profile` | bool | false | Only report high-activity lines |
| `cpu_percent_threshold` | float | 1.0 | Minimum CPU% to report |
| `malloc_threshold` | int | 100 | Minimum allocation size (bytes) |
| `profile_only` | str | "" | Profile only paths containing this |
| `profile_exclude` | str | "" | Exclude paths containing this |
| `use_virtual_time` | bool | false | Use virtual time instead of wall time |
| `script_args` | list | [] | Command-line arguments for the script |
### Environment Variables
- `SCALENE_CPU_PERCENT_THRESHOLD`: Override default CPU threshold
- `SCALENE_MALLOC_THRESHOLD`: Override default malloc threshold
## Architecture
### Components
- **ScaleneProfiler**: Async wrapper around Scalene CLI
- **ProfileParser**: Converts Scalene JSON to structured models
- **ProfileAnalyzer**: Extracts insights and hotspots
- **ProfileComparator**: Compares profiles for regressions
- **FastMCP Server**: Exposes tools via MCP protocol
### Data Flow
```
Python Script
ā
ScaleneProfiler (subprocess)
ā
Scalene CLI (--json)
ā
Temp JSON File
ā
ProfileParser
ā
Pydantic Models (ProfileResult)
ā
Analyzer / Comparator
ā
MCP Tools
ā
LLM Client
```
## Troubleshooting
### GPU Permission Error
If you see `PermissionError` when profiling with GPU:
```python
# Disable GPU profiling in test environments
result = await profiler.profile(
type="script",
script_path="script.py",
include_gpu=False
)
```
### Profile Not Found
Profiles are stored in memory during the server session. For persistence, implement the storage interface.
### Timeout Issues
Adjust the timeout parameter (if using profiler directly):
```python
result = await profiler.profile(
type="script",
script_path="slow_script.py"
)
```
## Development
### Running Tests
```bash
# All tests with coverage
uv run pytest -v --cov=src/scalene_mcp
# Specific test file
uv run pytest tests/test_profiler.py -v
# With coverage report
uv run pytest --cov=src/scalene_mcp --cov-report=html
```
### Code Quality
```bash
# Type checking
uv run mypy src/
# Linting
uv run ruff check src/
# Formatting
uv run ruff format src/
```
## Contributing
Contributions are welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Add tests for new functionality
4. Ensure all tests pass and coverage ā„ 85%
5. Submit a pull request
## License
MIT License - see LICENSE file for details.
## Citation
If you use Scalene-MCP in research, please cite both this project and Scalene:
```bibtex
@software{scalene_mcp,
title={Scalene-MCP: LLM-Friendly Profiling Server},
year={2026}
}
@inproceedings{berger2020scalene,
title={Scalene: Scripting-Language Aware Profiling for Python},
author={Berger, Emery},
year={2020}
}
```
## Support
- **Issues**: GitHub Issues for bug reports and feature requests
- **Discussions**: GitHub Discussions for questions and ideas
- **Documentation**: See `docs/` directory
---
Made with ā¤ļø for the Python performance community.
### Manual Installation
```bash
pip install -e .
```
## Development
### Prerequisites
- Python 3.10+
- uv (recommended) or pip
### Setup
```bash
# Install dependencies
uv sync
# Run tests
just test
# Run tests with coverage
just test-cov
# Lint and format
just lint
just format
# Type check
just typecheck
# Full build (sync + lint + typecheck + test)
just build
```
### Project Structure
```
scalene-mcp/
āāā src/scalene_mcp/ # Main package
ā āāā server.py # FastMCP server with tools/resources/prompts
ā āāā models.py # Pydantic data models
ā āāā profiler.py # Scalene execution wrapper
ā āāā parser.py # JSON output parser
ā āāā analyzer.py # Analysis engine
ā āāā comparator.py # Profile comparison
ā āāā recommender.py # Optimization recommendations
ā āāā storage.py # Profile persistence
ā āāā utils.py # Shared utilities
āāā tests/ # Test suite (100% coverage goal)
ā āāā fixtures/ # Test data
ā ā āāā profiles/ # Sample profile outputs
ā ā āāā scripts/ # Test Python scripts
ā āāā conftest.py # Shared test fixtures
āāā examples/ # Usage examples
āāā docs/ # Documentation
āāā pyproject.toml # Project configuration
āāā justfile # Task runner commands
āāā README.md # This file
```
## Usage
### Running the Server
```bash
# Development mode with auto-reload
fastmcp dev src/scalene_mcp/server.py
# Production mode
fastmcp run src/scalene_mcp/server.py
# Install to MCP config
fastmcp install src/scalene_mcp/server.py
```
### Example: Profile a Script
```python
# Through MCP client
result = await client.call_tool(
"profile",
arguments={
"script_path": "my_script.py",
"cpu": True,
"memory": True,
"gpu": False,
}
)
```
### Example: Analyze Results
```python
# Get analysis and recommendations
analysis = await client.call_tool(
"analyze",
arguments={"profile_id": result["profile_id"]}
)
```
## Testing
The project maintains 100% test coverage with comprehensive test suites:
```bash
# Run all tests
uv run pytest
# Run with coverage report
uv run pytest --cov=src --cov-report=html
# Run specific test file
uv run pytest tests/test_server.py
# Run with verbose output
uv run pytest -v
```
Test fixtures include:
- Sample profiling scripts (fibonacci, memory-intensive, leaky)
- Realistic Scalene JSON outputs
- Edge cases and error conditions
## Code Quality
This project follows strict code quality standards:
- **Type Safety:** 100% mypy strict mode compliance
- **Linting:** ruff with comprehensive rules
- **Testing:** 100% coverage requirement
- **Style:** Sleek-modern documentation, minimal functional emoji usage
- **Patterns:** FastMCP best practices throughout
## Development Phases
Current Status: **Phase 1.1 - Project Setup** ā
## Documentation
**Editor Setup Guides:**
- [GitHub Copilot Setup](SETUP_GITHUB_COPILOT.md) - Using Copilot Chat with VSCode
- [Claude Code Setup](SETUP_CLAUDE.md) - Using Claude Code VSCode extension
- [Cursor Setup](SETUP_CURSOR.md) - Using the Cursor IDE
- [General VSCode Setup](SETUP_VSCODE.md) - General VSCode configuration
**API & Usage:**
- [Tools Reference](TOOLS_REFERENCE.md) - Complete API documentation (7 tools)
- [Quick Start](QUICKSTART.md) - 3-step setup and basic workflows
- [Examples](examples/README.md) - Real-world profiling examples
## Development Roadmap
1. **Phase 1:** Project Setup & Infrastructure ā
2. **Phase 2:** Core Data Models (In Progress)
3. **Phase 3:** Profiler Integration
4. **Phase 4:** Analysis & Insights
5. **Phase 5:** Comparison Features
6. **Phase 6:** Resources Implementation
7. **Phase 7:** Prompts & Workflows
8. **Phase 8:** Testing & Quality
9. **Phase 9:** Documentation
10. **Phase 10:** Polish & Release
See [development-plan.md](../development-plan.md) for detailed roadmap.
## Contributing
Contributions are welcome! Please ensure:
- All tests pass (`just test`)
- Linting passes (`just lint`)
- Type checking passes (`just typecheck`)
- Code coverage remains at 100%
## License
[License TBD]
## Links
- [Scalene Profiler](https://github.com/plasma-umass/scalene)
- [FastMCP Framework](https://github.com/jlowin/fastmcp)
- [Model Context Protocol](https://modelcontextprotocol.io/)