Skip to main content
Glama

Scalene-MCP

A FastMCP v2 server providing LLMs with structured access to Scalene's comprehensive CPU, GPU, and memory profiling capabilities for Python packages and C/C++ bindings.

Installation

Prerequisites

  • Python 3.10+

  • uv (recommended) or pip

From Source

git clone https://github.com/plasma-umass/scalene-mcp.git
cd scalene-mcp
uv venv
uv sync

As a Package

pip install scalene-mcp

Quick Start: Running the Server

Development Mode

# Using uv
uv run scalene_mcp.server

# Using pip
python -m scalene_mcp.server

Production Mode

python -m scalene_mcp.server

🎯 Native Integration with LLM Agents

Works seamlessly with:

  • GitHub Copilot - Direct integration

  • Claude Code - Claude Code and Claude VSCode extension

  • Cursor - All-in-one IDE

  • Any MCP-compatible LLM client

Zero-Friction Setup (3 Steps)

  1. Install

    pip install scalene-mcp
  2. Configure - Choose one method:

    Automated (Recommended):

    python scripts/setup_vscode.py

    Interactive setup script auto-finds your editor and configures it.

    Manual - GitHub Copilot:

    // .vscode/settings.json
    {
      "github.copilot.chat.mcp.servers": {
        "scalene": {
          "command": "uv",
          "args": ["run", "-m", "scalene_mcp.server"]
        }
      }
    }

    Manual - Claude Code / Cursor: See editor-specific setup guides

  3. Restart VSCode/Cursor and start profiling!

Start Profiling Immediately

Open any Python project and ask your LLM:

"Profile main.py and show me the bottlenecks"

The LLM automatically:

  • 🔍 Detects your project structure

  • 📄 Finds and profiles your code

  • 📊 Analyzes CPU, memory, GPU usage

  • 💡 Suggests optimizations

No path thinking. No manual configuration. Zero friction.

📚 Editor-Specific Setup:

📚 Full docs: SETUP_VSCODE.md | QUICKSTART.md | TOOLS_REFERENCE.md

Available Serving Methods (FastMCP)

Scalene-MCP can be served in multiple ways using FastMCP's built-in serving capabilities:

1. Standard Server (Default)

# Starts an MCP-compatible server on stdio
python -m scalene_mcp.server

2. With Claude Desktop

Configure in your claude_desktop_config.json:

{
  "mcpServers": {
    "scalene": {
      "command": "python",
      "args": ["-m", "scalene_mcp.server"]
    }
  }
}

Then restart Claude Desktop.

3. With HTTP/SSE Endpoint

# If using fastmcp with HTTP support
uv run --help  # Check FastMCP documentation for HTTP serving

4. With Environment Variables

# Configure via environment
export SCALENE_PYTHON_EXECUTABLE=python3.11
export SCALENE_TIMEOUT=30
python -m scalene_mcp.server

5. Programmatically

from fastmcp import Server

# Create and run server programmatically
server = create_scalene_server()
# Configure and start...

Programmatic Usage

Use Scalene-MCP directly in your Python code:

from scalene_mcp.profiler import ScaleneProfiler
import asyncio

async def main():
    profiler = ScaleneProfiler()
    
    # Profile a script
    result = await profiler.profile(
        type="script",
        script_path="fibonacci.py",
        include_memory=True,
        include_gpu=False
    )
    
    print(f"Profile ID: {result['profile_id']}")
    print(f"Peak memory: {result['summary'].get('total_memory_mb', 'N/A')}MB")
    
asyncio.run(main())

Overview

Scalene-MCP transforms Scalene's powerful profiling output into an LLM-friendly format through a clean, minimal set of well-designed tools. Get detailed performance insights without images or excessive context overhead.

What Scalene-MCP Does

  • Profile Python scripts with full Scalene feature set

  • Analyze profiles for hotspots, bottlenecks, memory leaks

  • Compare profiles to detect regressions

  • Pass arguments to profiled scripts

  • Structured output in JSON format for LLMs

  • Async execution for non-blocking profiling

What Scalene-MCP Doesn't Do

  • In-process profiling (Scalene.start()/stop()) - uses subprocess instead for isolation

  • Process attachment (--pid based profiling) - profiles scripts, not running processes

  • Single-function profiling - designed for complete script analysis

Note: The subprocess-based approach was chosen for reliability and simplicity. LLM workflows typically profile complete scripts, which is a perfect fit. See SCALENE_MODES_ANALYSIS.md for detailed scope analysis.

Key Features

  • Complete CPU profiling: Line-by-line Python/C time, system time, CPU utilization

  • Memory profiling: Peak/average memory per line, leak detection with velocity metrics

  • GPU profiling: NVIDIA and Apple GPU support with per-line attribution

  • Advanced analysis: Stack traces, bottleneck identification, performance recommendations

  • Profile comparison: Track performance changes across runs

  • LLM-optimized: Structured JSON output, summaries before details, context-aware formatting

Available Tools (7 Consolidated Tools)

Scalene-MCP provides a clean, LLM-optimized set of 7 tools:

Discovery (3 tools)

  • get_project_root() - Auto-detect project structure

  • list_project_files(pattern, max_depth) - Find files by glob pattern

  • set_project_context(project_root) - Override auto-detection

Profiling (1 unified tool)

  • profile(type, script_path/code, ...) - Profile scripts or code snippets

    • type="script" for script profiling

    • type="code" for code snippet profiling

Analysis (1 mega tool)

  • analyze(profile_id, metric_type, ...) - 9 analysis modes in one tool:

    • metric_type="all" - Comprehensive analysis

    • metric_type="cpu" - CPU hotspots

    • metric_type="memory" - Memory hotspots

    • metric_type="gpu" - GPU hotspots

    • metric_type="bottlenecks" - Performance bottlenecks

    • metric_type="leaks" - Memory leak detection

    • metric_type="file" - File-level metrics

    • metric_type="functions" - Function-level metrics

    • metric_type="recommendations" - Optimization suggestions

Comparison & Storage (2 tools)

  • compare_profiles(before_id, after_id) - Compare two profiles

  • list_profiles() - View all captured profiles

Full reference: See TOOLS_REFERENCE.md

Configuration

Profiling Options

The unified profile() tool supports these options:

Option

Type

Default

Description

type

str

required

"script" or "code"

script_path

str

None

Required if type="script"

code

str

None

Required if type="code"

include_memory

bool

true

Profile memory

include_gpu

bool

false

Profile GPU usage

cpu_only

bool

false

Skip memory/GPU profiling

reduced_profile

bool

false

Only report high-activity lines

cpu_percent_threshold

float

1.0

Minimum CPU% to report

malloc_threshold

int

100

Minimum allocation size (bytes)

profile_only

str

""

Profile only paths containing this

profile_exclude

str

""

Exclude paths containing this

use_virtual_time

bool

false

Use virtual time instead of wall time

script_args

list

[]

Command-line arguments for the script

Environment Variables

  • SCALENE_CPU_PERCENT_THRESHOLD: Override default CPU threshold

  • SCALENE_MALLOC_THRESHOLD: Override default malloc threshold

Architecture

Components

  • ScaleneProfiler: Async wrapper around Scalene CLI

  • ProfileParser: Converts Scalene JSON to structured models

  • ProfileAnalyzer: Extracts insights and hotspots

  • ProfileComparator: Compares profiles for regressions

  • FastMCP Server: Exposes tools via MCP protocol

Data Flow

Python Script
    ↓
ScaleneProfiler (subprocess)
    ↓
Scalene CLI (--json)
    ↓
Temp JSON File
    ↓
ProfileParser
    ↓
Pydantic Models (ProfileResult)
    ↓
Analyzer / Comparator
    ↓
MCP Tools
    ↓
LLM Client

Troubleshooting

GPU Permission Error

If you see PermissionError when profiling with GPU:

# Disable GPU profiling in test environments
result = await profiler.profile(
    type="script",
    script_path="script.py",
    include_gpu=False
)

Profile Not Found

Profiles are stored in memory during the server session. For persistence, implement the storage interface.

Timeout Issues

Adjust the timeout parameter (if using profiler directly):

result = await profiler.profile(
    type="script",
    script_path="slow_script.py"
)

Development

Running Tests

# All tests with coverage
uv run pytest -v --cov=src/scalene_mcp

# Specific test file
uv run pytest tests/test_profiler.py -v

# With coverage report
uv run pytest --cov=src/scalene_mcp --cov-report=html

Code Quality

# Type checking
uv run mypy src/

# Linting
uv run ruff check src/

# Formatting
uv run ruff format src/

Contributing

Contributions are welcome! Please:

  1. Fork the repository

  2. Create a feature branch

  3. Add tests for new functionality

  4. Ensure all tests pass and coverage ≥ 85%

  5. Submit a pull request

License

MIT License - see LICENSE file for details.

Citation

If you use Scalene-MCP in research, please cite both this project and Scalene:

@software{scalene_mcp,
  title={Scalene-MCP: LLM-Friendly Profiling Server},
  year={2026}
}

@inproceedings{berger2020scalene,
  title={Scalene: Scripting-Language Aware Profiling for Python},
  author={Berger, Emery},
  year={2020}
}

Support

  • Issues: GitHub Issues for bug reports and feature requests

  • Discussions: GitHub Discussions for questions and ideas

  • Documentation: See docs/ directory


Made with ❤️ for the Python performance community.

Manual Installation

pip install -e .

Development

Prerequisites

  • Python 3.10+

  • uv (recommended) or pip

Setup

# Install dependencies
uv sync

# Run tests
just test

# Run tests with coverage
just test-cov

# Lint and format
just lint
just format

# Type check
just typecheck

# Full build (sync + lint + typecheck + test)
just build

Project Structure

scalene-mcp/
├── src/scalene_mcp/     # Main package
│   ├── server.py        # FastMCP server with tools/resources/prompts
│   ├── models.py        # Pydantic data models
│   ├── profiler.py      # Scalene execution wrapper
│   ├── parser.py        # JSON output parser
│   ├── analyzer.py      # Analysis engine
│   ├── comparator.py    # Profile comparison
│   ├── recommender.py   # Optimization recommendations
│   ├── storage.py       # Profile persistence
│   └── utils.py         # Shared utilities
├── tests/               # Test suite (100% coverage goal)
│   ├── fixtures/        # Test data
│   │   ├── profiles/    # Sample profile outputs
│   │   └── scripts/     # Test Python scripts
│   └── conftest.py      # Shared test fixtures
├── examples/            # Usage examples
├── docs/                # Documentation
├── pyproject.toml       # Project configuration
├── justfile             # Task runner commands
└── README.md            # This file

Usage

Running the Server

# Development mode with auto-reload
fastmcp dev src/scalene_mcp/server.py

# Production mode
fastmcp run src/scalene_mcp/server.py

# Install to MCP config
fastmcp install src/scalene_mcp/server.py

Example: Profile a Script

# Through MCP client
result = await client.call_tool(
    "profile",
    arguments={
        "script_path": "my_script.py",
        "cpu": True,
        "memory": True,
        "gpu": False,
    }
)

Example: Analyze Results

# Get analysis and recommendations
analysis = await client.call_tool(
    "analyze",
    arguments={"profile_id": result["profile_id"]}
)

Testing

The project maintains 100% test coverage with comprehensive test suites:

# Run all tests
uv run pytest

# Run with coverage report
uv run pytest --cov=src --cov-report=html

# Run specific test file
uv run pytest tests/test_server.py

# Run with verbose output
uv run pytest -v

Test fixtures include:

  • Sample profiling scripts (fibonacci, memory-intensive, leaky)

  • Realistic Scalene JSON outputs

  • Edge cases and error conditions

Code Quality

This project follows strict code quality standards:

  • Type Safety: 100% mypy strict mode compliance

  • Linting: ruff with comprehensive rules

  • Testing: 100% coverage requirement

  • Style: Sleek-modern documentation, minimal functional emoji usage

  • Patterns: FastMCP best practices throughout

Development Phases

Current Status: Phase 1.1 - Project Setup

Documentation

Editor Setup Guides:

API & Usage:

Development Roadmap

  1. Phase 1: Project Setup & Infrastructure ✓

  2. Phase 2: Core Data Models (In Progress)

  3. Phase 3: Profiler Integration

  4. Phase 4: Analysis & Insights

  5. Phase 5: Comparison Features

  6. Phase 6: Resources Implementation

  7. Phase 7: Prompts & Workflows

  8. Phase 8: Testing & Quality

  9. Phase 9: Documentation

  10. Phase 10: Polish & Release

See development-plan.md for detailed roadmap.

Contributing

Contributions are welcome! Please ensure:

  • All tests pass (just test)

  • Linting passes (just lint)

  • Type checking passes (just typecheck)

  • Code coverage remains at 100%

License

[License TBD]

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ptmorris05/scalene-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server