metaflow-mcp-server
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": false
} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| get_configA | Show current Metaflow configuration. Returns the active metadata provider, datastore, namespace, and profile. Also returns the user's default namespace (e.g. "user:npow") -- pass this as the namespace parameter to list_flows/search_runs/get_latest_failure to scope results to only your own runs. Use this first to understand what backend you're connected to. |
| list_flowsA | List available Metaflow flows. Returns flow names visible in the given namespace. Use this to discover flows before searching for runs. Args: last_n: Max number of flows to return (default 50). offset: Number of flows to skip for pagination (default 0). namespace: Metaflow namespace to scope results (e.g. "user:npow"). Use get_config to find your default_namespace. If omitted, returns all flows visible globally. |
| search_runsA | Find recent runs of a flow with optional filters. Args: flow_name: Name of the flow class (e.g. "MyFlow"). last_n: Max number of matching runs to return (default 5). status: Filter by status: "successful", "failed", or "running". created_after: ISO datetime -- only runs created after this time (e.g. "2024-01-15" or "2024-01-15T10:30:00"). created_before: ISO datetime -- only runs created before this time. tags: Only include runs that have all of these user tags. namespace: Metaflow namespace to scope results (e.g. "user:npow"). Use get_config to find your default_namespace. |
| get_runA | Get detailed status of a run including per-step breakdown. By default returns every task in every step — for flows with large foreach fan-outs (hundreds/thousands of tasks per step) this payload can be enormous. Pass summary=True to get per-step counts + only the failing tasks, which is usually what you want when diagnosing a failure. Args: pathspec: Run pathspec like "FlowName/RunID". summary: If True, return per-step task counts (total/successful/failed/running) and list only failing tasks. Recommended for foreach-heavy flows. Default False preserves full per-task detail. |
| get_task_logsA | Get stdout/stderr logs for a specific task. Args: pathspec: Task pathspec like "FlowName/RunID/StepName/TaskID". stdout: Include stdout (default true). stderr: Include stderr (default true). tail: Return only the last N lines of each log. head: Return only the first N lines of each log (ignored if tail is set). pattern: Regex pattern -- return only lines matching this pattern. |
| list_artifactsA | List all artifacts produced by a task (or the first task of a step). Returns artifact names, data types, and metadata. Use get_artifact to retrieve actual values. Args: pathspec: Task pathspec like "FlowName/RunID/StepName/TaskID", or step pathspec like "FlowName/RunID/StepName" (uses first task). |
| get_artifactA | Get the value of a data artifact from a task. Args: pathspec: Task pathspec like "FlowName/RunID/StepName/TaskID". name: Artifact name (e.g. "model", "result"). |
| list_cardsA | List cards attached to a run, step, or task. Cards are visual reports (HTML) produced by Metaflow steps, often containing plots, tables, and metrics. Use this to discover available cards before retrieving them with get_card. For a run pathspec, scans all steps (first task per step). For a step pathspec, uses the first task. For a task pathspec, uses that exact task. Args: pathspec: Run ("FlowName/RunID"), step ("FlowName/RunID/StepName"), or task ("FlowName/RunID/StepName/TaskID") pathspec. card_type: Only list cards of this type (e.g. "default"). card_id: Only list cards with this ID. |
| get_cardA | Get a Metaflow card's content. Returns text content extracted from the card by default. Cards can be multi-megabyte HTML — set include_html=True only when you actually need to save it to a file and open in a browser. Use list_cards first to discover available cards. Args: pathspec: Step ("FlowName/RunID/StepName") or task ("FlowName/RunID/StepName/TaskID") pathspec. card_index: Which card to retrieve if multiple exist (default 0). card_type: Filter cards by type before selecting by index. card_id: Filter cards by ID before selecting by index. include_html: If True, include the full raw HTML in the response. Default False — text_content is usually enough for analysis. |
| compare_cardsA | Compare Metaflow cards across multiple runs side by side. Creates an HTML comparison page and returns it. Save the html field to a local .html file and open it in your browser to view the cards side by side. Also returns text summaries of each card for analysis. Two ways to specify which cards to compare: Option A -- provide a list of step/task pathspecs directly: pathspecs=["MyFlow/100/validate", "MyFlow/101/validate"] Option B -- provide flow_name + step_name + run_ids (shorthand): flow_name="MyFlow", step_name="validate", run_ids=["100", "101"] Resolves each to "MyFlow/{run_id}/{step_name}" (first task). Args: pathspecs: List of step or task pathspecs to compare. flow_name: Flow name (used with step_name + run_ids). step_name: Step name (used with flow_name + run_ids). run_ids: List of run IDs to compare (used with flow_name + step_name). card_type: Filter cards by type before selecting. card_id: Filter cards by ID before selecting. card_index: Which card to use if multiple match (default 0). |
| get_latest_failureA | Find failed runs and return error details. Scans recent runs, finds all failures, and returns the failing step/task with exception and stderr for each. Args: flow_name: Name of the flow. last_n_runs: How many recent runs to scan (default 20). namespace: Metaflow namespace to scope results (e.g. "user:npow"). Use get_config to find your default_namespace. |
| search_artifactsA | Search for a named artifact across recent runs of a flow. Scans recent runs to find which tasks produced an artifact with the given name. Does not load artifact data. Use get_artifact to retrieve values. Note: for runs with many parallel tasks this may be slow. Use step_name to narrow the search. Args: flow_name: Name of the flow class. artifact_name: Name of the artifact to search for (e.g. "model", "accuracy"). last_n_runs: Number of recent runs to scan (default 5). step_name: Only search within this step (e.g. "train"). Recommended for large flows. |
| get_source_codeA | Get the source code from a Metaflow run's code package. Every Metaflow run that executes remotely stores a snapshot of the code. Use this to inspect the exact code that was used in a run. Without file_path, returns the main FlowSpec source file and lists all files in the code package. With file_path, returns the content of that specific file from the package. Args: pathspec: Run pathspec like "FlowName/RunID", or task pathspec like "FlowName/RunID/StepName/TaskID". file_path: Optional path of a specific file within the code package. If omitted, returns the main flow file and a listing of all files in the package. |
| diff_runsA | Compare two Metaflow runs: source code, parameters, environment, and system metadata. Produces a structured diff showing what changed between two runs of the same flow. Useful for debugging regressions, understanding why a run succeeded when another failed, or auditing parameter/dependency changes. Sections in the diff:
Args: source_pathspec: Run pathspec for the "before" run (e.g. "MyFlow/100"). target_pathspec: Run pathspec for the "after" run (e.g. "MyFlow/101"). |
| get_environmentA | Get the conda/pypi environment details for a Metaflow task or run. Returns the full list of packages installed, user-requested dependencies, and metadata (who resolved it, when, architecture, environment type). Works with both Netflix (nflx-metaflow) and OSS Metaflow installations. Args: pathspec: Run ("FlowName/RunID"), step ("FlowName/RunID/StepName"), or task ("FlowName/RunID/StepName/TaskID") pathspec. For run pathspecs, scans steps to find the first with an environment. package_type: Filter packages by type: "conda" or "pypi". If omitted, returns all. package_name: Filter packages by name (case-insensitive substring match). Use this to check if a specific package is installed and what version. E.g. "numpy" returns only packages with "numpy" in the name. max_packages: Max number of packages to return. If the environment has more, the list is truncated and packages_truncated=true is set. Useful for large environments (100+ packages). |
| get_recent_runsA | Find the most recent runs across all flows in a namespace. Use this when no specific flow name is given and you need to find what the user ran recently. Scans all flows in the namespace and returns runs sorted by creation time (newest first). Args: namespace: Metaflow namespace to scope results (e.g. "user:npow"). Use get_config to find your default_namespace. last_n_flows: How many flows to scan (default 20). last_n_runs_per_flow: How many recent runs to check per flow (default 3). status: Filter by status: "successful", "failed", or "running". |
| add_run_tagsA | Add user tags to a Metaflow run. Tags are useful for marking runs (e.g. "production", "experiment-v2", "approved") and can be used to filter runs in search_runs. Args: pathspec: Run pathspec like "FlowName/RunID". tags: Tags to add (e.g. ["production", "reviewed"]). |
| remove_run_tagsA | Remove user tags from a Metaflow run. System tags cannot be removed. Removing a non-existent tag is a no-op. Args: pathspec: Run pathspec like "FlowName/RunID". tags: Tags to remove. |
| list_deploymentsA | List Metaflow flows deployed to the configured orchestrator. Discovers flows deployed via Deployer to Argo Workflows, Maestro, or other supported backends. Args: flow_name: Only list deployments for this flow. If omitted, lists all. impl: Orchestrator backend (e.g. "maestro", "argo_workflows"). Auto-detected if omitted. |
| trigger_runA | Trigger a new run for a deployed Metaflow flow. Connects to an existing deployment and triggers a new run. The flow must already be deployed via Deployer. Use list_deployments to discover available deployments and their identifiers. The run is triggered asynchronously -- this returns immediately with tracking identifiers. Use get_triggered_run_status to poll, or get_run with the Metaflow pathspec once the start step completes. Args: identifier: Deployment identifier from list_deployments (e.g. Maestro workflow_id like "myproject.test.staging.TrainFlow"). parameters: Optional flow parameter overrides as key-value pairs (e.g. {"learning_rate": "0.01", "epochs": "10"}). Unspecified parameters use deployed defaults. impl: Orchestrator backend. Auto-detected if omitted. |
| get_triggered_run_statusA | Check the status of a previously triggered run. Args: identifier: Deployment identifier (same as passed to trigger_run). run_id: Run ID returned by trigger_run (the workflow_run_id or workflow_instance_id field). impl: Orchestrator backend. Auto-detected if omitted. |
| terminate_runA | Terminate a running triggered run. Stops the workflow on the orchestrator. This is irreversible. Args: identifier: Deployment identifier (same as passed to trigger_run). run_id: Run ID to terminate. impl: Orchestrator backend. Auto-detected if omitted. |
| run_flowA | Run a Metaflow flow from a local source file. Starts execution and returns once the run ID is assigned. The flow continues running as a subprocess. Use get_run to monitor progress. Requires the flow source file on the local filesystem -- only works when the MCP server has access to the flow code (e.g. local dev). Args: flow_file: Path to the flow Python file (e.g. "./myflow.py"). parameters: Optional flow parameter overrides (e.g. {"learning_rate": "0.01"}). Keys are parameter names. tags: Optional tags to apply to the run. max_workers: Max parallel workers for foreach steps. |
| resume_runA | Resume a failed Metaflow run from the point of failure. Starts a new run that reuses results from successful steps of the original run and re-executes from the failed step onward. Returns once the new run ID is assigned. The resumed flow continues running as a subprocess. Use get_run to monitor progress. Requires the flow source file on the local filesystem. Args: flow_file: Path to the flow Python file. origin_run_id: Run ID to resume from (e.g. "1715234567890"). Use get_run or get_latest_failure to find this. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/npow/metaflow-mcp-server'
If you have feedback or need assistance with the MCP directory API, please join our Discord server