mlb-api-mcp

get_statcast_pitcher

Retrieve MLB Statcast pitching data for a specific pitcher within a defined date range to analyze performance metrics and trends.

Instructions

Retrieve MLB Statcast data for a single pitcher over a date range.

Parameters

player_id : int MLBAM ID of the pitcher. start_date : str The start date in 'YYYY-MM-DD' format. Required. end_date : str The end date in 'YYYY-MM-DD' format. Required.

Returns

dict Dictionary with Statcast data for the pitcher. If the result is too large, returns an error message.

Notes

Data is sourced from MLB Statcast via pybaseball. See the official documentation for more details: https://github.com/jldbc/pybaseball/tree/master/docs

Input Schema

TableJSON Schema

Name	Required	Description	Default
`player_id`	Yes
`start_date`	Yes
`end_date`	Yes

Output Schema

TableJSON Schema

Name	Required	Description	Default
No arguments

Implementation Reference

mlb_api.py:796-846 (handler)

The primary handler function for the 'get_statcast_pitcher' tool. It is decorated with @mcp.tool() for automatic registration and schema generation. Calls pybaseball.statcast_pitcher to fetch data, handles validation and errors.

@mcp.tool()
def get_statcast_pitcher(
    player_id: int,
    start_date: str,
    end_date: str,
) -> dict:
    """
    Retrieve MLB Statcast data for a single pitcher over a date range.

    Parameters
    ----------
    player_id : int
        MLBAM ID of the pitcher.
    start_date : str
        The start date in 'YYYY-MM-DD' format. Required.
    end_date : str
        The end date in 'YYYY-MM-DD' format. Required.

    Returns
    -------
    dict
        Dictionary with Statcast data for the pitcher. If the result is too large, returns an error message.

    Notes
    -----
    Data is sourced from MLB Statcast via pybaseball. See the official documentation for more details:
    https://github.com/jldbc/pybaseball/tree/master/docs
    """
    try:
        # Validate date range
        date_error = validate_date_range(start_date, end_date)
        if date_error:
            return date_error
        data = statcast_pitcher(start_date, end_date, player_id)
        # Convert all columns to string to ensure JSON serializability
        data = data.astype(str)
        result = {"statcast_data": data.to_dict(orient="records")}
        if not result["statcast_data"]:
            return {
                "error": (
                    f"No Statcast data found for the given date range ({start_date} to {end_date}). The date "
                    "range may have resulted in nothing being returned."
                )
            }
        size_error = check_result_size(result, "player")
        if size_error:
            return size_error
        return result
    except Exception as e:
        return {"error": str(e)}

mlb_api.py:206-218 (helper)

Helper function used by get_statcast_pitcher to validate that start_date <= end_date and dates are properly formatted.

def validate_date_range(start_date: str, end_date: str) -> Optional[dict]:
    """
    Utility to check that start_date is before or equal to end_date.
    Returns an error dict if invalid, else None.
    """
    try:
        start = datetime.strptime(start_date, "%Y-%m-%d")
        end = datetime.strptime(end_date, "%Y-%m-%d")
        if start > end:
            return {"error": f"start_date ({start_date}) must be before or equal to end_date ({end_date})"}
    except Exception as e:
        return {"error": f"Invalid date format: {e}"}
    return None

mlb_api.py:189-204 (helper)

Helper function used by get_statcast_pitcher to check if the result dictionary is too large (>100k words) and return error if so.

def check_result_size(result: dict, context: str) -> Optional[dict]:
    """
    Utility to check the size of a result dictionary (by word count). Returns an error dict if too large, else None.
    """
    import json

    word_count = len(json.dumps(result).split())
    if word_count > 100000:
        return {
            "error": (
                f"Result too large ({word_count} words). Please narrow your query "
                f"(e.g., shorter date range, specific {context})."
            )
        }
    return None

tests/test_mlb_api.py:26-32 (registration)
Test fixture that calls mlb_api.setup_mlb_tools(mcp) to register the tools, including get_statcast_pitcher, confirming the registration mechanism.
```
@pytest.fixture
def mcp():
    mcp = MagicMock()
    patch_mcp_tool(mcp)
    mlb_api.setup_mlb_tools(mcp)
    generic_api.setup_generic_tools(mcp)
    return mcp
```

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well by disclosing key behavioral traits: data source (MLB Statcast via pybaseball), potential error condition (returns error if result too large), and return type (dictionary). However, it doesn't mention rate limits, authentication needs, or pagination behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (Parameters, Returns, Notes) and front-loaded purpose statement. The Notes section could be slightly more concise, but overall information density is high with minimal waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, 0% schema coverage, but presence of output schema, the description provides good context: clear purpose, parameter semantics, return behavior, and data source. Could improve by mentioning sibling tool relationships or more behavioral details, but covers essentials adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds substantial value beyond the input schema (0% coverage). It provides semantic meaning for all three parameters: clarifies player_id is an 'MLBAM ID', specifies date formats ('YYYY-MM-DD'), and marks both dates as 'Required'. This fully compensates for the schema's lack of descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verb ('Retrieve') and resource ('MLB Statcast data for a single pitcher over a date range'). It distinguishes itself from siblings like 'get_statcast_batter' (pitcher vs batter) and 'get_statcast_team' (single pitcher vs team).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through parameter requirements (pitcher ID and date range) but doesn't explicitly state when to use this tool versus alternatives like 'get_statcast_batter' or 'get_statcast_team'. No guidance on prerequisites or exclusions is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/guillochon/mlb-api-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server