Skip to main content
Glama
guillochon

mlb-api-mcp

get_statcast_pitcher

Retrieve MLB Statcast pitching data for a specific pitcher within a defined date range to analyze performance metrics and trends.

Instructions

Retrieve MLB Statcast data for a single pitcher over a date range.

Parameters

player_id : int MLBAM ID of the pitcher. start_date : str The start date in 'YYYY-MM-DD' format. Required. end_date : str The end date in 'YYYY-MM-DD' format. Required.

Returns

dict Dictionary with Statcast data for the pitcher. If the result is too large, returns an error message.

Notes

Data is sourced from MLB Statcast via pybaseball. See the official documentation for more details: https://github.com/jldbc/pybaseball/tree/master/docs

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
player_idYes
start_dateYes
end_dateYes

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The primary handler function for the 'get_statcast_pitcher' tool. It is decorated with @mcp.tool() for automatic registration and schema generation. Calls pybaseball.statcast_pitcher to fetch data, handles validation and errors.
    @mcp.tool()
    def get_statcast_pitcher(
        player_id: int,
        start_date: str,
        end_date: str,
    ) -> dict:
        """
        Retrieve MLB Statcast data for a single pitcher over a date range.
    
        Parameters
        ----------
        player_id : int
            MLBAM ID of the pitcher.
        start_date : str
            The start date in 'YYYY-MM-DD' format. Required.
        end_date : str
            The end date in 'YYYY-MM-DD' format. Required.
    
        Returns
        -------
        dict
            Dictionary with Statcast data for the pitcher. If the result is too large, returns an error message.
    
        Notes
        -----
        Data is sourced from MLB Statcast via pybaseball. See the official documentation for more details:
        https://github.com/jldbc/pybaseball/tree/master/docs
        """
        try:
            # Validate date range
            date_error = validate_date_range(start_date, end_date)
            if date_error:
                return date_error
            data = statcast_pitcher(start_date, end_date, player_id)
            # Convert all columns to string to ensure JSON serializability
            data = data.astype(str)
            result = {"statcast_data": data.to_dict(orient="records")}
            if not result["statcast_data"]:
                return {
                    "error": (
                        f"No Statcast data found for the given date range ({start_date} to {end_date}). The date "
                        "range may have resulted in nothing being returned."
                    )
                }
            size_error = check_result_size(result, "player")
            if size_error:
                return size_error
            return result
        except Exception as e:
            return {"error": str(e)}
  • Helper function used by get_statcast_pitcher to validate that start_date <= end_date and dates are properly formatted.
    def validate_date_range(start_date: str, end_date: str) -> Optional[dict]:
        """
        Utility to check that start_date is before or equal to end_date.
        Returns an error dict if invalid, else None.
        """
        try:
            start = datetime.strptime(start_date, "%Y-%m-%d")
            end = datetime.strptime(end_date, "%Y-%m-%d")
            if start > end:
                return {"error": f"start_date ({start_date}) must be before or equal to end_date ({end_date})"}
        except Exception as e:
            return {"error": f"Invalid date format: {e}"}
        return None
  • Helper function used by get_statcast_pitcher to check if the result dictionary is too large (>100k words) and return error if so.
    def check_result_size(result: dict, context: str) -> Optional[dict]:
        """
        Utility to check the size of a result dictionary (by word count). Returns an error dict if too large, else None.
        """
        import json
    
        word_count = len(json.dumps(result).split())
        if word_count > 100000:
            return {
                "error": (
                    f"Result too large ({word_count} words). Please narrow your query "
                    f"(e.g., shorter date range, specific {context})."
                )
            }
        return None
  • Test fixture that calls mlb_api.setup_mlb_tools(mcp) to register the tools, including get_statcast_pitcher, confirming the registration mechanism.
    @pytest.fixture
    def mcp():
        mcp = MagicMock()
        patch_mcp_tool(mcp)
        mlb_api.setup_mlb_tools(mcp)
        generic_api.setup_generic_tools(mcp)
        return mcp
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well by disclosing key behavioral traits: data source (MLB Statcast via pybaseball), potential error condition (returns error if result too large), and return type (dictionary). However, it doesn't mention rate limits, authentication needs, or pagination behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (Parameters, Returns, Notes) and front-loaded purpose statement. The Notes section could be slightly more concise, but overall information density is high with minimal waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, 0% schema coverage, but presence of output schema, the description provides good context: clear purpose, parameter semantics, return behavior, and data source. Could improve by mentioning sibling tool relationships or more behavioral details, but covers essentials adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds substantial value beyond the input schema (0% coverage). It provides semantic meaning for all three parameters: clarifies player_id is an 'MLBAM ID', specifies date formats ('YYYY-MM-DD'), and marks both dates as 'Required'. This fully compensates for the schema's lack of descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verb ('Retrieve') and resource ('MLB Statcast data for a single pitcher over a date range'). It distinguishes itself from siblings like 'get_statcast_batter' (pitcher vs batter) and 'get_statcast_team' (single pitcher vs team).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through parameter requirements (pitcher ID and date range) but doesn't explicitly state when to use this tool versus alternatives like 'get_statcast_batter' or 'get_statcast_team'. No guidance on prerequisites or exclusions is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/guillochon/mlb-api-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server