Skip to main content
Glama
guillochon

mlb-api-mcp

get_statcast_batter

Retrieve MLB Statcast data for a specific batter by providing their MLBAM ID and a date range to analyze performance metrics.

Instructions

Retrieve MLB Statcast data for a single batter over a date range.

Parameters

player_id : int MLBAM ID of the batter. start_date : str The start date in 'YYYY-MM-DD' format. Required. end_date : str The end date in 'YYYY-MM-DD' format. Required.

Returns

dict Dictionary with Statcast data for the batter. If the result is too large, returns an error message.

Notes

Data is sourced from MLB Statcast via pybaseball. See the official documentation for more details: https://github.com/jldbc/pybaseball/tree/master/docs

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
player_idYes
start_dateYes
end_dateYes

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The @mcp.tool()-decorated function implementing the core logic of the 'get_statcast_batter' tool. It validates the date range, fetches Statcast batter data using pybaseball's statcast_batter function, converts to JSON-serializable format, checks result size, and returns the data or appropriate error.
    @mcp.tool()
    def get_statcast_batter(
        player_id: int,
        start_date: str,
        end_date: str,
    ) -> dict:
        """
        Retrieve MLB Statcast data for a single batter over a date range.
    
        Parameters
        ----------
        player_id : int
            MLBAM ID of the batter.
        start_date : str
            The start date in 'YYYY-MM-DD' format. Required.
        end_date : str
            The end date in 'YYYY-MM-DD' format. Required.
    
        Returns
        -------
        dict
            Dictionary with Statcast data for the batter. If the result is too large, returns an error message.
    
        Notes
        -----
        Data is sourced from MLB Statcast via pybaseball. See the official documentation for more details:
        https://github.com/jldbc/pybaseball/tree/master/docs
        """
        try:
            # Validate date range
            date_error = validate_date_range(start_date, end_date)
            if date_error:
                return date_error
            data = statcast_batter(start_date, end_date, player_id)
            # Convert all columns to string to ensure JSON serializability
            data = data.astype(str)
            result = {"statcast_data": data.to_dict(orient="records")}
            if not result["statcast_data"]:
                return {
                    "error": (
                        f"No Statcast data found for the given date range ({start_date} to {end_date}). The date "
                        "range may have resulted in nothing being returned."
                    )
                }
            size_error = check_result_size(result, "player")
            if size_error:
                return size_error
            return result
        except Exception as e:
            return {"error": str(e)}
  • main.py:21-23 (registration)
    The code block in main.py that calls setup_mlb_tools(mcp), which registers the get_statcast_batter tool (and other MLB tools) with the MCP server instance.
    # Setup all MLB and generic tools
    setup_mlb_tools(mcp)
    setup_generic_tools(mcp)
  • Helper function used by get_statcast_batter to validate the input date range.
    def validate_date_range(start_date: str, end_date: str) -> Optional[dict]:
        """
        Utility to check that start_date is before or equal to end_date.
        Returns an error dict if invalid, else None.
        """
        try:
            start = datetime.strptime(start_date, "%Y-%m-%d")
            end = datetime.strptime(end_date, "%Y-%m-%d")
            if start > end:
                return {"error": f"start_date ({start_date}) must be before or equal to end_date ({end_date})"}
        except Exception as e:
            return {"error": f"Invalid date format: {e}"}
        return None
  • Helper function used by get_statcast_batter to check if the result size exceeds limits.
    def check_result_size(result: dict, context: str) -> Optional[dict]:
        """
        Utility to check the size of a result dictionary (by word count). Returns an error dict if too large, else None.
        """
        import json
    
        word_count = len(json.dumps(result).split())
        if word_count > 100000:
            return {
                "error": (
                    f"Result too large ({word_count} words). Please narrow your query "
                    f"(e.g., shorter date range, specific {context})."
                )
            }
        return None
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses key behavioral traits: data source (MLB Statcast via pybaseball), potential error condition ('If the result is too large, returns an error message'), and return format. However, it doesn't mention rate limits, authentication needs, or what specific data fields to expect in the dictionary.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (Parameters, Returns, Notes). The opening sentence efficiently states the purpose. Some redundancy exists (parameters listed in both description and schema), but overall it's appropriately sized with useful information in each section.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 parameters with 0% schema coverage and no annotations, the description provides good coverage: purpose, parameters, return format, error condition, and data source. With an output schema present, it doesn't need to detail return values. The main gap is lack of behavioral constraints like rate limits or authentication requirements.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It provides clear semantic meaning for all 3 parameters: player_id as 'MLBAM ID of the batter', start_date/end_date as date strings in 'YYYY-MM-DD' format with 'Required' indication. This adds substantial value beyond the bare schema, though it doesn't explain format constraints beyond date format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Retrieve MLB Statcast data'), resource ('for a single batter'), and scope ('over a date range'). It distinguishes from siblings like get_statcast_pitcher (different player type) and get_statcast_team (different aggregation level).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through the parameters (batter-specific, date range) and distinguishes from siblings by specifying 'single batter' vs team/pitcher tools. However, it doesn't explicitly state when NOT to use this tool or name specific alternatives for different query types.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/guillochon/mlb-api-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server