Skip to main content
Glama
kdqed
by kdqed

box_plot

Create box plots from SQL query results on CSV or Parquet data sources to visualize statistical distributions and identify outliers in your data.

Instructions

Run query against specified source and make a box plot using result For both csv and parquet sources, use DuckDB SQL syntax Use 'CSV' as the table name in the SQL query for csv sources. Use 'PARQUET' as the table name in the SQL query for parquet sources.

This will return an image of the plot

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
source_idYesThe data source to run the query on
queryYesSQL query to run on the data source
xYesColumn name from SQL result to use for x-axis
yYesColumn name from SQL result to use for y-axis
colorNoOptional column name from SQL result to show multiple colored bars representing another dimension

Implementation Reference

  • Core handler function for the 'box_plot' MCP tool. Executes SQL query on specified data source, generates box plot using plotly.express.px.box with x, y, optional color grouping, converts to PNG base64 ImageContent, or returns error string.
    def box_plot(self,
        source_id: Annotated[
            str, Field(description='The data source to run the query on')
        ],  
        query: Annotated[
            str, Field(description='SQL query to run on the data source')
        ],
        x: Annotated[
            str, Field(description='Column name from SQL result to use for x-axis')
        ],
        y: Annotated[
            str, Field(description='Column name from SQL result to use for y-axis')
        ],
        color: Annotated[
            str | None, Field(description='Optional column name from SQL result to show multiple colored bars representing another dimension')
        ] = None,
    ) -> str | ImageContent:
        """
        Run query against specified source and make a box plot using result
        For both csv and parquet sources, use DuckDB SQL syntax
        Use 'CSV' as the table name in the SQL query for csv sources.
        Use 'PARQUET' as the table name in the SQL query for parquet sources.
    
        This will return an image of the plot
        """
    
        try:
            df = self._get_df_from_source(source_id, query)
            fig = px.box(df, x=x, y=y, color=color)
            fig.update_xaxes(autotickangles=[0, 45, 60, 90])
    
            return _fig_to_image(fig)
        except Exception as e:
            return str(e)
  • Registration of the box_plot tool (along with other visualization tools) in the Visualizations class's self.tools list, used for MCP tool server registration.
    self.tools = [
        self.scatter_plot,
        self.line_plot,
        self.histogram,
        self.strip_plot,
        self.box_plot,
        self.bar_plot,
    
        self.density_heatmap,
        self.polar_scatter,
        self.polar_line,
    ]
  • Helper function shared across all plot tools to convert Plotly figure to MCP ImageContent (base64 PNG).
    def _fig_to_image(fig):
        fig_encoded = b64encode(fig.to_image(format='png')).decode()
        img_b64 = "data:image/png;base64," + fig_encoded
        
        return ImageContent(
            type = 'image',
            data = fig_encoded,
            mimeType = 'image/png',
            annotations = None,
        )
  • Helper method to fetch and execute SQL query on the specified data source, returning a Pandas DataFrame for plotting.
    def _get_df_from_source(self, source_id, query):
        source = self.data_sources.get(source_id)
        if not source:
            raise Exception(f"Source {source_id} Not Found")
                
        return query_utils.execute_query(source, query)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses key behavioral traits: it returns an image of the plot, uses DuckDB SQL syntax, and specifies table naming conventions for different source types. However, it doesn't mention important aspects like error handling, performance characteristics, size limitations, or what happens with invalid queries/columns. For a visualization tool with no annotations, this leaves significant gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with four sentences that each add value: purpose statement, SQL syntax guidance, table naming rules, and output specification. It's front-loaded with the core purpose and contains zero redundant information. Every sentence earns its place by providing essential operational guidance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given this is a visualization tool with 5 parameters, no annotations, and no output schema, the description provides adequate but incomplete coverage. It explains the core functionality and SQL context well, but lacks information about the returned image format, error conditions, performance implications, or how it differs from similar visualization tools. For a tool that both queries data and creates visualizations, more behavioral context would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 5 parameters thoroughly. The description adds some context about how parameters relate to the visualization (x/y axes, color for additional dimension) and mentions the SQL syntax context, but doesn't provide additional semantic meaning beyond what's in the schema descriptions. This meets the baseline of 3 when schema coverage is high.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Run query against specified source and make a box plot using result' - this is a specific verb ('make a box plot') + resource ('result from query') combination. It distinguishes itself from siblings like 'bar_plot', 'scatter_plot', and 'run_query' by specifying it creates a box plot visualization rather than other plot types or just running queries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context about when to use this tool: for creating box plots from query results, with specific syntax guidance for different source types (CSV vs parquet). However, it doesn't explicitly state when NOT to use it or mention alternatives among sibling tools (e.g., use 'run_query' if you just need data without visualization, or 'histogram' for distribution plots).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kdqed/zaturn'

If you have feedback or need assistance with the MCP directory API, please join our Discord server