Skip to main content
Glama
kdqed
by kdqed

scatter_plot

Create scatter plots from SQL query results on CSV or Parquet data sources to visualize relationships between variables for data analysis.

Instructions

Run query against specified source and make a scatter plot using result For both csv and parquet sources, use DuckDB SQL syntax Use 'CSV' as the table name in the SQL query for csv sources. Use 'PARQUET' as the table name in the SQL query for parquet sources.

This will return an image of the plot

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
source_idYesThe data source to run the query on
queryYesSQL query to run on the data source
xYesColumn name from SQL result to use for x-axis
yYesColumn name from SQL result to use for y-axis
colorNoOptional; column name from SQL result to use for coloring the points, with color representing another dimension

Implementation Reference

  • The handler function that runs an SQL query on a data source, generates a scatter plot using Plotly Express with specified x, y, and optional color columns, and returns a base64-encoded PNG image or error string.
    def scatter_plot(self,
        source_id: Annotated[
            str, Field(description='The data source to run the query on')
        ],  
        query: Annotated[
            str, Field(description='SQL query to run on the data source')
        ],
        x: Annotated[
            str, Field(description='Column name from SQL result to use for x-axis')
        ],
        y: Annotated[
            str, Field(description='Column name from SQL result to use for y-axis')
        ],
        color: Annotated[
            str | None, Field(description='Optional; column name from SQL result to use for coloring the points, with color representing another dimension')
        ] = None,
    ) -> str | ImageContent:
        """
        Run query against specified source and make a scatter plot using result
        For both csv and parquet sources, use DuckDB SQL syntax
        Use 'CSV' as the table name in the SQL query for csv sources.
        Use 'PARQUET' as the table name in the SQL query for parquet sources.
    
        This will return an image of the plot
        """
    
        try:
            df = self._get_df_from_source(source_id, query)
            fig = px.scatter(df, x=x, y=y, color=color)
            fig.update_xaxes(autotickangles=[0, 45, 60, 90])
    
            return _fig_to_image(fig)
        except Exception as e:
            return str(e)
  • Input schema defined via Annotated types and Field descriptions for source_id, query, x, y, and optional color parameters.
    def scatter_plot(self,
        source_id: Annotated[
            str, Field(description='The data source to run the query on')
        ],  
        query: Annotated[
            str, Field(description='SQL query to run on the data source')
        ],
        x: Annotated[
            str, Field(description='Column name from SQL result to use for x-axis')
        ],
        y: Annotated[
            str, Field(description='Column name from SQL result to use for y-axis')
        ],
        color: Annotated[
            str | None, Field(description='Optional; column name from SQL result to use for coloring the points, with color representing another dimension')
        ] = None,
    ) -> str | ImageContent:
  • Registers all tool functions from ZaturnTools, including scatter_plot, into the FastMCP server using Tool.from_function.
    for tool_function in zaturn_tools.tools:
        zaturn_mcp.add_tool(Tool.from_function(tool_function))
  • Adds the scatter_plot method to the list of tools in the Visualizations class.
    self.tools = [
        self.scatter_plot,
        self.line_plot,
        self.histogram,
        self.strip_plot,
        self.box_plot,
        self.bar_plot,
    
        self.density_heatmap,
        self.polar_scatter,
        self.polar_line,
    ]
  • Helper function to convert a Plotly figure to a base64-encoded ImageContent object for MCP response.
    def _fig_to_image(fig):
        fig_encoded = b64encode(fig.to_image(format='png')).decode()
        img_b64 = "data:image/png;base64," + fig_encoded
        
        return ImageContent(
            type = 'image',
            data = fig_encoded,
            mimeType = 'image/png',
            annotations = None,
        )
  • Helper function to retrieve and execute SQL query on the specified data source, returning a DataFrame.
    def _get_df_from_source(self, source_id, query):
        source = self.data_sources.get(source_id)
        if not source:
            raise Exception(f"Source {source_id} Not Found")
                
        return query_utils.execute_query(source, query)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It states the tool returns an image of the plot, which is useful, but lacks critical details: it doesn't mention whether this is a read-only operation (though implied by plotting), error handling for invalid queries or columns, performance characteristics, or any authentication/rate limit considerations. The description adds some context about SQL syntax but misses key behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized with four sentences. It's front-loaded with the core purpose, followed by implementation details and output information. While efficient, the second and third sentences about DuckDB syntax and table names could be more integrated or simplified, slightly affecting flow.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (a 5-parameter tool that queries and visualizes data), no annotations, and no output schema, the description is moderately complete. It covers the basic purpose, SQL syntax details, and output type, but lacks context on error cases, performance, or how it differs from siblings. For a tool with no structured safety or output info, it should provide more behavioral guidance to be fully helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 5 parameters thoroughly. The description adds minimal value beyond the schema: it implies the query runs against a data source and results are used for plotting, but doesn't clarify parameter interactions (e.g., how 'color' relates to 'x' and 'y') or provide examples. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool runs a query and creates a scatter plot from the results, specifying the verb ('run query' and 'make a scatter plot') and resource ('specified source'). However, it doesn't explicitly differentiate from sibling tools like 'run_query' (which doesn't create plots) or other plot types (e.g., 'line_plot'), leaving some ambiguity about when to choose this specific visualization.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It mentions using DuckDB SQL syntax and table naming conventions for CSV/parquet sources, but these are implementation details rather than usage context. There's no mention of when a scatter plot is appropriate compared to other plot types like 'bar_plot' or 'histogram' from the sibling list.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kdqed/zaturn'

If you have feedback or need assistance with the MCP directory API, please join our Discord server