scatter_plot
Create scatter plots from SQL query results on CSV or Parquet data sources to visualize relationships between variables for data analysis.
Instructions
Run query against specified source and make a scatter plot using result For both csv and parquet sources, use DuckDB SQL syntax Use 'CSV' as the table name in the SQL query for csv sources. Use 'PARQUET' as the table name in the SQL query for parquet sources.
This will return an image of the plot
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| source_id | Yes | The data source to run the query on | |
| query | Yes | SQL query to run on the data source | |
| x | Yes | Column name from SQL result to use for x-axis | |
| y | Yes | Column name from SQL result to use for y-axis | |
| color | No | Optional; column name from SQL result to use for coloring the points, with color representing another dimension |
Implementation Reference
- zaturn/tools/visualizations.py:51-84 (handler)The handler function that runs an SQL query on a data source, generates a scatter plot using Plotly Express with specified x, y, and optional color columns, and returns a base64-encoded PNG image or error string.def scatter_plot(self, source_id: Annotated[ str, Field(description='The data source to run the query on') ], query: Annotated[ str, Field(description='SQL query to run on the data source') ], x: Annotated[ str, Field(description='Column name from SQL result to use for x-axis') ], y: Annotated[ str, Field(description='Column name from SQL result to use for y-axis') ], color: Annotated[ str | None, Field(description='Optional; column name from SQL result to use for coloring the points, with color representing another dimension') ] = None, ) -> str | ImageContent: """ Run query against specified source and make a scatter plot using result For both csv and parquet sources, use DuckDB SQL syntax Use 'CSV' as the table name in the SQL query for csv sources. Use 'PARQUET' as the table name in the SQL query for parquet sources. This will return an image of the plot """ try: df = self._get_df_from_source(source_id, query) fig = px.scatter(df, x=x, y=y, color=color) fig.update_xaxes(autotickangles=[0, 45, 60, 90]) return _fig_to_image(fig) except Exception as e: return str(e)
- zaturn/tools/visualizations.py:51-67 (schema)Input schema defined via Annotated types and Field descriptions for source_id, query, x, y, and optional color parameters.def scatter_plot(self, source_id: Annotated[ str, Field(description='The data source to run the query on') ], query: Annotated[ str, Field(description='SQL query to run on the data source') ], x: Annotated[ str, Field(description='Column name from SQL result to use for x-axis') ], y: Annotated[ str, Field(description='Column name from SQL result to use for y-axis') ], color: Annotated[ str | None, Field(description='Optional; column name from SQL result to use for coloring the points, with color representing another dimension') ] = None, ) -> str | ImageContent:
- zaturn/mcp/__init__.py:92-93 (registration)Registers all tool functions from ZaturnTools, including scatter_plot, into the FastMCP server using Tool.from_function.for tool_function in zaturn_tools.tools: zaturn_mcp.add_tool(Tool.from_function(tool_function))
- zaturn/tools/visualizations.py:29-40 (registration)Adds the scatter_plot method to the list of tools in the Visualizations class.self.tools = [ self.scatter_plot, self.line_plot, self.histogram, self.strip_plot, self.box_plot, self.bar_plot, self.density_heatmap, self.polar_scatter, self.polar_line, ]
- zaturn/tools/visualizations.py:13-22 (helper)Helper function to convert a Plotly figure to a base64-encoded ImageContent object for MCP response.def _fig_to_image(fig): fig_encoded = b64encode(fig.to_image(format='png')).decode() img_b64 = "data:image/png;base64," + fig_encoded return ImageContent( type = 'image', data = fig_encoded, mimeType = 'image/png', annotations = None, )
- zaturn/tools/visualizations.py:43-49 (helper)Helper function to retrieve and execute SQL query on the specified data source, returning a DataFrame.def _get_df_from_source(self, source_id, query): source = self.data_sources.get(source_id) if not source: raise Exception(f"Source {source_id} Not Found") return query_utils.execute_query(source, query)