Skip to main content
Glama
K02D

MCP Tabular Data Analysis Server

by K02D

merge_datasets

Combine two tabular datasets by joining them on specified columns to create unified data for analysis.

Instructions

Merge/join two datasets together - essential for combining data sources. Args: file_path_left: Path to left/primary dataset file_path_right: Path to right/secondary dataset on: Column(s) to join on (if same name in both datasets) left_on: Column name in left dataset to join on right_on: Column name in right dataset to join on how: Join type - 'inner', 'left', 'right', 'outer' preview_limit: Number of rows to return in preview Returns: Dictionary containing merged data preview and statistics

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_path_leftYes
file_path_rightYes
onNo
left_onNo
right_onNo
howNoinner
preview_limitNo

Implementation Reference

  • The core handler function for the 'merge_datasets' tool. Decorated with @mcp.tool() for automatic registration in the FastMCP server. Loads two datasets using _load_data helper, performs merge using pandas pd.merge with flexible join key specification, and returns merge statistics and a preview of the result.
    @mcp.tool() def merge_datasets( file_path_left: str, file_path_right: str, on: list[str] | None = None, left_on: str | None = None, right_on: str | None = None, how: str = "inner", preview_limit: int = 50, ) -> dict[str, Any]: """ Merge/join two datasets together - essential for combining data sources. Args: file_path_left: Path to left/primary dataset file_path_right: Path to right/secondary dataset on: Column(s) to join on (if same name in both datasets) left_on: Column name in left dataset to join on right_on: Column name in right dataset to join on how: Join type - 'inner', 'left', 'right', 'outer' preview_limit: Number of rows to return in preview Returns: Dictionary containing merged data preview and statistics """ df_left = _load_data(file_path_left) df_right = _load_data(file_path_right) valid_how = ['inner', 'left', 'right', 'outer'] if how not in valid_how: raise ValueError(f"Unknown join type: {how}. Use: {valid_how}") # Determine join keys if on: invalid_left = [c for c in on if c not in df_left.columns] invalid_right = [c for c in on if c not in df_right.columns] if invalid_left: raise ValueError(f"Columns {invalid_left} not found in left dataset") if invalid_right: raise ValueError(f"Columns {invalid_right} not found in right dataset") merged = pd.merge(df_left, df_right, on=on, how=how) elif left_on and right_on: if left_on not in df_left.columns: raise ValueError(f"Column '{left_on}' not found in left dataset") if right_on not in df_right.columns: raise ValueError(f"Column '{right_on}' not found in right dataset") merged = pd.merge(df_left, df_right, left_on=left_on, right_on=right_on, how=how) else: # Try to find common columns common_cols = list(set(df_left.columns) & set(df_right.columns)) if not common_cols: raise ValueError("No common columns found. Specify 'on', or 'left_on' and 'right_on'") on = common_cols[:1] # Use first common column merged = pd.merge(df_left, df_right, on=on, how=how) # Statistics about the merge left_rows = len(df_left) right_rows = len(df_right) merged_rows = len(merged) merge_stats = { "left_rows": left_rows, "right_rows": right_rows, "merged_rows": merged_rows, "join_type": how, "join_keys": on if on else {"left": left_on, "right": right_on}, } if how == "inner": merge_stats["left_match_pct"] = round(merged_rows / left_rows * 100, 1) if left_rows > 0 else 0 merge_stats["right_match_pct"] = round(merged_rows / right_rows * 100, 1) if right_rows > 0 else 0 return { "merge_stats": merge_stats, "merged_columns": merged.columns.tolist(), "merged_shape": {"rows": merged_rows, "columns": len(merged.columns)}, "preview": merged.head(preview_limit).to_dict(orient="records"), "has_more": merged_rows > preview_limit, }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/K02D/mcp-tabular'

If you have feedback or need assistance with the MCP directory API, please join our Discord server