Skip to main content
Glama

get_data_sample

Extract a data sample from files to preview content and structure. Specify file path and row count to retrieve JSON-formatted sample data for analysis.

Instructions

Get a sample of data from a file.

Args: file_path: Path to the data file rows: Number of rows to return (default: 10)

Returns: Sample data in JSON format

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYes
rowsNo

Implementation Reference

  • Main implementation of get_data_sample tool that loads data using pandas and returns a sample of N rows in JSON format. Handles multiple file types (CSV, JSON, Excel, TSV) and properly converts numpy/pandas types for JSON serialization.
    @mcp.tool()
    def get_data_sample(file_path: str, rows: int = 10) -> str:
        """
        Get a sample of data from a file.
        
        Args:
            file_path: Path to the data file
            rows: Number of rows to return (default: 10)
        
        Returns:
            Sample data in JSON format
        """
        try:
            import pandas as pd
            from pathlib import Path
            
            file_extension = Path(file_path).suffix.lower()
            
            # Load with pandas
            if file_extension == '.csv':
                df = pd.read_csv(file_path)
            elif file_extension == '.json':
                df = pd.read_json(file_path)
            elif file_extension in ['.xlsx', '.xls']:
                df = pd.read_excel(file_path)
            elif file_extension == '.tsv':
                df = pd.read_csv(file_path, sep='\t')
            else:
                df = pd.read_csv(file_path)
            
            # Get sample rows
            sample_df = df.head(rows)
            
            # Convert to records for JSON serialization
            sample_data = []
            for _, row in sample_df.iterrows():
                row_data = {}
                for col in df.columns:
                    value = row[col]
                    # Handle pandas/numpy types for JSON serialization
                    if pd.isna(value):
                        row_data[col] = None
                    elif hasattr(value, 'item'):  # numpy types
                        row_data[col] = value.item()
                    else:
                        row_data[col] = str(value) if value is not None else None
                sample_data.append(row_data)
            
            result = {
                "filename": Path(file_path).name,
                "total_rows": len(df),
                "total_columns": len(df.columns),
                "sample_rows": len(sample_data),
                "columns": list(df.columns),
                "data": sample_data
            }
            
            return json.dumps(result, indent=2)
            
        except Exception as e:
            return f"Error getting data sample: {str(e)}\n{traceback.format_exc()}"
  • Tool registration via the @mcp.tool() decorator, which registers get_data_sample as an MCP tool in the FastMCP server instance.
    @mcp.tool()
  • Function signature and docstring defining the input/output schema: takes file_path (str) and optional rows (int=10), returns JSON string with sample data including filename, row/column counts, and actual data rows.
    def get_data_sample(file_path: str, rows: int = 10) -> str:
        """
        Get a sample of data from a file.
        
        Args:
            file_path: Path to the data file
            rows: Number of rows to return (default: 10)
        
        Returns:
            Sample data in JSON format
        """

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/moeloubani/visidata-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server