Skip to main content
Glama
StarRocks

StarRocks MCP Server

Official

db_overview

Retrieve an overview of all tables in a StarRocks database, including columns, sample rows, and row counts. Use cache unless a refresh is specified.

Instructions

Get an overview (columns, sample rows, row count) for ALL tables in a database. Uses cache unless refresh=True

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
dbNoDatabase name. Optional: uses the default database if not provided.
refreshNoSet to true to force refresh, ignoring cache. Defaults to false.

Implementation Reference

  • The handler function implementing the 'db_overview' tool. It lists all tables in the specified database (or default), fetches overviews for each using cache or _get_table_details helper, and compiles them into a string response. Note: The @mcp.tool decorator is commented out (line 455), so it may not be actively registered.
    def db_overview(
            db: Annotated[str, Field(
                description="Database name. Optional: uses the default database if not provided.")] = None,
            refresh: Annotated[
                bool, Field(description="Set to true to force refresh, ignoring cache. Defaults to false.")] = False
    ) -> str:
        try:
            db_name = db if db else db_client.default_database
            logger.info(f"Getting database overview for: {db_name}, refresh={refresh}")
            if not db_name:
                logger.error("Database overview called without database name")
                return "Error: Database name not provided and no default database is set."
    
            # List tables in the database
            query = f"SHOW TABLES FROM `{db_name}`"
            result = db_client.execute(query, db=db_name)
    
            if not result.success:
                logger.error(f"Failed to list tables in database {db_name}: {result.error_message}")
                return f"Database Error listing tables in '{db_name}': {result.error_message}"
    
            if not result.rows:
                logger.info(f"No tables found in database {db_name}")
                return f"No tables found in database '{db_name}'."
    
            tables = [row[0] for row in result.rows]
            logger.info(f"Found {len(tables)} tables in database {db_name}")
            all_overviews = [f"--- Overview for Database: `{db_name}` ({len(tables)} tables) ---"]
    
            total_length = 0
            limit_per_table = overview_length_limit * (math.log10(len(tables)) + 1) // len(tables)  # Limit per table
            for table_name in tables:
                cache_key = (db_name, table_name)
                overview_text = None
    
                # Check cache first
                if not refresh and cache_key in global_table_overview_cache:
                    logger.debug(f"Using cached overview for {db_name}.{table_name}")
                    overview_text = global_table_overview_cache[cache_key]
                else:
                    logger.debug(f"Fetching fresh overview for {db_name}.{table_name}")
                    # Fetch details for this table (will update cache via _get_table_details)
                    overview_text = _get_table_details(db_name, table_name, limit=limit_per_table)
    
                all_overviews.append(overview_text)
                all_overviews.append("\n")  # Add separator
                total_length += len(overview_text) + 1
    
            logger.info(f"Database overview completed for {db_name}, total length: {total_length}")
            return "\n".join(all_overviews)
    
        except Exception as e:
            # Catch any other unexpected errors during tool execution
            logger.exception(f"Unexpected error in db_overview for database {db}")
            reset_db_connections()
            stack_trace = traceback.format_exc()
            return f"Unexpected Error executing tool 'db_overview': {type(e).__name__}: {e}\nStack Trace:\n{stack_trace}"
  • Supporting helper function called by db_overview (and table_overview) to generate detailed overview for a single table: row count, DESCRIBE columns, and sample rows (LIMIT 3). Updates the global cache.
    def _get_table_details(db_name, table_name, limit=None):
        """
        Helper function to get description, sample rows, and count for a table.
        Returns a formatted string. Handles DB errors internally and returns error messages.
        """
        global global_table_overview_cache
        logger.debug(f"Fetching table details for {db_name}.{table_name}")
        output_lines = []
    
        full_table_name = f"`{table_name}`"
        if db_name:
            full_table_name = f"`{db_name}`.`{table_name}`"
        else:
            output_lines.append(
                f"Warning: Database name missing for table '{table_name}'. Using potentially incorrect context.")
            logger.warning(f"Database name missing for table '{table_name}'")
    
        count = 0
        output_lines.append(f"--- Overview for {full_table_name} ---")
    
        # 1. Get Row Count
        query = f"SELECT COUNT(*) FROM {full_table_name}"
        count_result = db_client.execute(query, db=db_name)
        if count_result.success and count_result.rows:
            count = count_result.rows[0][0]
            output_lines.append(f"\nTotal rows: {count}")
            logger.debug(f"Table {full_table_name} has {count} rows")
        else:
            output_lines.append(f"\nCould not determine total row count.")
            if not count_result.success:
                output_lines.append(f"Error: {count_result.error_message}")
                logger.error(f"Failed to get row count for {full_table_name}: {count_result.error_message}")
    
        # 2. Get Columns (DESCRIBE)
        if count > 0:
            query = f"DESCRIBE {full_table_name}"
            desc_result = db_client.execute(query, db=db_name)
            if desc_result.success and desc_result.column_names and desc_result.rows:
                output_lines.append(f"\nColumns:")
                output_lines.append(desc_result.to_string(limit=limit))
            else:
                output_lines.append("(Could not retrieve column information or table has no columns).")
                if not desc_result.success:
                    output_lines.append(f"Error getting columns for {full_table_name}: {desc_result.error_message}")
                    return "\n".join(output_lines)
    
            # 3. Get Sample Rows (LIMIT 3)
            query = f"SELECT * FROM {full_table_name} LIMIT 3"
            sample_result = db_client.execute(query, db=db_name)
            if sample_result.success and sample_result.column_names and sample_result.rows:
                output_lines.append(f"\nSample rows (limit 3):")
                output_lines.append(sample_result.to_string(limit=limit))
            else:
                output_lines.append(f"(No rows found in {full_table_name}).")
                if not sample_result.success:
                    output_lines.append(f"Error getting sample rows for {full_table_name}: {sample_result.error_message}")
    
        overview_string = "\n".join(output_lines)
        # Update cache even if there were partial errors, so we cache the error message too
        cache_key = (db_name, table_name)
        global_table_overview_cache[cache_key] = overview_string
        return overview_string
  • Commented-out registration decorator for the db_overview tool using @mcp.tool. Indicates it is not actively registered, preferring db_summary instead.
    #@mcp.tool(description="Get an overview (columns, sample rows, row count) for ALL tables in a database. Uses cache unless refresh=True" + description_suffix)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Since no annotations are provided, the description carries the full burden of behavioral disclosure. It effectively describes key behavioral traits: the tool retrieves metadata (columns, sample rows, row count), operates on all tables, and uses caching with an option to force refresh. This covers the core functionality and performance aspect, though it could add more context like response format or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the main purpose in the first sentence and adds a crucial behavioral note in the second. Both sentences earn their place by providing essential information without redundancy, making it highly efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 parameters, no annotations, no output schema), the description is mostly complete. It covers what the tool does, its scope, and caching behavior. However, it lacks details on the output format (e.g., structure of the overview data) and potential limitations, which would be helpful since there's no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, so the input schema already documents both parameters ('db' and 'refresh') thoroughly. The description adds minimal value beyond the schema by mentioning the cache behavior tied to 'refresh', but it does not provide additional semantic context or usage examples for the parameters. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get an overview'), the resource ('ALL tables in a database'), and the scope of information provided ('columns, sample rows, row count'). It distinguishes itself from sibling tools like 'table_overview' by explicitly covering all tables rather than a single table.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use this tool (to get an overview of all tables) and includes a specific usage note about the cache behavior ('Uses cache unless refresh=True'). However, it does not explicitly state when not to use it or name alternatives among the sibling tools, such as when a user might prefer 'table_overview' for a single table.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/StarRocks/mcp-server-starrocks'

If you have feedback or need assistance with the MCP directory API, please join our Discord server