Skip to main content
Glama
isdaniel

PostgreSQL-Performance-Tuner-Mcp

analyze_table_bloat

Read-onlyIdempotent

Identify PostgreSQL table bloat by analyzing dead tuples and free space to determine when VACUUM operations are needed for performance optimization and disk space reclamation.

Instructions

Analyze table bloat using the pgstattuple extension.

Note: This tool analyzes only user/client tables and excludes PostgreSQL system tables (pg_catalog, information_schema, pg_toast). This focuses the analysis on your application's custom tables.

Uses pgstattuple to get accurate tuple-level statistics including:

  • Dead tuple count and percentage

  • Free space within the table

  • Physical vs logical table size

This helps identify tables that:

  • Need VACUUM to reclaim space

  • Need VACUUM FULL to reclaim disk space

  • Have high bloat affecting performance

Requires the pgstattuple extension to be installed: CREATE EXTENSION IF NOT EXISTS pgstattuple;

Note: pgstattuple performs a full table scan, so use with caution on large tables. For large tables, consider using pgstattuple_approx instead (use_approx=true).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
table_nameNoName of the table to analyze (required if not using schema-wide scan)
schema_nameNoSchema name (default: public)public
use_approxNoUse pgstattuple_approx for faster but approximate results (recommended for large tables)
min_table_size_gbNoMinimum table size in GB to include in schema-wide scan (default: 5)
include_toastNoInclude TOAST table analysis if applicable

Implementation Reference

  • Core handler logic: checks pgstattuple extension, analyzes single table or entire schema for bloat using pgstattuple/pgstattuple_approx, formats results as JSON
    async def run_tool(self, arguments: dict[str, Any]) -> Sequence[TextContent]:
        try:
            table_name = arguments.get("table_name")
            schema_name = arguments.get("schema_name", "public")
            use_approx = arguments.get("use_approx", False)
            min_size_gb = arguments.get("min_table_size_gb", 5)
            include_toast = arguments.get("include_toast", False)
    
            # Check if pgstattuple extension is available
            ext_check = await self._check_extension()
            if not ext_check["available"]:
                return self.format_result(
                    "Error: pgstattuple extension is not installed.\n"
                    "Install it with: CREATE EXTENSION IF NOT EXISTS pgstattuple;\n\n"
                    "Note: You may need superuser privileges or the pg_stat_scan_tables role."
                )
    
            if table_name:
                # Analyze specific table
                result = await self._analyze_single_table(
                    schema_name, table_name, use_approx, include_toast
                )
            else:
                # Analyze all tables in schema
                result = await self._analyze_schema_tables(
                    schema_name, use_approx, min_size_gb
                )
    
            return self.format_json_result(result)
    
        except Exception as e:
            return self.format_error(e)
  • Input schema definition including parameters for table/schema analysis, approx mode, size filters
    def get_tool_definition(self) -> Tool:
        return Tool(
            name=self.name,
            description=self.description,
            inputSchema={
                "type": "object",
                "properties": {
                    "table_name": {
                        "type": "string",
                        "description": "Name of the table to analyze (required if not using schema-wide scan)"
                    },
                    "schema_name": {
                        "type": "string",
                        "description": "Schema name (default: public)",
                        "default": "public"
                    },
                    "use_approx": {
                        "type": "boolean",
                        "description": "Use pgstattuple_approx for faster but approximate results (recommended for large tables)",
                        "default": False
                    },
                    "min_table_size_gb": {
                        "type": "number",
                        "description": "Minimum table size in GB to include in schema-wide scan (default: 5)",
                        "default": 5
                    },
                    "include_toast": {
                        "type": "boolean",
                        "description": "Include TOAST table analysis if applicable",
                        "default": False
                    }
                },
                "required": []
            },
            annotations=self.get_annotations()
        )
  • Registration of TableBloatToolHandler (analyze_table_bloat) in register_all_tools() function called on server startup
    # Bloat detection tools (using pgstattuple extension)
    add_tool_handler(TableBloatToolHandler(driver))
    add_tool_handler(IndexBloatToolHandler(driver))
    add_tool_handler(DatabaseBloatSummaryToolHandler(driver))
  • TableBloatToolHandler class definition with tool name, description, and annotations
    class TableBloatToolHandler(ToolHandler):
        """Tool handler for analyzing table bloat using pgstattuple."""
    
        name = "analyze_table_bloat"
  • Helper method to determine bloat severity levels and issues based on pgstattuple metrics (dead tuples, free space, tuple density)
    def _get_bloat_severity(
        self,
        dead_tuple_percent: float,
        free_percent: float,
        tuple_percent: float
    ) -> dict[str, Any]:
        """
        Determine bloat severity based on pgstattuple metrics.
    
        Rules based on pgstattuple best practices:
        - dead_tuple_percent > 10% → Autovacuum is lagging
        - free_percent > 20% → Page fragmentation (consider VACUUM FULL/CLUSTER)
        - tuple_percent < 70% → Heavy table bloat (VACUUM FULL likely needed)
        """
        severity_result = {
            "overall_severity": "minimal",
            "dead_tuple_status": "normal",
            "free_space_status": "normal",
            "tuple_density_status": "normal",
            "issues": []
        }
    
        severity_score = 0
    
        # Rule 1: Dead tuple percentage check (autovacuum lag indicator)
        if dead_tuple_percent > 30:
            severity_result["dead_tuple_status"] = "critical"
            severity_result["issues"].append(
                f"Dead tuple percent ({dead_tuple_percent:.1f}%) is critical (>30%). "
                "Manual VACUUM recommended."
            )
            severity_score += 3
        elif dead_tuple_percent > 10:
            severity_result["dead_tuple_status"] = "warning"
            severity_result["issues"].append(
                f"Dead tuple percent ({dead_tuple_percent:.1f}%) indicates autovacuum lag (>10%). "
                "Tune autovacuum settings."
            )
            severity_score += 2
    
        # Rule 2: Free space percentage check (fragmentation indicator)
        if free_percent > 30:
            severity_result["free_space_status"] = "critical"
            severity_result["issues"].append(
                f"Free space ({free_percent:.1f}%) is very high (>30%). "
                "Consider VACUUM FULL, CLUSTER, or pg_repack."
            )
            severity_score += 3
        elif free_percent > 20:
            severity_result["free_space_status"] = "warning"
            severity_result["issues"].append(
                f"Free space ({free_percent:.1f}%) indicates page fragmentation (>20%). "
                "Consider VACUUM FULL or CLUSTER."
            )
            severity_score += 2
    
        # Rule 3: Tuple percent check (live data density)
        if tuple_percent < 50:
            severity_result["tuple_density_status"] = "critical"
            severity_result["issues"].append(
                f"Tuple density ({tuple_percent:.1f}%) is critically low (<50%). "
                "Only {:.1f}% of table contains real data. VACUUM FULL strongly recommended.".format(tuple_percent)
            )
            severity_score += 3
        elif tuple_percent < 70:
            severity_result["tuple_density_status"] = "warning"
            severity_result["issues"].append(
                f"Tuple density ({tuple_percent:.1f}%) is low (<70%). "
                "Heavy bloat detected. VACUUM FULL likely needed."
            )
            severity_score += 2
    
        # Determine overall severity
        if severity_score >= 6:
            severity_result["overall_severity"] = "critical"
        elif severity_score >= 4:
            severity_result["overall_severity"] = "high"
        elif severity_score >= 2:
            severity_result["overall_severity"] = "moderate"
        elif severity_score >= 1:
            severity_result["overall_severity"] = "low"
        else:
            severity_result["overall_severity"] = "minimal"
    
        return severity_result
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, destructiveHint=false, and idempotentHint=true, covering safety and idempotency. The description adds valuable context beyond this: it warns about performance impact (full table scan on large tables), mentions the need for extension installation, and clarifies scope (excludes system tables). No contradictions with annotations exist.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (purpose, scope, statistics, use cases, prerequisites, and cautions). It's appropriately detailed for a complex tool, though slightly verbose. Every sentence adds value, such as explaining exclusion of system tables and performance considerations.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, no output schema) and rich annotations, the description is largely complete. It covers purpose, usage, behavioral traits, and context. However, it doesn't detail the output format or example results, which could be helpful since there's no output schema. This minor gap prevents a perfect score.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the input schema fully documents all 5 parameters. The description adds some semantic context by explaining the purpose of use_approx (for large tables) and the focus on user tables, but it doesn't provide additional parameter details beyond what's in the schema. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes table bloat using the pgstattuple extension, specifying it focuses on user/client tables while excluding system tables. It distinguishes itself from sibling tools like 'get_bloat_summary' or 'get_table_stats' by emphasizing tuple-level statistics and bloat analysis for maintenance decisions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool (for identifying tables needing VACUUM or VACUUM FULL due to bloat) and when to consider alternatives (using pgstattuple_approx for large tables via use_approx=true). It also mentions prerequisites (pgstattuple extension installation) and cautions about full table scans on large tables.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/isdaniel/pgtuner-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server