Skip to main content
Glama

data_quality_report

Analyze data quality across schemas and tables to identify nulls, empty tables, outliers, soft deletes, and type inconsistencies.

Instructions

Comprehensive data quality analysis - nulls, types, empty tables, outliers, soft deletes.

LEVEL: Database ↔ Schema ↔ Table ↔ Column (multi-level tool)

  • schema='all': Database level - quality analysis for ALL schemas

  • schema='': Schema level - all tables in that schema (supports ANY schema name: 'sales', 'billing', 'auth', 'analytics', etc.)

  • table='users': Table level - specific table analysis

  • outlier_column='age': Column level - outlier detection for specific column

REQUIRED: Specify schema explicitly - use 'all' for all schemas or a specific schema name.

USE FOR: data quality, data profiling, finding nulls, empty tables, outliers, soft deletes, cardinality analysis, "which columns have too many nulls?", data validation. DO NOT USE FOR: finding duplicates (use duplicate_detection), security/PII scan (use sensitive_data_scan), schema structure (use get_schema).

INCLUDE OPTIONS:

  • 'all': Everything (default)

  • 'nulls': Null analysis - columns with high NULL percentages

  • 'cardinality': Cardinality analysis - unique value counts

  • 'empty': Empty tables - tables with zero rows

  • 'outliers': Outlier detection (requires table and outlier_column params)

  • 'soft_delete': Soft delete patterns - finds deleted_at, is_deleted columns

  • 'types': Data type consistency recommendations

Examples: data_quality_report() - All tables in public schema (default) data_quality_report(schema='all') - Database-wide analysis data_quality_report(schema='billing') - All tables in billing schema data_quality_report(include='nulls') - Only null analysis data_quality_report(include='empty') - Only empty tables data_quality_report(include='soft_delete') - Find soft delete patterns data_quality_report(table='users', include='outliers', outlier_column='age') - Outlier detection

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
schemaNoSchema to analyze. Omit for all schemas, or specify one. Use get_schema() to list available.
tableNoOptional table name
includeNoWhat to include: 'all', 'nulls', 'cardinality', 'empty', 'outliers', 'soft_delete', 'types'all
outlier_columnNoFor outlier detection: column name
formatNoOutput format: 'json' or 'markdown'json
urlNoDatabase URL for auto-connection
summary_onlyNoReturn only summary counts and issues, not detailed lists

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses behavioral traits like multi-level analysis, required schema specification, and include options. However, it does not explicitly state whether the tool is read-only or mention authentication/rate limits, which would strengthen transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with sections (LEVEL, REQUIRED, USE FOR, DO NOT USE FOR, INCLUDE OPTIONS, Examples) and front-loaded with the main purpose. It is slightly verbose but every sentence adds value; minor conciseness gains could be made by consolidating some repeated information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, multi-level, multiple include options), the description covers all necessary aspects: purpose, usage guidelines, parameter behavior, examples, and sibling differentiation. An output schema exists, so explanation of return values is not needed. The description is comprehensive enough for correct selection and invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 100% schema description coverage, the tool description adds significant extra meaning: explains multi-level behavior for 'schema' parameter, details each include option, and notes that outlier_column requires table and outlier_column parameters. This goes well beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs comprehensive data quality analysis covering nulls, types, empty tables, outliers, soft deletes, and more. It specifies the multi-level scope (database, schema, table, column) and explicitly distinguishes from sibling tools like duplicate_detection and sensitive_data_scan, making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: USE FOR lists appropriate scenarios, DO NOT USE FOR lists exclusions with references to alternatives. It includes a REQUIRED section and examples demonstrating various parameter combinations, giving clear context for when and how to use the tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/snss10/DBeast'

If you have feedback or need assistance with the MCP directory API, please join our Discord server