Skip to main content
Glama
alcastaro

datosgobdo-mcp

by alcastaro

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault

No arguments

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": false
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
search_datasetsA

Busca datasets en datos.gob.do (datos abiertos de República Dominicana).

Filtra por palabra clave, organización, tag o grupo temático. Devuelve metadatos resumidos: título, organización, formatos disponibles, URL.

get_datasetA

Obtiene metadatos completos de un dataset, incluyendo todos sus recursos descargables.

Devuelve: título, descripción, organización, licencia, lista completa de recursos (archivos CSV/XLSX/PDF/etc) con URLs de descarga directa.

list_recent_datasetsA

Datasets modificados más recientemente en datos.gob.do.

Útil para monitorear actualizaciones del portal gubernamental. Devuelve metadatos hidratados, no actividades crudas.

get_resourceA

Metadatos de un recurso (archivo) específico: URL de descarga, formato, tamaño.

search_resourcesA

Busca recursos (archivos individuales) por nombre. Devuelve URLs de descarga.

download_resource_previewA

Download a resource and return N rows with their column headers.

The datos.gob.do portal has no DataStore (no SQL), so this tool downloads the file and parses it client-side. 5 MB cap to avoid huge files. Useful for inspecting the structure of the data before deciding how to query it. For analytical queries on big files, use get_resource_schema + summarize_resource (v0.2) or aggregate_resource (v0.3+).

get_resource_schemaA

Return column names, inferred types, and sample values for a resource.

Cheap reconnaissance step. Downloads file (up to 100 MB), opens it in DuckDB, and runs DESCRIBE + per-column DISTINCT sampling. Does NOT return raw rows. Use this before summarize_resource or aggregate_resource so the model knows column names and types.

summarize_resourceA

Auto-generated profile: row count, types, nulls, distinct, min/max/mean, top values.

Downloads file (up to 100 MB), runs DuckDB COUNT/DISTINCT/AGG queries per column. Returns one compact dict per column with stats. The model uses this to decide which filters and aggregations to apply next, without any raw rows in its context. For columns with many distinct values (e.g. names), 'top_values' is omitted; only counts are returned.

filter_resourceA

Run a typed WHERE/SELECT/ORDER BY/LIMIT against a cached resource.

First call downloads the file (up to 100 MB) and caches it as Parquet at ~/.cache/datosgobdo-mcp/. Subsequent calls hit cache (<1s). Returns requested columns + matching rows (capped at limit) plus the total count of matching rows. Use this when you need actual records, not aggregates.

aggregate_resourceA

Run GROUP BY + aggregations against a cached resource without writing SQL.

Typed wrapper that builds safe DuckDB queries from JSON. Example usage: "How many employees by status in April 2026?" → aggregations=[{col: null, fn: count, alias: empleados}], group_by=["Estatus"], filters=[{col:"Año",op:"=",val:2026},{col:"Mes",op:"=",val:"Abril"}], order_by=[{col:"empleados",dir:"desc"}].

First call downloads + caches the file. Subsequent calls reuse the cache. Returns one row per group with the aggregation values.

quantiles_resourceA

Percentile distribution (p25/p50/p75/p90/p95/p99) of numeric columns.

Fills the gap left by aggregate_resource, which only exposes median. First call downloads + caches the file. Subsequent calls reuse the cache. Useful for salary analysis, budget distributions, and statistical profiling.

find_duplicates_resourceA

Find rows that appear more than once on the given columns (or all columns).

Returns duplicate groups sorted by frequency descending. Useful for detecting data-quality issues in payroll, census, and registry datasets. First call downloads + caches. Subsequent calls reuse the cache.

detect_outliers_resourceA

Find rows where a numeric column falls outside the IQR fence.

Uses the standard IQR method: outliers are values below Q1 - 1.5IQR or above Q3 + 1.5IQR. Returns rows sorted by distance from the median. Useful for detecting data-entry errors in salary, budget, or census data. First call downloads + caches. Subsequent calls reuse the cache.

save_query_to_csvA

Write a query or filter result to a local CSV file.

Export endpoint for analysis workflows — run your filter or SQL, then save the result to open in Excel or another tool. Returns the file path and row count. First call downloads + caches the source file. Subsequent calls reuse the cache.

query_resourceA

Run an ad-hoc read-only SQL query against a cached resource via DuckDB.

Power-user escape hatch when filter_resource / aggregate_resource don't cover the case. The cached resource is exposed as the in-memory table 'data'. SQL is DuckDB dialect — see https://duckdb.org/docs/sql/introduction. Supports CSV, TSV, XLSX, XLS, JSON, and ODS (auto-converted to CSV).

Safety:

  • Only SELECT/WITH statements (CTEs allowed); multi-statement blocked.

  • DDL/DML keywords (INSERT/UPDATE/DELETE/DROP/CREATE/ALTER/COPY/EXPORT/ IMPORT/TRUNCATE/GRANT/REVOKE/PRAGMA/SET/LOAD/INSTALL/ATTACH/DETACH/ VACUUM/ANALYZE) rejected outright.

  • Sandboxed: the resource is materialized in memory and external access is disabled, so table functions (read_text/read_csv/glob/...) cannot read local files or reach the network.

  • Row cap always applied via outer wrapper.

get_cache_statsA

Return on-disk Parquet cache stats: entry count, total bytes, max bytes.

clear_cacheA

Remove all cached Parquet files. Returns the count removed.

list_organizationsA

Lista instituciones gubernamentales que publican en datos.gob.do.

Devuelve ministerios, organismos autónomos, municipios, etc., con conteo de datasets por institución. Sin descripciones largas.

get_organizationB

Información detallada de una institución: descripción, número de datasets, URL.

list_groupsA

Categorías temáticas en datos.gob.do (economía, salud, gestión pública, etc.).

list_tagsA

Lista etiquetas disponibles, opcionalmente filtradas por prefijo.

autocompleteA

Autocompleta nombres de datasets / organizaciones / grupos / tags.

Útil para resolver slugs cuando el usuario sólo da nombre parcial. Ej: kind='organization', query='hacienda' → sugiere 'ministerio-de-hacienda'.

get_site_statsA

Estadísticas generales del portal datos.gob.do.

Devuelve: total de datasets, organizaciones, grupos, tags.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/alcastaro/datos.gob.do-MCP-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server