Stata MCP Server

Overview Schema Related Servers Score Discussions

mcp-stata
skill

SKILL.md

SKILL.md•10 KiB

--- name: stata-mcp description: Run or debug Stata workflows through the local io.github.tmonk/mcp-stata server. Use when users mention Stata commands, .do files, r()/e() results, dataset inspection, Stata graph exports, or data browsing with sorting/filtering. --- # Stata MCP Skill ## Instructions 1. Ensure the `stata` MCP server is registered (see project README for config) and request it if not already active. 2. When the user asks for Stata work: - Use `run_command` for ad-hoc syntax (`trace=True` for call stacks, `raw=True` for plain output). - Use `load_data` before analyses that require datasets. - Use `get_data`, `describe`, `codebook`, or `get_variable_list` to inspect data. - Use `run_do_file` for provided `.do` scripts. - Use `export_graph`/`export_graphs_all` for visualization requests. - Use `get_help` when the user wants Stata documentation. - Use `get_stored_results` to return `r()`/`e()` scalars/macros after commands for validation. - Use `read_log` to tail or retrieve output from long-running commands. - Use `get_ui_channel` to obtain a localhost HTTP endpoint for high-volume data browsing. 3. Surface `rc`/`stderr` info back to the user, referencing `r()`/`e()` codes. 4. If Stata isn't auto-discovered, remind the user to set `STATA_PATH` (examples in README). ## Tool quick reference ### Command Execution - `run_command(code, echo=True, as_json=True, trace=False, raw=False, max_output_lines=None)`: Run Stata syntax. - `code`: The Stata command(s) to execute. - `echo`: Include the command itself in output (default: True). - `as_json`: Return JSON envelope with rc/stdout/stderr/error (default: True). - `trace`: Enable `set trace on` for deeper error diagnostics (default: False). - `raw`: Return plain stdout/error message instead of JSON (default: False). - `max_output_lines`: Truncate output to this many lines (default: None for no truncation). - Note: Always writes output to a temporary log file and emits a `notifications/logMessage` with `{"event":"log_path","path":"..."}` so the client can tail it locally. - `run_do_file(path, echo=True, as_json=True, trace=False, raw=False, max_output_lines=None)`: Execute .do files. - `path`: Path to the .do file. - `echo`: Include commands in output (default: True). - `as_json`: Return JSON envelope (default: True). - `trace`: Enable trace mode for debugging (default: False). - `raw`: Return plain output instead of JSON (default: False). - `max_output_lines`: Truncate output to this many lines (default: None). - Note: Always writes output to a temporary log file and emits incremental `notifications/progress` when the client provides a progress token/callback. - `read_log(path, offset=0, max_bytes=65536)`: Read a slice of a previously-provided log file. - `path`: Path to the log file (from `notifications/logMessage`). - `offset`: Byte offset to start reading from (default: 0). - `max_bytes`: Maximum bytes to read (default: 65536). - Returns JSON: `path`, `offset`, `next_offset`, `data`. ### Data Loading & Inspection - `load_data(source, clear=True, as_json=True, raw=False, max_output_lines=None)`: Load data using sysuse/webuse/use heuristics. - `source`: Dataset name, URL, or file path (e.g., "auto", "webuse nlsw88", "/path/to/file.dta"). - `clear`: Append `, clear` to replace existing data (default: True). - `as_json`: Return JSON envelope (default: True). - `raw`: Return plain output (default: False). - `max_output_lines`: Truncate output to this many lines (default: None). - Note: After loading, use UI channel for advanced filtering/sorting at scale. - `get_data(start=0, count=50)`: Retrieve a slice of the active dataset as JSON. - `start`: Zero-based index of first observation (default: 0). - `count`: Number of observations to retrieve (default: 50, max: 500). - Note: For advanced sorting/filtering at scale, use the UI channel endpoints (see `get_ui_channel()`). - `describe()`: Return variable descriptions, storage types, and labels. - `get_variable_list()`: Return JSON list of all variables with names, labels, and types. - `codebook(variable, as_json=True, trace=False, raw=False, max_output_lines=None)`: Return codebook/summary for a specific variable. - `variable`: Variable name to describe. - `as_json`: Return JSON envelope (default: True). - `trace`: Enable trace mode (default: False). - `raw`: Return plain output (default: False). - `max_output_lines`: Truncate output to this many lines (default: None). ### Graph Management - `list_graphs()`: List all graphs in Stata's memory with active graph marked. - Note: Graphs are automatically cached during command execution for instant exports. - `export_graph(graph_name=None, format="pdf")`: Export a stored graph to file. - `graph_name`: Name of graph to export (from `list_graphs`); if None, exports active graph. - `format`: Output format—"pdf" (default) or "png". Use "png" to view plots directly. - `export_graphs_all()`: Export all graphs in memory. Returns file paths by default. ### Help & Results - `get_help(topic, plain_text=False)`: Return Stata help text. - `topic`: Command or help topic (e.g., "regress", "graph"). - `plain_text`: Return plain text instead of Markdown (default: False). - `get_stored_results()`: Return current `r()` and `e()` results as JSON after a command. ### UI Data Browser - `get_ui_channel()`: Return a short-lived localhost HTTP endpoint + bearer token for the UI-only data browser. - Returns JSON with `baseUrl`, `token`, `expiresAt`, and `capabilities`. - Intended for VS Code extension UI to browse data at high volume (paging, filtering, sorting) without sending large payloads over MCP. - Loopback only (binds to `127.0.0.1`), requires bearer auth. - **Key endpoints** (all require `Authorization: Bearer <token>` header): - `GET /v1/dataset`: Dataset identity and state - `GET /v1/vars`: Variable metadata - `POST /v1/page`: Page data with optional sorting (`sortBy` parameter) - `POST /v1/arrow`: Binary Arrow IPC stream - `POST /v1/views`: Create filtered view - `POST /v1/views/:viewId/page`: Page within filtered view (supports sorting) - `POST /v1/views/:viewId/arrow`: Arrow stream from filtered view - `DELETE /v1/views/:viewId`: Delete view - `POST /v1/filters/validate`: Validate filter expression - **Sorting**: Use `sortBy` array in page requests (e.g., `["price"]` for ascending, `["-price"]` for descending, `["foreign", "-price"]` for multi-level) - **Filtering**: Filter expressions use Python boolean operators (`==`, `!=`, `<`, `>`, `and`, `or`); Stata-style `&`/`|` also accepted - **Server limits**: maxLimit=500, maxVars=32767, maxChars=500, maxRequestBytes=1000000, maxArrowLimit=1000000 - **Dataset tracking**: `datasetId` used for cache invalidation; changing dataset invalidates view handles ## Cancellation - Clients may cancel an in-flight request by sending the MCP notification `notifications/cancelled` with `params.requestId` set to the original tool call ID. - Pass a `_meta.progressToken` when invoking the tool if you want progress updates (optional). - Cancellation is best-effort and depends on Stata surfacing `BreakError`. ## Error Reporting - All tools executing Stata commands support JSON envelopes (`as_json=true`) containing: - `rc`: Return code from r()/c(rc) - `stdout`: Standard output - `stderr`: Standard error (captures "red text") - `message`: Error message - `line`: Line number (when Stata reports it) - `command`: The command that was executed - `log_path`: Path to log file for streaming (when applicable) - `snippet`: Excerpt of error output - Stata-specific error codes (`r(XXX)`) are parsed and preserved - Use `trace=true` to enable `set trace on` for detailed program-defined error diagnostics - Set `MCP_STATA_LOGLEVEL` environment variable (e.g., `DEBUG`, `INFO`) to control server logging ## MCP Resources The server exposes these resources for MCP clients: - `stata://data/summary` → `summarize` - `stata://data/metadata` → `describe` - `stata://graphs/list` → graph list - `stata://variables/list` → variable list - `stata://results/stored` → stored r()/e() results ## Graph review workflow 1. Call `list_graphs()` to see available plots and identify the active graph. 2. Use `export_graphs_all()` to fetch file paths for every graph; view them directly in the client. 3. For a single plot, call `export_graph(graph_name="GraphName", format="png")` to get a viewable file. 4. Compare the rendered PNGs to the user spec (titles, axes labels, legends, colors, filters); state whether the graph matches and what to change. ## Examples ### Run a regression ``` # Load sample data and run regression load_data("auto") run_command("regress price mpg") get_stored_results() # Retrieve coefficients and statistics ``` ### Export a histogram ``` # Create and export a graph run_command("histogram price") list_graphs() # Confirm graph exists export_graph(graph_name="Graph", format="png") # Export for viewing ``` ### Debug a do-file ``` run_do_file("/path/to/analysis.do", trace=True) ``` ### Inspect data structure ``` load_data("nlsw88", clear=True) describe() get_variable_list() codebook("wage") get_data(start=0, count=10) ``` ### Read log output from long-running command ``` # After run_command emits a log_path notification read_log("/tmp/stata_log_abc123.log", offset=0) # Continue reading with next_offset for incremental output read_log("/tmp/stata_log_abc123.log", offset=4096) ``` ### Advanced data browsing with sorting and filtering ``` # Get UI channel for high-volume data operations get_ui_channel() # Returns baseUrl, token, expiresAt # Example UI channel usage (requires HTTP client): # POST {baseUrl}/v1/page with Authorization: Bearer {token} # Body: {"datasetId":"...","offset":0,"limit":50,"vars":["price","mpg"],"sortBy":["-price"]} # Create filtered view for price < 5000 # POST {baseUrl}/v1/views # Body: {"datasetId":"...","frame":"default","filterExpr":"price < 5000"} # Page through filtered view with sorting # POST {baseUrl}/v1/views/{viewId}/page # Body: {"offset":0,"limit":50,"vars":["price","mpg"],"sortBy":["-price"]} ```

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tmonk/mcp-stata'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

SKILL.md•10 KiB