What can you do with this server?

The NYC Open Data Capital Projects MCP Server lets you query, analyze, and explore NYC capital projects data — covering schedules, budgets, history, and variances — using 16 domain-aware tools backed by a local DuckDB database. Query & Explore Data * Run custom SQL queries against the local DuckDB, including tables for project state, schedule/budget history, funding, variance, and agency/category dimensions * Check data freshness — view per-dataset row counts, current reporting period, and key caveats (data is refreshed ~3×/year: Jan/May/Sep) * Look up field definitions — access the official NYC Open Data data dictionary for any column or dataset Browse & Resolve * List agencies and categories — see active agencies, their schedule-executor roles, and budget/line counts by program type * Resolve project references — map any project name, PID, FMS ID, or partial identifier to matching schedule and budget entities Drill Into Projects * Get project schedules — retrieve phase, lifecycle status, signed schedule variance, delay reasons, and linked budget lines * Get project budgets — retrieve total budget, spend-to-date, variance, and linked schedules for a given FMS line Aggregate & Analyze * Break down schedules or budgets — aggregate counts, averages, or totals by agency, sponsor, borough, phase, or category with role-aware attribution * Track schedule changes — identify projects newly completed or newly delayed between any two reporting periods * Analyze delay reasons — view the distribution of delay drivers for the current period or across all history * Measure budget changes — compute budget/spend deltas for an agency or specific FMS line between two reporting periods * Rank projects — rank schedules or budget lines by total budget, spend %, variance, or cumulative change, with optional filters * Analyze project durations — compute duration distributions between actual milestones, grouped by agency, borough, or lifecycle status * Browse the project portfolio — filter and list projects by category, borough, community board, lifecycle status, and/or agency Domain Rules Built In * PID ↔ FMS many-to-many relationships handled correctly * Role-aware agency attribution: sponsor (owner) view for normal agencies; managing (builder) view for DDC/DCAS/EDC * Signed, neutral variance reporting (e.g., "moved 45 days later", "budget grew $2.1M") * 3×-a-year reporting cadence acknowledged in all period-based tools

How do I use NYC Open Data Capital Projects MCP Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@NYC Open Data Capital Projects MCP Server Show me the capital projects with largest schedule delays" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

NYC Open Data Capital Projects MCP Server

by WillHsiaoNYC

Overview Schema Related Servers Score Discussions

Python

Local

The NYC Open Data Capital Projects MCP Server lets you query, analyze, and explore NYC capital projects data — covering schedules, budgets, history, and variances — using 16 domain-aware tools backed by a local DuckDB database.

Query & Explore Data

Run custom SQL queries against the local DuckDB, including tables for project state, schedule/budget history, funding, variance, and agency/category dimensions
Check data freshness — view per-dataset row counts, current reporting period, and key caveats (data is refreshed ~3×/year: Jan/May/Sep)
Look up field definitions — access the official NYC Open Data data dictionary for any column or dataset

Browse & Resolve

List agencies and categories — see active agencies, their schedule-executor roles, and budget/line counts by program type
Resolve project references — map any project name, PID, FMS ID, or partial identifier to matching schedule and budget entities

Drill Into Projects

Get project schedules — retrieve phase, lifecycle status, signed schedule variance, delay reasons, and linked budget lines
Get project budgets — retrieve total budget, spend-to-date, variance, and linked schedules for a given FMS line

Aggregate & Analyze

Break down schedules or budgets — aggregate counts, averages, or totals by agency, sponsor, borough, phase, or category with role-aware attribution
Track schedule changes — identify projects newly completed or newly delayed between any two reporting periods
Analyze delay reasons — view the distribution of delay drivers for the current period or across all history
Measure budget changes — compute budget/spend deltas for an agency or specific FMS line between two reporting periods
Rank projects — rank schedules or budget lines by total budget, spend %, variance, or cumulative change, with optional filters
Analyze project durations — compute duration distributions between actual milestones, grouped by agency, borough, or lifecycle status
Browse the project portfolio — filter and list projects by category, borough, community board, lifecycle status, and/or agency

Domain Rules Built In

PID ↔ FMS many-to-many relationships handled correctly
Role-aware agency attribution: sponsor (owner) view for normal agencies; managing (builder) view for DDC/DCAS/EDC
Signed, neutral variance reporting (e.g., "moved 45 days later", "budget grew $2.1M")
3×-a-year reporting cadence acknowledged in all period-based tools

NYC Open Data — Capital Projects MCP Server

A local MCP server over the NYC Capital Projects Dashboard (CPD) datasets on NYC Open Data. It ingests four public Socrata datasets into a single local DuckDB and exposes 16 tools so an AI assistant can answer schedule, budget, and lifecycle questions about NYC capital projects — with the domain rules (PID↔FMS many-to-many, role-aware agency attribution, signed variance reporting) baked into the tools instead of left for the caller to rediscover.

Source datasets (Socrata)

ID	Dataset
`fb86-vt7u`	Citywide Capital Project List Detail (the schedule↔budget edge)
`gyhf-rsr3`	Citywide Budget & Spend by FY
`qj5n-h5qp`	Citywide Budget Spend History & Variance
`95tx-snak`	Citywide Schedule History & Variance

Related MCP server: CodeXRay

What's an MCP — and why use one?

MCP (Model Context Protocol) is a standard way to give an AI assistant a new, trusted skill. Instead of pasting a spreadsheet into a chat and hoping the model reads it right, you hand it a set of well-defined tools it can call — with the domain rules already baked in. It's the difference between telling an analyst "here's a spreadsheet, good luck" and hiring one who already knows the data cold.

Why not just ChatGPT + a CSV? Public data is messy in ways a generic chatbot can't see. Ask a raw LLM "what's NYC's biggest library project?" and it'll happily double-count a budget line shared by several projects, call a long-finished branch "still under construction" because its funding line is still open, or — seeing only three reporting periods a year (Jan / May / Sep) — assume months of data have gone missing. It sounds confident — and it's wrong. This server encodes the guardrails once — the PID↔FMS many-to-many, role-aware agency attribution, the 3×-a-year reporting cadence, signed reporting — so every answer is consistent, sourced, and reproducible.

What one prompt can build

This isn't only a query tool. Point an AI agent at it and a single prompt produces a polished, self-contained interactive HTML report — with the domain rules already applied. Three real examples (one prompt → one file; click to open the live report):

1 · Schedule ↔ Budget topology — the many-to-many anatomy of the portfolio

Analyze the schedule and budget many-to-many relationship across NYC capital projects and build a single interactive HTML report — the 1:1-vs-fan-out split, the outlier extremes, a per-agency breakdown, and budget concentration.

▶ Open the report — fan-out rings, a bipartite diagram, the "tangled few" outliers (hover to see the real schedules and budget lines), an agency scatter, and a budget concentration curve.

2 · Parks projects over $50M — every big build, and what funds it

Build an interactive one-file HTML report on NYC Parks projects over $50M. For each budget line, show every schedule associated with it, with phase and forecast completion.

▶ Open the report — 23 budget lines; hover any to reveal its linked schedules. Quietly applies the category taxonomy, so the $1.9B "Park Pedestrian Bridges" route to Bridges, not Parks.

3 · Budget & schedule change monitor — what moved this period, by agency

Build an interactive one-file HTML monitor of NYC capital projects' budget and schedule changes by managing agency, with a click-through detail view for each project's schedule and budget history.

▶ Open the report — KPIs, a trend chart, a sortable watchlist, and a per-project popup with schedule-variance bars and a stacked budget-vs-spend chart.

Each report was generated from the prompt shown, then lightly polished. The figures are a snapshot of reporting period 202601 — browse all three in the report gallery.

🚀 Quick Start

Want the data without the setup? If your AI can run commands on your computer, just ask it to install everything for you.

✅ Let your AI install it (easiest)

Works with AI agents that can run terminal commands — Claude Code, Claude cowork (Claude Desktop's local-agent mode), Codex CLI, or another coding agent like Cursor.

Start your AI agent on this computer.
Paste the message below.
Approve each step (Allow, or press y) as it clones, installs, and connects the server.

Message to paste:

Install the MCP server at
https://github.com/WillHsiaoNYC/NYC-Opendata-Capital-Projects-MCP on this
machine — follow its README to clone the repo, install it with uv, run
`od-cpd init` to download the four NYC Open Data datasets into a local
database, and wire it into my MCP client config. Then run a verification
query to confirm it works.

What "done" looks like: your AI reports the loaded reporting period (e.g. 202601), confirms od-cpd is connected with its 16 tools, and answers a test question like "What's the biggest NYC capital project right now?" Takes a few minutes, mostly the dataset download.

🖥️ Claude Desktop (chat)

Claude Desktop can use a local server but can't install one itself. Run the Manual install, then add od-cpd to its own config (with the absolute path to uv) and fully restart — see Connect an MCP client.

☁️ claude.ai or ChatGPT (web)

These connect only to remote MCP servers, not a local one like this — so they can't run od-cpd directly. Use one of the options above.

Manual install

Requires Python ≥ 3.12 and uv.

git clone https://github.com/WillHsiaoNYC/NYC-Opendata-Capital-Projects-MCP.git
cd NYC-Opendata-Capital-Projects-MCP

uv sync
uv run od-cpd init        # download + materialize all 4 datasets → ./var/cpd.duckdb
uv run od-cpd status      # confirm the loaded reporting period

Optional: set OD_CPD_SOCRATA_APP_TOKEN to a free Socrata app token to avoid anonymous rate limits during ingest.

Connect an MCP client

The server speaks stdio. Use the absolute path to uv (run which uv to find it) — GUI apps like Claude Desktop don't inherit your shell PATH, so a bare uv command fails silently.

Claude Code — from inside the repo folder:

claude mcp add od-cpd --env PYTHONPATH="$(pwd)/src" -- \
  "$(which uv)" run --directory "$(pwd)" od-cpd-server

Claude Desktop — edit its config (macOS: ~/Library/Application Support/Claude/claude_desktop_config.json), add the block below, then fully quit (⌘Q) and reopen Claude Desktop:

{
  "mcpServers": {
    "od-cpd": {
      "command": "/absolute/path/to/uv",
      "args": ["run", "--directory", "/absolute/path/to/repo", "od-cpd-server"],
      "env": { "PYTHONPATH": "/absolute/path/to/repo/src" }
    }
  }
}

(PYTHONPATH keeps the launch robust when uv's editable install is flaky — e.g. on iCloud-synced paths.)

Keeping data fresh

The source datasets report on a Jan / May / Sep cycle, and Socrata typically publishes each period ~2.5–3 months later (so new data usually lands around April, August, and December). od-cpd update is a no-op when nothing is newer, so it's safe to run any time — check around those months:

uv run od-cpd status      # what period is loaded now
uv run od-cpd update      # re-ingest only if Socrata is newer

To keep it fresh automatically, schedule update (e.g. monthly via cron):

# 9am on the 1st of each month
0 9 1 * * cd /path/to/repo && uv run od-cpd update

What's inside

docs/FEATURES.md — the canonical inventory: all 16 tools and every domain rule the server encodes. Start here.

The headline domain rules, briefly:

"Project" is ambiguous. A PID identifies a schedule; an FMS ID identifies a budget line. They are many-to-many (~3% fan out), so the tools list all counterparts rather than silently picking one.
Agency attribution is role-aware. "Agency X's projects" means the sponsor (owner) view for normal agencies, but the managing (builder) view for the three construction-manager agencies (DDC/DCAS/EDC).
Values are reported signed and neutral ("moved 45 days later", "budget grew $2.1M") rather than only surfacing one direction.

Layout

src/od_cpd/ — ingest, materialization, and the MCP server + tools
data/ — curated agency/category dictionaries (YAML, tracked)
tests/ — unit tests + golden evals (uv run pytest)
var/, exports/ — runtime DuckDB + exports (gitignored, regenerable)

Develop

uv run pytest             # fallback: PYTHONPATH=src python -m pytest

Classification is dictionary-driven: edit data/agencies.yaml / data/categories.yaml (not Python) to adjust agency or category mappings, then re-materialize. See CLAUDE.md for the atomic-swap pattern that applies materialization changes without re-downloading.

Data caveats

This is an independent project, not affiliated with the City of New York. Figures reflect whatever reporting period the underlying Socrata datasets carry at ingest time; always check dataset_info for the current period and per-dataset caveats.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

View all tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/WillHsiaoNYC/NYC-Opendata-Capital-Projects-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server