# Jana Data Reference for MCP Server
**Date:** 2026-01-22
**Purpose:** Data model reference for MCP tool implementation
**Source:** Jana backend data dictionary and OpenAPI specification
---
## Overview
The Jana backend provides environmental data from three authoritative sources. This document describes the data models and key fields relevant to MCP tool design.
---
## Data Sources & Scale
| Source | Domain | Tables | Key Metrics |
|--------|--------|--------|-------------|
| **OpenAQ** | Air Quality | 4 | ~53K locations, ~300M+ measurements |
| **Climate TRACE** | GHG Emissions | 4 | ~1.3M assets, ~53M emission records |
| **EDGAR** | National Emissions | 4 | 190+ countries, gridded global coverage |
---
## OpenAQ (Air Quality)
Real-time and historical air quality measurements from global monitoring stations.
### Tables
#### `openaq_locations` (~53K records)
Monitoring station locations worldwide.
| Field | Type | Description |
|-------|------|-------------|
| `id` | bigint | Primary key |
| `openaq_id` | integer | Original OpenAQ identifier |
| `name` | varchar | Station name |
| `country_code` | varchar(2) | ISO-2 country code |
| `latitude` | numeric(10,7) | Latitude coordinate |
| `longitude` | numeric(10,7) | Longitude coordinate |
| `timezone` | varchar | IANA timezone |
| `owner_name` | varchar | Owning organization |
| `provider_name` | varchar | Data provider |
| `is_mobile` | boolean | Mobile station flag |
| `is_monitor` | boolean | Reference-grade monitor flag |
| `datetime_first` | timestamptz | First measurement timestamp |
| `datetime_last` | timestamptz | Most recent measurement |
#### `openaq_parameters` (~10 records)
Air quality parameter definitions.
| Field | Type | Description |
|-------|------|-------------|
| `id` | bigint | Primary key |
| `name` | varchar | Parameter code (e.g., `pm25`, `o3`) |
| `display_name` | varchar | Human name (e.g., "PM2.5", "Ozone") |
| `description` | text | Health/environmental significance |
| `units` | varchar | Standard unit (µg/m³, ppm, ppb) |
| `preferred_unit` | varchar | Display unit preference |
**Common Parameters:**
- `pm25` - Fine Particulate Matter (≤2.5µm)
- `pm10` - Coarse Particulate Matter (≤10µm)
- `o3` - Ozone
- `no2` - Nitrogen Dioxide
- `so2` - Sulfur Dioxide
- `co` - Carbon Monoxide
#### `openaq_sensors` (~150K records)
Individual sensors at monitoring locations.
| Field | Type | Description |
|-------|------|-------------|
| `id` | bigint | Primary key |
| `openaq_id` | integer | Original OpenAQ identifier |
| `location_id` | bigint | FK to `openaq_locations` |
| `parameter_id` | bigint | FK to `openaq_parameters` |
| `datetime_first` | timestamptz | First measurement |
| `datetime_last` | timestamptz | Most recent measurement |
| `latest_value` | numeric | Most recent reading |
| `latest_datetime` | timestamptz | Timestamp of latest reading |
#### `openaq_measurements` (~300M+ records)
Time-series pollutant measurements. **Primary fact table.**
| Field | Type | Description |
|-------|------|-------------|
| `id` | bigint | Primary key |
| `sensor_id` | bigint | FK to `openaq_sensors` |
| `location_id` | bigint | FK to `openaq_locations` (denormalized) |
| `parameter_name` | varchar(50) | Pollutant code (e.g., `pm25`) |
| `value` | numeric(15,6) | Measured value |
| `unit` | varchar(20) | Unit of measurement |
| `latitude` | numeric(10,7) | Latitude (denormalized) |
| `longitude` | numeric(10,7) | Longitude (denormalized) |
| `measured_at` | timestamptz | Measurement timestamp |
| `period_label` | varchar(20) | Period type (hour, minute, raw) |
| `period_interval` | varchar(20) | Duration (e.g., PT1H for 1 hour) |
| `flag` | varchar(50) | Quality indicator |
| `flag_description` | text | Quality flag explanation |
| `ingested_at` | timestamptz | When ingested into Jana |
**Key Indexes:**
- `(sensor_id, measured_at, parameter_name)` - UNIQUE
- `(location_id)` - for location-based queries
- `(measured_at)` - for time-based queries
---
## Climate TRACE (GHG Emissions)
Asset-level greenhouse gas emissions from industrial facilities worldwide.
### Tables
#### `climatetrace_sectors` (~50 records)
Industry sector classifications.
| Field | Type | Description |
|-------|------|-------------|
| `id` | bigint | Primary key |
| `climatetrace_id` | varchar | Original sector ID |
| `name` | varchar | Sector code (e.g., `electricity-generation`) |
| `display_name` | varchar | Human name |
| `description` | text | Sector description |
| `parent_sector_id` | integer | FK for hierarchy |
| `methodology_version` | varchar | Calculation methodology version |
| `emission_factors` | jsonb | Emission factors metadata |
**Common Sectors:**
- `electricity-generation` - Power plants
- `oil-and-gas-production-and-transport` - Oil & gas
- `steel` - Steel manufacturing
- `cement` - Cement production
- `aluminum` - Aluminum smelting
- `road-transportation` - Vehicles
#### `climatetrace_countries` (~200 records)
Country reference data.
| Field | Type | Description |
|-------|------|-------------|
| `id` | bigint | Primary key |
| `iso3` | varchar(3) | ISO-3 country code |
| `name` | varchar | Country name |
| `region` | varchar | Geographic region |
| `income_group` | varchar | World Bank classification |
| `total_assets_count` | integer | Assets in this country |
#### `climatetrace_assets` (~1.3M records)
Emission-producing facilities.
| Field | Type | Description |
|-------|------|-------------|
| `id` | bigint | Primary key |
| `climatetrace_id` | varchar | Original Climate TRACE ID |
| `asset_name` | varchar | Facility name |
| `sector_id` | integer | FK to `climatetrace_sectors` |
| `country_id` | integer | FK to `climatetrace_countries` |
| `latitude` | numeric(10,7) | Latitude |
| `longitude` | numeric(10,7) | Longitude |
| `asset_type` | varchar | Facility category |
| `owner_name` | varchar | Owner organization |
#### `climatetrace_emissions` (~53M records)
Asset-level GHG emissions. **Primary fact table.**
| Field | Type | Description |
|-------|------|-------------|
| `id` | bigint | Primary key |
| `asset_id` | integer | FK to `climatetrace_assets` |
| `climatetrace_asset_id` | varchar | Original asset ID |
| `sector_name` | varchar | Sector label |
| `country_iso3` | varchar(3) | ISO-3 country code |
| `temporal_granularity` | varchar | Granularity (annual, monthly) |
| `start_time` | timestamptz | Period start |
| `end_time` | timestamptz | Period end |
| `co2e_tonnes` | numeric | Emissions in tonnes CO2e |
---
## EDGAR (National/Gridded Emissions)
Global emission inventories from the European Commission Joint Research Centre.
### Tables
#### `edgar_country_totals`
Annual national GHG emission totals.
| Field | Type | Description |
|-------|------|-------------|
| `id` | bigint | Primary key |
| `country_code` | varchar(3) | ISO-3 country code |
| `year` | integer | Reporting year |
| `gas` | varchar | Gas type (CO2, CH4, N2O) |
| `sector` | varchar | Economic sector |
| `value` | numeric | Emissions in tonnes |
| `source_version` | varchar | EDGAR release version |
| `provisional_flag` | boolean | Provisional data indicator |
#### `edgar_air_pollutant_totals`
Non-GHG air pollutant emissions.
| Field | Type | Description |
|-------|------|-------------|
| `id` | bigint | Primary key |
| `country_code` | varchar(3) | ISO-3 country code |
| `year` | integer | Reporting year |
| `gas` | varchar | Pollutant (PM2.5, PM10, NOx, SO2, CO, NMVOC, NH3) |
| `sector` | varchar | Economic sector |
| `value` | numeric | Emission value |
| `unit` | varchar | Unit (kt, Gg) |
#### `edgar_fasttrack_emissions`
Provisional recent-year estimates.
| Field | Type | Description |
|-------|------|-------------|
| `id` | bigint | Primary key |
| `country_code` | varchar(3) | ISO-3 country code |
| `year` | integer | Reporting year |
| `gas` | varchar | Gas type |
| `sector` | varchar | Economic sector |
| `value` | numeric | Provisional emission value |
| `provisional` | boolean | Always true |
#### `edgar_grid_emissions`
Spatially gridded emissions at 0.1° × 0.1° resolution.
| Field | Type | Description |
|-------|------|-------------|
| `id` | bigint | Primary key |
| `cell_id` | varchar | Grid cell identifier |
| `year` | integer | Reporting year |
| `gas` | varchar | Gas type |
| `sector` | varchar | Economic sector |
| `value` | numeric | Emission in tonnes |
| `longitude` | numeric | Cell centroid longitude |
| `latitude` | numeric | Cell centroid latitude |
---
## Query Patterns for MCP Tools
### Geographic Filtering
**Bounding Box:**
```python
# API parameter
location_bbox = [min_lon, min_lat, max_lon, max_lat]
# Example: NYC area
location_bbox = [-74.1, 40.6, -73.9, 40.8]
```
**Point + Radius:**
```python
# API parameters
location_point = [longitude, latitude]
radius_km = 25
# Example: 25km around Manhattan
location_point = [-73.935, 40.730]
radius_km = 25
```
**Country Filter:**
```python
# API parameter (ISO-3 codes)
country_codes = ["USA", "CAN", "MEX"]
```
### Temporal Filtering
```python
# ISO 8601 format
date_from = "2024-01-01T00:00:00Z"
date_to = "2024-01-31T23:59:59Z"
```
### Parameter Filtering
**Air Quality (OpenAQ):**
```python
parameters = ["pm25", "pm10", "o3", "no2", "so2", "co"]
```
**Emissions (Climate TRACE, EDGAR):**
```python
# Sectors
sectors = ["electricity-generation", "steel", "cement"]
# Gases (EDGAR)
gases = ["CO2", "CH4", "N2O"]
```
### Cross-Source Queries
The unified ESG API supports querying multiple sources simultaneously:
```python
# eko-client-python
client.get_data(
sources=["openaq", "climatetrace", "edgar"],
location_bbox=[-74.1, 40.6, -73.9, 40.8],
date_from="2024-01-01T00:00:00Z",
date_to="2024-01-31T23:59:59Z",
include_correlations=True
)
```
---
## MCP Tool Design Considerations
### Tool Granularity Options
| Approach | Example Tools | Pros | Cons |
|----------|---------------|------|------|
| **Source-specific** | `get_openaq_data`, `get_climatetrace_emissions` | Direct mapping to APIs | More tools to manage |
| **Unified** | `get_environmental_data` | Simpler interface | Complex parameters |
| **Query-type** | `get_air_quality`, `get_emissions`, `find_nearby` | Natural language aligned | May need multiple calls |
### Recommended Tool Set (MVP)
| Tool | Purpose | Primary API |
|------|---------|-------------|
| `get_air_quality` | Air quality measurements | `/api/v1/esg/data/?sources=openaq` |
| `get_emissions` | GHG emissions data | `/api/v1/esg/data/?sources=climatetrace,edgar` |
| `find_nearby` | Proximity search | `/api/v1/esg/data/?location_point=...&radius_km=...` |
| `get_trends` | Temporal analysis | `/api/v1/esg/trends/` |
| `compare_locations` | Location comparison | Multiple calls + synthesis |
| `get_data_summary` | Platform overview | `/api/v1/esg/summary/` |
### Field Mapping for Natural Language
| User Says | Field/Parameter |
|-----------|-----------------|
| "PM2.5" | `parameter_name = "pm25"` |
| "fine particles" | `parameter_name = "pm25"` |
| "ozone" | `parameter_name = "o3"` |
| "carbon dioxide" / "CO2" | `gas = "CO2"` or emissions data |
| "power plants" | `sector = "electricity-generation"` |
| "last week" | `date_from = 7 days ago`, `date_to = now` |
| "near me" / "nearby" | `location_point` + `radius_km` |
---
## API Response Formats
### Unified Data Response
```json
{
"metadata": {
"total_records": 1500,
"sources_queried": ["openaq", "climatetrace"],
"query_time_ms": 245
},
"data": {
"air_quality": [...],
"emissions": [...],
"correlations": [...]
},
"attribution": {
"openaq": "Data provided by OpenAQ (https://openaq.org)",
"climatetrace": "Data provided by Climate TRACE (https://climatetrace.org)"
}
}
```
### Pagination
Default page size: 100 records
Maximum: 10,000 records per request
```python
# Parameters
limit = 1000
offset = 0
```
---
## Reference Documents
### Jana Repository
| Document | Path | Description |
|----------|------|-------------|
| OpenAPI Spec | `docs/architecture_docs/jana_openapi.yaml` | Full API specification (6,859 lines) |
| Data Dictionary | `docs/data_source_docs/data_dictionary_inventory.json` | Complete schema definitions |
| API Endpoints | `docs/architecture_docs/API_ENDPOINTS.md` | Endpoint catalog with examples |
| API Architecture | `docs/architecture_docs/API_ARCHITECTURE.md` | Design and data flow |
### eko-client-python
| Document | Path | Description |
|----------|------|-------------|
| README | `eko-client-python/README.md` | Client library documentation |
| Source | `eko-client-python/eko_client/` | Client implementation |
---
## Changelog
| Date | Change |
|------|--------|
| 2026-01-22 | Initial data reference created |