Skip to main content
Glama
baller-coder

mcp-federated-data

mcp-federated-data

Federated MCP server that joins relational metadata (MySQL) with timeseries values (InfluxDB) behind a single, LLM-friendly entity layer.

License: MIT Version Node MCP

Languages: English · 中文


The problem

When an LLM needs to answer a question like:

"show me last-7-day temperature trends for active sensors in zone A"

it has to traverse two stores:

  • Relational metadata (what sensors exist, where, what kind, what units) — MySQL

  • Timeseries values (the readings over time) — InfluxDB

Plugging in two independent MCP servers forces the LLM to:

  1. Query MySQL → find matching entities → extract their ids

  2. Push those ids into an InfluxDB query

  3. Stitch the two result sets in its own context

LLMs reliably stumble at step 3 — especially when the join key is composite (e.g. an InfluxDB tag is the concatenation of two relational fields).

This server collapses all three steps into one tool call.


What the LLM sees

Five tools, no SQL or Flux in the prompt:

Tool

Purpose

list_entities

Filter business entities by their fields

get_entity

Look up a single entity by primary key

list_related

Walk a configured relation between entities

get_entity_timeseries

Federated — metadata + timeseries in one call

compare_timeseries

Compare 2–20 specific entities over the same window

Every tool ships with rich JSON-Schema descriptions so the LLM picks the right arguments without prompt tricks.


How it differs from naive two-server setups

Scenario

Two independent MCP servers

mcp-federated-data

"trends for sensors in zone A"

LLM: query MySQL → extract ids → push into Flux → join in its head

one get_entity_timeseries call

Composite tag like {deviceId}.{signalId}

LLM composes strings in-prompt — error-prone

composer: "{deviceId}.{signalId}" in YAML, server does it

Metadata ↔ timeseries alignment

LLM does the join, frequently mis-pairs

server joins by configured key

Data-volume blow-up

unprotected

enforced max entities + auto-downsampling + per-entity point cap

Business semantics for the LLM

raw CREATE TABLE only

YAML fields with description

Audit trail

none

structured per-call audit log


Quickstart

git clone https://github.com/baller-coder/mcp-federated-data.git
cd mcp-federated-data
pnpm install

# Sample environment (MySQL + InfluxDB in Docker, with seeded data)
docker compose -f examples/industrial-monitoring/docker-compose.yml up -d
pnpm seed

# Start the MCP server over stdio
pnpm dev -- --config examples/industrial-monitoring/config.yaml

Connect from any MCP client (Claude Desktop, Cursor, mcp-inspector).

Try it

// list active sites
{
  "name": "list_entities",
  "arguments": {
    "entity": "site",
    "filters": [{ "field": "status", "op": "eq", "value": "active" }]
  }
}

// list sensors attached to site 1
{
  "name": "list_related",
  "arguments": {
    "source_entity": "site",
    "source_id": 1,
    "target_entity": "sensor"
  }
}

// federated query — metadata + timeseries in ONE call
{
  "name": "get_entity_timeseries",
  "arguments": {
    "entity": "sensor",
    "filters": [
      { "field": "site_id", "op": "eq", "value": 1 },
      { "field": "kind",    "op": "eq", "value": "temperature" }
    ],
    "time_range":  { "start": "-7d" },
    "aggregation": { "window": "1h", "fn": "mean" }
  }
}

// compare 3 specific sensors over the same window
{
  "name": "compare_timeseries",
  "arguments": {
    "entity": "sensor",
    "ids":    [101, 201, 301],
    "time_range":  { "start": "-24h" },
    "aggregation": { "window": "10m", "fn": "mean" }
  }
}

Configuration

The whole server behavior is defined in a single YAML file. Three sections.

1. Datasources

datasources:
  - name: business
    type: mysql
    host: localhost
    port: 3306
    database: my_db
    username: readonly_user
    password: secret

  - name: timeseries
    type: influxdb
    url: http://localhost:8086
    token: my-token
    org: my_org
    bucket: my_bucket

2. Entities

Each entity binds to a relational table or view, with optional relations and an optional timeseries binding.

entities:
  - name: site
    description: Physical monitoring location.
    source:
      datasource: business
      table: sites
      primary_key: id
      fields:
        - { name: id,     type: number }
        - { name: name,   type: string, description: Display name }
        - { name: region, type: string }
        - { name: status, type: string, description: "active / inactive / maintenance" }

  - name: sensor
    description: A sensor attached to a site.
    source:
      datasource: business
      table: sensors
      primary_key: id
      fields:
        - { name: id,      type: number }
        - { name: site_id, type: number }
        - { name: name,    type: string }
        - { name: kind,    type: string, description: "temperature / humidity / voltage / ..." }
        - { name: unit,    type: string }
    relations:
      - target: site
        type: many-to-one
        local_key: site_id
        foreign_key: id
    timeseries:
      datasource: timeseries
      measurement: sensor_data
      value_field: value
      join_key:
        local: id
        remote_tag: sensor_id

3. Defaults (guard rails)

defaults:
  max_entities_per_query: 50
  max_points_per_entity: 500
  query_timeout_ms: 15000

Composite join keys (v0.2 highlight)

When the InfluxDB tag value is a composite of multiple relational fields — common in IoT / industrial systems where a tag like 400001240.438000066 encodes {deviceId}.{signalId} — declare it like this:

timeseries:
  measurement: sensor_data
  value_field: value
  join_key:
    local: [device_id, signal_id]
    remote_tag: signal_id
    composer: "{device_id}.{signal_id}"

The server:

  1. Pulls metadata from MySQL (the composer's local fields are auto-included in SELECT).

  2. Composes each row's tag value using the template.

  3. Pushes the composed list into the InfluxDB tag filter.

  4. Joins results back by the same template.

Single-field bindings (the v0.1 form) keep working unchanged — the server treats them as a one-element composite, so all paths stay uniform.


How it works — the 50-line core

Every federated tool follows the same three steps:

  1. Fetch metadata — relational query against the business store, with safe-identifier checks and parameterised WHERE. Allowed filter fields are restricted to those declared in the entity config.

  2. Fetch timeseries — pull join-key values from step 1, push them into a tag filter against the timeseries store, with optional aggregation and per-call point cap.

  3. Merge — group timeseries points by the remote tag value, then stitch each metadata row with its sorted series.

No SQL parser. No cross-store query planner. By design.


Architecture

┌──────────────────────────────────┐
│  MCP client (Claude / Cursor)    │
└────────────┬─────────────────────┘
             │ stdio  (JSON-RPC)
             ▼
┌──────────────────────────────────┐
│       mcp-federated-data         │
│  ┌────────────────────────────┐  │
│  │  Tools  (5 tools)          │  │
│  ├────────────────────────────┤  │
│  │  Entity registry           │  │
│  │  Join-key normalizer       │  │
│  │  Composer engine           │  │
│  │  Guards (limits/timeout)   │  │
│  │  Audit logger              │  │
│  ├────────────────────────────┤  │
│  │  Datasource adapters       │  │
│  └─────────┬────────┬─────────┘  │
└────────────┼────────┼────────────┘
             ▼        ▼
        ┌──────┐ ┌──────────┐
        │MySQL │ │InfluxDB  │
        └──────┘ └──────────┘

Where this fits

mcp-federated-data is schema-driven, not domain-specific. It applies anywhere business metadata lives in a relational store and observed values live in a timeseries store:

  • IoT device telemetry

  • Industrial process monitoring

  • Asset performance management

  • Building automation

  • Energy / power monitoring

  • Environmental sensor networks

  • Network device monitoring

If your stack is MySQL + InfluxDB and you want LLMs to reason over it — this server is for you.


mcp-federated-data

mcp-server-mysql

mcp-server-influxdb

Wren AI / Vanna

Cross-store join

partial (own engine)

Composite tag composer

n/a

LLM-safe SQL guards

varies

n/a

Schema as semantic layer

Configuration

YAML

env / args

env / args

dedicated DSL

Scope

narrow but deep

thin wrapper

thin wrapper

full BI platform


Contributing

Issues and PRs welcome. Before opening one:

  • For new data-source adapters — open an issue first so we can align on the interface.

  • For new tools — match the existing JSON-Schema + audit-log pattern.

  • Public APIs must keep backward compatibility within minor versions.


Acknowledgments


License

MIT

A
license - permissive license
-
quality - not tested
C
maintenance

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/baller-coder/mcp-federated-data'

If you have feedback or need assistance with the MCP directory API, please join our Discord server