expert-suggestion.md•12.1 kB
You’ll get the cleanest result if you treat your Langfuse MCP server as:
**“A thin, typed, multi-project analytics facade over the Langfuse Metrics + Traces APIs.”**
I’ll outline a concrete architecture and then give you a TypeScript-style MCP skeleton you can pretty much drop in.
**Confidence:** 0.84 (APIs/docs referenced but I can’t see your exact `openapi.yml` here; adjust names to match it.)
---
## 1. Core design decisions
### (a) Use Langfuse’s JS/TS client or generated client from the OpenAPI
Don’t hand-roll HTTP calls from the MCP tools if you can avoid it.
Options:
1. **Use `@langfuse/client`** (recommended)
* It’s auto-generated from their Public API OpenAPI spec and exposes `langfuse.api.trace.list`, `langfuse.api.metrics.metrics`, etc. ([Langfuse][1])
* URL: `https://langfuse.com/docs/api-and-data-platform/features/query-via-sdk`
2. **Or generate your own client** from `https://cloud.langfuse.com/generated/api/openapi.yml`
* Use `openapi-typescript-codegen` or `openapi-generator`.
* This makes your MCP server strongly typed and resilient to refactors.
Either way: MCP server code should call a **typed `LangfuseClient` layer**, never raw `fetch` scattered in tools.
---
### (b) Multi-project strategy
On Langfuse Cloud each **API key pair is project-scoped**. Practically, that means:
* Multi-project == **multiple Basic Auth credentials**.
* You won’t (on Cloud) list “all projects” via the same public API key; instead you maintain a config.
Define a config like:
```ts
// langfuse.config.ts
export interface LangfuseProjectConfig {
id: string; // logical name: "anyteam-prod"
baseUrl: string; // e.g. "https://cloud.langfuse.com"
username: string; // pk-lf-...
password: string; // sk-lf-...
}
export const projects: LangfuseProjectConfig[] = [
{
id: "anyteam-prod",
baseUrl: "https://cloud.langfuse.com",
username: process.env.LF_ANYTEAM_PROD_PK!,
password: process.env.LF_ANYTEAM_PROD_SK!,
},
{
id: "anyteam-staging",
baseUrl: "https://cloud.langfuse.com",
username: process.env.LF_ANYTEAM_STG_PK!,
password: process.env.LF_ANYTEAM_STG_SK!,
},
];
```
Rules:
* **Secrets only in env**, never in MCP config returned to the client.
* Every MCP tool that hits Langfuse takes a `projectId` argument and looks up the right credentials.
* Have a `list_projects` tool so the client (Claude / VS Code / your agent) can discover allowed IDs.
---
### (c) Which Langfuse endpoints to expose
For “cost & usage of services” you want **aggregates first**, details on demand.
Recommended:
1. **Metrics API** – `/api/public/metrics`
* Used for: cost, tokens, latency, volume, grouped/sliced by model, tag, environment, user, release, etc. ([Langfuse][2])
* URL: `https://langfuse.com/docs/metrics/features/metrics-api`
2. **Traces & Observations** – e.g.:
* `trace.list`, `trace.get`
* `observations.getMany`
* Use for: “show top N expensive traces for service X”, debugging anomalies. ([Langfuse][1])
3. **Optional**:
* Scores / evals: for quality + cost views.
* Daily metrics / legacy endpoints only if you need simpler reports; otherwise stick to Metrics API.
Your MCP server should:
* Use **Metrics API** for all aggregation.
* Use **trace/observation APIs** only to drill into specific entities returned by metrics queries.
* Never try to manually aggregate over thousands of traces inside the MCP server unless absolutely necessary.
---
## 2. Tool design (MCP surface)
Design tools so that an LLM (or you in a REPL) can ask high-level questions without knowing Langfuse internals.
Here’s a practical set.
### 1. `list_projects`
* **Input:** none
* **Output:** array of `{ id }`
* Implementation: read from local config.
### 2. `project_overview`
High-level health & spend.
**Input:**
```ts
{
projectId: string;
from: string; // ISO
to: string; // ISO
environment?: string; // prod/stage/etc
}
```
**Behavior:**
* Calls `/api/public/metrics` with:
* `view: "traces"`
* `metrics`: `totalCost sum`, `totalTokens sum`, `count count`
* `dimensions`: `environment` (and maybe `model` if you like)
* Returns JSON like:
```json
{
"projectId": "anyteam-prod",
"totalCostUsd": 123.45,
"totalTokens": 567890,
"totalTraces": 3210,
"byEnvironment": [...],
"byModel": [...]
}
```
### 3. `usage_by_model`
For PLG/billing.
**Input:**
```ts
{
projectId: string;
from: string;
to: string;
groupBy?: "model" | "name" | "tag";
environment?: string;
}
```
**Behavior (example):**
* Metrics query:
* `view: "observations"`
* `dimensions`: `providedModelName`
* `metrics`: `totalTokens sum`, `totalCost sum`, `count count`
* Use tags/metadata to approximate “service” if you standardize that.
### 4. `usage_by_service`
You likely tag traces/observations per feature or microservice (e.g. `service:oraclee`, `feature:live-notes`).
**Input:**
```ts
{
projectId: string;
from: string;
to: string;
serviceTagKey: string; // e.g. "service"
environment?: string;
}
```
**Behavior:**
* Metrics:
* `view: "traces"`
* `dimensions`: `tags`
* `filters`: metadata/tag contains `serviceTagKey:<value>` pattern as per your convention.
* MCP handler post-processes to a `{ service: {cost, tokens, traces} }` map the LLM can reason over.
### 5. `top_expensive_traces`
For debugging cost spikes.
**Input:**
```ts
{
projectId: string;
from: string;
to: string;
limit?: number;
environment?: string;
}
```
**Behavior:**
* Use metrics *or* `trace.list` (if supported with sort).
* Return:
* `traceId`, `totalCost`, `totalTokens`, `name`, `userId`, `tags`.
### 6. `get_trace_detail`
**Input:**
```ts
{
projectId: string;
traceId: string;
}
```
**Behavior:**
* `trace.get` + `observations.getMany(trace_id=...)`
* Return minimal but rich JSON (no HTML noise):
* High-level summary is perfect for MCP clients.
You can add more (user summary, score analytics) the same way; the key is: each tool is **one well-defined query** over Langfuse.
---
## 3. Implementation sketch (TypeScript MCP + Langfuse)
Here’s a concrete skeleton using `@modelcontextprotocol/sdk` + `@langfuse/client`. Adjust imports to match the current packages.
```ts
// src/server.ts
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
import { LangfuseClient } from "@langfuse/client";
import { projects, LangfuseProjectConfig } from "./langfuse.config";
type ClientMap = Record<string, LangfuseClient>;
function createClient(p: LangfuseProjectConfig): LangfuseClient {
return new LangfuseClient({
baseUrl: p.baseUrl,
publicKey: p.username,
secretKey: p.password,
});
}
const clients: ClientMap = Object.fromEntries(
projects.map((p) => [p.id, createClient(p)])
);
function getClient(projectId: string): LangfuseClient {
const client = clients[projectId];
if (!client) {
throw new Error(`Unknown projectId: ${projectId}`);
}
return client;
}
const server = new McpServer({
name: "langfuse-analytics",
version: "0.1.0",
});
/**
* list_projects
*/
server.registerTool(
"list_projects",
{
title: "List Langfuse projects",
description: "List configured Langfuse projects available to this MCP server.",
inputSchema: z.object({}),
},
async () => {
const ids = projects.map((p) => p.id);
return {
content: [{ type: "text", text: JSON.stringify(ids, null, 2) }],
structuredContent: ids,
};
}
);
/**
* project_overview
*/
server.registerTool(
"project_overview",
{
title: "Project usage & cost overview",
description:
"Summarize total cost, tokens, and traces for a project over a time window.",
inputSchema: z.object({
projectId: z.string(),
from: z.string().datetime(),
to: z.string().datetime(),
environment: z.string().optional(),
}),
},
async ({ projectId, from, to, environment }) => {
const client = getClient(projectId);
const filters: any[] = [];
if (environment) {
filters.push({
column: "environment",
operator: "equals",
value: environment,
type: "string",
});
}
const query = JSON.stringify({
view: "traces",
metrics: [
{ measure: "totalCost", aggregation: "sum" },
{ measure: "totalTokens", aggregation: "sum" },
{ measure: "count", aggregation: "count" },
],
dimensions: [{ field: "environment" }],
filters,
fromTimestamp: from,
toTimestamp: to,
timeDimension: { granularity: "day" },
});
const res = await client.api.metrics.metrics({ query });
const output = {
projectId,
from,
to,
raw: res.data,
};
return {
content: [
{
type: "text",
text: JSON.stringify(output, null, 2),
},
],
structuredContent: output,
};
}
);
// Similar pattern for usage_by_model, usage_by_service, top_expensive_traces, get_trace_detail...
// Transport: stdio (for Claude Desktop, Cursor, etc)
const transport = new StdioServerTransport();
await server.connect(transport);
```
Notes:
* `client.api.metrics.metrics` usage is based on Langfuse docs; confirm exact method names from `@langfuse/client` reference. ([Langfuse][1])
* Each tool returns both a human-readable JSON string and `structuredContent` so the MCP client/LLM can parse.
---
## 4. Auth handling (Basic Auth)
Langfuse’s public API uses **Basic Auth with `pk:sk`**. ([Langfuse][2])
If you don’t use `@langfuse/client` and instead hit the OpenAPI-generated client:
* For each project config compute header:
```ts
const authHeader = "Basic " + Buffer
.from(`${username}:${password}`)
.toString("base64");
```
* Set:
```ts
fetch(url, {
headers: { Authorization: authHeader }
});
```
**Important design choice:**
All auth is strictly server-side. MCP clients **never** see your keys, only structured results.
---
## 5. Modeling “services” cleanly
To get per-service cost/usage in a sane way:
1. Standardize on **trace/observation tags or metadata**:
* e.g. `tags: ["service:anyteam-oraclee", "env:prod"]`
* or `metadata.service = "oraclee"`
2. Your MCP tools (`usage_by_service`, `top_expensive_traces`) should:
* Accept `service` as input.
* Translate to a Metrics API filter: by tag or metadata field.
* This keeps all “what is a service?” logic **outside** Langfuse and **inside** config/policy.
That gives you:
* Per-service daily cost
* Per-service per-model split
* Mapping back to traces when you need detailed debugging.
---
## 6. Why this is a “good way” (vs alternatives)
* **Typed & thin**: All Langfuse surface is behind a single typed client; less drift vs OpenAPI.
* **Multi-project safe**: Credentials are local & scoped; MCP consumers choose `projectId`, but never see secrets.
* **LLM-friendly**: Tools expose semantic intents (`usage_by_model`, `project_overview`) instead of raw low-level endpoints.
* **Composable**: You can drop this into Claude Desktop, Cursor, a custom agent infra, etc, and immediately ask:
* “Show last 7 days cost by model for anyteam-prod”
* “Find the top 20 traces by cost for oraclee service in prod this week”
* **No duplication of Langfuse features**: You delegate aggregation to Metrics API which is built exactly for this.
---
If you’d like, next step I can give you:
* A full `usage_by_model` + `get_trace_detail` implementation wired to the exact `@langfuse/client` signatures.
* A suggested tag/metadata schema for AnyTeam so this MCP server gives you per-microservice and per-feature cost with almost zero extra work.
[1]: https://langfuse.com/docs/api-and-data-platform/features/query-via-sdk "Query Data via SDKs - Langfuse"
[2]: https://langfuse.com/docs/metrics/features/metrics-api "Metrics API - Langfuse"