Langfuse MCP Server

expert-suggestion.md•12.1 kB

You’ll get the cleanest result if you treat your Langfuse MCP server as: **“A thin, typed, multi-project analytics facade over the Langfuse Metrics + Traces APIs.”** I’ll outline a concrete architecture and then give you a TypeScript-style MCP skeleton you can pretty much drop in. **Confidence:** 0.84 (APIs/docs referenced but I can’t see your exact `openapi.yml` here; adjust names to match it.) --- ## 1. Core design decisions ### (a) Use Langfuse’s JS/TS client or generated client from the OpenAPI Don’t hand-roll HTTP calls from the MCP tools if you can avoid it. Options: 1. **Use `@langfuse/client`** (recommended) * It’s auto-generated from their Public API OpenAPI spec and exposes `langfuse.api.trace.list`, `langfuse.api.metrics.metrics`, etc. ([Langfuse][1]) * URL: `https://langfuse.com/docs/api-and-data-platform/features/query-via-sdk` 2. **Or generate your own client** from `https://cloud.langfuse.com/generated/api/openapi.yml` * Use `openapi-typescript-codegen` or `openapi-generator`. * This makes your MCP server strongly typed and resilient to refactors. Either way: MCP server code should call a **typed `LangfuseClient` layer**, never raw `fetch` scattered in tools. --- ### (b) Multi-project strategy On Langfuse Cloud each **API key pair is project-scoped**. Practically, that means: * Multi-project == **multiple Basic Auth credentials**. * You won’t (on Cloud) list “all projects” via the same public API key; instead you maintain a config. Define a config like: ```ts // langfuse.config.ts export interface LangfuseProjectConfig { id: string; // logical name: "anyteam-prod" baseUrl: string; // e.g. "https://cloud.langfuse.com" username: string; // pk-lf-... password: string; // sk-lf-... } export const projects: LangfuseProjectConfig[] = [ { id: "anyteam-prod", baseUrl: "https://cloud.langfuse.com", username: process.env.LF_ANYTEAM_PROD_PK!, password: process.env.LF_ANYTEAM_PROD_SK!, }, { id: "anyteam-staging", baseUrl: "https://cloud.langfuse.com", username: process.env.LF_ANYTEAM_STG_PK!, password: process.env.LF_ANYTEAM_STG_SK!, }, ]; ``` Rules: * **Secrets only in env**, never in MCP config returned to the client. * Every MCP tool that hits Langfuse takes a `projectId` argument and looks up the right credentials. * Have a `list_projects` tool so the client (Claude / VS Code / your agent) can discover allowed IDs. --- ### (c) Which Langfuse endpoints to expose For “cost & usage of services” you want **aggregates first**, details on demand. Recommended: 1. **Metrics API** – `/api/public/metrics` * Used for: cost, tokens, latency, volume, grouped/sliced by model, tag, environment, user, release, etc. ([Langfuse][2]) * URL: `https://langfuse.com/docs/metrics/features/metrics-api` 2. **Traces & Observations** – e.g.: * `trace.list`, `trace.get` * `observations.getMany` * Use for: “show top N expensive traces for service X”, debugging anomalies. ([Langfuse][1]) 3. **Optional**: * Scores / evals: for quality + cost views. * Daily metrics / legacy endpoints only if you need simpler reports; otherwise stick to Metrics API. Your MCP server should: * Use **Metrics API** for all aggregation. * Use **trace/observation APIs** only to drill into specific entities returned by metrics queries. * Never try to manually aggregate over thousands of traces inside the MCP server unless absolutely necessary. --- ## 2. Tool design (MCP surface) Design tools so that an LLM (or you in a REPL) can ask high-level questions without knowing Langfuse internals. Here’s a practical set. ### 1. `list_projects` * **Input:** none * **Output:** array of `{ id }` * Implementation: read from local config. ### 2. `project_overview` High-level health & spend. **Input:** ```ts { projectId: string; from: string; // ISO to: string; // ISO environment?: string; // prod/stage/etc } ``` **Behavior:** * Calls `/api/public/metrics` with: * `view: "traces"` * `metrics`: `totalCost sum`, `totalTokens sum`, `count count` * `dimensions`: `environment` (and maybe `model` if you like) * Returns JSON like: ```json { "projectId": "anyteam-prod", "totalCostUsd": 123.45, "totalTokens": 567890, "totalTraces": 3210, "byEnvironment": [...], "byModel": [...] } ``` ### 3. `usage_by_model` For PLG/billing. **Input:** ```ts { projectId: string; from: string; to: string; groupBy?: "model" | "name" | "tag"; environment?: string; } ``` **Behavior (example):** * Metrics query: * `view: "observations"` * `dimensions`: `providedModelName` * `metrics`: `totalTokens sum`, `totalCost sum`, `count count` * Use tags/metadata to approximate “service” if you standardize that. ### 4. `usage_by_service` You likely tag traces/observations per feature or microservice (e.g. `service:oraclee`, `feature:live-notes`). **Input:** ```ts { projectId: string; from: string; to: string; serviceTagKey: string; // e.g. "service" environment?: string; } ``` **Behavior:** * Metrics: * `view: "traces"` * `dimensions`: `tags` * `filters`: metadata/tag contains `serviceTagKey:<value>` pattern as per your convention. * MCP handler post-processes to a `{ service: {cost, tokens, traces} }` map the LLM can reason over. ### 5. `top_expensive_traces` For debugging cost spikes. **Input:** ```ts { projectId: string; from: string; to: string; limit?: number; environment?: string; } ``` **Behavior:** * Use metrics *or* `trace.list` (if supported with sort). * Return: * `traceId`, `totalCost`, `totalTokens`, `name`, `userId`, `tags`. ### 6. `get_trace_detail` **Input:** ```ts { projectId: string; traceId: string; } ``` **Behavior:** * `trace.get` + `observations.getMany(trace_id=...)` * Return minimal but rich JSON (no HTML noise): * High-level summary is perfect for MCP clients. You can add more (user summary, score analytics) the same way; the key is: each tool is **one well-defined query** over Langfuse. --- ## 3. Implementation sketch (TypeScript MCP + Langfuse) Here’s a concrete skeleton using `@modelcontextprotocol/sdk` + `@langfuse/client`. Adjust imports to match the current packages. ```ts // src/server.ts import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"; import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"; import { z } from "zod"; import { LangfuseClient } from "@langfuse/client"; import { projects, LangfuseProjectConfig } from "./langfuse.config"; type ClientMap = Record<string, LangfuseClient>; function createClient(p: LangfuseProjectConfig): LangfuseClient { return new LangfuseClient({ baseUrl: p.baseUrl, publicKey: p.username, secretKey: p.password, }); } const clients: ClientMap = Object.fromEntries( projects.map((p) => [p.id, createClient(p)]) ); function getClient(projectId: string): LangfuseClient { const client = clients[projectId]; if (!client) { throw new Error(`Unknown projectId: ${projectId}`); } return client; } const server = new McpServer({ name: "langfuse-analytics", version: "0.1.0", }); /** * list_projects */ server.registerTool( "list_projects", { title: "List Langfuse projects", description: "List configured Langfuse projects available to this MCP server.", inputSchema: z.object({}), }, async () => { const ids = projects.map((p) => p.id); return { content: [{ type: "text", text: JSON.stringify(ids, null, 2) }], structuredContent: ids, }; } ); /** * project_overview */ server.registerTool( "project_overview", { title: "Project usage & cost overview", description: "Summarize total cost, tokens, and traces for a project over a time window.", inputSchema: z.object({ projectId: z.string(), from: z.string().datetime(), to: z.string().datetime(), environment: z.string().optional(), }), }, async ({ projectId, from, to, environment }) => { const client = getClient(projectId); const filters: any[] = []; if (environment) { filters.push({ column: "environment", operator: "equals", value: environment, type: "string", }); } const query = JSON.stringify({ view: "traces", metrics: [ { measure: "totalCost", aggregation: "sum" }, { measure: "totalTokens", aggregation: "sum" }, { measure: "count", aggregation: "count" }, ], dimensions: [{ field: "environment" }], filters, fromTimestamp: from, toTimestamp: to, timeDimension: { granularity: "day" }, }); const res = await client.api.metrics.metrics({ query }); const output = { projectId, from, to, raw: res.data, }; return { content: [ { type: "text", text: JSON.stringify(output, null, 2), }, ], structuredContent: output, }; } ); // Similar pattern for usage_by_model, usage_by_service, top_expensive_traces, get_trace_detail... // Transport: stdio (for Claude Desktop, Cursor, etc) const transport = new StdioServerTransport(); await server.connect(transport); ``` Notes: * `client.api.metrics.metrics` usage is based on Langfuse docs; confirm exact method names from `@langfuse/client` reference. ([Langfuse][1]) * Each tool returns both a human-readable JSON string and `structuredContent` so the MCP client/LLM can parse. --- ## 4. Auth handling (Basic Auth) Langfuse’s public API uses **Basic Auth with `pk:sk`**. ([Langfuse][2]) If you don’t use `@langfuse/client` and instead hit the OpenAPI-generated client: * For each project config compute header: ```ts const authHeader = "Basic " + Buffer .from(`${username}:${password}`) .toString("base64"); ``` * Set: ```ts fetch(url, { headers: { Authorization: authHeader } }); ``` **Important design choice:** All auth is strictly server-side. MCP clients **never** see your keys, only structured results. --- ## 5. Modeling “services” cleanly To get per-service cost/usage in a sane way: 1. Standardize on **trace/observation tags or metadata**: * e.g. `tags: ["service:anyteam-oraclee", "env:prod"]` * or `metadata.service = "oraclee"` 2. Your MCP tools (`usage_by_service`, `top_expensive_traces`) should: * Accept `service` as input. * Translate to a Metrics API filter: by tag or metadata field. * This keeps all “what is a service?” logic **outside** Langfuse and **inside** config/policy. That gives you: * Per-service daily cost * Per-service per-model split * Mapping back to traces when you need detailed debugging. --- ## 6. Why this is a “good way” (vs alternatives) * **Typed & thin**: All Langfuse surface is behind a single typed client; less drift vs OpenAPI. * **Multi-project safe**: Credentials are local & scoped; MCP consumers choose `projectId`, but never see secrets. * **LLM-friendly**: Tools expose semantic intents (`usage_by_model`, `project_overview`) instead of raw low-level endpoints. * **Composable**: You can drop this into Claude Desktop, Cursor, a custom agent infra, etc, and immediately ask: * “Show last 7 days cost by model for anyteam-prod” * “Find the top 20 traces by cost for oraclee service in prod this week” * **No duplication of Langfuse features**: You delegate aggregation to Metrics API which is built exactly for this. --- If you’d like, next step I can give you: * A full `usage_by_model` + `get_trace_detail` implementation wired to the exact `@langfuse/client` signatures. * A suggested tag/metadata schema for AnyTeam so this MCP server gives you per-microservice and per-feature cost with almost zero extra work. [1]: https://langfuse.com/docs/api-and-data-platform/features/query-via-sdk "Query Data via SDKs - Langfuse" [2]: https://langfuse.com/docs/metrics/features/metrics-api "Metrics API - Langfuse"

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/therealsachin/langfuse-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server