Skip to main content
Glama
therealsachin

Langfuse MCP Server

get_observations

Retrieve and filter LLM generations, spans, and events with timestamps, trace IDs, models, and log levels for analysis.

Instructions

Get LLM generations/spans with details and filtering.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
traceIdNoFilter by specific trace ID
fromNoStart timestamp (ISO 8601)
toNoEnd timestamp (ISO 8601)
limitNoMaximum number of observations to return (default: 25)
typeNoFilter by observation type
modelNoFilter by model name (substring match)
nameNoFilter by observation name (substring match)
levelNoFilter by log level

Implementation Reference

  • Core implementation of the get_observations tool handler. Fetches observations from Langfuse, processes them (including truncation of input/output), applies client-side filters, pagination, and returns JSON-formatted response.
    export async function getObservations(
      client: LangfuseAnalyticsClient,
      args: z.infer<typeof getObservationsSchema>
    ) {
      const response = await client.listObservations({
        fromStartTime: args.from,
        toStartTime: args.to,
        limit: args.limit,
        page: args.page,
        name: args.name,
        userId: args.userId,
        type: args.type,
        traceId: args.traceId,
        level: args.level,
      });
    
      let observations = response.data || [];
    
      // Helper function to truncate content
      const truncateContent = (content: any, maxLength: number): any => {
        if (!content) return content;
        if (typeof content === 'string') {
          return content.length > maxLength
            ? content.substring(0, maxLength) + '...[truncated]'
            : content;
        }
        if (typeof content === 'object') {
          const jsonStr = JSON.stringify(content);
          return jsonStr.length > maxLength
            ? jsonStr.substring(0, maxLength) + '...[truncated]'
            : content;
        }
        return content;
      };
    
      // Process and filter observations with content size control
      let processedObservations = observations.map((obs: any) => {
        const baseObs: any = {
          id: obs.id,
          traceId: obs.traceId,
          type: obs.type || 'SPAN',
          name: obs.name || 'Unnamed observation',
          startTime: obs.startTime,
          endTime: obs.endTime,
          model: obs.model,
          usage: {
            input: obs.usage?.input || obs.inputTokens,
            output: obs.usage?.output || obs.outputTokens,
            total: obs.usage?.total || obs.totalTokens,
          },
          cost: obs.calculatedTotalCost || obs.cost,
          level: obs.level || 'DEFAULT',
        };
    
        // Only include input/output if requested, and truncate if necessary
        if (args.includeInputOutput) {
          baseObs.input = truncateContent(obs.input, args.truncateContent);
          baseObs.output = truncateContent(obs.output, args.truncateContent);
          baseObs.modelParameters = obs.modelParameters;
        }
    
        return baseObs;
      });
    
      // Apply filters
      if (args.type) {
        processedObservations = processedObservations.filter((obs: any) => obs.type === args.type);
      }
      if (args.model) {
        processedObservations = processedObservations.filter((obs: any) =>
          obs.model && obs.model.toLowerCase().includes(args.model!.toLowerCase())
        );
      }
      if (args.name) {
        processedObservations = processedObservations.filter((obs: any) =>
          obs.name.toLowerCase().includes(args.name!.toLowerCase())
        );
      }
      if (args.level) {
        processedObservations = processedObservations.filter((obs: any) => obs.level === args.level);
      }
      if (args.minCost !== undefined) {
        processedObservations = processedObservations.filter((obs: any) =>
          (obs.cost || 0) >= args.minCost!
        );
      }
      if (args.maxCost !== undefined) {
        processedObservations = processedObservations.filter((obs: any) =>
          (obs.cost || 0) <= args.maxCost!
        );
      }
    
      // Apply pagination
      const startIndex = (args.page - 1) * args.limit;
      const paginatedObservations = processedObservations.slice(startIndex, startIndex + args.limit);
    
      const result: ObservationsResponse = {
        projectId: client.getProjectId(),
        observations: paginatedObservations,
        pagination: {
          page: args.page,
          limit: args.limit,
          total: processedObservations.length,
        },
      };
    
      return {
        content: [
          {
            type: 'text' as const,
            text: JSON.stringify(result, null, 2),
          },
        ],
      };
    }
  • Zod schema defining input parameters for the get_observations tool, including filters, pagination, and content options.
    export const getObservationsSchema = z.object({
      traceId: z.string().optional(),
      from: z.string().datetime().optional(),
      to: z.string().datetime().optional(),
      limit: z.number().min(1).max(50).default(10), // Reduced default limit
      page: z.number().min(1).default(1),
      type: z.enum(['GENERATION', 'SPAN', 'EVENT']).optional(),
      model: z.string().optional(),
      name: z.string().optional(),
      userId: z.string().optional(),
      level: z.enum(['DEBUG', 'DEFAULT', 'WARNING', 'ERROR']).optional(),
      minCost: z.number().optional(),
      maxCost: z.number().optional(),
      includeInputOutput: z.boolean().default(false), // New option to include full content
      truncateContent: z.number().min(100).max(2000).default(500), // Max chars for input/output
    });
  • src/index.ts:408-454 (registration)
    MCP tool registration in the server's listTools handler, providing the tool name, description, and JSON input schema.
    {
      name: 'get_observations',
      description: 'Get LLM generations/spans with details and filtering.',
      inputSchema: {
        type: 'object',
        properties: {
          traceId: {
            type: 'string',
            description: 'Filter by specific trace ID',
          },
          from: {
            type: 'string',
            format: 'date-time',
            description: 'Start timestamp (ISO 8601)',
          },
          to: {
            type: 'string',
            format: 'date-time',
            description: 'End timestamp (ISO 8601)',
          },
          limit: {
            type: 'number',
            minimum: 1,
            maximum: 100,
            description: 'Maximum number of observations to return (default: 25)',
          },
          type: {
            type: 'string',
            enum: ['GENERATION', 'SPAN', 'EVENT'],
            description: 'Filter by observation type',
          },
          model: {
            type: 'string',
            description: 'Filter by model name (substring match)',
          },
          name: {
            type: 'string',
            description: 'Filter by observation name (substring match)',
          },
          level: {
            type: 'string',
            enum: ['DEBUG', 'DEFAULT', 'WARNING', 'ERROR'],
            description: 'Filter by log level',
          },
        },
      },
    },
  • src/index.ts:1046-1049 (registration)
    Dispatch handler in the MCP server's callTool request handler that parses arguments using the schema and invokes the getObservations handler function.
    case 'get_observations': {
      const args = getObservationsSchema.parse(request.params.arguments);
      return await getObservations(this.client, args);
    }
  • src/index.ts:61-61 (registration)
    Import statement bringing in the handler function and schema from the tool module.
    import { getObservations, getObservationsSchema } from './tools/get-observations.js';
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'details and filtering' but lacks critical information: it doesn't specify if this is a read-only operation (implied by 'Get' but not explicit), describe pagination behavior (though 'limit' parameter hints at it), or explain return format. For a tool with 8 parameters and no annotations, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence: 'Get LLM generations/spans with details and filtering.' It's front-loaded with the core purpose and includes key capabilities without waste. Every word earns its place, making it highly concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 parameters, no annotations, no output schema), the description is adequate but has clear gaps. It covers the basic purpose and filtering aspect, but without annotations or output schema, it misses behavioral details like safety, response format, or error handling. This leaves the agent with incomplete context for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, meaning all parameters are documented in the schema. The description adds minimal value beyond the schema—it mentions 'filtering' which aligns with the parameters but doesn't provide additional context like how filters combine or default behaviors. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get LLM generations/spans with details and filtering.' It specifies the verb ('Get') and resource ('LLM generations/spans'), and mentions filtering capabilities. However, it doesn't explicitly differentiate from sibling tools like 'get_traces' or 'get_observation_detail', which appear related but have different scopes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'get_traces' (which might retrieve broader trace data) or 'get_observation_detail' (which might fetch a single observation), leaving the agent without context for tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/therealsachin/langfuse-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server