Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| LANGFUSE_HOST | Yes | Langfuse base URL | |
| LANGFUSE_TIMEOUT | No | HTTP request timeout (Spring Duration format, e.g. 30s, 1m) | 30s |
| LANGFUSE_PUBLIC_KEY | Yes | Langfuse project public key | |
| LANGFUSE_SECRET_KEY | Yes | Langfuse project secret key |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
| logging | {} |
| prompts | {
"listChanged": true
} |
| resources | {
"subscribe": false,
"listChanged": true
} |
| completions | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| create_annotation_queue | Creates a new annotation queue for human review workflows. Returns the created queue with its assigned ID. name is required. description and scoreConfigId are optional. |
| create_annotation_queue_item | Adds an item to an annotation queue for human review. queueId and objectId are required. objectType can be SESSION, TRACE, or OBSERVATION. status is optional PENDING | COMPLETED. |
| delete_annotation_queue_item | Removes an item from an annotation queue. This action is irreversible. Both queueId and itemId are required. |
| get_annotation_queue | Returns a single annotation queue by its ID. Returns: id, name, description, scoreConfigId, projectId, createdAt, updatedAt. queueId is required. |
| get_annotation_queue_item | Returns a specific item from an annotation queue by queue ID and item ID. Returns: id, queueId, traceId, observationId, status, annotatorUserId, completedAt. Both queueId and itemId are required. |
| list_annotation_queue_items | Returns items in a specific annotation queue, optionally filtered by status. status values: PENDING | COMPLETED. Omit status to return all items regardless of status. Each item contains: id, queueId, traceId, observationId, status, annotatorUserId, completedAt. queueId is required. |
| list_annotation_queues | Returns a paginated list of annotation queues in the Langfuse project. Each queue contains: id, name, description, scoreConfigId, projectId, createdAt, updatedAt. Annotation queues are used for human-in-the-loop review workflows. Pagination: page is 1-based (default 1), limit controls page size (default 20). |
| update_annotation_queue_item | Updates the status of an annotation queue item. status values: PENDING | COMPLETED. Both queueId and itemId are required. |
| create_comment | Creates a comment attached to a trace, observation, session, or prompt. objectType values: TRACE | OBSERVATION | SESSION | PROMPT. Both objectType and objectId are required along with content. Returns the created comment with its assigned ID. |
| get_comment | Returns a single comment by its ID. Returns: id, objectType, objectId, content, authorUserId, createdAt, updatedAt. commentId is required. |
| get_comments | List comments filtered by objectType and objectId. objectType values: TRACE | OBSERVATION. Returns: id, objectType, objectId, content, authorUserId, createdAt. Read-only. |
| create_dataset_run_item | Creates a dataset run item and creates or updates the dataset run if needed. runName and datasetItemId are required. traceId is strongly recommended and observationId is optional. metadataJson must be valid JSON when provided. |
| delete_dataset_run | Deletes a dataset run and all its run items. This action is irreversible. Use this to clean up experiment runs you no longer need. Both datasetName and runName are required. |
| get_dataset_run | Returns a single dataset run including all its run items. Each run item links a dataset item to a trace and optional observation. Returns: id, name, datasetName, metadata, createdAt, updatedAt, datasetRunItems[]. Both datasetName and runName are required. |
| list_dataset_run_items | Returns a paginated list of items in a specific dataset run. Each run item links a dataset item to a trace and optional observation for evaluation. Returns: id, datasetRunId, datasetRunName, datasetItemId, traceId, observationId, createdAt. Both datasetId and runName are required. |
| list_dataset_runs | Returns a paginated list of runs for a specific dataset. Each run represents one experiment executed against a dataset. Returns: id, name, datasetId, datasetName, metadata, createdAt, updatedAt. datasetName is required. Pagination: page 1-based (default 1), limit (default 20). |
| create_dataset | Creates a new dataset in Langfuse. name is required. description is optional. metadataJson, inputSchemaJson, and expectedOutputSchemaJson must be valid JSON when provided. Returns the created dataset definition. |
| create_dataset_item | Creates or upserts a dataset item in an existing dataset. datasetName is required. inputJson, expectedOutputJson, and metadataJson must be valid JSON when provided. Optional sourceTraceId or sourceObservationId can link the item back to Langfuse data. |
| delete_dataset_item | Deletes a dataset item by its ID. This action is irreversible. |
| get_dataset | Get a Langfuse dataset by name. Read-only. |
| get_dataset_item | Get a single dataset item by ID. Read-only. |
| list_dataset_items | List items in a dataset with pagination. Read-only. |
| list_datasets | List all evaluation datasets in the Langfuse project. Read-only. |
| list_llm_connections | Returns a paginated list of LLM provider connections configured in the Langfuse project. Each connection contains: id, provider, displaySecretKey (masked), baseURL, config. LLM connections define the provider credentials used by Langfuse for evaluations and playground. Pagination: page is 1-based (default 1), limit controls page size (default 20). |
| upsert_llm_connection | Creates or updates an LLM provider connection (upserted by provider name). If a connection for the given provider already exists, it is updated. provider and secretKey are required. provider examples: openai, anthropic, azure, google. |
| create_model | Creates a custom model definition for cost tracking and token pricing. modelName, matchPattern, and unit are required. unit values: TOKENS | CHARACTERS | MILLISECONDS | SECONDS | IMAGES | REQUESTS. Prices are per unit in USD (e.g. inputPrice=0.000001 means $1 per million tokens). Omit prices for models where you do not want cost tracking. startDate format: ISO-8601 date, e.g. 2025-01-01T00:00:00Z. |
| delete_model | Deletes a custom model definition by ID. Note: Langfuse-managed models cannot be deleted. Only custom models you created can be deleted. To override a Langfuse-managed model, create a new custom model with the same modelName instead. modelId is required. This action is irreversible. |
| get_model | Returns a single model definition by its ID. Returns: id, modelName, matchPattern, unit, inputPrice, outputPrice, totalPrice, startDate, tokenizerId, isLangfuseManaged, projectId. modelId is required. |
| list_models | Returns a paginated list of all models in the Langfuse project, including both Langfuse-managed models and custom models you have defined. Each model contains: id, modelName, matchPattern, unit, inputPrice, outputPrice, totalPrice, startDate, tokenizerId, isLangfuseManaged. Pagination: page is 1-based (default 1), limit controls page size (default 20). |
| get_projects_for_api_key | Returns the project or projects visible to the currently configured API key. With a project-scoped key this normally returns one project. With broader credentials, use this to confirm which project metadata is available. |
| get_prompt | Fetch a specific prompt by name. Optionally pin to a version number or a label (e.g. 'production', 'staging'). Returns: name, version, type (text|chat), prompt content, labels, tags, config. Read-only. |
| list_prompts | List all prompts in the Langfuse project with pagination. Read-only. |
| create_prompt | Creates a new version of a prompt. If the prompt name does not exist, a new prompt is created. If it does exist, a new version is appended. type values: text (plain string prompt) | chat (array of message objects). For text prompts, provide prompt as a plain string. For chat prompts, provide prompt as a JSON array of message objects with role and content fields. labels examples: production, staging, latest. The 'latest' label is managed by Langfuse automatically. Returns the created prompt version with its assigned version number. name, type, and prompt are required. |
| delete_prompt | Deletes prompt versions by name. Behaviour depends on which filters are supplied:
|
| update_prompt_labels | Replaces the labels on a specific prompt version. newLabels completely replaces the existing label set on that version. The 'latest' label is reserved and managed by Langfuse — do not include it. Both promptName and version are required. newLabels is required — supply an empty string to remove all labels from this version. |
| get_data_schema | Returns the Langfuse data model schema: all entity types, fields, and valid enum values. Call this first before running any query to understand the available data structures. Read-only. |
| create_score_config | Creates a score config definition used to validate or structure future scores. name and dataType are required. For categorical configs, categoriesJson should be a JSON array of {label,value} objects. For numeric configs, minValue and maxValue are optional bounds. |
| get_score | Fetch a single evaluation score by ID. Read-only. |
| get_score_config | Get a specific score config schema by ID. Read-only. |
| get_score_configs | List all score config schemas. Configs define constraints for NUMERIC (min/max), CATEGORICAL (allowed categories), or BOOLEAN scores. Read-only. |
| get_scores | List evaluation scores with optional filters. dataType values: NUMERIC | CATEGORICAL | BOOLEAN. Returns: id, traceId, observationId, name, value, dataType, comment, source. Read-only. |
| update_score_config | Updates an existing score config. configId is required. Provide only the fields you want to change. categoriesJson must be a JSON array of {label,value} objects when supplied. |
| fetch_sessions | Paginated list of all sessions with optional time range filter. Read-only. |
| get_session_details | Full details of one session including all its traces. Read-only. |
| get_user_sessions | All sessions for a specific user with pagination. Read-only. |
| delete_trace | Deletes a single trace by ID. This action is irreversible. traceId is required. |
| delete_traces | Deletes multiple traces in one request. Pass a comma-separated list of trace IDs. This action is irreversible. |
| fetch_trace | Returns the full detail of a single Langfuse trace identified by its ID. The response includes all observations (spans, generations, events) nested under the trace, as well as input/output payloads, metadata, tags, latency, and token usage. Use this after fetch_traces to drill into a specific trace. The traceId is required. |
| fetch_traces | Returns a paginated list of Langfuse traces. Each trace represents one end-to-end LLM pipeline execution. The response includes: id, name, userId, sessionId, level (DEFAULT | DEBUG | WARNING | ERROR), latency (seconds), totalTokens, totalCost (USD), tags, timestamp. All filter parameters are optional. Omit any filter you do not need — omitted filters are ignored and do not narrow the result set. Pagination: page is 1-based (default 1), limit controls page size (default 20, max 100). To page through results, increment page while keeping limit fixed. |
| find_exceptions | Returns only traces whose level field equals ERROR. Filtering is performed on the server before the response is returned — the result set contains error traces only, never a mix of levels. Useful for surfacing pipeline failures and debugging production errors. Both time range parameters are optional. Omit them to search across all time. Pagination works the same way as fetch_traces. |
| find_exceptions_in_file | Returns ERROR-level traces whose metadata contains the given file name as a substring. Both conditions must be true for a trace to appear in the result: (1) the trace level is ERROR, and (2) the trace metadata JSON contains the fileName string anywhere inside it. Use this to isolate errors originating from a specific source file. fileName is required. Both time range parameters are optional — omit them to search across the full project history. |
| get_error_count | Returns the total count of ERROR-level traces within the specified time range. The server scans up to 500 traces (5 pages of 100) and counts those with level=ERROR. The response contains errorCount, fromTimestamp, and toTimestamp. Both time range parameters are optional. Omit them to count errors across all time. Use this for a quick health signal before drilling into individual traces with find_exceptions. |
| get_exception_details | Returns the full detail of a single ERROR-level trace identified by its ID. Equivalent to fetch_trace but semantically scoped to error traces. Use this after find_exceptions to inspect a specific failure in depth — the response includes all nested observations, input/output, metadata, and timing. The traceId is required. |
| get_user_traces | All traces for a specific user with pagination. Read-only. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |