Skip to main content
Glama
langchain-ai

LangSmith MCP Server

Official
by langchain-ai

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
LANGSMITH_API_KEYYesYour LangSmith API key for authentication
LANGSMITH_ENDPOINTNoCustom API endpoint URL (for self-hosted or EU region)https://api.smith.langchain.com
LANGSMITH_WORKSPACE_IDNoWorkspace ID for API keys scoped to multiple workspaces

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
list_prompts

Fetch prompts from LangSmith with optional filtering.

Args: is_public (str): Filter by prompt visibility - "true" for public prompts, "false" for private prompts (default: "false") limit (int): Maximum number of prompts to return (default: 20)

Returns: Dict[str, Any]: Dictionary containing the prompts and metadata

get_prompt_by_name

Get a specific prompt by its exact name.

Args: prompt_name (str): The exact name of the prompt to retrieve ctx: FastMCP context (automatically provided)

Returns: Dict[str, Any]: Dictionary containing the prompt details and template, or an error message if the prompt cannot be found

push_prompt

Call this tool when you need to understand how to create and push prompts to LangSmith.

fetch_runs

Fetch LangSmith runs (traces, tools, chains, etc.) from one or more projects using flexible filters, query language expressions, and trace-level constraints.


🧩 PURPOSE

This is a general-purpose LangSmith run fetcher designed for analytics, trace export, and automated exploration.

It wraps client.list_runs() with complete support for:

  • Multiple project names or IDs

  • The Filter Query Language (FQL) for precise queries

  • Hierarchical filtering across trace trees

  • Sorting and result limiting

It returns raw suitable for further analysis or export.


βš™οΈ PARAMETERS

project_name : str The project name to fetch runs from. For multiple projects, use JSON array string (e.g., '["project1", "project2"]').

trace_id : str, optional Return only runs that belong to a specific trace tree. It is a UUID string, e.g. "123e4567-e89b-12d3-a456-426614174000".

run_type : str, optional Filter runs by type (e.g. "llm", "chain", "tool", "retriever").

error : str, optional Filter by error status: "true" for errored runs, "false" for successful runs.

is_root : str, optional Filter root traces: "true" for only top-level traces, "false" to exclude roots. If not provided, returns all runs.

filter : str, optional A Filter Query Language (FQL) expression that filters runs by fields, metadata, tags, feedback, latency, or time.

─── Common field names ─── - `id`, `name`, `run_type` - `start_time`, `end_time` - `latency` - `total_tokens` - `error` - `tags` - `feedback_key`, `feedback_score` - `metadata_key`, `metadata_value` - `execution_order` ─── Supported comparators ─── - `eq`, `neq` β†’ equal / not equal - `gt`, `gte`, `lt`, `lte` β†’ numeric or time comparisons - `has` β†’ tag or metadata contains value - `search` β†’ substring or full-text match - `and`, `or`, `not` β†’ logical operators ─── Examples ─── ```python 'gt(latency, "5s")' # took longer than 5 seconds 'neq(error, null)' # errored runs 'has(tags, "beta")' # runs tagged "beta" 'and(eq(name,"ChatOpenAI"), eq(run_type,"llm"))' # named & typed runs 'search("image classification")' # full-text search ```

trace_filter : str, optional Filter applied to the root run in each trace tree. Lets you select child runs based on root attributes or feedback.

Example: ```python 'and(eq(feedback_key,"user_score"), eq(feedback_score,1))' ``` β†’ return runs whose root trace has a user_score of 1.

tree_filter : str, optional Filter applied to any run in the trace tree (including siblings or children). Example: python 'eq(name,"ExpandQuery")' β†’ return runs if any run in their trace had that name.

order_by : str, default "-start_time" Sort field; prefix with "-" for descending order.

limit : int, default 50 Maximum number of runs to return.

reference_example_id : str, optional Filter runs by reference example ID. Returns only runs associated with the specified dataset example ID.

format_type : str, default "pretty" Output format for extracted messages. Options: - "pretty" (default): Human-readable formatted text focusing on human/AI/tool message exchanges - "json": Pretty-printed JSON format - "raw": Compact single-line JSON format

When format_type is set, the tool extracts messages from runs and formats them, making it ideal for conversational AI agents that care about message exchanges rather than full trace details. The response returns only the formatted output: - `formatted`: Formatted string representation of messages (when format_type is provided) When format_type is not set, the response returns: - `runs`: Full run data

πŸ“€ RETURNS

Dict[str, Any] Dictionary containing: - If format_type is set: {"formatted": str} - formatted string representation of messages - If format_type is not set: {"runs": List[Dict]} - list of LangSmith run dictionaries


πŸ§ͺ EXAMPLES

1️⃣ Get latest 10 root runs

runs = fetch_runs("alpha-project", is_root="true", limit=10)

2️⃣ Get all tool runs that errored

runs = fetch_runs("alpha-project", run_type="tool", error="true")

3️⃣ Get all runs that took >5s and have tag "experimental"

runs = fetch_runs("alpha-project", filter='and(gt(latency,"5s"), has(tags,"experimental"))')

4️⃣ Get all runs in a specific conversation thread

thread_id = "abc-123" fql = f'and(in(metadata_key, ["session_id","conversation_id","thread_id"]), eq(metadata_value, "{thread_id}"))' runs = fetch_runs("alpha-project", is_root="true", filter=fql)

5️⃣ List all runs called "extractor" whose root trace has feedback user_score=1

runs = fetch_runs( "alpha-project", filter='eq(name,"extractor")', trace_filter='and(eq(feedback_key,"user_score"), eq(feedback_score,1))' )

6️⃣ List all runs that started after a timestamp and either errored or got low feedback

fql = 'and(gt(start_time,"2023-07-15T12:34:56Z"), or(neq(error,null), and(eq(feedback_key,"Correctness"), eq(feedback_score,0.0))))' runs = fetch_runs("alpha-project", filter=fql)

7️⃣ Get formatted messages for conversational AI (default: pretty format)

# Returns formatted messages focusing on human/AI/tool exchanges result = fetch_runs("alpha-project", limit=10, format_type="pretty") # result["formatted"] contains human-readable formatted messages # result["messages"] contains the raw message list # result["runs"] contains full run data

8️⃣ Get messages in JSON format

result = fetch_runs("alpha-project", limit=10, format_type="json") # result["messages"] contains messages as JSON array # result["formatted"] contains pretty-printed JSON string

🧠 NOTES FOR AGENTS

  • Use this to query LangSmith data sources dynamically.

  • Compose FQL strings programmatically based on your intent.

  • Combine filter, trace_filter, and tree_filter for hierarchical logic.

  • Always verify that project_name matches an existing LangSmith project.

  • Returned dict objects have fields like:

  • id, name, run_type, inputs, outputs, error, start_time, end_time, latency, metadata, feedback, etc.

  • If the trace is big, save it to a file (if you have this ability) and analyze it locally.

  • For conversational AI agents: Use format_type="pretty" (default) to get human-readable message exchanges focusing on human/AI/tool messages rather than full trace details.

list_projects

List LangSmith projects with optional filtering and detail level control.

Fetches projects from LangSmith, optionally filtering by name and controlling the level of detail returned. Can return either simplified project information or full project details. In case a dataset id or name is provided, you don't need to provide a project name.


🧩 PURPOSE

This function provides a convenient way to list and explore LangSmith projects. It supports:

  • Filtering projects by name (partial match)

  • Limiting the number of results

  • Choosing between simplified or full project information

  • Automatically extracting deployment IDs from nested project data


βš™οΈ PARAMETERS

limit : int, default 5 Maximum number of projects to return (as string, e.g., "5"). This can be adjusted by agents or users based on their needs.

project_name : str, optional Filter projects by name using partial matching. If provided, only projects whose names contain this string will be returned. Example: project_name="Chat" will match "Chat-LangChain", "ChatBot", etc.

more_info : str, default "false" Controls the level of detail returned: - "false" (default): Returns simplified project information with only essential fields: name, project_id, and agent_deployment_id (if available) - "true": Returns full project details as returned by the LangSmith API

reference_dataset_id : str, optional The ID of the reference dataset to filter projects by. Either this OR reference_dataset_name must be provided (but not both).

reference_dataset_name : str, optional The name of the reference dataset to filter projects by. Either this OR reference_dataset_id must be provided (but not both).


πŸ“€ RETURNS

List[dict] A list of project dictionaries. The structure depends on more_info:

**When `more_info=False` (simplified):** ```python [ { "name": "Chat-LangChain", "project_id": "787d5165-f110-43ff-a3fb-66ea1a70c971", "agent_deployment_id": "deployment-123" # Only if available }, ... ] ``` **When `more_info=True` (full details):** Returns complete project objects with all fields from the LangSmith API, including metadata, settings, statistics, and nested structures.

πŸ§ͺ EXAMPLES

1️⃣ List first 5 projects (simplified)

projects = list_projects(limit="5")

2️⃣ Search for projects with "Chat" in the name

projects = list_projects(project_name="Chat", limit="10")

3️⃣ Get full project details

projects = list_projects(limit="3", more_info="true")

4️⃣ Find a specific project with full details

projects = list_projects(project_name="MyProject", more_info="true", limit="1")

🧠 NOTES FOR AGENTS

  • Use more_info="false" for quick project discovery and listing

  • Use more_info="true" when you need detailed project information

  • The agent_deployment_id field is automatically extracted from nested project data when available, making it easy to identify agent deployments

  • Projects are filtered to exclude reference projects by default

  • The function uses name_contains for filtering, so partial matches work

list_experiments

List LangSmith experiment projects (reference projects) with mandatory dataset filtering.

Fetches experiment projects from LangSmith that are associated with a specific dataset. These are projects used for model evaluation and comparison. Requires either a dataset ID or dataset name to filter experiments.


🧩 PURPOSE

This function provides a convenient way to list and explore LangSmith experiment projects. It supports:

  • Filtering experiments by reference dataset (mandatory)

  • Filtering projects by name (partial match)

  • Limiting the number of results

  • Automatically extracting deployment IDs from nested project data

  • Returns simplified project information with key metrics (latency, cost, feedback stats)


βš™οΈ PARAMETERS

reference_dataset_id : str, optional The ID of the reference dataset to filter experiments by. Either this OR reference_dataset_name must be provided (but not both).

reference_dataset_name : str, optional The name of the reference dataset to filter experiments by. Either this OR reference_dataset_id must be provided (but not both).

limit : int, default 5 Maximum number of experiments to return. This can be adjusted by agents or users based on their needs.

project_name : str, optional Filter projects by name using partial matching. If provided, only projects whose names contain this string will be returned. Example: project_name="Chat" will match "Chat-LangChain", "ChatBot", etc.


πŸ“€ RETURNS

Dict[str, Any] A dictionary containing an "experiments" key with a list of simplified experiment project dictionaries:

```python { "experiments": [ { "name": "Experiment-Chat-LangChain", "experiment_id": "787d5165-f110-43ff-a3fb-66ea1a70c971", "feedback_stats": {...}, # Feedback statistics if available "latency_p50_seconds": 1.626, # 50th percentile latency in seconds "latency_p99_seconds": 2.390, # 99th percentile latency in seconds "total_cost": 0.00013005, # Total cost in dollars "prompt_cost": 0.00002085, # Prompt cost in dollars "completion_cost": 0.0001092, # Completion cost in dollars "agent_deployment_id": "deployment-123" # Only if available }, ... ] } ```

πŸ§ͺ EXAMPLES

1️⃣ List experiments for a dataset by ID

experiments = list_experiments(reference_dataset_id="f5ca13c6-96ad-48ba-a432-ebb6bf94528f")

2️⃣ List experiments for a dataset by name

experiments = list_experiments(reference_dataset_name="my-dataset", limit=10)

3️⃣ Find experiments with specific name pattern

experiments = list_experiments( reference_dataset_id="f5ca13c6-96ad-48ba-a432-ebb6bf94528f", project_name="Chat", limit=1 )

🧠 NOTES FOR AGENTS

  • Returns simplified experiment information with key metrics (latency, cost, feedback stats)

  • The agent_deployment_id field is automatically extracted from nested project data when available, making it easy to identify agent deployments

  • Experiments are filtered to include only reference projects (associated with datasets)

  • The function uses name_contains for filtering, so partial matches work

  • You must provide either reference_dataset_id OR reference_dataset_name, but not both

  • Experiment projects are used for model evaluation and comparison across different runs

list_datasets

Fetch LangSmith datasets.

Note: If no arguments are provided, all datasets will be returned.

Args: dataset_ids (Optional[str]): Dataset IDs to filter by as JSON array string (e.g., '["id1", "id2"]') or single ID data_type (Optional[str]): Filter by dataset data type (e.g., 'chat', 'kv') dataset_name (Optional[str]): Filter by exact dataset name dataset_name_contains (Optional[str]): Filter by substring in dataset name metadata (Optional[str]): Filter by metadata as JSON object string (e.g., '{"key": "value"}') limit (int): Max number of datasets to return (default: 20) ctx: FastMCP context (automatically provided)

Returns: Dict[str, Any]: Dictionary containing the datasets and metadata, or an error message if the datasets cannot be retrieved

list_examples

Fetch examples from a LangSmith dataset with advanced filtering options.

Note: Either dataset_id, dataset_name, or example_ids must be provided. If multiple are provided, they are used in order of precedence: example_ids, dataset_id, dataset_name.

Args: dataset_id (Optional[str]): Dataset ID to retrieve examples from dataset_name (Optional[str]): Dataset name to retrieve examples from example_ids (Optional[str]): Specific example IDs as JSON array string (e.g., '["id1", "id2"]') or single ID limit (int): Maximum number of examples to return (default: 10) offset (int): Number of examples to skip (default: 0) filter (Optional[str]): Filter string using LangSmith query syntax (e.g., 'has(metadata, {"key": "value"})') metadata (Optional[str]): Metadata to filter by as JSON object string (e.g., '{"key": "value"}') splits (Optional[str]): Dataset splits as JSON array string (e.g., '["train", "test"]') or single split inline_s3_urls (Optional[str]): Whether to inline S3 URLs: "true" or "false" (default: SDK default if not specified) include_attachments (Optional[str]): Whether to include attachments: "true" or "false" (default: SDK default if not specified) as_of (Optional[str]): Dataset version tag OR ISO timestamp to retrieve examples as of that version/time ctx: FastMCP context (automatically provided)

Returns: Dict[str, Any]: Dictionary containing the examples and metadata, or an error message if the examples cannot be retrieved

read_dataset

Read a specific dataset from LangSmith.

Note: Either dataset_id or dataset_name must be provided to identify the dataset. If both are provided, dataset_id takes precedence.

Args: dataset_id (Optional[str]): Dataset ID to retrieve dataset_name (Optional[str]): Dataset name to retrieve ctx: FastMCP context (automatically provided)

Returns: Dict[str, Any]: Dictionary containing the dataset details, or an error message if the dataset cannot be retrieved

Example in case you need to create a separate python script to read a dataset: ```python from langsmith import Client

client = Client() dataset = client.read_dataset(dataset_name="My Dataset") # Or by ID: # dataset = client.read_dataset(dataset_id="dataset-id-here") ```
read_example

Read a specific example from LangSmith.

Args: example_id (str): Example ID to retrieve as_of (Optional[str]): Dataset version tag OR ISO timestamp to retrieve the example as of that version/time ctx: FastMCP context (automatically provided)

Returns: Dict[str, Any]: Dictionary containing the example details, or an error message if the example cannot be retrieved

Example in case you need to create a separate python script to read an example: ```python from langsmith import Client

client = Client() example = client.read_example(example_id="example-id-here") # Or with version: # example = client.read_example(example_id="example-id-here", as_of="v1.0") ```
create_dataset

Call this tool when you need to understand how to create datasets in LangSmith.

update_examples

Call this tool when you need to understand how to update dataset examples in LangSmith.

run_experiment

Call this tool when you need to understand how to run experiments and evaluations in LangSmith.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/langchain-ai/langsmith-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server