How do I use sampling-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@sampling-mcp summarize the article on quantum computing" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

sampling-mcp

by srod0010

Overview Schema Related Servers Score Discussions

Python

Local

Sampling in MCP — Demo

A minimal FastMCP example that demonstrates sampling: the mechanism by which an MCP server asks the client to run an LLM completion on its behalf, instead of calling an LLM itself.

The server exposes a summarize_document tool. When called, the tool doesn't talk to any LLM directly — it requests a completion from the client, which runs the model (GPT-4o via LiteLLM) and returns the text.

Sampling

The request direction is inverted from a normal tool call:

The server holds no API keys and no model SDK. It just declares what it wants generated.
The client owns the credentials, the model choice, and the LLM SDK. It decides how the generation actually happens (and can apply its own policy, fallbacks, cost controls, etc.).

This keeps secrets on the client side and lets a single server work with whatever model the client is willing to provide.

Related MCP server: Basic MCP Server

Flow

sequenceDiagram
    autonumber
    participant Main as client.py (main)
    participant Client as FastMCP Client
    participant Handler as sampling_handler
    participant LLM as GPT-4o (LiteLLM → OpenAI)
    participant Server as server.py (subprocess)

    Note over Main,Server: stdio transport — Client spawns server.py as a child process
    Main->>Client: async with client (start + handshake)
    Client->>Server: launch server.py, open stdio pipes
    Main->>Client: call_tool("summarize_document", {document_text})
    Client->>Server: tools/call request
    Server->>Server: summarize_document() runs
    Server-->>Client: ctx.sample(messages, system_prompt,<br/>temperature, max_tokens, model_preferences)
    Note right of Server: Server requests generation —<br/>it does NOT call the LLM itself
    Client->>Handler: invoke sampling_handler(messages, params, ctx)
    Handler->>Handler: build chat messages,<br/>read OPENAI_API_KEY from .env
    Handler->>LLM: acompletion(model, messages, temperature, max_tokens)
    LLM-->>Handler: generated summary text
    Handler-->>Server: return text (sampling result)
    Server->>Server: format "Summary:\n..."
    Server-->>Client: tool result
    Client-->>Main: CallToolResult
    Main->>Client: exit async with → connection closed

Components

flowchart LR
    subgraph ClientProc["Client process (holds the secrets)"]
        Main["main()<br/>reads sample.txt,<br/>calls the tool"]
        Client["FastMCP Client<br/>stdio transport"]
        Handler["sampling_handler<br/>OPENAI_API_KEY + LiteLLM"]
    end
    subgraph ServerProc["Server subprocess (no keys, no LLM SDK)"]
        Tool["summarize_document tool<br/>ctx.sample(...)"]
    end
    LLM[("OpenAI GPT-4o")]

    Main --> Client
    Client -- "tools/call (stdio)" --> Tool
    Tool -- "ctx.sample request (stdio)" --> Handler
    Handler -- "HTTPS" --> LLM
    LLM -- "completion" --> Handler
    Handler -- "result" --> Tool

Important parts of the code

Where	What to notice
server.py:10-16	`ctx.sample(...)` is the whole point — the server requests a completion (passing `system_prompt`, `temperature`, `max_tokens`, `model_preferences`) rather than calling an LLM.
client.py:13-45	`sampling_handler` is where the client fulfills that request: it builds chat messages, reads `OPENAI_API_KEY`, and calls the real model via LiteLLM. This is the client-side LLM policy.
client.py:47	`Client("server.py", sampling_handler=...)` — passing a `.py` path selects the stdio transport, so the client spawns the server as a subprocess. No separate terminal needed.
client.py:53-59	`async with client:` manages the full lifecycle — start, handshake, and graceful shutdown. `is_connected()` is `False` afterwards by design.
client.py:29	`modelPreferences.hints[0].name` — this implementation just takes the first hint. A real handler would validate/fallback across hints.

Setup

This project uses uv.

# 1. Install dependencies (creates .venv)
uv sync

# 2. Add your key — copy the template and fill it in
cp .env.example .env
#   then set OPENAI_API_KEY="sk-..." in .env

# 3. Run — the client launches the server automatically
uv run client.py

open-me.ipynb contains a guided, step-by-step walkthrough of the same setup.

Expected output

[..] INFO  Starting MCP server 'Document Assistant' with transport 'stdio'
gpt-4o          # \
0.7             #  } printed by sampling_handler (model / temperature / max_tokens)
300             # /
CallToolResult(content=[TextContent(... text="Summary:\n...")], is_error=False)
Connected?: False   # connection closed when the `async with` block exits — expected

Project layout

server.py        # FastMCP server — exposes summarize_document, uses ctx.sample()
client.py        # FastMCP client — runs the sampling_handler (the actual LLM call)
sample.txt       # input document fed to the tool
open-me.ipynb    # guided walkthrough notebook
.env.example     # template for OPENAI_API_KEY (copy to .env)
pyproject.toml   # uv project + dependencies (fastmcp, litellm, python-dotenv)

This server cannot be installed

license - not found

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Related MCP Servers

MCP Server Sample
Developer Tools Code Execution Autonomous Agents
antonscap
A
license
B
quality
D
maintenance
An educational implementation of a Model Context Protocol server that demonstrates how to build a functional MCP server integrating with various LLM clients.
Last updated 2025-05-16
2
MIT
Basic MCP Server
Developer Tools
gurdasnijor
A
license
-
quality
D
maintenance
A minimal demonstration server showcasing MCP protocol capabilities including tools, resources, and prompts with basic examples like hello world functionality.
Last updated 2025-10-28
18
MIT
MCP Sampling Demo
Developer Tools Code Execution Autonomous Agents
0GiS0
F
license
-
quality
D
maintenance
Demonstrates how to implement sampling in MCP servers, allowing tools to request LLM content generation from the client without requiring external API integrations or credentials.
Last updated 2025-10-27
1
MCP 101 Example
AI & Machine Learning Developer Tools
behavioral-ds
F
license
-
quality
D
maintenance
A foundational implementation of a Model Context Protocol (MCP) server designed for educational purposes. It demonstrates the complete interaction between an LLM, an inference engine, and a client during an agentic call.
Last updated 2026-03-26

View all related MCP servers

Related MCP Connectors

TokenOracle
Hosted MCP server for LLM cost estimation, model comparison, and budget-aware routing.
mcp-aichat
MCP server for AI dialogue using various LLM models via AceDataCloud
mcp
MCP server providing access to the Scorecard API to evaluate and optimize LLM systems.

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/srod0010/sampling-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server