Prompt Lab MCP Server
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| PROMPT_LAB_UI_URL | No | URL of your Prompt Lab UI deployment | |
| UPSTASH_REDIS_REST_URL | Yes | Upstash Redis URL for persistence | |
| UPSTASH_REDIS_REST_TOKEN | Yes | Upstash Redis token |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| start_web_appA | Open the prompt lab web UI. Returns the URL for the browser. If no workspaceId is given, creates a new empty workspace. If workspaceId is given, connects to that workspace (must exist). Always pass your environment API keys — they enable the UI Send button. Check each env var and pass it if set: anthropicApiKey: process.env.ANTHROPIC_API_KEY geminiApiKey: process.env.GEMINI_API_KEY openaiApiKey: process.env.OPENAI_API_KEY After returning the URL, call list_models to register available models. Then set_system_prompt and add_test_cases before running optimization. |
| list_modelsB | Register available AI models and API keys for this workspace. Call once after start_web_app. Scan your environment for API keys and Ollama:
Default model priority (first available wins): gemini-2.5-flash-lite → claude-haiku-4-5-20251001 → gpt-4o-mini |
| register_api_keyA | Register a provider API key for this workspace. Use this when you need to register a key that was not passed to start_web_app. Specify provider explicitly: anthropic | google | openai. |
| save_templateA | Save a named test suite template so it appears in the UI "Load test suite…" dropdown. Call at session startup for every .json file in prompt-lab/templates/: save_template(name=<file.name>, testCases=<file.testCases>) Template format (matches what the UI exports as a downloadable JSON): { "name": "suite-name", "savedAt": "...", "testCases": [{ "label"?, "query", "targetAnswer"?, "passThreshold"?, "queryType"? }] } Templates persist in Redis. Saving with the same name replaces the previous version. |
| save_system_prompt_templateA | Save a named system prompt template so it appears in the UI "Load template…" dropdown. Call at session startup for every .txt file in prompt-lab/system-prompts/: save_system_prompt_template(name=, content=) Also call after a successful optimization loop to preserve the best prompt found. Templates persist in Redis. Saving with the same name replaces the previous version. |
| set_system_promptA | Set or update the system prompt for this workspace. Does NOT increment the iteration counter — use this for initial setup or manual overrides. To record an optimization step, use apply_suggestion. Load the current prompt from current.json or ask the user before overwriting. |
| add_test_casesA | Add test cases to this workspace. Set replace: true to clear the existing suite and load a fresh one. Set replace: false (default) to append to the existing suite. Each test case needs at least a query. targetAnswer is required for scoring. Omit targetAnswer only for exploratory runs where you score manually. |
| start_optimization_sessionA | Run one optimization pass on an existing workspace. Prerequisites (do these first):
What this does:
This is one iteration. After the user approves or rejects the suggestion, call start_optimization_session again or switch to loop_optimization. |
| loop_optimizationA | Run the full optimization loop until the threshold is met or max iterations reached. Like start_optimization_session but auto-applies each suggestion and repeats. Prerequisites: same as start_optimization_session. Loop:
Do NOT stop after the first pass because it is passing — first pass is a baseline. Always run at least one improvement cycle. After the loop: call pull_ui_history, save optimization results locally, call save_system_prompt_template with the best prompt found. |
| run_regression_testsuiteA | Run all test cases against the current system prompt. Single pass — does not auto-improve. Use this to verify an already-good prompt still passes all test cases. For automatic improvement loops, use loop_regression. Steps to follow after this call:
|
| loop_regressionA | Run the full regression loop: test all cases → score → improve → repeat. Stops when BOTH conditions are met:
Loop:
After the loop: call pull_ui_history and save results locally. |
| get_workspace_stateA | Read the full current state of a workspace. Returns: system prompt, test cases, test results, suggestions, iteration counter, optimization goal, available models, selected model, and active query/target. Call at the start of each session to recover state after a context break. Also call before running tests to get the latest test case IDs. |
| post_test_resultA | Store the scored result of one test case run. Call after you run a test case against the model and evaluate the response. This makes the result visible in the UI and is used by get_regression_status. Score 0–100 using this scale: 90–100: Correct, complete, well-structured — exceeds target. 70–89: Correct and complete — minor gaps or style issues. 50–69: Partially correct — key points present but missing important details. 30–49: Mostly wrong — one or two relevant points but fundamentally off. 0–29: Completely wrong, off-topic, or refused. |
| post_prompt_suggestionA | Queue a revised system prompt for the user to review. Always explain in reasoning:
In gated mode (start_optimization_session): user reviews in UI, then approves or rejects. In loop mode (loop_optimization, loop_regression): call apply_suggestion immediately after. |
| apply_suggestionA | Apply a pending suggestion: sets it as the active system prompt and increments the iteration counter. Only call in fully automated loop mode (loop_optimization, loop_regression). In gated mode, wait for the user to approve via the UI. |
| get_regression_statusB | Pass/fail summary across all test cases for the current system prompt. Call after running all test cases to decide: is the prompt good enough, or improve further? A test case passes if its most recent score >= threshold (default 70). |
| set_test_modelB | Switch the model used for test cases in this workspace. Updates the UI model selector. |
| pull_ui_historyA | Fetch all history entries the UI has pushed to this workspace. The UI auto-pushes after every session summary ("Summarize & new") and every regression run. This gives you a record of what the user did in the UI between agent calls. ALWAYS save the response to a local file: prompt-lab/workspaces//_ui_history.json |
| delete_sessionB | Delete a workspace and all its state (test cases, results, suggestions, API keys). Irreversible. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/jurek-f/prompt-lab-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server