Schema | Prompt Lab MCP Server

Prompt Lab MCP Server

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description
`PROMPT_LAB_UI_URL`	No	URL of your Prompt Lab UI deployment
`UPSTASH_REDIS_REST_URL`	Yes	Upstash Redis URL for persistence
`UPSTASH_REDIS_REST_TOKEN`	Yes	Upstash Redis token

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": true }

Tools

Functions exposed to the LLM to take actions

Name	Description
start_web_appA	Open the prompt lab web UI. Returns the URL for the browser. If no workspaceId is given, creates a new empty workspace. If workspaceId is given, connects to that workspace (must exist). Always pass your environment API keys — they enable the UI Send button. Check each env var and pass it if set: anthropicApiKey: process.env.ANTHROPIC_API_KEY geminiApiKey: process.env.GEMINI_API_KEY openaiApiKey: process.env.OPENAI_API_KEY After returning the URL, call list_models to register available models. Then set_system_prompt and add_test_cases before running optimization.
list_modelsB	Register available AI models and API keys for this workspace. Call once after start_web_app. Scan your environment for API keys and Ollama: process.env.ANTHROPIC_API_KEY → add claude-haiku-4-5-20251001, claude-sonnet-4-6, claude-opus-4-8 process.env.GEMINI_API_KEY → add gemini-2.5-flash-lite, gemini-2.5-flash, gemini-2.5-pro process.env.OPENAI_API_KEY → add gpt-4o-mini, gpt-4o Ollama: fetch (process.env.OLLAMA_URL ?? "http://localhost:11434") + "/api/tags" → add each model.name; catch errors silently Default model priority (first available wins): gemini-2.5-flash-lite → claude-haiku-4-5-20251001 → gpt-4o-mini
register_api_keyA	Register a provider API key for this workspace. Use this when you need to register a key that was not passed to start_web_app. Specify provider explicitly: anthropic \| google \| openai.
save_templateA	Save a named test suite template so it appears in the UI "Load test suite…" dropdown. Call at session startup for every .json file in prompt-lab/templates/: save_template(name=<file.name>, testCases=<file.testCases>) Template format (matches what the UI exports as a downloadable JSON): { "name": "suite-name", "savedAt": "...", "testCases": [{ "label"?, "query", "targetAnswer"?, "passThreshold"?, "queryType"? }] } Templates persist in Redis. Saving with the same name replaces the previous version.
save_system_prompt_templateA	Save a named system prompt template so it appears in the UI "Load template…" dropdown. Call at session startup for every .txt file in prompt-lab/system-prompts/: save_system_prompt_template(name=, content=) Also call after a successful optimization loop to preserve the best prompt found. Templates persist in Redis. Saving with the same name replaces the previous version.
set_system_promptA	Set or update the system prompt for this workspace. Does NOT increment the iteration counter — use this for initial setup or manual overrides. To record an optimization step, use apply_suggestion. Load the current prompt from current.json or ask the user before overwriting.
add_test_casesA	Add test cases to this workspace. Set replace: true to clear the existing suite and load a fresh one. Set replace: false (default) to append to the existing suite. Each test case needs at least a query. targetAnswer is required for scoring. Omit targetAnswer only for exploratory runs where you score manually.
start_optimization_sessionA	Run one optimization pass on an existing workspace. Prerequisites (do these first): start_web_app → workspace URL + ID set_system_prompt → starting prompt add_test_cases → at least one case with targetAnswer What this does: Read system prompt and test cases from get_workspace_state. Run each test case against the model (write + execute a temp Node.js script). Score each response vs targetAnswer (LLM-as-judge, 0–100), call post_test_result. Analyse failures, write improved prompt, call post_prompt_suggestion. Present the suggestion — do NOT auto-apply. User reviews in the UI. This is one iteration. After the user approves or rejects the suggestion, call start_optimization_session again or switch to loop_optimization.
loop_optimizationA	Run the full optimization loop until the threshold is met or max iterations reached. Like start_optimization_session but auto-applies each suggestion and repeats. Prerequisites: same as start_optimization_session. Loop: Run all test cases, score responses, call post_test_result for each. Call get_regression_status. If ALL scores >= threshold AND iteration >= 1 → SUCCESS. If iteration >= maxIterations → EXHAUSTED. Report best result. Analyse failures, write improved prompt (targeted — fix pattern, keep what works). Call post_prompt_suggestion then apply_suggestion (auto authorised in loop mode). Go to 1. Do NOT stop after the first pass because it is passing — first pass is a baseline. Always run at least one improvement cycle. After the loop: call pull_ui_history, save optimization results locally, call save_system_prompt_template with the best prompt found.
run_regression_testsuiteA	Run all test cases against the current system prompt. Single pass — does not auto-improve. Use this to verify an already-good prompt still passes all test cases. For automatic improvement loops, use loop_regression. Steps to follow after this call: Run each test case against the model, score the response, call post_test_result. Call get_regression_status to see pass/fail summary. Optionally: post_prompt_suggestion with an improvement (user reviews).
loop_regressionA	Run the full regression loop: test all cases → score → improve → repeat. Stops when BOTH conditions are met: Overall pass rate >= threshold Every individual test case score >= threshold Or when max iterations are exhausted. Loop: Run all test cases, score responses, call post_test_result for each. Call get_regression_status. If pass rate >= threshold AND all individual scores >= threshold → SUCCESS. If iteration >= maxIterations → EXHAUSTED. Report best result. Analyse failures, write improved prompt, call post_prompt_suggestion + apply_suggestion. Go to 1. After the loop: call pull_ui_history and save results locally.
get_workspace_stateA	Read the full current state of a workspace. Returns: system prompt, test cases, test results, suggestions, iteration counter, optimization goal, available models, selected model, and active query/target. Call at the start of each session to recover state after a context break. Also call before running tests to get the latest test case IDs.
post_test_resultA	Store the scored result of one test case run. Call after you run a test case against the model and evaluate the response. This makes the result visible in the UI and is used by get_regression_status. Score 0–100 using this scale: 90–100: Correct, complete, well-structured — exceeds target. 70–89: Correct and complete — minor gaps or style issues. 50–69: Partially correct — key points present but missing important details. 30–49: Mostly wrong — one or two relevant points but fundamentally off. 0–29: Completely wrong, off-topic, or refused.
post_prompt_suggestionA	Queue a revised system prompt for the user to review. Always explain in reasoning: which test cases were failing and why what specific change you made to the prompt why you expect this change to fix those cases In gated mode (start_optimization_session): user reviews in UI, then approves or rejects. In loop mode (loop_optimization, loop_regression): call apply_suggestion immediately after.
apply_suggestionA	Apply a pending suggestion: sets it as the active system prompt and increments the iteration counter. Only call in fully automated loop mode (loop_optimization, loop_regression). In gated mode, wait for the user to approve via the UI.
get_regression_statusB	Pass/fail summary across all test cases for the current system prompt. Call after running all test cases to decide: is the prompt good enough, or improve further? A test case passes if its most recent score >= threshold (default 70).
set_test_modelB	Switch the model used for test cases in this workspace. Updates the UI model selector.
pull_ui_historyA	Fetch all history entries the UI has pushed to this workspace. The UI auto-pushes after every session summary ("Summarize & new") and every regression run. This gives you a record of what the user did in the UI between agent calls. ALWAYS save the response to a local file: prompt-lab/workspaces//_ui_history.json
delete_sessionB	Delete a workspace and all its state (test cases, results, suggestions, API keys). Irreversible.

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jurek-f/prompt-lab-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server