Skip to main content
Glama
133,413 tools. Last updated 2026-05-25 14:19

"Executing code in a sandbox environment" matching MCP tools:

  • Replay the sandbox test for one or more suites against captured mocks — re-runs the suite's steps against the dev's locally-running app while keploy serves outbound calls (DB, downstream HTTP, etc.) from the captured mocks. Use this when the dev says "replay", "run my sandbox tests", "integration-test", "check if mocks still match" — keywords "sandbox" / "replay" / "mocks" / "integration-test" all map here. Also the REPLAY STEP of FROM-SCRATCH: call this LAST (after create_test_suite + record_sandbox_test) to give the dev the whole-app regression picture against the freshly captured mocks. Output produces a SANDBOX RUN REPORT — it answers "does the suite still hold up against its captured baseline?". ═══════════════════════════════════════════════════════════════════ DISAMBIGUATION — pick this tool vs. replay_test_suite: ═══════════════════════════════════════════════════════════════════ USE replay_sandbox_test (THIS TOOL) when the dev says: * "run my sandbox tests" / "replay my sandbox tests" * "integration-test my app" / "run the integration tests" * "check if my mocks still match" / "replay against the captured mocks" * "rerun my sandbox suite" (with the word "sandbox") Trigger keyword: an explicit "sandbox" / "replay" / "mocks" / "integration-test" — silent signal that the dev wants captured-mock replay, NOT live-app execution. USE replay_test_suite INSTEAD when the dev says: * "run the test suite" / "run my test suites" (bare — no "sandbox") * "execute test suite X" / "run suite 810d3ebe…" * "test the suite again" / "smoke test against the live app" Bare verbs ("run / test / execute") applied to "the suite" without the word "sandbox" mean LIVE-APP execution, NOT captured-mock replay. replay_test_suite hits the dev's running localhost app directly via HTTP — no docker spin-up, no mocks. After a record_sandbox_test run, the natural next step is THIS tool (replay against the just-captured mocks). After create_test_suite / update_test_suite, the natural next step is replay_test_suite (validate against the live app). When the dev's verb is bare and the prior turn doesn't make the intent obvious, ASK rather than picking sandbox-replay silently — code-change regressions can hide under "mock didn't match" failures. ═══════════════════════════════════════════════════════════════════ DISCOVERY — when the dev hands you a bare suite_id with no app_id / branch_id: ═══════════════════════════════════════════════════════════════════ Suites live on a (app_id, branch_id) tuple. A bare suite_id has NO on-disk hint about which app or branch holds it; you have to RESOLVE both before calling this tool. Walk these steps in order — STOP as soon as getTestSuite returns 200: 1. Detect the dev's git branch: Bash `git rev-parse --abbrev-ref HEAD` in app_dir. If exit non-zero / output is "HEAD" → not a git repo / detached HEAD; ASK the dev for the Keploy branch name. 2. Resolve candidate apps via the cwd basename: Bash `basename $(pwd)` → call listApps with q=<basename>. Usually 1–2 candidates. If 0 → ASK; if >1 → walk every candidate in step 4. 3. For each candidate app, call list_branches({app_id}) and find the branch whose `name` matches the git branch from step 1. That gives you {branch_id}. If no match → not this app, try next. 4. Verify with getTestSuite({app_id, suite_id, branch_id=<from step 3>}). 200 → resolved; 404 → wrong app/branch, try next. 5. If steps 2–4 exhaust, walk every OPEN branch on each candidate app via list_branches → getTestSuite. Then try main (branch_id omitted). If still nothing → ASK the dev for the {app_id, branch_id} pair. After resolving once in a session, REUSE the {app_id, branch_id} for subsequent suite-targeted calls; don't re-walk discovery for every action. SCOPE — whole-app vs single-suite: * Default: LEAVE suite_ids UNSET → the tool resolves "every suite for the app that has a sandbox test (test_set_id populated)" and replays them all. Use this for "run my sandbox tests" / "check if my tests still pass" — whole-app regression. New suites auto-pick up. * Single / subset: PASS suite_ids when the dev names specific suites — "replay sandbox test for suite 810d3ebe-…", "replay only the auth suite", "run suite X and Y". The tool validates each requested id is actually a suite with a sandbox test (has test_set_id); an unlinked id gets a precise "record first" error instead of an opaque downstream CLI failure. This tool resolves the app, picks the suite set per the rule above, and returns a single playbook that drives the replay for them. It does NOT record. WHAT THIS TOOL DOES INTERNALLY (so you don't have to): 1. Resolves app_id — use the explicit app_id if the caller has one; otherwise pass app_name_hint (usually the cwd basename) and the server does listApps with a substring match. Multiple matches → error listing them; zero matches → error suggesting the dev generate a suite first. 2. Lists test suites for the app, keeps only those with a non-empty test_set_id. Zero linked → typed "no linked sandbox tests" error. 3. If suite_ids was passed, validates every requested id is in the linked-suites set; unlinked ids → typed error pointing to record_sandbox_test. 4. Returns the headless playbook — walk it exactly: spawn CLI in background, tail the progress file (PID-alive guard built in), read the terminal event, fetch the report. No separate cleanup step — the CLI exits on its own. ===== PREREQUISITES ===== (Same as record_sandbox_test — if you just recorded, you already have them. Same docker-compose network rule applies: use the same compose file + service, stop the app service before calling, leave deps running.) - app_command: shell command that starts the dev's app (e.g. "docker compose up producer"). - app_url: base URL the app listens on, e.g. http://localhost:8080. - app_dir: absolute path to repo root. - container_name if app_command is docker-compose. - keploy binary on PATH. If `which keploy` returns nothing, install it before calling this tool with: `curl --silent -O -L https://keploy.io/install.sh && source install.sh`. ===== AFTER CALLING — walk the playbook ===== Same headless playbook shape as record_sandbox_test: spawn `keploy test sandbox --cloud-app-id …` in the background via Bash, poll `tail -n 1 $PROGRESS_FILE` repeatedly (no sleep loops; the wait_for_done step has a built-in `kill -0 $KEPLOY_PID` guard so the loop exits if the CLI dies silently), read the terminal NDJSON event (phase=done, data.ok, data.test_run_id), and — if ok=true — call get_session_report(app_id, test_run_id) with verbose=true at the end. No separate cleanup step needed; the CLI exits cleanly once phase=done is written. ===== MANDATORY OUTPUT — Phase 3 section ===== Your final message to the dev MUST contain a section with this exact heading (do NOT merge with Phase 2; do NOT compress the failed-steps table even when failures are homogeneous): ### Phase 3 — Sandbox run report Under it, emit the uniform three-subsection format owned by get_session_report: (i) per-suite table — one row per suite in per_suite, passing suites included, columns = Suite name | passed/total steps. (ii) failed-steps table — ONE ROW per entry in failed_steps[], columns = Suite | Step name | Method + URL | Expected → Actual status | mock_mismatch y/n. Never collapse rows. (iii) Diagnosis + Recommendation (see get_session_report description for case-specific rules around mock_mismatch_dominant, repo-diff inspection, and the SKIP / FIX-CODE / FIX-TEST branching for fix-it follow-ups). Do NOT print aggregate step totals across suites — they mix unrelated suites and hide where damage actually is. ===== ROLLUP LINE ===== Close the message with a final one-line rollup paragraph (no heading), in addition to the three phase sections. Mention the TOTAL number of suites replayed (which may exceed the count created in this session, because replay_sandbox_test covers every linked suite the app has). Example: "_Rollup: inserted 4 suites, 4/4 with sandbox tests after record, 3/4 suites passed sandbox replay across the app's 6 linked suites — 1 failure is likely keploy egress-hook, file an issue with the IDs above._" ===== DO NOT ===== * DO NOT call update_test_suite or record_sandbox_test after this. The dev said RUN, not REFRESH. * DO NOT fall back to raw keploy CLI (`keploy test …`) if the MCP tool drops mid-flow — CLI runs test-sets directly and does NOT write results back to the MCP-visible TestSuiteRun. See MCP DISCONNECT RECOVERY in the top-level instructions.
    Connector
  • Returns available payment and authentication options for accessing live market data. Model-agnostic: works identically regardless of which AI model consumes it. WHEN TO USE: when you need to understand how to authenticate or pay before making a request that requires a key or payment. Returns upgrade ladder: sandbox (200 calls free), x402 per-request ($0.001 USDC), x402 sandbox (10 credits for $0.001), credit packs ($5 = 1000 calls), builder subscription ($99/mo = 50K/day). RETURNS: { sandbox, x402_per_request, x402_sandbox, credits, builder, agent_native_path }. No authentication required. Always returns 200.
    Connector
  • Look up an ATC code at level 1-4 to get its name and hierarchy level. Use this tool to: - Resolve an ATC code (e.g., "A10BA") to its class name ("Biguanides") - Confirm a code exists in the current ATC index - Identify the level (anatomical / therapeutic / pharmacological / chemical) Accepts codes 1-5 characters long: "A" (anatomical), "A10" (therapeutic), "A10B" (pharmacological), "A10BA" (chemical). Substance-level codes (7 chars, e.g., "A10BA02") are not exposed by this endpoint — use atc_classify with the drug name to retrieve the substance code.
    Connector
  • PREVIEW: Run terraform plan to preview infrastructure changes Runs a terraform plan for an InsideOut session without applying any changes. This lets the user review what will be created/changed/destroyed before committing. Returns job_id, plan_id, and project_id. Use tflogs to stream the plan output. After the plan completes, use tfdeploy with plan_id to apply the exact plan. SINGLE-FLIGHT: only one TF job per session at a time. If another job is already in flight, tfplan returns tf_job_conflict with the live job_id — attach with tfstatus/tflogs, or pass force_new=true to override. REQUIRES: session_id from convoopen response (format: sess_v2_...). OPTIONAL: sandbox (boolean, default false) — plans real generated Terraform. Set to true for cheap sandbox template (testing only). OPTIONAL: force_new (boolean, default false) - bypass the single-flight guard. Use only when the existing run is provably wedged. CREDENTIAL HANDLING: Same as tfdeploy - credentials must be configured first.
    Connector
  • Returns available payment and authentication options for accessing live market data. Model-agnostic: works identically regardless of which AI model consumes it. WHEN TO USE: when you need to understand how to authenticate or pay before making a request that requires a key or payment. Returns upgrade ladder: sandbox (200 calls free), x402 per-request ($0.001 USDC), x402 sandbox (10 credits for $0.001), credit packs ($5 = 1000 calls), builder subscription ($99/mo = 50K/day). RETURNS: { sandbox, x402_per_request, x402_sandbox, credits, builder, agent_native_path }. No authentication required. Always returns 200.
    Connector
  • Returns the current Strale wallet balance. Call this before executing paid capabilities to verify sufficient funds, or after a series of calls to reconcile spend. Returns balance in EUR cents (integer) and formatted EUR string. Requires an API key — returns an auth instruction if none is configured.
    Connector

Matching MCP Servers

Matching MCP Connectors

  • Read-only. Use to find workflows in a project by name, description, or trigger type before inspection or editing. Trigger filters include database, auth email, repeating, broadcast, and no-trigger workflows. Returns paginated workflow summaries, published/sandbox state, trigger type, workflow URLs, totalCount, hasMore, and nextOffset. Do not use as the final source of truth before editing; call get_workflow_and_preview_url for full structure.
    Connector
  • Record mocks for V1 repo-mode API tests using the V1-native CLI command `keploy sandbox local record`. Runs the dev's app under the keploy eBPF agent, drives the V1 chained-CRUD tests from `keploy/api-tests/<resource>/test.yaml`, captures every outbound call (DB queries, Redis ops, downstream HTTP) as mocks, and lays them out at `<app_dir>/keploy/<suite-name>/{tests/, mocks.yaml, config.yaml}` in the standard OSS test-set tree. On success, mocks upload to the Keploy canonical pool by content hash; the hash lands in config.yaml so a teammate's later replay fetches the same bytes. CRITICAL — DO NOT CONFUSE WITH `keploy record sandbox`: * `keploy sandbox local record` (V1, repo-mode) ← this is what the playbook below uses * `keploy record sandbox` (legacy, cloud-mode) ← DO NOT call this for V1 The two are entirely different commands. Cloud-mode requires server-side suites (queried via --suite-ids) — V1 repo-mode reads tests from the local filesystem and never registers them in the cloud. If the dev is in repo storage mode (verify via devloop_resolve_storage's source=persisted, mode=repo), V1 is the ONLY correct sandbox path. STRICT — TIME-FREEZING DOES NOT APPLY TO RECORD. Recording MUST use the dev's regular (prod) Dockerfile or native binary. NEVER spawn the app via Dockerfile.keploy / "-f docker-compose.keploy.yml" / "-tags=faketime" build during record. The faketime binary writes wrong timestamps into captured mocks (it reads time from the offset file, not the wall clock) and the entire capture becomes corrupt — recovery requires re-recording from scratch with the prod binary. If a previous replay failed with expired-JWT and the dev wants to "fix" it, the fix is to re-RUN the replay with --freezeTime, NOT to re-record. The recorded mocks captured against the prod binary are exactly what replay's clock-rewind is designed to validate; touching the record path defeats the whole mechanism. ONLY call this with an explicit dev opt-in. The valid triggers: * Dev directly asks ("capture mocks", "sandbox record", "rerecord the users mocks"). * Post-resource menu (Step 5 of devloop_generate_resource_flow) — dev picks "Capture mocks so CI runs in seconds". * get_session_report shows mock_mismatch_dominant=true AND the dev says yes to your "rerecord?" prompt. Pre-conditions: * Dev's app must NOT already be running (keploy spawns its own copy of the app under the agent's eBPF hooks via the -c command). If a server is up at the target port, KILL IT first or the agent's network capture won't see the traffic. * Real downstream deps (MySQL, Redis, Kafka, etc.) MUST be running — the capture proxies through to them on first contact so the recorded mocks contain real responses. * The test YAML must exist at <app_dir>/keploy/api-tests/<resource>/test.yaml. Returns a playbook for `keploy sandbox local record` with the V1 flag surface: --test-dir, --app-url, -c (spawn command), --container-name (docker-compose only), --skip-mock-upload (offline), --skip-report-upload (offline). Mocks land per-suite at keploy/<suite-name>/. NDJSON progress at --progress-file for the standard tail-til-done loop.
    Connector
  • Complete Disco signup using an email verification code. Call this after discovery_signup returns {"status": "verification_required"}. The user receives a 6-digit code by email — pass it here along with the same email address used in discovery_signup. Returns an API key on success. Args: email: Email address used in the discovery_signup call. code: 6-digit verification code from the email.
    Connector
  • Connect to the user's catalogue using a pairing code. IMPORTANT: Most users connect via OAuth (sign-in popup) — if get_profile already works, the user is connected and you do NOT need this tool. Only use this tool when: (1) get_profile returns an authentication error, AND (2) the user shares a code matching the pattern WORD-1234 (e.g., TULIP-3657). Never proactively ask for a pairing code — try get_profile first. If the user does share a code, call this tool immediately without asking for confirmation. Never say "pairing code" to the user — just say "your code" or refer to it naturally.
    Connector
  • Get the rugby teams in scope, optionally scoped to a tournament. Each team has a 3-letter code (e.g. "FRA", "RSA", "NZL"), full name, and the tournaments it competes in. Use this to enumerate valid `team` values for rugby_getMatches.
    Connector
  • Complete login and receive a new API key. Call this after discovery_login returns {"status": "verification_required"}. The user receives a 6-digit code by email — pass it here along with the same email address. Returns a new API key on success. Args: email: Email address used in the discovery_login call. code: 6-digit verification code from the email.
    Connector
  • PREVIEW: Run terraform plan to preview infrastructure changes Runs a terraform plan for an InsideOut session without applying any changes. This lets the user review what will be created/changed/destroyed before committing. Returns job_id, plan_id, and project_id. Use tflogs to stream the plan output. After the plan completes, use tfdeploy with plan_id to apply the exact plan. SINGLE-FLIGHT: only one TF job per session at a time. If another job is already in flight, tfplan returns tf_job_conflict with the live job_id — attach with tfstatus/tflogs, or pass force_new=true to override. REQUIRES: session_id from convoopen response (format: sess_v2_...). OPTIONAL: sandbox (boolean, default false) — plans real generated Terraform. Set to true for cheap sandbox template (testing only). OPTIONAL: force_new (boolean, default false) - bypass the single-flight guard. Use only when the existing run is provably wedged. CREDENTIAL HANDLING: Same as tfdeploy - credentials must be configured first.
    Connector
  • Browse and compare Licium's agents and tools. Use this when you want to SEE what's available before executing. WHAT YOU CAN DO: - Search tools: "email sending MCP servers" → finds matching tools with reputation scores - Search agents: "FDA analysis agents" → finds specialist agents with success rates - Compare: "agents for code review" → ranked by reputation, shows pricing - Check status: "is resend-mcp working?" → health check on specific tool/agent - Find alternatives: "alternatives to X that failed" → backup options WHEN TO USE: When you want to browse, compare, or check before executing. If you just want results, use licium instead.
    Connector
  • Returns runnable code that creates a Solana keypair. Solentic cannot generate the keypair for you and never sees the private key — generation must happen wherever you run code (the agent process, a code-interpreter tool, a Python/Node sandbox, the user's shell). The response includes the snippet ready to execute. After running it, fund the resulting publicKey and call the `stake` tool with {walletAddress, secretKey, amountSol} to stake in one call.
    Connector
  • Returns runnable code that creates a Solana keypair. Solentic cannot generate the keypair for you and never sees the private key — generation must happen wherever you run code (the agent process, a code-interpreter tool, a Python/Node sandbox, the user's shell). The response includes the snippet ready to execute. After running it, fund the resulting publicKey and call the `stake` tool with {walletAddress, secretKey, amountSol} to stake in one call.
    Connector
  • Check a business's Making Tax Digital VAT mandate status via the HMRC API. NOTE: Connects to the HMRC sandbox by default. Set HMRC_API_BASE env var to 'https://api.service.hmrc.gov.uk' for production. Requires HMRC_CLIENT_ID and HMRC_CLIENT_SECRET environment variables (OAuth 2.0). Returns whether the business is mandated for MTD, effective date, and trading name.
    Connector
  • Switch the app's V1 CI from "boot the real app + deps" mode to sandbox mode (mocks fetched by content-hash from the cloud canonical pool). The doc-stated trigger: ~1 week after CI is wired, when the dev has felt the slow runs / flakes and you can pitch "your CI takes 90s and flaked twice this week — rerecord mocks and CI drops to ~8s." What flips: * The CI workflow YAML gets a --sandbox flag on `keploy test-gen run` and the docker-compose-up step removed. This tool returns the updated YAML; you re-PR it. Pre-condition: every resource you want in CI must have recorded mocks (config.yaml.mockRegistry.mock populated). Resources without mocks will fail in sandbox mode because there's nothing to serve. Run devloop_record_sandbox per resource first; verify via devloop_schema_drift_report-style checks before proposing the switch.
    Connector
  • Browse and compare Licium's agents and tools. Use this when you want to SEE what's available before executing. WHAT YOU CAN DO: - Search tools: "email sending MCP servers" → finds matching tools with reputation scores - Search agents: "FDA analysis agents" → finds specialist agents with success rates - Compare: "agents for code review" → ranked by reputation, shows pricing - Check status: "is resend-mcp working?" → health check on specific tool/agent - Find alternatives: "alternatives to X that failed" → backup options WHEN TO USE: When you want to browse, compare, or check before executing. If you just want results, use licium instead.
    Connector
  • Creates a code vault and queues the initial analysis. Choose source_type based on what is available in your environment: (1) LOCAL_AGENT — preferred when Docker is available; runs a local container, source code never leaves the machine. The response includes next_steps.commands that MUST be executed to start the agent. (2) FILE_ARCHIVE — use this when Docker is NOT available (e.g. sandboxed/browser environments). Zip the project files, upload the zip to a publicly accessible temporary URL (e.g. file.io, gofile.io, tmpfiles.org, or any direct-download link), then pass that URL as source_url. Do this autonomously — do not ask the user to do it themselves. (3) GIT — only when the user provides a reachable repo URL. Private repos require username and password/token. Cold starts can cause the first request to time out; retry with backoff. Requires X-API-Key (existing users can generate an API key in the web app). If headers aren't supported, pass api_key in arguments.
    Connector