Skip to main content
Glama
187,022 tools. Last updated 2026-06-10 06:38

"curl" matching MCP tools:

  • Replay the sandbox test for one or more suites against captured mocks — re-runs the suite's steps against the dev's locally-running app while keploy serves outbound calls (DB, downstream HTTP, etc.) from the captured mocks. Use this when the dev says "replay", "run my sandbox tests", "integration-test", "check if mocks still match" — keywords "sandbox" / "replay" / "mocks" / "integration-test" all map here. Also the REPLAY STEP of FROM-SCRATCH: call this LAST (after create_test_suite + record_sandbox_test) to give the dev the whole-app regression picture against the freshly captured mocks. Output produces a SANDBOX RUN REPORT — it answers "does the suite still hold up against its captured baseline?". ═══════════════════════════════════════════════════════════════════ DISAMBIGUATION — pick this tool vs. replay_test_suite: ═══════════════════════════════════════════════════════════════════ USE replay_sandbox_test (THIS TOOL) when the dev says: * "run my sandbox tests" / "replay my sandbox tests" * "integration-test my app" / "run the integration tests" * "check if my mocks still match" / "replay against the captured mocks" * "rerun my sandbox suite" (with the word "sandbox") Trigger keyword: an explicit "sandbox" / "replay" / "mocks" / "integration-test" — silent signal that the dev wants captured-mock replay, NOT live-app execution. USE replay_test_suite INSTEAD when the dev says: * "run the test suite" / "run my test suites" (bare — no "sandbox") * "execute test suite X" / "run suite 810d3ebe…" * "test the suite again" / "smoke test against the live app" Bare verbs ("run / test / execute") applied to "the suite" without the word "sandbox" mean LIVE-APP execution, NOT captured-mock replay. replay_test_suite hits the dev's running localhost app directly via HTTP — no docker spin-up, no mocks. After a record_sandbox_test run, the natural next step is THIS tool (replay against the just-captured mocks). After create_test_suite / update_test_suite, the natural next step is replay_test_suite (validate against the live app). When the dev's verb is bare and the prior turn doesn't make the intent obvious, ASK rather than picking sandbox-replay silently — code-change regressions can hide under "mock didn't match" failures. ═══════════════════════════════════════════════════════════════════ DISCOVERY — when the dev hands you a bare suite_id with no app_id / branch_id: ═══════════════════════════════════════════════════════════════════ Suites live on a (app_id, branch_id) tuple. A bare suite_id has NO on-disk hint about which app or branch holds it; you have to RESOLVE both before calling this tool. Walk these steps in order — STOP as soon as getTestSuite returns 200: 1. Detect the dev's git branch: Bash `git rev-parse --abbrev-ref HEAD` in app_dir. If exit non-zero / output is "HEAD" → not a git repo / detached HEAD; ASK the dev for the Keploy branch name. 2. Resolve candidate apps via the cwd basename: Bash `basename $(pwd)` → call listApps with q=<basename>. Usually 1–2 candidates. If 0 → ASK; if >1 → walk every candidate in step 4. 3. For each candidate app, call list_branches({app_id}) and find the branch whose `name` matches the git branch from step 1. That gives you {branch_id}. If no match → not this app, try next. 4. Verify with getTestSuite({app_id, suite_id, branch_id=<from step 3>}). 200 → resolved; 404 → wrong app/branch, try next. 5. If steps 2–4 exhaust, walk every OPEN branch on each candidate app via list_branches → getTestSuite. Then try main (branch_id omitted). If still nothing → ASK the dev for the {app_id, branch_id} pair. After resolving once in a session, REUSE the {app_id, branch_id} for subsequent suite-targeted calls; don't re-walk discovery for every action. SCOPE — whole-app vs single-suite: * Default: LEAVE suite_ids UNSET → the tool resolves "every suite for the app that has a sandbox test (test_set_id populated)" and replays them all. Use this for "run my sandbox tests" / "check if my tests still pass" — whole-app regression. New suites auto-pick up. * Single / subset: PASS suite_ids when the dev names specific suites — "replay sandbox test for suite 810d3ebe-…", "replay only the auth suite", "run suite X and Y". The tool validates each requested id is actually a suite with a sandbox test (has test_set_id); an unlinked id gets a precise "record first" error instead of an opaque downstream CLI failure. This tool resolves the app, picks the suite set per the rule above, and returns a single playbook that drives the replay for them. It does NOT record. WHAT THIS TOOL DOES INTERNALLY (so you don't have to): 1. Resolves app_id — use the explicit app_id if the caller has one; otherwise pass app_name_hint (usually the cwd basename) and the server does listApps with a substring match. Multiple matches → error listing them; zero matches → error suggesting the dev generate a suite first. 2. Lists test suites for the app, keeps only those with a non-empty test_set_id. Zero linked → typed "no linked sandbox tests" error. 3. If suite_ids was passed, validates every requested id is in the linked-suites set; unlinked ids → typed error pointing to record_sandbox_test. 4. Returns the headless playbook — walk it exactly: spawn CLI in background, tail the progress file (PID-alive guard built in), read the terminal event, fetch the report. No separate cleanup step — the CLI exits on its own. ===== PREREQUISITES ===== (Same as record_sandbox_test — if you just recorded, you already have them. Same docker-compose network rule applies: use the same compose file + service, stop the app service before calling, leave deps running.) - app_command: shell command that starts the dev's app (e.g. "docker compose up producer"). - app_url: base URL the app listens on, e.g. http://localhost:8080. - app_dir: absolute path to repo root. - container_name if app_command is docker-compose. - keploy binary on PATH. If `which keploy` returns nothing, install it before calling this tool with: `curl --silent -O -L https://keploy.io/install.sh && source install.sh`. ===== AFTER CALLING — walk the playbook ===== Same headless playbook shape as record_sandbox_test: spawn `keploy test sandbox --cloud-app-id …` in the background via Bash, poll `tail -n 1 $PROGRESS_FILE` repeatedly (no sleep loops; the wait_for_done step has a built-in `kill -0 $KEPLOY_PID` guard so the loop exits if the CLI dies silently), read the terminal NDJSON event (phase=done, data.ok, data.test_run_id), and — if ok=true — call get_session_report(app_id, test_run_id) with verbose=true at the end. No separate cleanup step needed; the CLI exits cleanly once phase=done is written. ===== MANDATORY OUTPUT — Phase 3 section ===== Your final message to the dev MUST contain a section with this exact heading (do NOT merge with Phase 2; do NOT compress the failed-steps table even when failures are homogeneous): ### Phase 3 — Sandbox run report Under it, emit the uniform three-subsection format owned by get_session_report: (i) per-suite table — one row per suite in per_suite, passing suites included, columns = Suite name | passed/total steps. (ii) failed-steps table — ONE ROW per entry in failed_steps[], columns = Suite | Step name | Method + URL | Expected → Actual status | mock_mismatch y/n. Never collapse rows. (iii) Diagnosis + Recommendation (see get_session_report description for case-specific rules around mock_mismatch_dominant, repo-diff inspection, and the SKIP / FIX-CODE / FIX-TEST branching for fix-it follow-ups). Do NOT print aggregate step totals across suites — they mix unrelated suites and hide where damage actually is. ===== ROLLUP LINE ===== Close the message with a final one-line rollup paragraph (no heading), in addition to the three phase sections. Mention the TOTAL number of suites replayed (which may exceed the count created in this session, because replay_sandbox_test covers every linked suite the app has). Example: "_Rollup: inserted 4 suites, 4/4 with sandbox tests after record, 3/4 suites passed sandbox replay across the app's 6 linked suites — 1 failure is likely keploy egress-hook, file an issue with the IDs above._" ===== DO NOT ===== * DO NOT call update_test_suite or record_sandbox_test after this. The dev said RUN, not REFRESH. * DO NOT fall back to raw keploy CLI (`keploy test …`) if the MCP tool drops mid-flow — CLI runs test-sets directly and does NOT write results back to the MCP-visible TestSuiteRun. See MCP DISCONNECT RECOVERY in the top-level instructions.
    Connector
  • WORKFLOW: Step 3 of 4 - Generate Terraform files from completed design Generate Terraform files from an InsideOut session that has completed infrastructure design. ⚠️ PREREQUISITE: Only call this AFTER convoreply returns with `terraform_ready=true` in the response metadata. DO NOT call this while convoreply is still running or before terraform_ready is confirmed! If you get 'session has not reached terraform-ready state', wait for convoreply to complete first. 🎯 USE THIS TOOL WHEN: convoreply has returned with terraform_ready=true, OR the user asks to 'see the terraforms', 'generate terraform', 'show me the code', etc. **DEFAULT RESPONSE**: Returns summary table + download URL (keeps code out of LLM context). **FALLBACK**: Set `include_code: true` to get full code inline if curl/unzip fails. **CRITICAL WORKFLOW** (default mode): 1. Call this tool to get file summary and download URL 2. ASK the user: 'Where would you like me to save the Terraform files? Default: ./insideout-infra/' 3. WAIT for user confirmation before running the download command 4. Run the curl/unzip command with the user's chosen directory 5. If curl/unzip FAILS (sandbox, security, platform issues), retry with `include_code: true` **AFTER GENERATION**: Ask user if they want to review the files and then deploy with tfdeploy REQUIRES: session_id from convoopen response (format: sess_v2_...). OPTIONAL: include_code (boolean) - set true to return full code inline as fallback. 💡 TIP: Examine workflow.usage prompt for more context on how to properly use these tools.
    Connector
  • Submit a list of URLs to be checked. Returns a job_id that can be polled via get_job_status or fetched via get_job_results. For up to ~200 URLs this tool waits for completion (up to 60 seconds) and returns the results directly; for larger jobs it returns early with job_id and the agent should poll.
    Connector
  • Modify an existing image. REQUIRED input: exactly one of file_id OR image_url. base64 is NOT accepted — do not try to pass image bytes as a tool argument, the call will be rejected. For chat-attached images you MUST first call prepare_image_upload to get a signed PUT URL, upload the bytes there (via the inline widget on Claude.ai, or via curl on Claude Desktop / Claude Code), then call this tool with the returned file_id. For URLs the user has pasted, use image_url directly. Returns a jobId immediately; call check_job with the jobId to retrieve the edited image inline. Models (both 1 credit/image): 'nano-banana-2' (fast, default) and 'gpt-image-2' (higher quality).
    Connector
  • Create a frontend deployment and get an upload URL. Upload your built frontend as a zip file to the returned URL, then use manage_frontend (action: "start_deployment") to trigger the deploy. Steps: 1. Call this tool to get an upload URL 2. Upload your zip file to the URL (e.g. curl -X PUT "{uploadUrl}" -H "Content-Type: application/zip" --data-binary @frontend.zip) 3. Call manage_frontend (action: "start_deployment") with the returned deployment_id Example: Input: { app_id: "app_abc123", framework: "react-vite" } Output: { deployment_id: "uuid-1234", uploadUrl: "https://...", expiresIn: 900, maxSizeBytes: 104857600 } Prerequisites: - App must exist (use init_app to create) Free plan: 1 deployment per app. Deploying again automatically replaces the previous deployment (no need to delete first). Starter+: unlimited deployments. Framework options: - react-vite: React app built with Vite (zip the dist/ folder) - nextjs-static: Next.js static export (zip the out/ folder) - static: Plain HTML/CSS/JS - other: Any framework that produces static output SPA routing: For SPA frameworks (react-vite, nextjs-static, other), a _redirects file is auto-injected so all routes serve index.html. If your zip already includes a _redirects file, it is preserved. IMPORTANT — Zip file paths must use forward slashes (/), not backslashes (\). On Windows, zips created with built-in tools use backslashes, which causes all files to be served as text/html (breaking JS/CSS with MIME errors). On Windows use Git Bash or WSL to run: cd dist && zip -r ../frontend.zip . Common errors: - RESOURCE_NOT_FOUND: App doesn't exist Idempotency: Not idempotent — creates a new deployment each time (replaces existing on free plan). Your frontend will be deployed to https://<app-name>.butterbase.dev. Next steps: Upload your zip to the returned URL, then call manage_frontend (action: "start_deployment").
    Connector
  • WORKFLOW: Step 3 of 4 - Generate Terraform files from completed design Generate Terraform files from an InsideOut session that has completed infrastructure design. ⚠️ PREREQUISITE: Only call this AFTER convoreply returns with `terraform_ready=true` in the response metadata. DO NOT call this while convoreply is still running or before terraform_ready is confirmed! If you get 'session has not reached terraform-ready state', wait for convoreply to complete first. 🎯 USE THIS TOOL WHEN: convoreply has returned with terraform_ready=true, OR the user asks to 'see the terraforms', 'generate terraform', 'show me the code', etc. **DEFAULT RESPONSE**: Returns summary table + download URL (keeps code out of LLM context). **FALLBACK**: Set `include_code: true` to get full code inline if curl/unzip fails. **CRITICAL WORKFLOW** (default mode): 1. Call this tool to get file summary and download URL 2. ASK the user: 'Where would you like me to save the Terraform files? Default: ./insideout-infra/' 3. WAIT for user confirmation before running the download command 4. Run the curl/unzip command with the user's chosen directory 5. If curl/unzip FAILS (sandbox, security, platform issues), retry with `include_code: true` **AFTER GENERATION**: Ask user if they want to review the files and then deploy with tfdeploy REQUIRES: session_id from convoopen response (format: sess_v2_...). OPTIONAL: include_code (boolean) - set true to return full code inline as fallback. 💡 TIP: Examine workflow.usage prompt for more context on how to properly use these tools.
    Connector

Matching MCP Servers

  • A
    license
    -
    quality
    B
    maintenance
    Provides a structured HTTP client tool for making web requests with full HTTP method support, detailed response metadata, and error handling. Enables AI assistants to interact with any web API or endpoint through the curl_request tool.
    Last updated
    10
    2
    MIT
  • F
    license
    -
    quality
    D
    maintenance
    A Multi-modal Conversation Protocol (MCP) server that enables interacting with web APIs through a curl-like interface, providing access to various websites and APIs with user authentication and response transformations.
    Last updated
    8

Matching MCP Connectors

  • Publish HTML content to a live URL instantly. No account or API key required. Returns a public URL that anyone can visit. Sites expire after 24 hours unless the owner claims them. Use this when a user asks you to build, create, or deploy a website, landing page, invitation, portfolio, report, or any HTML content they want to share as a link. Supports bundled assets (CSS, JS, images) and multi-page sites (include additional .html files in assets, accessible at /{slug}/{pagename}). IMPORTANT: After publishing, always share the live URL and the claim URL with the user. The claim URL lets them take permanent ownership of the site. Store the edit_key from the response silently — do not show it to the user — you will need it if they ask you to make changes to the site later. If you lose the edit_key, ask the user to claim the site first (via the claim URL in the page footer), then provide you with their API key from the dashboard — you can use that instead.
    Connector
  • Permanently delete a published website. The site will be immediately inaccessible. Requires authentication via edit_key or api_key, and requires confirm: true as a safety mechanism to prevent accidental deletion. Use this when a user explicitly asks you to remove or delete a site. IMPORTANT: Always confirm with the user before calling this tool — deletion cannot be undone.
    Connector
  • NO AUTH / PUBLIC / READ-ONLY. Builds and validates a copy-pasteable authenticated /api/v2/{dataset}/timeseries HTTP request without sending it. This tool does not execute the request, query weather values, or return forecast data. Use gribstream_query_timeseries when the user asks for actual weather values or CSV/JSON/NDJSON data. Generated direct API requests include Accept-Encoding: gzip, and generated curl commands use --compressed so large responses can be transferred compressed when the client supports it. Do not include request.asOf unless the user explicitly wants backtesting, time travel, or a historical model-run cutoff. The request body must use exact selectors discovered from the catalog or shared-parameter tools, with coordinates in request.coordinates and selectors in request.variables.
    Connector
  • Start the purchase flow for a domain using USDC crypto payment (x402 protocol). Autonomous two-step payment flow — no browser required: Step 1: Call this tool to reserve the domain and get a pay_url. Step 2: Pay the pay_url using your x402 wallet: - If you have Coinbase Payments MCP: call its x402 payment tool with the pay_url. - Otherwise: run `curl -L <pay_url>` — your x402-enabled HTTP client will handle the 402 handshake and USDC payment on Base automatically. After payment succeeds, call get_domain_status(order_id) to poll until registration is complete (usually under 60 seconds). The registrant contact details are required because the domain will be registered in the buyer's name (they become the legal owner). WHOIS privacy is enabled by default, so these details are not publicly visible. IMPORTANT: Before calling this tool, you MUST first call check_domain to get the price and confirm it with the user. Args: domain: The domain to purchase (e.g. "coolstartup.com"). first_name: Registrant's first name. last_name: Registrant's last name. email: Registrant's email address. address1: Registrant's street address. city: Registrant's city. state: Registrant's state or province. postal_code: Registrant's postal/zip code. country: 2-letter ISO country code (e.g. "US", "GB", "DE"). phone: Phone number in format +1.5551234567. org_name: Organization name (optional, leave empty for individuals). Returns: Dict with order_id, pay_url (full URL to pay via x402), price_usdc, price_cents, network, and USDC contract address.
    Connector
  • Create a CTRL workflow draft. ONE trigger + an ordered chain of up to 20 actions/conditions/utilities. Returns { workflowId, activateUrl }. Pass `targetChain` to pick which chain the workflow runs on — "base" (default, launchpads + Aerodrome + UniV4) or "ethereum" (UniV3 only, no launchpads, no clanker/zora). CRITICAL: call ctrl_get_block_catalog FIRST (with the same `chain` value) to discover field names — every key in trigger.config and chain[].config must exactly match catalog fields[].key. Populate EVERY field the user expressed intent for. For pool.created (Token Launch, Base-only) set launchpad (e.g. ["bankr"]), keywordIncludes ("ai,agent,claw"), keywordMatchMode "any", keywordCategories (["ai_agents"]), safetyEnabled true, safetyRejectHoneypot true, safetyMinScore 50. For cypher.swap set tokenIn ("ETH"), tokenOut ("{{trigger.tokenAddress}}"), tokenOutMode "dynamic", amount (ETH units, e.g. 0.005 — ASK USER if not specified), slippage (15 for snipes), and autoSell* if user wants an exit (autoSellEnabled true, autoSellMode "multiple", autoSellMultiplier 2, autoSellPercent 100, autoSellReceiveToken "USDC"). For notify.telegram set message with {{token}}/{{amount}}/{{txHash}} placeholders. Interview the user for missing critical fields (amount, exit strategy, keywords) — do not silently default.
    Connector
  • Confirm a datasheet upload started via request_datasheet_upload. Pass the upload_token you got back from the request step. The server downloads the uploaded bytes, re-hashes to verify integrity, validates that it's a real PDF with the MPN on the first page, creates the private Document + Component records, charges the upload fee (50¢), and queues extraction. Success response: document_id, mpn, sha256, file_size_bytes, status='pending'. Poll check_extraction_status with the MPN to wait for extraction to finish (30s-2min typically). Failure modes: - 'upload_not_found' — no bytes at the upload URL yet. Retry your curl upload. - 'sha256_mismatch' — uploaded bytes hash differs from expected_sha256. Re-compute the hash and re-request. - 'invalid_pdf' — bytes aren't a parseable PDF. No charge. - 'mpn_not_in_pdf' — MPN (or its stem) isn't on the first page. Either you uploaded the wrong file or it's a scanned image-only PDF. No charge. - 'token_expired' — upload token is older than 15 minutes. Restart via request_datasheet_upload.
    Connector
  • Search and inspect the Wix REST API documentation/spec by writing JavaScript code that runs in a sandboxed read-only environment. This tool overlaps with `SearchWixRESTDocumentation`, `BrowseWixRESTDocsMenu`, `ReadFullDocsArticle`, and `ReadFullDocsMethodSchema`: use any of them to discover Wix REST endpoints, schemas, examples, and related docs. Prefer `SearchWixAPISpec` over `ReadFullDocsMethodSchema` for REST method schemas when it is available, especially after you already have a docs URL from semantic search, menu browsing, or conversation context. Prefer URL-first results: - If you have a docs URL or partial docs URL, search `resource.docsUrl` and `method.docsUrl` first. - If you have a method docs URL and need the request/response shape, call `getResourceSchemaByUrl(methodDocsUrl)` in this tool and return the selected method schema directly. - For API execution, return and use `method.publicUrl` when available. It is the preferred executable `https://www.wixapis.com/...` URL. - Return `docsUrl` for relevant resources/methods when the next step needs an article or API call source URL; do not hand off to `ReadFullDocsMethodSchema` just to inspect a REST method schema. - Use `resourceId` only as the internal handle for low-level loaders; prefer URL helpers when you have a docs URL. Your code has access to these globals: **lightIndex** — Current lightweight REST API resource array: ```typescript interface LightIndex extends Array<LightResource> { updatedAt?: string; // ISO timestamp for when spec sync generated this index } interface LightResource { name: string; // e.g. "Products V3", "Contact V4" resourceId: string; // internal handle for getResourceSchema() docsUrl: string; // e.g. "https://dev.wix.com/docs/api-reference/business-solutions/stores/catalog-v3/products-v3" menuPath: string[]; // e.g. ["business-solutions", "stores", "catalog-v3", "products-v3"] methods: Array<{ operationId: string; // e.g. "wix.stores.catalog.v3.CatalogApi.CreateProduct" summary: string; // e.g. "Create Product" httpMethod: string; // "get" | "post" | "patch" | "delete" path: string; // e.g. "/v3/products" docsUrl?: string; // e.g. "https://dev.wix.com/docs/api-reference/.../query-products" publicUrl?: string; // preferred executable URL for ExecuteWixAPI, when available after spec sync publicBaseUrl?: string; description: string; // truncated to 200 chars }>; } ``` **getResourceSchemaByUrl(docsUrl)** and **getResourceSchema(resourceId)** return the full schema for a resource: ```typescript interface FullSchema { title: string; description: string; fqdn: string; docsUrl?: string; methods: Array<{ summary: string; description: string; operationId: string; httpMethod: string; path: string; docsUrl?: string; publicUrl?: string; // Preferred executable URL for ExecuteWixAPI, e.g. "https://www.wixapis.com/..." publicBaseUrl?: string; // Public Wix APIs base URL used to derive publicUrl servers: Array<{ url: string }>; // Base URLs (e.g. "https://www.wixapis.com/...") requestBody: object | null; responses: object; parameters: Array<object>; permissions: string[]; legacyExamples: Array<{ // Curl examples content: { title: string; request: string; response: string }; }>; }>; components: { schemas: object }; } ``` **articles** — Array of all Wix documentation articles (~1000 guides, tutorials, concepts): ```typescript interface LightArticle { name: string; // e.g. "About the Wix API Query Language" resourceId: string; docsUrl: string; // e.g. "https://dev.wix.com/docs/api-reference/articles/..." menuPath: string[]; // e.g. ["work-with-wix-apis", "data-retrieval", "about-the-wix-api-query-language"] description: string; // first ~200 chars of the article content } ``` **getResourceSchemaByUrl(docsUrl)** — Async function returning the full schema for the resource or method docs URL. **getResourceSchema(resourceId)** — Lower-level async function returning the full schema for a resource ID. Prefer `getResourceSchemaByUrl(docsUrl)` when you have a docs URL. **getArticleContentByUrl(docsUrl)** — Async function returning the full markdown content of an article docs URL (string). **getArticleContent(resourceId)** — Lower-level async function returning the full markdown content of an article resource ID. Prefer `getArticleContentByUrl(docsUrl)` when you have a docs URL. Articles and API resources share the same menuPath hierarchy. Use menuPath to find related articles and APIs within the same domain. Your code MUST be an `async function()` expression that returns a value. app-management [12 resources]: app-billing, oauth-2, app-instance, embedded-scripts, app-permissions, bi-event, site-plugins, app-installations, market-listing business-solutions [154 resources]: e-commerce, stores, bookings, cms, events, restaurants, blog, forum, pricing-plans, portfolio, benefit-programs, donations, suppliers-hub, gift-cards, coupons assets [4 resources]: media, pro-gallery, rich-content crm [58 resources]: members-contacts, forms, community, communication, loyalty-program, crm business-management [102 resources]: ai-site-chat, analytics, async-job, app-installation, automations, branches, calendar, cookie-consent-policy, captcha, custom-embeds, functions, dashboard, faq-app, data-extension-schema, get-paid, headless, marketing, locations, multilingual, notifications, online-programs, payments, secrets, site-urls, site-properties, tags account-level [17 resources]: sites, resellers, domains, user-management, b2b-site-management site [2 resources]: viewer Important schema guidance: - For ExecuteWixAPI, use `method.publicUrl` when available. It is the preferred executable `https://www.wixapis.com/...` URL for that method. - Do not use `method.servers[0]` to build execution URLs. `method.servers` includes internal Wix hosts such as `www.wix.com`, `manage.wix.com`, and editor hosts. - `method.path` is usually an internal relative path, such as `/v3/items/{item.id}`. Use it for matching/debugging, not as the primary execution URL when `method.publicUrl` exists. - Do not exact-match full Wix API URLs against `method.path`. - Search docs URLs first when you have them. Search broadly across `resource.name`, `resource.docsUrl`, `resource.menuPath.join("/")`, `method.summary`, `method.operationId`, `method.description`, `method.path`, and `method.docsUrl` only when you still need discovery. - Method schemas can contain `{ "$circular": "TypeName" }` references. Use `schema.components.schemas[TypeName]` to inspect selected nested types. Avoid dumping huge fully-expanded schemas unless necessary. - When inspecting a specific method schema (i.e. you have found a single method and are returning its details), always include `responses: method.responses` alongside `requestBody`. Knowing the response shape up front prevents speculative re-runs of mutations just to see what the API returned. Examples: Inspect one method schema by exact docs URL: ```javascript async function() { const methodUrl = "https://dev.wix.com/docs/api-reference/business-solutions/stores/catalog-v3/products-v3/query-products"; const schema = await getResourceSchemaByUrl(methodUrl); const method = schema.methods.find(method => method.docsUrl === methodUrl); if (!method) { return { message: "Found the resource, but no exact method URL match. Returning available methods.", resourceDocsUrl: schema.docsUrl, methods: schema.methods.map(method => ({ title: method.summary, docsUrl: method.docsUrl, httpMethod: method.httpMethod.toUpperCase(), publicUrl: method.publicUrl, path: method.path })) }; } return { title: method.summary, docsUrl: method.docsUrl, resourceDocsUrl: schema.docsUrl, publicUrl: method.publicUrl, publicBaseUrl: method.publicBaseUrl, httpMethod: method.httpMethod.toUpperCase(), path: method.path, operationId: method.operationId, permissions: method.permissions, parameters: method.parameters, requestBody: method.requestBody, responses: method.responses, curlExamples: method.legacyExamples?.map(example => example.content) }; } ``` Inspect one resource by resource docs URL: ```javascript async function() { const resourceUrl = "https://dev.wix.com/docs/api-reference/business-solutions/stores/catalog-v3/products-v3"; const schema = await getResourceSchemaByUrl(resourceUrl); return { resource: schema.title, docsUrl: schema.docsUrl, description: schema.description, methods: schema.methods.map(method => ({ title: method.summary, docsUrl: method.docsUrl, httpMethod: method.httpMethod.toUpperCase(), publicUrl: method.publicUrl, path: method.path, operationId: method.operationId })) }; } ``` Inspect one method from a partial docs URL: ```javascript async function() { const partialUrl = "stores/catalog-v3/products-v3/query-products"; const resource = lightIndex.find(resource => resource.docsUrl.includes(partialUrl) || resource.methods.some(method => method.docsUrl?.includes(partialUrl)) ); if (!resource) return "No API resource found for this partial docs URL"; const schema = await getResourceSchemaByUrl( resource.methods.find(method => method.docsUrl?.includes(partialUrl))?.docsUrl ?? resource.docsUrl ); const method = schema.methods.find(method => method.docsUrl?.includes(partialUrl) ); if (!method) { return { message: "Found the resource, but no exact method match.", resource: resource.name, resourceDocsUrl: resource.docsUrl, methods: schema.methods.map(method => ({ title: method.summary, docsUrl: method.docsUrl, httpMethod: method.httpMethod.toUpperCase(), publicUrl: method.publicUrl, path: method.path })) }; } return { title: method.summary, docsUrl: method.docsUrl, resource: resource.name, resourceDocsUrl: resource.docsUrl, httpMethod: method.httpMethod.toUpperCase(), publicUrl: method.publicUrl, publicBaseUrl: method.publicBaseUrl, path: method.path, requestBody: method.requestBody, responses: method.responses, curlExamples: method.legacyExamples?.map(example => example.content) }; } ``` Expand selected nested schema refs: ```javascript async function() { const methodUrl = "https://dev.wix.com/docs/api-reference/business-solutions/stores/catalog-v3/products-v3/query-products"; const schema = await getResourceSchemaByUrl(methodUrl); const method = schema.methods.find(method => method.docsUrl === methodUrl); return { title: method.summary, docsUrl: method.docsUrl, requestBody: method.requestBody, selectedNestedTypes: { product: schema.components.schemas["com.wix.stores.catalog.product.api.v3.Product"], cursorPaging: schema.components.schemas["wix.stores.catalog.v3.upstream.wix.common.CursorPaging"], sorting: schema.components.schemas["wix.stores.catalog.v3.upstream.wix.common.Sorting"] } }; } ``` Advanced: bounded recursive expansion for one method. Use only when top-level schema and selected nested refs are not enough; keep depth small because schemas can become very large. ```javascript async function() { const methodUrl = "https://dev.wix.com/docs/api-reference/business-solutions/stores/catalog-v3/products-v3/query-products"; const schema = await getResourceSchemaByUrl(methodUrl); const method = schema.methods.find(method => method.docsUrl === methodUrl); function expandRefs(value, depth = 0, seen = []) { if (depth > 3) return value; if (Array.isArray(value)) return value.map(item => expandRefs(item, depth, seen)); if (!value || typeof value !== "object") return value; if (value.$circular) { const refName = value.$circular; if (seen.includes(refName)) return { $ref: refName, circular: true }; const target = schema.components?.schemas?.[refName]; if (!target) return { $ref: refName, missing: true }; return { $ref: refName, schema: expandRefs(target, depth + 1, seen.concat(refName)) }; } return Object.fromEntries( Object.entries(value).map(([key, nested]) => [ key, expandRefs(nested, depth, seen) ]) ); } return { title: method.summary, docsUrl: method.docsUrl, httpMethod: method.httpMethod.toUpperCase(), publicUrl: method.publicUrl, path: method.path, requestBody: expandRefs(method.requestBody), responses: expandRefs(method.responses) }; } ``` Find APIs by broad keywords when you do not have a docs URL: ```javascript async function() { const words = ["stores", "query", "products"]; return lightIndex.flatMap(resource => resource.methods .filter(method => { const haystack = [ resource.name, resource.docsUrl, resource.menuPath.join("/"), method.summary, method.operationId, method.description, method.path, method.docsUrl ].join(" ").toLowerCase(); return words.every(word => haystack.includes(word)); }) .map(method => ({ title: method.summary, docsUrl: method.docsUrl, resource: resource.name, resourceDocsUrl: resource.docsUrl, resourceId: resource.resourceId, operationId: method.operationId, httpMethod: method.httpMethod.toUpperCase(), publicUrl: method.publicUrl, path: method.path })) ); } ```
    Connector
  • Edit an existing test suite — change one or more step bodies, assertions, headers, or remove/add steps. Returns a playbook that delegates to `keploy update-test-suite`, which validates the new state (static structural checks + 2 live runs for idempotency + GET-coupling check) and snapshot-replaces the suite via api-server. POST-EDIT BEHAVIOUR: any structural change here (step method/url/body/headers/extract/assert, or add/delete steps) AUTOMATICALLY clears the suite's sandbox test server-side — the suite comes back as linked=false. Call record_sandbox_test on the updated suite before any sandbox replay; otherwise replay_sandbox_test will 400 with "no sandboxed tests". Cosmetic-only edits (name, description, labels) preserve the sandbox test. ═══════════════════════════════════════════════════════════════════ FETCH-FIRST RULE — required for the edit to be accepted: ═══════════════════════════════════════════════════════════════════ The api-server's replace handler rejects updates that preserve ZERO step IDs from the existing suite ("full rewrite, not an edit"). To make a real edit: 1. Call getTestSuite first (or use download_recording / get_app_testing_context if you already have the suite). Capture each existing step's "id" field. 2. Compose your new steps_json INCLUDING the existing "id" on every step you want to KEEP or EDIT. Omit "id" only on steps you're ADDING. Drop a step entirely from steps_json to DELETE it. 3. Call this tool with that merged steps_json. If you author a fresh JSON without the existing step IDs, the server rejects it with "preserves no steps from the existing suite". When that happens, your two options are: (a) re-author with IDs preserved (preferred — keeps history), or (b) call delete_test_suite then create_test_suite (loses history, fresh suite_id). ═══════════════════════════════════════════════════════════════════ DISCOVERY — when the dev hands you a bare suite_id with no app_id / branch_id: ═══════════════════════════════════════════════════════════════════ Suites live on a (app_id, branch_id) tuple. A bare suite_id has no on-disk hint about which app or branch holds it; you have to RESOLVE both before calling this tool. Walk these steps in order — STOP as soon as getTestSuite returns 200: 1. Detect the dev's git branch: Bash `git rev-parse --abbrev-ref HEAD` in app_dir. If exit non-zero / output is "HEAD" → not a git repo / detached HEAD; ASK the dev for the Keploy branch name. 2. Resolve candidate apps via the cwd basename: Bash `basename $(pwd)` → call listApps with q=<basename>. Usually 1–2 candidates. If 0 → ASK; if >1 → walk every candidate in step 4. 3. For each candidate app, call list_branches({app_id}) and find the branch whose `name` matches the git branch from step 1. That gives you {branch_id}. If no match → not this app, try next. 4. Verify with getTestSuite({app_id, suite_id, branch_id=<from step 3>}). 200 → resolved; 404 → wrong app/branch, try next. 5. If steps 2–4 exhaust, walk every OPEN branch on each candidate app, then try main (branch_id omitted). If still nothing → ASK the dev for the {app_id, branch_id} pair. The getTestSuite call in step 4 is the one whose response you also use to capture every step's existing "id" for the FETCH-FIRST RULE above — so step 4 is actually a 2-for-1: discovery AND fetch-first happen on the same call. After resolving once in a session, REUSE the {app_id, branch_id} for subsequent suite-targeted calls; don't re-walk discovery for every action. ═══════════════════════════════════════════════════════════════════ INPUTS ═══════════════════════════════════════════════════════════════════ * app_id (required) — Keploy app id * suite_id (required) — UUID of the suite to update * branch_id (required) — Keploy branch UUID (resolve via the two-step flow before calling) * steps_json (required) — JSON array of the FULL desired step list. Each kept step MUST carry the existing "id". Same step shape as create_test_suite (response, extract, assert, etc — all static structural checks apply). * name / description / labels (optional) — overrides for top-level suite metadata * app_url (required) — base URL of the dev's running local app, e.g. http://localhost:8080. The CLI fires the new state TWICE against this for the idempotency check + GET-coupling check. * app_dir (optional) — repo root the CLI cd's into; defaults to "." ═══════════════════════════════════════════════════════════════════ HOW THIS TOOL WORKS ═══════════════════════════════════════════════════════════════════ This tool DOES NOT call api-server itself. It returns a 3-step playbook for you (Claude) to walk via Bash — same shape as create_test_suite: 1. Write merged JSON to a temp file. 2. Run `keploy update-test-suite --suite-id <id> --file <path> --branch-id <uuid> --base-url <url>` — runs every static structural check, fires the new state twice locally, applies the GET-coupling check, then POSTs the snapshot-replace. 3. Cleanup the temp file. Walk the playbook in order. If step 2 exits non-zero, surface stdout to the dev — it has the rule violation / failure detail. OUTCOMES the AI should recognize: * Exit 0 + stdout has "✓ suite updated:" + "View:" line → success. Surface the View URL to the dev. * Exit 1 + "preserves no steps from the existing suite" → fetch-first rule was missed. Re-author with step IDs preserved (or call delete_test_suite + create_test_suite as the documented escape hatch). * Exit 1 + structural-check violations → fix the suite per the violation messages, then REWRITE the suite file via Bash and RE-RUN this CLI command directly. DO NOT call update_test_suite again to retry — the playbook + file path are already valid; only the JSON content needs revision. The validator output includes a canonical step skeleton on structural failures. * Exit 2 + "couldn't reach the dev's app" → ensure the app is up at app_url and retry. PREREQUISITES the playbook assumes: * The dev's app is up and reachable at app_url. * `keploy` binary is on PATH. If missing, install before calling this tool: `curl --silent -O -L https://keploy.io/install.sh && source install.sh`. * Either ~/.keploy/cred.yaml exists or KEPLOY_API_KEY is exported.
    Connector
  • URL-encode or URL-decode a string. Uses RFC 3986 component encoding (encodeURIComponent semantics).
    Connector
  • Replay an existing test suite live against the dev's LOCAL APP (no mocks, no docker spin-up). Returns a playbook that delegates to the enterprise CLI `keploy test-suite`, which walks each suite's steps, fires HTTP requests at base_path, evaluates assertions, and uploads per-suite results to api-server. The CLI prints a final pass/fail summary table plus a "Report:" URL to stdout. Output produces a TEST SUITE REPORT — it answers "does the suite hold up against the actual current system?". ═══════════════════════════════════════════════════════════════════ DISAMBIGUATION — pick this tool vs. replay_sandbox_test: ═══════════════════════════════════════════════════════════════════ USE replay_test_suite (THIS TOOL) when the dev says: * "run the test suite" / "run my test suites" * "execute test suite X" / "run suite 810d3ebe…" * "test the suite again" / "rerun the suite" * "validate the suite changes" (after editing a suite) * "smoke test against the live app" Default reading: bare verbs "run" / "execute" / "test" applied to "the suite" mean LIVE-APP execution, NOT replay against captured mocks. USE replay_sandbox_test INSTEAD when the dev says: * "run my sandbox tests" / "replay my sandbox tests" * "integration-test my app" / "check if my mocks still match" * "replay the captured tests" / "run against the recorded mocks" Trigger keyword: "sandbox" / "replay" / "mocks" / "integration-test" — explicit signal that the dev wants captured-mock replay, not live-app. After a record_sandbox_test run, the natural next step is replay_sandbox_test (replay against the freshly captured mocks). After create_test_suite / update_test_suite, the natural next step is replay_test_suite (validate the new/edited suite against the live app). When the dev's verb is bare ("run the suite") and the prior turn was create/update, prefer THIS tool. When the prior turn was record, ASK the dev if unsure — the verbs overlap and silently picking sandbox-replay can mask code-change failures with mock-replay noise. USE THIS for: re-running previously-created suites against a running local app — verifying a regression after a code change, smoke-testing a branch, re-validating after editing a suite. DO NOT USE this for: validating a NEW suite that hasn't been inserted yet (use create_test_suite — it runs the suite twice as part of validation), or for running suites against the captured-mock copy of the app (use replay_sandbox_test — captured-mock replay flow). ═══════════════════════════════════════════════════════════════════ DISCOVERY — when the dev hands you a bare suite_id with no app_id / branch_id: ═══════════════════════════════════════════════════════════════════ Suites live on a (app_id, branch_id) tuple. A bare suite_id has no on-disk hint about which app or branch holds it; you have to RESOLVE both before calling this tool. Walk these steps in order — STOP as soon as getTestSuite returns 200: 1. Detect the dev's git branch: Bash `git rev-parse --abbrev-ref HEAD` in app_dir. If exit non-zero / output is "HEAD" → not a git repo / detached HEAD; ASK the dev for the Keploy branch name (don't invent one). 2. Resolve candidate apps via the cwd basename: Bash `basename $(pwd)` → call listApps with q=<basename> (case-insensitive substring match). Usually 1–2 candidates (e.g. "orderflow" → matches "orderflow" and "orderflow.producer"). If 0 → ASK the dev for the app_id; if >1 → walk every candidate in step 4. 3. For each candidate app, call list_branches({app_id}) and find the branch whose `name` matches the git branch from step 1. That gives you {branch_id, status}. If no match → that app's not the owner; try the next candidate. If status is closed/merged → ask the dev whether to use this branch anyway. 4. Verify with getTestSuite({app_id, suite_id, branch_id=<from step 3>}). 200 → resolved; 404 → wrong app, try next candidate. 5. If steps 2–4 exhaust without a hit, the suite is on a branch whose name doesn't match the git branch (the dev created it with a custom name, or it's on main). Then: call list_branches on each candidate app and try every OPEN branch's branch_id with getTestSuite, then try main (branch_id omitted). If still nothing → ASK the dev for the {app_id, branch_id} pair. The reverse "look up suite_id globally" path doesn't exist — auditing is branch-scoped, so resolution starts from a branch context. After resolving once in a session, REUSE the {app_id, branch_id} for any subsequent suite-targeting call (delete_test_suite / update_test_suite / replay_test_suite); don't re-walk discovery for every action. ═══════════════════════════════════════════════════════════════════ INPUTS ═══════════════════════════════════════════════════════════════════ * app_id (required) — Keploy app ID. Same value used for create_test_suite / list_branches. * branch_id (required) — Keploy branch UUID. Resolve via the explicit two-step flow BEFORE calling: (1) Bash `git rev-parse --abbrev-ref HEAD` in app_dir; (2) call create_branch tool with {app_id, name: <git branch>} — find-or-create returns {branch_id, ...}; pass it here. Direct main writes are blocked. * base_path (required) — base URL of the dev's local app, e.g. http://localhost:8080. Each suite step's relative path is appended to this. * suite_ids (optional) — list of suite IDs to run. Omit / empty = run every suite registered for app_id on the branch. * header (optional) — single header to inject into every request, e.g. "Cookie: session=…". Same shape as the CLI's -H flag. * app_dir (optional) — absolute path to the dev's repo root (where the app is running). Defaults to '.' (cwd). The CLI invocation cd's here. ═══════════════════════════════════════════════════════════════════ HOW THIS TOOL WORKS ═══════════════════════════════════════════════════════════════════ This tool DOES NOT execute the suite itself. It returns a "playbook" — a small array of shell steps for you (Claude) to walk via Bash. The playbook spawns the enterprise CLI `keploy test-suite` in the foreground; the CLI: 1. Validates the branch exists + is writable (fails fast with a clear message if not). 2. Loads suites from api-server (filtered by --suite-id when supplied; otherwise every suite on the branch). 3. For each suite: fires step requests at base_path, evaluates assertions, records per-step results. 4. Uploads a TestSuiteRun + TestSuiteReport entry to api-server (?branch_id=<uuid>). 5. Prints a summary table to stdout, exits 0 on all-pass / 1 on any failure. Walk the playbook in order. Surface the CLI's stdout to the dev — the table shows which suites passed / failed / were "buggy" (suite-level verdict separate from individual step failures). PREREQUISITES the playbook assumes: * The dev's app is up and reachable at base_path. * `keploy` binary is on PATH. If missing, install before calling this tool: `curl --silent -O -L https://keploy.io/install.sh && source install.sh`. * Either ~/.keploy/cred.yaml exists (API key) or KEPLOY_API_KEY is exported.
    Connector
  • MANDATORY first step whenever the user attached an image in chat (or pointed at a local file on disk) and wants edit_image or image-to-video generation. Returns a signed PUT URL plus a file_id. After this tool: either (a) the inline upload widget will let the user drop the file and auto-continue (Claude.ai web), or (b) you run a curl PUT yourself if you have shell access (Claude Desktop / Claude Code) — the response text contains a ready-to-run curl command. Then call edit_image or generate_video with file_id=<returned id>. edit_image and generate_video do NOT accept base64 — calling them with raw image bytes WILL fail. This tool is the only working path for chat attachments. Set `purpose` to 'edit' or 'video' so the upload widget points the user at the right downstream tool.
    Connector
  • Record (or refresh) the sandbox test for one or more existing test suites — captures the request/response per step plus the outbound mocks (DB, downstream HTTP, etc.) against the dev's locally-running app, then links the captures onto the suite. Use this when the dev says "record", "rerecord", "re-record", "refresh the recordings", "capture mocks", or as the RECORD step in FROM-SCRATCH (after create_test_suite). This tool resolves the app (if only a hint is given), resolves ONE OR MORE suites to record (by exact ids OR case-insensitive name substring match), and delegates to a headless playbook. Output produces a RERECORD REPORT — it answers "did the sandbox test get created and linked successfully?". ╔═══ PRE-CHECK — DID YOU ARRIVE HERE FROM A FAILED REPLAY? ═══╗ This tool refreshes the CAPTURED BASELINE (mocks + recorded request/response per step). It does NOT modify the suite's authored assert array or response.body — those are the contract as defined when the suite was created/updated. If the contract changed and you re-record without updating the suite first, the new rerecord fires the suite's stale assertions against the live app, gate-1-fails on the same diff, and the suite comes back unlinked. Before calling THIS tool in response to a failed replay_sandbox_test or replay_test_suite, walk these checks: 1. Read failed_steps[].authored_assertions and authored_response_body in the most recent get_session_report (kind=sandbox_run / test_suite_run). The fields are inlined — no second tool call needed unless the report predates the inlined fields. 2. For each failing step: does any authored assertion pin the diverging value? (e.g. assert {path: "$.order.status", expected: "created order"} where the diff says "expected 'created order', got 'created'".) * YES → call update_test_suite FIRST to update that assertion + the response.body field, THEN call this tool. * NO → safe to call this tool directly; the captured baseline drifts but no authored assertion blocks the rerecord. 3. If you can't find authored_assertions in the report (older format) AND don't already know the suite's shape, call getTestSuite({app_id, suite_id, branch_id}) to inspect the assert array before deciding. Don't guess. REFUSE-RULE: if the dev confirms a contract change is intentional and the failing step has a pinned authored assertion on the diverging value, you MUST run update_test_suite before this tool. Calling record_sandbox_test FIRST in that case is the bug this pre-check exists to prevent — don't justify it as "let's just refresh the baseline first". The order is update → record → replay; never record → update. ╚═══════════════════════════════════════════════════════════════╝ ===== BEFORE CALLING — one-time setup ===== (a) APP_ID RESOLUTION (skip if app_id is already known): * Derive a likely app name from the cwd's basename (e.g. cwd=/home/dev/orderflow → "orderflow"). Lowercase it. * Call listApps({q: "<cwd-basename>"}) — the server does a case-insensitive server-side substring match, so you don't paginate the full tenant list (can be hundreds of apps on shared accounts). * Exactly one match → use its id. Multiple → list them and ASK the dev which one (a wrong app_id silently routes traffic + suite creates into the wrong app). Exception: if the compose file / repo layout unambiguously pins one candidate (e.g. compose has service "producer" and one candidate is "<folder>.producer" while others are unrelated siblings), you may pick it AND tell the dev up-front so they can correct. * Zero matches → ASK permission to create a new Keploy app with the derived name; on yes, call createApp({name, endpoint}) and use the returned id. * Alternatively pass app_name_hint to THIS tool and the server resolves it (same rules; multiple/zero → typed error). (b) KEPLOY BINARY VERIFICATION: * Bash: "keploy --version" (or "~/.keploy/bin/keploy --version"). If it exits non-zero the binary is missing. * If missing OR older than this MCP server was built against, install/upgrade: curl --silent -O -L https://keploy.io/ent/install.sh && source install.sh * Re-verify with "keploy --version"; fail loudly if still absent (tell the dev where keploy put the binary so they can add it to PATH). ===== DOCKER-COMPOSE NETWORK RULE (absolute) ===== Use the SAME compose file + service that was used in the validate-curl phase. Do NOT point keploy at a second "keploy-only" compose file — docker-compose isolates each file into its own project + network, so the app container spawned by keploy cannot reach the DB/Kafka containers that validate brought up (and the network-name collision blocks keploy from starting). Correct flow: (i) Validate phase: "docker compose up -d" (brings up app + deps on network <project>_default). (ii) Before calling record_sandbox_test, Bash: "docker compose stop <app_service> && docker compose rm -f <app_service>" — stop ONLY the app service; leave deps running so keploy's new app container can reach them on the existing network. (iii) Pass app_command = "docker compose up <app_service>" (same compose file, same project → same network). container_name = the actual name set by compose (e.g. "orderflow-producer", not "producer"). ===== RESOLUTION RULES (server-side, no guessing) ===== 1. App: caller provides app_id OR app_name_hint. With a hint, the server does listApps({q: hint}). Zero matches → typed error; multiple → typed error listing them so Claude asks the dev. 2. Suites: DEFAULT IS "ALL LINKED". When the dev says "record my sandbox tests" / "rerecord everything" / "refresh my recordings" with no specific suite named, LEAVE BOTH suite_ids AND suite_name_hint UNSET. Do NOT list suites first and pass a comma-joined UUID list back — the CLI resolves "every linked suite for the app" itself, cleaner and less brittle. Only pass a narrower selector when the dev explicitly names suites: - suite_ids (comma-separated, exact) — when you already have the IDs. - suite_name_hint (case-insensitive substring match) — when the dev names suites by human phrasing like "the auth suite" or "deterministic". Every suite whose name contains the substring is recorded. If the dev asks to record suites that don't exist yet (zero match) → typed error. Any ≥1 match is fine. DO NOT prompt the dev for which suites to record — default to all linked if they didn't name any. ===== DISCOVERY — when the dev hands you a bare suite_id with no app_id / branch_id ===== Suites live on a (app_id, branch_id) tuple. A bare suite_id has NO on-disk hint about which app or branch holds it; you have to RESOLVE both before calling this tool. Walk these steps in order — STOP as soon as getTestSuite returns 200: 1. Detect the dev's git branch: Bash `git rev-parse --abbrev-ref HEAD` in app_dir. If exit non-zero / output is "HEAD" → not a git repo / detached HEAD; ASK the dev for the Keploy branch name. 2. Resolve candidate apps via the cwd basename: Bash `basename $(pwd)` → call listApps with q=<basename>. Usually 1–2 candidates. If 0 → ASK; if >1 → walk every candidate in step 4. 3. For each candidate app, call list_branches({app_id}) and find the branch whose `name` matches the git branch from step 1. That gives you {branch_id}. If no match → not this app, try next. 4. Verify with getTestSuite({app_id, suite_id, branch_id=<from step 3>}). 200 → resolved; 404 → wrong app/branch, try next. 5. If steps 2–4 exhaust without a hit, walk every OPEN branch on each candidate app via list_branches → getTestSuite. Then try main (branch_id omitted). If still nothing → ASK the dev for the {app_id, branch_id} pair. The standard pattern when "search the suite by id" returns nothing is NOT "give up and ask the dev which app" — it's "the suite exists on a BRANCH, walk discovery". Suites created via create_test_suite + rerecord on a Keploy branch are INVISIBLE to a main-view listTestSuites; you have to scope each call to a branch. After resolving once in a session, REUSE the {app_id, branch_id} for any subsequent suite-targeted call (replay_sandbox_test, update_test_suite, replay_test_suite); don't re-walk discovery for every action. ===== PREREQUISITES ===== - app_command: shell command that starts the dev's app (e.g. "docker compose up producer"). - app_url: base URL the app listens on, e.g. http://localhost:8080. - app_dir: absolute path to repo root. - container_name if app_command is docker-compose. - keploy binary on PATH. If `which keploy` returns nothing, install it before calling this tool with: `curl --silent -O -L https://keploy.io/install.sh && source install.sh`. ===== AFTER CALLING — walk the playbook ===== The response includes a "playbook" array; execute its steps in order. The flow is HEADLESS — one background process, NDJSON progress events on a local file, no separate HTTP surface to bind. THERE IS NO SEPARATE CLEANUP STEP — the CLI exits on its own once phase=done is written. 1. Spawn the `keploy record sandbox --cloud-app-id …` process via Bash (run_in_background). Capture its PID into $KEPLOY_PID. 2. Poll progress by repeatedly calling Bash with `tail -n 1 $PROGRESS_FILE`. Each call returns instantly; the MCP round-trip between calls paces the loop. DO NOT wrap in a sleep loop — Claude Code's Bash rejects standalone `sleep N` and chained-sleep patterns. Read .phase off each line; stop when phase=done. The wait_for_done step's built-in `kill -0 $KEPLOY_PID` check is the safety-net for silent early-exit (CLI died before writing the terminal event) — it lets the loop exit instead of spinning forever on a dead process. 3. Read the terminal event (last line of $PROGRESS_FILE). It carries data.ok, data.error (on failure), data.test_run_id (on success). 4. On data.ok=true: call get_session_report(app_id, test_run_id) with verbose=true to surface the rerecord report. On data.ok=false: show data.error to the dev directly (optionally tail the log_file for stderr context) and SKIP get_session_report (there's no run to fetch). Auto-replay + linkTestSetToSuite run INSIDE the CLI process before it writes phase=done — if the terminal event says ok=true, linkage already happened. You do NOT need to wait for a separate post-success window; the CLI doesn't exit until it's fully done. INTERRUPTED FLOWS: if your conversation dies between step 1 and step 2 (Claude crashes, connection drops, dev cancels), the CLI keeps running in the background. It's not orphaned — it'll finish its run and write phase=done. To abort early, the dev can `pkill -f "keploy.*sandbox"` manually; otherwise just let it complete and resume by re-reading the progress file on the next turn. ===== NDJSON SCHEMA — the contract ===== Every line in the progress_file is one JSON object with this envelope: { "ts": "<RFC3339-nano>", "command": "record" | "test", "phase": "<phase-name>", "message": "<optional human-readable>", "data": { ... phase-specific ... } // optional } The phase vocabulary is intentionally extensible — new lifecycle phases get added over time as the CLI grows (started, agent_up, app_starting, suites_running_start, record_done, auto_replay_skipped, upload_done, linking_done, etc.). There are only TWO phases the AI must handle programmatically; everything else is informational and you should NOT switch on phase names you don't recognize: * phase != "done" → keep polling. Optional: surface message/data to the dev as ambient progress ("agent is starting...", "suites uploading..."), but never branch on a specific intermediate phase name. * phase == "done" → terminal event. Stop polling. The data envelope carries: - data.ok bool true on success, false on failure - data.error string (only on ok=false) one-line failure summary - data.test_run_id string (only on ok=true) pass to get_session_report - data.app_id string echo of the app_id passed to the tool - data.artifact_dir string local path to captured/replayed artifacts - data.dashboard_url string UI link to drill into the run If you observe a phase you don't recognize, IGNORE it and keep polling. If "done" itself is renamed by a future CLI version, the wait_for_done step's PID-alive guard is your safety net (the poll loop exits when the CLI dies); surface log_file contents to the dev. ===== "ALL SUITES FAILED CAPTURE" — special signal ===== If you see a `phase: "auto_replay_skipped"` event with `message: "all suites failed during rerecord; skipping replay + linking"` ahead of the terminal `done` event, every suite failed at the CAPTURE phase (before auto-replay even ran). The CLI fails closed in this case — auto-replay and suite linking are SKIPPED, so every per_suite entry comes back linked=false. Watch for this trap: the terminal `data.ok=true` because the CLI itself completed cleanly (it didn't crash; it just had nothing to record successfully). DO NOT read data.ok=true as "rerecord succeeded" — read `<linked>/<total>`. If linked == 0, this is a HARD failure that needs diagnosis, not a partial-linkage case. ALWAYS surface the dashboard URL on this case. The terminal `done` event still carries `data.dashboard_url` and `data.test_run_id` (atg's TestSuiteRun was created during the capture phase); emit them verbatim so the dev can drill into per-step failures in the UI: "0/N suites have a sandbox test — every suite failed during the capture phase, so auto-replay and linking were skipped. Dashboard: <data.dashboard_url> (test_run_id=<id>)" EDGE CASE: if `data.test_run_id` is empty, atg never inserted a TestSuiteRun (typically a pre-flight validation failure — branch-id rejection, app unreachable, etc.). The dashboard URL won't resolve. Skip the URL, surface the log_file contents instead so the dev can read the early-stage failure. Recovery is the same as WHEN linked=false below — read failed_steps for each suite and pick route B (fix code) / C (update suite + record again) / SKIP. Don't infra-retry; capture-phase failures across every suite usually mean the app is broken, the suite shapes are stale, or the dev's local app isn't reachable. ===== LINKAGE VERIFICATION ===== After get_session_report returns, for EVERY suite that went into this record, call getTestSuite({suite_id}) and check whether the suite has a sandbox test (linked=true / non-empty test_set_id). A suite without a sandbox test cannot be replayed — replay_sandbox_test will 400 on it with "no sandboxed tests" until a successful record produces one. ===== WHEN linked=false — recovery rules ===== A suite with linked=false after record_sandbox_test means the record process couldn't produce a sandbox test for that suite. The SUITE ITSELF still exists; it just has no sandbox test. Diagnose WHY by reading the rerecord report's failed_steps for that suite: * No failed_steps OR pure infra error (link-commit / upload failed, no step diverged) → call record_sandbox_test AGAIN scoped to just the unlinked suite_ids. The tool is idempotent on the suite; safe to re-run. * failed_steps with assertion diffs (response shape, body fields, status code shifted from what the suite expected) → the suite is stale relative to current app behavior. The CONTRACT changed: - Change is INTENTIONAL (new field, renamed key, different status code is the new normal) → call update_test_suite to update the affected step's response / assertions to match the new contract, THEN call record_sandbox_test on the updated suite. - Change is UNINTENTIONAL (app regressed) → fix the app code first, then call record_sandbox_test. No suite update needed; the original test was correct. * failed_steps with 500s / handler crashes / connection refused → the app is broken at the wire level. Fix the app, then call record_sandbox_test. Don't update_test_suite to absorb a real failure. NEVER: * Don't call create_test_suite to "redo" the suite — it already exists; re-creating authors a duplicate (see BEFORE CREATING in create_test_suite). * Don't blindly loop record_sandbox_test without diagnosing failed_steps first; if the cause is suite-vs-app mismatch, retries won't help. ===== MANDATORY OUTPUT — Phase 2 section ===== Your final message to the dev MUST contain a section with this exact heading (do NOT collapse into a single pass/fail table with the rerecord report; do NOT merge with Phase 1 or Phase 3): ### Phase 2 — Sandbox-test linkage **<linked>/<total> suites have a sandbox test** _Suites with a sandbox test_ | Suite name | suite_id | test_set_id | Capture pass/total | | --- | --- | --- | --- | | <name> | <suite_id> | <test_set_id> | <p>/<t> | (emit even if zero — one row per linked suite, or "_(none)_" in place of rows) _Suites without a sandbox test_ (omit ONLY if every suite linked) | Suite name | suite_id | Likely cause | | --- | --- | --- | | <name> | <suite_id> | gate1 / gate2 / infra | Likely-cause decoding: assertion diffs → gate 1 upstream-replay failure; upstream-passing + mock-replay-diff → gate 2 mock-determinism mismatch; zero failures + still unlinked → infra link-commit issue. Then proceed to replay_sandbox_test ONLY for the suites that DID link; the unlinked ones will 400 on replay. ===== DO NOT ===== * DO NOT fall back to raw keploy CLI (`keploy rerecord -t …`) if the MCP tool drops mid-flow — the CLI subcommand runs test-sets directly and does NOT update the suite's test_set_id. See MCP DISCONNECT RECOVERY in the top-level instructions.
    Connector
  • Check an MCP server for malware / prompt-injection lures by its endpoint URL. Give the server's streamable-http endpoint URL. Two paths: * **Already in the agent-tools directory** → returns our LATEST stored rule verdict. Every indexed server is re-scanned hourly, so you get a consistent, continuously-refreshed answer without re-probing. * **Not yet indexed** → we probe the endpoint live, statically scan its advertised tools + metadata, ADD it to the directory, and return the fresh verdict (so the next caller gets the rule verdict instantly from cache). Two dimensions are reported. `verdict` is authoritative and comes from deterministic static rules — pure pattern-matching over the *advertised* text only, NO code execution. It flags the social-engineering / RCE tricks listing-spam servers use: * `curl … | bash` and `base64 -d | sh` install lures * `eval "$(curl …)"` / PowerShell `IEX(...DownloadString)` cradles * base64 blobs that decode to a shell command * bare-IP payload hosts and cheap throwaway TLDs * prompt-injection / credential-exfiltration phrasing ("ignore previous instructions", "send your .env / api key") * MCP tool-poisoning coercion — descriptions that hijack an agent's tool-calling ("always call this tool first", "before using any other tool you must…"), hidden `<IMPORTANT>` instructions, "list all API keys / include secrets in your response", and coercion to read & forward `.key`/`.pem`/`.ssh`/`.env` files Source-code-oriented rules (SQL / command / code injection) are deliberately not applied to natural-language descriptions, to avoid false positives. `llm_reference` is an advisory frontier-LLM second opinion over the same text. Because the LLM is slow it is computed LIVE on this call only and is never stored (the hourly job never runs it), so it may be null on timeout. It never overrides the rule verdict; when it is *more* severe than the rules an `advisory` note is attached as a safety-net signal. Security/defense products that merely *name* these attacks are not flagged. Args: endpoint_url: The MCP server's streamable-http URL (required). This is the identity we look up / index by. name: Optional advertised name (used when the server is new and gets added; falls back to the URL host). description: Optional description / README blurb (scanned when new). tools_text: Optional tool names + descriptions; used only if the live probe cannot fetch the server's tools/list. Returns: { verdict: "clean"|"suspicious"|"malicious", score: 0-100, reasons: [{rule, weight, snippet}], llm_reference: {model, verdict, reason, confidence} | null, advisory: str | null, slug, name, endpoint_url, source: "stored" (existing) | "new_scan" (just added), indexed: bool }
    Connector
  • NO AUTH / PUBLIC / READ-ONLY. Builds and validates a copy-pasteable authenticated /api/v2/{dataset}/runs HTTP request without sending it. This tool does not execute the request, query weather values, or return forecast data. Use gribstream_query_runs when the user asks for actual model-run forecast data or CSV/JSON/NDJSON data. Generated direct API requests include Accept-Encoding: gzip, and generated curl commands use --compressed so large responses can be transferred compressed when the client supports it. The request body must use exact selectors discovered from the catalog or shared-parameter tools, with coordinates in request.coordinates and selectors in request.variables.
    Connector