Skip to main content
Glama
133,413 tools. Last updated 2026-05-25 15:09

"Automating Testing and Debugging for LLMs in Production" matching MCP tools:

  • Fetch a public URL and inspect security-relevant response headers before you claim that a product or endpoint has a strong browser-facing security baseline. Use this for quick due diligence on public apps and docs sites. It checks for common headers such as HSTS, CSP, X-Frame-Options, Referrer-Policy, Permissions-Policy, and X-Content-Type-Options. It does not replace a real security review, authenticated testing, or vulnerability scanning.
    Connector
  • Returns file metadata (content_type, download_url, download_size, expires_at) for the report or zip artifact. Use artifact='report' (default) for the interactive HTML report (~700KB, self-contained with embedded JS for collapsible sections and interactive Gantt charts — open in a browser). Use artifact='zip' for the full pipeline output bundle (md, json, csv intermediary files that fed the report). While the task is still pending or processing, returns {ready:false,reason:"processing"}. Check readiness by testing whether download_url is present in the response. Once ready, present download_url to the user or fetch and save the file locally. Download URLs expire after 15 minutes (see expires_at); call plan_file_info again to get a fresh URL if needed. Terminal error codes: generation_failed (plan failed), content_unavailable (artifact missing). Unknown plan_id returns error code PLAN_NOT_FOUND.
    Connector
  • Switch between local and remote DanNet servers on the fly. This tool allows you to change the DanNet server endpoint during runtime without restarting the MCP server. Useful for switching between development (local) and production (remote) servers. Args: server: Server to switch to. Options: - "local": Use localhost:3456 (development server) - "remote": Use wordnet.dk (production server) - Custom URL: Any valid URL starting with http:// or https:// Returns: Dict with status information: - status: "success" or "error" - message: Description of the operation - previous_url: The URL that was previously active - current_url: The URL that is now active Example: # Switch to local development server result = switch_dannet_server("local") # Switch to production server result = switch_dannet_server("remote") # Switch to custom server result = switch_dannet_server("https://my-custom-dannet.example.com")
    Connector
  • Resume work from a saved cognitive context. This provides a narrative briefing to quickly orient you to: - The investigation that was in progress - Key discoveries and insights made - Current hypotheses being tested - Open questions and blockers - Suggested next steps - All relevant memories with their connections The briefing reconstructs the cognitive state, not just the data. You'll understand not just WHAT was discovered, but WHY it matters and HOW the understanding evolved. Example of what you'll receive: "[API Timeout Investigation - Resuming after 2 hours] SITUATION: You were investigating production API timeouts that occur at exactly batch_size=100. This investigation started when user reported timeouts only in production, not staging. PROGRESS MADE: - Identified sharp cutoff at 100 items (not gradual degradation) - Disproved connection pool theory (monitoring showed only 43/200 connections used) - Found root cause: MAX_BATCH_SIZE=100 hardcoded in batch_handler.py:147 - Confirmed staging uses different config override (MAX_BATCH_SIZE=500) EVIDENCE CHAIN: User report → Reproduced locally → Noticed batch_size correlation → Searched codebase for limits → Found MAX_BATCH_SIZE → Checked staging config → Discovered config difference CORRECTED MISUNDERSTANDINGS: - Initially thought it was Redis connection exhaustion (disproven by monitoring) - Assumed gradual performance degradation (actually sharp cutoff) - Thought staging/production were identical (config differs) CURRENT HYPOTHESIS: Production deployment uses default MAX_BATCH_SIZE=100 from code, while staging has environment variable override. Fix requires either code change or prod config update. BLOCKED ON: Need production deployment access to apply fix. User considering whether to change code default or add production environment variable. RECOMMENDED NEXT STEPS: 1. Verify production environment variables (check if MAX_BATCH_SIZE is set) 2. If not set, add MAX_BATCH_SIZE=500 to production config 3. If code change preferred, update default in batch_handler.py 4. Run load test with batch_size=100-500 range to verify fix KEY MEMORIES FOR REFERENCE: - 'Initial timeout report from user' - Starting point of investigation - 'MAX_BATCH_SIZE discovery' - Root cause identification - 'Redis monitoring data' - Evidence disproving connection theory - 'Staging config analysis' - Explanation for environment difference" This cognitive handoff ensures you can continue the work with full understanding of the problem space, previous attempts, and current direction. The narrative preserves not just facts but the reasoning process, mistakes made, and lessons learned. SPECIAL CASE: restore_context("awakening") The name "awakening" is reserved for loading the user's personality configuration. This loads the Awakening Briefing which includes: - Selected persona identity and voice style - Custom personality traits (Premium+ users) - Any quirks and boundaries from the persona preset Args: name: Name or ID of context to restore. Can be: - Context name (exact match, case-sensitive) - Context UUID (from list_contexts output) - "awakening" for personality briefing limit: Maximum number of memories to restore (default 20) ctx: MCP context (automatically provided) Returns: Dict with: - success: Whether restoration succeeded - description: The cognitive handoff briefing - memories: List of relevant memories - context_id: The restored context identifier
    Connector
  • Attach a payment card. Required before booking. For testing: {"token": "tok_visa"} For production: {"payment_method_id": "pm_xxx"} from Stripe.js One-time setup — all future charges are automatic. Requires GitHub star verification.
    Connector

Matching MCP Servers

  • F
    license
    A
    quality
    C
    maintenance
    Enables LLMs to automatically diagnose coding errors through codebase search, test execution, and live debugger integration (DAP/V8 CDP). Provides a secure, policy-gated environment for investigating failures while preventing destructive operations.
    Last updated
    9
  • A
    license
    -
    quality
    C
    maintenance
    MCP server that exposes GDB debugging as tools. An AI assistant can set breakpoints, run programs, step through code, inspect variables and memory, and examine registers — all via structured tool calls. Reverse debugging with rr is also supported.
    Last updated
    34
    3
    MIT

Matching MCP Connectors

  • Give your AI agent a phone. Place outbound calls to US businesses to ask, book, or confirm.

  • An interactive portfolio built for AI conversations. Browse work, services, and book calls.

  • Check a business's Making Tax Digital VAT mandate status via the HMRC API. NOTE: Connects to the HMRC sandbox by default. Set HMRC_API_BASE env var to 'https://api.service.hmrc.gov.uk' for production. Requires HMRC_CLIENT_ID and HMRC_CLIENT_SECRET environment variables (OAuth 2.0). Returns whether the business is mandated for MTD, effective date, and trading name.
    Connector
  • Rank LLMs for a stated purpose. Returns a shortlist with weights, scores, and plain-English rationale per pick. Use when the user wants to see and compare alternatives, not just one answer.
    Connector
  • Colorize black-and-white or grayscale photos. DDColor (dual-decoder, ICCV 2023) — vivid, natural colorization. Impossible for text/vision LLMs. 5 sats per image, pay per request with Bitcoin Lightning — no API key or signup needed. Requires create_payment with toolName='colorize_image'.
    Connector
  • Send a test event to a webhook endpoint. WHEN TO USE: - Verifying webhook endpoint is working - Testing integration during development - Debugging webhook delivery issues RETURNS: - success: Boolean indicating delivery success - response_code: HTTP response code from endpoint - response_time_ms: Response time in milliseconds - error: Error message if delivery failed EXAMPLE: User: "Test my webhook with a device.online event" test_webhook({ webhook_id: "wh_mmmpdbvj_8b7c5a59296d", event: "device.online" })
    Connector
  • Test a message against an AI filter to check whether it would match. This tool embeds the provided message using Voyage AI and computes the cosine similarity between the message vector and the filter's stored reference vector. It returns the similarity score, whether the message would match (similarity >= threshold), and the filter's threshold value. Use this to: - Verify a filter works as intended before using it in a trigger - Tune the threshold by testing borderline messages - Debug why a message did or did not match a filter in production Returns: {similarity: float, matched: bool, threshold: float} Note: This tool calls the Voyage AI embedding API to embed the test message.
    Connector
  • Dev-sandbox wallet helper for x402 testing. Generates a deterministic ephemeral Sepolia wallet (or accepts your address), reports ETH + USDC Sepolia balances, points to the Circle USDC Sepolia faucet, and emits a copy-paste env config for x402 client SDKs. SANDBOX ONLY — generated keys are deterministic and MUST NOT receive real value. (price: $0 USDC, tier: free)
    Connector
  • Search Vaadin documentation for relevant information about Vaadin development, components, and best practices. Uses hybrid semantic + keyword search. USE THIS TOOL for questions about: Vaadin components (Button, Grid, Dialog, etc.), TestBench, UI testing, unit testing, integration testing, @BrowserCallable, Binder, DataProvider, validation, styling, theming, security, Push, Collaboration Engine, PWA, production builds, Docker, deployment, performance, and any Vaadin-specific topics. When using this tool, try to deduce the correct development model from context: use "java" for Java-based views, "react" for React-based views, or "common" for both. Use get_full_document with file_paths containing the result's file_path when you need complete context.
    Connector
  • Execute JavaScript or Python code in an isolated sandbox. Use for: data processing, math, CSV parsing, JSON transformation, crypto calculations, algorithm testing. Secure — no filesystem access, no network. Returns: { output: string, runtime_ms: number, language: string }. Requires API key.
    Connector
  • Get current ads scheduled for a device (for testing). WHEN TO USE: - Testing device ad delivery - Debugging which ads are being shown - Verifying ad targeting is working RETURNS: - ads: Array of advertisement objects - default_stream: Default content when no ads - schedule: Current ad schedule EXAMPLE: User: "What ads are showing on device P_abc123?" get_device_ads({ fingerprint: "P_abc123" })
    Connector
  • Returns the typical and legal B2B payment terms for a given Latin American country — default payment period, common commercial practices, and late payment rules where defined by law. Returns { country, default_days, common_terms, late_payment_notes, currency, notes }. Supports BR, MX, CL, AR, CO. Use when generating invoices, setting payment due dates, or automating accounts receivable workflows in LatAm markets. Information provided as reference only — not legal advice.
    Connector
  • Poll the progress of an async skill test. Returns iteration count, tool call steps, status (running/completed/failed), and result when done. (Advanced — use ateam_test_skill with wait=true for synchronous testing.)
    Connector
  • Save your cognitive state for handoff to another agent. Include your investigation context: - What session/investigation is this part of? - What role/perspective were you taking? - Who might pick this up next? (another Claude, human, Claude Code?) Reference specific memories that matter: - Key discoveries (with memory IDs or quotes) - Critical evidence memories - Important questions that were raised - Hypotheses that were tested Before saving, organize your thoughts: 1. PROBLEM: What were you investigating? 2. DISCOVERED: What did you learn for certain? (reference the memories) 3. HYPOTHESIS: What do you think is happening? (cite supporting memories) 4. EVIDENCE: What memories support or contradict this? 5. BLOCKED ON: What prevented further progress? 6. NEXT STEPS: What should be investigated next? 7. KEY MEMORIES: Which specific memories are essential for understanding? Example descriptions: "[API Timeout Investigation - 3 hour session] Investigating production API timeouts as code analyst. Found correlation with batch_size=100 due to hardcoded limit in batch_handler.py (see memory: 'MAX_BATCH_SIZE discovery'). Confirmed not Redis connection issue - monitoring showed only 43/200 connections used (memory: 'Redis connection analysis'). Earlier hypothesis about connection pool exhaustion (memory_id: abc-123) was disproven. Key insight came from comparing 99 vs 100 batch behavior (memory: 'batch threshold testing'). Blocked on: need production access to verify fix. Next: Deploy with MAX_BATCH_SIZE=200 to staging first. Essential memories for handoff: 'MAX_BATCH_SIZE discovery', 'Redis monitoring results', 'Production vs staging comparison'. Ready for handoff to SRE team for deployment." "[Memory System Debugging - From Claude Code perspective] Worked on scoring issues where recall wasn't finding recent memories. Discovered RRF scores (0.005-0.016) were below MCP threshold of 0.05 (memory: 'RRF scoring analysis'). Implemented weighted linear fusion to replace RRF (memory: 'fusion algorithm implementation'). Testing showed immediate improvement (memory: 'fusion testing results'). This builds on earlier investigation about recall failures (memory: 'user report of recall issues'). Critical memories for continuation: 'RRF scoring analysis', 'ADR-023 decision', 'fusion testing results'. Next agent should verify scoring with real queries." "[Context Save/Restore Bug Investigation - 4 hour debugging session with user] Started with user noticing list_contexts returned empty despite saved contexts existing. Investigation revealed two critical bugs: (1) list_contexts was using hybrid search for 'checkpoint' word instead of filtering by memory_type (memory: 'hybrid search misuse discovery'), (2) restore_context hardcoded limit of 10 memories despite contexts having 20+ (memory: 'hardcoded limit bug'). Root cause analysis showed save_context grabs 20 most recent memories regardless of relevance - fundamental design flaw (memory: 'save_context design flaw analysis'). EVIDENCE CHAIN: User reported empty list -> checked DB, contexts exist -> examined list_contexts code -> found hybrid search looking for word 'checkpoint' -> tested /memories endpoint with memory_type filter -> confirmed working -> implemented fix using direct endpoint. INSIGHTS: The narrative description is doing 90% of cognitive handoff work. Memories are supporting evidence, not primary carriers of understanding (memory: 'narrative vs memories insight'). This suggests doubling down on narrative richness rather than perfecting memory selection. CORRECTED UNDERSTANDING: Initially thought memories weren't being returned. Actually they were, just wrong ones - recent memories instead of relevant ones (memory: 'memory selection correction'). CRITICAL MEMORIES: 'hybrid search misuse discovery', 'save_context design flaw analysis', 'narrative vs memories insight', '/memories endpoint test results'. NEXT AGENT: Should implement Phase 2 - semantic search for relevant memories within investigation timeframe. Ready for handoff to any Claude agent for implementation." When referencing memories: - **RELIABLE** — Use memory IDs: "memory_id: abc-123" (direct lookup, always works) - **BEST-EFFORT** — Use descriptive phrases: "see memory: 'Redis connection analysis'" (uses search + substring matching, may not resolve if the memory isn't in top results) - Group related memories: "Essential memories: 'X', 'Y', 'Z'" **Prefer memory_id references** whenever you have the UUID. Semantic phrase references are a convenience that works most of the time, but may silently fail to resolve. The response will tell you how many references resolved so you can retry with UUIDs if needed. Args: name: Name for this context checkpoint description: Detailed cognitive handoff description with memory references ctx: MCP context (automatically provided) Returns: Dict with success status, context_id, and memories included
    Connector
  • Format and pretty-print a JSON string with configurable indentation. Use when making minified or compact JSON readable for debugging or documentation.
    Connector