MCP Debugger

MCP_DEBUGGER_TEST_REPORT.md•10.4 KiB

# MCP Debugger Testing Report ## Thorough validation of Python and JavaScript tools Test Date: 2025-10-09 Tester: Cline AI MCP Server: mcp-debugger --- ## Executive summary Core debugging flows work well for both Python and JavaScript: - Sessions: create, launch with stopOnEntry, and close are reliable - Breakpoints, stepping, stack/scopes/variables, expression evaluation, and source context are all functional - Python experience is polished end-to-end - JavaScript experience is functional but has quirks that make it a bit noisier and more stateful for clients No broken features were found in tested flows. The biggest frictions are: - JavaScript stack traces include many Node internal frames - JavaScript variableReferences and frameIds churn frequently, requiring a fetch-refresh pattern - JavaScript breakpoints sometimes report Unbound at set time - start_debugging scriptPath resolution required absolute paths in our environment (Python) The sections below detail what works, what’s confusing (but works), and concrete suggestions to improve the UX. --- ## What was tested (tools exercised) - list_supported_languages - create_debug_session - start_debugging (with stopOnEntry, justMyCode) - set_breakpoint - continue_execution - get_stack_trace - get_scopes - get_variables - evaluate_expression - step_into / step_over / step_out - get_source_context - close_debug_session Note: pause_execution is documented as Not Implemented and was not expected to work. --- ## Python debugging results Works really well - Session lifecycle - create_debug_session returned sessionId and clear message - close_debug_session cleaned up reliably - Launch - start_debugging with stopOnEntry successfully paused on entry - stopOnEntrySuccessful: true reported in data - Breakpoints - Verified breakpoints with helpful surrounding context - Example: set breakpoint at line 32 “fact_result = factorial(5)”, verified: true - Stepping - step_into entered factorial as expected - step_over advanced execution and populated locals - Stack, scopes, variables - get_stack_trace showed frames for main and <module>, then factorial when stepping into it - get_scopes returned Locals and Globals scopes - get_variables for Locals showed expected values (x, y, z, etc.) after executing lines - Expression evaluation - evaluate_expression with pure expressions (e.g., n + 1) worked and returned correct types/values - Attempting assignment (x = 99) resulted in SyntaxError; evaluate appears expression-only in this context - Source context - get_source_context provided precise line and surrounding code, extremely helpful for orientation Confusing but works - Absolute path requirement for scriptPath (environment-specific) - start_debugging with relative path “test-scripts/test_python_debug.py” returned “Script file not found” - Using absolute path succeeded - Recommendation: Allow paths relative to server CWD, or document path expectations prominently - Variables appear after execution of defining line - Standard debugger behavior (stop before execute) but can surprise users - Suggest documenting that you may need to step once for new locals to appear Observed behavior (not bugs) - Assignment in evaluate_expression raised SyntaxError - Likely by design: evaluation accepts expressions, not statements - Recommendation: Document this; optionally provide a mode or flag to allow statement execution (if adapter/runtime permits) Outcome - Full flow: created, launched, paused, set breakpoints, stepped into and over, inspected scopes/variables, evaluated expressions, fetched source context, continued, and closed - Rating: 9/10 --- ## JavaScript debugging results Works really well - Session lifecycle - create_debug_session returned sessionId and closed cleanly at end - Launch - start_debugging with stopOnEntry paused on entry - Scopes and variables - get_scopes returned granular scopes: “Local: main”, “Module”, “Global” - get_variables returned expected values in main (x, y, z) and factorial (n) - Expression evaluation - evaluate_expression in factorial frame (n + 1) returned correct result (6) - Stepping - step_into, step_out, and step_over worked reliably - Recursion depth in factorial required multiple step_out calls to return to main (expected) Confusing but works - Unbound breakpoint warnings on set - set_breakpoint at lines 40 and 54 reported verified: false, Unbound breakpoint, even though the file was the running script - We primarily proceeded via stepping and did not verify whether they would bind later during run; could be a timing/mapping nuance - Recommendation: Improve binding feedback or re-check binding after script load advances; document lifecycle (breakpoints may bind later) - Verbose stack traces (Node internals) - get_stack_trace includes many Node internal frames; the first few user frames are useful, the rest are noise - Recommendation: Add an option to filter internal frames or reduce default depth to user code - Variable/Frame identity churn - After stepping, frameIds and variablesReference change; clients must refresh stack/scopes/variables - Recommendation: Document the fetch-refresh pattern clearly (step -> stack -> scopes -> variables) - Variables appear after execution of defining line - Same as Python; recommend documenting expectations Outcome - Full flow: created, launched, paused, set breakpoints (unbound at set time), inspected stack/scopes/variables, evaluated expressions, stepped (including out of recursion multiple times), and closed - Rating: 7/10 (functional, but noisier/more stateful than Python) --- ## Side-by-side summary - Sessions: Both excellent - Launch + stopOnEntry: Both excellent - Breakpoints: - Python: Verified at set time and worked - JavaScript: Reported Unbound at set time; likely timing/sourcemap/loading nuance (still functional via stepping) - Stepping: Both reliable - Stack traces: - Python: concise and user-focused - JavaScript: verbose with many Node-internal frames - Scopes/variables: - Python: simple and stable - JavaScript: granular scopes but requires frequent refresh after steps; variableReferences churn - Expression evaluation: - Python: expressions only; assignment raised SyntaxError (likely by design) - JavaScript: expressions worked as expected - Source context: Excellent for both --- ## What’s broken - No hard-broken behaviors were found in exercised flows - pause_execution is explicitly Not Implemented (as documented) --- ## Confusing but works (top items) 1) JavaScript stack traces are noisy - Too many Node internal frames for typical workflows - Suggest providing filter options or default user-only frames 2) JavaScript variableReferences/frameIds change after each step - Requires re-fetching stack, scopes, and variables after every step - Document the step -> stack -> scopes -> variables pattern 3) Breakpoint lifecycle and binding in JavaScript - Breakpoints reported Unbound at set time - Likely bind later or are subject to timing/sourcemap state - Provide clearer messaging and/or re-check binding automatically 4) Python evaluate_expression is expression-only - Assignment throws SyntaxError - Document limitations; optionally support statement evaluation (if feasible/secure) 5) Absolute vs relative scriptPath resolution - In our environment, Python start_debugging required absolute path - Improve path resolution or document clearly --- ## Works really well - Source context: Outstanding and highly useful - Session lifecycle: Clean, reliable, explicit messages - Python end-to-end polish: Breakpoint verification, stack, scopes, stepping, evaluation, and source context all felt cohesive - Stepping controls: step_into/over/out behaved predictably in both languages --- ## Recommendations (what could work better and how) High priority - JS stack trace filtering - Add a parameter to get_stack_trace like includeInternalFrames (default false) - Or a maxDepth focused on user frames by default with a way to include internals - Document JS variable reference lifecycle - Provide a “best-practice loop” in docs for stepping: 1) Step 2) get_stack_trace (current frameId) 3) get_scopes (new variablesReference) 4) get_variables (with the new reference) Medium priority - Breakpoint binding clarity (JS) - Improve messages around Unbound -> Bound transitions - Optionally expose a tool to re-query breakpoint statuses - Path resolution (start_debugging) - Support relative paths relative to server CWD - Or explicitly document required path form in error message - Python evaluate_expression documentation - Call out expressions-only vs statements - Consider a mode to enable statement execution (e.g., REPL context), if adapter/runtime permits Low priority - Provide a “filter” hint for get_stack_trace to drop known noisy Node internals by default - Add examples for complex/nested data structures (expandable variables) in docs - Add tests for conditional and multiple breakpoints, error cases, and large files --- ## Repro highlights (from this run) Python - start_debugging: absolute scriptPath required; paused on entry with reason=entry - set_breakpoint: 32 (factorial call), 46 (final computation) verified true - continue_execution to 32; step_into factorial; get_variables shows n=5; evaluate_expression “n + 1” returns 6 - get_source_context for current lines accurate - step_over at 46; get_variables shows final=3600 - continue_execution then close_debug_session JavaScript - start_debugging paused on entry; stack shows factorial at line 8, main at 40 - set_breakpoint at 40 and 54 returned Unbound (verified false) - get_scopes main shows x/y/z; factorial shows n=5 - evaluate_expression in factorial: “n + 1” returns 6 - step_out repeatedly (recursive stack); stepping works - close_debug_session --- ## Conclusion mcp-debugger delivers a capable, cross-language debugging experience with excellent fundamentals and a strong Python path. JavaScript is fully usable but noisier, with stack verbosity and stateful variable/frame references that make client code more chatty. Addressing stack filtering, documenting the refresh pattern, clarifying breakpoint binding, and improving path handling would materially improve developer ergonomics. Overall ratings (subjective): - Python: 9/10 - JavaScript: 7.5/10 - API design and stability: 8/10 Would recommend adoption. With the above refinements, this would be an excellent debugging MCP server.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/debugmcpdev/mcp-debugger'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

MCP_DEBUGGER_TEST_REPORT.md•10.4 KiB