AI Debugger

Overview Schema Related Servers Score Discussions

testing-mcp-tools.md•14 KiB

# Testing MCP Tools This guide covers best practices for testing MCP tools in AIDB, including E2E tests, integration tests, and the DebugInterface abstraction. ## Testing Philosophy MCP tools require comprehensive testing to ensure: 1. **Correctness**: Tools return accurate data (line numbers, values, state) 1. **Structure**: Responses match MCP protocol expectations 1. **Content**: Data reflects actual debug state 1. **Efficiency**: Responses don't include bloat **Priority:** E2E > Integration > Unit (highest ROI first) ## DebugInterface Abstraction The `DebugInterface` abstraction allows tests to work with both direct API and MCP tools interchangeably. ### Location ```python from tests._helpers.debug_interface import DebugInterface ``` The `DebugInterface` is located in `src/tests/_helpers/debug_interface/` (base class in `base.py`). ### Why Use DebugInterface? 1. **Flexibility**: Tests work with both API and MCP implementations 1. **Consistency**: Same test validates both paths 1. **Clarity**: Higher-level operations vs low-level API calls 1. **Coverage**: Ensures API and MCP stay in sync ### Basic Usage ```python import pytest from tests._helpers.debug_interface import DebugInterface @pytest.mark.e2e async def test_your_tool(debug_interface: DebugInterface): """Test tool with DebugInterface abstraction.""" # Start session await debug_interface.start_session( target="test_app.py", breakpoints=[{"file": "test_app.py", "line": 10}], ) # Continue to breakpoint await debug_interface.continue_execution() # Use your tool (works for both MCP and direct API) response = await debug_interface.call_tool( "aidb.your_tool", {"action": "your_action"}, ) # Validate response assert response["success"] is True assert "data" in response ``` ### Parameterized Testing (MCP + Direct API) Test both implementations automatically: ```python @pytest.mark.e2e @pytest.mark.parametrize("interface_type", ["mcp", "direct"]) async def test_step_operation(debug_interface: DebugInterface, interface_type: str): """Test stepping works via MCP and direct API.""" # DebugInterface automatically routes to correct implementation await debug_interface.start_session(target="test_app.py") # Step (uses MCP or API based on interface_type fixture) response = await debug_interface.step("into") # Validate (same assertions for both) assert response["success"] is True assert "execution_state" in response["data"] ``` ## E2E Test Patterns ### Pattern 1: Basic Tool Operation Test the happy path for your tool: ```python @pytest.mark.e2e async def test_tool_basic_operation(debug_interface: DebugInterface): """Test basic tool operation.""" # Setup await debug_interface.start_session( target="examples/simple_app.py", breakpoints=[{"file": "examples/simple_app.py", "line": 5}], ) # Execute await debug_interface.continue_execution() response = await debug_interface.call_tool( "aidb.inspect", {"target": "locals"}, ) # Validate structure assert response["success"] is True assert "data" in response assert "locals" in response["data"] # Validate content accuracy assert isinstance(response["data"]["locals"], list) assert len(response["data"]["locals"]) > 0 # Validate efficiency (no bloat) assert len(response["data"]) <= 3 # Only essential fields ``` ### Pattern 2: Error Handling Test tool behavior with invalid inputs: ```python @pytest.mark.e2e async def test_tool_invalid_action(debug_interface: DebugInterface): """Test tool error handling with invalid action.""" await debug_interface.start_session(target="examples/simple_app.py") # Invalid action response = await debug_interface.call_tool( "aidb.your_tool", {"action": "invalid_action"}, ) # Should return error response assert response["success"] is False assert "error_code" in response assert "VALIDATION" in response["error_code"] assert "suggested_actions" in response # User guidance ``` ### Pattern 3: State Validation Test tool behavior in different execution states: ```python @pytest.mark.e2e async def test_tool_requires_paused_state(debug_interface: DebugInterface): """Test tool that requires program to be paused.""" await debug_interface.start_session(target="examples/simple_app.py") # Tool requires paused state but program is running response = await debug_interface.call_tool( "aidb.inspect", {"target": "locals"}, ) # Should return state error assert response["success"] is False assert "STATE" in response.get("error_code", "") assert "paused" in response.get("error_message", "").lower() ``` ### Pattern 4: Multi-Step Workflow Test tools in realistic debugging workflows: ```python @pytest.mark.e2e async def test_step_and_inspect_workflow(debug_interface: DebugInterface): """Test stepping and inspecting variables.""" # Start session await debug_interface.start_session( target="examples/calculator.py", breakpoints=[{"file": "examples/calculator.py", "line": 10}], ) # Continue to breakpoint await debug_interface.continue_execution() # Inspect initial state locals_response = await debug_interface.call_tool( "aidb.inspect", {"target": "locals"}, ) initial_vars = {v["name"]: v["value"] for v in locals_response["data"]["locals"]} # Step to next line await debug_interface.step("over") # Inspect updated state locals_response = await debug_interface.call_tool( "aidb.inspect", {"target": "locals"}, ) updated_vars = {v["name"]: v["value"] for v in locals_response["data"]["locals"]} # Validate state changed correctly assert initial_vars != updated_vars assert "result" in updated_vars # New variable appeared ``` ## MCP Response Validation ### Structure Validation Ensure response matches MCP protocol: ```python def validate_mcp_response_structure(response: dict) -> None: """Validate MCP response structure.""" assert "success" in response, "Missing 'success' field" assert isinstance(response["success"], bool), "'success' must be boolean" if response["success"]: assert "data" in response, "Success response must have 'data'" assert "summary" in response, "Success response must have 'summary'" else: assert "error_code" in response, "Error response must have 'error_code'" assert "error_message" in response, "Error response must have 'error_message'" @pytest.mark.e2e async def test_response_structure(debug_interface: DebugInterface): """Test tool returns valid MCP response structure.""" await debug_interface.start_session(target="examples/simple_app.py") response = await debug_interface.call_tool("aidb.session_status", {}) # Validate structure validate_mcp_response_structure(response) ``` ### Content Accuracy Validation Verify response data reflects actual debug state: ```python @pytest.mark.e2e async def test_locals_content_accuracy(debug_interface: DebugInterface): """Test inspect.locals returns accurate variable values.""" await debug_interface.start_session( target="examples/variables.py", breakpoints=[{"file": "examples/variables.py", "line": 15}], ) await debug_interface.continue_execution() # Inspect locals response = await debug_interface.call_tool( "aidb.inspect", {"target": "locals"}, ) # Validate specific variables exist with correct values locals_dict = {v["name"]: v["value"] for v in response["data"]["locals"]} assert "x" in locals_dict, "Variable 'x' should exist" assert locals_dict["x"] == "10", f"Expected x=10, got {locals_dict['x']}" assert "message" in locals_dict, "Variable 'message' should exist" assert locals_dict["message"] == "'hello'", "Message should be 'hello'" ``` ### Efficiency Validation (No Bloat) Ensure responses don't include unnecessary data: ```python @pytest.mark.e2e async def test_response_efficiency(debug_interface: DebugInterface): """Test tool returns only necessary data (no bloat).""" await debug_interface.start_session( target="examples/simple_app.py", breakpoints=[{"file": "examples/simple_app.py", "line": 5}], ) await debug_interface.continue_execution() response = await debug_interface.call_tool( "aidb.inspect", {"target": "locals"}, ) # Response should be focused data_keys = set(response["data"].keys()) # Should have locals, maybe location assert len(data_keys) <= 3, f"Too many fields: {data_keys}" # Should NOT have unnecessary fields bloat_fields = {"globals", "timestamp", "correlation_id", "metadata"} assert not (data_keys & bloat_fields), f"Found bloat: {data_keys & bloat_fields}" ``` ## Integration Tests Integration tests focus on handler logic without full session setup. ### Testing Handlers Directly **⚠️ CONCEPTUAL EXAMPLE** - Adapt to actual handler structure: ```python import pytest from aidb_mcp.handlers.inspection.state_inspection import handle_inspect @pytest.mark.integration async def test_inspect_handler_validation(mock_session_context): """Test inspect handler parameter validation.""" # Missing required parameter response = await handle_inspect({ "_api": mock_session_context.api, "_context": mock_session_context, }) # Should return validation error assert response["success"] is False assert "VALIDATION" in response["error_code"] ``` ### Testing Response Builders ```python import pytest from aidb_mcp.responses.builders import ExecutionStateBuilder from aidb_mcp.core.constants import DetailedExecutionStatus @pytest.mark.unit def test_execution_state_builder(): """Test ExecutionStateBuilder creates correct structure.""" state = ExecutionStateBuilder.build( detailed_status=DetailedExecutionStatus.STOPPED_AT_BREAKPOINT, location="example.py:10", has_breakpoints=True, stop_reason="breakpoint", ) assert state["status"] == "stopped" assert state["reason"] == "breakpoint" assert state["location"] == "example.py:10" assert state["has_breakpoints"] is True ``` ## Test Organization ### Directory Structure **⚠️ SIMPLIFIED EXAMPLE** - Actual structure is more detailed: ``` src/tests/ ├── aidb_mcp/ │ ├── core/ │ │ ├── integration/ │ │ └── unit/ │ ├── handlers/ │ ├── responses/ │ ├── server/ │ └── utils/ └── _helpers/ └── debug_interface/ ├── base.py ├── api_interface.py └── mcp_interface.py ``` ### Test Naming Conventions - **E2E**: `test_<tool_name>_<scenario>` (e.g., `test_step_into_function`) - **Integration**: `test_<component>_<behavior>` (e.g., `test_handler_validation`) - **Unit**: `test_<function>_<case>` (e.g., `test_validate_action_invalid_enum`) ## Common Test Fixtures ### debug_interface Fixture Automatically parameterized for MCP and direct API: ```python @pytest.fixture async def debug_interface(interface_type: str): """Provide DebugInterface for tests.""" if interface_type == "mcp": interface = MCPDebugInterface() else: interface = DirectAPIDebugInterface() yield interface # Cleanup await interface.cleanup() ``` ### Sample Applications **⚠️ CONCEPTUAL EXAMPLES** - Create test programs as needed: Example test programs for debugging workflows (adapt to your needs): - Basic sequential execution program - Function calls and variable tracking - Control flow and loops - Exception handling scenarios See `src/tests/_assets/test_programs/` for actual test program examples. ## Best Practices ### DO - **Test all three dimensions**: Structure, content accuracy, efficiency - **Use DebugInterface**: Ensures MCP and API stay in sync - **Test error paths**: Invalid inputs, wrong state, edge cases - **Validate actual values**: Not just structure, but correct data - **Check for bloat**: Responses should be minimal - **Test workflows**: Multi-step realistic scenarios ### DON'T - **Skip content validation**: Structure tests alone aren't enough - **Ignore efficiency**: Bloated responses hurt agent performance - **Test implementation details**: Focus on behavior, not internals - **Mock unnecessarily**: Use real debug sessions when possible - **Write flaky tests**: Ensure deterministic outcomes ## Debugging Failed Tests ### Enable Verbose Logging ```bash AIDB_LOG_LEVEL=DEBUG AIDB_ADAPTER_TRACE=1 \ pytest src/tests/aidb_mcp/e2e/test_your_tool.py -v -s ``` ### Inspect MCP Responses ```python import json @pytest.mark.e2e async def test_debug_response(debug_interface: DebugInterface): """Debug test to inspect full response.""" response = await debug_interface.call_tool("aidb.your_tool", {}) # Print full response for debugging print(json.dumps(response, indent=2)) # Add assertions after inspecting output ``` ### Check Execution State ```python @pytest.mark.e2e async def test_debug_state(debug_interface: DebugInterface): """Debug test to check execution state.""" await debug_interface.start_session(target="examples/simple_app.py") # Check state before operation status = await debug_interface.call_tool("aidb.session_status", {}) print(f"Before: {status['data']['execution_state']}") # Perform operation await debug_interface.step("over") # Check state after operation status = await debug_interface.call_tool("aidb.session_status", {}) print(f"After: {status['data']['execution_state']}") ``` ## Summary Testing MCP tools requires: 1. **DebugInterface abstraction** - Test MCP and API together 1. **Three-dimensional validation** - Structure + Content + Efficiency 1. **E2E-first approach** - Highest ROI testing 1. **Real debug sessions** - Avoid excessive mocking 1. **Content accuracy checks** - Verify actual values, not just structure Always test the agent experience: Is the response accurate, clear, and minimal?

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ai-debugger/aidb'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

testing-mcp-tools.md•14 KiB