de en es ja ko ru zh

IDA Pro MCP

by mrexodia

Overview Schema Related Servers Score Discussions

Python

Local

ida-pro-mcp
devdocs

test-framework.md

test-framework.md•16.6 KiB

# IDA Pro MCP Test Framework This document provides detailed documentation for the test framework used in ida-pro-mcp. ## Overview The test framework is a custom, lightweight testing solution designed specifically for testing IDA Pro MCP tools. It supports: - **Inline tests** - Tests defined right after the functions they test in `api_*.py` files - **IDA console runner** - Run tests interactively inside IDA Pro - **Standalone runner** - Run tests headlessly using idalib - **Code coverage** - Measure code coverage of API implementations - **Binary-agnostic assertions** - Tests that work on any binary ## Architecture ``` src/ida_pro_mcp/ ├── ida_mcp/ │ ├── framework.py # Test framework core │ │ ├── @test decorator # Register test functions │ │ ├── run_tests() # Test runner │ │ ├── TestResults # Result aggregation │ │ ├── assert_* # Assertion helpers │ │ └── get_any_* # Test data helpers │ ├── tests/ # Separate test files │ │ ├── test_api_core.py │ │ ├── test_api_analysis.py │ │ ├── test_api_memory.py │ │ ├── test_api_modify.py │ │ ├── test_api_types.py │ │ ├── test_api_stack.py │ │ └── test_api_resources.py │ ├── api_core.py # API implementations │ ├── api_analysis.py │ └── ... ├── test.py # Standalone runner (idalib) └── ... tests/ ├── crackme03.elf # Test binary └── binary_info.py # Binary info extraction ``` ## Test Registration ### The `@test` Decorator ```python from .tests import test @test() def test_idb_meta(): """Test description""" meta = idb_meta() assert_has_keys(meta, "path", "module", "base") ``` #### Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `binary` | `str` | `""` | Specific binary this test runs for | | `skip` | `bool` | `False` | Skip this test | #### Examples ```python # Basic test @test() def test_basic(): pass # Skipped test @test(skip=True) def test_broken(): pass ``` ### Test Placement Tests should be placed **immediately after** the function they test: ```python @tool @idasync def idb_meta() -> Metadata: """Get IDB metadata""" return Metadata(...) @test() def test_idb_meta(): """idb_meta returns valid metadata""" meta = idb_meta() assert_has_keys(meta, "path", "module") ``` This keeps tests close to implementation and makes it easy to see test coverage. ### Test Auto-Discovery Tests are automatically discovered and categorized by their module: ```python # In api_core.py @test() def test_idb_meta(): # Registered as category="api_core" pass # In api_analysis.py @test() def test_decompile(): # Registered as category="api_analysis" pass ``` ## Assertion Helpers Import helpers from `tests.py`: ```python from .tests import ( test, assert_has_keys, assert_valid_address, assert_non_empty, assert_is_list, assert_all_have_keys, ) ``` ### `assert_has_keys(d, *keys)` Assert a dict contains all specified keys: ```python meta = idb_meta() assert_has_keys(meta, "path", "module", "base", "size") ``` ### `assert_valid_address(addr)` Assert a string is a valid hex address: ```python assert_valid_address("0x401000") # OK assert_valid_address("-0x10") # OK (negative) assert_valid_address("401000") # FAIL - no 0x prefix assert_valid_address("not_addr") # FAIL - not hex ``` ### `assert_non_empty(value)` Assert value is not None and not empty: ```python assert_non_empty("hello") # OK assert_non_empty([1, 2, 3]) # OK assert_non_empty("") # FAIL assert_non_empty([]) # FAIL assert_non_empty(None) # FAIL ``` ### `assert_is_list(value, min_length=0)` Assert value is a list with minimum length: ```python assert_is_list([1, 2, 3]) # OK assert_is_list([1, 2], min_length=2) # OK assert_is_list([1], min_length=2) # FAIL - too short assert_is_list("string") # FAIL - not a list ``` ### `assert_all_have_keys(items, *keys)` Assert all dicts in a list have specified keys: ```python funcs = [{"addr": "0x1", "name": "a"}, {"addr": "0x2", "name": "b"}] assert_all_have_keys(funcs, "addr", "name") # OK assert_all_have_keys(funcs, "addr", "size") # FAIL - missing "size" ``` ## Test Data Helpers ### `get_any_function() -> Optional[str]` Get address of first function (for tests needing a valid function): ```python fn_addr = get_any_function() if not fn_addr: return # Skip test if no functions result = decompile(fn_addr) assert result[0]["code"] is not None ``` ### `get_any_string() -> Optional[str]` Get address of first string: ```python str_addr = get_any_string() if not str_addr: return # Skip if no strings result = get_string(str_addr) assert result[0]["value"] is not None ``` ### `get_first_segment() -> Optional[tuple[str, str]]` Get first segment's (start, end) addresses: ```python seg = get_first_segment() if not seg: return start_addr, end_addr = seg result = get_bytes({"addr": start_addr, "size": 16}) ``` ## Running Tests ### From IDA Console ```python from ida_mcp.tests import run_tests # Run all tests run_tests() # Run specific category run_tests(category="api_core") # Run tests matching pattern run_tests(pattern="*decompile*") # Combine filters run_tests(pattern="test_list_*", category="api_core") # Stop on first failure run_tests(stop_on_failure=True) # Quiet mode (just summary) run_tests(verbose=False) ``` ### From Command Line ```bash # Basic usage ida-mcp-test tests/crackme03.elf # Filter by category ida-mcp-test tests/crackme03.elf --category api_core # Filter by pattern ida-mcp-test tests/crackme03.elf --pattern "*meta*" # List available tests ida-mcp-test tests/crackme03.elf --list # Stop on first failure ida-mcp-test tests/crackme03.elf --stop-on-failure # With code coverage ida-mcp-test tests/crackme03.elf --coverage # HTML coverage report ida-mcp-test tests/crackme03.elf --coverage --coverage-report html ``` ## Code Coverage Code coverage is measured using the `coverage` package with `uv run coverage run`. ### Running with Coverage ```bash # Run tests with coverage uv run coverage run -m ida_pro_mcp.test crackme03.elf # Show coverage report in terminal uv run coverage report --show-missing # Generate HTML report uv run coverage html open htmlcov/index.html # Generate XML report (for CI) uv run coverage xml # Generate JSON report uv run coverage json # Combine multiple coverage runs uv run coverage combine uv run coverage report ``` ### Coverage Configuration Create a `.coveragerc` file or add to `pyproject.toml`: ```toml [tool.coverage.run] source = ["ida_pro_mcp.ida_mcp"] omit = [ "*/tests.py", "*/zeromcp/*", "*/__init__.py", ] [tool.coverage.report] exclude_lines = [ "pragma: no cover", "if TYPE_CHECKING:", "raise NotImplementedError", ] ``` ## Test Patterns ### 1. Schema Validation Verify return types match expected structure: ```python @test() def test_idb_meta(): """idb_meta returns properly structured metadata""" meta = idb_meta() assert_has_keys(meta, "path", "module", "base", "size", "md5", "sha256") assert_valid_address(meta["base"]) ``` ### 2. Non-Empty Results Verify APIs return data on valid binaries: ```python @test() def test_list_funcs(): """list_funcs returns at least one function""" result = list_funcs({}) assert_is_list(result, min_length=1) assert_is_list(result[0]["data"], min_length=1) ``` ### 3. Error Handling Verify invalid inputs raise `IDAError`: ```python from .sync import IDAError @test() def test_decompile_invalid(): """decompile raises IDAError for invalid address""" try: decompile("0xDEADBEEFDEADBEEF") assert False, "Expected IDAError" except IDAError: pass # Expected ``` ### 4. Referential Integrity Verify cross-API consistency: ```python @test() def test_functions_are_decompilable(): """Functions from list_funcs can be decompiled""" result = list_funcs({"count": 3}) for fn in result[0]["data"]: dec = decompile(fn["addr"]) # Should either have code or specific error, not crash assert "code" in dec[0] or "error" in dec[0] ``` ### 5. Round-Trip with Cleanup Test modifications and restore original state: ```python @test() def test_rename_roundtrip(): """rename function works and can be undone""" fn_addr = get_any_function() if not fn_addr: return # Get original name original = lookup_funcs(fn_addr)[0]["fn"]["name"] try: # Rename rename({"func": [{"addr": fn_addr, "name": "__test__"}]}) # Verify new_name = lookup_funcs(fn_addr)[0]["fn"]["name"] assert new_name == "__test__" finally: # Restore rename({"func": [{"addr": fn_addr, "name": original}]}) ``` ### 6. Batch Operations Verify batch input handling: ```python @test() def test_batch_operations(): """Operations handle multiple inputs""" result = list_funcs({"count": 3}) addrs = [fn["addr"] for fn in result[0]["data"]] # Batch lookup results = lookup_funcs(addrs) assert len(results) == len(addrs) ``` ## Test Results ### `TestResult` Structure ```python @dataclass class TestResult: name: str # Test function name category: str # Module category (api_core, etc.) status: Literal["passed", "failed", "skipped"] duration: float # Execution time in seconds error: Optional[str] # Error message if failed traceback: Optional[str] # Full traceback if failed ``` ### `TestResults` Aggregate ```python @dataclass class TestResults: passed: int failed: int skipped: int total_time: float results: list[TestResult] ``` ### Programmatic Access ```python results = run_tests(verbose=False) print(f"Passed: {results.passed}") print(f"Failed: {results.failed}") for r in results.results: if r.status == "failed": print(f"{r.name}: {r.error}") print(r.traceback) ``` ## Adding New Tests ### Step-by-Step 1. **Identify the function to test** in the appropriate `api_*.py` file 2. **Add imports** at the top of the file: ```python from .tests import ( test, assert_has_keys, assert_valid_address, # ... other helpers you need ) ``` 3. **Write the test** immediately after the function: ```python @tool @idasync def my_function(...): ... @test() def test_my_function(): """Description of what the test verifies""" result = my_function(...) assert_has_keys(result, "key1", "key2") ``` 4. **Run the test**: ```bash ida-mcp-test tests/crackme03.elf --pattern "test_my_function" ``` ### Best Practices 1. **Test name**: Use `test_<function_name>` or `test_<function_name>_<scenario>` 2. **Docstring**: First line describes what the test verifies 3. **Binary-agnostic**: Don't hardcode addresses or expected values 4. **Skip gracefully**: Return early if required data isn't available ```python fn_addr = get_any_function() if not fn_addr: return # Skip, don't fail ``` 5. **Cleanup**: Restore state for tests that modify the IDB ```python try: # Modify finally: # Restore ``` 6. **Error handling**: Use `try/except IDAError` for expected errors ## Coverage-Guided Test Development Tests should be **coverage-guided**: use coverage data to identify gaps in our wrapper code, then write tests that exercise those specific code paths. ### Methodology **Key Principle**: We're testing *our wrapper code*, not IDA's correctness. Don't write tests that verify IDA returns expected values - IDA works. Instead, write tests that trigger edge cases and error paths in our code. #### Workflow 1. **Run coverage**: ```bash uv run coverage run -m ida_pro_mcp.test tests/crackme03.elf uv run coverage report --show-missing --include="src/ida_pro_mcp/ida_mcp/api_*.py" ``` 2. **Identify uncovered lines** and understand what conditions trigger them 3. **Determine test type**: - **Generic test**: If the gap can be triggered with invalid/edge-case inputs (preferred) - **Binary-specific test**: Only if the gap requires specific binary properties 4. **Write minimal test** that triggers the specific code path 5. **Re-run coverage** to verify the gap is closed #### Example: Coverage-Guided Test Development Coverage shows lines 252-256 in `api_core.py` are uncovered: ```python except ValueError: results.append( {"input": text, "result": None, "error": f"Invalid number: {text}"} ) ``` This is an error path for invalid input. Write a generic test: ```python @test() def test_int_convert_invalid_text(): """int_convert handles invalid number text (covers lines 252-256)""" result = int_convert({"text": "not_a_number"}) assert result[0]["result"] is None assert "Invalid number" in result[0]["error"] ``` ### Deterministic Sampling For broader test coverage, test on multiple items rather than just one. The framework provides helpers that select a deterministic sample based on binary name. #### Sample Size Configuration Sample size is configured via command line: ```bash # Default: 5 items uv run ida-mcp-test tests/crackme03.elf # More thorough testing with 10 items uv run ida-mcp-test tests/crackme03.elf --sample-size 10 # Quick smoke test with 2 items uv run ida-mcp-test tests/crackme03.elf --sample-size 2 ``` #### Sampling Helpers ```python from .tests import get_n_functions, get_n_strings, get_functions_with_calls @test() def test_decompile_multiple(): """decompile works on multiple functions (sampling test)""" addrs = get_n_functions() # Uses configured sample size if len(addrs) < 2: return results = decompile(addrs) assert len(results) == len(addrs) # At least some should succeed successes = [r for r in results if r.get("code")] assert len(successes) > 0 ``` #### Available Helpers | Helper | Description | |--------|-------------| | `get_n_functions(n=None)` | N deterministically-selected function addresses | | `get_n_strings(n=None)` | N deterministically-selected string addresses | | `get_functions_with_calls()` | Functions that contain call instructions | | `get_functions_with_callers()` | Functions that are called by others | | `get_data_address()` | Address in a data (non-code) segment | | `get_unmapped_address()` | Invalid address for error path testing | ### Test Categories #### 1. Schema Tests Verify return structure matches expected schema: ```python @test() def test_idb_meta_schema(): result = idb_meta() assert_has_keys(result, "path", "module", "base", "size") ``` #### 2. Error Path Tests Verify error handling for invalid inputs: ```python @test() def test_decompile_invalid_address(): """Covers error handling path""" result = decompile(get_unmapped_address()) assert result[0].get("error") is not None ``` #### 3. Edge Case Tests Cover specific code paths identified by coverage: ```python @test() def test_lookup_funcs_wildcard(): """lookup_funcs with '*' returns all functions (covers lines 132-134)""" result = lookup_funcs("*") assert len(result) > 0 assert result[0]["query"] == "*" ``` #### 4. Sampling Tests Test on multiple items for broader coverage: ```python @test() def test_callees_multiple(): """callees works on multiple functions (sampling test)""" addrs = get_n_functions() result = callees(addrs) assert len(result) == len(addrs) ``` ### Binary-Specific Tests (Use Sparingly) Binary-specific tests should only be used when: 1. A code path requires specific binary properties (e.g., C++ mangled names) 2. The edge case cannot be triggered with generic inputs ```python @test(binary="crackme03.elf") def test_crackme_specific_edge_case(): """Tests a code path that requires specific binary structure""" # Only runs when crackme03.elf is loaded ... ``` **Do NOT use binary-specific tests to verify IDA correctness** (e.g., "main is at 0x123e"). ### Coverage Targets | Module | Target | Notes | |--------|--------|-------| | api_core | 90%+ | Core functionality | | api_analysis | 85%+ | Complex decompilation paths | | api_memory | 85%+ | Read/write operations | | api_types | 80%+ | Type system edge cases | | api_modify | 70%+ | Modification operations | | api_resources | 85%+ | Resource URIs | | api_stack | 80%+ | Stack frame operations | | api_debug | Skip | Requires active debugger | | api_python | Skip | Requires special setup |

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mrexodia/ida-pro-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

test-framework.md•16.6 KiB