Supports building and compiling binaries with CMake as part of the development workflow for binary research and exploitation
Provides containerized deployment and execution environment for isolated binary analysis and debugging sessions with proper security capabilities
Integrates with Git repositories for managing and versioning binary research projects and exploitation development
Leverages Python-based tools including GDB Python API integration and pwndbg for advanced binary debugging and exploitation capabilities
Runs on Ubuntu 24.04 LTS base system providing the foundational environment for binary analysis, debugging, and exploitation research
pwno-mcp
stateful system for autonomous pwn and binary research, designed for LLMs agents.
Overview
PwnoMCP is designed to run in scalable containerized environments (k8s) and expose stateful interaction, debugging capabilities through the Model Context Protocol (MCP). Each container runs an isolated instance with its own research self-sufficient environment.
We does it came from?: This version was a result of literation of redesigning, thinking and researching for around 6 month on this problem of "making a Pwn MCP for LLMs":
I first tried writing a
gdbpluging and executing via gdb python api (autogdb.io), integrating MCP backend on a single backend, authorization by rewriting bit low-level implementation of early MCP's SSE (this was around March, see this post): Didn't work out well, since first of capturing program stdio was a problem forgdbapis (we did tried delimeters but another story regarding timing we will mention later), while stopping multi-thread binary is bit problematic (makes the entire part of actual executing pretty much unusable) although this version was pretty scalable with only one command backend was enough. (autogdb was only solving the problem of connecting from you're debugging (research) machine to agent client for research), it sounds easy but it was mixed with jumping between frontend, auth and specific compatibilization problem.After realizing the scability problem of autogdb.io, I started this idea of bring even the entire research environment on cloud with scable pre-configured environments. Tons of time learning and making mistakes in k8s specifically gke, pretty much starting learning everything fron thin air. We got a working MVP on around 2 weeks diving into this (back then I still have my AP exams). Anyway backend it's still a major problem of "how to start a environment for everyone, and how to let everyone access their own environment?" We still sticked with the original centralized MCP backend approach, but this time we assign a k8s stream channel for each users, and use these io channels on one hand to natively interact with gdb (with delimiters), this was still intended to solve the problem of program IO capturing, it's a trick problem, I then thought about you should also let users see their gdb session on cloud, so I came up with the approach of duplicating a stdio channel back into frontend via k8s's stream and websockets, with around 2 months of development, we got our pwno.io up-and-running, but still tons of problem that spent incredible amount of time that i didnt mentioned, from gke integration to network issues.
pwno.io was working I can't say well, but at a working level, there's still asynchroization problems and gke native problems but we managed to solve the most pain-in-the-ass scability, interactive IO problem that we spent around by far 3 months on. This is when I started working on pwnuous our cooperation with GGML, which will need a new thing like the previous version of pwno-mcp but for more stable support. Since for previous version, we're plugged into GDB via direct IO stream, asynchroization problem as I mentioned was another huge pain-in-the-ass, some IO slipped away and it just not stable enough for use. This is when I started thinking rewriting everything, and throw away some part just for usability for LLMs and it's full agentic compatabilization. I was working on my black hat talk back then so thought a little about statefulness, learnt about this wonderful thing that just seem to be born for us GDB/MI (Debugging with GDB), I spent few days rewriting the entire thing by reading docs. I definited did spent less time conceptualizing backend architecture for pwno.io for this version of
pwno-mcp(around 2 days mainly on gke gateway things), it's definite not a very elaborate or sophisticated framework by all mean, but it did came from a shit tons of experience of trial-and-erroring my self while thinking about the question of making something that's can scale (multi-agent, researcher using it), so I will say it's by far the best conceptualizations and work to best serve for the purpose of LLMs using it stabiliy and scability. And I do think it's the best time or the now-or-never time to open-source it, or this project or Pwno will die from lack of feedback loop, despitepwno-mcpis a little part of what we're doing.
Can I use it?:
non-profit: yes, feel free to
commercial: oss@pwno.io
Features
pwndbgmachine-interface, stateful execution with debugging (designed deliberately toward LLMs)Execute Tool: Run arbitrary GDB/pwndbg commands
Launch Tool: Load and run binaries with full control
Step Control: Support for all stepping commands (run, c, n, s, ni, si)
Context Retrieval: Get registers, stack, disassembly, code, and backtrace
Breakpoint Management: Set conditional breakpoints
Memory Operations: Read memory in various formats
Session State: Track debugging session state and history
Subprocess Tools:
Compile binaries with sanitizers (ASAN, MSAN, etc.)
Spawn and manage background processes
Track process status and resource usage
RetDec: RetDec decompiler service
note: to use pwno's retdec server, you need to
Auth: Authentication via a nonce
pwno-mcp
Installation
Using Docker
Build and run with Docker:
Or use Docker Compose:
The Docker image includes:
Ubuntu 24.04 LTS base
GDB with pwndbg pre-installed
Build tools (gcc, g++, clang, make, cmake)
Address sanitizer libraries
Python with uv package manager
All required dependencies
Prerequisites
GDB with Python support
pwndbg installed and configured in
~/.gdbinitPython 3.8+
Usage
Running the Server
Docker Deployment
MCP Tools
1. Execute
Execute any GDB/pwndbg command:
2. Set File
Load a binary file for debugging:
3. Run
Run the loaded binary (requires at least one enabled breakpoint set beforehand):
Note: You must set at least one enabled breakpoint before running.
4. Step Control
Control program execution:
5. Get Context
Retrieve debugging context:
6. Set Breakpoint
Set breakpoints with optional conditions:
7. Get Memory
Read memory at specific addresses:
8. Get Session Info
Get current debugging session information:
9. Run Command
Execute system commands (primarily for compilation):
10. Spawn Process
Start a background process and get its PID:
11. Get Process Status
Check status of a spawned process:
12. Kill Process
Terminate a process:
13. List Processes
List all tracked background processes:
Typical Workflow
Load a binary:
{"tool": "set_file", "arguments": {"binary_path": "/path/to/binary"}}Prepare breakpoints, then run:
{"tool": "set_breakpoint", "arguments": {"location": "main"}} {"tool": "run", "arguments": {"args": ""}}Use stepping commands and examine state:
{"tool": "step_control", "arguments": {"command": "n"}} {"tool": "get_context", "arguments": {"context_type": "all"}}
Compilation Workflow Example
Compile with AddressSanitizer:
{"tool": "run_command", "arguments": {"command": "gcc -g -fsanitize=address -fno-omit-frame-pointer vuln.c -o vuln"}}Load and debug the compiled binary:
{"tool": "set_file", "arguments": {"binary_path": "./vuln"}} {"tool": "set_breakpoint", "arguments": {"location": "main"}} {"tool": "run", "arguments": {"args": ""}}If running a server for exploitation:
{"tool": "spawn_process", "arguments": {"command": "./vulnerable_server 8080"}}Then check its status:
{"tool": "get_process_status", "arguments": {"pid": 12345}}
Development
Project Structure
Key Design Decisions
Synchronous Tool Execution: Unlike pwndbg-gui, each MCP tool invocation returns complete results immediately, suitable for LLM interaction.
State Management: The server maintains session state including binary info, breakpoints, watches, and command history.
GDB/MI Native Commands: Leverages GDB Machine Interface commands for structured output, as recommended in the pygdbmi documentation:
-file-exec-and-symbolsinstead offilefor loading binaries-break-insertinstead ofbreakfor structured breakpoint data-exec-run,-exec-continue,-exec-next, etc. for execution control-data-evaluate-expressionfor expression evaluation-break-list,-break-delete,-break-enable/disablefor breakpoint management
This provides structured JSON-like responses instead of parsing text output, making the server more reliable and efficient.
Per-Container Isolation: Each container runs its own GDB instance, ensuring complete isolation between debugging sessions.
Future Enhancements
WebSocket endpoint for streaming I/O
Advanced memory analysis tools
Heap exploitation helpers
ROP chain generation
Symbolic execution integration
License
This project is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0).
Non-commercial use only
No derivatives or modifications may be distributed
Attribution required
See the LICENSE file for the full legal text or visit the license page: https://creativecommons.org/licenses/by-nc-nd/4.0/
Contributing
Contributions are welcome! Please submit pull requests or open issues on GitHub.
This server cannot be installed
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
Enables stateful binary analysis and pwn research through GDB/pwndbg debugging capabilities. Supports loading binaries, setting breakpoints, memory analysis, compilation with sanitizers, and process management for vulnerability research.