get_unpacked_files
Extract and classify in-memory unpacked binaries from sandbox analysis to identify runtime-decrypted payloads and memory-resident code with process tracking and execution timeline classification.
Instructions
Retrieve and classify in-memory unpacked binaries from a sandbox analysis.
This tool extracts executable artifacts that were unpacked in memory during the dynamic execution of the submitted sample. These binaries typically reflect runtime-decrypted payloads or memory-resident code generated by the sample or its child processes.
Each extracted file is associated with:
- The process ID (pid) responsible for its memory region.
- A classification that indicates **when** during execution the memory snapshot was taken.
If a custom `save_path` is provided, the files are saved under `{save_path}/{webid}-{run}`. If the path is invalid or inaccessible, a fallback directory under `unpacked_files/{webid}-{run}` is used instead.
Snapshot types:
- "Snapshot at beginning of execution": Memory captured at process start.
- "Snapshot taken on unpacking (modifying executable sections or adding new ones)": Captured at runtime after self-modifying code or section manipulation.
- "Snapshot at the end of execution": Captured near process termination.
- "Snapshot taken when memory gets freed": Captured when memory regions were released.
Args:
webid (required): The submission ID of the analysis.
run (optional, default = 0): Index of the sandbox run to process (typically 0 for the first run).
save_path (optional): Optional base directory to store the unpacked files. If not valid, a default directory is used.
Returns:
A dictionary containing:
- output_directory: Absolute path where the files were saved.
- files: A list of unpacked file entries, each with:
- unpacked_file: Absolute path to the file on disk.
- pid: ID of the process associated with the memory region.
- type: A human-readable label describing when the snapshot was taken.
- note: A message indicating whether the fallback directory was used.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| webid | Yes | ||
| run | No | ||
| save_path | No |
Implementation Reference
- jbxmcp/tools.py:703-743 (handler)The primary MCP tool handler for 'get_unpacked_files', decorated with @mcp.tool(). It validates input and delegates to the core download_unpacked_files function, handling exceptions.@mcp.tool() async def get_unpacked_files(webid: str, run: int = 0, save_path: Optional[str] = None) -> Dict[str, Any]: """ Retrieve and classify in-memory unpacked binaries from a sandbox analysis. This tool extracts executable artifacts that were unpacked in memory during the dynamic execution of the submitted sample. These binaries typically reflect runtime-decrypted payloads or memory-resident code generated by the sample or its child processes. Each extracted file is associated with: - The process ID (pid) responsible for its memory region. - A classification that indicates **when** during execution the memory snapshot was taken. If a custom `save_path` is provided, the files are saved under `{save_path}/{webid}-{run}`. If the path is invalid or inaccessible, a fallback directory under `unpacked_files/{webid}-{run}` is used instead. Snapshot types: - "Snapshot at beginning of execution": Memory captured at process start. - "Snapshot taken on unpacking (modifying executable sections or adding new ones)": Captured at runtime after self-modifying code or section manipulation. - "Snapshot at the end of execution": Captured near process termination. - "Snapshot taken when memory gets freed": Captured when memory regions were released. Args: webid (required): The submission ID of the analysis. run (optional, default = 0): Index of the sandbox run to process (typically 0 for the first run). save_path (optional): Optional base directory to store the unpacked files. If not valid, a default directory is used. Returns: A dictionary containing: - output_directory: Absolute path where the files were saved. - files: A list of unpacked file entries, each with: - unpacked_file: Absolute path to the file on disk. - pid: ID of the process associated with the memory region. - type: A human-readable label describing when the snapshot was taken. - note: A message indicating whether the fallback directory was used. """ try: return await download_unpacked_files(webid, run, save_path) except Exception as e: return { "error": f"Failed to download unpacked files for submission ID '{webid}' run {run}. " f"Reason: {str(e)}" }
- jbxmcp/core.py:325-380 (helper)Core implementation that downloads the unpackpe ZIP from Joe Sandbox API, extracts files, parses metadata from filenames to associate with PIDs and snapshot types using process tree data, and saves to disk.async def download_unpacked_files(webid: str, run: Optional[int] = 0, save_path: Optional[str] = None) -> Dict[str, Any]: jbx_client = get_client() _, data = jbx_client.analysis_download(webid=webid, run=run, type='unpackpe') default_output_dir = os.path.join("unpacked_files", f"{webid}-{run}") output_dir = default_output_dir used_default_path = False root = await get_or_fetch_report(webid, run) proc_tree = extract_process_tree(root) targetid_to_pid = flatten_process_tree(proc_tree) if save_path: try: output_dir = os.path.join(save_path, f"{webid}-{run}") os.makedirs(output_dir, exist_ok=True) except (OSError, FileNotFoundError): output_dir = default_output_dir os.makedirs(output_dir, exist_ok=True) used_default_path = True else: os.makedirs(output_dir, exist_ok=True) # Extract files and associate them with process IDs and frame stages unpacked_files_info = [] with zipfile.ZipFile(io.BytesIO(data)) as zf: zf.extractall(path=output_dir, pwd=b"infected") for name in zf.namelist(): if name.endswith('/') or '.raw.' in name: continue base = os.path.basename(name) metadata = extract_unpack_filename_metadata(base) if metadata is None: continue targetid = metadata["targetid"] frame_label = metadata["frame_label"] pid = targetid_to_pid.get(targetid, "unknown") full_path = os.path.abspath(os.path.join(output_dir, name)) unpacked_files_info.append({ "unpacked_file": full_path, "pid": pid, "type": frame_label }) note = ( "User-provided save_path was invalid. Default directory was used." if used_default_path else "Extraction completed successfully." ) return { "output_directory": os.path.abspath(output_dir), "files": unpacked_files_info, "note": note }
- jbxmcp/core.py:303-323 (helper)Supporting helper to parse unpacked file filenames, extracting process targetid and determining the snapshot type (e.g., 'Snapshot at unpacking').def extract_unpack_filename_metadata(filename: str) -> Optional[Dict[str, Any]]: """ Extract the targetid and frame id from the filename pattern: e.g., '1.2.filename.exe.abc.unpack' → targetid='1', frame_id=2 """ frame_map = { -1: "UNKNOWN", 0: "Snapshot at beginning of execution", 1: "Snapshot taken on unpacking (modifying executable sections or adding new ones)", 2: "Snapshot at the end of execution", 3: "Snapshot taken when memory gets freed" } match = re.match(r'^(\d+)\.(\d+)\..+\.unpack$', filename) if not match: return None targetid, frame_id = match.groups() frame_id = int(frame_id) return { "targetid": targetid, "frame_label": frame_map.get(frame_id, "UNKNOWN") }
- jbxmcp/core.py:289-302 (helper)Supporting helper to traverse the process tree and create a targetid-to-PID mapping used for associating unpacked files with processes.def flatten_process_tree(proc_tree: List[Dict[str, Any]]) -> Dict[str, str]: """ Flatten the process tree and return a mapping from targetid to process ID (pid). """ targetid_to_pid = {} queue = deque(proc_tree) while queue: node = queue.popleft() if "targetid" in node and "pid" in node: targetid_to_pid[str(node["targetid"])] = str(node["pid"]) if "children" in node: queue.extend(node["children"]) return targetid_to_pid
- jbxmcp/server.py:19-20 (registration)Import of the tools module in the server entrypoint, which loads and registers all @mcp.tool()-decorated functions including get_unpacked_files.import jbxmcp.tools as tools