Skip to main content
Glama
lucasoeth

mitmproxy-mcp MCP Server

by lucasoeth

extract_json_fields

Extract specific data fields from JSON content in HTTP flows using JSONPath expressions to isolate and retrieve targeted information.

Instructions

Extract specific fields from JSON content in a flow using JSONPath expressions

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
session_idYesThe ID of the session
flow_indexYesThe index of the flow
content_typeYesWhether to extract from request or response content
json_pathsYesJSONPath expressions to extract (e.g. ['$.data.users', '$.metadata.timestamp'])

Implementation Reference

  • The core handler function that executes the 'extract_json_fields' tool. It retrieves the flow, parses JSON content from request or response, and extracts specified fields using JSONPath expressions via extract_with_jsonpath.
    async def extract_json_fields(arguments: dict) -> list[types.TextContent]:
        """
        Extract specific fields from JSON content in a flow using JSONPath expressions.
        """
        session_id = arguments.get("session_id")
        flow_index = arguments.get("flow_index")
        content_type = arguments.get("content_type")
        json_paths = arguments.get("json_paths")
    
        if not session_id:
            return [types.TextContent(type="text", text="Error: Missing session_id")]
        if flow_index is None:
            return [types.TextContent(type="text", text="Error: Missing flow_index")]
        if not content_type:
            return [types.TextContent(type="text", text="Error: Missing content_type")]
        if not json_paths:
            return [types.TextContent(type="text", text="Error: Missing json_paths")]
    
        try:
            flows = await get_flows_from_dump(session_id)
            
            try:
                flow = flows[flow_index]
                
                if flow.type != "http":
                    return [types.TextContent(type="text", text=f"Error: Flow {flow_index} is not an HTTP flow")]
                
                request = flow.request
                response = flow.response
                
                # Determine which content to extract from
                content = None
                headers = None
                if content_type == "request":
                    content = request.content
                    headers = dict(request.headers)
                elif content_type == "response":
                    if not response:
                        return [types.TextContent(type="text", text=f"Error: Flow {flow_index} has no response")]
                    content = response.content
                    headers = dict(response.headers)
                else:
                    return [types.TextContent(type="text", text=f"Error: Invalid content_type. Must be 'request' or 'response'")]
                
                # Parse the content
                json_content = parse_json_content(content, headers)
                
                # Only extract from JSON content
                if not isinstance(json_content, (dict, list)):
                    return [types.TextContent(type="text", text=f"Error: The {content_type} content is not valid JSON")]
                
                # Extract fields
                result = {}
                for path in json_paths:
                    try:
                        extracted = extract_with_jsonpath(json_content, path)
                        result[path] = extracted
                    except Exception as e:
                        result[path] = f"Error extracting path: {str(e)}"
                
                return [types.TextContent(type="text", text=json.dumps(result, indent=2))]
                
            except IndexError:
                return [types.TextContent(type="text", text=f"Error: Flow index {flow_index} out of range")]
                
        except FileNotFoundError:
            return [types.TextContent(type="text", text="Error: Session not found")]
        except Exception as e:
            return [types.TextContent(type="text", text=f"Error extracting JSON fields: {str(e)}")]
  • Registration of the 'extract_json_fields' tool in the list_tools handler, including the input schema definition.
    types.Tool(
        name="extract_json_fields",
        description="Extract specific fields from JSON content in a flow using JSONPath expressions",
        inputSchema={
            "type": "object",
            "properties": {
                "session_id": {
                    "type": "string",
                    "description": "The ID of the session"
                },
                "flow_index": {
                    "type": "integer",
                    "description": "The index of the flow"
                },
                "content_type": {
                    "type": "string",
                    "enum": ["request", "response"],
                    "description": "Whether to extract from request or response content"
                },
                "json_paths": {
                    "type": "array",
                    "items": {
                        "type": "string"
                    },
                    "description": "JSONPath expressions to extract (e.g. ['$.data.users', '$.metadata.timestamp'])"
                }
            },
            "required": ["session_id", "flow_index", "content_type", "json_paths"]
        }
    ),
  • Dispatcher in the call_tool handler that routes calls to the extract_json_fields function.
    elif name == "extract_json_fields":
        return await extract_json_fields(arguments)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool extracts fields but doesn't describe what happens if JSON content is missing, invalid, or if JSONPath expressions fail. It lacks details on permissions, rate limits, or output format, leaving key behavioral traits unspecified for a tool that manipulates data.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose ('extract specific fields from JSON content in a flow') and method ('using JSONPath expressions'). There is no wasted text, and it directly communicates the tool's function without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a 4-parameter tool with no annotations and no output schema, the description is incomplete. It doesn't explain the return values, error handling, or behavioral nuances like what happens with invalid inputs. For a tool that processes JSON data, more context on output format and limitations is needed to be fully helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all four parameters. The description adds minimal value beyond the schema by mentioning 'JSON content in a flow' and 'JSONPath expressions', which align with parameters like 'content_type' and 'json_paths'. However, it doesn't provide additional syntax, examples, or constraints beyond what's in the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('extract specific fields') and resource ('JSON content in a flow'), specifying the method ('using JSONPath expressions'). It distinguishes from sibling tools like 'get_flow_details' or 'list_flows' by focusing on field extraction rather than retrieval or analysis. However, it doesn't explicitly contrast with 'analyze_protection', which might involve similar data handling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a valid session or flow), exclusions, or compare to siblings like 'analyze_protection' for JSON analysis. Usage is implied only by the action, with no explicit context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/lucasoeth/mitmproxy-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server