Skip to main content
Glama

analyze_protection

Detect and analyze bot protection mechanisms in web flows, extracting challenge details and optional JavaScript for security insights.

Instructions

Analyze flow for bot protection mechanisms and extract challenge details

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
extract_scriptsNoWhether to extract and analyze JavaScript from the response (default: true)
flow_indexYesThe index of the flow to analyze
session_idYesThe ID of the session

Implementation Reference

  • The primary handler function for the 'analyze_protection' tool. It retrieves the specified flow from a mitmproxy session, analyzes request/response headers, cookies, content for protection signatures, optionally extracts and analyzes JavaScript scripts, and generates remediation suggestions using imported helper functions.
    async def analyze_protection(arguments: dict) -> list[types.TextContent]: """ Analyze a flow for bot protection mechanisms and extract challenge details. """ session_id = arguments.get("session_id") flow_index = arguments.get("flow_index") extract_scripts = arguments.get("extract_scripts", True) if not session_id: return [types.TextContent(type="text", text="Error: Missing session_id")] if flow_index is None: return [types.TextContent(type="text", text="Error: Missing flow_index")] try: flows = await get_flows_from_dump(session_id) try: flow = flows[flow_index] if flow.type != "http": return [types.TextContent(type="text", text=f"Error: Flow {flow_index} is not an HTTP flow")] # Analyze the flow for protection mechanisms analysis = { "flow_index": flow_index, "method": flow.request.method, "url": flow.request.url, "protection_systems": identify_protection_system(flow), "request_cookies": analyze_cookies(dict(flow.request.headers)), "has_response": flow.response is not None, } if flow.response: # Add response analysis content_type = flow.response.headers.get("Content-Type", "") is_html = "text/html" in content_type analysis.update({ "status_code": flow.response.status_code, "response_cookies": analyze_cookies(dict(flow.response.headers)), "challenge_analysis": analyze_response_for_challenge(flow), "content_type": content_type, "is_html": is_html, }) # If HTML and script extraction is requested, extract and analyze JavaScript if is_html and extract_scripts: try: html_content = flow.response.content.decode('utf-8', errors='ignore') analysis["scripts"] = extract_javascript(html_content) except Exception as e: analysis["script_extraction_error"] = str(e) # Add remediation suggestions based on findings analysis["suggestions"] = generate_suggestions(analysis) return [types.TextContent(type="text", text=json.dumps(analysis, indent=2))] except IndexError: return [types.TextContent(type="text", text=f"Error: Flow index {flow_index} out of range")] except FileNotFoundError: return [types.TextContent(type="text", text="Error: Session not found")] except Exception as e: return [types.TextContent(type="text", text=f"Error analyzing protection: {str(e)}")]
  • Registration of the 'analyze_protection' tool in the list_tools() method, including its name, description, and JSON schema for input validation (session_id, flow_index, optional extract_scripts).
    types.Tool( name="analyze_protection", description="Analyze flow for bot protection mechanisms and extract challenge details", inputSchema={ "type": "object", "properties": { "session_id": { "type": "string", "description": "The ID of the session" }, "flow_index": { "type": "integer", "description": "The index of the flow to analyze" }, "extract_scripts": { "type": "boolean", "description": "Whether to extract and analyze JavaScript from the response (default: true)", "default": True } }, "required": ["session_id", "flow_index"] } )
  • Dispatch/registration logic within the @server.call_tool() handler that routes calls to the analyze_protection function.
    elif name == "analyze_protection": return await analyze_protection(arguments)
  • Helper function called by the handler to identify bot protection vendors (Cloudflare, Akamai, etc.) by scanning flow headers and content against predefined regex signatures, returning ranked list with confidence scores.
    def identify_protection_system(flow) -> List[Dict[str, Any]]: """ Identify potential bot protection systems based on signatures. """ protections = [] # Combine all searchable content searchable_content = "" # Add request headers for k, v in flow.request.headers.items(): searchable_content += f"{k}: {v}\n" # Check response if available if flow.response: # Add response headers for k, v in flow.response.headers.items(): searchable_content += f"{k}: {v}\n" # Add response content if it's text content_type = flow.response.headers.get("Content-Type", "") if "text" in content_type or "javascript" in content_type or "json" in content_type: try: searchable_content += flow.response.content.decode('utf-8', errors='ignore') except Exception: pass # Check for protection signatures for vendor, signatures in BOT_PROTECTION_SIGNATURES.items(): matches = [] for sig in signatures: if re.search(sig, searchable_content, re.IGNORECASE): matches.append(sig) if matches: protections.append({ "vendor": vendor, "confidence": len(matches) / len(signatures) * 100, "matching_signatures": matches }) return sorted(protections, key=lambda x: x["confidence"], reverse=True)
  • Helper function that generates vendor-specific and general remediation suggestions based on the analysis results, including handling for challenges, cookies, and script analysis.
    def generate_suggestions(analysis: Dict[str, Any]) -> List[str]: """ Generate remediation suggestions based on the protection analysis. """ suggestions = [] # Check if any protection system was detected if analysis.get("protection_systems"): top_system = analysis["protection_systems"][0]["vendor"] confidence = analysis["protection_systems"][0]["confidence"] if confidence > 50: suggestions.append(f"Detected {top_system} with {confidence:.1f}% confidence.") # Add vendor-specific suggestions if "Cloudflare" in top_system: suggestions.append("Cloudflare often uses JavaScript challenges. Check for cf_clearance cookie.") suggestions.append("Consider using proven techniques like cfscrape or cloudscraper libraries.") elif "Akamai" in top_system: suggestions.append("Akamai uses sensor_data for browser fingerprinting.") suggestions.append("Focus on _abck cookie which contains browser verification data.") elif "PerimeterX" in top_system: suggestions.append("PerimeterX relies on JavaScript execution and browser fingerprinting.") suggestions.append("Look for _px cookies which are essential for session validation.") elif "DataDome" in top_system: suggestions.append("DataDome uses advanced behavioral and fingerprinting techniques.") suggestions.append("The datadome cookie is critical for maintaining sessions.") elif "CAPTCHA" in top_system: suggestions.append("This site uses CAPTCHA challenges which may require manual solving or specialized services.") # Add suggestions based on challenge type if analysis.get("challenge_analysis", {}).get("is_challenge", False): challenge_type = analysis.get("challenge_analysis", {}).get("challenge_type", "unknown") if challenge_type == "javascript": suggestions.append("This response contains a JavaScript challenge that must be solved.") suggestions.append("Consider using a headless browser to execute the challenge JavaScript.") # If we have script analysis, add more specific suggestions if "scripts" in analysis: obfuscated_scripts = [s for s in analysis["scripts"] if s.get("summary", {}).get("obfuscation_level") in ["medium", "high"]] if obfuscated_scripts: suggestions.append(f"Found {len(obfuscated_scripts)} obfuscated script(s) that likely contain challenge logic.") fingerprinting_scripts = [s for s in analysis["scripts"] if s.get("summary", {}).get("fingerprinting_indicators")] if fingerprinting_scripts: techniques = set() for script in fingerprinting_scripts: techniques.update(script.get("summary", {}).get("fingerprinting_indicators", [])) suggestions.append(f"Detected browser fingerprinting techniques: {', '.join(techniques)}.") elif challenge_type == "captcha": suggestions.append("This response contains a CAPTCHA challenge.") suggestions.append("Consider using a CAPTCHA solving service or manual intervention.") # Check for important cookies protection_cookies = [c for c in analysis.get("response_cookies", []) if c.get("protection_related")] if protection_cookies: cookie_names = [c["name"] for c in protection_cookies] suggestions.append(f"Important protection cookies to maintain: {', '.join(cookie_names)}.") # General suggestions if analysis.get("protection_systems") or analysis.get("challenge_analysis", {}).get("is_challenge", False): suggestions.append("General recommendations:") suggestions.append("- Maintain consistent User-Agent between requests") suggestions.append("- Preserve all cookies from the session") suggestions.append("- Add appropriate referer and origin headers") suggestions.append("- Consider adding delays between requests to avoid rate limiting") suggestions.append("- Use rotating IP addresses if available") return suggestions

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/lucasoeth/mitmproxy-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server