Skip to main content
Glama

xpathwithurl

Extract specific data from XML/HTML content using XPath queries. Provide a URL and XPath expression to fetch and filter content from web pages or documents.

Instructions

Fetch content from a URL and select query it using XPath

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
mimeTypeNoThe MIME type (e.g. text/xml, application/xml, text/html, application/xhtml+xml)text/html
queryYesThe XPath query to execute
urlYesThe URL to fetch XML/HTML content from

Implementation Reference

  • Handler for 'xpathwithurl' tool: parses arguments, uses Puppeteer to fetch and render page content from URL, parses as XML/HTML, executes XPath query using xpath library, handles errors and empty results, serializes output using resultToString.
    } else if (name === "xpathwithurl") {
        const { url, query, mimeType } = XPathWithUrlArgumentsSchema.parse(args);
        
        // Launch puppeteer browser
        const browser = await puppeteer.launch({ headless: true });
        const page = await browser.newPage();
        
        try {
            // Navigate to the URL and wait until network is idle
            await page.goto(url, { waitUntil: 'networkidle0' });
            
            // Get the rendered HTML
            const xml = await page.content();
            
            // Parse XML
            const parsedXml = parser.parseFromString(xml, mimeType);
            
            // Check for parsing errors
            const errors = xpath.select('//parsererror', parsedXml);
            if (Array.isArray(errors) && errors.length > 0) {
                return {
                    content: [{ type: "text", text: "XML parsing error: " + resultToString(errors[0]) }]
                };
            }
            
            const result = xpath.select(query, parsedXml);
            
            // If result is an empty array, provide more information
            if (Array.isArray(result) && result.length === 0) {
                return {
                    content: [{ type: "text", text: "No nodes matched the query." }]
                };
            }
            
            return {
                content: [{ type: "text", text: resultToString(result) }]
            };
        } catch (error: unknown) {
            const errorMessage = error instanceof Error ? error.message : String(error);
            return {
                content: [{ type: "text", text: `Error processing XPath query: ${errorMessage}` }]
            };
        } finally {
            // Make sure to close the browser
            await browser.close();
        }
  • Zod schema for validating input arguments to the xpathwithurl tool: url (required string URL), query (required string), mimeType (optional string, default 'text/html').
    const XPathWithUrlArgumentsSchema = z.object({
        url: z.string().url().describe("The URL to fetch XML/HTML content from"),
        query: z.string().describe("The XPath query to execute"),
        mimeType: z.string()
        .describe("The MIME type (e.g. text/xml, application/xml, text/html, application/xhtml+xml)")
        .default("text/html")
    });
  • index.ts:100-122 (registration)
    Tool registration in ListToolsRequestHandler: defines name, description, and JSON inputSchema matching the Zod schema.
    {
        name: "xpathwithurl",
        description: "Fetch content from a URL and select query it using XPath",
        inputSchema: {
            type: "object",
            properties: {
                url: {
                    type: "string",
                    description: "The URL to fetch XML/HTML content from",
                },
                query: {
                    type: "string",
                    description: "The XPath query to execute",
                },
                mimeType: {
                    type: "string",
                    description: "The MIME type (e.g. text/xml, application/xml, text/html, application/xhtml+xml)",
                    default: "text/html"
                }
            },
            required: ["url", "query"],
        },
    }
  • Utility function to convert XPath results (strings, numbers, booleans, DOM nodes/arrays) to a readable string format, used in both xpath and xpathwithurl handlers for output.
    function resultToString(result: string | number | boolean | Node | Node[] | null): string {
        if (result === null) {
            return "null";
        } else if (Array.isArray(result)) {
            return result.map(resultToString).join("\n");
        } else if (typeof result === 'object' && result.nodeType !== undefined) {
            // Handle DOM nodes
            if (result.nodeType === 1) { // Element node
                const serializer = new XMLSerializer();
                return serializer.serializeToString(result);
            } else if (result.nodeType === 2) { // Attribute node
                return `${result.nodeName}="${result.nodeValue}"`;
            } else if (result.nodeType === 3) { // Text node
                return result.nodeValue || "";
            } else {
                // Default fallback for other node types
                try {
                    const serializer = new XMLSerializer();
                    return serializer.serializeToString(result);
                } catch (e) {
                    return String(result);
                }
            }
        } else {
            return String(result);
        }
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions fetching content and querying with XPath, but doesn't disclose important behavioral traits such as error handling, network timeouts, authentication needs, rate limits, or what happens with invalid URLs or queries. For a tool that performs network operations and data processing, this is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded, consisting of a single sentence: 'Fetch content from a URL and select query it using XPath'. Every word contributes directly to explaining the tool's purpose, with no unnecessary information or redundancy. It efficiently communicates the core functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (network fetching and XPath querying), lack of annotations, and no output schema, the description is incomplete. It doesn't explain what the tool returns (e.g., query results, errors), behavioral aspects like performance or limitations, or how to interpret results. For a tool with no structured safety or output information, the description should provide more context to be fully helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, clearly documenting all three parameters (url, query, mimeType) with their purposes and types. The description adds minimal value beyond the schema, as it only implies the parameters indirectly ('Fetch content from a URL' hints at 'url', 'select query it using XPath' hints at 'query'). It doesn't provide additional context like examples or constraints, so the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Fetch content from a URL and select query it using XPath'. It specifies the verb ('fetch', 'select query'), resource ('content from a URL'), and method ('using XPath'), making it easy to understand what the tool does. However, it doesn't explicitly differentiate from its sibling tool 'xpath', which might have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention the sibling tool 'xpath' or any other tools, nor does it specify scenarios where this tool is preferred or should be avoided. The description only states what the tool does, not when to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/thirdstrandstudio/mcp-xpath'

If you have feedback or need assistance with the MCP directory API, please join our Discord server