Skip to main content
Glama

xpathwithurl

Extract specific data from XML/HTML content using XPath queries. Provide a URL and XPath expression to fetch and filter content from web pages or documents.

Instructions

Fetch content from a URL and select query it using XPath

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
mimeTypeNoThe MIME type (e.g. text/xml, application/xml, text/html, application/xhtml+xml)text/html
queryYesThe XPath query to execute
urlYesThe URL to fetch XML/HTML content from

Implementation Reference

  • Handler for 'xpathwithurl' tool: parses arguments, uses Puppeteer to fetch and render page content from URL, parses as XML/HTML, executes XPath query using xpath library, handles errors and empty results, serializes output using resultToString.
    } else if (name === "xpathwithurl") {
        const { url, query, mimeType } = XPathWithUrlArgumentsSchema.parse(args);
        
        // Launch puppeteer browser
        const browser = await puppeteer.launch({ headless: true });
        const page = await browser.newPage();
        
        try {
            // Navigate to the URL and wait until network is idle
            await page.goto(url, { waitUntil: 'networkidle0' });
            
            // Get the rendered HTML
            const xml = await page.content();
            
            // Parse XML
            const parsedXml = parser.parseFromString(xml, mimeType);
            
            // Check for parsing errors
            const errors = xpath.select('//parsererror', parsedXml);
            if (Array.isArray(errors) && errors.length > 0) {
                return {
                    content: [{ type: "text", text: "XML parsing error: " + resultToString(errors[0]) }]
                };
            }
            
            const result = xpath.select(query, parsedXml);
            
            // If result is an empty array, provide more information
            if (Array.isArray(result) && result.length === 0) {
                return {
                    content: [{ type: "text", text: "No nodes matched the query." }]
                };
            }
            
            return {
                content: [{ type: "text", text: resultToString(result) }]
            };
        } catch (error: unknown) {
            const errorMessage = error instanceof Error ? error.message : String(error);
            return {
                content: [{ type: "text", text: `Error processing XPath query: ${errorMessage}` }]
            };
        } finally {
            // Make sure to close the browser
            await browser.close();
        }
  • Zod schema for validating input arguments to the xpathwithurl tool: url (required string URL), query (required string), mimeType (optional string, default 'text/html').
    const XPathWithUrlArgumentsSchema = z.object({
        url: z.string().url().describe("The URL to fetch XML/HTML content from"),
        query: z.string().describe("The XPath query to execute"),
        mimeType: z.string()
        .describe("The MIME type (e.g. text/xml, application/xml, text/html, application/xhtml+xml)")
        .default("text/html")
    });
  • index.ts:100-122 (registration)
    Tool registration in ListToolsRequestHandler: defines name, description, and JSON inputSchema matching the Zod schema.
    {
        name: "xpathwithurl",
        description: "Fetch content from a URL and select query it using XPath",
        inputSchema: {
            type: "object",
            properties: {
                url: {
                    type: "string",
                    description: "The URL to fetch XML/HTML content from",
                },
                query: {
                    type: "string",
                    description: "The XPath query to execute",
                },
                mimeType: {
                    type: "string",
                    description: "The MIME type (e.g. text/xml, application/xml, text/html, application/xhtml+xml)",
                    default: "text/html"
                }
            },
            required: ["url", "query"],
        },
    }
  • Utility function to convert XPath results (strings, numbers, booleans, DOM nodes/arrays) to a readable string format, used in both xpath and xpathwithurl handlers for output.
    function resultToString(result: string | number | boolean | Node | Node[] | null): string {
        if (result === null) {
            return "null";
        } else if (Array.isArray(result)) {
            return result.map(resultToString).join("\n");
        } else if (typeof result === 'object' && result.nodeType !== undefined) {
            // Handle DOM nodes
            if (result.nodeType === 1) { // Element node
                const serializer = new XMLSerializer();
                return serializer.serializeToString(result);
            } else if (result.nodeType === 2) { // Attribute node
                return `${result.nodeName}="${result.nodeValue}"`;
            } else if (result.nodeType === 3) { // Text node
                return result.nodeValue || "";
            } else {
                // Default fallback for other node types
                try {
                    const serializer = new XMLSerializer();
                    return serializer.serializeToString(result);
                } catch (e) {
                    return String(result);
                }
            }
        } else {
            return String(result);
        }
    }
Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/thirdstrandstudio/mcp-xpath'

If you have feedback or need assistance with the MCP directory API, please join our Discord server