Skip to main content
Glama

xpathwithurl

Extract specific data from XML/HTML content using XPath queries. Provide a URL and XPath expression to fetch and filter content from web pages or documents.

Instructions

Fetch content from a URL and select query it using XPath

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
mimeTypeNoThe MIME type (e.g. text/xml, application/xml, text/html, application/xhtml+xml)text/html
queryYesThe XPath query to execute
urlYesThe URL to fetch XML/HTML content from

Implementation Reference

  • Handler for 'xpathwithurl' tool: parses arguments, uses Puppeteer to fetch and render page content from URL, parses as XML/HTML, executes XPath query using xpath library, handles errors and empty results, serializes output using resultToString.
    } else if (name === "xpathwithurl") { const { url, query, mimeType } = XPathWithUrlArgumentsSchema.parse(args); // Launch puppeteer browser const browser = await puppeteer.launch({ headless: true }); const page = await browser.newPage(); try { // Navigate to the URL and wait until network is idle await page.goto(url, { waitUntil: 'networkidle0' }); // Get the rendered HTML const xml = await page.content(); // Parse XML const parsedXml = parser.parseFromString(xml, mimeType); // Check for parsing errors const errors = xpath.select('//parsererror', parsedXml); if (Array.isArray(errors) && errors.length > 0) { return { content: [{ type: "text", text: "XML parsing error: " + resultToString(errors[0]) }] }; } const result = xpath.select(query, parsedXml); // If result is an empty array, provide more information if (Array.isArray(result) && result.length === 0) { return { content: [{ type: "text", text: "No nodes matched the query." }] }; } return { content: [{ type: "text", text: resultToString(result) }] }; } catch (error: unknown) { const errorMessage = error instanceof Error ? error.message : String(error); return { content: [{ type: "text", text: `Error processing XPath query: ${errorMessage}` }] }; } finally { // Make sure to close the browser await browser.close(); }
  • Zod schema for validating input arguments to the xpathwithurl tool: url (required string URL), query (required string), mimeType (optional string, default 'text/html').
    const XPathWithUrlArgumentsSchema = z.object({ url: z.string().url().describe("The URL to fetch XML/HTML content from"), query: z.string().describe("The XPath query to execute"), mimeType: z.string() .describe("The MIME type (e.g. text/xml, application/xml, text/html, application/xhtml+xml)") .default("text/html") });
  • index.ts:100-122 (registration)
    Tool registration in ListToolsRequestHandler: defines name, description, and JSON inputSchema matching the Zod schema.
    { name: "xpathwithurl", description: "Fetch content from a URL and select query it using XPath", inputSchema: { type: "object", properties: { url: { type: "string", description: "The URL to fetch XML/HTML content from", }, query: { type: "string", description: "The XPath query to execute", }, mimeType: { type: "string", description: "The MIME type (e.g. text/xml, application/xml, text/html, application/xhtml+xml)", default: "text/html" } }, required: ["url", "query"], }, }
  • Utility function to convert XPath results (strings, numbers, booleans, DOM nodes/arrays) to a readable string format, used in both xpath and xpathwithurl handlers for output.
    function resultToString(result: string | number | boolean | Node | Node[] | null): string { if (result === null) { return "null"; } else if (Array.isArray(result)) { return result.map(resultToString).join("\n"); } else if (typeof result === 'object' && result.nodeType !== undefined) { // Handle DOM nodes if (result.nodeType === 1) { // Element node const serializer = new XMLSerializer(); return serializer.serializeToString(result); } else if (result.nodeType === 2) { // Attribute node return `${result.nodeName}="${result.nodeValue}"`; } else if (result.nodeType === 3) { // Text node return result.nodeValue || ""; } else { // Default fallback for other node types try { const serializer = new XMLSerializer(); return serializer.serializeToString(result); } catch (e) { return String(result); } } } else { return String(result); } }

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/thirdstrandstudio/mcp-xpath'

If you have feedback or need assistance with the MCP directory API, please join our Discord server