mcp-xpath

xpathwithurl

Extract specific data from XML/HTML content using XPath queries. Provide a URL and XPath expression to fetch and filter content from web pages or documents.

Instructions

Fetch content from a URL and select query it using XPath

Input Schema

TableJSON Schema

Name	Required	Description	Default
`mimeType`	No	The MIME type (e.g. text/xml, application/xml, text/html, application/xhtml+xml)	text/html
`query`	Yes	The XPath query to execute
`url`	Yes	The URL to fetch XML/HTML content from

Implementation Reference

index.ts:168-213 (handler)

Handler for 'xpathwithurl' tool: parses arguments, uses Puppeteer to fetch and render page content from URL, parses as XML/HTML, executes XPath query using xpath library, handles errors and empty results, serializes output using resultToString.

} else if (name === "xpathwithurl") {
    const { url, query, mimeType } = XPathWithUrlArgumentsSchema.parse(args);
    
    // Launch puppeteer browser
    const browser = await puppeteer.launch({ headless: true });
    const page = await browser.newPage();
    
    try {
        // Navigate to the URL and wait until network is idle
        await page.goto(url, { waitUntil: 'networkidle0' });
        
        // Get the rendered HTML
        const xml = await page.content();
        
        // Parse XML
        const parsedXml = parser.parseFromString(xml, mimeType);
        
        // Check for parsing errors
        const errors = xpath.select('//parsererror', parsedXml);
        if (Array.isArray(errors) && errors.length > 0) {
            return {
                content: [{ type: "text", text: "XML parsing error: " + resultToString(errors[0]) }]
            };
        }
        
        const result = xpath.select(query, parsedXml);
        
        // If result is an empty array, provide more information
        if (Array.isArray(result) && result.length === 0) {
            return {
                content: [{ type: "text", text: "No nodes matched the query." }]
            };
        }
        
        return {
            content: [{ type: "text", text: resultToString(result) }]
        };
    } catch (error: unknown) {
        const errorMessage = error instanceof Error ? error.message : String(error);
        return {
            content: [{ type: "text", text: `Error processing XPath query: ${errorMessage}` }]
        };
    } finally {
        // Make sure to close the browser
        await browser.close();
    }

index.ts:24-30 (schema)

Zod schema for validating input arguments to the xpathwithurl tool: url (required string URL), query (required string), mimeType (optional string, default 'text/html').

const XPathWithUrlArgumentsSchema = z.object({
    url: z.string().url().describe("The URL to fetch XML/HTML content from"),
    query: z.string().describe("The XPath query to execute"),
    mimeType: z.string()
    .describe("The MIME type (e.g. text/xml, application/xml, text/html, application/xhtml+xml)")
    .default("text/html")
});

index.ts:100-122 (registration)

Tool registration in ListToolsRequestHandler: defines name, description, and JSON inputSchema matching the Zod schema.

{
    name: "xpathwithurl",
    description: "Fetch content from a URL and select query it using XPath",
    inputSchema: {
        type: "object",
        properties: {
            url: {
                type: "string",
                description: "The URL to fetch XML/HTML content from",
            },
            query: {
                type: "string",
                description: "The XPath query to execute",
            },
            mimeType: {
                type: "string",
                description: "The MIME type (e.g. text/xml, application/xml, text/html, application/xhtml+xml)",
                default: "text/html"
            }
        },
        required: ["url", "query"],
    },
}

index.ts:46-72 (helper)

Utility function to convert XPath results (strings, numbers, booleans, DOM nodes/arrays) to a readable string format, used in both xpath and xpathwithurl handlers for output.

function resultToString(result: string | number | boolean | Node | Node[] | null): string {
    if (result === null) {
        return "null";
    } else if (Array.isArray(result)) {
        return result.map(resultToString).join("\n");
    } else if (typeof result === 'object' && result.nodeType !== undefined) {
        // Handle DOM nodes
        if (result.nodeType === 1) { // Element node
            const serializer = new XMLSerializer();
            return serializer.serializeToString(result);
        } else if (result.nodeType === 2) { // Attribute node
            return `${result.nodeName}="${result.nodeValue}"`;
        } else if (result.nodeType === 3) { // Text node
            return result.nodeValue || "";
        } else {
            // Default fallback for other node types
            try {
                const serializer = new XMLSerializer();
                return serializer.serializeToString(result);
            } catch (e) {
                return String(result);
            }
        }
    } else {
        return String(result);
    }
}

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions fetching content and querying with XPath, but doesn't disclose important behavioral traits such as error handling, network timeouts, authentication needs, rate limits, or what happens with invalid URLs or queries. For a tool that performs network operations and data processing, this is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded, consisting of a single sentence: 'Fetch content from a URL and select query it using XPath'. Every word contributes directly to explaining the tool's purpose, with no unnecessary information or redundancy. It efficiently communicates the core functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (network fetching and XPath querying), lack of annotations, and no output schema, the description is incomplete. It doesn't explain what the tool returns (e.g., query results, errors), behavioral aspects like performance or limitations, or how to interpret results. For a tool with no structured safety or output information, the description should provide more context to be fully helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, clearly documenting all three parameters (url, query, mimeType) with their purposes and types. The description adds minimal value beyond the schema, as it only implies the parameters indirectly ('Fetch content from a URL' hints at 'url', 'select query it using XPath' hints at 'query'). It doesn't provide additional context like examples or constraints, so the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Fetch content from a URL and select query it using XPath'. It specifies the verb ('fetch', 'select query'), resource ('content from a URL'), and method ('using XPath'), making it easy to understand what the tool does. However, it doesn't explicitly differentiate from its sibling tool 'xpath', which might have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention the sibling tool 'xpath' or any other tools, nor does it specify scenarios where this tool is preferred or should be avoided. The description only states what the tool does, not when to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

xpathC
xpathwithurlC

Related Tools

xpathC
@thirdstrandstudio/mcp-xpath
extract-html-fragment
@mcollina/mcp-node-fetch
get_url_content_info
@mcp2everything/mcp2tavily
scrape_webpage
@hyperbrowserai/mcp
xml_queryA
@rawr-ai/mcp-filesystem
crawling_exa
@jackedelic/exa-mcp-server

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/thirdstrandstudio/mcp-xpath'

If you have feedback or need assistance with the MCP directory API, please join our Discord server