Skip to main content
Glama
yantrix-ai

@praveen030686/data-apis-mcp

Extract Structured Data

web_extract_structured
Read-onlyIdempotent

Extract structured data including text, tables, JSON-LD, and metadata from any URL. Uses USDC micropayments on Base network for processing.

Instructions

Full structured extraction: text, tables, JSON-LD, metadata from any URL. Costs $0.05 USDC per request via x402 on Base.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL to extract structured data from

Implementation Reference

  • The handler function for web_extract_structured which calls the structured extraction API.
    async ({ url }) => {
      const data = await apiPost(`${WEB_EXTRACT_API}/api/v1/extract/structured`, { url });
      return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] };
    }
  • src/index.ts:336-346 (registration)
    Registration of the web_extract_structured tool with its schema and description.
    server.registerTool(
      "web_extract_structured",
      {
        title: "Extract Structured Data",
        description: `Full structured extraction: text, tables, JSON-LD, metadata from any URL.
    Costs $0.05 USDC per request via x402 on Base.`,
        inputSchema: {
          url: z.string().url().describe("URL to extract structured data from"),
        },
        annotations: { readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: true },
      },
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations cover read-only, open-world, idempotent, and non-destructive hints, so the bar is lower. The description adds cost information ('Costs $0.05 USDC per request via x402 on Base'), which is useful context beyond annotations. However, it doesn't detail rate limits, error handling, or output format, limiting its value. No contradiction with annotations exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise and front-loaded, with two sentences that efficiently convey purpose and cost. Every sentence adds value without waste, making it appropriately sized and well-structured for quick understanding.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (extraction from URLs), rich annotations, and no output schema, the description is somewhat complete but has gaps. It covers purpose and cost but lacks details on output format, error cases, or when to use versus siblings. This makes it adequate but with clear room for improvement, scoring a 3.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage for its single parameter (url), so the baseline is 3. The description adds no additional parameter semantics beyond what the schema provides, such as URL format constraints or examples, keeping it at the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Full structured extraction: text, tables, JSON-LD, metadata from any URL.' This specifies the verb ('extract') and resource ('structured data'), though it doesn't explicitly differentiate from siblings like web_extract_text or web_extract_metadata. It's clear but lacks sibling differentiation, warranting a 4.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It mentions cost but not context, prerequisites, or comparisons to siblings like web_extract_text or web_extract_metadata. This absence of usage guidance results in a score of 2.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/yantrix-ai/x402-apis-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server