extract-web-data
Extract structured JSON data from any public webpage by describing in natural language what data you need.
Instructions
Extracts structured data as JSON from a web page given a URL using a Natural Language description of the data.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | The URL of the public webpage to extract data from | |
| prompt | Yes | Natural Language description of the data to extract from the page |
Implementation Reference
- src/index.ts:60-103 (handler)Handler function for 'extract-web-data' tool. Calls the AgentQL API to extract structured data from a web page using a URL and natural language prompt.
// Handler for the 'extract-web-data' tool. server.setRequestHandler(CallToolRequestSchema, async (request) => { switch (request.params.name) { case EXTRACT_TOOL_NAME: { const url = String(request.params.arguments?.url); const prompt = String(request.params.arguments?.prompt); if (!url || !prompt) { throw new Error("Both 'url' and 'prompt' are required"); } const endpoint = 'https://api.agentql.com/v1/query-data'; const response = await fetch(endpoint, { method: 'POST', headers: { 'X-API-Key': `${AGENTQL_API_KEY}`, 'X-TF-Request-Origin': 'mcp-server', 'Content-Type': 'application/json', }, body: JSON.stringify({ url: url, prompt: prompt, params: { wait_for: 0, is_scroll_to_bottom_enabled: false, mode: 'fast', is_screenshot_enabled: false, }, }), }); if (!response.ok) { throw new Error(`AgentQL API error: ${response.statusText}\n${await response.text()}`); } const json = (await response.json()) as AqlResponse; return { content: [ { type: 'text', text: JSON.stringify(json.data, null, 2), }, ], }; - src/index.ts:41-54 (schema)Input schema for 'extract-web-data' tool: requires 'url' (string) and 'prompt' (string) parameters.
inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL of the public webpage to extract data from', }, prompt: { type: 'string', description: 'Natural Language description of the data to extract from the page', }, }, required: ['url', 'prompt'], }, - src/index.ts:25-25 (registration)Tool name constant definition for 'extract-web-data' used in registration.
const EXTRACT_TOOL_NAME = 'extract-web-data'; - src/index.ts:33-56 (registration)Registration of 'extract-web-data' tool in the list of available tools.
// Handler that lists available tools. server.setRequestHandler(ListToolsRequestSchema, async () => { return { tools: [ { name: EXTRACT_TOOL_NAME, description: 'Extracts structured data as JSON from a web page given a URL using a Natural Language description of the data.', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL of the public webpage to extract data from', }, prompt: { type: 'string', description: 'Natural Language description of the data to extract from the page', }, }, required: ['url', 'prompt'], }, }, ], - src/index.ts:8-10 (helper)Helper interface defining the AgentQL API response shape, used in the handler.
interface AqlResponse { data: object; }