extract
Fetch and extract structured data from web pages using a JSON schema. Returns clean JSON with the specified fields, bypassing manual parsing.
Instructions
Fetch a webpage and extract structured data from it according to a JSON Schema. The MCP server fetches the page, then asks a local Ollama model to populate the schema from the page contents. So the caller gets clean JSON back, without having to read or parse the snapshot itself.
Use this tool when the user asks for specific fields that you can name in advance, especially on pages with structured content (WHOIS lookups, product pages, GitHub repos, recipes, tables of data). Arrays and nested objects are supported, since the work is done by an LLM and not by a constrained server-side extractor.
Prefer fetch when the user asks an open-ended question or wants a
free-form summary.
Args:
url: The full URL to extract from.
schema: A JSON Schema describing the fields to extract. Property
descriptions guide the extraction model in finding the right
page content. Example:
{
"type": "object",
"properties": {
"registrar": {"type": "string",
"description": "Domain registrar name"},
"expiration_date": {"type": "string",
"description": "Registrar Registration Expiration Date"},
"nameservers": {"type": "array",
"items": {"type": "string"},
"description": "Name Server entries"}
}
}
user_id: Optional, same semantics as for fetch.
Returns: A JSON string with the extracted fields. If a field cannot be found, it is set to null.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | ||
| schema | Yes | ||
| user_id | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |