tavily-extract
Extract and process web content from specified URLs for data collection, content analysis, and research tasks, with options for extraction depth, image inclusion, and output format.
Instructions
A powerful web content extraction tool that retrieves and processes raw content from specified URLs, ideal for data collection, content analysis, and research tasks.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
extract_depth | No | Depth of extraction - 'basic' or 'advanced', if urls are linkedin use 'advanced' or if explicitly told to use advanced | basic |
format | No | The format of the extracted web page content. markdown returns content in markdown format. text returns plain text and may increase latency. | markdown |
include_images | No | Include a list of images extracted from the urls in the response | |
urls | Yes | List of URLs to extract content from |
Input Schema (JSON Schema)
{
"properties": {
"extract_depth": {
"default": "basic",
"description": "Depth of extraction - 'basic' or 'advanced', if urls are linkedin use 'advanced' or if explicitly told to use advanced",
"enum": [
"basic",
"advanced"
],
"title": "Extract Depth",
"type": "string"
},
"format": {
"default": "markdown",
"description": "The format of the extracted web page content. markdown returns content in markdown format. text returns plain text and may increase latency.",
"enum": [
"markdown",
"text"
],
"title": "Format",
"type": "string"
},
"include_images": {
"default": false,
"description": "Include a list of images extracted from the urls in the response",
"title": "Include Images",
"type": "boolean"
},
"urls": {
"description": "List of URLs to extract content from",
"items": {
"type": "string"
},
"title": "Urls",
"type": "array"
}
},
"required": [
"urls"
],
"type": "object"
}