tavily-extract

Extract and process raw web content from URLs for data collection, content analysis, and research tasks with configurable depth and image inclusion.

Instructions

A powerful web content extraction tool that retrieves and processes raw content from specified URLs, ideal for data collection, content analysis, and research tasks.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`urls`	Yes	List of URLs to extract content from
`extract_depth`	No	Depth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advanced	basic
`include_images`	No	Include a list of images extracted from the urls in the response

Implementation Reference

src/index.ts:479-494 (handler)
The core handler function that executes the tavily-extract tool logic by making a POST request to Tavily's extract API endpoint using axios, handling parameters like urls and extract_depth, and managing errors such as invalid API key or rate limits.
async extract(params: any): Promise<TavilyResponse> { try { const response = await this.axiosInstance.post(this.baseURLs.extract, { ...params, api_key: API_KEY }); return response.data; } catch (error: any) { if (error.response?.status === 401) { throw new Error('Invalid API key'); } else if (error.response?.status === 429) { throw new Error('Usage limit exceeded'); } throw error; } }
src/index.ts:194-215 (schema)
Defines the input schema for validation of tavily-extract tool arguments, specifying required 'urls' array and optional 'extract_depth' and 'include_images' parameters.
inputSchema: { type: "object", properties: { urls: { type: "array", items: { type: "string" }, description: "List of URLs to extract content from" }, extract_depth: { type: "string", enum: ["basic","advanced"], description: "Depth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advanced", default: "basic" }, include_images: { type: "boolean", description: "Include a list of images extracted from the urls in the response", default: false, } }, required: ["urls"] }
src/index.ts:191-216 (registration)
Registers the tavily-extract tool in the ListTools response, providing name, description, and input schema.
{ name: "tavily-extract", description: "A powerful web content extraction tool that retrieves and processes raw content from specified URLs, ideal for data collection, content analysis, and research tasks.", inputSchema: { type: "object", properties: { urls: { type: "array", items: { type: "string" }, description: "List of URLs to extract content from" }, extract_depth: { type: "string", enum: ["basic","advanced"], description: "Depth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advanced", default: "basic" }, include_images: { type: "boolean", description: "Include a list of images extracted from the urls in the response", default: false, } }, required: ["urls"] } },
src/index.ts:373-379 (handler)
Dispatch case in the CallToolRequest handler that routes tavily-extract calls to the extract method with parsed arguments.
case "tavily-extract": response = await this.extract({ urls: args.urls, extract_depth: args.extract_depth, include_images: args.include_images }); break;
src/index.ts:59-63 (helper)
Defines the API base URLs, including the extract endpoint used by the tavily-extract tool.
private baseURLs = { search: 'https://api.tavily.com/search', extract: 'https://api.tavily.com/extract', crawl: 'https://api.tavily.com/crawl', map: 'https://api.tavily.com/map'

tavily-mcp

tavily-extract

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API