tavily-extract

Extract and process web content from URLs for data collection, content analysis, and research tasks, supporting multiple formats and extraction depths.

Instructions

A powerful web content extraction tool that retrieves and processes raw content from specified URLs, ideal for data collection, content analysis, and research tasks.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`urls`	Yes	List of URLs to extract content from
`extract_depth`	No	Depth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advanced	basic
`include_images`	No	Include a list of images extracted from the urls in the response
`format`	No	The format of the extracted web page content. markdown returns content in markdown format. text returns plain text and may increase latency.	markdown
`include_favicon`	No	Whether to include the favicon URL for each result
`query`	No	User intent query for reranking extracted chunks based on relevance

Implementation Reference

src/index.ts:593-608 (handler)
The core handler function implementing the 'tavily-extract' tool logic. Makes a POST request to Tavily's extract API endpoint (https://api.tavily.com/extract) using axios with user-provided parameters and API key, returns the response data, and handles specific errors like 401 (invalid key) and 429 (rate limit).
async extract(params: any): Promise<TavilyResponse> { try { const response = await this.axiosInstance.post(this.baseURLs.extract, { ...params, api_key: API_KEY }); return response.data; } catch (error: any) { if (error.response?.status === 401) { throw new Error('Invalid API key'); } else if (error.response?.status === 429) { throw new Error('Usage limit exceeded'); } throw error; } }
src/index.ts:244-284 (schema)
Input schema definition for the 'tavily-extract' tool, including required 'urls' array, optional parameters like extract_depth (basic/advanced), include_images, format (markdown/text), include_favicon, and query for reranking.
{ name: "tavily-extract", description: "A powerful web content extraction tool that retrieves and processes raw content from specified URLs, ideal for data collection, content analysis, and research tasks.", inputSchema: { type: "object", properties: { urls: { type: "array", items: { type: "string" }, description: "List of URLs to extract content from" }, extract_depth: { type: "string", enum: ["basic","advanced"], description: "Depth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advanced", default: "basic" }, include_images: { type: "boolean", description: "Include a list of images extracted from the urls in the response", default: false, }, format: { type: "string", enum: ["markdown","text"], description: "The format of the extracted web page content. markdown returns content in markdown format. text returns plain text and may increase latency.", default: "markdown" }, include_favicon: { type: "boolean", description: "Whether to include the favicon URL for each result", default: false, }, query: { type: "string", description: "User intent query for reranking extracted chunks based on relevance" }, }, required: ["urls"] } },
src/index.ts:443-452 (registration)
Registration and dispatch logic for 'tavily-extract' in the CallToolRequestSchema switch statement. Parses arguments and invokes the extract handler method.
case "tavily-extract": response = await this.extract({ urls: args.urls, extract_depth: args.extract_depth, include_images: args.include_images, format: args.format, include_favicon: args.include_favicon, query: args.query, }); break;
src/index.ts:645-684 (helper)
Helper function used to format the TavilyResponse from tavily-extract (and other tools) into a human-readable string with sections for answer, detailed results (title, URL, content, raw_content, favicon), and images.
function formatResults(response: TavilyResponse): string { // Format API response into human-readable text const output: string[] = []; // Include answer if available if (response.answer) { output.push(`Answer: ${response.answer}`); } // Format detailed search results output.push('Detailed Results:'); response.results.forEach(result => { output.push(`\nTitle: ${result.title}`); output.push(`URL: ${result.url}`); output.push(`Content: ${result.content}`); if (result.raw_content) { output.push(`Raw Content: ${result.raw_content}`); } if (result.favicon) { output.push(`Favicon: ${result.favicon}`); } }); // Add images section if available if (response.images && response.images.length > 0) { output.push('\nImages:'); response.images.forEach((image, index) => { if (typeof image === 'string') { output.push(`\n[${index + 1}] URL: ${image}`); } else { output.push(`\n[${index + 1}] URL: ${image.url}`); if (image.description) { output.push(` Description: ${image.description}`); } } }); } return output.join('\n'); }

Tavily MCP Server

tavily-extract

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API