tavily-extract
Extract and process web content from URLs for data collection, content analysis, and research tasks with configurable depth and image inclusion options.
Instructions
A powerful web content extraction tool that retrieves and processes raw content from specified URLs, ideal for data collection, content analysis, and research tasks.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| extract_depth | No | Depth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advanced | basic |
| include_images | No | Include a list of images extracted from the urls in the response | |
| urls | Yes | List of URLs to extract content from |
Implementation Reference
- src/index.ts:290-305 (handler)The core handler function that executes the tavily-extract tool logic by making a POST request to Tavily's extract API endpoint with the provided parameters.async extract(params: any): Promise<TavilyResponse> { try { const response = await this.axiosInstance.post(this.baseURLs.extract, { ...params, api_key: API_KEY }); return response.data; } catch (error: any) { if (error.response?.status === 401) { throw new Error('Invalid API key'); } else if (error.response?.status === 429) { throw new Error('Usage limit exceeded'); } throw error; } }
- src/index.ts:171-196 (registration)Registration of the tavily-extract tool in the ListToolsRequestSchema handler, including name, description, and input schema.{ name: "tavily-extract", description: "A powerful web content extraction tool that retrieves and processes raw content from specified URLs, ideal for data collection, content analysis, and research tasks.", inputSchema: { type: "object", properties: { urls: { type: "array", items: { type: "string" }, description: "List of URLs to extract content from" }, extract_depth: { type: "string", enum: ["basic","advanced"], description: "Depth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advanced", default: "basic" }, include_images: { type: "boolean", description: "Include a list of images extracted from the urls in the response", default: false, } }, required: ["urls"] } },
- src/index.ts:223-229 (handler)Dispatch handler in CallToolRequestSchema that invokes the extract method for tavily-extract tool calls.case "tavily-extract": response = await this.extract({ urls: args.urls, extract_depth: args.extract_depth, include_images: args.include_images }); break;
- src/index.ts:174-195 (schema)Input schema definition for the tavily-extract tool, specifying required urls array and optional parameters.inputSchema: { type: "object", properties: { urls: { type: "array", items: { type: "string" }, description: "List of URLs to extract content from" }, extract_depth: { type: "string", enum: ["basic","advanced"], description: "Depth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advanced", default: "basic" }, include_images: { type: "boolean", description: "Include a list of images extracted from the urls in the response", default: false, } }, required: ["urls"] }
- src/index.ts:308-334 (helper)Helper function used to format the Tavily API response (including from extract) into a readable text output for the tool response.function formatResults(response: TavilyResponse): string { // Format API response into human-readable text const output: string[] = []; // Include answer if available if (response.answer) { output.push(`Answer: ${response.answer}`); output.push('\nSources:'); response.results.forEach(result => { output.push(`- ${result.title}: ${result.url}`); }); output.push(''); } // Format detailed search results output.push('Detailed Results:'); response.results.forEach(result => { output.push(`\nTitle: ${result.title}`); output.push(`URL: ${result.url}`); output.push(`Content: ${result.content}`); if (result.raw_content) { output.push(`Raw Content: ${result.raw_content}`); } }); return output.join('\n'); }