tavily-extract
Extract and process raw web content from specified URLs for data collection, content analysis, and research tasks with configurable depth and image inclusion options.
Instructions
A powerful web content extraction tool that retrieves and processes raw content from specified URLs, ideal for data collection, content analysis, and research tasks.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| urls | Yes | List of URLs to extract content from | |
| extract_depth | No | Depth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advanced | basic |
| include_images | No | Include a list of images extracted from the urls in the response |
Implementation Reference
- src/index.ts:478-493 (handler)Core handler function that executes the tavily-extract tool by making a POST request to Tavily's /extract endpoint with user parameters and the API key, handling common errors like invalid key or rate limits.async extract(params: any): Promise<TavilyResponse> { try { const response = await this.axiosInstance.post(this.baseURLs.extract, { ...params, api_key: API_KEY }); return response.data; } catch (error: any) { if (error.response?.status === 401) { throw new Error('Invalid API key'); } else if (error.response?.status === 429) { throw new Error('Usage limit exceeded'); } throw error; } }
- src/index.ts:193-214 (schema)Defines the input schema for the tavily-extract tool, requiring a list of URLs and allowing optional extraction depth and image inclusion.inputSchema: { type: "object", properties: { urls: { type: "array", items: { type: "string" }, description: "List of URLs to extract content from" }, extract_depth: { type: "string", enum: ["basic","advanced"], description: "Depth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advanced", default: "basic" }, include_images: { type: "boolean", description: "Include a list of images extracted from the urls in the response", default: false, } }, required: ["urls"] }
- src/index.ts:190-215 (registration)Registers the tavily-extract tool in the MCP ListTools response, providing name, description, and full input schema.{ name: "tavily-extract", description: "A powerful web content extraction tool that retrieves and processes raw content from specified URLs, ideal for data collection, content analysis, and research tasks.", inputSchema: { type: "object", properties: { urls: { type: "array", items: { type: "string" }, description: "List of URLs to extract content from" }, extract_depth: { type: "string", enum: ["basic","advanced"], description: "Depth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advanced", default: "basic" }, include_images: { type: "boolean", description: "Include a list of images extracted from the urls in the response", default: false, } }, required: ["urls"] } },
- src/index.ts:372-378 (handler)Switch case in the CallToolRequest handler that routes tavily-extract calls to the extract method with parsed arguments.case "tavily-extract": response = await this.extract({ urls: args.urls, extract_depth: args.extract_depth, include_images: args.include_images }); break;
- src/index.ts:60-66 (helper)API endpoint URL for the tavily-extract tool, used in the axios POST request.extract: 'https://api.tavily.com/extract', crawl: 'https://api.tavily.com/crawl', map: 'https://api.tavily.com/map' }; constructor() { this.server = new Server(