Skip to main content
Glama
tavily-ai

Tavily MCP Server

Official
by tavily-ai

tavily-extract

Extract and process web content from URLs for data collection, content analysis, and research tasks, supporting multiple formats and extraction depths.

Instructions

A powerful web content extraction tool that retrieves and processes raw content from specified URLs, ideal for data collection, content analysis, and research tasks.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlsYesList of URLs to extract content from
extract_depthNoDepth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advancedbasic
include_imagesNoInclude a list of images extracted from the urls in the response
formatNoThe format of the extracted web page content. markdown returns content in markdown format. text returns plain text and may increase latency.markdown
include_faviconNoWhether to include the favicon URL for each result
queryNoUser intent query for reranking extracted chunks based on relevance

Implementation Reference

  • The core handler function implementing the 'tavily-extract' tool logic. Makes a POST request to Tavily's extract API endpoint (https://api.tavily.com/extract) using axios with user-provided parameters and API key, returns the response data, and handles specific errors like 401 (invalid key) and 429 (rate limit).
    async extract(params: any): Promise<TavilyResponse> { try { const response = await this.axiosInstance.post(this.baseURLs.extract, { ...params, api_key: API_KEY }); return response.data; } catch (error: any) { if (error.response?.status === 401) { throw new Error('Invalid API key'); } else if (error.response?.status === 429) { throw new Error('Usage limit exceeded'); } throw error; } }
  • Input schema definition for the 'tavily-extract' tool, including required 'urls' array, optional parameters like extract_depth (basic/advanced), include_images, format (markdown/text), include_favicon, and query for reranking.
    { name: "tavily-extract", description: "A powerful web content extraction tool that retrieves and processes raw content from specified URLs, ideal for data collection, content analysis, and research tasks.", inputSchema: { type: "object", properties: { urls: { type: "array", items: { type: "string" }, description: "List of URLs to extract content from" }, extract_depth: { type: "string", enum: ["basic","advanced"], description: "Depth of extraction - 'basic' or 'advanced', if usrls are linkedin use 'advanced' or if explicitly told to use advanced", default: "basic" }, include_images: { type: "boolean", description: "Include a list of images extracted from the urls in the response", default: false, }, format: { type: "string", enum: ["markdown","text"], description: "The format of the extracted web page content. markdown returns content in markdown format. text returns plain text and may increase latency.", default: "markdown" }, include_favicon: { type: "boolean", description: "Whether to include the favicon URL for each result", default: false, }, query: { type: "string", description: "User intent query for reranking extracted chunks based on relevance" }, }, required: ["urls"] } },
  • src/index.ts:443-452 (registration)
    Registration and dispatch logic for 'tavily-extract' in the CallToolRequestSchema switch statement. Parses arguments and invokes the extract handler method.
    case "tavily-extract": response = await this.extract({ urls: args.urls, extract_depth: args.extract_depth, include_images: args.include_images, format: args.format, include_favicon: args.include_favicon, query: args.query, }); break;
  • Helper function used to format the TavilyResponse from tavily-extract (and other tools) into a human-readable string with sections for answer, detailed results (title, URL, content, raw_content, favicon), and images.
    function formatResults(response: TavilyResponse): string { // Format API response into human-readable text const output: string[] = []; // Include answer if available if (response.answer) { output.push(`Answer: ${response.answer}`); } // Format detailed search results output.push('Detailed Results:'); response.results.forEach(result => { output.push(`\nTitle: ${result.title}`); output.push(`URL: ${result.url}`); output.push(`Content: ${result.content}`); if (result.raw_content) { output.push(`Raw Content: ${result.raw_content}`); } if (result.favicon) { output.push(`Favicon: ${result.favicon}`); } }); // Add images section if available if (response.images && response.images.length > 0) { output.push('\nImages:'); response.images.forEach((image, index) => { if (typeof image === 'string') { output.push(`\n[${index + 1}] URL: ${image}`); } else { output.push(`\n[${index + 1}] URL: ${image.url}`); if (image.description) { output.push(` Description: ${image.description}`); } } }); } return output.join('\n'); }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tavily-ai/tavily-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server