Skip to main content
Glama
omgwtfwow

MCP Server for Crawl4AI

by omgwtfwow

extract_with_llm

Extract specific information from webpages using AI by asking questions about content, such as summarizing key points or finding product details.

Instructions

[STATELESS] Ask questions about webpage content using AI. Returns natural language answers. Crawls fresh each time. For dynamic content or sessions, use crawl with session_id first.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesYour question about the webpage content. Examples: "What is the main topic?", "List all product prices", "Summarize the key points", "What contact information is available?"
urlYesThe URL to extract data from

Implementation Reference

  • Core handler function that performs LLM extraction by making a GET request to the backend /llm endpoint with URL and query parameters. Includes URL validation and specific error handling for timeouts and missing LLM config.
    async extractWithLLM(options: LLMEndpointOptions): Promise<LLMEndpointResponse> { // Validate URL if (!validateURL(options.url)) { throw new Error('Invalid URL format'); } try { const encodedUrl = encodeURIComponent(options.url); const encodedQuery = encodeURIComponent(options.query); const response = await this.axiosClient.get(`/llm/${encodedUrl}?q=${encodedQuery}`); return response.data; } catch (error) { // Special handling for LLM-specific errors if (axios.isAxiosError(error)) { const axiosError = error as AxiosError; if (axiosError.code === 'ECONNABORTED' || axiosError.response?.status === 504) { throw new Error('LLM extraction timed out. Try a simpler query or different URL.'); } if (axiosError.response?.status === 401) { throw new Error( 'LLM extraction failed: No LLM provider configured on server. Please ensure the server has an API key set.', ); } } return handleAxiosError(error); } }
  • Wrapper handler in ContentHandlers that calls the service's extractWithLLM and formats the MCP response with the LLM answer as text content.
    async extractWithLLM(options: { url: string; query: string }) { try { const result = await this.service.extractWithLLM(options); return { content: [ { type: 'text', text: result.answer, }, ], }; } catch (error) { throw this.formatError(error, 'extract with LLM'); } }
  • Zod schema for validating input parameters: url (string URL) and query (string), using createStatelessSchema with tool name 'extract_with_llm'.
    export const ExtractWithLlmSchema = createStatelessSchema( z.object({ url: z.string().url(), query: z.string(), }), 'extract_with_llm', );
  • src/server.ts:890-896 (registration)
    Tool registration in the server switch statement: validates args with ExtractWithLlmSchema and delegates to contentHandlers.extractWithLLM.
    case 'extract_with_llm': return await this.validateAndExecute( 'extract_with_llm', args, ExtractWithLlmSchema, async (validatedArgs) => this.contentHandlers.extractWithLLM(validatedArgs), );
  • src/server.ts:791-811 (registration)
    Tool metadata registration in listTools response, including name, description, and inputSchema matching the Zod schema.
    name: 'extract_with_llm', description: '[STATELESS] Ask questions about webpage content using AI. Returns natural language answers. ' + 'Crawls fresh each time. For dynamic content or sessions, use crawl with session_id first.', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to extract data from', }, query: { type: 'string', description: 'Your question about the webpage content. Examples: "What is the main topic?", ' + '"List all product prices", "Summarize the key points", "What contact information is available?"', }, }, required: ['url', 'query'], }, },

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/omgwtfwow/mcp-crawl4ai-ts'

If you have feedback or need assistance with the MCP directory API, please join our Discord server