Skip to main content
Glama
omgwtfwow

MCP Server for Crawl4AI

by omgwtfwow

extract_with_llm

Extract specific information from webpages using AI by asking questions about content. Crawls fresh content each time to answer queries about topics, prices, summaries, or contact details.

Instructions

[STATELESS] Ask questions about webpage content using AI. Returns natural language answers. Crawls fresh each time. For dynamic content or sessions, use crawl with session_id first.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL to extract data from
queryYesYour question about the webpage content. Examples: "What is the main topic?", "List all product prices", "Summarize the key points", "What contact information is available?"

Implementation Reference

  • Handler function that executes the tool logic: calls the service, extracts the answer, and formats the MCP response with content type 'text'.
    async extractWithLLM(options: { url: string; query: string }) {
      try {
        const result = await this.service.extractWithLLM(options);
    
        return {
          content: [
            {
              type: 'text',
              text: result.answer,
            },
          ],
        };
      } catch (error) {
        throw this.formatError(error, 'extract with LLM');
      }
    }
  • Core service method that performs the actual HTTP request to the Crawl4AI backend's /llm endpoint with URL and query parameters, handles specific errors like timeout and auth.
    async extractWithLLM(options: LLMEndpointOptions): Promise<LLMEndpointResponse> {
      // Validate URL
      if (!validateURL(options.url)) {
        throw new Error('Invalid URL format');
      }
    
      try {
        const encodedUrl = encodeURIComponent(options.url);
        const encodedQuery = encodeURIComponent(options.query);
        const response = await this.axiosClient.get(`/llm/${encodedUrl}?q=${encodedQuery}`);
        return response.data;
      } catch (error) {
        // Special handling for LLM-specific errors
        if (axios.isAxiosError(error)) {
          const axiosError = error as AxiosError;
          if (axiosError.code === 'ECONNABORTED' || axiosError.response?.status === 504) {
            throw new Error('LLM extraction timed out. Try a simpler query or different URL.');
          }
          if (axiosError.response?.status === 401) {
            throw new Error(
              'LLM extraction failed: No LLM provider configured on server. Please ensure the server has an API key set.',
            );
          }
        }
        return handleAxiosError(error);
      }
    }
  • Zod schema for input validation: requires url (valid URL) and query (string). Uses createStatelessSchema helper.
    export const ExtractWithLlmSchema = createStatelessSchema(
      z.object({
        url: z.string().url(),
        query: z.string(),
      }),
      'extract_with_llm',
    );
  • src/server.ts:890-896 (registration)
    Tool registration in the CallToolRequestSchema handler switch statement: uses ExtractWithLlmSchema for validation and delegates to contentHandlers.extractWithLLM.
    case 'extract_with_llm':
      return await this.validateAndExecute(
        'extract_with_llm',
        args,
        ExtractWithLlmSchema,
        async (validatedArgs) => this.contentHandlers.extractWithLLM(validatedArgs),
      );

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/omgwtfwow/mcp-crawl4ai-ts'

If you have feedback or need assistance with the MCP directory API, please join our Discord server