Skip to main content
Glama

crawl_web

Extract webpage content in Markdown, raw HTML, or AI-enhanced formats for analysis and processing.

Instructions

Crawl a specific webpage and extract its content in various formats including Markdown, raw HTML, and AI-enhanced HTML.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL to crawl and extract content from
markdownNoReturn content in Markdown format
raw_htmlNoReturn original, unprocessed HTML
enhanced_htmlNoReturn AI-enhanced, cleaned HTML

Implementation Reference

  • The handler function for the 'crawl_web' tool. It calls makeCrawlRequest with the provided arguments, returns the JSON result, or an error message if the request fails.
    async (args) => { try { const result = await makeCrawlRequest<Record<string, unknown>>({ url: args.url, markdown: args.markdown, raw_html: args.raw_html, enhanced_html: args.enhanced_html, }); return { content: [ { type: "text" as const, text: JSON.stringify(result, null, 2), }, ], }; } catch (error) { const errorMessage = error instanceof Error ? error.message : "Unknown error occurred"; return { content: [ { type: "text" as const, text: `Error crawling URL: ${errorMessage}`, }, ], isError: true, }; } }
  • Zod schema defining the input parameters for the 'crawl_web' tool, including URL and format options.
    const WebCrawlSchema = z.object({ url: z.string().describe("URL to crawl and extract content from"), markdown: z.boolean().optional().default(true).describe("Return content in Markdown format"), raw_html: z.boolean().optional().default(false).describe("Return original, unprocessed HTML"), enhanced_html: z.boolean().optional().default(true).describe("Return AI-enhanced, cleaned HTML"), });
  • src/index.ts:146-180 (registration)
    Registration of the 'crawl_web' tool using server.tool(), including name, description, schema, and inline handler.
    server.tool( "crawl_web", "Crawl a specific webpage and extract its content in various formats including Markdown, raw HTML, and AI-enhanced HTML.", WebCrawlSchema.shape, async (args) => { try { const result = await makeCrawlRequest<Record<string, unknown>>({ url: args.url, markdown: args.markdown, raw_html: args.raw_html, enhanced_html: args.enhanced_html, }); return { content: [ { type: "text" as const, text: JSON.stringify(result, null, 2), }, ], }; } catch (error) { const errorMessage = error instanceof Error ? error.message : "Unknown error occurred"; return { content: [ { type: "text" as const, text: `Error crawling URL: ${errorMessage}`, }, ], isError: true, }; } } );
  • Helper function that performs the POST request to the Crawleo /crawl API endpoint using the provided body and API key.
    async function makeCrawlRequest<T>( body: Record<string, unknown> ): Promise<T> { const apiKey = getApiKey(); const response = await fetch(`${API_BASE_URL}/crawl`, { method: "POST", headers: { "Content-Type": "application/json", "x-api-key": apiKey, }, body: JSON.stringify(body), }); if (!response.ok) { const errorText = await response.text(); throw new Error(`API request failed: ${response.status} - ${errorText}`); } return response.json() as Promise<T>; }

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Crawleo/crawleo-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server