Skip to main content
Glama

fetch_website_single

Fetch content from a single webpage and convert it to clean markdown format for structured data extraction and processing.

Instructions

Fetch content from a single webpage and convert to clean markdown

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL to fetch
timeoutNoRequest timeout in milliseconds (default: 10000)

Implementation Reference

  • src/server.ts:345-363 (registration)
    Registration of the 'fetch_website_single' tool in the TOOLS array, including name, description, and input schema definition.
    {
      name: "fetch_website_single",
      description: "Fetch content from a single webpage and convert to clean markdown",
      inputSchema: {
        type: "object",
        properties: {
          url: {
            type: "string",
            description: "The URL to fetch",
          },
          timeout: {
            type: "number",
            description: "Request timeout in milliseconds (default: 10000)",
            default: 10000,
          },
        },
        required: ["url"],
      },
    },
  • Handler function for executing the 'fetch_website_single' tool. Validates input, configures single-page scraping options, invokes the scraper, and returns markdown content.
    case "fetch_website_single": {
      const { url, timeout = 10000 } = args as any;
    
      if (!url) {
        throw new Error("URL is required");
      }
    
      try {
        const options: FetchOptions = {
          maxDepth: 0,
          maxPages: 1,
          timeout,
        };
    
        const markdown = await scraper.scrapeWebsite(url, options);
    
        return {
          content: [
            {
              type: "text",
              text: markdown,
            },
          ],
        };
      } catch (error) {
        throw new Error(`Failed to fetch single page: ${error}`);
      }
    }
  • Core scraping logic used by the tool (configured with maxDepth=0 for single page). Fetches page content, processes links based on depth, and formats as markdown.
    async scrapeWebsite(startUrl: string, options: FetchOptions = {}): Promise<string> {
      const {
        maxDepth = 2,
        maxPages = 50,
        sameDomainOnly = true,
        timeout = 10000
      } = options;
    
      this.baseUrl = startUrl;
      this.visitedUrls.clear();
    
      const allContent: PageContent[] = [];
      const urlsToProcess: Array<{ url: string; depth: number }> = [{ url: startUrl, depth: 0 }];
    
      while (urlsToProcess.length > 0 && allContent.length < maxPages) {
        const { url, depth } = urlsToProcess.shift()!;
    
        if (depth > maxDepth || this.visitedUrls.has(url)) {
          continue;
        }
    
        const pageContent = await this.fetchPageContent(url, depth, options);
        
        if (pageContent) {
          allContent.push(pageContent);
    
          // Add child URLs for processing
          if (depth < maxDepth) {
            for (const link of pageContent.links) {
              if (!this.visitedUrls.has(link)) {
                urlsToProcess.push({ url: link, depth: depth + 1 });
              }
            }
          }
        }
    
        // Small delay to be respectful
        await new Promise(resolve => setTimeout(resolve, 500));
      }
    
      return this.formatAsMarkdown(allContent, startUrl);
    }
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/flutterninja9/better-fetch'

If you have feedback or need assistance with the MCP directory API, please join our Discord server