Skip to main content
Glama

crawl

Extract content from websites by crawling multiple pages from a starting URL, with configurable depth and page limits for structured data collection.

Instructions

Crawls a website starting from the specified URL and extracts content from multiple pages. Args: - url: The complete URL of the web page to start crawling from - maxDepth: The maximum depth level for crawling linked pages - limit: The maximum number of pages to crawl

Returns:
- Content extracted from the crawled pages in markdown and HTML format

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes
maxDepthYes
limitYes

Implementation Reference

  • main.py:39-54 (registration)
    Registration and handler wrapper for the 'crawl' MCP tool, which delegates to WebTools.crawl implementation.
    @mcp.tool()
    async def crawl(url: str, maxDepth: int, limit: int) -> str:
        """Crawls a website starting from the specified URL and extracts content from multiple pages.
        Args:
        - url: The complete URL of the web page to start crawling from
        - maxDepth: The maximum depth level for crawling linked pages
        - limit: The maximum number of pages to crawl
    
        Returns:
        - Content extracted from the crawled pages in markdown and HTML format
        """
        try:
            crawl_results = webtools.crawl(url, maxDepth, limit)
            return crawl_results
        except Exception as e:
            return f"Error crawling pages: {str(e)}"
  • Core implementation of the crawl functionality using FirecrawlApp.crawl_url, handling parameters for limit, maxDepth, and formats.
    def crawl(self, url: str, maxDepth: int, limit: int):
        try:
            crawl_page = self.firecrawl.crawl_url(
                url,
                params={
                    "limit": limit,
                    "maxDepth": maxDepth,
                    "scrapeOptions": {"formats": ["markdown", "html"]},
                },
                poll_interval=30,
            )
            return crawl_page
        except Exception as e:
            return f"Error crawling pages: {str(e)}"
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/josemartinrodriguezmortaloni/webSearch-Tools'

If you have feedback or need assistance with the MCP directory API, please join our Discord server