Skip to main content
Glama

crawl

Extract content from websites by crawling multiple pages from a starting URL, with configurable depth and page limits for structured data collection.

Instructions

Crawls a website starting from the specified URL and extracts content from multiple pages. Args: - url: The complete URL of the web page to start crawling from - maxDepth: The maximum depth level for crawling linked pages - limit: The maximum number of pages to crawl

Returns:
- Content extracted from the crawled pages in markdown and HTML format

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes
maxDepthYes
limitYes

Implementation Reference

  • main.py:39-54 (registration)
    Registration and handler wrapper for the 'crawl' MCP tool, which delegates to WebTools.crawl implementation.
    @mcp.tool()
    async def crawl(url: str, maxDepth: int, limit: int) -> str:
        """Crawls a website starting from the specified URL and extracts content from multiple pages.
        Args:
        - url: The complete URL of the web page to start crawling from
        - maxDepth: The maximum depth level for crawling linked pages
        - limit: The maximum number of pages to crawl
    
        Returns:
        - Content extracted from the crawled pages in markdown and HTML format
        """
        try:
            crawl_results = webtools.crawl(url, maxDepth, limit)
            return crawl_results
        except Exception as e:
            return f"Error crawling pages: {str(e)}"
  • Core implementation of the crawl functionality using FirecrawlApp.crawl_url, handling parameters for limit, maxDepth, and formats.
    def crawl(self, url: str, maxDepth: int, limit: int):
        try:
            crawl_page = self.firecrawl.crawl_url(
                url,
                params={
                    "limit": limit,
                    "maxDepth": maxDepth,
                    "scrapeOptions": {"formats": ["markdown", "html"]},
                },
                poll_interval=30,
            )
            return crawl_page
        except Exception as e:
            return f"Error crawling pages: {str(e)}"

Tool Definition Quality

Score is being calculated. Check back soon.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/josemartinrodriguezmortaloni/webSearch-Tools'

If you have feedback or need assistance with the MCP directory API, please join our Discord server