Skip to main content
Glama

extract

Extract specific information from web pages using custom prompts to retrieve targeted data without manual searching.

Instructions

Extracts specific information from a web page based on a prompt. Args: - url: The complete URL of the web page to extract information from - prompt: Instructions specifying what information to extract from the page - enabaleWebSearch: Whether to allow web searches to supplement the extraction - showSources: Whether to include source references in the response

Returns:
- Extracted information from the web page based on the prompt

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes
promptYes
enabaleWebSearchYes
showSourcesYes

Implementation Reference

  • main.py:57-77 (handler)
    The primary handler function for the 'extract' MCP tool, registered with @mcp.tool(). It receives input parameters, calls the WebTools.extract_info helper, and handles errors.
    @mcp.tool()
    async def extract(
        url: list[str], prompt: str, enabaleWebSearch: bool, showSources: bool
    ) -> str:
        """Extracts specific information from a web page based on a prompt.
        Args:
        - url: The complete URL of the web page to extract information from
        - prompt: Instructions specifying what information to extract from the page
        - enabaleWebSearch: Whether to allow web searches to supplement the extraction
        - showSources: Whether to include source references in the response
    
        Returns:
        - Extracted information from the web page based on the prompt
        """
        try:
            info_extracted = webtools.extract_info(
                url, enabaleWebSearch, prompt, showSources
            )
            return info_extracted
        except Exception as e:
            return f"Error extracting information: {str(e)}"
  • main.py:58-70 (schema)
    Input schema defined by function parameters and docstring describing args and return type for the 'extract' tool.
    async def extract(
        url: list[str], prompt: str, enabaleWebSearch: bool, showSources: bool
    ) -> str:
        """Extracts specific information from a web page based on a prompt.
        Args:
        - url: The complete URL of the web page to extract information from
        - prompt: Instructions specifying what information to extract from the page
        - enabaleWebSearch: Whether to allow web searches to supplement the extraction
        - showSources: Whether to include source references in the response
    
        Returns:
        - Extracted information from the web page based on the prompt
        """
  • Supporting helper method in WebTools class that implements the core extraction logic by calling FirecrawlApp.extract with configured options.
    def extract_info(
        self, url: list[str], enableWebSearch: bool, prompt: str, showSources: bool
    ):
        try:
            info_extracted = self.firecrawl.extract(
                url,
                {
                    "prompt": prompt,
                    "enableWebSearch": enableWebSearch,
                    "showSources": showSources,
                    "scrapeOptions": {
                        "formats": ["markdown"],
                        "blockAds": True,
                    },
                },
            )
            return info_extracted
        except Exception as e:
            return f"Error extracting information from page {url}: {str(e)}"
  • main.py:57-57 (registration)
    The @mcp.tool() decorator registers the 'extract' function as an MCP tool.
    @mcp.tool()
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/josemartinrodriguezmortaloni/webSearch-Tools'

If you have feedback or need assistance with the MCP directory API, please join our Discord server