Skip to main content
Glama
Fadi88

LLM Inference Pricing Research Server

by Fadi88

extract_scraped_info

Extract structured pricing data from scraped LLM inference provider websites to analyze and compare costs across services like CloudRift, DeepInfra, Fireworks, and Groq.

Instructions

Extract information about a scraped website.

Args:
    identifier: The provider name, full URL, or domain to look for
    
Returns:
    Formatted JSON string with the scraped information

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
identifierYes

Implementation Reference

  • The handler function for the 'extract_scraped_info' tool, decorated with @mcp.tool() for registration. It retrieves scraped metadata and content based on the provided identifier (provider name, URL, or domain) and returns it as JSON.
    @mcp.tool()
    def extract_scraped_info(identifier: str) -> str:
        """
        Extract information about a scraped website.
        
        Args:
            identifier: The provider name, full URL, or domain to look for
            
        Returns:
            Formatted JSON string with the scraped information
        """
        
        logger.info(f"Extracting information for identifier: {identifier}")
        
        if not os.path.exists(SCRAPE_DIR):
            return json.dumps({"error": "No scraped content found."})
    
        metadata_file = os.path.join(SCRAPE_DIR, "scraped_metadata.json")
        if not os.path.exists(metadata_file):
            return json.dumps({"error": "Metadata file not found."})
    
        try:
            with open(metadata_file, "r") as f:
                metadata = json.load(f)
        except json.JSONDecodeError:
            return json.dumps({"error": "Invalid metadata file."})
    
        # Search for the identifier
        found_data = None
        
        # Check if identifier matches a provider key directly
        if identifier in metadata:
            found_data = metadata[identifier]
        else:
            # Search by url or domain
            for provider, data in metadata.items():
                if identifier == data.get("url") or identifier == data.get("domain"):
                    found_data = data
                    break
        
        if found_data:
            # Read content files
            content_data = {}
            if "content_files" in found_data:
                for fmt, filename in found_data["content_files"].items():
                    file_path = os.path.join(SCRAPE_DIR, filename)
                    if os.path.exists(file_path):
                        with open(file_path, "r", encoding="utf-8") as f:
                            content_data[fmt] = f.read()
            
            found_data["content"] = content_data
            return json.dumps(found_data, indent=2)
        else:
            return json.dumps({"error": f"No information found for {identifier}"})
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Fadi88/UDACITY_MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server