Skip to main content
Glama

web_content_qna

Extract answers from web pages by analyzing content with AI. Provide a URL and question to get specific information from the page.

Instructions

Answer questions about web page content using RAG.

This tool combines web scraping with RAG (Retrieval Augmented Generation) to answer specific questions about web page content. It extracts relevant content sections and uses AI to provide accurate answers.

Args: url: The web page URL to analyze question: The question to answer based on the page content

Returns: str: AI-generated answer based on the web page content

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes
questionYes

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • Handler function for the 'web_content_qna' tool. Decorated with @mcp.tool() for registration in FastMCP. Takes URL and question, delegates to RAGProcessor.process_web_qna for execution.
    @mcp.tool()
    def web_content_qna(url: str, question: str) -> str:
        """
        Answer questions about web page content using RAG.
        
        This tool combines web scraping with RAG (Retrieval Augmented Generation)
        to answer specific questions about web page content. It extracts relevant
        content sections and uses AI to provide accurate answers.
        
        Args:
            url: The web page URL to analyze
            question: The question to answer based on the page content
            
        Returns:
            str: AI-generated answer based on the web page content
        """
        return rag_processor.process_web_qna(url, question)
  • Core helper method implementing the RAG pipeline for web content Q&A. Extracts markdown from URL using url_to_markdown, chunks content, retrieves relevant chunks based on question similarity, and generates answer using OpenAI GPT.
    def process_web_qna(self, url: str, question: str) -> str:
        """
        Process a URL and answer a question about its content.
        
        This is the main RAG function that combines web extraction and QA.
        
        Args:
            url: The URL to analyze
            question: The question to answer based on the URL content
            
        Returns:
            str: The answer to the question based on the web content
        """
        try:
            # Extract content from URL
            markdown_content = url_to_markdown(url)
            
            if markdown_content.startswith("Error"):
                return f"Could not process the URL: {markdown_content}"
            
            # Chunk the content
            chunks = self.chunk_content(markdown_content)
            
            if not chunks:
                return "No content could be extracted from the URL to answer your question."
            
            # Select relevant chunks
            relevant_chunks = self.select_relevant_chunks(question, chunks)
            
            if not relevant_chunks:
                return f"The content from {url} doesn't seem to contain information relevant to your question: '{question}'"
            
            # Generate answer
            answer = self.generate_answer(question, relevant_chunks)
            
            return answer
            
        except Exception as e:
            return f"Error processing question about {url}: {str(e)}"
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses key behavioral traits: it performs web scraping and uses AI (RAG) to generate answers, which implies external API calls and potential rate limits or latency. However, it lacks details on error handling, authentication needs, or specific limitations (e.g., website compatibility, content size). The description does not contradict any annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded: the first sentence states the core purpose, followed by brief elaboration on the method, and ends with clear Arg/Return sections. Every sentence adds value without redundancy, and the structure is logical and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (involving web scraping and AI), no annotations, and an output schema exists (returns a string), the description is mostly complete. It covers purpose, method, parameters, and return type, but could benefit from more behavioral context (e.g., limitations, errors). The output schema reduces the need to explain return values, but the description still lacks some operational details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It adds meaningful semantics beyond the schema by explaining that 'url' is for 'the web page URL to analyze' and 'question' is 'the question to answer based on the page content', clarifying their roles. However, it does not provide format examples or constraints (e.g., URL validation, question phrasing). With 0% coverage and 2 parameters, this is above baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as 'Answer questions about web page content using RAG' with specific verbs ('answer questions', 'combines web scraping with RAG', 'extracts relevant content sections') and distinguishes it from the sibling tool 'url_to_markdown_tool' by focusing on Q&A rather than conversion. It explicitly mentions the resource (web page content) and method (RAG).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('to answer specific questions about web page content'), but does not explicitly state when not to use it or name alternatives. It implies usage for Q&A tasks involving web content, though lacks explicit exclusions or comparisons to the sibling tool beyond their different functions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kimdonghwi94/web-analyzer-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server