web_content_qna
Extract answers from web pages by analyzing content with AI. Provide a URL and question to get specific information from the page.
Instructions
Answer questions about web page content using RAG.
This tool combines web scraping with RAG (Retrieval Augmented Generation) to answer specific questions about web page content. It extracts relevant content sections and uses AI to provide accurate answers.
Args: url: The web page URL to analyze question: The question to answer based on the page content
Returns: str: AI-generated answer based on the web page content
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | ||
| question | Yes |
Implementation Reference
- web_analyzer_mcp/server.py:38-54 (handler)Handler function for the 'web_content_qna' tool. Decorated with @mcp.tool() for registration in FastMCP. Takes URL and question, delegates to RAGProcessor.process_web_qna for execution.@mcp.tool() def web_content_qna(url: str, question: str) -> str: """ Answer questions about web page content using RAG. This tool combines web scraping with RAG (Retrieval Augmented Generation) to answer specific questions about web page content. It extracts relevant content sections and uses AI to provide accurate answers. Args: url: The web page URL to analyze question: The question to answer based on the page content Returns: str: AI-generated answer based on the web page content """ return rag_processor.process_web_qna(url, question)
- Core helper method implementing the RAG pipeline for web content Q&A. Extracts markdown from URL using url_to_markdown, chunks content, retrieves relevant chunks based on question similarity, and generates answer using OpenAI GPT.def process_web_qna(self, url: str, question: str) -> str: """ Process a URL and answer a question about its content. This is the main RAG function that combines web extraction and QA. Args: url: The URL to analyze question: The question to answer based on the URL content Returns: str: The answer to the question based on the web content """ try: # Extract content from URL markdown_content = url_to_markdown(url) if markdown_content.startswith("Error"): return f"Could not process the URL: {markdown_content}" # Chunk the content chunks = self.chunk_content(markdown_content) if not chunks: return "No content could be extracted from the URL to answer your question." # Select relevant chunks relevant_chunks = self.select_relevant_chunks(question, chunks) if not relevant_chunks: return f"The content from {url} doesn't seem to contain information relevant to your question: '{question}'" # Generate answer answer = self.generate_answer(question, relevant_chunks) return answer except Exception as e: return f"Error processing question about {url}: {str(e)}"