Uses OpenAI's API for embedding generation and language model capabilities to power RAG-based FAQ retrieval and answer generation.
FAQ RAG + MCP Tool (Starter Skeleton)
This is a minimal starting point for the MCP option.
Contents
rag_core.py— RAG coremcp_server.py— MCP server exposingask_faqfaqs/— tiny sample corpusrequirements.txt
Quick Start
Design Principles (Evaluation Criteria)
This implementation prioritizes Simplicity, Practicality, and Interface Correctness.
1. Simplicity over Over-Engineering
No Vector Database: Instead of adding heavy dependencies like Chroma or Pinecone, we use
numpyfor in-memory cosine similarity. For a filtered FAQ lists, this is faster, easier to debug, and removes deployment complexity.FastMCP: We use the high-level
FastMCPinterface to reduce boilerplate, keeping the server code focused on logic rather than protocol details.Global State: We preload the corpus at import time for simplicity in this specific "server" context, avoiding complex dependency injection containers.
2. Practicality
Robust Error Handling: The server uses structured logging and catches API errors (e.g., rate limits) to prevent crashes, returning user-friendly error messages to the LLM.
Exposed Resources: The
faq://resource allows the LLM (and developers) to inspect the raw content of any FAQ file, which is crucial for verifying answers or debugging retrieval issues.Pre-defined Prompts: The
ask_faq_expertprompt helps users/LLMs start with the right context immediately.
3. Interface Correctness
Standard MCP Patterns: We strictly follow MCP standards by exposing Tools (action), Resources (data), and Prompts (context).
Type Safety: All tools use Python type hints (
str,int) whichFastMCPautomatically converts to JSON-Schema for the LLM to understand.