Learning Coach MCP Server

prd.md•19 KiB

--- stepsCompleted: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] inputDocuments: ['details.txt'] workflowType: 'prd' lastStep: 10 project_name: 'newsletter_mcp' user_name: 'Pevanshgulati' date: '2025-12-01' --- # Product Requirements Document - newsletter_mcp **Author:** Pevanshgulati **Date:** 2025-12-01 ## Executive Summary **newsletter_mcp** is a personalized learning companion MCP server that helps self-directed learners retain knowledge as they progress through structured curricula. Instead of generic daily newsletters or rigid learning platforms, it provides intelligent, context-aware insights generated from the learner's own curated content sources. The product addresses the core challenge of self-directed learning: **information overload without personalized guidance leads to poor retention and abandoned goals.** Learners start with motivation but struggle to maintain momentum when they're buried in content that isn't contextually relevant to their current learning stage. **Target Users:** Self-directed learners following week-by-week curricula - bootcamp students, developers learning new stacks, career changers, anyone on a structured learning path who wants to actually retain what they learn. **Core Value Proposition:** Through RAG-powered daily digests, learners receive 5-7 high-quality insights that are: - **Personalized** to their current week and topics - **Filtered** using RAGAS evaluation for quality and relevance - **Customizable** based on their own content sources - **Intelligent** about timing and context for optimal retention The MCP architecture means this works across any AI client (Claude, GPT, etc.), making it learning infrastructure rather than just another app. ### What Makes This Special **Personalized, not prescriptive:** Unlike traditional learning platforms that push generic content, newsletter_mcp works with content the learner already trusts. They curate their sources (favorite blogs, Twitter follows, Reddit communities), and the system intelligently synthesizes insights matched to their learning stage. **Retention-focused, not completion-focused:** The goal isn't just to finish Week 12 of a bootcamp - it's to retain and recall concepts weeks later. Daily digests provide spaced repetition of key concepts at contextually optimal moments. **Infrastructure, not siloed app:** Built as an MCP server, this becomes universal learning infrastructure. Any LLM can become a personalized learning coach, not locked into one ecosystem. **Quality-controlled intelligence:** RAGAS evaluation ensures insights meet minimum quality thresholds (faithfulness, relevance, context precision). It's not just content summarization - it's intelligent curation. **User feedback drives improvement:** The system learns from user feedback, improving insight quality and relevance over time. The more you use it, the better it understands your learning style. **The "Aha!" moment:** When a learner realizes their scattered content sources are being intelligently synthesized into perfectly-timed insights that actually help them understand and retain concepts - that's when they know this is different. ## Project Classification **Technical Type:** api_backend **Domain:** edtech **Complexity:** medium This is fundamentally a backend service exposing tools via the MCP protocol. The MCP server provides four core tools (generate_daily_digest, add_content_source, update_progress, search_insights) that any MCP-compatible AI client can invoke. **Technical Architecture:** FastMCP server with RAG pipeline, Supabase (PostgreSQL + pgvector) for storage and semantic search, HuggingFace embeddings for vector operations, RAGAS for insight evaluation. **Domain Considerations:** As an edtech product focused on students/learners, we must consider content moderation and age-appropriate filtering, though this is V1 focused on adult self-learners. Privacy around learning progress and content sources is important but not at COPPA/FERPA compliance levels. **Complexity Rationale:** Medium complexity due to RAG implementation, vector search optimization, and quality evaluation pipelines. Not high complexity because V1 targets adult learners without stringent regulatory requirements. ## Success Criteria ### User Success **Primary Success Indicators:** - **Read rates**: Users consistently engage with daily digests (not just another ignored notification) - **Retention**: Users can recall and apply concepts from previous weeks of their learning journey **Observable Success Behaviors:** - Users return to the digest daily or multiple times per week - Users actively use `search_insights` to recall past learning - Users continue updating their progress week-over-week (indicating sustained engagement) - Users provide positive feedback on insight relevance and quality **The Success Moment:** When a learner realizes their curated content sources are being intelligently synthesized into perfectly-timed insights that actually help them understand and retain concepts. ### Business Success **V1 Focus:** Prove the concept works with real learners - Validate that RAG + RAGAS produces high-quality, relevant insights - Confirm users find value in personalized daily digests - Demonstrate MCP infrastructure model is viable across different AI clients - Gather user feedback to inform post-MVP iterations **Measured Through:** - User engagement patterns (digest generation frequency, search usage) - Qualitative feedback from early adopters - RAGAS quality scores trending above 0.7 threshold - Successful MCP integration with multiple AI clients ### Technical Success **Infrastructure:** - MCP server successfully connects to Claude and other MCP-compatible clients - All four core tools (`generate_daily_digest`, `add_content_source`, `update_progress`, `search_insights`) are functional and reliable - Supabase vector search performs efficiently at initial scale **RAG Pipeline:** - Content chunking and embedding pipeline processes sources correctly - Vector similarity search returns relevant results - RAGAS evaluation accurately filters low-quality insights (minimum threshold: 0.7) - Digest generation completes within reasonable timeframe (< 30 seconds for 5-7 insights) **Data Quality:** - Embeddings are generated correctly and stored in pgvector - Content metadata is extracted and stored accurately - User profile and progress tracking persists reliably ### Measurable Outcomes **For V1, success means:** - A working MCP server that any compatible AI client can connect to - Daily digests that consistently score above 0.7 on RAGAS metrics - Users can curate their own content sources and see them reflected in personalized insights - Search functionality returns semantically relevant past insights - The system works end-to-end: add sources → update progress → generate digest → search insights ## Product Scope ### MVP - Minimum Viable Product (V1) **Core MCP Tools (All Required):** 1. **`generate_daily_digest`** - Fetches user profile (current week, topics, preferences) - Generates semantic queries based on current learning context - Retrieves top-k relevant content chunks via vector search - Uses LLM to generate 5-7 candidate insights from retrieved context - Evaluates insights with RAGAS (faithfulness, relevance, context precision) - Filters insights with minimum score threshold (0.7) - Returns formatted digest to user 2. **`add_content_source`** - Accepts source URL and type (blog, article, manual input) - Fetches content from URL - Extracts and stores metadata - Chunks content intelligently (512 tokens, 50 token overlap) - Generates embeddings using HuggingFace models - Stores in Supabase with vector index 3. **`update_progress`** - Updates user profile with current week number - Updates current topics being studied - Records completed topics - Updates learning preferences - Returns updated profile summary 4. **`search_insights`** - Accepts semantic query from user - Generates query embedding - Performs vector similarity search across past insights - Applies optional filters (date range, topics, week number) - Ranks by relevance and recency - Returns formatted results with context **Infrastructure Requirements:** - FastMCP server with MCP protocol support - Supabase database with PostgreSQL + pgvector extension - HuggingFace embeddings service (sentence-transformers model) - RAGAS evaluation pipeline - Basic error handling and logging **Data Schema:** - User profile and learning state - Content sources and raw content storage - Vector embeddings (content + insights) - Daily digests and individual insights - Progress tracking **Quality Requirements:** - RAGAS minimum threshold: 0.7 composite score - Digest generation: 5-7 insights per request - Vector search: returns relevant results - Content chunking: preserves context with overlap ### Growth Features (Post-MVP) **Automated Content Fetching:** - Twitter integration for auto-fetching tweets from followed accounts - Reddit integration for subreddit monitoring - RSS feed support for blog auto-updates - Scheduled content refresh **Advanced Personalization:** - Feedback loop: users rate insights to improve future recommendations - Adaptive difficulty based on user performance - Learning pattern detection and optimization - Spaced repetition scheduling based on retention curves **Multi-User & Collaboration:** - Support for multiple user profiles - Shared content sources across learning cohorts - Collaborative insights and discussions - Team/cohort progress tracking **Analytics & Insights:** - User engagement dashboard - Learning progress visualization - Content source effectiveness metrics - RAGAS quality trend analysis **Enhanced Search:** - Cross-reference between related insights - Concept mapping and knowledge graphs - "Similar insights" recommendations - Export/save favorite insights ### Vision (Future) **Universal Learning Infrastructure:** - MCP protocol becomes standard for AI learning assistants - Integration with major learning platforms and LMS systems - Open ecosystem of curated content sources - Community-driven insight quality improvements **Advanced AI Capabilities:** - Multi-modal learning (video, audio, code snippets) - Interactive practice problems generated from insights - Adaptive curriculum recommendations - Real-time concept clarification via AI tutor **Scale & Platform:** - Support for diverse learning domains beyond tech/bootcamps - Enterprise learning & development platform integration - Mobile-optimized digest delivery - Offline-first capabilities ## Functional Requirements ### MCP Server Infrastructure - **FR1**: The system can accept connections from MCP-compatible AI clients (Claude, GPT, etc.) - **FR2**: The system can expose tools via MCP protocol for client invocation - **FR3**: The system can handle concurrent tool invocations from multiple clients - **FR4**: The system can return structured responses to MCP tool calls ### User Profile & Learning Progress - **FR5**: Users can set their current learning week number - **FR6**: Users can define their current learning topics - **FR7**: Users can mark topics as completed - **FR8**: Users can update their learning preferences (difficulty level, content types) - **FR9**: The system can persist user profile data across sessions - **FR10**: Users can view their updated profile summary after changes ### Content Source Management - **FR11**: Users can add content sources by providing a URL - **FR12**: Users can specify the source type (blog, article, manual input) - **FR13**: The system can fetch content from provided URLs - **FR14**: The system can extract metadata from content sources (title, author, publish date) - **FR15**: The system can chunk content into semantically meaningful segments - **FR16**: The system can generate vector embeddings for content chunks - **FR17**: The system can store content and embeddings in the database with vector index ### Daily Digest Generation - **FR18**: Users can generate a personalized daily learning digest - **FR19**: The system can retrieve user profile context (current week, topics, preferences) for digest generation - **FR20**: The system can generate semantic queries based on user's current learning context - **FR21**: The system can perform vector similarity search to retrieve relevant content chunks - **FR22**: The system can use LLM to generate candidate insights from retrieved content - **FR23**: The system can generate 5-7 insights per digest request - **FR24**: The system can evaluate each insight using RAGAS metrics (faithfulness, relevance, context precision) - **FR25**: The system can filter insights below quality threshold (0.7 composite score) - **FR26**: The system can return formatted digest with high-quality insights to the user - **FR27**: The system can complete digest generation within 30 seconds ### Insight Search & Retrieval - **FR28**: Users can search past insights using semantic queries - **FR29**: The system can generate query embeddings for search requests - **FR30**: The system can perform vector similarity search across stored insights - **FR31**: Users can apply filters to search results (date range, topics, week number) - **FR32**: The system can rank search results by relevance and recency - **FR33**: The system can return formatted search results with contextual information ### Quality Assurance & Evaluation - **FR34**: The system can measure faithfulness of generated insights to source content - **FR35**: The system can measure answer relevancy against user's learning context - **FR36**: The system can measure context precision of retrieved content - **FR37**: The system can compute composite RAGAS scores for each insight - **FR38**: The system can reject insights that don't meet minimum quality standards ### Data Persistence & Management - **FR39**: The system can store user profiles and learning state persistently - **FR40**: The system can store content sources and raw content - **FR41**: The system can store vector embeddings with efficient retrieval capabilities - **FR42**: The system can store daily digests and individual insights - **FR43**: The system can track progress history over time - **FR44**: The system can maintain referential integrity between related data (users, content, insights) ### Content Processing Pipeline - **FR45**: The system can process content in configurable chunk sizes (512 tokens default) - **FR46**: The system can apply overlapping windows for context preservation (50 tokens default) - **FR47**: The system can handle various content formats from different source types - **FR48**: The system can extract and preserve content structure and formatting ### Personalization & Adaptation - **FR49**: The system can tailor digest insights to user's current learning week - **FR50**: The system can tailor digest insights to user's specified topics - **FR51**: The system can adjust content selection based on user preferences - **FR52**: The system can prioritize recent and relevant content for current learning stage ### Error Handling & Logging - **FR53**: The system can handle errors gracefully during content fetching - **FR54**: The system can handle errors gracefully during embedding generation - **FR55**: The system can handle errors gracefully during digest generation - **FR56**: The system can log errors and system events for debugging - **FR57**: The system can return meaningful error messages to users when operations fail ## Non-Functional Requirements ### Performance **Response Time Requirements:** - **NFR1**: MCP tool invocations return responses within 30 seconds for digest generation - **NFR2**: Simple tool calls (`update_progress`, `add_content_source`) complete within 5 seconds - **NFR3**: Search queries return results within 3 seconds for up to 1000 stored insights **Throughput Requirements:** - **NFR4**: The system can handle at least 10 concurrent digest generation requests without performance degradation - **NFR5**: Vector similarity search maintains sub-3-second response times with up to 100,000 embedded content chunks **Processing Efficiency:** - **NFR6**: Content chunking and embedding generation processes at least 10 articles (average 2000 words each) per minute - **NFR7**: RAGAS evaluation processes 20 candidate insights in parallel within the 30-second digest generation window ### Security **Data Protection:** - **NFR8**: All data at rest in Supabase is encrypted using AES-256 encryption - **NFR9**: All data in transit uses TLS 1.2 or higher for connections between components - **NFR10**: Database credentials and API keys are stored securely using environment variables or secret management systems **Access Control:** - **NFR11**: User profile data is isolated per user with proper access controls preventing cross-user data access - **NFR12**: MCP client connections are authenticated before tool invocation is permitted - **NFR13**: Content source URLs are validated and sanitized to prevent injection attacks **Privacy:** - **NFR14**: User learning progress, topics, and content sources are treated as private data and not shared without explicit consent - **NFR15**: System logs do not contain sensitive user information (learning topics, personal preferences) ### Scalability **User Growth:** - **NFR16**: The system architecture supports scaling from 10 initial users to 1,000 users with minimal configuration changes - **NFR17**: Database schema and vector index design accommodate 10x growth in stored content without requiring migration **Data Growth:** - **NFR18**: Vector search performance remains acceptable (< 5 second queries) with up to 1 million embedded content chunks - **NFR19**: Storage architecture supports at least 100 GB of content and embeddings for V1 **Concurrent Load:** - **NFR20**: The system handles at least 50 concurrent MCP client connections without dropped connections - **NFR21**: Background processing (embedding generation, RAGAS evaluation) scales horizontally to handle increased load ### Reliability **Availability:** - **NFR22**: The MCP server maintains 95% uptime during business hours (9 AM - 6 PM user timezone) - **NFR23**: Database connections automatically retry on transient failures with exponential backoff **Data Integrity:** - **NFR24**: User profile updates are atomic - partial updates do not corrupt user state - **NFR25**: Content embeddings and source content maintain referential integrity - no orphaned embeddings - **NFR26**: Failed digest generation attempts do not leave incomplete data in the database **Error Recovery:** - **NFR27**: Content fetching failures return meaningful error messages and do not crash the server - **NFR28**: Embedding service failures are logged and retry automatically up to 3 times before reporting failure - **NFR29**: MCP tool calls that fail return structured error responses to clients for proper error handling **Monitoring & Observability:** - **NFR30**: System logs capture error events with sufficient detail for debugging (error type, stack trace, context) - **NFR31**: Critical failures (database connection loss, embedding service unavailable) trigger alerts - **NFR32**: Performance metrics (digest generation time, search latency, RAGAS scores) are logged for analysis

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Pevansh/learning_coach_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

prd.md•19 KiB