Skip to main content
Glama

MCP RSS

by ronnycoding
README.md19.3 kB
[![MseeP.ai Security Assessment Badge](https://mseep.net/pr/buhe-mcp-rss-badge.png)](https://mseep.ai/app/buhe-mcp-rss) # MCP RSS MCP RSS is a Model Context Protocol (MCP) server for intelligent RSS feed management with advanced search capabilities, semantic search using AI embeddings, and a comprehensive reading workflow. ## Features - 📰 **RSS Feed Management** - Parse OPML files and automatically fetch articles from RSS feeds - 🔍 **Advanced Search** - Keyword search with date range, category, and status filtering - 🤖 **Semantic Search** - AI-powered natural language search using OpenAI embeddings (optional) - 📊 **Smart Organization** - Four-status workflow (unread/read/favorite/archived) - 📅 **Daily Digest** - Get today's unread articles grouped by category - 🚀 **High Performance** - PostgreSQL with pgvector for efficient vector similarity search - 🔄 **Auto-Deduplication** - Prevents duplicate articles and wasted API calls - ⚡ **Token-Efficient** - Browse titles/excerpts first, fetch full content only when needed - 📑 **Pagination Support** - Handle large feed collections (500+) with efficient pagination ## Installation ### Prerequisites - Node.js (v18 or higher) - Docker & Docker Compose (for PostgreSQL with pgvector) - OpenAI API Key (optional, only for semantic search) ### Quick Start with Docker Compose 1. **Clone or install the package:** ```bash npm install -g mcp_rss # OR for local development git clone <repository-url> cd mcp_rss npm install ``` 2. **Start PostgreSQL with pgvector:** ```bash docker-compose up -d ``` 3. **Configure environment variables:** ```bash cp .env.example .env # Edit .env with your settings ``` 4. **Build the project:** ```bash npm run build ``` ### Database Setup The project uses PostgreSQL 17 with pgvector extension for vector similarity search. **Using Docker Compose (Recommended):** ```bash docker-compose up -d # Start PostgreSQL docker-compose down # Stop PostgreSQL docker-compose down -v # Stop and remove volumes (fresh start) docker-compose logs -f postgres # View PostgreSQL logs ``` **Manual PostgreSQL Setup:** ```bash docker run -d \ --name mcp-rss-postgres \ -p 5433:5432 \ -e POSTGRES_USER=mcp_user \ -e POSTGRES_PASSWORD=123456 \ -e POSTGRES_DB=mcp_rss \ pgvector/pgvector:pg17 ``` ## Configuration ### Environment Variables Create a `.env` file with the following configuration: | Variable | Description | Default | Required | |----------|-------------|---------|----------| | **Database Configuration** | | `DB_HOST` | PostgreSQL host | `localhost` | No | | `DB_PORT` | PostgreSQL port | `5433` | No | | `DB_USER` / `DB_USERNAME` | Database username | `mcp_user` | No | | `DB_PASSWORD` | Database password | `123456` | No | | `DB_NAME` / `DB_DATABASE` | Database name | `mcp_rss` | No | | **RSS Configuration** | | `OPML_FILE_PATH` | Path to OPML file with RSS feeds | `./feeds.opml` | Yes | | `RSS_UPDATE_INTERVAL` | Feed update interval (minutes) | `1` | No | | **OpenAI Configuration** | | `OPENAI_API_KEY` | OpenAI API key for embeddings | - | No* | \* *Only required for semantic search feature. All other features work without it.* ### Claude Desktop Configuration For local development, use the built dist folder: ```json { "mcpServers": { "rss": { "command": "node", "args": ["/absolute/path/to/mcp_rss/dist/index.js"], "env": { "OPML_FILE_PATH": "/path/to/your/feeds.opml", "DB_HOST": "localhost", "DB_PORT": "5433", "DB_USER": "mcp_user", "DB_PASSWORD": "123456", "DB_NAME": "mcp_rss", "RSS_UPDATE_INTERVAL": "60", "OPENAI_API_KEY": "sk-your-key-here" } } } } ``` For global installation via npm: ```json { "mcpServers": { "rss": { "command": "npx", "args": ["mcp_rss"], "env": { "OPML_FILE_PATH": "/path/to/your/feeds.opml", "OPENAI_API_KEY": "sk-your-key-here" } } } } ``` ## MCP Tools Reference The server exposes 8 powerful tools for RSS feed management: ### Token Efficiency Guide All list/search tools now support ultra-efficient token usage: - **Default behavior**: Returns ONLY titles and metadata (no excerpts, no content) - **Optional excerpts**: Set `includeExcerpt: true` for content previews (moderate token usage) - **Full content**: Set `includeContent: true` for complete article text (high token usage) - **On-demand content**: Use `get_article_full` to fetch specific articles by ID (most efficient) **Recommended workflow (90%+ token savings):** 1. Browse titles only with `get_content` or `search_articles` (default settings) 2. Identify interesting articles from titles alone 3. Optionally fetch excerpts for borderline cases with `includeExcerpt: true` 4. Fetch full content with `get_article_full` for selected articles only **Token Usage Comparison:** - Titles only: ~50-100 tokens per article - Titles + excerpts: ~150-300 tokens per article - Titles + full content: ~1,000-5,000 tokens per article ### 1. get_content Get articles with basic filtering and pagination. **Returns latest articles first** (sorted by pubDate DESC). **Use this for:** - Browsing recent articles - Checking unread articles - Simple filtering by status or source - Date range filtering for specific time periods **Token Efficiency:** - By default, returns ONLY titles and metadata (most token-efficient) - Set `includeExcerpt: true` to add content previews - Set `includeContent: true` to get full article text - For best efficiency: browse titles only, then use `get_article_full` for specific articles **Parameters:** | Parameter | Type | Description | Default | |-----------|------|-------------|---------| | `statuses` | `string[]` | Filter by statuses: `"unread"`, `"read"`, `"favorite"`, `"archived"` | All statuses | | `source` | `string` | Filter by feed source title | All sources | | `limit` | `number` | Number of articles to return | `10` | | `offset` | `number` | Offset for pagination | `0` | | `favoriteBlogsOnly` | `boolean` | Only show articles from favorite blogs | `false` | | `prioritizeFavoriteBlogs` | `boolean` | Show favorite blog articles first | `false` | | `includeContent` | `boolean` | Include full article content (uses more tokens) | `false` | | `includeExcerpt` | `boolean` | Include article excerpt/preview | `false` | | `startDate` | `string` | Start date (ISO: YYYY-MM-DD or YYYY-MM-DDTHH:mm:ssZ) | - | | `endDate` | `string` | End date (ISO format) | - | **Example (titles only - most efficient):** ```json { "statuses": ["unread"], "limit": 20 } ``` **Example (with date range and excerpts):** ```json { "startDate": "2025-10-01", "endDate": "2025-10-25", "includeExcerpt": true, "limit": 15 } ``` **Example (favorite blogs with full content):** ```json { "favoriteBlogsOnly": true, "limit": 5, "includeContent": true } ``` **Response (default - titles only, no excerpt/content):** ```json { "articles": [ { "id": 123, "title": "Article Title", "link": "https://example.com/article", "pubDate": "2024-01-15T10:30:00Z", "fetchDate": "2024-01-15T11:00:00Z", "status": "unread", "feedTitle": "Engineering Blog", "feedCategory": "Technology" } ], "total": 150, "success": true } ``` --- ### 2. search_articles Advanced search with keyword matching, date ranges, categories, and status filters. **Searches both title and content**. **Use this for:** - Finding articles on specific topics - Date-based filtering - Complex multi-criteria searches - Category-specific searches **Parameters:** | Parameter | Type | Description | Default | |-----------|------|-------------|---------| | `keyword` | `string` | Search term (case-insensitive, searches title + content) | - | | `category` | `string` | Filter by feed category | - | | `statuses` | `string[]` | Filter by article statuses | All | | `startDate` | `string` | Start date (ISO format: `YYYY-MM-DD` or `YYYY-MM-DDTHH:mm:ssZ`) | - | | `endDate` | `string` | End date (ISO format) | - | | `limit` | `number` | Number of results | `20` | | `offset` | `number` | Offset for pagination | `0` | | `includeContent` | `boolean` | Include full article content (uses more tokens) | `false` | **Example:** ```json { "keyword": "kubernetes", "category": "Engineering", "startDate": "2024-01-01", "endDate": "2024-12-31", "statuses": ["unread"], "limit": 10 } ``` --- ### 3. semantic_search **AI-powered semantic search** using OpenAI embeddings. Finds conceptually similar articles even without exact keyword matches. **Use this for:** - Natural language queries - Finding related concepts - Research and discovery - Topic exploration **Requirements:** - `OPENAI_API_KEY` must be set - Only works for articles from 2020 onwards - Automatically disabled if API key is missing (fails gracefully) **Parameters:** | Parameter | Type | Description | Default | |-----------|------|-------------|---------| | `query` | `string` | Natural language search query (required) | - | | `includeContent` | `boolean` | Include full article content (uses more tokens) | `false` | | `limit` | `number` | Number of results | `10` | | `statuses` | `string[]` | Filter by article statuses | All | | `category` | `string` | Filter by feed category | - | **Example:** ```json { "query": "how to optimize database performance and reduce query latency", "limit": 5, "statuses": ["unread"] } ``` **How it works:** 1. Converts your query into a 1536-dimensional vector using OpenAI 2. Compares against article embeddings using pgvector cosine similarity 3. Returns semantically similar articles ranked by relevance --- ### 4. get_daily_digest Get **today's unread articles** grouped by category. Perfect for daily reading workflows. Filters by **publication date** (pubDate), not fetch date. **Use this for:** - Morning briefings - Daily catch-up - Category-organized reading - Articles published today (based on pubDate) **Parameters:** | Parameter | Type | Description | Default | |-----------|------|-------------|---------| | `limit` | `number` | Max articles per category | `5` | | `includeContent` | `boolean` | Include full article content (uses more tokens) | `false` | **Example:** ```json { "limit": 5 } ``` **Response:** Articles grouped by category, with up to N articles per category fetched today. --- ### 5. get_weekly_favorites **NEW:** Get favorite articles from the last 7 days (titles only). Perfect for weekly review of bookmarked content. **Use this for:** - Weekly reading lists - Reviewing saved articles from the past week - Tracking important bookmarked content - Quick overview of what you found valuable recently **Parameters:** None **Example:** No parameters needed - simply call the tool. **Response:** ```json { "articles": [ { "id": 789, "title": "Optimizing PostgreSQL for High Write Throughput", "link": "https://engineering.example.com/postgres-optimization", "pubDate": "2025-10-22T14:30:00Z", "fetchDate": "2025-10-22T15:00:00Z", "status": "favorite", "feedTitle": "Engineering at Example", "feedCategory": "Database" }, { "id": 654, "title": "Building Resilient Microservices with Circuit Breakers", "link": "https://blog.example.com/circuit-breakers", "pubDate": "2025-10-20T09:15:00Z", "fetchDate": "2025-10-20T10:00:00Z", "status": "favorite", "feedTitle": "Tech Blog", "feedCategory": "Architecture" } ], "total": 2, "success": true } ``` **Features:** - Returns articles marked as "favorite" published in last 7 days - Sorted by publication date (newest first) - Ultra token-efficient - titles and metadata only - No excerpts or content by default - Use `get_article_full` to read full content of any article --- ### 6. get_article_full Get full article content by ID. Use this for token-efficient reading: browse titles first, then fetch complete content only for articles you want to read. **Use this for:** - Reading full articles after browsing titles - Getting complete content for specific interesting articles - Token-efficient workflow (browse → select → read) **Parameters:** | Parameter | Type | Description | Required | |-----------|------|-------------|----------| | `articleId` | `number` | Article ID from get_content/search_articles | Yes | **Example:** ```json { "articleId": 123 } ``` **Response:** ```json { "articles": [ { "id": 123, "title": "Complete Article Title", "content": "Full article content with all HTML and formatting...", "link": "https://example.com/article", "pubDate": "2024-01-15T10:30:00Z", "fetchDate": "2024-01-15T11:00:00Z", "status": "unread", "feedTitle": "Engineering Blog", "feedCategory": "Technology", "excerpt": "First 200 characters..." } ], "success": true } ``` **Token-Efficient Workflow:** ``` 1. get_content(limit=20) → Browse 20 titles/excerpts 2. Find interesting article with id=456 3. get_article_full(articleId=456) → Read full content 4. set_tag(articleId=456, status="favorite") → Save for later ``` --- ### 7. get_sources Get RSS feed sources with pagination and filtering. With hundreds of feeds, pagination is essential to avoid token limits. **Use this for:** - Discovering available sources - Finding valid source names for filtering - Exploring feed categories - Browsing favorite blogs **Parameters:** | Parameter | Type | Description | Default | |-----------|------|-------------|---------| | `limit` | `number` | Number of sources to return (max recommended: 100) | `50` | | `offset` | `number` | Offset for pagination (e.g., 50 for page 2) | `0` | | `favoritesOnly` | `boolean` | Only show favorite blogs | `false` | | `category` | `string` | Filter by category (case-insensitive, partial match) | All categories | **Example (first page):** ```json { "limit": 50, "offset": 0 } ``` **Example (favorites only):** ```json { "favoritesOnly": true, "limit": 20 } ``` **Example (filter by category):** ```json { "category": "Engineering", "limit": 30 } ``` **Response:** ```json { "sources": [ { "id": 1, "title": "Engineering at Meta", "category": "Engineering Blogs", "url": "https://engineering.fb.com/feed/", "isFavorite": true }, { "id": 2, "title": "Netflix Tech Blog", "category": "Engineering Blogs", "url": "https://netflixtechblog.com/feed", "isFavorite": false } ], "total": 518, "success": true } ``` **Pagination Example:** ``` Page 1: offset=0, limit=50 → Sources 1-50 of 518 Page 2: offset=50, limit=50 → Sources 51-100 of 518 Page 3: offset=100, limit=50 → Sources 101-150 of 518 ``` --- ### 8. set_tag Update article status to manage your reading workflow. **Use this for:** - Marking articles as read - Saving favorites - Archiving old articles - Managing reading queue **Parameters:** | Parameter | Type | Description | Required | |-----------|------|-------------|----------| | `articleId` | `number` | Article ID to update | Yes | | `status` | `string` | New status: `"unread"`, `"read"`, `"favorite"`, `"archived"` | Yes | **Example:** ```json { "articleId": 123, "status": "favorite" } ``` ## Article Status Workflow The server supports a comprehensive 4-status workflow: ``` ┌─────────┐ │ unread │ ← New articles start here └────┬────┘ │ ├──→ read (marked as read) ├──→ favorite (important/bookmarked) └──→ archived (old/irrelevant) ``` ## Vector Search & Embeddings ### How Embeddings Work 1. **Automatic Generation**: When fetching RSS articles, the server automatically generates embeddings for articles from **2020 onwards** 2. **OpenAI Integration**: Uses `text-embedding-3-small` model (1536 dimensions) 3. **Deduplication**: Embeddings are only generated once per article (checked by URL) 4. **Graceful Degradation**: If `OPENAI_API_KEY` is missing or invalid, the server continues to work normally (embeddings skipped) ### Storage - Embeddings stored as `vector(1536)` in PostgreSQL using pgvector extension - Enables fast cosine similarity search: `ORDER BY embedding <=> query_vector` ### Cost Optimization - Only articles from 2020+ get embeddings (configurable in `RssService.shouldGenerateEmbedding()`) - Duplicate articles are skipped (no redundant API calls) - Embedding generation failures don't block article saving ## Development ### Project Structure ``` mcp_rss/ ├── src/ │ ├── entities/ # TypeORM entities │ │ ├── Article.ts # Article entity with vector embeddings │ │ └── Feed.ts # RSS feed source entity │ ├── services/ │ │ ├── OpmlService.ts # OPML parsing │ │ ├── RssService.ts # RSS fetching + embedding generation │ │ ├── McpService.ts # MCP tool implementations │ │ └── EmbeddingService.ts # OpenAI embedding wrapper │ ├── config/ │ │ └── database.ts # TypeORM + pgvector setup │ └── index.ts # MCP server entry point ├── docker-compose.yml # PostgreSQL with pgvector ├── .env.example # Environment template └── package.json ``` ### Building ```bash npm run build # Compile TypeScript npm run watch # Watch mode for development ``` ### Testing ```bash # Test database connection docker-compose ps # Test MCP server locally node dist/index.js # Debug with MCP inspector npx @modelcontextprotocol/inspector node dist/index.js ``` ## Troubleshooting ### Database Connection Issues **Error: `connect ETIMEDOUT`** - Ensure PostgreSQL is running: `docker-compose ps` - Check port 5433 is available: `lsof -i :5433` - Verify environment variables match docker-compose settings ### OpenAI API Errors **Error: `401 Incorrect API key`** - Verify your API key at https://platform.openai.com/api-keys - Ensure you have available credits - Check the key isn't expired **Embeddings not being generated:** - Server works fine without API key (embeddings skipped) - Check article dates (only 2020+ articles get embeddings) - Look for errors in console logs ### MCP Server Issues **Server not appearing in Claude Desktop:** 1. Check Claude Desktop config path is correct 2. Verify `dist/index.js` exists (run `npm run build`) 3. Restart Claude Desktop after config changes 4. Check Claude Desktop logs for errors ## Performance Tips 1. **Adjust Update Interval**: Set `RSS_UPDATE_INTERVAL` to 60+ minutes for production 2. **Limit Embedding Generation**: Embeddings are only for articles from 2020+ 3. **Use Pagination**: Always use `offset` and `limit` for large result sets 4. **Database Indexing**: PostgreSQL automatically indexes the vector column ## License MIT ## Contributing Contributions welcome! Please ensure: - TypeScript compiles without errors (`npm run build`) - Environment variables are documented - New features include appropriate error handling

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ronnycoding/my_mcp_rss'

If you have feedback or need assistance with the MCP directory API, please join our Discord server