@profullstack/mcp-server

mcp-server
mcp_modules
news-aggregator

README.md•7.12 KiB

# News Aggregator Module A comprehensive news aggregation module for the MCP server that provides real-time news by parsing RSS feeds and scraping selected websites. Built with Node.js, featuring efficient caching, keyword filtering, and structured JSON output. ## Features - **RSS Feed Parsing**: Direct parsing of RSS feeds from reputable sources - **Website Scraping**: HTML scraping using Cheerio for sites without RSS - **Multi-Source Aggregation**: Combine news from multiple sources - **Keyword Filtering**: Filter articles by keywords across all sources - **Caching**: 5-minute cache to improve performance and reduce load - **Category Support**: Filter by news categories (Technology, World, Business, etc.) - **Search Functionality**: Search across multiple sources with keyword matching - **Rate Limiting**: Built-in rate limiting for API calls - **Error Handling**: Graceful error handling with detailed logging ## Supported Sources ### RSS Sources - **Google News**: Technology, World, Business categories - **Hacker News**: Technology focus - **BBC News**: World, Technology, Business, Science categories - **NPR**: General news ### Scraped Sources - **TechCrunch**: Technology news - **CNN**: World news ## Installation 1. Install dependencies: ```bash cd mcp_modules/news-aggregator npm install ``` 2. The module will be automatically loaded by the MCP server. ## API Endpoints ### Basic Module Info ``` GET /news-aggregator ``` Returns module information and status. ### RSS News ``` GET /news-aggregator/rss?source=google&category=technology GET /news-aggregator/rss/hackernews ``` Fetch news from RSS sources. **Parameters:** - `source`: RSS source (google, hackernews, bbc, npr) - `category`: News category (technology, world, business, science, general) ### Scraped News ``` GET /news-aggregator/scrape?source=techcrunch GET /news-aggregator/scrape/cnn ``` Fetch news from scraped websites. **Parameters:** - `source`: Scrape source (techcrunch, cnn) ### Aggregated News ``` GET /news-aggregator/aggregate?sources=google,hackernews,bbc&category=technology&keywords=AI,machine learning ``` Get news from multiple sources with optional filtering. **Parameters:** - `sources`: Comma-separated list of sources - `category`: News category filter - `keywords`: Comma-separated keywords for filtering ### Search News ``` GET /news-aggregator/search?keywords=artificial intelligence&sources=google,hackernews&limit=20 ``` Search for news articles by keywords. **Parameters:** - `keywords`: Comma-separated search keywords (required) - `sources`: Comma-separated list of sources - `category`: News category filter - `limit`: Maximum number of results ### Available Sources ``` GET /news-aggregator/sources ``` Get list of all available news sources and their capabilities. ### Health Check ``` GET /news-aggregator/health ``` Check if the news aggregator service is operational. ### Cache Management ``` DELETE /news-aggregator/cache ``` Clear the news cache. ## MCP Tool Usage The module provides an MCP tool that can be used programmatically: ```javascript // Tool info GET /tools/news-aggregator/info // Tool execution POST /tools/news-aggregator { "action": "aggregate", "sources": "google,hackernews,bbc", "category": "technology", "keywords": "AI,machine learning", "limit": 10 } ``` ### Tool Actions 1. **rss**: Fetch from RSS source - Required: `source` - Optional: `category` 2. **scrape**: Fetch from scraped source - Required: `source` 3. **aggregate**: Get news from multiple sources - Optional: `sources`, `category`, `keywords` 4. **search**: Search by keywords - Required: `keywords` - Optional: `sources`, `category`, `limit` 5. **sources**: Get available sources ## Response Format All endpoints return JSON in the following format: ### Single Source Response ```json { "source": "Google News", "category": "technology", "articles": [ { "title": "Article Title", "description": "Article description or excerpt", "link": "https://example.com/article", "publishedAt": "2024-01-01T12:00:00.000Z", "source": "Google News" } ], "fetchedAt": "2024-01-01T12:00:00.000Z" } ``` ### Aggregated Response ```json { "sources": [ { "source": "Google News", "category": "technology", "articles": [...], "fetchedAt": "2024-01-01T12:00:00.000Z" } ], "errors": [], "totalArticles": 25, "fetchedAt": "2024-01-01T12:00:00.000Z" } ``` ### Search Response ```json { "keywords": ["AI", "machine learning"], "sources": ["google", "hackernews"], "articles": [...], "totalResults": 15, "searchedAt": "2024-01-01T12:00:00.000Z" } ``` ## Usage Examples ### Get Technology News from Multiple Sources ```bash curl "http://localhost:3000/news-aggregator/aggregate?sources=google,hackernews,techcrunch&category=technology" ``` ### Search for AI-related Articles ```bash curl "http://localhost:3000/news-aggregator/search?keywords=artificial intelligence,machine learning&limit=10" ``` ### Get BBC World News ```bash curl "http://localhost:3000/news-aggregator/rss/bbc?category=world" ``` ### Using the MCP Tool ```bash curl -X POST "http://localhost:3000/tools/news-aggregator" \ -H "Content-Type: application/json" \ -d '{ "action": "search", "keywords": "climate change,environment", "sources": "bbc,npr", "limit": 5 }' ``` ## Configuration The module uses the following default settings: - **Cache Timeout**: 5 minutes - **Rate Limiting**: 10 requests per minute per source - **Default Sources**: google, hackernews, bbc - **User Agent**: Mozilla/5.0 (compatible; NewsAggregator/1.0) ## Error Handling The module includes comprehensive error handling: - Network failures are logged and return appropriate HTTP status codes - Invalid sources return 400 Bad Request - Parsing errors are caught and logged - Rate limiting prevents overwhelming news sources - Graceful fallbacks for missing or malformed data ## Caching Strategy - **Cache Duration**: 5 minutes per source - **Cache Key**: Based on source, category, and keywords - **Memory Storage**: In-memory cache for fast access - **Automatic Cleanup**: Expired entries are automatically removed ## Development ### Running Tests ```bash npm test ``` ### Adding New Sources To add a new RSS source: 1. Add the source configuration to `src/service.js` 2. Update the `getAggregatedNews` method 3. Add the source to the available sources list To add a new scraped source: 1. Define CSS selectors for the site structure 2. Add a new method in `src/service.js` 3. Update the controller and main index file ## Dependencies - **cheerio**: HTML parsing and manipulation - **rss-parser**: RSS feed parsing - **node-fetch**: HTTP requests - **mocha**: Testing framework - **chai**: Assertion library - **sinon**: Test spies and mocks ## License ISC License ## Contributing 1. Fork the repository 2. Create a feature branch 3. Add tests for new functionality 4. Ensure all tests pass 5. Submit a pull request ## Support For issues and questions: 1. Check the logs for error details 2. Verify network connectivity to news sources 3. Test with the health endpoint 4. Clear cache if experiencing stale data

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/profullstack/mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•7.12 KiB