Skip to main content
Glama

Telegram MCP Server

by DLHellMe
PROJECT_PLAN.md5.36 kB
# Telegram MCP Server Project Plan ## Project Overview Build an MCP (Model Context Protocol) server that scrapes Telegram public channels and groups using Puppeteer/Chromium and provides the data to Claude in markdown format for analysis. ## Key Requirements - No Telegram API usage (web scraping only) - Use Chromium browser automation - Scroll to parse historical messages - Extract comprehensive data (posts, reactions, views, metadata) - Return data in markdown format - Follow MCP protocol standards ## Proposed Task List ### Phase 1: Project Setup and Architecture 1. **Initialize TypeScript project** - Set up package.json with dependencies - Configure TypeScript (tsconfig.json) - Set up build scripts - Configure linting and formatting 2. **Install core dependencies** - @modelcontextprotocol/sdk - puppeteer (includes Chromium) - cheerio (for HTML parsing) - date-fns (for date handling) - dotenv (for configuration) 3. **Create project structure** ``` tgmcp/ ├── src/ │ ├── index.ts (MCP server entry) │ ├── server.ts (MCP server implementation) │ ├── scraper/ │ │ ├── telegram-scraper.ts │ │ ├── browser-manager.ts │ │ └── data-parser.ts │ ├── formatters/ │ │ └── markdown-formatter.ts │ ├── types/ │ │ └── telegram.types.ts │ └── utils/ │ ├── logger.ts │ └── config.ts ├── dist/ ├── tests/ ├── .env.example ├── README.md └── package.json ``` ### Phase 2: MCP Server Implementation 4. **Implement base MCP server** - Create server class extending MCP SDK - Set up JSON-RPC message handling - Implement initialization handshake - Configure stdio transport 5. **Define MCP tools** - `scrape_channel`: Scrape a Telegram channel - `scrape_group`: Scrape a public Telegram group - `get_channel_info`: Get channel metadata only - `scrape_date_range`: Scrape posts within date range 6. **Implement error handling** - MCP protocol error responses - Scraping failure handling - Rate limiting errors - Network timeout handling ### Phase 3: Telegram Scraper Implementation 7. **Browser automation setup** - Puppeteer configuration - Headless/headful mode options - User agent and viewport settings - Cookie and session management 8. **Navigation logic** - URL validation for Telegram links - Page load waiting strategies - Dynamic content detection - Error page handling 9. **Scrolling mechanism** - Implement infinite scroll detection - Date-based stopping condition - Memory-efficient batch processing - Progress tracking 10. **Data extraction** - Channel/group metadata (name, description, member count) - Post content (text, media indicators) - Post metadata (date, views, forwards) - Reactions (emoji types and counts) - Reply/comment counts - Media presence indicators ### Phase 4: Data Processing and Formatting 11. **Data parsing** - HTML element selectors - Text content extraction - Date parsing and normalization - Number formatting (views, reactions) 12. **Markdown formatter** - Channel/group header section - Posts organized by date - Reaction summaries - Media indicators - Structured metadata tables 13. **Data models** - TypeScript interfaces for all data types - Validation schemas - Error types ### Phase 5: Testing and Optimization 14. **Unit tests** - Parser functions - Formatter logic - Data validation 15. **Integration tests** - MCP server communication - Scraper functionality - Error scenarios 16. **Performance optimization** - Memory usage monitoring - Batch processing for large channels - Caching strategies - Resource cleanup ### Phase 6: Documentation and Deployment 17. **Documentation** - README with installation guide - API documentation - Usage examples - Troubleshooting guide 18. **Deployment preparation** - Build scripts - Distribution packaging - Claude Desktop configuration example - Security guidelines ## Technical Decisions ### Architecture - **Language**: TypeScript for type safety and MCP SDK compatibility - **Browser Automation**: Puppeteer (includes Chromium, well-documented) - **Transport**: stdio (standard for local MCP servers) - **Data Format**: Markdown for Claude compatibility ### Security Considerations - No credential storage - Read-only operations - Rate limiting implementation - Input validation for URLs - Sandboxed browser execution ### Limitations - Public channels/groups only - No member list access - Subject to Telegram's anti-bot measures - Performance depends on channel size ## Success Criteria - Successfully connects to Claude Desktop - Can scrape public Telegram channels/groups - Returns well-formatted markdown - Handles errors gracefully - Performs within reasonable time limits - Follows MCP protocol standards ## Estimated Timeline - Phase 1-2: 2-3 days (Setup and MCP implementation) - Phase 3-4: 3-4 days (Scraper and formatting) - Phase 5-6: 2-3 days (Testing and documentation) - Total: ~8-10 days for production-ready implementation

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DLHellMe/telegram-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server