Skip to main content
Glama

scrape_manual

Manually scrape Telegram channels by opening a browser for login, navigating to desired channels, and extracting posts. Save results in MD and JSON formats for easy access and analysis.

Instructions

Manual scraping mode: Opens browser for you to login and navigate to any channel, then scrapes it

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of posts to scrape (optional)
save_to_fileNoSave results to MD and JSON files

Implementation Reference

  • Main handler function for the 'scrape_manual' MCP tool. Dynamically imports and uses ManualTelegramScraper to open browser, wait for user navigation, scrape channel, save files, and return formatted response.
    private async handleManualScrape(args: any): Promise<any> { try { logger.info('Starting manual scrape mode...'); // Import manual scraper and required modules const { ManualTelegramScraper } = await import('./scraper/manual-scraper.js'); const { join } = await import('path'); const { writeFile, mkdir } = await import('fs/promises'); const manualScraper = new ManualTelegramScraper(); // Open browser and wait for user to navigate const { browser, page } = await manualScraper.loginAndWaitForChannel(); // Scrape the current channel const options = { maxPosts: args.limit || args.max_posts || 0 }; const result = await manualScraper.scrapeCurrentChannel(page, options); // Save to file if (args.save_to_file !== false) { const timestamp = new Date().toISOString().replace(/[:.]/g, '-').slice(0, -5); const filename = `${result.channel.username}_${timestamp}_manual.md`; const formatter = new MarkdownFormatter(); const markdown = formatter.format(result); const basePath = 'C:\\Users\\User\\AppData\\Roaming\\Claude\\telegram_scraped_data'; const filepath = join(basePath, filename); await mkdir(basePath, { recursive: true }); await writeFile(filepath, markdown, 'utf8'); // Also save JSON const jsonFilename = `${result.channel.username}_${timestamp}_manual.json`; const jsonFilepath = join(basePath, jsonFilename); await writeFile(jsonFilepath, JSON.stringify(result, null, 2), 'utf8'); logger.info(`Saved to: ${filepath}`); } // Close browser await manualScraper.close(browser); // Format response const summary = result.posts.slice(0, 5).map(post => ({ date: post.date.toISOString(), content: post.content.substring(0, 100) + (post.content.length > 100 ? '...' : ''), views: post.views })); return { content: [ { type: 'text', text: `✅ Successfully scraped ${result.posts.length} posts from ${result.channel.name} Channel: @${result.channel.username} Total posts scraped: ${result.posts.length} ${args.save_to_file !== false ? `Files saved to: - Markdown: C:\\Users\\User\\AppData\\Roaming\\Claude\\telegram_scraped_data\\${result.channel.username}_*_manual.md - JSON: C:\\Users\\User\\AppData\\Roaming\\Claude\\telegram_scraped_data\\${result.channel.username}_*_manual.json ` : ''}Sample of first 5 posts: ${summary.map(post => `\n📅 ${post.date}\n${post.content}\n👁 ${post.views} views`).join('\n---\n')}` } ] }; } catch (error) { logger.error('Manual scrape failed:', error); return { content: [ { type: 'text', text: `❌ Manual scrape failed: ${error instanceof Error ? error.message : 'Unknown error'}` } ] }; } }
  • src/server.ts:272-291 (registration)
    Registration of the 'scrape_manual' tool in the MCP server's tool list, including name, description, and input schema definition.
    { name: 'scrape_manual', description: 'Manual scraping mode: Opens browser for you to login and navigate to any channel, then scrapes it', inputSchema: { type: 'object', properties: { limit: { type: 'number', description: 'Maximum number of posts to scrape (optional)' }, save_to_file: { type: 'boolean', description: 'Save results to MD and JSON files', default: true } }, required: [] } }, {
  • Input schema definition for the 'scrape_manual' tool.
    inputSchema: { type: 'object', properties: { limit: { type: 'number', description: 'Maximum number of posts to scrape (optional)' }, save_to_file: { type: 'boolean', description: 'Save results to MD and JSON files', default: true } }, required: [] } },
  • Helper method that launches a visible browser, navigates to Telegram Web, provides user instructions, and waits for user to navigate to the target channel.
    async loginAndWaitForChannel(): Promise<{ browser: Browser; page: Page }> { logger.info('Opening browser for manual login and navigation...'); // Launch browser in non-headless mode const browser = await this.browserManager.launch(false); const page = await browser.newPage(); // Set viewport await page.setViewport({ width: 1280, height: 800 }); // Load cookies if available await this.cookieManager.loadCookies(page); // Navigate to Telegram Web logger.info('Navigating to Telegram Web...'); await page.goto('https://web.telegram.org/a/', { waitUntil: 'networkidle2', timeout: 60000 }); // Instructions for user logger.info('='.repeat(60)); logger.info('MANUAL NAVIGATION REQUIRED'); logger.info('1. Log in to Telegram if needed'); logger.info('2. Navigate to the channel you want to scrape'); logger.info('3. Make sure the channel messages are visible'); logger.info('4. Press Enter here when ready to start scraping'); logger.info('='.repeat(60)); // Wait for user to press Enter await new Promise<void>(resolve => { process.stdin.once('data', () => { resolve(); }); }); // Save cookies for future use await this.cookieManager.saveCookies(page); return { browser, page }; }
  • Core scraping handler in ManualTelegramScraper that extracts channel information from the current page and collects posts by scrolling and parsing.
    async scrapeCurrentChannel(page: Page, options: Partial<ScrapeOptions> = {}): Promise<ScrapeResult> { logger.info('Starting to scrape current channel...'); try { // Get current URL to extract channel info const currentUrl = page.url(); logger.info(`Current URL: ${currentUrl}`); // Get channel info from the page const channelHtml = await page.content(); const parser = new DataParser(channelHtml); let channel = parser.parseChannelInfo(); // Try to extract channel name from URL or page title if (channel.name === 'Unknown Channel') { const pageTitle = await page.title(); if (pageTitle && pageTitle !== 'Telegram') { channel.name = pageTitle; } } // Extract username from URL if possible const urlMatch = currentUrl.match(/#@?([^/?]+)$/); if (urlMatch && urlMatch[1] && channel.username === 'unknown') { channel.username = urlMatch[1].replace('-', ''); } logger.info(`Scraping channel: ${channel.name} (@${channel.username})`); // Scroll and collect posts const posts = await this.scrollAndCollectPosts(page, options); logger.info(`Scraping complete. Total posts: ${posts.length}`); return { channel, posts, scrapedAt: new Date(), totalPosts: posts.length }; } catch (error) { logger.error('Scraping failed:', error); return { channel: { name: 'Unknown', username: 'unknown', description: '' }, posts: [], scrapedAt: new Date(), totalPosts: 0, error: error instanceof Error ? error.message : 'Unknown error' }; } }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DLHellMe/telegram-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server