Telegram MCP Server

scrape_manual

Manually scrape Telegram channel content by opening a browser for login and navigation, then extracting posts with optional file saving.

Instructions

Manual scraping mode: Opens browser for you to login and navigate to any channel, then scrapes it

Input Schema

TableJSON Schema

Name	Required	Description	Default
`limit`	No	Maximum number of posts to scrape (optional)
`save_to_file`	No	Save results to MD and JSON files

Implementation Reference

src/server.ts:272-290 (registration)

Registration of the 'scrape_manual' tool in getTools(), including name, description, and input schema definition.

{
  name: 'scrape_manual',
  description: 'Manual scraping mode: Opens browser for you to login and navigate to any channel, then scrapes it',
  inputSchema: {
    type: 'object',
    properties: {
      limit: {
        type: 'number',
        description: 'Maximum number of posts to scrape (optional)'
      },
      save_to_file: {
        type: 'boolean',
        description: 'Save results to MD and JSON files',
        default: true
      }
    },
    required: []
  }
},

src/server.ts:275-290 (schema)

Input schema definition for the 'scrape_manual' tool.

  inputSchema: {
    type: 'object',
    properties: {
      limit: {
        type: 'number',
        description: 'Maximum number of posts to scrape (optional)'
      },
      save_to_file: {
        type: 'boolean',
        description: 'Save results to MD and JSON files',
        default: true
      }
    },
    required: []
  }
},

src/server.ts:678-761 (handler)

The primary handler function for the 'scrape_manual' tool. It dynamically imports ManualTelegramScraper, launches a browser for manual user navigation and login, performs the scrape, optionally saves results to files, and returns a formatted response with a sample.

  private async handleManualScrape(args: any): Promise<any> {
    try {
      logger.info('Starting manual scrape mode...');
      
      // Import manual scraper and required modules
      const { ManualTelegramScraper } = await import('./scraper/manual-scraper.js');
      const { join } = await import('path');
      const { writeFile, mkdir } = await import('fs/promises');
      
      const manualScraper = new ManualTelegramScraper();
      
      // Open browser and wait for user to navigate
      const { browser, page } = await manualScraper.loginAndWaitForChannel();
      
      // Scrape the current channel
      const options = {
        maxPosts: args.limit || args.max_posts || 0
      };
      
      const result = await manualScraper.scrapeCurrentChannel(page, options);
      
      // Save to file
      if (args.save_to_file !== false) {
        const timestamp = new Date().toISOString().replace(/[:.]/g, '-').slice(0, -5);
        const filename = `${result.channel.username}_${timestamp}_manual.md`;
        
        const formatter = new MarkdownFormatter();
        const markdown = formatter.format(result);
        
        const basePath = 'C:\\Users\\User\\AppData\\Roaming\\Claude\\telegram_scraped_data';
        const filepath = join(basePath, filename);
        
        await mkdir(basePath, { recursive: true });
        await writeFile(filepath, markdown, 'utf8');
        
        // Also save JSON
        const jsonFilename = `${result.channel.username}_${timestamp}_manual.json`;
        const jsonFilepath = join(basePath, jsonFilename);
        await writeFile(jsonFilepath, JSON.stringify(result, null, 2), 'utf8');
        
        logger.info(`Saved to: ${filepath}`);
      }
      
      // Close browser
      await manualScraper.close(browser);
      
      // Format response
      const summary = result.posts.slice(0, 5).map(post => ({
        date: post.date.toISOString(),
        content: post.content.substring(0, 100) + (post.content.length > 100 ? '...' : ''),
        views: post.views
      }));
      
      return {
        content: [
          {
            type: 'text',
            text: `✅ Successfully scraped ${result.posts.length} posts from ${result.channel.name}

Channel: @${result.channel.username}
Total posts scraped: ${result.posts.length}

${args.save_to_file !== false ? `Files saved to:
- Markdown: C:\\Users\\User\\AppData\\Roaming\\Claude\\telegram_scraped_data\\${result.channel.username}_*_manual.md
- JSON: C:\\Users\\User\\AppData\\Roaming\\Claude\\telegram_scraped_data\\${result.channel.username}_*_manual.json

` : ''}Sample of first 5 posts:
${summary.map(post => `\n📅 ${post.date}\n${post.content}\n👁 ${post.views} views`).join('\n---\n')}`
          }
        ]
      };
      
    } catch (error) {
      logger.error('Manual scrape failed:', error);
      return {
        content: [
          {
            type: 'text',
            text: `❌ Manual scrape failed: ${error instanceof Error ? error.message : 'Unknown error'}`
          }
        ]
      };
    }
  }

src/scraper/manual-scraper.ts:8-206 (helper)

Helper class ManualTelegramScraper providing the core implementation for manual scraping: browser launch, user-guided navigation, channel scraping via scrolling and parsing, post collection, and cleanup.

export class ManualTelegramScraper {
  private browserManager: BrowserManager;
  private cookieManager: CookieManager;

  constructor() {
    this.browserManager = new BrowserManager();
    this.cookieManager = new CookieManager();
  }

  async loginAndWaitForChannel(): Promise<{ browser: Browser; page: Page }> {
    logger.info('Opening browser for manual login and navigation...');
    
    // Launch browser in non-headless mode
    const browser = await this.browserManager.launch(false);
    const page = await browser.newPage();
    
    // Set viewport
    await page.setViewport({ width: 1280, height: 800 });
    
    // Load cookies if available
    await this.cookieManager.loadCookies(page);
    
    // Navigate to Telegram Web
    logger.info('Navigating to Telegram Web...');
    await page.goto('https://web.telegram.org/a/', {
      waitUntil: 'networkidle2',
      timeout: 60000
    });
    
    // Instructions for user
    logger.info('='.repeat(60));
    logger.info('MANUAL NAVIGATION REQUIRED');
    logger.info('1. Log in to Telegram if needed');
    logger.info('2. Navigate to the channel you want to scrape');
    logger.info('3. Make sure the channel messages are visible');
    logger.info('4. Press Enter here when ready to start scraping');
    logger.info('='.repeat(60));
    
    // Wait for user to press Enter
    await new Promise<void>(resolve => {
      process.stdin.once('data', () => {
        resolve();
      });
    });
    
    // Save cookies for future use
    await this.cookieManager.saveCookies(page);
    
    return { browser, page };
  }

  async scrapeCurrentChannel(page: Page, options: Partial<ScrapeOptions> = {}): Promise<ScrapeResult> {
    logger.info('Starting to scrape current channel...');
    
    try {
      // Get current URL to extract channel info
      const currentUrl = page.url();
      logger.info(`Current URL: ${currentUrl}`);
      
      // Get channel info from the page
      const channelHtml = await page.content();
      const parser = new DataParser(channelHtml);
      let channel = parser.parseChannelInfo();
      
      // Try to extract channel name from URL or page title
      if (channel.name === 'Unknown Channel') {
        const pageTitle = await page.title();
        if (pageTitle && pageTitle !== 'Telegram') {
          channel.name = pageTitle;
        }
      }
      
      // Extract username from URL if possible
      const urlMatch = currentUrl.match(/#@?([^/?]+)$/);
      if (urlMatch && urlMatch[1] && channel.username === 'unknown') {
        channel.username = urlMatch[1].replace('-', '');
      }
      
      logger.info(`Scraping channel: ${channel.name} (@${channel.username})`);
      
      // Scroll and collect posts
      const posts = await this.scrollAndCollectPosts(page, options);
      
      logger.info(`Scraping complete. Total posts: ${posts.length}`);
      
      return {
        channel,
        posts,
        scrapedAt: new Date(),
        totalPosts: posts.length
      };
      
    } catch (error) {
      logger.error('Scraping failed:', error);
      return {
        channel: {
          name: 'Unknown',
          username: 'unknown',
          description: ''
        },
        posts: [],
        scrapedAt: new Date(),
        totalPosts: 0,
        error: error instanceof Error ? error.message : 'Unknown error'
      };
    }
  }

  private async scrollAndCollectPosts(page: Page, options: Partial<ScrapeOptions>): Promise<any[]> {
    logger.info('Starting to scroll and collect posts');
    
    const posts: Map<string, any> = new Map();
    let scrollAttempts = 0;
    let lastPostCount = 0;
    let noNewPostsCount = 0;
    const maxScrollAttempts = options.maxPosts ? Math.min(50, Math.ceil(options.maxPosts / 20)) : 50;
    
    while (scrollAttempts < maxScrollAttempts) {
      // Parse current posts
      const html = await page.content();
      const parser = new DataParser(html);
      const currentPosts = parser.parsePosts();
      
      // Add new posts to map (deduplication)
      for (const post of currentPosts) {
        if (!posts.has(post.id)) {
          posts.set(post.id, post);
          
          // Log progress
          if (posts.size % 20 === 0) {
            logger.info(`Collected ${posts.size} posts so far...`);
          }
        }
      }
      
      // Check if we've reached max posts
      if (options.maxPosts && posts.size >= options.maxPosts) {
        logger.info(`Reached maxPosts limit: ${options.maxPosts}`);
        break;
      }
      
      // Check if we're getting new posts
      if (posts.size === lastPostCount) {
        noNewPostsCount++;
        if (noNewPostsCount >= 3) {
          logger.info('No new posts found after 3 attempts, stopping');
          break;
        }
      } else {
        noNewPostsCount = 0;
        lastPostCount = posts.size;
      }
      
      // Scroll to load more messages
      await this.scrollUp(page);
      
      // Wait for new content
      await new Promise(resolve => setTimeout(resolve, 2000));
      
      scrollAttempts++;
    }
    
    logger.info(`Scrolling complete. Total posts collected: ${posts.size}`);
    
    // Sort posts by date (newest first) and limit if needed
    let sortedPosts = Array.from(posts.values()).sort((a, b) => b.date.getTime() - a.date.getTime());
    
    if (options.maxPosts && sortedPosts.length > options.maxPosts) {
      sortedPosts = sortedPosts.slice(0, options.maxPosts);
    }
    
    return sortedPosts;
  }

  private async scrollUp(page: Page): Promise<void> {
    // Scroll within the messages container to load older messages
    await page.evaluate(() => {
      const container = document.querySelector('.bubbles-inner, .messages-container, .bubbles, .im_history_scrollable');
      if (container) {
        // Scroll to top of the container to load older messages
        container.scrollTop = 0;
      } else {
        // Try to find any scrollable container
        const scrollables = document.querySelectorAll('[class*="scroll"], [class*="messages"], [class*="chat"]');
        for (let i = 0; i < scrollables.length; i++) {
          const el = scrollables[i] as HTMLElement;
          if (el.scrollHeight > el.clientHeight) {
            el.scrollTop = 0;
            break;
          }
        }
      }
    });
  }

  async close(browser: Browser): Promise<void> {
    await browser.close();
  }
}

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions 'Opens browser' and 'login,' hinting at interactive behavior, but doesn't disclose critical traits like whether it's read-only, destructive, requires user input during runtime, rate limits, or error handling. For a tool with manual interaction and scraping, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence: 'Manual scraping mode: Opens browser for you to login and navigate to any channel, then scrapes it.' It's front-loaded with the key concept ('Manual scraping mode') and wastes no words, making it highly concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a manual scraping tool with browser interaction and no annotations or output schema, the description is incomplete. It doesn't explain what 'scrapes' entails (e.g., data types returned, success/failure states), prerequisites, or behavioral details. This leaves the agent with insufficient information for reliable tool invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters ('limit' and 'save_to_file') with clear descriptions. The tool description adds no parameter-specific information beyond what's in the schema, such as default values or usage context. Baseline 3 is appropriate as the schema handles the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Opens browser for you to login and navigate to any channel, then scrapes it.' It specifies the verb ('scrapes') and resource ('any channel'), and distinguishes itself from siblings like 'api_scrape_channel' by emphasizing manual browser interaction. However, it doesn't explicitly differentiate from 'scrape_channel_authenticated' or 'scrape_channel_full' in terms of scope or depth.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by mentioning 'Opens browser for you to login,' suggesting it's for manual authentication scenarios. It doesn't provide explicit when-to-use vs. when-not-to-use guidance or name alternatives like 'api_scrape_channel' for automated cases. The context is clear but lacks detailed exclusions or comparative advice.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DLHellMe/telegram-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server