Skip to main content
Glama
DLHellMe
by DLHellMe

scrape_channel

Extract Telegram channel posts in markdown format for content analysis or archiving. Specify URL and post limit to gather channel data with optional reaction information.

Instructions

Scrape a Telegram channel and return posts in markdown format. Uses authenticated session if logged in.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe Telegram channel URL (e.g., https://t.me/channelname)
max_postsNoMaximum number of posts to scrape (default: 100)
include_reactionsNoInclude reaction data in the output

Implementation Reference

  • The primary handler function for the 'scrape_channel' tool. It checks authentication status, selects the appropriate scraper (authenticated or unauthenticated), prepares scrape options from tool arguments, invokes the scraper's scrape method, formats the result as markdown, and returns the MCP-standard content response with an authentication indicator if applicable.
    private async handleScrapeChannel(args: any): Promise<any> {
      // Check if authenticated and use authenticated scraper by default
      const isAuthenticated = await this.auth.isAuthenticated();
      const scraperToUse = isAuthenticated ? this.authScraper : this.scraper;
      
      if (isAuthenticated) {
        logger.info('Using authenticated scraper (logged in)');
      } else {
        logger.info('Using unauthenticated scraper (not logged in)');
      }
    
      const options: ScrapeOptions = {
        url: args.url,
        maxPosts: args.max_posts === undefined ? 0 : args.max_posts, // 0 means no limit
        includeReactions: args.include_reactions !== false
      };
    
      const result = await scraperToUse.scrape(options);
      const markdown = this.formatter.format(result);
    
      return {
        content: [
          {
            type: 'text',
            text: isAuthenticated 
              ? `${markdown}\n\n✅ *Scraped using authenticated session*`
              : markdown
          }
        ]
      };
    }
  • src/server.ts:124-146 (registration)
    Registers the 'scrape_channel' tool with the MCP server via the getTools() method, providing name, description, and input schema for validation.
      name: 'scrape_channel',
      description: 'Scrape a Telegram channel and return posts in markdown format. Uses authenticated session if logged in.',
      inputSchema: {
        type: 'object',
        properties: {
          url: {
            type: 'string',
            description: 'The Telegram channel URL (e.g., https://t.me/channelname)'
          },
          max_posts: {
            type: 'number',
            description: 'Maximum number of posts to scrape (default: 100)',
            default: 100
          },
          include_reactions: {
            type: 'boolean',
            description: 'Include reaction data in the output',
            default: true
          }
        },
        required: ['url']
      }
    },
  • Input schema definition for the 'scrape_channel' tool, specifying parameters like url (required), max_posts, and include_reactions with types, descriptions, and defaults.
    inputSchema: {
      type: 'object',
      properties: {
        url: {
          type: 'string',
          description: 'The Telegram channel URL (e.g., https://t.me/channelname)'
        },
        max_posts: {
          type: 'number',
          description: 'Maximum number of posts to scrape (default: 100)',
          default: 100
        },
        include_reactions: {
          type: 'boolean',
          description: 'Include reaction data in the output',
          default: true
        }
      },
      required: ['url']
  • Core helper function implementing the scraping logic called by the handler. Handles browser page creation, URL validation/navigation (auth/unauth modes), channel info parsing, infinite scrolling to collect posts with deduplication and limits, error handling with screenshots, file saving, and returns structured ScrapeResult.
    async scrape(options: ScrapeOptions): Promise<ScrapeResult> {
      logger.info(`Starting scrape for: ${options.url}`);
      
      let page: Page | null = null;
      
      try {
        // Validate URL
        if (!this.isValidTelegramUrl(options.url)) {
          throw new Error('Invalid Telegram URL. Must be a t.me link.');
        }
    
        // Create page
        page = await this.browserManager.createPage();
        
        // Navigate to channel/group
        await this.navigateToChannel(page, options.url);
        
        // Get channel info BEFORE scrolling
        const channelHtml = await page.content();
        const parser = new DataParser(channelHtml);
        let channel = parser.parseChannelInfo();
        
        // Try to get channel name and username from URL if parsing failed
        const urlMatch = options.url.match(/t\.me\/s?\/([^/?]+)/);
        if (urlMatch && urlMatch[1]) {
          if (channel.username === 'unknown') {
            channel.username = urlMatch[1];
          }
          if (channel.name === 'Unknown Channel') {
            channel.name = urlMatch[1];
          }
        }
        
        // Scroll and collect posts
        const posts = await this.scrollAndCollectPosts(page, options);
        
        // Get total post count from collected posts
        const totalPosts = posts.length;
        
        logger.info(`Scraping complete. Total posts: ${totalPosts}`);
        
        const result = {
          channel,
          posts,
          scrapedAt: new Date(),
          totalPosts
        };
        
        // Save to file
        await this.saveToFile(result, channel.username);
        
        return result;
        
      } catch (error) {
        logger.error('Scraping failed:', error);
        
        // Take screenshot on error
        if (page && config.debug.saveScreenshots) {
          await this.browserManager.screenshot(page, 'error');
        }
        
        return {
          channel: {
            name: 'Unknown',
            username: 'unknown',
            description: ''
          },
          posts: [],
          scrapedAt: new Date(),
          totalPosts: 0,
          error: error instanceof Error ? error.message : 'Unknown error'
        };
        
      } finally {
        if (page) {
          await page.close();
        }
      }
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DLHellMe/telegram-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server