Skip to main content
Glama

fetchpage

Fetch web pages with full JavaScript rendering and automatic cookie management. Extract specific content using CSS selectors for dynamic websites and authenticated sessions.

Instructions

Fetch web pages using browser automation with full JavaScript rendering. Supports automatic cookie management, localStorage, CSS selectors, and dynamic content. Cookies are automatically loaded from local storage if available.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL to fetch
waitForNoCSS selector to extract specific content only (optional, extracts only content within this selector)
headlessNoRun browser in headless mode (optional, default: true)
timeoutNoTimeout in milliseconds (default: 30000)

Implementation Reference

  • The handler logic for the 'fetchpage' tool, which delegates to 'handleFetchSpaWithCookies'.
    if (toolName === 'fetchpage') {
      try {
        return await handleFetchSpaWithCookies(request.params.arguments, sendProgress);
  • The core implementation of 'handleFetchSpaWithCookies' which uses Puppeteer to fetch the page. (Captured range of the function start)
    async function handleFetchSpaWithCookies(args, sendProgress = null, shouldSaveFile = true) {
      const { url, waitFor, timeout = 30000, headless = true } = args;
    
      if (!url) {
        return {
          content: [
            {
              type: 'text',
              text: 'Error: URL parameter is required'
            }
          ]
        };
      }
    
      let browser = null;
    
    
        // 解析域名
        const urlObj = new URL(url);
        const domain = urlObj.hostname;
    
        // 获取cookie数据 - 自动从文件加载所有cookies
        let cookieData = null;
    
        // 自动合并所有cookie文件,解决短链/跨域跳转漏cookie问题
        const merged = cookieManager.loadAndMergeAllCookies();
        if (merged) {
          const hasExpired = cookieManager.isCookieExpiredForDomain(merged, domain);
          cookieData = merged;
          if (sendProgress) await sendProgress(0, 1, `已读取Cookie(合并 ${cookieData.cookies?.length || 0} 个${hasExpired ? ',包含过期项' : ''})`);
        } else {
          cookieData = null;
          if (sendProgress) await sendProgress(0, 1, '无Cookie');
        }
        
        // 启动Puppeteer浏览器,使用系统 Chrome(避免下载受管浏览器)
        const launchOptions = {
          headless: headless,
          defaultViewport: null, // 允许浏览器使用默认视口
          args: [
            '--no-sandbox',
            '--disable-setuid-sandbox',
            '--disable-blink-features=AutomationControlled',
            '--disable-features=VizDisplayCompositor',
            '--disable-extensions',
            '--disable-plugins',
            '--disable-sync',
            '--disable-translate',
            '--disable-default-apps',
            '--no-first-run',
            '--no-default-browser-check',
            '--disable-dev-shm-usage',
            '--disable-gpu',
            '--disable-web-security' // 有助于cookie设置
            // 移除 --no-zygote 和 --single-process 参数,这些会导致 frame detached 错误
          ]
        };
        
        // 直接写死系统 Chrome 路径(若存在),否则尝试使用 channel: 'chrome'
        const systemChrome = resolveSystemChromePath();
        if (systemChrome) {
          launchOptions.executablePath = systemChrome;
        } else {
          // 在 macOS/Windows 上,Puppeteer 可通过 channel 使用系统浏览器
          // 若仍未找到,将回退到默认行为(可能报未安装受管浏览器的错误)
          launchOptions.channel = 'chrome';
        }
    
        browser = await puppeteer.launch(launchOptions);
        
        const page = await browser.newPage();
        
        // 只在无头模式下设置视口大小
        if (headless) {
          await page.setViewport({
            width: 1366,
            height: 768,
            deviceScaleFactor: 1,
            hasTouch: false,
            isLandscape: true,
            isMobile: false,
          });
        }
        
        // 设置随机用户代理
        const userAgents = [
          'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
          'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
          'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36',
          'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15'
        ];
        const randomUserAgent = userAgents[Math.floor(Math.random() * userAgents.length)];
        await page.setUserAgent(randomUserAgent);
        
        // 禁用自动化检测标志
        await page.evaluateOnNewDocument(() => {
          // 删除webdriver属性
          Object.defineProperty(navigator, 'webdriver', {
            get: () => undefined,
          });
          
          // 修改plugins长度
          Object.defineProperty(navigator, 'plugins', {
            get: () => [1, 2, 3, 4, 5],
          });
          
          // 修改语言设置
          Object.defineProperty(navigator, 'languages', {
            get: () => ['en-US', 'en'],
          });
          
          // 删除自动化控制标志
          delete Object.getPrototypeOf(navigator).webdriver;
          
          // 覆盖权限查询
          const originalQuery = window.navigator.permissions.query;
          window.navigator.permissions.query = (parameters) => (
            parameters.name === 'notifications' ?
              Promise.resolve({ state: Notification.permission }) :
              originalQuery(parameters)
          );
          
          // 模拟真实的Chrome运行时
          Object.defineProperty(window, 'chrome', {
            get: () => ({
              runtime: {},
              loadTimes: function() {},
              csi: function() {},
              app: {}
            }),
          });
        });
        
        // 使用正确的browser.setCookie API设置cookies(带SameSite映射与健壮性)
        if (cookieData && cookieData.cookies && cookieData.cookies.length > 0) {
          try {
            const mapSameSite = (val) => {
              if (!val) return null;
              const lower = String(val).toLowerCase();
              if (lower === 'lax') return 'Lax';
              if (lower === 'strict') return 'Strict';
              if (lower === 'none' || lower === 'no_restriction') return 'None';
              if (lower === 'unspecified' || lower === 'default') return null;
              return null;
            };
            
            const context = page.browserContext();
            const cookiesToSet = [];
            for (const cookie of cookieData.cookies) {
              if (!cookie || !cookie.name || !cookie.value || !cookie.domain) continue;
              const entry = {
                name: cookie.name,
                value: cookie.value,
                domain: cookie.domain,
                path: cookie.path || '/',
                secure: !!cookie.secure,
                httpOnly: !!cookie.httpOnly
              };
              const mapped = mapSameSite(cookie.sameSite);
              if (mapped) entry.sameSite = mapped;
              if (cookie.expirationDate) entry.expires = cookie.expirationDate;
              cookiesToSet.push(entry);
            }
            if (cookiesToSet.length > 0) {
              await context.setCookie(...cookiesToSet);
              if (sendProgress) await sendProgress(1, 1, `已设置 ${cookiesToSet.length} 个Cookie`);
            }
          } catch (error) {
            // 静默处理cookie设置错误(避免泄露敏感信息),但保留简要计数
          }
        } else {
        }
        
        // 在导航之前设置localStorage(如果有的话)
        // 在导航之前设置localStorage(按域名作用域写入,避免污染其他域)
        if (cookieData && cookieData.localStorageByDomain && Object.keys(cookieData.localStorageByDomain).length > 0) {
          await page.evaluateOnNewDocument((byDomain) => {
            try {
              const host = (location.hostname || '').replace(/^www\./, '');
              const candidates = [];
              for (const domain of Object.keys(byDomain)) {
                const d = String(domain).replace(/^www\./, '');
                if (host === d || host.endsWith('.' + d)) {
                  candidates.push(d);
                }
              }
              for (const d of candidates) {
                const bucket = byDomain[d] || {};
                for (const [k, v] of Object.entries(bucket)) {
                  try { window.localStorage.setItem(k, v); } catch (e) {}
                }
              }
            } catch (e) {
              // 忽略localStorage错误
            }
          }, cookieData.localStorageByDomain);
        }
        
        // 发送进度通知:设置完成,开始导航
        if (sendProgress) await sendProgress(4, 10, "开始页面导航");
        
        // 导航到目标页面(添加更多错误处理)
        let response;
        let finalUrl = url;
        try {
          response = await page.goto(url, { 
            waitUntil: 'domcontentloaded',
            timeout: timeout 
          });
          finalUrl = response?.url?.() || page.url() || url;
          
          // 检查页面是否正常加载
          if (response.status() >= 400) {
            throw new Error(`HTTP ${response.status()}: ${response.statusText()}`);
          }
          
        } catch (error) {
          throw new Error(`页面导航失败: ${error.message}`);
        }
        
        // 等待JavaScript执行完成
        try {
          await new Promise(r => setTimeout(r, 500));
          if (!page.isClosed()) {
            const readyState = await page.evaluate(() => document.readyState).catch(() => 'unknown');
            if (readyState !== 'complete') {
              await page.waitForFunction(() => document.readyState === 'complete', { timeout: 10000 }).catch(() => {});
            }
          }
        } catch (error) {
          // 继续执行,不抛出异常
        }
        
        // 提取目标规则:优先用户参数,其次域名预设
        const domainRule = getDomainRuleForUrl(url);
        const targetSelector = waitFor || domainRule.selector;
    
        // 等待动态内容渲染
        await new Promise(r => setTimeout(r, 800));
        
        // 如果有目标选择器,先等待元素出现
        if (targetSelector) {
          try {
  • Registration of the 'fetchpage' tool definition in the MCP server.
    server.setRequestHandler(ListToolsRequestSchema, async () => {
      return {
        tools: [
          {
            name: 'fetchpage',
            description: 'Fetch web pages using browser automation with full JavaScript rendering. Supports automatic cookie management, localStorage, CSS selectors, and dynamic content. Cookies are automatically loaded from local storage if available.',
            inputSchema: {
              type: 'object',
              properties: {
                url: {
                  type: 'string',
                  description: 'The URL to fetch'
                },
                waitFor: {
                  type: 'string',
                  description: 'CSS selector to extract specific content only (optional, extracts only content within this selector)'
                },
                headless: {
                  type: 'boolean',
                  description: 'Run browser in headless mode (optional, default: true)'
                },
                timeout: {
                  type: 'number',
                  description: 'Timeout in milliseconds (default: 30000)',
                  default: 30000
                }
              },
              required: ['url']
            }
          }
        ]
      };
    });
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kaiye/mcp-fetch-page'

If you have feedback or need assistance with the MCP directory API, please join our Discord server