Skip to main content
Glama
adenot

MCP Google Server

by adenot

read_webpage

Extract text content from any webpage by providing its URL. This tool fetches and processes web content for analysis or data collection.

Instructions

Fetch and extract text content from a webpage

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL of the webpage to read

Implementation Reference

  • The main handler logic for the 'read_webpage' tool. Validates input arguments, fetches the webpage using axios (with optional proxy), parses HTML with cheerio to extract title and cleaned body text, structures the output as WebpageContent, and handles errors.
    } else if (request.params.name === 'read_webpage') {
      if (!isValidWebpageArgs(request.params.arguments)) {
        throw new McpError(
          ErrorCode.InvalidParams,
          'Invalid webpage arguments'
        );
      }
    
      const { url } = request.params.arguments;
    
      try {
        const proxyConfig = createProxyConfig();
        const response = await axios.get(url, {
          proxy: proxyConfig,
        });
        const $ = cheerio.load(response.data);
    
        // Remove script and style elements
        $('script, style').remove();
    
        const content: WebpageContent = {
          title: $('title').text().trim(),
          text: $('body').text().trim().replace(/\s+/g, ' '),
          url: url,
        };
    
        return {
          content: [
            {
              type: 'text',
              text: JSON.stringify(content, null, 2),
            },
          ],
        };
      } catch (error) {
        if (axios.isAxiosError(error)) {
          return {
            content: [
              {
                type: 'text',
                text: `Webpage fetch error: ${error.message}`,
              },
            ],
            isError: true,
          };
        }
        throw error;
      }
    }
  • src/index.ts:140-153 (registration)
    Registration of the 'read_webpage' tool in the MCP server's ListTools response, defining its name, description, and input schema.
    {
      name: 'read_webpage',
      description: 'Fetch and extract text content from a webpage',
      inputSchema: {
        type: 'object',
        properties: {
          url: {
            type: 'string',
            description: 'URL of the webpage to read',
          },
        },
        required: ['url'],
      },
    },
  • Input validation type guard (schema) for the 'read_webpage' tool arguments, ensuring the presence of a valid 'url' string.
    const isValidWebpageArgs = (
      args: any
    ): args is { url: string } =>
      typeof args === 'object' &&
      args !== null &&
      typeof args.url === 'string';
  • TypeScript interface defining the structure of the webpage content output from the 'read_webpage' tool.
    interface WebpageContent {
      title: string;
      text: string;
      url: string;
  • Helper function to create Axios proxy configuration from environment variables, used in the 'read_webpage' handler for HTTP requests.
    function createProxyConfig(): AxiosProxyConfig | false {
      const httpsProxy = process.env.HTTPS_PROXY || process.env.https_proxy;
      const httpProxy = process.env.HTTP_PROXY || process.env.http_proxy;
    
      const proxyUrl = httpsProxy || httpProxy;
    
      if (!proxyUrl) {
        return false;
      }
    
      try {
        const url = new URL(proxyUrl);
        return {
          protocol: url.protocol.replace(':', ''),
          host: url.hostname,
          port: parseInt(url.port) || (url.protocol === 'https:' ? 443 : 80),
          auth: url.username && url.password ? {
            username: url.username,
            password: url.password
          } : undefined
        };
      } catch (error) {
        console.warn(`Invalid proxy URL: ${proxyUrl}`);
        return false;
      }
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/adenot/mcp-google-search'

If you have feedback or need assistance with the MCP directory API, please join our Discord server