Skip to main content
Glama
pgzhang

MCP Google Server

by pgzhang

read_webpage

Extract readable text content from any webpage URL for analysis or processing.

Instructions

Fetch and extract text content from a webpage

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL of the webpage to read

Implementation Reference

  • Handler for the 'read_webpage' tool: validates input arguments, fetches the webpage content using axios, parses HTML with cheerio to extract title and cleaned text, returns structured content as JSON or error.
    } else if (request.params.name === 'read_webpage') {
      if (!isValidWebpageArgs(request.params.arguments)) {
        throw new McpError(
          ErrorCode.InvalidParams,
          'Invalid webpage arguments'
        );
      }
    
      const { url } = request.params.arguments;
    
      try {
        const response = await axios.get(url);
        const $ = cheerio.load(response.data);
    
        // Remove script and style elements
        $('script, style').remove();
    
        const content: WebpageContent = {
          title: $('title').text().trim(),
          text: $('body').text().trim().replace(/\s+/g, ' '),
          url: url,
        };
    
        return {
          content: [
            {
              type: 'text',
              text: JSON.stringify(content, null, 2),
            },
          ],
        };
      } catch (error) {
        if (axios.isAxiosError(error)) {
          return {
            content: [
              {
                type: 'text',
                text: `Webpage fetch error: ${error.message}`,
              },
            ],
            isError: true,
          };
        }
        throw error;
      }
    }
  • src/index.ts:109-122 (registration)
    Registration of the 'read_webpage' tool in the ListToolsRequestHandler, including name, description, and input schema definition.
    {
      name: 'read_webpage',
      description: 'Fetch and extract text content from a webpage',
      inputSchema: {
        type: 'object',
        properties: {
          url: {
            type: 'string',
            description: 'URL of the webpage to read',
          },
        },
        required: ['url'],
      },
    },
  • Helper function to validate input arguments for the 'read_webpage' tool.
    const isValidWebpageArgs = (
      args: any
    ): args is { url: string } =>
      typeof args === 'object' &&
      args !== null &&
      typeof args.url === 'string';
  • TypeScript interface defining the structure of the webpage content returned by the tool.
    interface WebpageContent {
      title: string;
      text: string;
      url: string;
    }

Tool Definition Quality

Score is being calculated. Check back soon.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pgzhang/mcp2'

If you have feedback or need assistance with the MCP directory API, please join our Discord server