Skip to main content
Glama

extract_readable

Extracts clean, readable text from web pages by removing ads, navigation, and other clutter to deliver focused content for analysis or reading.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes

Implementation Reference

  • The handler function for the 'extract_readable' tool. Fetches the HTML from the given URL, uses JSDOM to parse it, applies Mozilla's Readability library to extract the main article content (title, byline, excerpt, text), and returns it as markdown-formatted text.
    async (input) => { const res = await fetch(input.url, { headers: { "User-Agent": "Mozilla/5.0 (compatible; MCP-Web-Tools/0.1; +https://example.com)", }, }); const html = await res.text(); const dom = new JSDOM(html, { url: input.url }); const reader = new Readability(dom.window.document); const article = reader.parse(); if (!article) { return { content: [{ type: "text", text: "No readable content found." }] }; } const textBlocks = []; if (article.title) textBlocks.push(`# ${article.title}`); if (article.byline) textBlocks.push(`by ${article.byline}`); if (article.excerpt) textBlocks.push(article.excerpt); if (article.textContent) textBlocks.push(article.textContent); return { content: [{ type: "text", text: textBlocks.join("\n\n") }] }; }
  • The input schema for the 'extract_readable' tool, validating that the input contains a valid URL string.
    { url: z.string().url() },
  • src/server.js:135-158 (registration)
    The registration of the 'extract_readable' tool on the McpServer instance, specifying name, input schema, and handler function.
    server.tool( "extract_readable", { url: z.string().url() }, async (input) => { const res = await fetch(input.url, { headers: { "User-Agent": "Mozilla/5.0 (compatible; MCP-Web-Tools/0.1; +https://example.com)", }, }); const html = await res.text(); const dom = new JSDOM(html, { url: input.url }); const reader = new Readability(dom.window.document); const article = reader.parse(); if (!article) { return { content: [{ type: "text", text: "No readable content found." }] }; } const textBlocks = []; if (article.title) textBlocks.push(`# ${article.title}`); if (article.byline) textBlocks.push(`by ${article.byline}`); if (article.excerpt) textBlocks.push(article.excerpt); if (article.textContent) textBlocks.push(article.textContent); return { content: [{ type: "text", text: textBlocks.join("\n\n") }] }; } );

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/JoaoPedroLanca/mcp-web-tools'

If you have feedback or need assistance with the MCP directory API, please join our Discord server