Skip to main content
Glama

extract_readable

Extract clean, readable text content from web pages by providing a URL, removing navigation elements and ads to focus on the main article text.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes

Implementation Reference

  • The main handler function for the 'extract_readable' tool. Fetches the HTML from the provided URL, uses JSDOM to parse it, applies Mozilla's Readability library to extract the main article content (title, byline, excerpt, text), and formats it into a markdown-like text response.
    async (input) => { const res = await fetch(input.url, { headers: { "User-Agent": "Mozilla/5.0 (compatible; MCP-Web-Tools/0.1; +https://example.com)", }, }); const html = await res.text(); const dom = new JSDOM(html, { url: input.url }); const reader = new Readability(dom.window.document); const article = reader.parse(); if (!article) { return { content: [{ type: "text", text: "No readable content found." }] }; } const textBlocks = []; if (article.title) textBlocks.push(`# ${article.title}`); if (article.byline) textBlocks.push(`by ${article.byline}`); if (article.excerpt) textBlocks.push(article.excerpt); if (article.textContent) textBlocks.push(article.textContent); return { content: [{ type: "text", text: textBlocks.join("\n\n") }] }; }
  • Input schema validation using Zod: requires a single 'url' parameter that must be a valid URL string.
    { url: z.string().url() },
  • src/server.js:135-158 (registration)
    Registration of the 'extract_readable' tool on the MCP server using server.tool(), including schema and handler.
    server.tool( "extract_readable", { url: z.string().url() }, async (input) => { const res = await fetch(input.url, { headers: { "User-Agent": "Mozilla/5.0 (compatible; MCP-Web-Tools/0.1; +https://example.com)", }, }); const html = await res.text(); const dom = new JSDOM(html, { url: input.url }); const reader = new Readability(dom.window.document); const article = reader.parse(); if (!article) { return { content: [{ type: "text", text: "No readable content found." }] }; } const textBlocks = []; if (article.title) textBlocks.push(`# ${article.title}`); if (article.byline) textBlocks.push(`by ${article.byline}`); if (article.excerpt) textBlocks.push(article.excerpt); if (article.textContent) textBlocks.push(article.textContent); return { content: [{ type: "text", text: textBlocks.join("\n\n") }] }; } );

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/JoaoPedroLanca/mcp-web-tools'

If you have feedback or need assistance with the MCP directory API, please join our Discord server