What can you do with this server?

The Lean Reader server fetches URLs and returns token-minimized clean text plus a token-savings receipt, optimized for LLM consumption. * Fetch and clean web pages: Strips nav bars, cookie banners, scripts, SVGs, and boilerplate — returning only article-focused content. * Choose output format: Returns either markdown (default) or text. * Token-savings receipt: Every response includes before/after token counts, percentage saved, compression ratio, and estimated cost savings (e.g., 231,276 → 15,735 tokens · 93% saved · ~$0.54 on gpt-4o). * Dual-extractor content recovery: Runs both Defuddle and Mozilla Readability, keeping whichever recovers more body content — reducing silent content loss. * Partial-result flagging: JS-rendered or client-side (SPA) pages are flagged as partial rather than returning empty or misleading content. * Optimized for static HTML: Best results on static pages; JS-rendered pages may return limited content.

How do I use Lean Reader?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@Lean Reader read and minimize https://example.com/long-article" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

Lean Reader

by AIMento

Overview Schema Related Servers Score Discussions

JavaScript

Remote

Lean Reader

Turn any URL into token-minimized clean text for LLMs, with a token-savings receipt on every call. MCP server + library.

LLMs don't need your nav bar, your cookie banner, your <script> tags, or 200 KB of inlined SVG — but raw page HTML makes them pay for all of it. Lean Reader strips a page down to the article and tells you exactly how many tokens (and dollars) you just saved.

231,276 → 15,735 tokens (93% saved · 14.7× vs raw HTML · ~$0.54 on gpt-4o) · cleaned by lean reader

Use as an MCP server

Add to your client's MCP config (Claude Desktop/Code, Cursor, …):

{
  "mcpServers": {
    "lean-reader": { "command": "npx", "args": ["-y", "lean-reader"] }
  }
}

Then the lean_read(url, format?) tool returns clean text plus the receipt.

Related MCP server: Scrapi MCP Server

Use as a library

import { leanRead } from 'lean-reader/lib/core.js';

const r = await leanRead('https://example.com/article', { format: 'markdown' });
console.log(r.content);   // token-minimized text
console.log(r.receipt);   // { beforeTokens, afterTokens, savedPct, ratio, estCostSavedUsd, ... }

How much does it save?

Measured, not marketed — the open benchmark ships the corpus, the tokenizer, and every raw output, and flags the cases where Lean Reader loses:

~29% fewer tokens than Mozilla Readability (the standard extractor) at the median, while keeping ~99% of the body text. Be honest about where that edge comes from: it's the minimize post-pass (link/image/footnote/whitespace strip), not smarter extraction — run both through minimize and they're roughly par. Lean actually runs Readability as one of its two extractors (see Honest limits), so it doesn't lose to it.
Versus raw page HTML the multiple is much larger (median ~8.7×, down to ~3.1× on already-clean blog prose, 100×+ on script-heavy docs) — but that's HTML nobody feeds an LLM, so read it as "don't dump raw pages," not as a competitive claim.
Versus Jina Reader (measured, anonymous tier): ~1.6× fewer tokens on a like-for-like body, ~4.3× if you count the nav and reference dumps Jina also returns. Firecrawl is not yet measured (needs an API key).

The receipt uses the o200k_base tokenizer (GPT-4o/4.1 class); the model and tokenizer are always shown, and counts are vs the raw page HTML so you can check the math.

Honest limits

Static HTML only (v1). Pages whose body is client-rendered (some SPAs, GitHub repo landing pages) return little — Lean Reader flags partial instead of emitting empty text. Jina/Firecrawl render JS and will beat us there.
Two extractors, body-max selection. Defuddle and Mozilla Readability each silently drop the body on different pages (Defuddle on some large Wikipedia articles, Readability on some docs/SPAs). Lean runs both and keeps whichever recovers more body, so neither's blind spot becomes a silent content drop. A ROUGE-L ground-truth pass on a 14-page hand-labeled sample is done: reference-body recall 0.99, equal to Readability on the same ground truth, so the word-count gap is noise removal, not body loss (see the bench repo).
Token counts are o200k_base; Claude/Gemini tokenize differently.

Open-core

The extraction + token-minimization core (lib/) and the MCP server (src/) are MIT. Hosted service, sharing UI, and metering are separate.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

lean_readA

Related MCP Servers

superFetch MCP Server
Browser Automation Web Scraping RAG Systems
j0hanz
A
license
A
quality
D
maintenance
An MCP server that fetches web pages and extracts clean, AI-friendly Markdown content using Mozilla Readability. It provides secure web access for LLMs with built-in SSRF protection and automated content cleaning for improved context retrieval and summarization.
Last updated 2026-04-11
1
259
MIT
Scrapi MCP Server
Web Scraping Browser Automation
bamchi
A
license
A
quality
C
maintenance
MCP server that converts URLs to clean Markdown/Text for LLM agents.
Last updated 2026-05-27
5
70
5
MIT
TokenSaver MCP
AI & Machine Learning Developer Tools
pozii
A
license
A
quality
C
maintenance
An MCP server that reduces AI API costs by up to 97% through token measurement, compression, caching, and pruning, all without changing prompts.
Last updated 2026-06-21
10
1
Apache 2.0
krwl3r
Browser Automation Web Scraping
garciarsdiego
A
license
-
quality
D
maintenance
MCP server for web scraping and browser automation, enabling AI agents to extract clean, token-efficient content from web pages.
Last updated 2026-03-03
1
MIT

View all related MCP servers

Related MCP Connectors

TokenOracle
Hosted MCP server for LLM cost estimation, model comparison, and budget-aware routing.
Jina Reader
Jina AI Reader/Search MCP — turn any URL into clean LLM-ready markdown, plus web search.
mcp
MCP server providing access to the Scorecard API to evaluate and optimize LLM systems.

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AIMento/lean-reader'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

Lean Reader

Use as an MCP server

Use as a library

How much does it save?

Honest limits

Open-core

Maintenance

Resources

Looking for Admin?

Tools

Related MCP Servers

superFetch MCP Server

Scrapi MCP Server

TokenSaver MCP

krwl3r

Related MCP Connectors

Latest Blog Posts

MCP directory API