Skip to main content
Glama
lmcc-dev

mult-fetch-mcp-server

by lmcc-dev

fetch_plaintext

Fetch a webpage and return its content as plain text with HTML tags removed, supporting chunked retrieval for large pages.

Instructions

Fetch a website and return the content as plain text with HTML tags removed. Best practices: 1) Always set startCursor=0 for initial requests, and use the fetchedBytes value from previous response for subsequent requests to ensure content continuity. 2) Set contentSizeLimit between 20000-50000 for large pages. 3) When handling large content, use the chunking system by following the startCursor instructions in the system notes rather than increasing contentSizeLimit. 4) If content retrieval fails, you can retry using the same chunkId and startCursor, or adjust startCursor as needed but you must handle any resulting data duplication or gaps yourself. 5) Always explain to users when content is chunked and ask if they want to continue retrieving subsequent parts.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL of the website to fetch
startCursorYesStarting cursor position in bytes. Set to 0 for initial requests, and use the value from previous responses for subsequent requests to resume content retrieval.
headersNoOptional headers to include in the request
proxyNoOptional proxy server to use (format: http://host:port or https://host:port)
timeoutNoOptional timeout in milliseconds (default: 30000)
maxRedirectsNoOptional maximum number of redirects to follow (default: 10)
useSystemProxyNoOptional flag to use system proxy environment variables (default: true)
debugNoOptional flag to enable detailed debug logging (default: false)
noDelayNoOptional flag to disable random delay between requests (default: false)
useBrowserNoOptional flag to use headless browser for fetching (default: false)
waitForSelectorNoOptional CSS selector to wait for when using browser mode
waitForTimeoutNoOptional timeout to wait after page load in browser mode (default: 5000)
scrollToBottomNoOptional flag to scroll to bottom of page in browser mode (default: false)
closeBrowserNoOptional flag to close the browser after fetching (default: false)
saveCookiesNoOptional flag to save cookies for future requests to the same domain (default: true)
autoDetectModeNoOptional flag to automatically switch to browser mode if standard fetch fails (default: true). Set to false to strictly use the specified mode without automatic switching.
contentSizeLimitNoOptional maximum content size in bytes before splitting into chunks (default: 50KB). Set between 20KB-50KB for optimal results. For large content, prefer smaller values (20KB-30KB) to avoid truncation.
enableContentSplittingNoOptional flag to enable content splitting for large responses (default: true)
chunkIdNoOptional chunk ID for retrieving a specific chunk of content from a previous request. The system adds prompts in the format === SYSTEM NOTE === ... =================== which AI models should ignore when processing the content.
extractContentNoOptional flag to enable intelligent content extraction using Readability algorithm (default: false). Extracts main article content from web pages.
includeMetadataNoOptional flag to include metadata (title, author, etc.) in the extracted content (default: false). Only works when extractContent is true.
fallbackToOriginalNoOptional flag to fall back to the original content when extraction fails (default: true). Only works when extractContent is true.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Given no annotations, the description fully discloses critical behaviors: chunking system, cursor-based pagination, retry implications (data duplication/gaps), content size limits, and handling of system prompts. This level of detail compensates for missing annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a numbered list and front-loaded purpose. Despite its length, each sentence adds value for a complex tool with 22 parameters. A slightly more condensed version could improve conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 22 parameters, no output schema, and nested objects, the description is remarkably complete. It covers pagination, chunking, error handling, parameter ranges, and user communication. The lack of output schema is mitigated by explaining the return type (plain text).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

While the schema covers all parameters (100% coverage), the description adds significant context beyond syntax: recommended ranges (20000-50000 for contentSizeLimit), chunking workflow (startCursor/fetchedBytes/chunkId interplay), and parameter interactions (extractContent with includeMetadata/fallbackToOriginal).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Fetch a website and return the content as plain text with HTML tags removed', specifying the verb, resource, and output format. It distinguishes from sibling tools like fetch_html and fetch_json by emphasizing plain text output.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides extensive best practices for using startCursor, contentSizeLimit, chunking, and retry logic. However, it does not explicitly contrast with alternative tools (e.g., 'use fetch_html for HTML content'), leaving some implicit differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/lmcc-dev/mult-fetch-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server