Skip to main content
Glama
lmcc-dev

mult-fetch-mcp-server

by lmcc-dev

fetch_html

Fetch a website and return its HTML content. Supports chunking for large pages, browser automation for dynamic content, and intelligent extraction for main article text.

Instructions

Fetch a website and return the content as HTML. Best practices: 1) Always set startCursor=0 for initial requests, and use the fetchedBytes value from previous response for subsequent requests to ensure content continuity. 2) Set contentSizeLimit between 20000-50000 for large pages. 3) When handling large content, use the chunking system by following the startCursor instructions in the system notes rather than increasing contentSizeLimit. 4) If content retrieval fails, you can retry using the same chunkId and startCursor, or adjust startCursor as needed but you must handle any resulting data duplication or gaps yourself. 5) Always explain to users when content is chunked and ask if they want to continue retrieving subsequent parts.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL of the website to fetch
startCursorYesStarting cursor position in bytes. Set to 0 for initial requests, and use the value from previous responses for subsequent requests to resume content retrieval.
headersNoOptional headers to include in the request
proxyNoOptional proxy server to use (format: http://host:port or https://host:port)
timeoutNoOptional timeout in milliseconds (default: 30000)
maxRedirectsNoOptional maximum number of redirects to follow (default: 10)
useSystemProxyNoOptional flag to use system proxy environment variables (default: true)
debugNoOptional flag to enable detailed debug logging (default: false)
noDelayNoOptional flag to disable random delay between requests (default: false)
useBrowserNoOptional flag to use headless browser for fetching (default: false)
waitForSelectorNoOptional CSS selector to wait for when using browser mode
waitForTimeoutNoOptional timeout to wait after page load in browser mode (default: 5000)
scrollToBottomNoOptional flag to scroll to bottom of page in browser mode (default: false)
closeBrowserNoOptional flag to close the browser after fetching (default: false)
saveCookiesNoOptional flag to save cookies for future requests to the same domain (default: true)
autoDetectModeNoOptional flag to automatically switch to browser mode if standard fetch fails (default: true). Set to false to strictly use the specified mode without automatic switching.
contentSizeLimitNoOptional maximum content size in bytes before splitting into chunks (default: 50KB). Set between 20KB-50KB for optimal results. For large content, prefer smaller values (20KB-30KB) to avoid truncation.
enableContentSplittingNoOptional flag to enable content splitting for large responses (default: true)
chunkIdNoOptional chunk ID for retrieving a specific chunk of content from a previous request. The system adds prompts in the format === SYSTEM NOTE === ... =================== which AI models should ignore when processing the content.
extractContentNoOptional flag to enable intelligent content extraction using Readability algorithm (default: false). Extracts main article content from web pages.
includeMetadataNoOptional flag to include metadata (title, author, etc.) in the extracted content (default: false). Only works when extractContent is true.
fallbackToOriginalNoOptional flag to fall back to the original content when extraction fails (default: true). Only works when extractContent is true.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses chunking behavior, cursor management, retry semantics, data duplication risks, and instruction to ignore system notes. Comprehensive coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single paragraph with five bullet-pointed best practices. Front-loaded with purpose, each sentence provides unique value. No redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 22 parameters and no output schema, the description covers core usage patterns well but could briefly mention response structure or error codes. Still highly informative.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. The description adds value by providing usage context for key parameters like startCursor (always set to 0) and contentSizeLimit (20-50KB), beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Fetch a website and return the content as HTML', specifying the verb, resource, and output format. This distinguishes it from siblings like fetch_json, fetch_markdown, etc.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Five explicit best practices guide when and how to use the tool, including initial cursor setting, content size limits, chunking, retry handling, and user communication. No exclusions needed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/lmcc-dev/mult-fetch-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server