Skip to main content
Glama
sisilet

Wayback Machine MCP Server

by sisilet

get_archived_page

Retrieve archived webpage content from the Wayback Machine using a specific timestamp to access historical website versions and data.

Instructions

Retrieve content of an archived webpage from the Wayback Machine using YYYYMMDDHHMMSS timestamp. If original=true, request id_ mode.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
originalNo
timestampYes
urlYes

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • Implements the core logic for retrieving an archived webpage from the Wayback Machine. Constructs the archived URL based on timestamp and optional 'original' mode, fetches the response using _fetch_text helper, and returns structured data including status, headers, and page text.
    async def get_archived_page(
    	url: str,
    	timestamp: str,
    	original: bool = False,
    ) -> Dict[str, Any]:
    	"""
    	Fetch the archived page content. Returns status, headers, and text content.
    	If `original` is True, uses the `id_` mode to minimize Wayback rewriting/banners.
    	"""
    	mode = "id_/" if original else ""
    	archived_url = f"{WAYBACK_ENDPOINT}/{timestamp}/{mode}{url}"
    	resp = await _fetch_text(archived_url)
    
    	text: Optional[str]
    	try:
    		text = resp.text
    	except Exception:
    		text = None
    
    	return {
    		"url": url,
    		"timestamp": timestamp,
    		"archived_url": archived_url,
    		"status_code": resp.status_code,
    		"headers": dict(resp.headers),
    		"text": text,
    	}
  • Registers the 'get_archived_page' tool with the FastMCP app, specifying the name and description.
    @app.tool(
    	name="get_archived_page",
    	description=(
    		"Retrieve content of an archived webpage from the Wayback Machine "
    		"using YYYYMMDDHHMMSS timestamp. If original=true, request id_ mode."
    	),
    )
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It hints at a retrieval operation ('Retrieve content') but lacks critical details: authentication requirements, rate limits, error handling, or what 'id_ mode' entails. The mention of timestamp format is useful but insufficient for a tool interacting with an external service like Wayback Machine.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief and front-loaded with the main purpose. Both sentences are relevant: the first states the action and key parameter detail, the second adds a conditional parameter behavior. There's no wasted text, though the ambiguous 'id_ mode' could be clarified for better efficiency.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 parameters with 0% schema coverage, no annotations, and an output schema (which reduces need to describe returns), the description is moderately complete. It covers timestamp format and a parameter condition but misses explanations for 'url' and 'id_ mode', and lacks behavioral context like errors or limits. This is adequate but has clear gaps for a tool with external dependencies.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the schema provides no parameter documentation. The description adds some semantics: it explains the timestamp format (YYYYMMDDHHMMSS) and clarifies that 'original' parameter triggers 'id_ mode'. However, it doesn't explain the 'url' parameter or what 'id_ mode' means, leaving gaps for 3 parameters. This partial compensation justifies a baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Retrieve content') and resource ('archived webpage from the Wayback Machine'), making the purpose understandable. However, it doesn't explicitly differentiate this tool from its siblings (get_snapshots, search_items), which would require a 5. The mention of 'using YYYYMMDDHHMMSS timestamp' adds specificity but not sibling distinction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like get_snapshots or search_items. It mentions a conditional ('If original=true, request id_ mode') but this is a parameter usage note, not a contextual guideline for tool selection. Without any when-to-use or when-not-to-use information, the score is minimal.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sisilet/wayback-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server