Skip to main content
Glama
badchars

osint-mcp-server

by badchars

wayback_snapshots

Retrieve Wayback Machine snapshot history for any URL. Returns timestamps, status codes, and direct archive links.

Instructions

Get Wayback Machine snapshot history for a specific URL. Returns timestamps, status codes, and direct archive links. Shows first/last seen dates.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL to get snapshot history for
limitNoMaximum snapshots to return (default: 100)

Implementation Reference

  • The 'waybackSnapshots' function that executes the tool logic. It queries the Wayback Machine CDX API, parses the JSON response, and returns snapshot history including timestamps, status codes, MIME types, and archive URLs.
    export async function waybackSnapshots(url: string, limit = 100): Promise<WaybackSnapshotsResult> {
      await limiter.acquire();
    
      const params = new URLSearchParams({
        url,
        output: "json",
        fl: "timestamp,original,statuscode,mimetype",
        limit: String(limit),
      });
    
      const controller = new AbortController();
      const timeout = setTimeout(() => controller.abort(), 30000);
    
      try {
        const res = await fetch(`https://web.archive.org/cdx/search/cdx?${params}`, { signal: controller.signal });
        if (!res.ok) throw new Error(`Wayback CDX returned ${res.status}`);
    
        const data: string[][] = await res.json();
        const rows = data.slice(1);
    
        const snapshots: WaybackSnapshot[] = rows.map((row) => ({
          timestamp: row[0] ?? "",
          url: row[1] ?? "",
          statusCode: row[2] ?? "",
          mimeType: row[3] ?? "",
          archiveUrl: `https://web.archive.org/web/${row[0]}/${row[1]}`,
        }));
    
        const firstSeen = snapshots.length > 0 ? snapshots[0].timestamp : undefined;
        const lastSeen = snapshots.length > 0 ? snapshots[snapshots.length - 1].timestamp : undefined;
    
        return { url, totalSnapshots: snapshots.length, firstSeen, lastSeen, snapshots };
      } finally {
        clearTimeout(timeout);
      }
    }
  • Type definitions: WaybackSnapshot (individual snapshot with timestamp, url, statusCode, mimeType, archiveUrl) and WaybackSnapshotsResult (overall result with url, totalSnapshots, firstSeen, lastSeen, snapshots array).
    interface WaybackSnapshot {
      timestamp: string;
      url: string;
      statusCode: string;
      mimeType: string;
      archiveUrl: string;
    }
    
    interface WaybackSnapshotsResult {
      url: string;
      totalSnapshots: number;
      firstSeen?: string;
      lastSeen?: string;
      snapshots: WaybackSnapshot[];
    }
  • Tool registration as a ToolDef constant 'waybackSnapshotsTool', with name 'wayback_snapshots', description, Zod schema (url required string, limit optional number), and execute handler that calls waybackSnapshots().
    const waybackSnapshotsTool: ToolDef = {
      name: "wayback_snapshots",
      description: "Get Wayback Machine snapshot history for a specific URL. Returns timestamps, status codes, and direct archive links. Shows first/last seen dates.",
      schema: {
        url: z.string().describe("URL to get snapshot history for"),
        limit: z.number().optional().describe("Maximum snapshots to return (default: 100)"),
      },
      execute: async (args) =>
        json(await waybackSnapshots(args.url as string, args.limit as number | undefined)),
    };
  • Import statement bringing 'waybackSnapshots' from '../wayback/index.js' into the tools registration module.
    import { waybackUrls, waybackSnapshots } from "../wayback/index.js";
  • src/index.ts:32-32 (registration)
    Tool listed in the top-level tool registry under 'Wayback Machine' category with tools ['wayback_urls', 'wayback_snapshots'].
    { label: "Wayback Machine", env: null, tools: ["wayback_urls", "wayback_snapshots"] },
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It explains output (timestamps, status codes, links) but omits behavioral details like rate limits, pagination, or error handling. Safe read operation is implied but not explicitly stated.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences without unnecessary words. Front-loaded with core purpose, then additional output details. Efficient and to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given simple tool (2 params, no output schema), description adequately covers inputs and outputs. Lacks differentiation from siblings but sufficient for basic use. Missing details like sort order or default behavior for missing URLs.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% so description adds no parameter details beyond what is already in the schema. Baseline 3 is appropriate as description does not compensate further.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states verb 'Get' and resource 'Wayback Machine snapshot history' with the specific input (URL). Distinguishes from sibling tool 'wayback_urls' which likely retrieves archived URLs rather than snapshot history.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage for retrieving snapshot history but provides no explicit guidance on when to use this tool vs alternatives like 'wayback_urls' or other history tools. No 'when not to use' or prerequisite context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/badchars/osint-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server