batch_extract_timestamps

Extract creation, modification, and publication timestamps from multiple webpages simultaneously using HTML meta tags, HTTP headers, and structured data.

Instructions

Extract timestamps from multiple webpages in batch

Input Schema

TableJSON Schema

Name	Required	Description	Default
`urls`	Yes	Array of URLs to extract timestamps from
`config`	No	Optional configuration for the extraction

Implementation Reference

src/index.ts:159-209 (handler)
Handler for the 'batch_extract_timestamps' tool. Parses arguments, validates URLs array, instantiates TimestampExtractor with optional config, concurrently extracts timestamps from each URL using Promise.allSettled, processes results including errors, and returns JSON-formatted array of results.
if (name === 'batch_extract_timestamps') { const { urls, config } = args as { urls: string[]; config?: { timeout?: number; userAgent?: string; followRedirects?: boolean; maxRedirects?: number; enableHeuristics?: boolean; }; }; if (!Array.isArray(urls) || urls.length === 0) { return { content: [ { type: 'text', text: 'Error: URLs array is required and must not be empty', }, ], isError: true, }; } const timestampExtractor = config ? new TimestampExtractor(config) : extractor; const results = await Promise.allSettled( urls.map(url => timestampExtractor.extractTimestamps(url)) ); const processedResults = results.map((result, index) => { if (result.status === 'fulfilled') { return result.value; } else { return { url: urls[index], sources: [], confidence: 'low' as const, errors: [`Failed to extract timestamps: ${result.reason}`], }; } }); return { content: [ { type: 'text', text: JSON.stringify(processedResults, null, 2), }, ], }; }
src/index.ts:67-109 (registration)
Registration of the 'batch_extract_timestamps' tool in the tools array returned by ListToolsRequestHandler, including detailed input schema for batch processing.
{ name: 'batch_extract_timestamps', description: 'Extract timestamps from multiple webpages in batch', inputSchema: { type: 'object', properties: { urls: { type: 'array', items: { type: 'string', }, description: 'Array of URLs to extract timestamps from', }, config: { type: 'object', description: 'Optional configuration for the extraction', properties: { timeout: { type: 'number', description: 'Request timeout in milliseconds (default: 10000)', }, userAgent: { type: 'string', description: 'User agent string to use for requests', }, followRedirects: { type: 'boolean', description: 'Whether to follow HTTP redirects (default: true)', }, maxRedirects: { type: 'number', description: 'Maximum number of redirects to follow (default: 5)', }, enableHeuristics: { type: 'boolean', description: 'Whether to enable heuristic timestamp detection (default: true)', }, }, }, }, required: ['urls'], }, },
src/index.ts:70-108 (schema)
Input schema definition for the 'batch_extract_timestamps' tool, specifying required 'urls' array and optional 'config' object.
inputSchema: { type: 'object', properties: { urls: { type: 'array', items: { type: 'string', }, description: 'Array of URLs to extract timestamps from', }, config: { type: 'object', description: 'Optional configuration for the extraction', properties: { timeout: { type: 'number', description: 'Request timeout in milliseconds (default: 10000)', }, userAgent: { type: 'string', description: 'User agent string to use for requests', }, followRedirects: { type: 'boolean', description: 'Whether to follow HTTP redirects (default: true)', }, maxRedirects: { type: 'number', description: 'Maximum number of redirects to follow (default: 5)', }, enableHeuristics: { type: 'boolean', description: 'Whether to enable heuristic timestamp detection (default: true)', }, }, }, }, required: ['urls'], },
src/extractor.ts:19-56 (helper)
Helper method in TimestampExtractor class that implements the core logic for extracting timestamps from a single webpage, used by both 'extract_timestamps' and 'batch_extract_timestamps' tools.
async extractTimestamps(url: string): Promise<TimestampResult> { const errors: string[] = []; let fetchResult: FetchResult; try { fetchResult = await this.fetchPage(url); } catch (error) { return { url, sources: [], confidence: 'low', errors: [`Failed to fetch page: ${error instanceof Error ? error.message : String(error)}`], }; } const $ = cheerio.load(fetchResult.html); const sources: TimestampSource[] = []; // Extract timestamps from various sources sources.push(...this.extractFromHtmlMeta($)); sources.push(...this.extractFromHttpHeaders(fetchResult.headers)); sources.push(...this.extractFromJsonLd($)); sources.push(...this.extractFromMicrodata($)); sources.push(...this.extractFromOpenGraph($)); sources.push(...this.extractFromTwitterCards($)); if (this.config.enableHeuristics) { sources.push(...this.extractFromHeuristics($)); } const result = this.consolidateTimestamps(url, sources); if (errors.length > 0) { result.errors = errors; } return result; }

MCP Webpage Timestamps

batch_extract_timestamps

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API