Skip to main content
Glama

extract_timestamps

Extract webpage creation, modification, and publication timestamps from HTML meta tags, HTTP headers, and structured data using an input URL. Configurable options include timeout, user agent, redirect handling, and heuristic detection.

Instructions

Extract creation, modification, and publication timestamps from a webpage

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
configNoOptional configuration for the extraction
urlYesThe URL of the webpage to extract timestamps from

Implementation Reference

  • Core handler function that implements the timestamp extraction logic: fetches webpage, parses HTML, extracts from multiple sources (meta, headers, structured data, heuristics), consolidates timestamps.
    async extractTimestamps(url: string): Promise<TimestampResult> { const errors: string[] = []; let fetchResult: FetchResult; try { fetchResult = await this.fetchPage(url); } catch (error) { return { url, sources: [], confidence: 'low', errors: [`Failed to fetch page: ${error instanceof Error ? error.message : String(error)}`], }; } const $ = cheerio.load(fetchResult.html); const sources: TimestampSource[] = []; // Extract timestamps from various sources sources.push(...this.extractFromHtmlMeta($)); sources.push(...this.extractFromHttpHeaders(fetchResult.headers)); sources.push(...this.extractFromJsonLd($)); sources.push(...this.extractFromMicrodata($)); sources.push(...this.extractFromOpenGraph($)); sources.push(...this.extractFromTwitterCards($)); if (this.config.enableHeuristics) { sources.push(...this.extractFromHeuristics($)); } const result = this.consolidateTimestamps(url, sources); if (errors.length > 0) { result.errors = errors; } return result; }
  • MCP tool dispatch handler branch for 'extract_timestamps': validates input, instantiates extractor if config provided, calls extractTimestamps, returns JSON result.
    if (name === 'extract_timestamps') { const { url, config } = args as { url: string; config?: { timeout?: number; userAgent?: string; followRedirects?: boolean; maxRedirects?: number; enableHeuristics?: boolean; }; }; if (!url) { return { content: [ { type: 'text', text: 'Error: URL is required', }, ], isError: true, }; } const timestampExtractor = config ? new TimestampExtractor(config) : extractor; const result = await timestampExtractor.extractTimestamps(url); return { content: [ { type: 'text', text: JSON.stringify(result, null, 2), }, ], }; }
  • src/index.ts:27-66 (registration)
    Tool registration in the MCP tools list, including name, description, and input schema.
    { name: 'extract_timestamps', description: 'Extract creation, modification, and publication timestamps from a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL of the webpage to extract timestamps from', }, config: { type: 'object', description: 'Optional configuration for the extraction', properties: { timeout: { type: 'number', description: 'Request timeout in milliseconds (default: 10000)', }, userAgent: { type: 'string', description: 'User agent string to use for requests', }, followRedirects: { type: 'boolean', description: 'Whether to follow HTTP redirects (default: true)', }, maxRedirects: { type: 'number', description: 'Maximum number of redirects to follow (default: 5)', }, enableHeuristics: { type: 'boolean', description: 'Whether to enable heuristic timestamp detection (default: true)', }, }, }, }, required: ['url'], }, },
  • Type definition for the output TimestampResult returned by the tool.
    export interface TimestampResult { url: string; createdAt?: Date; modifiedAt?: Date; publishedAt?: Date; sources: TimestampSource[]; confidence: 'high' | 'medium' | 'low'; errors?: string[]; }
  • Input schema for the extract_timestamps tool, defining required 'url' and optional 'config' parameters.
    inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL of the webpage to extract timestamps from', }, config: { type: 'object', description: 'Optional configuration for the extraction', properties: { timeout: { type: 'number', description: 'Request timeout in milliseconds (default: 10000)', }, userAgent: { type: 'string', description: 'User agent string to use for requests', }, followRedirects: { type: 'boolean', description: 'Whether to follow HTTP redirects (default: true)', }, maxRedirects: { type: 'number', description: 'Maximum number of redirects to follow (default: 5)', }, enableHeuristics: { type: 'boolean', description: 'Whether to enable heuristic timestamp detection (default: true)', }, }, }, }, required: ['url'], },

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Fabien-desablens/mcp-webpage-timestamps'

If you have feedback or need assistance with the MCP directory API, please join our Discord server