Skip to main content
Glama

extract_timestamps

Extract creation, modification, and publication timestamps from webpage URLs using HTML meta tags, HTTP headers, and structured data.

Instructions

Extract creation, modification, and publication timestamps from a webpage

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL of the webpage to extract timestamps from
configNoOptional configuration for the extraction

Implementation Reference

  • Core handler function implementing the timestamp extraction logic: fetches webpage, parses with Cheerio, extracts from meta tags, headers, JSON-LD, microdata, OpenGraph, Twitter cards, heuristics, and consolidates results.
    async extractTimestamps(url: string): Promise<TimestampResult> { const errors: string[] = []; let fetchResult: FetchResult; try { fetchResult = await this.fetchPage(url); } catch (error) { return { url, sources: [], confidence: 'low', errors: [`Failed to fetch page: ${error instanceof Error ? error.message : String(error)}`], }; } const $ = cheerio.load(fetchResult.html); const sources: TimestampSource[] = []; // Extract timestamps from various sources sources.push(...this.extractFromHtmlMeta($)); sources.push(...this.extractFromHttpHeaders(fetchResult.headers)); sources.push(...this.extractFromJsonLd($)); sources.push(...this.extractFromMicrodata($)); sources.push(...this.extractFromOpenGraph($)); sources.push(...this.extractFromTwitterCards($)); if (this.config.enableHeuristics) { sources.push(...this.extractFromHeuristics($)); } const result = this.consolidateTimestamps(url, sources); if (errors.length > 0) { result.errors = errors; } return result; }
  • src/index.ts:28-66 (registration)
    MCP tool registration defining name, description, and input schema for 'extract_timestamps'.
    name: 'extract_timestamps', description: 'Extract creation, modification, and publication timestamps from a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL of the webpage to extract timestamps from', }, config: { type: 'object', description: 'Optional configuration for the extraction', properties: { timeout: { type: 'number', description: 'Request timeout in milliseconds (default: 10000)', }, userAgent: { type: 'string', description: 'User agent string to use for requests', }, followRedirects: { type: 'boolean', description: 'Whether to follow HTTP redirects (default: true)', }, maxRedirects: { type: 'number', description: 'Maximum number of redirects to follow (default: 5)', }, enableHeuristics: { type: 'boolean', description: 'Whether to enable heuristic timestamp detection (default: true)', }, }, }, }, required: ['url'], }, },
  • MCP CallToolRequestSchema handler branch for 'extract_timestamps': validates input, instantiates extractor if config provided, calls extractTimestamps, and returns JSON result.
    if (name === 'extract_timestamps') { const { url, config } = args as { url: string; config?: { timeout?: number; userAgent?: string; followRedirects?: boolean; maxRedirects?: number; enableHeuristics?: boolean; }; }; if (!url) { return { content: [ { type: 'text', text: 'Error: URL is required', }, ], isError: true, }; } const timestampExtractor = config ? new TimestampExtractor(config) : extractor; const result = await timestampExtractor.extractTimestamps(url); return { content: [ { type: 'text', text: JSON.stringify(result, null, 2), }, ], }; }
  • TypeScript interfaces defining the output structure (TimestampResult) and sources (TimestampSource) used by the tool.
    export interface TimestampResult { url: string; createdAt?: Date; modifiedAt?: Date; publishedAt?: Date; sources: TimestampSource[]; confidence: 'high' | 'medium' | 'low'; errors?: string[]; } export interface TimestampSource { type: 'html-meta' | 'http-header' | 'json-ld' | 'microdata' | 'opengraph' | 'twitter' | 'heuristic'; field: string; value: string; confidence: 'high' | 'medium' | 'low'; }
  • Type definition for optional config input parameter.
    export interface ExtractorConfig { timeout?: number; userAgent?: string; followRedirects?: boolean; maxRedirects?: number; enableHeuristics?: boolean; }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Fabien-desablens/mcp-webpage-timestamps'

If you have feedback or need assistance with the MCP directory API, please join our Discord server