Skip to main content
Glama

MCP Webpage Timestamps

MCP Webpage Timestamps

A powerful Model Context Protocol (MCP) server for extracting webpage creation, modification, and publication timestamps. This tool is designed for content freshness evaluation, web scraping, and temporal analysis of web content.

Features

  • Comprehensive Timestamp Extraction: Extracts creation, modification, and publication timestamps from webpages
  • Multiple Data Sources: Supports HTML meta tags, HTTP headers, JSON-LD, microdata, OpenGraph, Twitter cards, and heuristic analysis
  • Confidence Scoring: Provides confidence levels (high/medium/low) for extracted timestamps
  • Batch Processing: Extract timestamps from multiple URLs simultaneously
  • Configurable: Customizable timeout, user agent, redirect handling, and heuristic options
  • Production Ready: Robust error handling, comprehensive logging, and TypeScript support

Installation

Quick Install

npm install -g mcp-webpage-timestamps

Usage with npx

npx mcp-webpage-timestamps

Prerequisites

  • Node.js 18.0.0 or higher
  • npm or yarn

Development Install

git clone https://github.com/Fabien-desablens/mcp-webpage-timestamps.git cd mcp-webpage-timestamps npm install npm run build

Usage

As MCP Server

The server can be used with any MCP-compatible client. Here's how to configure it:

Claude Desktop Configuration

Add to your claude_desktop_config.json:

{ "mcpServers": { "webpage-timestamps": { "command": "npx", "args": ["mcp-webpage-timestamps"], "env": {} } } }
Cline Configuration

Add to your MCP settings:

{ "mcpServers": { "webpage-timestamps": { "command": "npx", "args": ["mcp-webpage-timestamps"] } } }

Direct Usage

# Start the server npm start # Or run in development mode npm run dev

API Reference

Tools

extract_timestamps

Extract timestamps from a single webpage.

Parameters:

  • url (string, required): The URL of the webpage to extract timestamps from
  • config (object, optional): Configuration options

Configuration Options:

  • timeout (number): Request timeout in milliseconds (default: 10000)
  • userAgent (string): User agent string for requests
  • followRedirects (boolean): Whether to follow HTTP redirects (default: true)
  • maxRedirects (number): Maximum number of redirects to follow (default: 5)
  • enableHeuristics (boolean): Enable heuristic timestamp detection (default: true)

Example:

{ "name": "extract_timestamps", "arguments": { "url": "https://example.com/article", "config": { "timeout": 15000, "enableHeuristics": true } } }
batch_extract_timestamps

Extract timestamps from multiple webpages in batch.

Parameters:

  • urls (array of strings, required): Array of URLs to extract timestamps from
  • config (object, optional): Same configuration options as extract_timestamps

Example:

{ "name": "batch_extract_timestamps", "arguments": { "urls": [ "https://example.com/article1", "https://example.com/article2", "https://example.com/article3" ], "config": { "timeout": 10000 } } }

Response Format

Both tools return a JSON object with the following structure:

{ url: string; createdAt?: Date; modifiedAt?: Date; publishedAt?: Date; sources: TimestampSource[]; confidence: 'high' | 'medium' | 'low'; errors?: string[]; }

TimestampSource:

{ type: 'html-meta' | 'http-header' | 'json-ld' | 'microdata' | 'opengraph' | 'twitter' | 'heuristic'; field: string; value: string; confidence: 'high' | 'medium' | 'low'; }

Supported Timestamp Sources

HTML Meta Tags

  • article:published_time
  • article:modified_time
  • date
  • pubdate
  • publishdate
  • last-modified
  • dc.date.created
  • dc.date.modified
  • dcterms.created
  • dcterms.modified

HTTP Headers

  • Last-Modified
  • Date

JSON-LD Structured Data

  • datePublished
  • dateModified
  • dateCreated

Microdata

  • datePublished
  • dateModified

OpenGraph

  • og:article:published_time
  • og:article:modified_time
  • og:updated_time

Twitter Cards

  • twitter:data1 (when containing date information)

Heuristic Analysis

  • Time elements with datetime attributes
  • Common date patterns in text
  • Date-related CSS classes

Development

Scripts

# Development with hot reload npm run dev # Build the project npm run build # Run tests npm test # Run tests in watch mode npm run test:watch # Lint code npm run lint # Fix linting issues npm run lint:fix # Format code npm run format

Testing

The project includes comprehensive tests:

# Run all tests npm test # Run tests with coverage npm test -- --coverage # Run specific test file npm test -- extractor.test.ts

Code Quality

  • TypeScript: Full TypeScript support with strict type checking
  • ESLint: Code linting with recommended rules
  • Prettier: Code formatting
  • Jest: Unit and integration testing
  • 95%+ Test Coverage: Comprehensive test suite

Examples

Basic Usage

import { TimestampExtractor } from './src/extractor.js'; const extractor = new TimestampExtractor(); const result = await extractor.extractTimestamps('https://example.com/article'); console.log('Published:', result.publishedAt); console.log('Modified:', result.modifiedAt); console.log('Confidence:', result.confidence); console.log('Sources:', result.sources.length);

Custom Configuration

const extractor = new TimestampExtractor({ timeout: 15000, userAgent: 'MyBot/1.0', enableHeuristics: false, maxRedirects: 3 }); const result = await extractor.extractTimestamps('https://example.com');

Batch Processing

const urls = [ 'https://example.com/article1', 'https://example.com/article2', 'https://example.com/article3' ]; const results = await Promise.all( urls.map(url => extractor.extractTimestamps(url)) );

Use Cases

  • Content Freshness Analysis: Evaluate how recent web content is
  • Web Scraping: Extract temporal metadata from scraped pages
  • SEO Analysis: Analyze publication and modification patterns
  • Research: Study temporal aspects of web content
  • Content Management: Track content lifecycle and updates

Error Handling

The extractor handles various error conditions gracefully:

  • Network Errors: Timeout, connection refused, DNS resolution failures
  • HTTP Errors: 404, 500, and other HTTP status codes
  • Parsing Errors: Invalid HTML, malformed JSON-LD, unparseable dates
  • Configuration Errors: Invalid URLs, timeout values, etc.

All errors are captured in the errors array of the response, allowing for robust error handling and debugging.

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Setup

  1. Fork the repository
  2. Clone your fork: git clone https://github.com/Fabien-desablens/mcp-webpage-timestamps.git
  3. Install dependencies: npm install
  4. Create a branch: git checkout -b feature/your-feature
  5. Make your changes
  6. Run tests: npm test
  7. Commit your changes: git commit -m 'Add some feature'
  8. Push to the branch: git push origin feature/your-feature
  9. Submit a pull request

Code Style

  • Follow the existing code style
  • Use TypeScript for all new code
  • Add tests for new functionality
  • Update documentation as needed

License

MIT License - see the LICENSE file for details.

Support

Changelog

See CHANGELOG.md for a detailed history of changes.

Acknowledgments

-
security - not tested
A
license - permissive license
-
quality - not tested

remote-capable server

The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.

A Model Context Protocol server that extracts webpage creation, modification, and publication timestamps from various sources including HTML meta tags, HTTP headers, and structured data.

  1. Features
    1. Installation
      1. Quick Install
      2. Usage with npx
      3. Prerequisites
      4. Development Install
    2. Usage
      1. As MCP Server
      2. Direct Usage
    3. API Reference
      1. Tools
      2. Response Format
    4. Supported Timestamp Sources
      1. HTML Meta Tags
      2. HTTP Headers
      3. JSON-LD Structured Data
      4. Microdata
      5. OpenGraph
      6. Twitter Cards
      7. Heuristic Analysis
    5. Development
      1. Scripts
      2. Testing
      3. Code Quality
    6. Examples
      1. Basic Usage
      2. Custom Configuration
      3. Batch Processing
    7. Use Cases
      1. Error Handling
        1. Contributing
          1. Development Setup
          2. Code Style
        2. License
          1. Support
            1. Changelog
              1. Acknowledgments

                Related MCP Servers

                • A
                  security
                  A
                  license
                  A
                  quality
                  A Model Context Protocol server that provides web content fetching and conversion capabilities.
                  Last updated -
                  4
                  89
                  2
                  JavaScript
                  MIT License
                  • Apple
                • -
                  security
                  A
                  license
                  -
                  quality
                  A Model Context Protocol server that enables web search with category support, website content scraping with citation metadata, and timezone-aware date/time tools.
                  Last updated -
                  Python
                  MIT License
                  • Linux
                  • Apple
                • -
                  security
                  F
                  license
                  -
                  quality
                  A Model Context Protocol server that intelligently fetches and processes web content, transforming websites and documentation into clean, structured markdown with nested URL crawling capabilities.
                  Last updated -
                  TypeScript
                • A
                  security
                  F
                  license
                  A
                  quality
                  A Model Context Protocol server that provides tools for fetching and posting HTTP data, with built-in prompts for URL summarization and API analysis.
                  Last updated -
                  2
                  TypeScript

                View all related MCP servers

                MCP directory API

                We provide all the information about MCP servers via our MCP API.

                curl -X GET 'https://glama.ai/api/mcp/v1/servers/Fabien-desablens/mcp-webpage-timestamps'

                If you have feedback or need assistance with the MCP directory API, please join our Discord server