MCP Webpage Timestamps
A powerful Model Context Protocol (MCP) server for extracting webpage creation, modification, and publication timestamps. This tool is designed for content freshness evaluation, web scraping, and temporal analysis of web content.
Features
- Comprehensive Timestamp Extraction: Extracts creation, modification, and publication timestamps from webpages
- Multiple Data Sources: Supports HTML meta tags, HTTP headers, JSON-LD, microdata, OpenGraph, Twitter cards, and heuristic analysis
- Confidence Scoring: Provides confidence levels (high/medium/low) for extracted timestamps
- Batch Processing: Extract timestamps from multiple URLs simultaneously
- Configurable: Customizable timeout, user agent, redirect handling, and heuristic options
- Production Ready: Robust error handling, comprehensive logging, and TypeScript support
Installation
Quick Install
Usage with npx
Prerequisites
- Node.js 18.0.0 or higher
- npm or yarn
Development Install
Usage
As MCP Server
The server can be used with any MCP-compatible client. Here's how to configure it:
Claude Desktop Configuration
Add to your claude_desktop_config.json
:
Cline Configuration
Add to your MCP settings:
Direct Usage
API Reference
Tools
extract_timestamps
Extract timestamps from a single webpage.
Parameters:
url
(string, required): The URL of the webpage to extract timestamps fromconfig
(object, optional): Configuration options
Configuration Options:
timeout
(number): Request timeout in milliseconds (default: 10000)userAgent
(string): User agent string for requestsfollowRedirects
(boolean): Whether to follow HTTP redirects (default: true)maxRedirects
(number): Maximum number of redirects to follow (default: 5)enableHeuristics
(boolean): Enable heuristic timestamp detection (default: true)
Example:
batch_extract_timestamps
Extract timestamps from multiple webpages in batch.
Parameters:
urls
(array of strings, required): Array of URLs to extract timestamps fromconfig
(object, optional): Same configuration options asextract_timestamps
Example:
Response Format
Both tools return a JSON object with the following structure:
TimestampSource:
Supported Timestamp Sources
HTML Meta Tags
article:published_time
article:modified_time
date
pubdate
publishdate
last-modified
dc.date.created
dc.date.modified
dcterms.created
dcterms.modified
HTTP Headers
Last-Modified
Date
JSON-LD Structured Data
datePublished
dateModified
dateCreated
Microdata
datePublished
dateModified
OpenGraph
og:article:published_time
og:article:modified_time
og:updated_time
Twitter Cards
twitter:data1
(when containing date information)
Heuristic Analysis
- Time elements with
datetime
attributes - Common date patterns in text
- Date-related CSS classes
Development
Scripts
Testing
The project includes comprehensive tests:
Code Quality
- TypeScript: Full TypeScript support with strict type checking
- ESLint: Code linting with recommended rules
- Prettier: Code formatting
- Jest: Unit and integration testing
- 95%+ Test Coverage: Comprehensive test suite
Examples
Basic Usage
Custom Configuration
Batch Processing
Use Cases
- Content Freshness Analysis: Evaluate how recent web content is
- Web Scraping: Extract temporal metadata from scraped pages
- SEO Analysis: Analyze publication and modification patterns
- Research: Study temporal aspects of web content
- Content Management: Track content lifecycle and updates
Error Handling
The extractor handles various error conditions gracefully:
- Network Errors: Timeout, connection refused, DNS resolution failures
- HTTP Errors: 404, 500, and other HTTP status codes
- Parsing Errors: Invalid HTML, malformed JSON-LD, unparseable dates
- Configuration Errors: Invalid URLs, timeout values, etc.
All errors are captured in the errors
array of the response, allowing for robust error handling and debugging.
Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Setup
- Fork the repository
- Clone your fork:
git clone https://github.com/Fabien-desablens/mcp-webpage-timestamps.git
- Install dependencies:
npm install
- Create a branch:
git checkout -b feature/your-feature
- Make your changes
- Run tests:
npm test
- Commit your changes:
git commit -m 'Add some feature'
- Push to the branch:
git push origin feature/your-feature
- Submit a pull request
Code Style
- Follow the existing code style
- Use TypeScript for all new code
- Add tests for new functionality
- Update documentation as needed
License
MIT License - see the LICENSE file for details.
Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: Wiki
Changelog
See CHANGELOG.md for a detailed history of changes.
Acknowledgments
- Model Context Protocol for the excellent MCP framework
- Cheerio for HTML parsing
- Axios for HTTP requests
- date-fns for date parsing and manipulation
This server cannot be installed
A Model Context Protocol server that extracts webpage creation, modification, and publication timestamps from various sources including HTML meta tags, HTTP headers, and structured data.
Related MCP Servers
- AsecurityAlicenseAqualityA Model Context Protocol server that provides web content fetching and conversion capabilities.Last updated -4892JavaScriptMIT License
- -securityAlicense-qualityA Model Context Protocol server that enables web search with category support, website content scraping with citation metadata, and timezone-aware date/time tools.Last updated -PythonMIT License
- -securityFlicense-qualityA Model Context Protocol server that intelligently fetches and processes web content, transforming websites and documentation into clean, structured markdown with nested URL crawling capabilities.Last updated -TypeScript
- AsecurityFlicenseAqualityA Model Context Protocol server that provides tools for fetching and posting HTTP data, with built-in prompts for URL summarization and API analysis.Last updated -2TypeScript