Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP Webpage Timestampsextract timestamps from https://news.example.com/article"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP Webpage Timestamps
A powerful Model Context Protocol (MCP) server for extracting webpage creation, modification, and publication timestamps. This tool is designed for web scraping and temporal analysis of web content.
Features
Comprehensive Timestamp Extraction: Extracts creation, modification, and publication timestamps from webpages
Multiple Data Sources: Supports HTML meta tags, HTTP headers, JSON-LD, microdata, OpenGraph, Twitter cards, and heuristic analysis
Confidence Scoring: Provides confidence levels (high/medium/low) for extracted timestamps
Batch Processing: Extract timestamps from multiple URLs simultaneously
Configurable: Customizable timeout, user agent, redirect handling, and heuristic options
Production Ready: Robust error handling, comprehensive logging, and TypeScript support
Related MCP server: MCP SearXNG Enhanced
Installation
Quick Install
npm install -g mcp-webpage-timestampsUsage with npx
npx mcp-webpage-timestampsInstalling via Smithery
To install mcp-webpage-timestamps for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @Fabien-desablens/mcp-webpage-timestamps --client claudePrerequisites
Node.js 18.0.0 or higher
npm or yarn
Development Install
git clone https://github.com/Fabien-desablens/mcp-webpage-timestamps.git
cd mcp-webpage-timestamps
npm install
npm run buildUsage
As MCP Server
The server can be used with any MCP-compatible client. Here's how to configure it:
Claude Desktop Configuration
Add to your claude_desktop_config.json:
{
"mcpServers": {
"webpage-timestamps": {
"command": "npx",
"args": ["mcp-webpage-timestamps"],
"env": {}
}
}
}Cline Configuration
Add to your MCP settings:
{
"mcpServers": {
"webpage-timestamps": {
"command": "npx",
"args": ["mcp-webpage-timestamps"]
}
}
}Direct Usage
# Start the server
npm start
# Or run in development mode
npm run devAPI Reference
Tools
extract_timestamps
Extract timestamps from a single webpage.
Parameters:
url(string, required): The URL of the webpage to extract timestamps fromconfig(object, optional): Configuration options
Configuration Options:
timeout(number): Request timeout in milliseconds (default: 10000)userAgent(string): User agent string for requestsfollowRedirects(boolean): Whether to follow HTTP redirects (default: true)maxRedirects(number): Maximum number of redirects to follow (default: 5)enableHeuristics(boolean): Enable heuristic timestamp detection (default: true)
Example:
{
"name": "extract_timestamps",
"arguments": {
"url": "https://example.com/article",
"config": {
"timeout": 15000,
"enableHeuristics": true
}
}
}batch_extract_timestamps
Extract timestamps from multiple webpages in batch.
Parameters:
urls(array of strings, required): Array of URLs to extract timestamps fromconfig(object, optional): Same configuration options asextract_timestamps
Example:
{
"name": "batch_extract_timestamps",
"arguments": {
"urls": [
"https://example.com/article1",
"https://example.com/article2",
"https://example.com/article3"
],
"config": {
"timeout": 10000
}
}
}Response Format
Both tools return a JSON object with the following structure:
{
url: string;
createdAt?: Date;
modifiedAt?: Date;
publishedAt?: Date;
sources: TimestampSource[];
confidence: 'high' | 'medium' | 'low';
errors?: string[];
}TimestampSource:
{
type: 'html-meta' | 'http-header' | 'json-ld' | 'microdata' | 'opengraph' | 'twitter' | 'heuristic';
field: string;
value: string;
confidence: 'high' | 'medium' | 'low';
}Supported Timestamp Sources
HTML Meta Tags
article:published_timearticle:modified_timedatepubdatepublishdatelast-modifieddc.date.createddc.date.modifieddcterms.createddcterms.modified
HTTP Headers
Last-ModifiedDate
JSON-LD Structured Data
datePublisheddateModifieddateCreated
Microdata
datePublisheddateModified
OpenGraph
og:article:published_timeog:article:modified_timeog:updated_time
Twitter Cards
twitter:data1(when containing date information)
Heuristic Analysis
Time elements with
datetimeattributesCommon date patterns in text
Date-related CSS classes
Development
Scripts
# Development with hot reload
npm run dev
# Build the project
npm run build
# Run tests
npm test
# Run tests in watch mode
npm run test:watch
# Lint code
npm run lint
# Fix linting issues
npm run lint:fix
# Format code
npm run formatTesting
The project includes comprehensive tests:
# Run all tests
npm test
# Run tests with coverage
npm test -- --coverage
# Run specific test file
npm test -- extractor.test.tsCode Quality
TypeScript: Full TypeScript support with strict type checking
ESLint: Code linting with recommended rules
Prettier: Code formatting
Jest: Unit and integration testing
95%+ Test Coverage: Comprehensive test suite
Examples
Basic Usage
import { TimestampExtractor } from './src/extractor.js';
const extractor = new TimestampExtractor();
const result = await extractor.extractTimestamps('https://example.com/article');
console.log('Published:', result.publishedAt);
console.log('Modified:', result.modifiedAt);
console.log('Confidence:', result.confidence);
console.log('Sources:', result.sources.length);Custom Configuration
const extractor = new TimestampExtractor({
timeout: 15000,
userAgent: 'MyBot/1.0',
enableHeuristics: false,
maxRedirects: 3
});
const result = await extractor.extractTimestamps('https://example.com');Batch Processing
const urls = [
'https://example.com/article1',
'https://example.com/article2',
'https://example.com/article3'
];
const results = await Promise.all(
urls.map(url => extractor.extractTimestamps(url))
);Use Cases
Content Analysis: Analyze temporal aspects of web content
Web Scraping: Extract temporal metadata from scraped pages
SEO Analysis: Analyze publication and modification patterns
Research: Study temporal aspects of web content
Content Management: Track content lifecycle and updates
Error Handling
The extractor handles various error conditions gracefully:
Network Errors: Timeout, connection refused, DNS resolution failures
HTTP Errors: 404, 500, and other HTTP status codes
Parsing Errors: Invalid HTML, malformed JSON-LD, unparseable dates
Configuration Errors: Invalid URLs, timeout values, etc.
All errors are captured in the errors array of the response, allowing for robust error handling and debugging.
Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Setup
Fork the repository
Clone your fork:
git clone https://github.com/Fabien-desablens/mcp-webpage-timestamps.gitInstall dependencies:
npm installCreate a branch:
git checkout -b feature/your-featureMake your changes
Run tests:
npm testCommit your changes:
git commit -m 'Add some feature'Push to the branch:
git push origin feature/your-featureSubmit a pull request
Code Style
Follow the existing code style
Use TypeScript for all new code
Add tests for new functionality
Update documentation as needed
License
MIT License - see the LICENSE file for details.
Support
Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: Wiki
Changelog
See CHANGELOG.md for a detailed history of changes.
Acknowledgments
Model Context Protocol for the excellent MCP framework
Cheerio for HTML parsing
Axios for HTTP requests
date-fns for date parsing and manipulation