README.md•5.41 kB
# MCP Metaculus Server
MCP server providing historical prediction data from the Metaculus forecasting platform with backtesting support.
## Overview
This server retrieves question metadata, community predictions, and user comments from Metaculus questions, with strict filtering to prevent future information leakage. Perfect for forecasting applications that need to analyze historical prediction markets.
## Features
- **Backtesting Compliant**: All predictions and comments filtered by cutoff date
- **Comprehensive Data**: Question details, background, resolution criteria, and fine print
- **Historical Predictions**: Community forecast history with timestamps and forecaster counts
- **User Comments**: Optional comment scraping via Firecrawl API with date filtering
- **Multi-Format Support**: Binary, numeric/date, and multiple-choice questions
## Tools
### `get_metaculus_question_info`
Get Metaculus question information with historical predictions filtered by cutoff date.
**Parameters:**
- `question_url` (str): The full Metaculus question URL (e.g., 'https://www.metaculus.com/questions/39771/')
- `cutoff_date` (str): ISO format date (YYYY-MM-DD) - only return predictions made before this date
**Returns:**
- Formatted string containing:
- Question title, ID, post ID, type
- Background information
- Resolution criteria
- Fine print (if any)
- User comments (if Firecrawl API available, filtered by date)
- Community prediction history (filtered by cutoff_date)
- Last 15 historical entries with timestamps, predictions, and forecaster counts
- Total number of historical entries
**Example:**
```python
result = await get_metaculus_question_info(
question_url="https://www.metaculus.com/questions/39771/will-there-be-a-stronger-hurricane/",
cutoff_date="2024-06-01"
)
# Returns only predictions made before June 1, 2024
```
## Environment Variables
**Optional:**
- `FIRECRAWL_API_KEY`: API key for Firecrawl comment scraping (gracefully degrades if not provided)
Get your API key at: https://www.firecrawl.dev/
## Installation
```bash
cd mcp-metaculus
uv sync
```
## Usage
### Testing Locally
```bash
mcp run -t sse metaculus_server.py:mcp
```
### As Git Submodule
```bash
git submodule add <repo-url> mcp-servers/mcp-metaculus
```
## Backtesting Compliance
This server is designed for strict backtesting requirements:
### Prediction History Filtering
1. **Timestamp-Based**: Filters predictions where `end_time <= cutoff_timestamp`
2. **No Future Data**: Only includes predictions made before the cutoff date
3. **Unix Timestamps**: Uses precise timestamp comparisons for accuracy
### Comment Filtering
1. **Date Parsing**: Parses `time_posted` field from Firecrawl results
2. **Conservative Approach**: If date can't be parsed, includes the comment (safe for backtesting)
3. **Optional Feature**: Comments require Firecrawl API, gracefully degrades without it
### Question Type Support
**Binary Questions:**
- Returns probability as percentage (e.g., "72.5%")
- Shows most recent community forecast
**Numeric/Date Questions:**
- Returns median value
- Includes confidence interval range (lower and upper bounds)
- Handles both numeric ranges and date predictions
**Multiple Choice Questions:**
- Returns probabilities for all options
- Format: `options: ['45.0%', '30.0%', '25.0%']`
## API Details
### Metaculus API
- **Endpoint**: `https://www.metaculus.com/api/posts/{post_id}/`
- **Authentication**: None required (public API)
- **Data**: Question metadata and prediction history with timestamps
### Firecrawl API (Optional)
- **Endpoint**: `https://api.firecrawl.dev/v2/scrape`
- **Authentication**: Bearer token via `FIRECRAWL_API_KEY`
- **Structured Extraction**: Uses JSON schema to extract comment data
- **Caching**: 48-hour cache (`maxAge: 172800000`)
## Error Handling
The tool returns user-friendly error messages for common issues:
- Invalid date format (not YYYY-MM-DD)
- Malformed question URL (can't extract post ID)
- API request failures with status codes
- No prediction history found
- No predictions before cutoff date
All errors are returned as strings rather than raising exceptions.
## Data Structure
### Prediction History Entry
```python
{
"start_time": 1234567890, # Unix timestamp
"end_time": 1234567890, # Unix timestamp
"centers": [0.725], # Prediction values (binary: 0-1, numeric: actual values)
"forecaster_count": 42, # Number of forecasters
"interval_lower_bounds": [...], # For numeric questions
"interval_upper_bounds": [...], # For numeric questions
}
```
### Comment Structure (from Firecrawl)
```python
{
"content": "Comment text...",
"time_posted": "2024-05-15T10:30:00Z",
"upvotes": 5,
"downvotes": 1,
"changed_my_mind_votes": 2,
"author": "username"
}
```
## Limitations
- **Comment Date Filtering**: Relies on Firecrawl's structured extraction accuracy
- **API Rate Limits**: Subject to Metaculus API rate limits (typically generous)
- **Firecrawl Costs**: Comment scraping requires paid Firecrawl subscription
- **Historical Data**: Only available for questions with prediction history
- **URL Format**: Requires standard Metaculus question URL format
## Dependencies
- **fastmcp**: MCP server framework
- **httpx**: Async HTTP client for API requests
- **python-dotenv**: Environment variable management