# MCP Metaculus Server
MCP server providing historical prediction data from the Metaculus forecasting platform with backtesting support.
## Overview
This server retrieves question metadata, community predictions, and user comments from Metaculus questions, with strict filtering to prevent future information leakage. Perfect for forecasting applications that need to analyze historical prediction markets.
## Features
- **Backtesting Compliant**: All predictions and comments filtered by cutoff date
- **Comprehensive Data**: Question details, background, resolution criteria, and fine print
- **Historical Predictions**: Community forecast history with timestamps and forecaster counts
- **User Comments**: Optional comment scraping via Firecrawl API with date filtering
- **Multi-Format Support**: Binary, numeric/date, and multiple-choice questions
## Tools
### `get_metaculus_question_info`
Get Metaculus question information with historical predictions filtered by cutoff date.
**Parameters:**
- `question_url` (str): The full Metaculus question URL (e.g., 'https://www.metaculus.com/questions/39771/')
- `cutoff_date` (str): ISO format date (YYYY-MM-DD) - only return predictions made before this date
**Returns:**
- Formatted string containing:
- Question title, ID, post ID, type
- Background information
- Resolution criteria
- Fine print (if any)
- User comments (if Firecrawl API available, filtered by date)
- Community prediction history (filtered by cutoff_date)
- Last 15 historical entries with timestamps, predictions, and forecaster counts
- Total number of historical entries
**Example:**
```python
result = await get_metaculus_question_info(
question_url="https://www.metaculus.com/questions/39771/will-there-be-a-stronger-hurricane/",
cutoff_date="2024-06-01"
)
# Returns only predictions made before June 1, 2024
```
## Environment Variables
**Optional:**
- `FIRECRAWL_API_KEY`: API key for Firecrawl comment scraping (gracefully degrades if not provided)
Get your API key at: https://www.firecrawl.dev/
## Installation
```bash
cd mcp-metaculus
uv sync
```
## Usage
### Testing Locally
```bash
mcp run -t sse metaculus_server.py:mcp
```
### As Git Submodule
```bash
git submodule add <repo-url> mcp-servers/mcp-metaculus
```
## Backtesting Compliance
This server is designed for strict backtesting requirements:
### Prediction History Filtering
1. **Timestamp-Based**: Filters predictions where `end_time <= cutoff_timestamp`
2. **No Future Data**: Only includes predictions made before the cutoff date
3. **Unix Timestamps**: Uses precise timestamp comparisons for accuracy
### Comment Filtering
1. **Date Parsing**: Parses `time_posted` field from Firecrawl results
2. **Conservative Approach**: If date can't be parsed, includes the comment (safe for backtesting)
3. **Optional Feature**: Comments require Firecrawl API, gracefully degrades without it
### Question Type Support
**Binary Questions:**
- Returns probability as percentage (e.g., "72.5%")
- Shows most recent community forecast
**Numeric/Date Questions:**
- Returns median value
- Includes confidence interval range (lower and upper bounds)
- Handles both numeric ranges and date predictions
**Multiple Choice Questions:**
- Returns probabilities for all options
- Format: `options: ['45.0%', '30.0%', '25.0%']`
## API Details
### Metaculus API
- **Endpoint**: `https://www.metaculus.com/api/posts/{post_id}/`
- **Authentication**: None required (public API)
- **Data**: Question metadata and prediction history with timestamps
### Firecrawl API (Optional)
- **Endpoint**: `https://api.firecrawl.dev/v2/scrape`
- **Authentication**: Bearer token via `FIRECRAWL_API_KEY`
- **Structured Extraction**: Uses JSON schema to extract comment data
- **Caching**: 48-hour cache (`maxAge: 172800000`)
## Error Handling
The tool returns user-friendly error messages for common issues:
- Invalid date format (not YYYY-MM-DD)
- Malformed question URL (can't extract post ID)
- API request failures with status codes
- No prediction history found
- No predictions before cutoff date
All errors are returned as strings rather than raising exceptions.
## Data Structure
### Prediction History Entry
```python
{
"start_time": 1234567890, # Unix timestamp
"end_time": 1234567890, # Unix timestamp
"centers": [0.725], # Prediction values (binary: 0-1, numeric: actual values)
"forecaster_count": 42, # Number of forecasters
"interval_lower_bounds": [...], # For numeric questions
"interval_upper_bounds": [...], # For numeric questions
}
```
### Comment Structure (from Firecrawl)
```python
{
"content": "Comment text...",
"time_posted": "2024-05-15T10:30:00Z",
"upvotes": 5,
"downvotes": 1,
"changed_my_mind_votes": 2,
"author": "username"
}
```
## Limitations
- **Comment Date Filtering**: Relies on Firecrawl's structured extraction accuracy
- **API Rate Limits**: Subject to Metaculus API rate limits (typically generous)
- **Firecrawl Costs**: Comment scraping requires paid Firecrawl subscription
- **Historical Data**: Only available for questions with prediction history
- **URL Format**: Requires standard Metaculus question URL format
## Testing
### Setup
1. Install test dependencies:
```bash
uv pip install -e ".[test]"
```
2. (Optional) Configure Firecrawl API key by copying `.env.example` to `.env`:
```bash
cp .env.example .env
```
3. (Optional) Add your Firecrawl API key to `.env`:
- `FIRECRAWL_API_KEY` - Optional from https://firecrawl.dev/
- Tests work without this key, but comments won't be fetched
### Running Tests
Run all tests:
```bash
pytest
```
Run with verbose output:
```bash
pytest -v
```
Run specific test file:
```bash
pytest tests/test_metaculus_server.py
```
Run specific test:
```bash
pytest tests/test_metaculus_server.py::test_get_metaculus_question_basic -v
```
### Test Coverage
The test suite covers:
- **Basic functionality**: Question retrieval with various cutoff dates
- **Output format**: Validation of all expected sections (title, background, predictions, etc.)
- **Prediction history**: Historical data filtering by cutoff date
- **URL handling**: Valid URLs, invalid formats, nonexistent questions
- **Date validation**: Valid dates, invalid formats, malformed dates, very early dates
- **Prediction details**: Forecaster counts, most recent markers, total entry counts
- **Post ID extraction**: Correct parsing of question URLs
**Note**: Tests make real API calls to Metaculus (no API key required). The Metaculus API is public and free. Firecrawl API key is optional - tests work without it.
## Dependencies
- **fastmcp**: MCP server framework
- **httpx**: Async HTTP client for API requests
- **python-dotenv**: Environment variable management