README.md•7.96 kB
# Data Commons MCP Server
A fully functional **Model Context Protocol (MCP)** server for accessing public statistical data from [Data Commons](https://datacommons.org). This server is optimized for deployment on **Railway.app** and can be accessed remotely by MCP clients like Manus, Claude Desktop, and other MCP-enabled applications.
## Overview
**Data Commons** is an open knowledge repository providing a unified view across multiple public datasets and statistics. This MCP server enables AI agents and applications to query the Data Commons knowledge graph through a standardized protocol.
### Key Features
- **MCP-Compliant**: Implements the Model Context Protocol for seamless agent integration
- **Data Commons Access**: Fetches public statistics from the datacommons.org knowledge graph
- **Custom Instance Support**: Can be configured to work with custom Data Commons instances
- **Railway-Ready**: Pre-configured for one-click deployment on Railway.app
- **Remote Access**: Accessible via HTTP for remote MCP clients
- **Comprehensive Tools**: Includes tools for searching indicators and fetching observations
## Architecture
The server provides two main MCP tools:
1. **`search_indicators`**: Search and discover statistical variables (indicators) available in Data Commons
2. **`get_observations`**: Fetch actual statistical data for specific variables and places
## Quick Start
### Prerequisites
1. **Data Commons API Key**: Create one at [apikeys.datacommons.org](https://apikeys.datacommons.org/)
2. **Python 3.11+**: Required for local development
3. **Railway Account**: For deployment (optional)
### Local Development
1. **Clone the repository**:
   ```bash
   git clone https://github.com/ARJ999/Data-Commons-mcp-server.git
   cd Data-Commons-mcp-server
   ```
2. **Set up environment**:
   ```bash
   cp .env.example .env
   # Edit .env and add your DC_API_KEY
   ```
3. **Install dependencies**:
   ```bash
   pip install -r requirements.txt
   ```
4. **Run the server**:
   ```bash
   python -m datacommons_mcp.cli serve http --host 0.0.0.0 --port 8080
   ```
5. **Access the MCP endpoint**:
   ```
   http://localhost:8080/mcp
   ```
## Railway Deployment
### One-Click Deploy
[](https://railway.app/new/template)
### Manual Deployment
1. **Create a new Railway project**:
   - Go to [railway.app](https://railway.app)
   - Click "New Project" → "Deploy from GitHub repo"
   - Select this repository
2. **Configure environment variables**:
   - Add `DC_API_KEY` with your Data Commons API key
   - Railway automatically sets `PORT`
3. **Deploy**:
   - Railway will automatically detect the configuration and deploy
   - Your MCP server will be available at: `https://your-app.railway.app/mcp`
### Environment Variables
| Variable | Required | Description |
|----------|----------|-------------|
| `DC_API_KEY` | Yes | Your Data Commons API key from [apikeys.datacommons.org](https://apikeys.datacommons.org/) |
| `DC_API_ROOT` | No | Custom Data Commons instance URL (defaults to datacommons.org) |
| `PORT` | No | Server port (Railway sets this automatically) |
## Using with MCP Clients
### Manus
Configure Manus to connect to your deployed MCP server:
```json
{
  "mcpServers": {
    "datacommons": {
      "url": "https://your-app.railway.app/mcp",
      "transport": "http"
    }
  }
}
```
### Claude Desktop
Add to your Claude Desktop MCP settings:
```json
{
  "mcpServers": {
    "datacommons": {
      "command": "curl",
      "args": ["-X", "POST", "https://your-app.railway.app/mcp"]
    }
  }
}
```
### Other MCP Clients
Any MCP-enabled client can connect using the HTTP endpoint:
- **Endpoint**: `https://your-app.railway.app/mcp`
- **Transport**: Streamable HTTP
- **Protocol**: MCP (Model Context Protocol)
## Available Tools
### 1. search_indicators
Search for statistical variables (indicators) in Data Commons.
**Parameters**:
- `query` (string): Natural language search query
- `place_dcids` (list, optional): Filter by specific place DCIDs
- `topic_dcids` (list, optional): Filter by topic DCIDs
**Example**:
```python
search_indicators(
    query="population growth rate",
    place_dcids=["country/USA"]
)
```
### 2. get_observations
Fetch statistical observations for a variable and place.
**Parameters**:
- `variable_dcid` (string): Variable identifier from search_indicators
- `place_dcid` (string): Place identifier
- `child_place_type` (string, optional): Get data for child places
- `date` (string, optional): Date filter ('latest', 'all', or specific date)
- `date_range_start` (string, optional): Start of date range
- `date_range_end` (string, optional): End of date range
**Example**:
```python
get_observations(
    variable_dcid="Count_Person",
    place_dcid="country/USA",
    date="latest"
)
```
## Project Structure
```
Data-Commons-mcp-server/
├── datacommons_mcp/          # Main package
│   ├── __init__.py
│   ├── server.py             # MCP server implementation
│   ├── cli.py                # Command-line interface
│   ├── clients.py            # Data Commons API client
│   ├── services.py           # Business logic
│   ├── settings.py           # Configuration
│   ├── data_models/          # Pydantic models
│   └── ...
├── requirements.txt          # Python dependencies
├── pyproject.toml           # Project metadata
├── Procfile                 # Railway start command
├── railway.json             # Railway configuration
├── runtime.txt              # Python version
├── .env.example             # Environment template
├── .gitignore              # Git ignore rules
└── README.md               # This file
```
## Technical Details
### Dependencies
- **FastAPI**: Web framework for HTTP server
- **FastMCP**: MCP protocol implementation
- **Uvicorn**: ASGI server
- **datacommons-client**: Official Data Commons Python client
- **Pydantic**: Data validation and settings management
### Transport Modes
The server supports two transport modes:
1. **Streamable HTTP** (default for Railway):
   - Accessible via HTTP/HTTPS
   - Suitable for remote clients
   - Endpoint: `/mcp`
2. **stdio** (for local integrations):
   - Communicates via standard input/output
   - Used by local MCP clients like Gemini CLI
## Troubleshooting
### Server won't start
- **Check API Key**: Ensure `DC_API_KEY` is set correctly
- **Check Python Version**: Must be 3.11 or 3.12
- **Check Dependencies**: Run `pip install -r requirements.txt`
### Can't connect from MCP client
- **Verify URL**: Ensure you're using the correct Railway URL
- **Check Endpoint**: URL should end with `/mcp`
- **Check Deployment**: Verify the Railway deployment is successful
### API Errors
- **Invalid API Key**: Get a new key from [apikeys.datacommons.org](https://apikeys.datacommons.org/)
- **Rate Limits**: Data Commons may have rate limits; check their documentation
## Contributing
Contributions are welcome! Please feel free to submit issues or pull requests.
## License
This project is based on the [Data Commons Agent Toolkit](https://github.com/datacommonsorg/agent-toolkit) and is licensed under the Apache License 2.0.
## Resources
- **Data Commons**: [datacommons.org](https://datacommons.org)
- **MCP Specification**: [Model Context Protocol](https://modelcontextprotocol.io)
- **Railway Documentation**: [docs.railway.app](https://docs.railway.app)
- **Original Repository**: [datacommonsorg/agent-toolkit](https://github.com/datacommonsorg/agent-toolkit)
## Support
For issues related to:
- **This deployment**: Open an issue on this repository
- **Data Commons API**: Visit [datacommons.org/support](https://datacommons.org/support)
- **Railway platform**: Check [Railway documentation](https://docs.railway.app)
---
**Built with ❤️ for the MCP ecosystem**