# Wikidata MCP Server - Optimized Hybrid Architecture
A Model Context Protocol (MCP) server with Server-Sent Events (SSE) transport that connects Large Language Models to Wikidata's structured knowledge base. Features an **optimized hybrid architecture** that balances speed, accuracy, and verifiability by using fast basic tools for simple queries and advanced orchestration only for complex temporal/relational queries.
## Architecture Highlights
- **🚀 Fast Basic Tools**: 140-250ms for simple entity/property searches
- **🧠 Advanced Orchestration**: 1-11s for complex temporal queries (when needed)
- **⚡ 50x Performance Difference**: Empirically measured and optimized
- **🔄 Hybrid Approach**: Right tool for each query type
- **🛡️ Graceful Degradation**: Works with or without Vector DB API key
## MCP Tools
### Basic Tools (Fast & Reliable)
- **`search_wikidata_entity`**: Find entities by name (140-250ms)
- **`search_wikidata_property`**: Find properties by name (~200ms)
- **`get_wikidata_metadata`**: Entity labels, descriptions (~200ms)
- **`get_wikidata_properties`**: All entity properties (~200ms)
- **`execute_wikidata_sparql`**: Direct SPARQL queries (~200ms)
### Advanced Tool (Complex Queries)
- **`query_wikidata_complex`**: Temporal/relational queries (1-11s)
- ✅ "last 3 popes", "recent presidents of France"
- ❌ Simple entity searches (use basic tools instead)
## Live Demo
The server is deployed and accessible at:
- **URL**: [https://wikidata-mcp-mirror.onrender.com](https://wikidata-mcp-mirror.onrender.com)
- **MCP Endpoint**: [https://wikidata-mcp-mirror.onrender.com/mcp](https://wikidata-mcp-mirror.onrender.com/mcp)
- **Health Check**: [https://wikidata-mcp-mirror.onrender.com/health](https://wikidata-mcp-mirror.onrender.com/health)
## Usage with Claude Desktop
To use this server with Claude Desktop:
1. **Install mcp-remote** (if not already installed):
```bash
npm install -g @modelcontextprotocol/mcp-remote
```
2. Edit the Claude Desktop configuration file located at:
```
~/Library/Application Support/Claude/claude_desktop_config.json
```
3. Configure it to use the remote MCP server:
```json
{
"mcpServers": {
"Wikidata MCP": {
"command": "npx",
"args": [
"mcp-remote",
"https://wikidata-mcp-mirror.onrender.com/mcp"
]
}
}
}
```
4. Restart Claude Desktop
5. When using Claude, you can now access Wikidata knowledge through the configured MCP server.
## Deployment
### Deploying to Render
1. **Create a new Web Service** in your Render dashboard
2. **Connect your GitHub repository**
3. **Configure the service**:
- **Build Command**: `pip install -e .`
- **Start Command**: `python -m wikidata_mcp.api`
4. **Set Environment Variables**:
- Add all variables from `.env.example`
- For production, set `DEBUG=false`
- Make sure to set a proper `WIKIDATA_VECTORDB_API_KEY`
5. **Deploy**
The service will be available at `https://your-service-name.onrender.com`
## Environment Setup
### Prerequisites
- Python 3.10+
- Virtual environment tool (venv, conda, etc.)
- Vector DB API key (for enhanced semantic search)
### Environment Variables
Create a `.env` file in the project root with the following variables:
```bash
# Required for Vector DB integration
1. Clone the repository:
```bash
git clone https://github.com/yourusername/wikidata-mcp-mirror.git
cd wikidata-mcp-mirror
```
2. Create and activate a virtual environment:
```bash
python -m venv venv
source venv/bin/activate # On Windows: .\venv\Scripts\activate
```
3. Install the required dependencies:
```bash
pip install -e .
```
4. Create a `.env` file based on `.env.example` and configure your environment variables:
```bash
cp .env.example .env
# Edit .env with your configuration
```
5. Run the application:
```bash
# Development
python -m wikidata_mcp.api
# Production (with Gunicorn)
gunicorn --bind 0.0.0.0:8000 --workers 4 --timeout 120 --keep-alive 5 --worker-class uvicorn.workers.UvicornWorker wikidata_mcp.api:app
```
The server will start on `http://localhost:8000` by default with the following endpoints:
- `GET /health` - Health check
- `GET /messages/` - SSE endpoint for MCP communication
- `GET /docs` - Interactive API documentation (if enabled)
- `GET /metrics` - Prometheus metrics (if enabled)
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `PORT` | 8000 | Port to run the server on |
| `WORKERS` | 4 | Number of worker processes |
| `TIMEOUT` | 120 | Worker timeout in seconds |
| `KEEPALIVE` | 5 | Keep-alive timeout in seconds |
| `DEBUG` | false | Enable debug mode |
| `LOG_LEVEL` | INFO | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) |
| `USE_VECTOR_DB` | true | Enable/disable vector DB integration |
| `USE_CACHE` | true | Enable/disable caching system |
| `USE_FEEDBACK` | true | Enable/disable feedback system |
| `CACHE_TTL_SECONDS` | 3600 | Cache time-to-live in seconds |
| `CACHE_MAX_SIZE` | 1000 | Maximum number of items in cache |
| `WIKIDATA_VECTORDB_API_KEY` | | API key for the vector DB service |
### Running with Docker
1. Build the Docker image:
```bash
docker build -t wikidata-mcp .
```
2. Run the container:
```bash
docker run -p 8000:8000 --env-file .env wikidata-mcp
```
### Running with Docker Compose
1. Start the application:
```bash
docker-compose up --build
```
2. For production, use the production compose file:
```bash
docker-compose -f docker-compose.prod.yml up --build -d
```
## Monitoring
The service exposes Prometheus metrics at `/metrics` when the `PROMETHEUS_METRICS` environment variable is set to `true`.
### Health Check
```bash
curl http://localhost:8000/health
```
### Metrics
```bash
curl http://localhost:8000/metrics
```
## Testing
### Running Tests
Run the test suite with:
```bash
# Run all tests
pytest
# Run specific test file
pytest tests/orchestration/test_query_orchestrator.py -v
# Run with coverage report
pytest --cov=wikidata_mcp tests/
```
### Integration Tests
To test the Vector DB integration, you'll need to set the `WIKIDATA_VECTORDB_API_KEY` environment variable:
```bash
WIKIDATA_VECTORDB_API_KEY=your_key_here pytest tests/orchestration/test_vectordb_integration.py -v
```
### Test Client
You can also test the server using the included test client:
```bash
python test_mcp_client.py
```
Or manually with curl:
```bash
# Connect to SSE endpoint
curl -N -H "Accept: text/event-stream" https://wikidata-mcp-mirror.onrender.com/messages/
# Send a message (replace SESSION_ID with the one received from the SSE endpoint)
curl -X POST "https://wikidata-mcp-mirror.onrender.com/messages/?session_id=YOUR_SESSION_ID" \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test-client","version":"0.1.0"}},"id":0}'
```
## Deployment on Render.com
This server is configured for deployment on Render.com using the `render.yaml` file.
### Deployment Configuration
- **Build Command**: `pip install -r requirements.txt`
- **Start Command**: `gunicorn -k uvicorn.workers.UvicornWorker server_sse:app`
- **Environment Variables**:
- `PORT`: 10000
- **Health Check Path**: `/health`
### Docker Support
The repository includes a Dockerfile that's used by Render.com for containerized deployment. This allows the server to run in a consistent environment with all dependencies properly installed.
### How to Deploy
1. Fork or clone this repository to your GitHub account
2. Create a new Web Service on Render.com
3. Connect your GitHub repository
4. Render will automatically detect the `render.yaml` file and configure the deployment
5. Click "Create Web Service"
After deployment, you can access your server at the URL provided by Render.com.
## Architecture
The server is built using:
- **FastAPI**: For handling HTTP requests and routing
- **SSE Transport**: For bidirectional communication with clients
- **MCP Framework**: For implementing the Model Context Protocol
- **Wikidata API**: For accessing Wikidata's knowledge base
### Key Components
- `server_sse.py`: Main server implementation with SSE transport
- `wikidata_api.py`: Functions for interacting with Wikidata's API and SPARQL endpoint
- `requirements.txt`: Dependencies for the project
- `Dockerfile`: Container configuration for Docker deployment on Render
- `render.yaml`: Configuration for deployment on Render.com
- `test_mcp_client.py`: Test client for verifying server functionality
## Available MCP Tools
The server provides the following MCP tools:
- `search_wikidata_entity`: Search for entities by name
- `search_wikidata_property`: Search for properties by name
- `get_wikidata_metadata`: Get entity metadata (label, description)
- `get_wikidata_properties`: Get all properties for an entity
- `execute_wikidata_sparql`: Execute a SPARQL query
- `find_entity_facts`: Search for an entity and find its facts
- `get_related_entities`: Find entities related to a given entity
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Acknowledgments
- Based on the Model Context Protocol (MCP) specification
- Uses Wikidata as the knowledge source
- Inspired by the MCP examples from the official documentation