mvp.md•9.08 kB
# Wikipedia MCP Server - MVP Specification
## Overview
This MVP focuses on the **essential 4 tools** that make Claude most effective at Wikipedia research while demonstrating core MCP concepts. The goal is simplicity, learning, and immediate value.
## Core Philosophy
- **Simple but Powerful**: Each tool has a clear, focused purpose
- **Learning-Focused**: Demonstrates key MCP patterns and best practices
- **Research-Optimized**: Tools designed for how Claude actually researches
- **Minimal Dependencies**: Keep external requirements to a minimum
## Essential MCP Tools
### 1. `search_wikipedia` - Discovery Tool
**Purpose**: Find relevant articles when you don't know exact titles
```python
@mcp.tool()
async def search_wikipedia(
query: str,
limit: int = 5,
language: str = "en"
) -> WikipediaSearchResult:
"""Search Wikipedia for articles matching the query.
This is the starting point for research - use when you need to find
articles about a topic but don't know exact titles.
"""
```
**Why Essential**: Research always starts with discovery. Claude needs to find the right articles before it can read them.
**Input Schema**:
```json
{
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search terms (e.g. 'quantum computing', 'Napoleon Bonaparte')"
},
"limit": {
"type": "integer",
"default": 5,
"minimum": 1,
"maximum": 10,
"description": "Number of results to return"
},
"language": {
"type": "string",
"default": "en",
"description": "Wikipedia language code"
}
},
"required": ["query"]
}
```
**Output**:
```python
class SearchResult(BaseModel):
title: str
snippet: str # Brief description from search
url: str
page_id: int
class WikipediaSearchResult(BaseModel):
results: list[SearchResult]
query: str
total_found: int
```
### 2. `get_article` - Content Retrieval Tool
**Purpose**: Get the full content of a specific Wikipedia article
```python
@mcp.tool()
async def get_article(
title: str,
language: str = "en"
) -> WikipediaArticle:
"""Retrieve the full content of a Wikipedia article.
Use this when you know the article title and need the complete content
for detailed analysis or to answer specific questions.
"""
```
**Why Essential**: This is the core research tool - getting the actual content Claude needs to work with.
**Input Schema**:
```json
{
"type": "object",
"properties": {
"title": {
"type": "string",
"description": "Exact article title (e.g. 'Artificial Intelligence')"
},
"language": {
"type": "string",
"default": "en",
"description": "Wikipedia language code"
}
},
"required": ["title"]
}
```
**Output**:
```python
class WikipediaArticle(BaseModel):
title: str
content: str # Full article text (cleaned)
url: str
last_modified: str # ISO format
page_id: int
word_count: int
sections: list[str] # Section headings for reference
```
### 3. `get_summary` - Quick Understanding Tool
**Purpose**: Get a concise summary without the full article
```python
@mcp.tool()
async def get_summary(
title: str,
language: str = "en"
) -> WikipediaSummary:
"""Get a concise summary of a Wikipedia article.
Use this for quick understanding or when you need just the key facts
without the full article content.
"""
```
**Why Essential**: Often Claude needs just the key facts, not 10,000 words. This saves time and context window space.
**Input Schema**:
```json
{
"type": "object",
"properties": {
"title": {
"type": "string",
"description": "Article title to summarize"
},
"language": {
"type": "string",
"default": "en",
"description": "Wikipedia language code"
}
},
"required": ["title"]
}
```
**Output**:
```python
class WikipediaSummary(BaseModel):
title: str
summary: str # First paragraph + key points (max ~500 words)
url: str
key_facts: list[str] # 3-5 bullet points of key information
page_id: int
```
### 4. `find_related` - Discovery Extension Tool
**Purpose**: Find articles related to current research topic
```python
@mcp.tool()
async def find_related(
title: str,
limit: int = 5,
language: str = "en"
) -> RelatedArticles:
"""Find articles related to the given article.
Use this to expand research, find connected topics, or discover
relevant context around your main research subject.
"""
```
**Why Essential**: Research is interconnected. This helps Claude discover the broader context and related topics.
**Input Schema**:
```json
{
"type": "object",
"properties": {
"title": {
"type": "string",
"description": "Article title to find related articles for"
},
"limit": {
"type": "integer",
"default": 5,
"minimum": 1,
"maximum": 10,
"description": "Number of related articles to return"
},
"language": {
"type": "string",
"default": "en",
"description": "Wikipedia language code"
}
},
"required": ["title"]
}
```
**Output**:
```python
class RelatedArticle(BaseModel):
title: str
snippet: str
url: str
page_id: int
relation_type: str # "linked_from", "linked_to", "category", "similar"
class RelatedArticles(BaseModel):
source_title: str
related: list[RelatedArticle]
total_found: int
```
## Research Workflow Examples
### Typical Research Flow
1. **Start**: `search_wikipedia("quantum computing")` → Find relevant articles
2. **Explore**: `get_summary("Quantum computing")` → Quick understanding
3. **Deep Dive**: `get_article("Quantum computing")` → Full content if needed
4. **Expand**: `find_related("Quantum computing")` → Discover related topics
### Alternative Flow
1. **Direct**: `get_summary("Marie Curie")` → Quick facts about known topic
2. **Context**: `find_related("Marie Curie")` → Related scientists, discoveries
3. **Detail**: `get_article("Radium")` → Deep dive on specific discovery
## Technical Implementation
### MCP Server Structure
```python
from mcp.server.fastmcp import FastMCP
from pydantic import BaseModel
import httpx
import asyncio
# Initialize MCP server
mcp = FastMCP("Wikipedia Research Assistant")
# Wikipedia API client
class WikipediaClient:
def __init__(self):
self.base_url = "https://en.wikipedia.org/api/rest_v1"
self.session = httpx.AsyncClient()
async def search(self, query: str, limit: int = 5) -> dict:
# Implementation here
pass
async def get_page(self, title: str) -> dict:
# Implementation here
pass
# Tool implementations would go here...
```
### Key MCP Concepts Demonstrated
1. **Tool Registration**: Using `@mcp.tool()` decorator
2. **Structured Input**: JSON schemas for validation
3. **Structured Output**: Pydantic models for type safety
4. **Async Operations**: Non-blocking Wikipedia API calls
5. **Error Handling**: Graceful failures with helpful messages
6. **Documentation**: Clear descriptions for Claude's understanding
### Dependencies (Minimal)
```toml
[dependencies]
mcp = "^1.0.0"
httpx = "^0.27.0"
pydantic = "^2.0.0"
beautifulsoup4 = "^4.12.0" # For HTML cleaning
```
## Why These 4 Tools?
### Research Coverage
- **Discovery**: `search_wikipedia` finds unknown articles
- **Quick Facts**: `get_summary` provides rapid understanding
- **Deep Analysis**: `get_article` gives complete information
- **Context Expansion**: `find_related` broadens research scope
### MCP Learning Value
- **Simple Patterns**: Each tool demonstrates core MCP concepts
- **Real Value**: Immediately useful for actual research
- **Structured Data**: Shows proper input/output modeling
- **Async Design**: Demonstrates modern Python/MCP patterns
### Claude Effectiveness
- **Natural Workflow**: Mirrors how humans research
- **Context Efficiency**: Right amount of info for each need
- **Interconnected**: Tools work together seamlessly
- **Flexible**: Can adapt to many research styles
## Success Metrics
### Functional
- [ ] All 4 tools work reliably
- [ ] Proper structured input/output
- [ ] Clear error messages
- [ ] Fast response times (< 3 seconds)
### Learning
- [ ] Demonstrates core MCP patterns
- [ ] Easy to understand and extend
- [ ] Well-documented with examples
- [ ] Shows best practices
### Research Quality
- [ ] Claude can effectively research any topic
- [ ] Natural discovery → summary → detail → related flow
- [ ] Handles both known and unknown topics
- [ ] Provides actionable, structured information
## Future Expansion Ideas
*Not in MVP, but easy to add later:*
- Categories and tags
- Image and media access
- Historical versions
- Cross-language support
- Citation extraction
- Export formats
---
This MVP balances simplicity with power, providing Claude with everything needed for effective Wikipedia research while demonstrating core MCP concepts clearly.