MCP Web Tools Server
by surya-madhav
- docs
# Extending the Repository with New Tools
This guide explains how to add new tools to the MCP repository. You'll learn best practices for tool design, implementation strategies, and integration techniques that maintain the repository's modular structure.
## Understanding the Repository Structure
Before adding new tools, it's important to understand the existing structure:
```
/MCP/
├── LICENSE
├── README.md
├── requirements.txt
├── server.py
├── streamlit_app.py
├── run.sh
├── run.bat
├── tools/
│ ├── __init__.py
│ └── web_scrape.py
└── docs/
└── *.md
```
Key components:
1. **server.py**: The main MCP server that registers and exposes tools
2. **tools/**: Directory containing individual tool implementations
3. **streamlit_app.py**: UI for interacting with MCP servers
4. **requirements.txt**: Python dependencies
5. **run.sh/run.bat**: Convenience scripts for running the server or UI
## Planning Your New Tool
Before implementation, plan your tool carefully:
### 1. Define the Purpose
Clearly define what your tool will do:
- What problem does it solve?
- How does it extend the capabilities of an LLM?
- Does it retrieve information, process data, or perform actions?
### 2. Choose a Tool Type
MCP supports different types of tools:
- **Information retrieval tools**: Fetch information from external sources
- **Processing tools**: Transform or analyze data
- **Action tools**: Perform operations with side effects
- **Integration tools**: Connect to external services or APIs
### 3. Design the Interface
Consider the tool's interface:
- What parameters does it need?
- What will it return?
- How will it handle errors?
- What schema will describe it?
Example interface design:
```
Tool: search_news
Purpose: Search for recent news articles by keyword
Parameters:
- query (string): Search query
- days (int, optional): How recent the news should be (default: 7)
- limit (int, optional): Maximum number of results (default: 5)
Returns:
- List of articles with titles, sources, and summaries
Errors:
- Handle API timeouts
- Handle rate limiting
- Handle empty results
```
## Implementing Your Tool
Now that you've planned your tool, it's time to implement it.
### 1. Create a New Tool Module
Create a new Python file in the `tools` directory:
```bash
touch tools/my_new_tool.py
```
### 2. Implement the Tool Function
Write the core functionality in your new tool file:
```python
# tools/my_new_tool.py
"""
MCP tool for [description of your tool].
"""
import httpx
import asyncio
import json
from typing import List, Dict, Any, Optional
async def search_news(query: str, days: int = 7, limit: int = 5) -> List[Dict[str, Any]]:
"""
Search for recent news articles based on a query.
Args:
query: Search terms
days: How recent the news should be (in days)
limit: Maximum number of results to return
Returns:
List of news articles with title, source, and summary
"""
# Implementation details
try:
# API call
async with httpx.AsyncClient() as client:
response = await client.get(
"https://newsapi.example.com/v2/everything",
params={
"q": query,
"from": f"-{days}d",
"pageSize": limit,
"apiKey": "YOUR_API_KEY" # In production, use environment variables
}
)
response.raise_for_status()
data = response.json()
# Process and return results
articles = data.get("articles", [])
results = []
for article in articles[:limit]:
results.append({
"title": article.get("title", "No title"),
"source": article.get("source", {}).get("name", "Unknown source"),
"url": article.get("url", ""),
"summary": article.get("description", "No description")
})
return results
except httpx.HTTPStatusError as e:
# Handle API errors
return [{"error": f"API error: {e.response.status_code}"}]
except httpx.RequestError as e:
# Handle connection errors
return [{"error": f"Connection error: {str(e)}"}]
except Exception as e:
# Handle unexpected errors
return [{"error": f"Unexpected error: {str(e)}"}]
# For testing outside of MCP
if __name__ == "__main__":
async def test():
results = await search_news("python programming")
print(json.dumps(results, indent=2))
asyncio.run(test())
```
### 3. Add Required Dependencies
If your tool needs additional dependencies, add them to the requirements.txt file:
```bash
# Add to requirements.txt
httpx>=0.24.0
dateutil>=2.8.2
```
### 4. Register the Tool in the Server
Update the main server.py file to import and register your new tool:
```python
# server.py
from mcp.server.fastmcp import FastMCP
# Import existing tools
from tools.web_scrape import fetch_url_as_markdown
# Import your new tool
from tools.my_new_tool import search_news
# Create an MCP server
mcp = FastMCP("Web Tools")
# Register existing tools
@mcp.tool()
async def web_scrape(url: str) -> str:
"""
Convert a URL to use r.jina.ai as a prefix and fetch the markdown content.
Args:
url (str): The URL to convert and fetch.
Returns:
str: The markdown content if successful, or an error message if not.
"""
return await fetch_url_as_markdown(url)
# Register your new tool
@mcp.tool()
async def news_search(query: str, days: int = 7, limit: int = 5) -> str:
"""
Search for recent news articles based on a query.
Args:
query: Search terms
days: How recent the news should be (in days, default: 7)
limit: Maximum number of results to return (default: 5)
Returns:
Formatted text with news article information
"""
articles = await search_news(query, days, limit)
# Format the results as text
if articles and "error" in articles[0]:
return articles[0]["error"]
if not articles:
return "No news articles found for the given query."
results = []
for i, article in enumerate(articles, 1):
results.append(f"## {i}. {article['title']}")
results.append(f"Source: {article['source']}")
results.append(f"URL: {article['url']}")
results.append(f"\n{article['summary']}\n")
return "\n".join(results)
if __name__ == "__main__":
mcp.run()
```
## Best Practices for Tool Implementation
### Error Handling
Robust error handling is essential for reliable tools:
```python
try:
# Operation that might fail
result = await perform_operation()
return result
except SpecificError as e:
# Handle specific error cases
return f"Operation failed: {str(e)}"
except Exception as e:
# Catch-all for unexpected errors
logging.error(f"Unexpected error: {str(e)}")
return "An unexpected error occurred. Please try again later."
```
### Input Validation
Validate inputs before processing:
```python
def validate_search_params(query: str, days: int, limit: int) -> Optional[str]:
"""Validate search parameters and return error message if invalid."""
if not query or len(query.strip()) == 0:
return "Search query cannot be empty"
if days < 1 or days > 30:
return "Days must be between 1 and 30"
if limit < 1 or limit > 100:
return "Limit must be between 1 and 100"
return None
# In the tool function
error = validate_search_params(query, days, limit)
if error:
return error
```
### Security Considerations
Implement security best practices:
```python
# Sanitize inputs
def sanitize_query(query: str) -> str:
"""Remove potentially dangerous characters from query."""
import re
return re.sub(r'[^\w\s\-.,?!]', '', query)
# Use environment variables for secrets
import os
api_key = os.environ.get("NEWS_API_KEY")
if not api_key:
return "API key not configured. Please set the NEWS_API_KEY environment variable."
# Implement rate limiting
from functools import lru_cache
import time
@lru_cache(maxsize=100)
def get_last_call_time():
return time.time()
def respect_rate_limit(min_interval=1.0):
"""Ensure minimum time between API calls."""
last_call = get_last_call_time()
now = time.time()
if now - last_call < min_interval:
time.sleep(min_interval - (now - last_call))
get_last_call_time.cache_clear()
get_last_call_time()
```
### Docstrings and Comments
Write clear documentation:
```python
async def translate_text(text: str, target_language: str) -> str:
"""
Translate text to another language.
This tool uses an external API to translate text from one language to another.
It automatically detects the source language and translates to the specified
target language.
Args:
text: The text to translate
target_language: ISO 639-1 language code (e.g., 'es' for Spanish)
Returns:
Translated text in the target language
Raises:
ValueError: If the target language is not supported
"""
# Implementation
```
### Testing
Include tests for your tools:
```python
# tools/tests/test_my_new_tool.py
import pytest
import asyncio
from tools.my_new_tool import search_news
@pytest.mark.asyncio
async def test_search_news_valid_query():
"""Test search_news with a valid query."""
results = await search_news("test query")
assert isinstance(results, list)
assert len(results) > 0
@pytest.mark.asyncio
async def test_search_news_empty_query():
"""Test search_news with an empty query."""
results = await search_news("")
assert isinstance(results, list)
assert "error" in results[0]
# Run tests
if __name__ == "__main__":
asyncio.run(pytest.main(["-xvs", "test_my_new_tool.py"]))
```
## Managing Tool Configurations
For tools that require configuration, follow these practices:
### Environment Variables
Use environment variables for configuration:
```python
# tools/my_new_tool.py
import os
API_KEY = os.environ.get("MY_TOOL_API_KEY")
BASE_URL = os.environ.get("MY_TOOL_BASE_URL", "https://api.default.com")
```
### Configuration Files
For more complex configurations, use configuration files:
```python
# tools/config.py
import json
import os
from pathlib import Path
def load_config(tool_name):
"""Load tool-specific configuration."""
config_dir = Path(os.environ.get("MCP_CONFIG_DIR", "~/.mcp")).expanduser()
config_path = config_dir / f"{tool_name}.json"
if not config_path.exists():
return {}
try:
with open(config_path, "r") as f:
return json.load(f)
except Exception as e:
print(f"Error loading config: {str(e)}")
return {}
# In your tool file
from tools.config import load_config
config = load_config("my_new_tool")
api_key = config.get("api_key", os.environ.get("MY_TOOL_API_KEY", ""))
```
## Advanced Tool Patterns
### Composition
Compose multiple tools for complex functionality:
```python
async def search_and_summarize(query: str) -> str:
"""Search for news and summarize the results."""
# First search for news
articles = await search_news(query, days=3, limit=3)
if not articles or "error" in articles[0]:
return "Failed to find news articles."
# Then summarize each article
summaries = []
for article in articles:
summary = await summarize_text(article["summary"])
summaries.append(f"Title: {article['title']}\nSummary: {summary}")
return "\n\n".join(summaries)
```
### Stateful Tools
For tools that need to maintain state:
```python
# tools/stateful_tool.py
from typing import Dict, Any
import json
import os
from pathlib import Path
class SessionStore:
"""Simple file-based session store."""
def __init__(self, tool_name):
self.storage_dir = Path(os.environ.get("MCP_STORAGE_DIR", "~/.mcp/storage")).expanduser()
self.storage_dir.mkdir(parents=True, exist_ok=True)
self.tool_name = tool_name
self.sessions: Dict[str, Dict[str, Any]] = {}
self._load()
def _get_storage_path(self):
return self.storage_dir / f"{self.tool_name}_sessions.json"
def _load(self):
path = self._get_storage_path()
if path.exists():
try:
with open(path, "r") as f:
self.sessions = json.load(f)
except Exception:
self.sessions = {}
def _save(self):
with open(self._get_storage_path(), "w") as f:
json.dump(self.sessions, f, indent=2)
def get(self, session_id, key, default=None):
session = self.sessions.get(session_id, {})
return session.get(key, default)
def set(self, session_id, key, value):
if session_id not in self.sessions:
self.sessions[session_id] = {}
self.sessions[session_id][key] = value
self._save()
def clear(self, session_id):
if session_id in self.sessions:
del self.sessions[session_id]
self._save()
# Usage in a tool
from tools.stateful_tool import SessionStore
# Initialize store
session_store = SessionStore("conversation")
async def remember_fact(session_id: str, fact: str) -> str:
"""Remember a fact for later recall."""
facts = session_store.get(session_id, "facts", [])
facts.append(fact)
session_store.set(session_id, "facts", facts)
return f"I'll remember that: {fact}"
async def recall_facts(session_id: str) -> str:
"""Recall previously stored facts."""
facts = session_store.get(session_id, "facts", [])
if not facts:
return "I don't have any facts stored for this session."
return "Here are the facts I remember:\n- " + "\n- ".join(facts)
```
### Long-Running Operations
For tools that take time to complete:
```python
from mcp.server.fastmcp import FastMCP, Context
@mcp.tool()
async def process_large_dataset(dataset_url: str, ctx: Context) -> str:
"""Process a large dataset with progress reporting."""
try:
# Download dataset
ctx.info(f"Downloading dataset from {dataset_url}")
await ctx.report_progress(10)
# Process in chunks
total_chunks = 10
for i in range(total_chunks):
ctx.info(f"Processing chunk {i+1}/{total_chunks}")
# Process chunk
await asyncio.sleep(1) # Simulate work
await ctx.report_progress(10 + (i+1) * 80 // total_chunks)
# Finalize
ctx.info("Finalizing results")
await ctx.report_progress(90)
await asyncio.sleep(1) # Simulate work
# Complete
await ctx.report_progress(100)
return "Dataset processing complete. Found 42 insights."
except Exception as e:
ctx.info(f"Error: {str(e)}")
return f"Processing failed: {str(e)}"
```
## Adding a Resource
In addition to tools, you might want to add a resource to your MCP server:
```python
# server.py
@mcp.resource("weather://{location}")
async def get_weather(location: str) -> str:
"""
Get weather information for a location.
Args:
location: City name or coordinates
Returns:
Weather information as text
"""
try:
# Fetch weather data
async with httpx.AsyncClient() as client:
response = await client.get(
f"https://api.weatherapi.com/v1/current.json",
params={
"q": location,
"key": os.environ.get("WEATHER_API_KEY", "")
}
)
response.raise_for_status()
data = response.json()
# Format weather data
location_data = data.get("location", {})
current_data = data.get("current", {})
weather_info = f"""
Weather for {location_data.get('name', location)}, {location_data.get('country', '')}
Temperature: {current_data.get('temp_c', 'N/A')}°C / {current_data.get('temp_f', 'N/A')}°F
Condition: {current_data.get('condition', {}).get('text', 'N/A')}
Wind: {current_data.get('wind_kph', 'N/A')} kph, {current_data.get('wind_dir', 'N/A')}
Humidity: {current_data.get('humidity', 'N/A')}%
Updated: {current_data.get('last_updated', 'N/A')}
"""
return weather_info
except Exception as e:
return f"Error fetching weather: {str(e)}"
```
## Adding a Prompt
You can also add a prompt to your MCP server:
```python
# server.py
@mcp.prompt()
def analyze_sentiment(text: str) -> str:
"""
Create a prompt for sentiment analysis.
Args:
text: The text to analyze
Returns:
A prompt for sentiment analysis
"""
return f"""
Please analyze the sentiment of the following text and categorize it as positive, negative, or neutral.
Provide a brief explanation for your categorization and highlight key phrases that indicate the sentiment.
Text to analyze:
{text}
Your analysis:
"""
```
## Conclusion
Extending the MCP repository with new tools is a powerful way to enhance the capabilities of LLMs. By following the patterns and practices outlined in this guide, you can create robust, reusable tools that integrate seamlessly with the existing repository structure.
Remember these key principles:
1. **Plan before coding**: Define the purpose and interface of your tool
2. **Follow best practices**: Implement proper error handling, input validation, and security
3. **Document thoroughly**: Write clear docstrings and comments
4. **Test rigorously**: Create tests for your tools
5. **Consider configurations**: Use environment variables or configuration files
6. **Explore advanced patterns**: Implement composition, state, and long-running operations as needed
In the next document, we'll explore example use cases for your MCP server and tools.