MCP Web Tools Server

MIT License
OverviewInspectSchema Related Servers Reviews Score
docs
# Extending the Repository with New Tools

This guide explains how to add new tools to the MCP repository. You'll learn best practices for tool design, implementation strategies, and integration techniques that maintain the repository's modular structure.

## Understanding the Repository Structure

Before adding new tools, it's important to understand the existing structure:

```
/MCP/
├── LICENSE
├── README.md
├── requirements.txt
├── server.py
├── streamlit_app.py
├── run.sh
├── run.bat
├── tools/
│   ├── __init__.py
│   └── web_scrape.py
└── docs/
    └── *.md
```

Key components:

1. **server.py**: The main MCP server that registers and exposes tools
2. **tools/**: Directory containing individual tool implementations
3. **streamlit_app.py**: UI for interacting with MCP servers
4. **requirements.txt**: Python dependencies
5. **run.sh/run.bat**: Convenience scripts for running the server or UI

## Planning Your New Tool

Before implementation, plan your tool carefully:

### 1. Define the Purpose

Clearly define what your tool will do:

- What problem does it solve?
- How does it extend the capabilities of an LLM?
- Does it retrieve information, process data, or perform actions?

### 2. Choose a Tool Type

MCP supports different types of tools:

- **Information retrieval tools**: Fetch information from external sources
- **Processing tools**: Transform or analyze data
- **Action tools**: Perform operations with side effects
- **Integration tools**: Connect to external services or APIs

### 3. Design the Interface

Consider the tool's interface:

- What parameters does it need?
- What will it return?
- How will it handle errors?
- What schema will describe it?

Example interface design:

```
Tool: search_news
Purpose: Search for recent news articles by keyword
Parameters:
  - query (string): Search query
  - days (int, optional): How recent the news should be (default: 7)
  - limit (int, optional): Maximum number of results (default: 5)
Returns:
  - List of articles with titles, sources, and summaries
Errors:
  - Handle API timeouts
  - Handle rate limiting
  - Handle empty results
```

## Implementing Your Tool

Now that you've planned your tool, it's time to implement it.

### 1. Create a New Tool Module

Create a new Python file in the `tools` directory:

```bash
touch tools/my_new_tool.py
```

### 2. Implement the Tool Function

Write the core functionality in your new tool file:

```python
# tools/my_new_tool.py
"""
MCP tool for [description of your tool].
"""

import httpx
import asyncio
import json
from typing import List, Dict, Any, Optional


async def search_news(query: str, days: int = 7, limit: int = 5) -> List[Dict[str, Any]]:
    """
    Search for recent news articles based on a query.
    
    Args:
        query: Search terms
        days: How recent the news should be (in days)
        limit: Maximum number of results to return
        
    Returns:
        List of news articles with title, source, and summary
    """
    # Implementation details
    try:
        # API call
        async with httpx.AsyncClient() as client:
            response = await client.get(
                "https://newsapi.example.com/v2/everything",
                params={
                    "q": query,
                    "from": f"-{days}d",
                    "pageSize": limit,
                    "apiKey": "YOUR_API_KEY"  # In production, use environment variables
                }
            )
            response.raise_for_status()
            data = response.json()
            
            # Process and return results
            articles = data.get("articles", [])
            results = []
            
            for article in articles[:limit]:
                results.append({
                    "title": article.get("title", "No title"),
                    "source": article.get("source", {}).get("name", "Unknown source"),
                    "url": article.get("url", ""),
                    "summary": article.get("description", "No description")
                })
                
            return results
            
    except httpx.HTTPStatusError as e:
        # Handle API errors
        return [{"error": f"API error: {e.response.status_code}"}]
    except httpx.RequestError as e:
        # Handle connection errors
        return [{"error": f"Connection error: {str(e)}"}]
    except Exception as e:
        # Handle unexpected errors
        return [{"error": f"Unexpected error: {str(e)}"}]


# For testing outside of MCP
if __name__ == "__main__":
    async def test():
        results = await search_news("python programming")
        print(json.dumps(results, indent=2))
    
    asyncio.run(test())
```

### 3. Add Required Dependencies

If your tool needs additional dependencies, add them to the requirements.txt file:

```bash
# Add to requirements.txt
httpx>=0.24.0
dateutil>=2.8.2
```

### 4. Register the Tool in the Server

Update the main server.py file to import and register your new tool:

```python
# server.py
from mcp.server.fastmcp import FastMCP

# Import existing tools
from tools.web_scrape import fetch_url_as_markdown

# Import your new tool
from tools.my_new_tool import search_news

# Create an MCP server
mcp = FastMCP("Web Tools")

# Register existing tools
@mcp.tool()
async def web_scrape(url: str) -> str:
    """
    Convert a URL to use r.jina.ai as a prefix and fetch the markdown content.
    
    Args:
        url (str): The URL to convert and fetch.
        
    Returns:
        str: The markdown content if successful, or an error message if not.
    """
    return await fetch_url_as_markdown(url)

# Register your new tool
@mcp.tool()
async def news_search(query: str, days: int = 7, limit: int = 5) -> str:
    """
    Search for recent news articles based on a query.
    
    Args:
        query: Search terms
        days: How recent the news should be (in days, default: 7)
        limit: Maximum number of results to return (default: 5)
        
    Returns:
        Formatted text with news article information
    """
    articles = await search_news(query, days, limit)
    
    # Format the results as text
    if articles and "error" in articles[0]:
        return articles[0]["error"]
    
    if not articles:
        return "No news articles found for the given query."
    
    results = []
    for i, article in enumerate(articles, 1):
        results.append(f"## {i}. {article['title']}")
        results.append(f"Source: {article['source']}")
        results.append(f"URL: {article['url']}")
        results.append(f"\n{article['summary']}\n")
    
    return "\n".join(results)

if __name__ == "__main__":
    mcp.run()
```

## Best Practices for Tool Implementation

### Error Handling

Robust error handling is essential for reliable tools:

```python
try:
    # Operation that might fail
    result = await perform_operation()
    return result
except SpecificError as e:
    # Handle specific error cases
    return f"Operation failed: {str(e)}"
except Exception as e:
    # Catch-all for unexpected errors
    logging.error(f"Unexpected error: {str(e)}")
    return "An unexpected error occurred. Please try again later."
```

### Input Validation

Validate inputs before processing:

```python
def validate_search_params(query: str, days: int, limit: int) -> Optional[str]:
    """Validate search parameters and return error message if invalid."""
    if not query or len(query.strip()) == 0:
        return "Search query cannot be empty"
    
    if days < 1 or days > 30:
        return "Days must be between 1 and 30"
    
    if limit < 1 or limit > 100:
        return "Limit must be between 1 and 100"
    
    return None

# In the tool function
error = validate_search_params(query, days, limit)
if error:
    return error
```

### Security Considerations

Implement security best practices:

```python
# Sanitize inputs
def sanitize_query(query: str) -> str:
    """Remove potentially dangerous characters from query."""
    import re
    return re.sub(r'[^\w\s\-.,?!]', '', query)

# Use environment variables for secrets
import os
api_key = os.environ.get("NEWS_API_KEY")
if not api_key:
    return "API key not configured. Please set the NEWS_API_KEY environment variable."

# Implement rate limiting
from functools import lru_cache
import time

@lru_cache(maxsize=100)
def get_last_call_time():
    return time.time()

def respect_rate_limit(min_interval=1.0):
    """Ensure minimum time between API calls."""
    last_call = get_last_call_time()
    now = time.time()
    if now - last_call < min_interval:
        time.sleep(min_interval - (now - last_call))
    get_last_call_time.cache_clear()
    get_last_call_time()
```

### Docstrings and Comments

Write clear documentation:

```python
async def translate_text(text: str, target_language: str) -> str:
    """
    Translate text to another language.
    
    This tool uses an external API to translate text from one language to another.
    It automatically detects the source language and translates to the specified
    target language.
    
    Args:
        text: The text to translate
        target_language: ISO 639-1 language code (e.g., 'es' for Spanish)
        
    Returns:
        Translated text in the target language
        
    Raises:
        ValueError: If the target language is not supported
    """
    # Implementation
```

### Testing

Include tests for your tools:

```python
# tools/tests/test_my_new_tool.py
import pytest
import asyncio
from tools.my_new_tool import search_news

@pytest.mark.asyncio
async def test_search_news_valid_query():
    """Test search_news with a valid query."""
    results = await search_news("test query")
    assert isinstance(results, list)
    assert len(results) > 0

@pytest.mark.asyncio
async def test_search_news_empty_query():
    """Test search_news with an empty query."""
    results = await search_news("")
    assert isinstance(results, list)
    assert "error" in results[0]

# Run tests
if __name__ == "__main__":
    asyncio.run(pytest.main(["-xvs", "test_my_new_tool.py"]))
```

## Managing Tool Configurations

For tools that require configuration, follow these practices:

### Environment Variables

Use environment variables for configuration:

```python
# tools/my_new_tool.py
import os

API_KEY = os.environ.get("MY_TOOL_API_KEY")
BASE_URL = os.environ.get("MY_TOOL_BASE_URL", "https://api.default.com")
```

### Configuration Files

For more complex configurations, use configuration files:

```python
# tools/config.py
import json
import os
from pathlib import Path

def load_config(tool_name):
    """Load tool-specific configuration."""
    config_dir = Path(os.environ.get("MCP_CONFIG_DIR", "~/.mcp")).expanduser()
    config_path = config_dir / f"{tool_name}.json"
    
    if not config_path.exists():
        return {}
    
    try:
        with open(config_path, "r") as f:
            return json.load(f)
    except Exception as e:
        print(f"Error loading config: {str(e)}")
        return {}

# In your tool file
from tools.config import load_config

config = load_config("my_new_tool")
api_key = config.get("api_key", os.environ.get("MY_TOOL_API_KEY", ""))
```

## Advanced Tool Patterns

### Composition

Compose multiple tools for complex functionality:

```python
async def search_and_summarize(query: str) -> str:
    """Search for news and summarize the results."""
    # First search for news
    articles = await search_news(query, days=3, limit=3)
    
    if not articles or "error" in articles[0]:
        return "Failed to find news articles."
    
    # Then summarize each article
    summaries = []
    for article in articles:
        summary = await summarize_text(article["summary"])
        summaries.append(f"Title: {article['title']}\nSummary: {summary}")
    
    return "\n\n".join(summaries)
```

### Stateful Tools

For tools that need to maintain state:

```python
# tools/stateful_tool.py
from typing import Dict, Any
import json
import os
from pathlib import Path

class SessionStore:
    """Simple file-based session store."""
    
    def __init__(self, tool_name):
        self.storage_dir = Path(os.environ.get("MCP_STORAGE_DIR", "~/.mcp/storage")).expanduser()
        self.storage_dir.mkdir(parents=True, exist_ok=True)
        self.tool_name = tool_name
        self.sessions: Dict[str, Dict[str, Any]] = {}
        self._load()
    
    def _get_storage_path(self):
        return self.storage_dir / f"{self.tool_name}_sessions.json"
    
    def _load(self):
        path = self._get_storage_path()
        if path.exists():
            try:
                with open(path, "r") as f:
                    self.sessions = json.load(f)
            except Exception:
                self.sessions = {}
    
    def _save(self):
        with open(self._get_storage_path(), "w") as f:
            json.dump(self.sessions, f, indent=2)
    
    def get(self, session_id, key, default=None):
        session = self.sessions.get(session_id, {})
        return session.get(key, default)
    
    def set(self, session_id, key, value):
        if session_id not in self.sessions:
            self.sessions[session_id] = {}
        self.sessions[session_id][key] = value
        self._save()
    
    def clear(self, session_id):
        if session_id in self.sessions:
            del self.sessions[session_id]
            self._save()

# Usage in a tool
from tools.stateful_tool import SessionStore

# Initialize store
session_store = SessionStore("conversation")

async def remember_fact(session_id: str, fact: str) -> str:
    """Remember a fact for later recall."""
    facts = session_store.get(session_id, "facts", [])
    facts.append(fact)
    session_store.set(session_id, "facts", facts)
    return f"I'll remember that: {fact}"

async def recall_facts(session_id: str) -> str:
    """Recall previously stored facts."""
    facts = session_store.get(session_id, "facts", [])
    if not facts:
        return "I don't have any facts stored for this session."
    
    return "Here are the facts I remember:\n- " + "\n- ".join(facts)
```

### Long-Running Operations

For tools that take time to complete:

```python
from mcp.server.fastmcp import FastMCP, Context

@mcp.tool()
async def process_large_dataset(dataset_url: str, ctx: Context) -> str:
    """Process a large dataset with progress reporting."""
    try:
        # Download dataset
        ctx.info(f"Downloading dataset from {dataset_url}")
        await ctx.report_progress(10)
        
        # Process in chunks
        total_chunks = 10
        for i in range(total_chunks):
            ctx.info(f"Processing chunk {i+1}/{total_chunks}")
            # Process chunk
            await asyncio.sleep(1)  # Simulate work
            await ctx.report_progress(10 + (i+1) * 80 // total_chunks)
        
        # Finalize
        ctx.info("Finalizing results")
        await ctx.report_progress(90)
        await asyncio.sleep(1)  # Simulate work
        
        # Complete
        await ctx.report_progress(100)
        return "Dataset processing complete. Found 42 insights."
        
    except Exception as e:
        ctx.info(f"Error: {str(e)}")
        return f"Processing failed: {str(e)}"
```

## Adding a Resource

In addition to tools, you might want to add a resource to your MCP server:

```python
# server.py
@mcp.resource("weather://{location}")
async def get_weather(location: str) -> str:
    """
    Get weather information for a location.
    
    Args:
        location: City name or coordinates
    
    Returns:
        Weather information as text
    """
    try:
        # Fetch weather data
        async with httpx.AsyncClient() as client:
            response = await client.get(
                f"https://api.weatherapi.com/v1/current.json",
                params={
                    "q": location,
                    "key": os.environ.get("WEATHER_API_KEY", "")
                }
            )
            response.raise_for_status()
            data = response.json()
        
        # Format weather data
        location_data = data.get("location", {})
        current_data = data.get("current", {})
        
        weather_info = f"""
        Weather for {location_data.get('name', location)}, {location_data.get('country', '')}
        
        Temperature: {current_data.get('temp_c', 'N/A')}°C / {current_data.get('temp_f', 'N/A')}°F
        Condition: {current_data.get('condition', {}).get('text', 'N/A')}
        Wind: {current_data.get('wind_kph', 'N/A')} kph, {current_data.get('wind_dir', 'N/A')}
        Humidity: {current_data.get('humidity', 'N/A')}%
        Updated: {current_data.get('last_updated', 'N/A')}
        """
        
        return weather_info
        
    except Exception as e:
        return f"Error fetching weather: {str(e)}"
```

## Adding a Prompt

You can also add a prompt to your MCP server:

```python
# server.py
@mcp.prompt()
def analyze_sentiment(text: str) -> str:
    """
    Create a prompt for sentiment analysis.
    
    Args:
        text: The text to analyze
    
    Returns:
        A prompt for sentiment analysis
    """
    return f"""
    Please analyze the sentiment of the following text and categorize it as positive, negative, or neutral. 
    Provide a brief explanation for your categorization and highlight key phrases that indicate the sentiment.
    
    Text to analyze:
    
    {text}
    
    Your analysis:
    """
```

## Conclusion

Extending the MCP repository with new tools is a powerful way to enhance the capabilities of LLMs. By following the patterns and practices outlined in this guide, you can create robust, reusable tools that integrate seamlessly with the existing repository structure.

Remember these key principles:

1. **Plan before coding**: Define the purpose and interface of your tool
2. **Follow best practices**: Implement proper error handling, input validation, and security
3. **Document thoroughly**: Write clear docstrings and comments
4. **Test rigorously**: Create tests for your tools
5. **Consider configurations**: Use environment variables or configuration files
6. **Explore advanced patterns**: Implement composition, state, and long-running operations as needed

In the next document, we'll explore example use cases for your MCP server and tools.