The MCP AI Hub server provides unified access to over 100 AI models from various providers through a single interface using the Model Context Protocol (MCP).
• Chat with AI models: Send messages (text or OpenAI-style arrays) to configured models like GPT-4, Claude, and receive responses
• Discover and manage models: List all available models with list_models
and get detailed configuration information using get_model_info
• Access diverse providers: Connect to OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Ollama, and 100+ other providers via LiteLM
• Flexible configuration: Customize models with API keys, parameters (max_tokens, temperature), and system prompts through YAML files
• Multiple connection methods: Support stdio (for MCP clients like Claude Desktop), Server-Sent Events (web apps), and HTTP API with configurable host/port settings
Provides access to Google's Gemini models (Pro, Pro Vision, Ultra) for AI chat and completion capabilities
Enables interaction with various open-source AI models hosted on Hugging Face through the unified LiteLLM interface
Provides access to locally deployed AI models through Ollama for private, on-device AI chat and completion tasks
Enables interaction with OpenAI's models including GPT-4, GPT-3.5-turbo, and GPT-4-turbo through a unified chat interface
MCP AI Hub
A Model Context Protocol (MCP) server that provides unified access to various AI providers through LiteLM. Chat with OpenAI, Anthropic, and 100+ other AI models using a single, consistent interface.
🌟 Overview
MCP AI Hub acts as a bridge between MCP clients (like Claude Desktop/Code) and multiple AI providers. It leverages LiteLM's unified API to provide seamless access to 100+ AI models without requiring separate integrations for each provider.
Key Benefits:
Unified Interface: Single API for all AI providers
100+ Providers: OpenAI, Anthropic, Google, Azure, AWS Bedrock, and more
MCP Protocol: Native integration with Claude Desktop and Claude Code
Flexible Configuration: YAML-based configuration with Pydantic validation
Multiple Transports: stdio, SSE, and HTTP transport options
Custom Endpoints: Support for proxy servers and local deployments
Quick Start
1. Install
Choose your preferred installation method:
Installation Notes:
uv
is a fast Python package installer and resolverThe package requires Python 3.10 or higher
All dependencies are automatically resolved and installed
2. Configure
Create a configuration file at ~/.ai_hub.yaml
with your API keys and model configurations:
Configuration Guidelines:
API Keys: Replace placeholder keys with your actual API keys
Model Names: Use descriptive names you'll remember (e.g.,
gpt-4
,claude-sonnet
)LiteLM Models: Use LiteLM's provider/model format (e.g.,
openai/gpt-4
,anthropic/claude-3-5-sonnet-20241022
)Parameters: Configure
max_tokens
,temperature
, and other LiteLM-supported parametersSecurity: Keep your config file secure with appropriate file permissions (chmod 600)
3. Connect to Claude Desktop
Configure Claude Desktop to use MCP AI Hub by editing your configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
4. Connect to Claude Code
Advanced Usage
CLI Options and Transport Types
MCP AI Hub supports multiple transport mechanisms for different use cases:
Command Line Options:
Transport Type Details:
Transport | Use Case | Default Host:Port | Description |
| MCP clients (Claude Desktop/Code) | N/A | Standard input/output, default for MCP |
| Web applications | localhost:3001 | Server-Sent Events for real-time web apps |
| Direct API calls | localhost:3001 (override with
) | HTTP transport with streaming support |
CLI Arguments:
--transport {stdio,sse,http}
: Transport protocol (default: stdio)--host HOST
: Host address for SSE/HTTP (default: localhost)--port PORT
: Port number for SSE/HTTP (default: 3001; override if you need a different port)--config CONFIG
: Custom config file path (default: ~/.ai_hub.yaml)--log-level {DEBUG,INFO,WARNING,ERROR}
: Logging verbosity (default: INFO)
Usage
Once MCP AI Hub is connected to your MCP client, you can interact with AI models using these tools:
MCP Tool Reference
Primary Chat Tool:
model_name: Name of the configured model (e.g., "gpt-4", "claude-sonnet")
message: String message or OpenAI-style message list
Returns: AI model response as string
Model Discovery Tools:
Returns: List of all configured model names
model_name: Name of the configured model
Returns: Model configuration details including provider, parameters, etc.
Configuration
MCP AI Hub supports 100+ AI providers through LiteLM. Configure your models in ~/.ai_hub.yaml
with API keys and custom parameters.
System Prompts
You can define system prompts at two levels:
global_system_prompt
: Applied to all models by defaultPer-model
system_prompt
: Overrides the global prompt for that model
Precedence: model-specific prompt > global prompt. If a model's system_prompt
is set to an empty string, it disables the global prompt for that model.
Notes:
The server prepends the configured system prompt to the message list it sends to providers.
If you pass an explicit message list that already contains a
system
message, both system messages will be included in order (configured prompt first).
Supported Providers
Major AI Providers:
OpenAI: GPT-4, GPT-3.5-turbo, GPT-4-turbo, etc.
Anthropic: Claude 3.5 Sonnet, Claude 3 Haiku, Claude 3 Opus
Google: Gemini Pro, Gemini Pro Vision, Gemini Ultra
Azure OpenAI: Azure-hosted OpenAI models
AWS Bedrock: Claude, Llama, Jurassic, and more
Together AI: Llama, Mistral, Falcon, and open-source models
Hugging Face: Various open-source models
Local Models: Ollama, LM Studio, and other local deployments
Configuration Parameters:
api_key: Your provider API key (required)
max_tokens: Maximum response tokens (optional)
temperature: Response creativity 0.0-1.0 (optional)
api_base: Custom endpoint URL (for proxies/local servers)
Additional: All LiteLM-supported parameters
Configuration Examples
Basic Configuration:
Custom Parameters:
Local LLM Server Configuration:
For more providers, please refer to the LiteLLM docs: https://docs.litellm.ai/docs/providers.
Development
Setup:
Running and Testing:
Code Quality:
Troubleshooting
Configuration Issues
Configuration File Problems:
File Location: Ensure
~/.ai_hub.yaml
exists in your home directoryYAML Validity: Validate YAML syntax using online validators or
python -c "import yaml; yaml.safe_load(open('~/.ai_hub.yaml'))"
File Permissions: Set secure permissions with
chmod 600 ~/.ai_hub.yaml
Path Resolution: Use absolute paths in custom config locations
Configuration Validation:
Required Fields: Each model must have
model_name
andlitellm_params
API Keys: Verify API keys are properly quoted and not expired
Model Formats: Use LiteLM-compatible model identifiers (e.g.,
openai/gpt-4
,anthropic/claude-3-5-sonnet-20241022
)
API and Authentication Errors
Authentication Issues:
Invalid API Keys: Check for typos, extra spaces, or expired keys
Insufficient Permissions: Verify API keys have necessary model access permissions
Rate Limiting: Monitor API usage and implement retry logic if needed
Regional Restrictions: Some models may not be available in all regions
API-Specific Troubleshooting:
OpenAI: Check organization settings and model availability
Anthropic: Verify Claude model access and usage limits
Azure OpenAI: Ensure proper resource deployment and endpoint configuration
Google Gemini: Check project setup and API enablement
MCP Connection Issues
Server Startup Problems:
Port Conflicts: Use different ports for SSE/HTTP transports if defaults are in use
Permission Errors: Ensure executable permissions for
mcp-ai-hub
commandPython Path: Verify Python environment and package installation
Client Configuration Issues:
Command Path: Ensure
mcp-ai-hub
is in PATH or use full absolute pathWorking Directory: Some MCP clients require specific working directory settings
Transport Mismatch: Use stdio transport for Claude Desktop/Code
Performance and Reliability
Response Time Issues:
Network Latency: Use geographically closer API endpoints when possible
Model Selection: Some models are faster than others (e.g., GPT-3.5 vs GPT-4)
Token Limits: Large
max_tokens
values can increase response time
Reliability Improvements:
Retry Logic: Implement exponential backoff for transient failures
Timeout Configuration: Set appropriate timeouts for your use case
Health Checks: Monitor server status and restart if needed
Load Balancing: Use multiple model configurations for redundancy
License
MIT License - see LICENSE file for details.
Contributing
We welcome contributions! Please follow these guidelines:
Development Workflow
Fork and Clone: Fork the repository and clone your fork
Create Branch: Create a feature branch (
git checkout -b feature/amazing-feature
)Development Setup: Install dependencies with
uv sync
Make Changes: Implement your feature or fix
Testing: Add tests and ensure all tests pass
Code Quality: Run formatting, linting, and type checking
Documentation: Update documentation if needed
Submit PR: Create a pull request with detailed description
Code Standards
Python Style:
Follow PEP 8 style guidelines
Use type hints for all functions
Add docstrings for public functions and classes
Keep functions focused and small
Testing Requirements:
Write tests for new functionality
Ensure existing tests continue to pass
Aim for good test coverage
Test edge cases and error conditions
Documentation:
Update README.md for user-facing changes
Add inline comments for complex logic
Update configuration examples if needed
Document breaking changes clearly
Quality Checks
Before submitting a PR, ensure:
Issues and Feature Requests
Use GitHub Issues for bug reports and feature requests
Provide detailed reproduction steps for bugs
Include configuration examples when relevant
Check existing issues before creating new ones
Label issues appropriately
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Provides unified access to 100+ AI models from OpenAI, Anthropic, Google, AWS Bedrock and other providers through a single MCP interface. Enables seamless switching between different AI models using LiteLM's unified API without requiring separate integrations for each provider.