# Ollama MCP Server
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
[](https://modelcontextprotocol.io/)
A self-contained **Model Context Protocol (MCP) server** for local Ollama management, developed with Claude AI assistance. Features include listing local models, chatting, starting/stopping the server, and a 'local model advisor' to suggest the best local model for a given task. The server is designed to be a robust, dependency-free, and cross-platform tool for managing a local Ollama instance.
## β οΈ Current Testing Status
**Currently tested on**: Windows 11 with NVIDIA RTX 4090
**Status**: Beta on Windows, Other Platforms Need Testing
**Cross-platform code**: Ready for Linux and macOS but requires community testing
**GPU support**: NVIDIA fully tested, AMD/Intel/Apple Silicon implemented but needs validation
We welcome testers on different platforms and hardware configurations! Please report your experience via GitHub Issues.
## π― Key Features
### π§ **Self-Contained Architecture**
- **Zero External Dependencies**: No external MCP servers required
- **MIT License Ready**: All code internally developed and properly licensed
- **Enterprise-Grade**: Professional error handling with actionable troubleshooting
### π **Universal Compatibility**
- **Cross-Platform**: Windows, Linux, macOS with automatic platform detection
- **Multi-GPU Support**: NVIDIA, AMD, Intel detection with vendor-specific optimizations
- **Smart Installation Discovery**: Automatic Ollama detection across platforms
### β‘ **Complete Local Ollama Management**
- **Model Operations**: List, suggest, and remove local models.
- **Server Control**: Start and monitor the Ollama server with intelligent process management.
- **Direct Chat**: Communicate with any locally installed model.
- **System Analysis**: Assess hardware compatibility and monitor resources.
## π Quick Start
### Installation
```bash
git clone https://github.com/paolodalprato/ollama-mcp-server.git
cd ollama-mcp-server
pip install -e .
```
### Configuration
Add to your MCP client configuration (e.g., Claude Desktop `config.json`):
```json
{
"mcpServers": {
"ollama-mcp": {
"command": "python",
"args": [
"X:\\PATH_TO\\ollama-mcp-server\\src\\ollama_mcp\\server.py"
],
"env": {}
}
}
}
```
**Note**: Adjust the path to match your installation directory. On Linux/macOS, use forward slashes: `/path/to/ollama-mcp-server/src/ollama_mcp/server.py`
### Requirements
- **Python 3.10+** (required by MCP SDK dependency)
- **Ollama installed** and accessible in PATH
- **MCP-compatible client** (Claude Desktop, etc.)
### Ollama Configuration Compatibility
This MCP server automatically respects your Ollama configuration. If you have customized your Ollama setup (e.g., changed the models folder via `OLLAMA_MODELS` environment variable), the MCP server will work seamlessly without any additional configuration.
## π οΈ Available Tools
### **Model Management**
- `list_local_models` - List all locally installed models with their details.
- `local_llm_chat` - Chat directly with any locally installed model.
- `remove_model` - Safely remove a model from local storage.
- `suggest_models` - Recommends the best **locally installed** model for a specific task (e.g., "suggest a model for coding").
### **Server and System Operations**
- `start_ollama_server` - Starts the Ollama server if it's not already running.
- `ollama_health_check` - Performs a comprehensive health check of the Ollama server.
- `system_resource_check` - Analyzes system hardware and resource availability.
### **Diagnostics**
- `test_model_responsiveness` - Checks the responsiveness of a specific local model by sending a test prompt, helping to diagnose performance issues.
- `select_chat_model` - Presents a list of available local models to choose from before starting a chat.
## π¬ How to Interact with Ollama-MCP
Ollama-MCP works **through your MCP client** (like Claude Desktop) - you don't interact with it directly. Instead, you communicate with your MCP client using **natural language**, and the client translates your requests into tool calls.
### **Basic Interaction Pattern**
You speak to your MCP client in natural language, and it automatically uses the appropriate ollama-mcp tools:
```
You: "List my installed Ollama models"
β Client calls: list_local_models
β You get: Formatted list of your models
You: "Chat with llama3.2: explain machine learning"
β Client calls: local_llm_chat with model="llama3.2" and message="explain machine learning"
β You get: AI response from your local model
You: "Check if Ollama is running"
β Client calls: ollama_health_check
β You get: Server status and troubleshooting if needed
```
### **Example Interactions**
#### **Model Management**
- *"What models do I have installed?"* β `list_local_models`
- *"I need a model for creative writing, which of my models is best?"* β `suggest_models`
- *"Remove the old mistral model to save space"* β `remove_model`
#### **System Operations**
- *"Start Ollama server"* β `start_ollama_server`
- *"Is my system capable of running large AI models?"* β `system_resource_check`
#### **AI Chat**
- *"Chat with llama3.2: write a Python function to sort a list"* β `local_llm_chat`
- *"Use deepseek-coder to debug this code: [code snippet]"* β `local_llm_chat`
- *"Ask phi3.5 to explain quantum computing simply"* β `local_llm_chat`
### **Key Points**
- **No Direct Commands**: You never call `ollama_health_check()` directly
- **Natural Language**: Speak normally to your MCP client
- **Automatic Tool Selection**: The client chooses the right tool based on your request
- **Conversational**: You can ask follow-up questions and the client maintains context
## π― Real-World Use Cases
### **Daily Development Workflow**
*"I need to work on a coding project. Which of my local models is best for coding? Let's check its performance and then ask it a question."*
This could trigger:
1. `suggest_models` - Recommends the best local model for "coding".
2. `test_model_responsiveness` - Checks if the recommended model is responsive.
3. `local_llm_chat` - Starts a chat with the model.
### **Model Management Session**
*"Show me what models I have and recommend one for writing a story. Then let's clean up any old models I don't need."*
Triggers:
1. `list_local_models` - Current inventory
2. `suggest_models` - Recommends a local model for "writing a story".
3. `remove_model` - Cleanup unwanted models.
### **Troubleshooting Session**
*"Ollama isn't working. Check what's wrong, try to fix it, and test with a simple chat."*
Triggers:
1. `ollama_health_check` - Diagnose issues
2. `start_ollama_server` - Attempt to start server
3. `local_llm_chat` - Verify working with test message
## ποΈ Architecture
### **Design Principles**
- **Self-Contained**: Zero external MCP server dependencies
- **Fail-Safe**: Comprehensive error handling with actionable guidance
- **Cross-Platform First**: Universal Windows/Linux/macOS compatibility
- **Enterprise Ready**: Professional-grade implementation and documentation
### **Technical Highlights**
- **Internal Process Management**: Advanced subprocess handling with timeout control
- **Multi-GPU Detection**: Platform-specific GPU identification without confusing metrics
- **Intelligent Model Selection**: Fallback to first available model when none specified
- **Progressive Health Monitoring**: Smart server startup detection with detailed feedback
## π System Compatibility
### **Operating Systems**
- **Windows**: Full support with auto-detection in Program Files and AppData β
**Tested**
- **Linux**: XDG configuration support with package manager integration β οΈ **Needs Testing**
- **macOS**: Homebrew detection with Apple Silicon GPU support β οΈ **Needs Testing**
### **GPU Support**
- **NVIDIA**: Full detection via nvidia-smi with memory and utilization info β
**Tested RTX 4090**
- **AMD**: ROCm support via vendor-specific tools β οΈ **Needs Testing**
- **Intel**: Basic detection via system tools β οΈ **Needs Testing**
- **Apple Silicon**: M1/M2/M3 detection with unified memory handling β οΈ **Needs Testing**
### **Hardware Requirements**
- **Minimum**: 4GB RAM, 2GB free disk space
- **Recommended**: 8GB+ RAM, 10GB+ free disk space
- **GPU**: Optional but recommended for model acceleration
## π§ Development
### **Project Structure**
```
ollama-mcp-server/
βββ src/
β βββ __init__.py # Defines the package version
β βββ ollama_mcp/
β βββ __init__.py # Makes 'ollama_mcp' a package
β βββ server.py # Main MCP server implementation
β βββ client.py # Ollama API client
β βββ config.py # Configuration management
β βββ model_manager.py # Local model operations
β βββ hardware_checker.py # System hardware analysis
β βββ ... (and other modules)
βββ tests/
β βββ test_client.py # Unit tests for the client
β βββ test_tools.py # Integration tests for tools
βββ .gitignore # Specifies intentionally untracked files
βββ pyproject.toml # Project configuration and dependencies
```
### **Key Technical Achievements**
#### **Self-Contained Implementation**
- **Challenge**: Eliminated external `desktop-commander` dependency
- **Solution**: Internal process management with advanced subprocess handling
- **Result**: Zero external MCP dependencies, MIT license compatible
#### **Intelligent GPU Detection**
- **Challenge**: Complex VRAM reporting causing user confusion
- **Solution**: Simplified to GPU name display only
- **Result**: Clean, reliable hardware identification
#### **Enterprise Error Handling**
- **Implementation**: 6-level exception framework with specific error types
- **Coverage**: Platform-specific errors, process failures, network issues
- **UX**: Actionable troubleshooting steps for every error scenario
## π€ Contributing
We welcome contributions! Areas where help is especially appreciated:
- **Platform Testing**: Different OS and hardware configurations β **High Priority**
- **GPU Vendor Support**: Additional vendor-specific detection
- **Performance Optimization**: Startup time and resource usage improvements
- **Documentation**: Usage examples and integration guides
- **Testing**: Edge cases and error condition validation
### **Immediate Testing Needs**
- **Linux**: Ubuntu, Fedora, Arch with various GPU configurations
- **macOS**: Intel and Apple Silicon Macs with different Ollama installations
- **GPU Vendors**: AMD ROCm, Intel Arc, Apple unified memory
- **Edge Cases**: Different Python versions, various Ollama installation methods
### **Development Setup**
```bash
git clone https://github.com/paolodalprato/ollama-mcp-server.git
cd ollama-mcp-server
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Code formatting
black src/
isort src/
# Type checking
mypy src/
```
## π Troubleshooting
### **Common Issues**
#### **Ollama Not Found**
```bash
# Verify Ollama installation
ollama --version
# Check PATH configuration
which ollama # Linux/macOS
where ollama # Windows
```
#### **Server Startup Failures**
```bash
# Check port availability
netstat -an | grep 11434
# Manual server start for debugging
ollama serve
```
#### **Permission Issues**
- **Windows**: Run as Administrator if needed
- **Linux/macOS**: Check user permissions for service management
### **Platform-Specific Issues**
If you encounter issues on Linux or macOS, please report them via GitHub Issues with:
- Operating system and version
- Python version
- Ollama version and installation method
- GPU hardware (if applicable)
- Complete error output
## π Performance
### **Typical Response Times** *(Windows RTX 4090)*
- **Health Check**: <500ms
- **Model List**: <1 second
- **Server Start**: 1-15 seconds (hardware dependent)
- **Model Chat**: 2-30 seconds (model and prompt dependent)
### **Resource Usage**
- **Memory**: <50MB for MCP server process
- **CPU**: Minimal when idle, scales with operations
- **Storage**: Configuration files and logs only
## π Security
- **Data Flow**: User β MCP Client (Claude) β ollama-mcp-server β Local Ollama β back through chain
## π¨βπ» About This Project
This is my first MCP server, created by adapting a personal tool I had developed for my own Ollama management needs.
### **The Problem I Faced**
I started using Claude to interact with Ollama because it allows me to use natural language instead of command-line interfaces. Claude also provides capabilities that Ollama alone doesn't have, particularly intelligent model suggestions based on both my system capabilities and specific needs.
### **My Solution**
I built this MCP server to streamline my own workflow, and then refined it into a stable tool that others might find useful. The design reflects real usage patterns:
- **Self-contained**: No external dependencies that can break
- **Intelligent error handling**: Clear guidance when things go wrong
- **Cross-platform**: Works consistently across different environments
- **Practical tools**: Features I actually use in daily work
### **Design Philosophy**
I initially developed this for my personal use to manage Ollama models more efficiently. When the MCP protocol became available, I transformed my personal tool into an MCP server to share it with others who might find it useful.
**Development Approach**: This project was developed with Claude using "vibe coding" - an iterative, conversational development process where AI assistance helped refine both the technical implementation and user experience. It's a practical example of AI-assisted development creating tools for AI management. Jules was also involved in the final refactoring phase.
## π License
MIT License - see [LICENSE](LICENSE) file for details.
## π Acknowledgments
- **Ollama Team**: For the excellent local AI platform
- **MCP Project**: For the Model Context Protocol specification
- **Claude Desktop/Code by Anthropic**: As tool in MCP client implementation, testing and refactoring
- **Jules by Google**: As tool in refactoring
## π Support
- **Bug Reports**: [GitHub Issues](https://github.com/paolodalprato/ollama-mcp-server/issues)
- **Feature Requests**: [GitHub Issues](https://github.com/paolodalprato/ollama-mcp-server/issues)
- **Community Discussion**: [GitHub Discussions](https://github.com/paolodalprato/ollama-mcp-server/discussions)
---
## Changelog
* **v0.9.0 (August 17, 2025):** Critical bugfix release - Fixed datetime serialization issue that prevented model listing from working with Claude Desktop. All 9 tools now verified working correctly.
* **August 2025:** Project refactoring and enhancements. Overhauled the architecture for modularity, implemented a fully asynchronous client, added a test suite, and refined the tool logic based on a "local-first" philosophy.
* **July 2025:** Initial version created by Paolo Dalprato with Claude AI assistance.
For detailed changes, see [CHANGELOG.md](CHANGELOG.md).
---
**Status**: Beta on Windows, Other Platforms Need Testing
**Testing**: Windows 11 + RTX 4090 validated, Linux/macOS require community validation
**License**: MIT
**Dependencies**: Zero external MCP servers required