Provides a bridge between Ollama and the Model Context Protocol, enabling access to Ollama's local LLM capabilities including model management (pull, push, list, create), model execution with customizable parameters, vision/multimodal support, and advanced reasoning via the 'think' parameter.
Offers an OpenAI-compatible chat completion API interface, allowing the server to function as a drop-in replacement for OpenAI's chat completion functionality while using Ollama's local LLM models.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Ollama MCP Serverexplain quantum computing in simple terms using llama2"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Ollama MCP Server
This is a rebooted and actively maintained fork.
Original project: NightTrek/Ollama-mcpThis repository (hyzhak/ollama-mcp-server) is a fresh upstream with improved maintenance, metadata, and publishing automation.
See NightTrek/Ollama-mcp for project history and prior releases.
π A powerful bridge between Ollama and the Model Context Protocol (MCP), enabling seamless integration of Ollama's local LLM capabilities into your MCP-powered applications.
π Features
Complete Ollama Integration
Full API Coverage: Access all essential Ollama functionality through a clean MCP interface
OpenAI-Compatible Chat: Drop-in replacement for OpenAI's chat completion API
Local LLM Power: Run AI models locally with full control and privacy
Core Capabilities
π Model Management
Pull models from registries
Push models to registries
List available models
Create custom models from Modelfiles
Copy and remove models
π€ Model Execution
Run models with customizable prompts (response is returned only after completion; streaming is not supported in stdio mode)
Vision/multimodal support: pass images to compatible models
Chat completion API with system/user/assistant roles
Configurable parameters (temperature, timeout)
NEW:
thinkparameter for advanced reasoning and transparency (see below)Raw mode support for direct responses
π Server Control
Start and manage Ollama server
View detailed model information
Error handling and timeout management
Related MCP server: Ollama MCP Server
π Quick Start
Prerequisites
Configuration
Add the server to your MCP configuration:
For Claude Desktop:
MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json
{
"mcpServers": {
"ollama": {
"command": "npx",
"args": ["ollama-mcp-server"],
"env": {
"OLLAMA_HOST": "http://127.0.0.1:11434" // Optional: customize Ollama API endpoint
}
}
}
}π Developer Setup
Prerequisites
Ollama installed on your system
Node.js and npm
Installation
Install dependencies:
npm installBuild the server:
npm run buildπ Usage Examples
Pull and Run a Model
// Pull a model
await mcp.use_mcp_tool({
server_name: "ollama",
tool_name: "pull",
arguments: {
name: "llama2"
}
});
// Run the model
await mcp.use_mcp_tool({
server_name: "ollama",
tool_name: "run",
arguments: {
name: "llama2",
prompt: "Explain quantum computing in simple terms"
}
});Run a Vision/Multimodal Model
// Run a model with an image (for vision/multimodal models)
await mcp.use_mcp_tool({
server_name: "ollama",
tool_name: "run",
arguments: {
name: "gemma3:4b",
prompt: "Describe the contents of this image.",
imagePath: "./path/to/image.jpg"
}
});Chat Completion (OpenAI-compatible)
await mcp.use_mcp_tool({
server_name: "ollama",
tool_name: "chat_completion",
arguments: {
model: "llama2",
messages: [
{
role: "system",
content: "You are a helpful assistant."
},
{
role: "user",
content: "What is the meaning of life?"
}
],
temperature: 0.7
}
});
// Chat with images (for vision/multimodal models)
await mcp.use_mcp_tool({
server_name: "ollama",
tool_name: "chat_completion",
arguments: {
model: "gemma3:4b",
messages: [
{
role: "system",
content: "You are a helpful assistant."
},
{
role: "user",
content: "Describe the contents of this image.",
images: ["./path/to/image.jpg"]
}
]
}
});Note: The
imagesfield is optional and only supported by vision/multimodal models.
Create Custom Model
await mcp.use_mcp_tool({
server_name: "ollama",
tool_name: "create",
arguments: {
name: "custom-model",
modelfile: "./path/to/Modelfile"
}
});π§ Advanced Reasoning with the think Parameter
Both the run and chat_completion tools now support an optional think parameter:
think: true: Requests the model to provide step-by-step reasoning or "thought process" in addition to the final answer (if supported by the model).think: false(default): Only the final answer is returned.
Example (run tool):
await mcp.use_mcp_tool({
server_name: "ollama",
tool_name: "run",
arguments: {
name: "deepseek-r1:32b",
prompt: "how many r's are in strawberry?",
think: true
}
});If the model supports it, the response will include a
<think>...</think>block with detailed reasoning before the final answer.
Example (chat_completion tool):
await mcp.use_mcp_tool({
server_name: "ollama",
tool_name: "chat_completion",
arguments: {
model: "deepseek-r1:32b",
messages: [
{ role: "user", content: "how many r's are in strawberry?" }
],
think: true
}
});The model's reasoning (if provided) will be included in the message content.
Note: Not all models support the
thinkparameter. Advanced models (e.g., "deepseek-r1:32b", "magistral") may provide more detailed and accurate reasoning whenthinkis enabled.
π§ Advanced Configuration
OLLAMA_HOST: Configure custom Ollama API endpoint (default: http://127.0.0.1:11434)Timeout settings for model execution (default: 60 seconds)
Temperature control for response randomness (0-2 range)
π€ Contributing
Contributions are welcome! Feel free to:
Report bugs
Suggest new features
Submit pull requests
π License
MIT License - feel free to use in your own projects!
Built with β€οΈ for the MCP ecosystem