Skip to main content
Glama

Ollama MCP Server

by hyzhak

Ollama MCP Server

This is a rebooted and actively maintained fork.
Original project: NightTrek/Ollama-mcp

This repository (hyzhak/ollama-mcp-server) is a fresh upstream with improved maintenance, metadata, and publishing automation.

See NightTrek/Ollama-mcp for project history and prior releases.

🚀 A powerful bridge between Ollama and the Model Context Protocol (MCP), enabling seamless integration of Ollama's local LLM capabilities into your MCP-powered applications.

🌟 Features

Complete Ollama Integration

  • Full API Coverage: Access all essential Ollama functionality through a clean MCP interface
  • OpenAI-Compatible Chat: Drop-in replacement for OpenAI's chat completion API
  • Local LLM Power: Run AI models locally with full control and privacy

Core Capabilities

  • 🔄 Model Management
    • Pull models from registries
    • Push models to registries
    • List available models
    • Create custom models from Modelfiles
    • Copy and remove models
  • 🤖 Model Execution
    • Run models with customizable prompts (response is returned only after completion; streaming is not supported in stdio mode)
    • Vision/multimodal support: pass images to compatible models
    • Chat completion API with system/user/assistant roles
    • Configurable parameters (temperature, timeout)
    • NEW: think parameter for advanced reasoning and transparency (see below)
    • Raw mode support for direct responses
  • 🛠 Server Control
    • Start and manage Ollama server
    • View detailed model information
    • Error handling and timeout management

🚀 Quick Start

Prerequisites

  • Ollama installed on your system
  • Node.js (with npx, included with npm)

Configuration

Add the server to your MCP configuration:

For Claude Desktop:

MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%/Claude/claude_desktop_config.json

{ "mcpServers": { "ollama": { "command": "npx", "args": ["ollama-mcp-server"], "env": { "OLLAMA_HOST": "http://127.0.0.1:11434" // Optional: customize Ollama API endpoint } } } }

🛠 Developer Setup

Prerequisites

  • Ollama installed on your system
  • Node.js and npm

Installation

  1. Install dependencies:
npm install
  1. Build the server:
npm run build

🛠 Usage Examples

Pull and Run a Model

// Pull a model await mcp.use_mcp_tool({ server_name: "ollama", tool_name: "pull", arguments: { name: "llama2" } }); // Run the model await mcp.use_mcp_tool({ server_name: "ollama", tool_name: "run", arguments: { name: "llama2", prompt: "Explain quantum computing in simple terms" } });

Run a Vision/Multimodal Model

// Run a model with an image (for vision/multimodal models) await mcp.use_mcp_tool({ server_name: "ollama", tool_name: "run", arguments: { name: "gemma3:4b", prompt: "Describe the contents of this image.", imagePath: "./path/to/image.jpg" } });

Chat Completion (OpenAI-compatible)

await mcp.use_mcp_tool({ server_name: "ollama", tool_name: "chat_completion", arguments: { model: "llama2", messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "What is the meaning of life?" } ], temperature: 0.7 } }); // Chat with images (for vision/multimodal models) await mcp.use_mcp_tool({ server_name: "ollama", tool_name: "chat_completion", arguments: { model: "gemma3:4b", messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Describe the contents of this image.", images: ["./path/to/image.jpg"] } ] } });

Note: The images field is optional and only supported by vision/multimodal models.

Create Custom Model

await mcp.use_mcp_tool({ server_name: "ollama", tool_name: "create", arguments: { name: "custom-model", modelfile: "./path/to/Modelfile" } });

🧠 Advanced Reasoning with the think Parameter

Both the run and chat_completion tools now support an optional think parameter:

  • think: true: Requests the model to provide step-by-step reasoning or "thought process" in addition to the final answer (if supported by the model).
  • think: false (default): Only the final answer is returned.
Example (run tool):
await mcp.use_mcp_tool({ server_name: "ollama", tool_name: "run", arguments: { name: "deepseek-r1:32b", prompt: "how many r's are in strawberry?", think: true } });
  • If the model supports it, the response will include a :::thinking ... ::: block with detailed reasoning before the final answer.
Example (chat_completion tool):
await mcp.use_mcp_tool({ server_name: "ollama", tool_name: "chat_completion", arguments: { model: "deepseek-r1:32b", messages: [ { role: "user", content: "how many r's are in strawberry?" } ], think: true } });
  • The model's reasoning (if provided) will be included in the message content.

Note: Not all models support the think parameter. Advanced models (e.g., "deepseek-r1:32b", "magistral") may provide more detailed and accurate reasoning when think is enabled.

🔧 Advanced Configuration

  • OLLAMA_HOST: Configure custom Ollama API endpoint (default: http://127.0.0.1:11434)
  • Timeout settings for model execution (default: 60 seconds)
  • Temperature control for response randomness (0-2 range)

🤝 Contributing

Contributions are welcome! Feel free to:

  • Report bugs
  • Suggest new features
  • Submit pull requests

📝 License

MIT License - feel free to use in your own projects!


Built with ❤️ for the MCP ecosystem

Deploy Server
A
security – no known vulnerabilities
A
license - permissive license
A
quality - confirmed to work

local-only server

The server can only run on the client's local machine because it depends on local resources.

A bridge that integrates Ollama's local LLM capabilities into MCP-powered applications, enabling users to run, manage, and interact with AI models locally with full control and privacy.

  1. 🌟 Features
    1. Complete Ollama Integration
    2. Core Capabilities
  2. 🚀 Quick Start
    1. Prerequisites
    2. Configuration
  3. 🛠 Developer Setup
    1. Prerequisites
    2. Installation
  4. 🛠 Usage Examples
    1. Pull and Run a Model
    2. Run a Vision/Multimodal Model
    3. Chat Completion (OpenAI-compatible)
    4. Create Custom Model
  5. 🧠 Advanced Reasoning with the think Parameter
    1. Example (run tool):
    2. Example (chat_completion tool):
  6. 🔧 Advanced Configuration
    1. 🤝 Contributing
      1. 📝 License

        Related MCP Servers

        • A
          security
          A
          license
          A
          quality
          MCP Ollama server integrates Ollama models with MCP clients, allowing users to list models, get detailed information, and interact with them through questions.
          Last updated -
          3
          27
          MIT License
          • Apple
        • -
          security
          A
          license
          -
          quality
          Enables seamless integration between Ollama's local LLM models and MCP-compatible applications, supporting model management and chat interactions.
          Last updated -
          69
          98
          AGPL 3.0
        • A
          security
          F
          license
          A
          quality
          A bridge that enables seamless integration of Ollama's local LLM capabilities into MCP-powered applications, allowing users to manage and run AI models locally with full API coverage.
          Last updated -
          10
          73
          • Apple
        • -
          security
          F
          license
          -
          quality
          A generic Model Context Protocol framework for building AI-powered applications that provides standardized ways to create MCP servers and clients for integrating LLMs with support for Ollama and Supabase.
          Last updated -

        View all related MCP servers

        MCP directory API

        We provide all the information about MCP servers via our MCP API.

        curl -X GET 'https://glama.ai/api/mcp/v1/servers/hyzhak/ollama-mcp-server'

        If you have feedback or need assistance with the MCP directory API, please join our Discord server