What can you do with this server?

mcp-mlx-launcher is an MCP server that lets AI agents autonomously manage local LLM instances powered by mlx-lm on Apple Silicon Macs. * Check system environment: Diagnose available unified memory and confirm Apple Silicon architecture readiness. * Search MLX models: Search Hugging Face for MLX-format models by keyword, with results including download counts and model IDs. * Download a model: Pre-download and cache a specific MLX model from Hugging Face locally before launching. * Launch an LLM server: Start an mlx_lm.server subprocess in the background for a given model and port, with an optional memory requirement guard. * Restart an LLM server: Gracefully stop and restart a server on a given port, optionally switching to a different model. * Shutdown an LLM server: Safely terminate a running LLM server process on a specified port to free resources. * Check server status: Verify whether a server is currently running and listening on a specified port. * List running servers: Get all background LLM server processes with their ports and loaded models. * Auto-cleanup: Automatically shuts down all managed LLM processes when the MCP server disconnects.

Which integrations are available for this server?

Allows searching Hugging Face for available MLX models and downloading them locally to cache before launching.

de en es ja ko ru zh

mcp-mlx-launcher

by globalpocket

Overview Schema Related Servers Score Discussions

Python

Local

mcp-mlx-launcher

An MCP (Model Context Protocol) server designed to autonomously manage, launch, and shutdown local mlx-lm instances on Apple Silicon (Mac) environments.

This tool empowers AI agents (like Cline, Claude Desktop, etc.) to start local LLM servers on demand, check their status, prepare environments, and gracefully shut them down when no longer needed, saving system resources.

💡 Proven in Production: This server was extracted as a general-purpose, reusable module from the cingulater project, where it is actively used and running in production.

Features

System Environment Check: Verify system memory and architecture (Apple Silicon) to ensure readiness.
Model Search & Download: Search Hugging Face for available MLX models and download them locally to cache before launching.
Launch & Manage Local LLMs: Start, stop, and restart an mlx-lm server with any supported model in the background.
Status Check: Verify if a specific port is currently active and listening.
Apple Silicon Optimized: Built specifically to manage MLX-based local models.
Auto Cleanup: Automatically cleans up and shuts down all managed LLM processes when the MCP server disconnects or shuts down, preventing resource leaks.

Related MCP server: local-mmcp

Prerequisites

macOS (Apple Silicon M1/M2/M3/M4)
Python 3.10 or higher
mlx-lm installed in your environment (pip install mlx-lm)

Installation

# Clone the repository
git clone [https://github.com/YOUR_USERNAME/mcp-mlx-launcher.git](https://github.com/YOUR_USERNAME/mcp-mlx-launcher.git)
cd mcp-mlx-launcher

# Install dependencies
pip install -e .

Usage (MCP Configuration)

To use this server with your MCP client (e.g., Claude Desktop or Cline), add the following to your MCP configuration file:

{
  "mcpServers": {
    "mcp-mlx-launcher": {
      "command": "python",
      "args": [
        "-m",
        "mcp_mlx_launcher.server"
      ]
    }
  }
}

Available Tools

Once connected, the MCP server provides the following tools to the AI agent:

check_system_environment(): Diagnoses the current system environment, returning available unified memory (GB) and architecture details.
check_llm_status(port: int): Returns true if a server is currently running on the specified port.
list_running_servers(): Retrieves a list of all local LLM servers (ports and models) currently running in the background.
search_mlx_models(search_query: str = "", limit: int = 10): Searches Hugging Face for available MLX format models and lists their details (like download count and model ID).
download_model(model_name: str): Pre-downloads a specified MLX model from Hugging Face and caches it locally. Useful for preparing large models before launching.
launch_llm_server(model_name: str, port: int, memory_requirement_gb: float = 4.0): Launches an mlx_lm.server instance in the background. Includes an optional memory requirement check to prevent out-of-memory errors.
restart_llm_server(port: int, model_name: str = None, memory_requirement_gb: float = 4.0): Gracefully stops the running server on the given port and restarts it. If model_name is omitted, it restarts with the currently loaded model.
shutdown_llm_server(port: int): Gracefully terminates the running LLM server on the given port.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

2dRelease cycle

4Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/globalpocket/mcp-mlx-launcher'

If you have feedback or need assistance with the MCP directory API, please join our Discord server