How do I use llama-diffusion-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@llama-diffusion-mcp Chat with diffusion about the meaning of dreams" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

llama-diffusion-mcp

by hkbu-kennycheng

Overview Schema Related Servers Score Discussions

Python

Local

Llama Diffusion MCP Bridge

A robust, bidirectional Model Context Protocol (MCP) server that allows Large Language Models (like Claude) to seamlessly interact with diffusion-based LLMs (e.g., DiffusionGemma, LLaDA, RND1) via llama-diffusion-cli.

✨ Features

Bidirectional Interactive Chat: Spawns and manages a persistent background instance of llama-diffusion-cli to maintain conversation context and avoid reloading heavy GGUF weights on every turn.
Graceful Lifecycle Management: Includes tools for the LLM to cleanly terminate (/exit) and restart the background process when you ask to start a new chat session.
Zero-Setup Execution: Configured with uv and pyproject.toml so it can be run directly from the repository without manually managing virtual environments.
Fully Configurable: Supports all standard llama.cpp diffusion parameters (steps, algorithms, temperature, batch sizing) directly through initialization arguments.

Related MCP server: FLUX MCP Server

🛠️ Prerequisites

Python 3.10+
uv (Recommended package manager)
llama-diffusion-cli: Must be compiled from the llama.cpp repository.

🚀 Quick Start & Installation

uv run --with git+https://github.com/hkbu-kennycheng/llama-diffusion-mcp.git llama-diffusion-mcp -hf unsloth/diffusiongemma-26B-A4B-it-GGUF:Q4_K_M -ngl 99 -n 128000

🔌 Connecting to Claude Desktop

To use this bridge with Claude Desktop (or any other MCP Client), add the server to your configuration file.

Path:

Mac: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Example Configuration (LLaDA 8B)

{
  "mcpServers": {
    "llama-diffusion-chat": {
      "command": "uv",
      "args": [
        "run",
        "--with", "git+https://github.com/hkbu-kennycheng/llama-diffusion-mcp.git",
        "llama-diffusion-mcp",
        "-hf", "unsloth/diffusiongemma-26B-A4B-it-GGUF:Q4_K_M",
        "-ngl", "99",
        "-n", "128000"
      ],
      "env": {
        "LLAMA_DIFFUSION_CLI_PATH": "/absolute/path/to/llama.cpp/build/bin/llama-diffusion-cli"
      }
    }
  }
}

Note: Restart Claude Desktop after updating the config.

⚙️ Configuration Options

The MCP server accepts standard llama-diffusion-cli arguments:

Argument	Description
`-m`, `--model`	(Required) Path to the GGUF model file.
`-i`, `--interactive`	Run in interactive mode (Highly recommended for this bridge).
`-c`, `--ctx-size`	Context size.
`-ub`, `--ubatch-size`	Maximum sequence length (ubatch size).
`--diffusion-steps`	Number of diffusion steps (e.g., 256).
`--diffusion-algorithm`	Algorithm for token selection (0-4).
`--temp`	Temperature for sampling.

Advanced MCP Settings

Argument	Description
`--mcp-prompt-marker`	The string the CLI prints when waiting for input (Default: `>` ). Determines when the server stops reading the stream.
`LLAMA_DIFFUSION_CLI_PATH`	Environment variable pointing to your CLI executable. Defaults to `llama-diffusion-cli` if in your system PATH.

🛠️ Exposed MCP Tools

Once connected, your LLM will have access to the following tools:

chat_with_diffusion(prompt: str) Sends a message to the persistently running Diffusion LLM and returns the generated text.
restart_chat_session() Gracefully exits the current chat process using the /exit command and spins up a fresh session. The LLM will use this if you ask it to clear context or start over.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

generate_diffusion_textA

Related MCP Servers

LibreModel MCP Server
Code Execution Autonomous Agents Developer Tools
openconstruct
A
license
C
quality
D
maintenance
Bridges Claude Desktop with local LLM instances running via llama-server, enabling full conversation support with complete parameter control and health monitoring. Allows users to chat with their local models directly through Claude Desktop with configurable sampling parameters.
Last updated 2025-08-16
3
5
9
Creative Commons Zero v1.0 Universal
FLUX MCP Server
Image & Video Processing Autonomous Agents Developer Tools
tehw0lf
A
license
-
quality
A
maintenance
Enables high-quality image generation using FLUX.1-dev through Claude Desktop or CLI, with automatic model unloading to save VRAM and memory-efficient bfloat16 processing.
Last updated 2026-07-22
MIT
mcp-comfyui
Image & Video Processing AI & Machine Learning
budihartono
F
license
-
quality
C
maintenance
Bridges Claude Desktop to local and remote ComfyUI instances, enabling health checks, model listing, workflow queuing, status polling, and output retrieval.
Last updated 2026-05-30
mcp-llama-swap
AI & Machine Learning Developer Tools
oussama-kh
A
license
A
quality
D
maintenance
Enables hot-swapping llama.cpp models in a running Claude Code session via launchctl or systemd, preserving conversation context across model changes.
Last updated 2026-04-06
4
5
Apache 2.0

View all related MCP servers

Related MCP Connectors

RogerRat
Real-time chat hub for AI agents — Claude Code, Cursor, Cline, Codex over MCP or REST.
Apuchat
Real-time chat hub for AI agents — Claude Code, Cursor, Cline, Codex over MCP or REST.
Remote Desktop Commander
Hosted MCP server connecting claude.ai, ChatGPT and other AI apps to your own computer

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hkbu-kennycheng/llama-diffusion-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server