How do I use llama-diffusion-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@llama-diffusion-mcp Chat with diffusion about the meaning of dreams" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

llama-diffusion-mcp

by hkbu-kennycheng

Overview Schema Related Servers Score Discussions

Python

Local

Llama Diffusion MCP Bridge

A robust, bidirectional Model Context Protocol (MCP) server that allows Large Language Models (like Claude) to seamlessly interact with diffusion-based LLMs (e.g., DiffusionGemma, LLaDA, RND1) via llama-diffusion-cli.

✨ Features

Bidirectional Interactive Chat: Spawns and manages a persistent background instance of llama-diffusion-cli to maintain conversation context and avoid reloading heavy GGUF weights on every turn.
Graceful Lifecycle Management: Includes tools for the LLM to cleanly terminate (/exit) and restart the background process when you ask to start a new chat session.
Zero-Setup Execution: Configured with uv and pyproject.toml so it can be run directly from the repository without manually managing virtual environments.
Fully Configurable: Supports all standard llama.cpp diffusion parameters (steps, algorithms, temperature, batch sizing) directly through initialization arguments.

Related MCP server: FLUX MCP Server

🛠️ Prerequisites

Python 3.10+
uv (Recommended package manager)
llama-diffusion-cli: Must be compiled from the llama.cpp repository.

🚀 Quick Start & Installation

uv run --with git+https://github.com/hkbu-kennycheng/llama-diffusion-mcp.git llama-diffusion-mcp -hf unsloth/diffusiongemma-26B-A4B-it-GGUF:Q4_K_M -ngl 99 -n 128000

🔌 Connecting to Claude Desktop

To use this bridge with Claude Desktop (or any other MCP Client), add the server to your configuration file.

Path:

Mac: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

Example Configuration (LLaDA 8B)

{
  "mcpServers": {
    "llama-diffusion-chat": {
      "command": "uv",
      "args": [
        "run",
        "--with", "git+https://github.com/hkbu-kennycheng/llama-diffusion-mcp.git",
        "llama-diffusion-mcp",
        "-hf", "unsloth/diffusiongemma-26B-A4B-it-GGUF:Q4_K_M",
        "-ngl", "99",
        "-n", "128000"
      ],
      "env": {
        "LLAMA_DIFFUSION_CLI_PATH": "/absolute/path/to/llama.cpp/build/bin/llama-diffusion-cli"
      }
    }
  }
}

Note: Restart Claude Desktop after updating the config.

⚙️ Configuration Options

The MCP server accepts standard llama-diffusion-cli arguments:

Argument	Description
`-m`, `--model`	(Required) Path to the GGUF model file.
`-i`, `--interactive`	Run in interactive mode (Highly recommended for this bridge).
`-c`, `--ctx-size`	Context size.
`-ub`, `--ubatch-size`	Maximum sequence length (ubatch size).
`--diffusion-steps`	Number of diffusion steps (e.g., 256).
`--diffusion-algorithm`	Algorithm for token selection (0-4).
`--temp`	Temperature for sampling.

Advanced MCP Settings

Argument	Description
`--mcp-prompt-marker`	The string the CLI prints when waiting for input (Default: `>` ). Determines when the server stops reading the stream.
`LLAMA_DIFFUSION_CLI_PATH`	Environment variable pointing to your CLI executable. Defaults to `llama-diffusion-cli` if in your system PATH.

🛠️ Exposed MCP Tools

Once connected, your LLM will have access to the following tools:

chat_with_diffusion(prompt: str) Sends a message to the persistently running Diffusion LLM and returns the generated text.
restart_chat_session() Gracefully exits the current chat process using the /exit command and spins up a fresh session. The LLM will use this if you ask it to clear context or start over.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

generate_diffusion_textA

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hkbu-kennycheng/llama-diffusion-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server