Skip to main content
Glama

Local Image Gen

MCP server for local image generation. Designed to run on a Windows GPU box (RTX 3060 12GB at your parents' house) and be called remotely by Claude Cowork on a MacBook over Tailscale.

  • Backend: black-forest-labs/FLUX.2-klein-4B via Hugging Face diffusers

  • Server: FastMCP over HTTP (streamable-http transport)

  • Tool: generate_image (one tool, that's it for the MVP)

Setup on the Windows GPU host

Prereqs: Windows 10/11, NVIDIA RTX 3060 (12GB) with up-to-date driver, Python 3.11+ available.

git clone <this-repo>
cd Local-Image-Gen
.\scripts\setup_windows.ps1
.venv\Scripts\Activate.ps1
huggingface-cli login   # needed if the model is gated
copy .env.example .env
uv run main.py

The first call to generate_image will download the model to ./models/ (~10 GB, one-time). Watch for [pipeline] ready in the server output before invoking tools from Cowork.

Related MCP server: Flux Schnell Server

Adding to Claude Cowork (MacBook)

In Cowork's MCP config:

{
  "mcpServers": {
    "local-image-gen": {
      "url": "http://<windows-pc-tailscale-ip>:8765/mcp"
    }
  }
}

The Windows PC's Tailscale IP looks like 100.x.y.z — get it with tailscale ip -4 on the Windows box.

The generate_image tool

Param

Type

Default

Notes

prompt

string

required

What to draw

width

int

1024

Multiple of 8

height

int

1024

Multiple of 8

num_inference_steps

int

4

Distilled models: 4. Non-distilled: 20-30

guidance_scale

float

1.0

Distilled: 1.0 or 0.0. Non-distilled: ~3.5

seed

int | null

random

Use the same seed across carousel slides for style consistency

save_to_disk

bool

true

Saves PNG to IMG_OUTPUT_DIR

Returns:

{
  "image_b64": "<base64 PNG>",
  "path": "C:\\...\\generated\\1234567890_abc123.png",
  "seed_used": 1234567890,
  "width": 1024,
  "height": 1024,
  "elapsed_seconds": 3.42
}

On error:

{ "error": "CUDA out of memory...", "error_type": "OOM" }

Config (env vars, prefix IMG_)

Var

Default

IMG_MODEL_ID

black-forest-labs/FLUX.2-klein-4B

HF repo id

IMG_DEVICE

cuda

IMG_LOW_VRAM

true

VAE tiling + attention slicing. Leave on for 12GB cards

IMG_HOST

0.0.0.0

IMG_PORT

8765

IMG_OUTPUT_DIR

./generated

Where PNGs land

IMG_CACHE_DIR

./models

Where the model is downloaded

Networking: MacBook ↔ Windows PC

Use Tailscale — free for personal use, no port forwarding on the parents' router, encrypted.

  1. Install Tailscale on both machines, sign in to the same account

  2. Note the Windows PC's Tailscale IP (100.x.y.z)

  3. Use that IP in the Cowork MCP config above

For wake-on-LAN (so the PC doesn't have to run 24/7):

  • Enable "Wake on LAN" in BIOS and in the NIC's advanced power settings

  • Tailscale's tailscale wake <hostname> from the Mac will turn it on

Dev on the MacBook (no GPU)

You can iterate on the server code without a GPU by switching to a small model:

IMG_MODEL_ID=stable-diffusion-v1-5/stable-diffusion-v1-5 IMG_DEVICE=cpu uv run main.py

CPU generation is slow (~minutes per image) but the round-trip works.

F
license - not found
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/NurmukhamedKZ/ImageMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server