Which integrations are available for this server?

Allows querying local Ollama models (e.g., gemma3) as a tool, routing low-stakes work off the API and onto a homelab.

How do I use MCP-Demo?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@MCP-Demo ask gemma3:4b to explain MCP in simple terms" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

mcp-demo

A minimal Python MCP server, built in phases so each commit teaches one concept. End state: Claude Code can call local Ollama models (gemma3:4b / 3:12b) as a tool, routing low-stakes work off the API and onto the homelab.

What is MCP, really?

Model Context Protocol is a standardized JSON-RPC 2.0 protocol that lets an LLM client (Claude Code, Claude Desktop, Cursor) discover and invoke tools, fetch resources, and load prompt templates from separate processes called MCP servers. MCP is to LLM tooling what LSP is to IDE language support: one protocol, many interoperable implementations.

The moving pieces

    ┌───────────────────┐  stdio / HTTP   ┌──────────────────┐
    │  MCP Client       │ ◄──JSON-RPC──► │  MCP Server      │
    │  (Claude Code)    │                │  (this repo)     │
    └───────────────────┘                └──────────────────┘
            │                                     │
            │ spawns as child                     │ hits
            │ process (stdio)                     │ localhost:11434
            ▼                                     ▼
     your shell env                        Ollama / gemma3

Client — embedded in the LLM app
Server — any process that speaks MCP
Transport — stdio (client spawns server as child, pipes JSON-RPC) or streamable-http (server is a web service). This repo uses stdio.

Three primitives an MCP server exposes

Primitive	What it is	This repo's use
Tools	Functions the LLM can call	`echo`, `ollama_ask`
Resources	Read-only blobs the client can fetch	`ollama://models` (list pulled models)
Prompts	Pre-canned prompt templates	(not used — kept minimal)

Phase progression

Each phase is one commit — git log shows the evolution.

✅ scaffold — pyproject, README, .gitignore.
✅ hello-world server — one echo tool + smoke-test client. Proves the full lifecycle: client spawn → handshake → tool discovery → tool call.
✅ ollama_ask tool — async tool that POSTs to localhost:11434/api/generate and returns Gemma's reply.
✅ polish — adds system parameter to ollama_ask and exposes a resource at ollama://models (list of pulled models).

Install & run

python3 -m venv .venv
source .venv/bin/activate
pip install -e .
# Phase 2+ only: run the server by hand to smoke-test
python -m mcp_demo.server

The server reads JSON-RPC off stdin and writes to stdout — if you run it directly in a terminal it will just sit there waiting for input. That's expected. The client (Claude Code) is what actually feeds it.

Registering with Claude Code

Add to ~/.claude/settings.json:

{
  "mcpServers": {
    "mcp-demo": {
      "command": "/home/booty/mcp-demo/.venv/bin/python",
      "args": ["-m", "mcp_demo.server"]
    }
  }
}

Restart Claude Code. Tools appear as mcp__mcp-demo__<tool_name>.

MCP-Demo

mcp-demo

What is MCP, really?

The moving pieces

Three primitives an MCP server exposes

Phase progression

Install & run

Registering with Claude Code

Further reading

Resources

Looking for Admin?

Latest Blog Posts

MCP directory API