Skip to main content
Glama

MCP 101

Calling a tool

  1. Make sure that nothing is listening on ports 8000 and 8080. Open 3 generously sized terminals on your screen.

  2. Download a sensible model. Qwen 3.5 4B is sensible.

  3. Compile fresh llama.cpp:

    git clone https://github.com/ggml-org/llama.cpp && cd llama.cpp
    cmake -B build && cmake --build build --config Release -j 6
  4. Launch the llama in terminal #1:

    ./llama-server -m ~/Downloads/Qwen3.5-4B-Q8_0.gguf --ctx-size 4096 --temp 1.0 --top-p 0.95 --top-k 20 --min-p 0.00 --verbose --webui-mcp-proxy
  5. Clone this repository:

    https://github.com/behavioral-ds/mcp-example && cd mcp-example
  6. Install deps: poetry install && poetry shell

  7. Launch MCP in terminal #2: python mcp_serve.py

  8. Execute the Agentic Call™ in terminal #3: python call.py

  9. Observe the dance between LLM <-> Inference engine <-> MCP <-> Client.

Using MCP prompts

  1. Open llama web UI at http://localhost:8080/, go to settings and add a new MCP server:

  2. Select "MCP prompt" when drafting a new message:

  3. That's your @mcp.prompt() parsed into UI element, click it:

  4. ...and supply some meaningful content:

  5. Then click "Use prompt" and rejoice:

-
security - not tested
F
license - not found
-
quality - not tested

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/behavioral-ds/mcp-example'

If you have feedback or need assistance with the MCP directory API, please join our Discord server