Skip to main content
Glama
aris-md

aris-md/mcp

by aris-md

MCP + LLM — Learning Project

A minimal, well-structured implementation of the Model Context Protocol (MCP) connected to OpenAI. Built to understand how MCP works in practice, with clean separation between the tool layer, the transport layer, and the LLM layer.


What is MCP?

MCP (Model Context Protocol) is an open standard created by Anthropic that lets you expose tools to any AI agent in a provider-agnostic way.

Without MCP, tools are hardcoded inside each LLM client. With MCP, tools live on a dedicated server — any client that speaks the protocol can discover and call them, regardless of which LLM it uses.

Without MCP                          With MCP
-----------                          --------
Client A  →  tools (copy)            MCP Server  ←─  Client A (OpenAI)
Client B  →  tools (copy)                        ←─  Client B (Claude)
Client C  →  tools (copy)                        ←─  Claude Desktop
                                                 ←─  Cursor

The tools are the value. The MCP server is just the standardized container that exposes them.


How it works — exact flow

1. CLIENT  ──►  MCP SERVER     "list your tools"
2. MCP SERVER  ──►  CLIENT     [{name, description, inputSchema}, ...]

3. CLIENT  translates MCP schema → LLM-specific format   ← only LLM-specific line

4. CLIENT  ──►  LLM            user prompt + translated tools
5. LLM  ──►  CLIENT            [{tool_name, arguments}, ...]

6. CLIENT  ──►  MCP SERVER     execute(tool_name, arguments)
7. MCP SERVER  ──►  CLIENT     result

8. CLIENT  ──►  LLM            result appended to conversation
   [repeat 5→8 until LLM returns text instead of tool calls]

9. LLM  ──►  CLIENT            final answer

The MCP server never knows which LLM is being used. It only receives (tool_name, arguments) and returns a result.


Project structure

mcp/
│
├── server/                         MCP Server (run on any machine)
│   ├── tools/
│   │   ├── base.py                 BaseTool — abstract class (definition + run)
│   │   ├── google_search.py        GoogleSearchTool
│   │   ├── api_search.py           ApiSearchTool
│   │   └── client_id_sum.py        ClientIdSumTool
│   ├── tool_registry.py            ToolRegistry — registers and dispatches tools
│   └── server.py                   MCPServer — HTTP transport only
│
├── client/                         MCP Client + LLM (run on your machine)
│   ├── mcp_client.py               MCPClient — connects to the MCP server
│   ├── llm/
│   │   ├── base.py                 BaseLLM — abstract class (to_tools + run_conversation)
│   │   └── openai_llm.py           OpenAILLM — concrete OpenAI implementation
│   └── client.py                   App — wires MCP and LLM, interactive terminal
│
└── .env                            API keys and config

Quickstart

1. Install dependencies

For testing, you can use the same machine, otherwise: Server machine:

pip install -r server/requirements.txt

Client machine:

pip install -r client/requirements.txt

2. Configure .env

# MCP Server
MCP_HOST=0.0.0.0
MCP_PORT=8000
MCP_API_KEY=               # leave empty to disable auth

# MCP Client
MCP_SERVER_URL=http://localhost:8000/mcp   # change host when server is remote

# OpenAI
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o

3. Start the server

python server/server.py

4. Start the client (interactive terminal)

python client/client.py

The 3 tools

Tool

What it does

Implementation

google_search

Web search via DuckDuckGo

asyncio.to_thread (DuckDuckGo has no async API)

api_search

Product search via REST API (DummyJSON)

httpx.AsyncClient — truly async

client_id_sum

Extracts digits from a client ID and sums them

Pure computation


Transport: Streamable HTTP (MCP spec 2025-03-26)

This project uses the current MCP transport, not the deprecated SSE transport.

Transport

MCP spec

Status

Endpoints

SSE

2024-11-05

Deprecated

/sse + /messages/

Streamable HTTP

2025-03-26

Current

/mcp (single)

Streamable HTTP is stateless-friendly (works behind load balancers), CDN-compatible, and doesn't require a persistent connection.


Adapting to a different LLM

The only two things that change when switching LLM providers are in client/llm/:

  1. Schema translation (to_tools) — how you describe tools to the LLM

  2. Response parsing (run_conversation) — how the LLM requests tool calls back

LLM

to_tools format

Tool call parsing

OpenAI

{"type":"function","function":{...}}

msg.tool_calls[i].function.name/arguments

Claude

{"name":..., "input_schema":{...}}

block.type == "tool_use"block.name/input

Gemini

{"function_declarations":[{...}]}

part.function_call.name/args

Mistral

identical to OpenAI

identical to OpenAI

Ollama (local)

identical to OpenAI

identical to OpenAI

To swap providers, create a new file client/llm/claude_llm.py (or similar) with a class extending BaseLLM, then change one import line in client/client.py:

from llm.openai_llm import OpenAILLM   # change this line only

The server, tools, and MCPClient are untouched.


Adding a new tool

  1. Create server/tools/your_tool.py:

from mcp import types
from .base import BaseTool

class YourTool(BaseTool):

    @property
    def definition(self) -> types.Tool:
        return types.Tool(
            name="your_tool",
            description="What this tool does.",
            inputSchema={
                "type": "object",
                "properties": {
                    "param": {"type": "string", "description": "..."},
                },
                "required": ["param"],
            },
        )

    async def run(self, param: str) -> dict:
        return {"result": param}
  1. Register it in server/tool_registry.py — add one import and one instance:

from tools.your_tool import YourTool

registry = ToolRegistry([
    GoogleSearchTool(),
    ApiSearchTool(),
    ClientIdSumTool(),
    YourTool(),            # ← add here
])

That's it. All connected clients see the new tool automatically on next connect.


Token cost and latency

Tool definitions are sent as input tokens on every request, even if no tool is called. Each tool call adds one extra LLM round trip — the context grows with each turn:

Turn 1 input:  schema + user prompt
Turn 2 input:  schema + user prompt + tool result #1        ← prefix paid again
Turn 3 input:  schema + user prompt + tool result #1 + #2   ← prefix paid again

If the LLM calls 2 tools, you pay for 3 separate API calls with a growing context. A poorly described tool that causes retries is disproportionately expensive.

Latency per tool call: ~5ms MCP round trip (local) / ~50–200ms (remote) + one extra LLM inference (~300ms–2s depending on model).


Native MCP support (no client code needed)

Some consumers connect to your MCP server directly without any translation layer:

Consumer

How

OpenAI Responses API

Pass {"type": "mcp", "server_url": "..."} — OpenAI connects from their side (server must be public)

Claude Desktop

Add server URL in claude_desktop_config.json

Claude Code

Add server URL in settings

Cursor / Windsurf / Continue

Add server URL in their MCP settings

Your server/server.py works with all of the above without modification.


Why MCP instead of direct function calling?

For a single client + single LLM, you don't need MCP. Direct function calling is simpler.

MCP pays off when you have multiple consumers:

MCP Server (tools defined once)
    ├── your Python client
    ├── Claude Desktop        (zero extra code)
    ├── Cursor                (zero extra code)
    ├── a Node.js agent       (different language, same tools)
    └── a teammate's script   (different team, same tools)

Direct function calling

MCP

Setup

Simple

Requires a server

Reuse across clients

Copy-paste

Point at the server URL

Multi-language

No

Yes

Ecosystem apps (Cursor, Claude)

No

Yes

Credentials security

In client

Stay on server

MCP is microservices for AI tools — it only pays off when multiple things consume the same tools.


Finding existing tools

Instead of building from scratch, you can point your client at existing MCP tool servers:

Registry

URL

Official Anthropic servers

github.com/modelcontextprotocol/servers

Community marketplace

smithery.ai

Curated list (500+)

github.com/punkpeye/awesome-mcp-servers

Notable ready-made tools: brave-search, fetch (any URL → markdown), postgres, github, slack, puppeteer (browser control), filesystem, google-maps.

F
license - not found
-
quality - not tested
-
maintenance - not tested

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/aris-md/mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server