Skip to main content
Glama
Mahnooramjad05

MCP Agent Toolkit

MCP Agent Toolkit

An agent built on the Model Context Protocol (MCP) that discovers and calls external tools for multi-step research, data lookup, and code-execution tasks, returning reliable structured JSON.

Overview

This project implements a real, runnable MCP server (using the official mcp Python SDK) exposing four distinct tools: a read-only SQL tool over a seeded SQLite database, a REST API tool, a sandboxed Python code interpreter, and a set of file-resource lifecycle tools. An MCP client/agent connects to the server over stdio, dynamically discovers the available tools (nothing is hardcoded), and drives a plan-act loop with a pluggable LLM (OpenAI, Anthropic, or a deterministic offline mock) to satisfy multi-step natural-language tasks such as "look up customer X's orders, sum their total, and tell me if it exceeds $500." The agent's final answer is validated against a Pydantic schema before being returned, and every tool call made along the way is recorded in a trace. A small FastAPI layer exposes this agent over HTTP as POST /agent/run. The SQL tool also demonstrates MCP "context negotiation": callers request a page size, the server clamps it to a safe maximum and reports whether more data is available.

Related MCP server: databricks-mcp

Key Features

  • MCP server with 4 distinct tools, built on the official mcp Python SDK (FastMCP):

    • sql_query — read-only SQL access to a bundled SQLite sample database (customers, products, orders), with context-negotiated pagination (page_size/offset, server-clamped, has_more/next_offset reported back).

    • rest_exchange_rate — calls a REST API (a bundled local mock FastAPI server, see "REST tool: real API vs. mock" below) to fetch currency exchange rates.

    • run_python — a sandboxed Python code interpreter: snippets are statically checked against an import/name allowlist-denylist, then executed in an isolated subprocess with a wall-clock timeout (and, on POSIX, CPU/memory rlimits).

    • list_resources / open_resource / read_resource / release_resource — explicit MCP resource lifecycle management (list available file resources, open a handle, read through that handle one or more times, then explicitly release it; reads after release fail).

  • MCP client/agent (agent/client.py) that performs live tool discovery via tools/list (no hardcoded tool list) and runs a bounded plan-act loop, calling MCP tools and observing results until it produces a final answer.

  • Pluggable LLM (agent/llm.py): LLM_PROVIDER=openai|anthropic|mock. Defaults to a deterministic, rule-based mock LLM that requires no API key and no network access, so the whole project (including all tests) runs fully offline.

  • Structured (JSON-mode) output: the agent's final answer is always validated against the AgentAnswer Pydantic schema (agent/schemas.py), including a typed tool-call trace.

  • Context negotiation: the SQL tool clamps requested page sizes to a server-side maximum and reports has_more/next_offset, demonstrating negotiated, paginated tool results instead of unbounded responses.

  • FastAPI wrapper (api/main.py) exposing POST /agent/run, which runs the full client-server-agent interaction per request and returns the structured result plus the discovered tool names.

  • pytest suite (18 tests) covering tool discovery, SQL correctness and pagination, sandbox accept/reject behaviour, the REST tool against the bundled mock server, resource lifecycle enforcement, and end-to-end structured agent output — all offline, no API key required.

  • Docker image that seeds the database at build time and runs the mock REST server plus the FastAPI app.

Tech Stack

  • Python 3.11+

  • mcp (official Model Context Protocol Python SDK, FastMCP high-level server API, stdio transport)

  • FastAPI + Uvicorn

  • Pydantic v2 (schemas / structured output)

  • SQLite (stdlib sqlite3, no external DB server)

  • httpx (REST calls)

  • OpenAI SDK / Anthropic SDK (optional, pluggable)

  • pytest

  • Docker

Architecture

                     ┌───────────────────────────────┐
  HTTP POST          │        FastAPI (api/main.py)   │
  /agent/run  ─────▶  │   POST /agent/run              │
                     └───────────────┬────────────────┘
                                     │ run_agent_task(task)
                                     ▼
                     ┌───────────────────────────────┐
                     │   Agent / MCP client            │
                     │   (agent/client.py)             │
                     │                                 │
                     │  1. spawn MCP server (stdio)     │
                     │  2. tools/list -> dynamic         │
                     │     tool discovery                │
                     │  3. loop:                        │
                     │       LLMClient.plan_step(...)    │
                     │       -> call_tool | final_answer │
                     │  4. validate AgentAnswer (Pydantic)│
                     └───────┬───────────────┬─────────┘
                             │               │
                    MCP stdio│               │pluggable LLM
                    protocol │               │(agent/llm.py)
                             ▼               ▼
              ┌─────────────────────┐   OpenAI / Anthropic /
              │  MCP server          │   deterministic MockLLM
              │  (mcp_server/server.py, FastMCP)
              │                       │
              │  sql_query  ─────────┼──▶ data/sample.db (SQLite)
              │  rest_exchange_rate ─┼──▶ mock_rest_server.py (FastAPI, localhost)
              │  run_python ─────────┼──▶ sandbox.py (subprocess, allowlist, timeout)
              │  list/open/read/     │
              │  release_resource ───┼──▶ data/resources/*.txt|*.md
              └─────────────────────┘
  • MCP server (mcp_server/server.py): a single FastMCP instance registering the four tools below plus one MCP resource template (resource://mcp-agent-toolkit/{filename}), served over stdio. The agent spawns it as a subprocess (python -m mcp_server.server) per run, so each HTTP request gets an isolated server process.

  • The four tools:

    1. sql_query opens a fresh read-only sqlite3 connection (PRAGMA query_only = ON) per call against data/sample.db, rejects anything that isn't a single SELECT, and implements context negotiation: it accepts a requested page_size/offset, clamps page_size to [1, 25] server-side regardless of what was asked for, and always reports back effective_page_size, total_rows_matched, has_more, and next_offset so a caller can page through a large result set safely.

    2. rest_exchange_rate makes a real HTTP call (via httpx) to a REST API for currency exchange rates.

    3. run_python statically validates a Python snippet with ast (import allowlist + denylisted names such as open/eval/exec/dunder-attribute access), then runs it in a brand-new subprocess (never exec()-ed in-process) with a wall-clock timeout and, on POSIX, CPU-time/address-space rlimits.

    4. list_resources / open_resource / read_resource / release_resource model explicit MCP resource lifecycle: list available files, open a named handle (handle_id), read through it any number of times, then release it — after release, further reads on that handle fail. The same files are also exposed as standard MCP resources via resource://mcp-agent-toolkit/{filename} for resources/list/resources/read.

  • MCP client/agent (agent/client.py): connects over stdio with mcp.ClientSession, calls session.list_tools() to discover tools dynamically (the tool list is never hardcoded in the agent), then loops: ask the LLMClient for a PlanStep (call a tool, or produce a final answer) given the task, the discovered tools, and the history of calls/results so far; execute the chosen tool via session.call_tool(...); repeat (bounded by MAX_AGENT_STEPS). The final answer dict from the LLM is validated into an AgentAnswer Pydantic model, with a ToolCallRecord per tool call made.

  • Pluggable LLM (agent/llm.py): build_llm_client() reads LLM_PROVIDER and returns an OpenAILLMClient, AnthropicLLMClient, or the default MockLLMClient. All three implement the same plan_step(task, available_tools, history) -> PlanStep interface. The real providers are prompted to return one JSON object matching the call_tool / final_answer shape described in SYSTEM_INSTRUCTIONS. The MockLLMClient is a small deterministic rule-based planner: it recognizes the reference task shape ("customer X's orders... exceeds $N"), issues one sql_query call, then computes the sum/threshold comparison itself and returns a final_answer — enough to drive the full agent loop, including the FastAPI layer and the entire pytest suite, with zero network access and no API key.

  • FastAPI layer (api/main.py): POST /agent/run takes {"task": "..."}, calls run_agent_task(task) (which owns the full spawn-discover-loop-validate lifecycle above), and returns {"result": AgentAnswer, "available_tools": [...]}.

REST tool: real API vs. mock (read this)

The rest_exchange_rate tool calls a bundled local mock REST server (mcp_server/mock_rest_server.py, a small FastAPI app), not an external public API. This was a deliberate choice per the project brief: the goal is a demo (and a pytest suite) that is fully self-contained and deterministic and does not depend on outbound internet access being available in whatever environment this repo is cloned, tested, or CI-run in. The mock server is exercised over real HTTP (via httpx, over 127.0.0.1) exactly as a call to a real public API would be — the tool code has no special-casing for "mock vs. real"; pointing MOCK_REST_BASE_URL at a real exchange-rate API with a compatible response shape would work with no code changes to the tool itself.

Sandboxing approach and its real limits (read this)

The run_python tool is a portfolio-grade sandbox, explicitly not a hardened, production-grade isolation boundary. What it actually does:

  1. Static AST check (mcp_server/sandbox.py::check_code_safety): parses the snippet with Python's ast module and rejects it before execution if it imports anything outside a small allowlist (math, statistics, json, itertools, collections, datetime, re, random, textwrap), or references denylisted names (open, exec, eval, compile, __import__, getattr/setattr/delattr, globals/locals/vars, etc.), or accesses any dunder attribute (a common sandbox-escape vector, e.g. ().__class__).

  2. Process isolation: the snippet is written to a temp file and run via subprocess.run([sys.executable, "-I", "-S", script]) — a brand-new OS process, never exec()-ed in the server's own process.

  3. Resource limits: a wall-clock timeout (subprocess.run(timeout=...), default 5s, max 10s) always applies. On POSIX, a preexec_fn additionally applies resource.setrlimit for CPU time and address space. On Windows there is no OS-level CPU/memory rlimit — only the wall-clock timeout and the static allowlist apply; this is a real, documented gap, not an oversight.

What this does not protect against: a sufficiently creative snippet may find static-analysis bypasses that this simple allowlist/denylist does not catch (this is a known-hard problem for a dynamic language like Python); there is no OS-level container/seccomp/VM boundary (no gVisor/Firecracker/Docker-level isolation for the interpreter itself — the Docker image in this repo containerizes the whole app, not each snippet individually); and on Windows, memory/CPU exhaustion inside the timeout window is not prevented. For genuinely untrusted/adversarial code, use a real container- or VM-based sandbox instead of this module.

Setup / Installation

# 1. Create and activate a virtual environment (recommended)
python -m venv venv
venv\Scripts\activate

# 2. Install dependencies
pip install -r requirements.txt

# 3. Seed the sample SQLite database (creates data/sample.db)
python scripts/seed_db.py

# 4. Copy the env template (optional -- defaults work with no keys set)
copy .env.example .env

Usage

Run the FastAPI server

uvicorn api.main:app --reload

(Optional) Run the bundled mock REST server, for the REST tool

uvicorn mcp_server.mock_rest_server:app --port 8811

If this is not running, rest_exchange_rate calls will fail with a connection error — everything else (SQL tool, sandbox tool, resource tools, the reference agent task) works without it.

Example request/response

POST /agent/run
Content-Type: application/json

{
  "task": "Look up customer Alice Nguyen's orders, sum their total, and tell me if it exceeds $500."
}
{
  "result": {
    "task": "Look up customer Alice Nguyen's orders, sum their total, and tell me if it exceeds $500.",
    "status": "completed",
    "answer": "Alice Nguyen's completed orders total $334.00, which does not exceed $500.00.",
    "key_facts": {
      "customer_name": "Alice Nguyen",
      "order_total": 334.0,
      "threshold": 500.0,
      "exceeds_threshold": false,
      "num_completed_orders": 3
    },
    "tool_calls": [
      {
        "step": 1,
        "tool_name": "sql_query",
        "arguments": {
          "sql": "SELECT c.name AS customer_name, o.order_id, o.quantity, p.unit_price, (o.quantity * p.unit_price) AS line_total, o.status FROM orders o JOIN customers c ON c.customer_id = o.customer_id JOIN products p ON p.product_id = o.product_id WHERE c.name = 'Alice Nguyen'",
          "page_size": 25
        },
        "result_summary": "{\"columns\": [...], \"rows\": [...], ...}",
        "raw_result": { "columns": ["customer_name", "order_id", "quantity", "unit_price", "line_total", "status"], "rows": [ { "customer_name": "Alice Nguyen", "order_id": 1, "quantity": 1, "unit_price": 59.5, "line_total": 59.5, "status": "completed" }, { "..." : "..." } ], "row_count_returned": 3, "total_rows_matched": 3, "offset": 0, "requested_page_size": 25, "effective_page_size": 25, "has_more": false, "next_offset": null }
      }
    ]
  },
  "available_tools": ["sql_query", "rest_exchange_rate", "run_python", "list_resources", "open_resource", "read_resource", "release_resource"]
}

This example runs with the default LLM_PROVIDER=mock and requires no API key.

Running the seed script and tests

python scripts/seed_db.py
pytest -v

The test suite (18 tests) seeds the database automatically via a session fixture if it does not already exist, spawns the real MCP server as a subprocess for discovery/end-to-end tests, and starts the bundled mock REST server in-process for the REST tool test. No test requires a real API key or outbound network access.

Environment Variables

Variable

Required

Default

Description

LLM_PROVIDER

No

mock

One of mock, openai, anthropic. Selects the agent's planning LLM.

OPENAI_API_KEY

Only if LLM_PROVIDER=openai

OpenAI API key.

OPENAI_MODEL

No

gpt-4o-mini

OpenAI model name.

ANTHROPIC_API_KEY

Only if LLM_PROVIDER=anthropic

Anthropic API key.

ANTHROPIC_MODEL

No

claude-3-5-haiku-latest

Anthropic model name.

MOCK_REST_BASE_URL

No

http://127.0.0.1:8811

Base URL the rest_exchange_rate tool calls.

See .env.example for a copyable template (placeholder values only — no real secrets are stored in this repo).

Project Layout

mcp_server/
  server.py             MCP server (FastMCP): tool + resource definitions
  sandbox.py            Sandboxed code interpreter (AST check + subprocess)
  mock_rest_server.py   Bundled mock REST API (FastAPI)
agent/
  client.py             MCP client / agent loop, tool discovery
  llm.py                Pluggable LLM client (openai/anthropic/mock)
  schemas.py            Pydantic structured-output schemas
api/
  main.py               FastAPI wrapper: POST /agent/run
scripts/
  seed_db.py             SQLite seed script
data/
  resources/             Sample files exposed via the resource lifecycle tools
  sample.db              Generated by seed_db.py (not committed, see .gitignore)
tests/                   pytest suite

License

MIT License, Copyright (c) 2026 Mahnoor Amjad. See LICENSE.

A
license - permissive license
-
quality - not tested
B
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Mahnooramjad05/mcp-agent-toolkit'

If you have feedback or need assistance with the MCP directory API, please join our Discord server