Genkit MCP

Official

Overview Schema Related Servers Score Discussions

architecture.md•7.5 KiB

# Architecture ## Module structure The tool follows a clean dependency graph with no circular imports: ``` types.py ← Shared types (Status, PluginResult, Runtime Protocol) ↑ config.py ← TOML config loader, RuntimeConfig (satisfies Runtime) ↑ paths.py ← Path constants (PACKAGE_DIR, TOOL_DIR) ↑ plugins.py ← Plugin discovery, env-var checking ↑ display.py ← Rich tables, inline progress bars, Rust-style messages ↑ log_redact.py ← Structlog processor to truncate data URIs in logs ↑ ┌──────────────┬──────────────────┬──────────────────────┐ │ checker.py │ runner.py │ util_test_model.py │ │ (check-plugin)│ (--use-cli) │ (check-model default)│ │ │ legacy CLI runner│ ActionRunner Protocol │ └──────────────┴──────────────────┴──────────────────────┘ ↑ ↑ │ ┌────────────┬──────────┘ │ │ │ │ reflection.py test_cases.py │ (httpx client) (12 built-in) │ │ │ validators/ │ ├── __init__.py ← Protocol + @register decorator │ ├── helpers.py ← Response parsing utilities │ ├── json.py ← valid-json │ ├── media.py ← valid-media │ ├── reasoning.py ← reasoning │ ├── streaming.py ← stream-text-includes, stream-has-tool-request, stream-valid-json │ ├── text.py ← text-includes, text-starts-with, text-not-empty │ └── tool.py ← has-tool-request ↑ cli.py ← CLI entry point, argparse, dispatch ↑ __main__.py ← python -m conform support ``` ## Key design decisions ### Types in a separate module `Status`, `PluginResult`, and the `Runtime` Protocol live in `types.py` which has **zero internal dependencies**. This breaks what would otherwise be a circular import between `display.py` (needs `PluginResult` for type signatures) and `runner.py` (needs display functions for output). ### ActionRunner Protocol ```python class ActionRunner(Protocol): async def run_action( self, key: str, input_data: dict, *, stream: bool = False, ) -> tuple[dict, list[dict]]: ... async def close(self) -> None: ... ``` Three implementations: | Runner | When | How | |--------|------|-----| | `InProcessRunner` | Python runtime (default) | Imports entry point, calls `action.arun_raw()` via Genkit SDK | | `ReflectionRunner` | JS/Go/other runtimes | Subprocess + async HTTP to reflection server | | genkit CLI (legacy) | `--use-cli` flag | Delegates to `genkit dev:test-model` via `runner.py` | ### Runtime Protocol ```python @runtime_checkable class Runtime(Protocol): @property def name(self) -> str: ... @property def specs_dir(self) -> Path: ... @property def plugins_dir(self) -> Path: ... @property def entry_command(self) -> list[str]: ... ``` The `RuntimeConfig` dataclass in `config.py` satisfies this protocol. New runtimes can be added by: 1. Adding a `[tool.conform.runtimes.<name>]` section to `pyproject.toml` 2. Or creating a new class that satisfies the protocol ### Concurrency model (check-model) ```text ┌─────────────┐ │ cli.py │ asyncio.run() └──────┬──────┘ ▼ ┌─────────────┐ │ runner.py │ asyncio.Queue + N workers └──────┬──────┘ ├──► Worker 1: pulls from queue, runs test ├──► Worker 2: pulls from queue, runs test └──► Worker N: (bounded by config.concurrency) ``` The runner creates an `asyncio.Queue` of plugins to test and a pool of `N` worker coroutines. Each worker pulls a plugin from the queue and runs it as a subprocess using `asyncio.create_subprocess_exec`. The number of workers limits concurrency (default: 8, configurable via `-j` or TOML). ### Retry strategy Conformance tests hit external LLM APIs that are subject to transient errors (rate limits, timeouts, server errors). A single failure should not mark a plugin as broken if the cause is a momentary API hiccup. **Algorithm:** Exponential backoff with **full jitter** (AWS-recommended). ```text delay = random() × min(base × 2^k, 60) ``` where `k` is the zero-indexed attempt number, `base` is configurable (default 1.0s), and 60s is the hard cap. **Why full jitter over other strategies:** | Strategy | Formula | Trade-off | |---|---|---| | Full Jitter ✅ | `random() × min(base × 2^k, cap)` | Maximum spread; best for reducing contention | | Equal Jitter | `max/2 + random() × max/2` | Guarantees ≥50% backoff; less spread | | Decorrelated | `min(cap, random() × 3 × prev)` | Good for correlated failures; harder to reason about | | No Jitter | `min(base × 2^k, cap)` | Deterministic; risks thundering herd | Full jitter is optimal here because conformance tests run against shared API endpoints where contention is the primary concern. The uniform `[0, max]` distribution maximizes the spread of retry times across concurrent workers. **Serial fallback:** When tests run in parallel (`test-concurrency > 1`) and any fail, the failures are automatically re-run **serially** before being reported. This prevents concurrent retry storms from compounding rate-limit pressure. The user sees a note: ``` ⚠ 2 test(s) failed — re-running serially with retries to rule out flakes. ``` **Configuration:** `max-retries` (default 2) and `retry-base-delay` (default 1.0s) are set in `conform.toml` and overridable via CLI (`--max-retries`, `--retry-base-delay`). Set `--max-retries 0` to disable retries entirely. ### CLI flag architecture All shared flags (`--config`, `--runtime`, `--specs-dir`, `--plugins-dir`) are added directly to each subcommand parser via `_add_common_args()` rather than using argparse's `parents=` mechanism. This avoids a known argparse bug where parent-parser defaults silently overwrite values parsed by the main parser, and prevents `uv run` from intercepting flags intended for the `conform` command. Common arg extraction is handled by `_extract_common_args()` in `main()`, keeping the dispatch logic DRY. ### Progress bar width invariant The inline progress bar in `display.py` uses `round()` to convert pass/fail counts to block characters. Because `round()` can independently round both segments up, the sum `p + f` may exceed `bar_width`. The fix clamps each segment: ```python p = min(round(passed / total * bar_width), bar_width) f = min(round(failed / total * bar_width), bar_width - p) r = bar_width - p - f # always >= 0 ``` This guarantees `p + f + r == bar_width` for all inputs. A brute-force regression test in `display_test.py` verifies every combination of `passed`, `failed`, `total` across multiple bar widths. ### Validator registry Validators use a Protocol + `@register` decorator pattern: ```python @register('has-tool-request') def has_tool_request( response: dict, arg: str | None = None, chunks: list[dict] | None = None, ) -> None: ... # raise ValidationError on failure ``` 10 validators are ported 1:1 from the canonical JS source (`genkit-tools/cli/src/commands/dev-test-model.ts`).

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/firebase/genkit'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

architecture.md•7.5 KiB