de en es ja ko ru zh

Genkit MCP

Official

by firebase

Overview Schema Related Servers Score Discussions

Python

Hybrid

genkit
py
samples
web-endpoints-hello

roadmap.md•23.6 KiB

# Roadmap Planned improvements for the web-endpoints-hello sample. Items are roughly ordered by priority within each category. --- ## Migrate production modules into Genkit core The sample currently bundles ~20 production-readiness modules that every Genkit Python app would need. The long-term goal is to move the framework-agnostic ones into `genkit` core so that the sample shrinks to flows + schemas + config only. ### Module dependency graph ``` ┌──────────────────────────────────────────────────────────────┐ │ APPLICATION LAYER │ │ │ │ main.py ──────────┬──── config.py (Settings, CLI args) │ │ │ │ │ │ ├── asgi.py ├──── sentry_init.py │ │ │ (app │ │ │ │ factory) ├──── telemetry.py │ │ │ │ │ │ ├── server.py ├──── logging.py │ │ │ (granian, │ │ │ │ uvicorn, └──── grpc_server.py │ │ │ hypercorn) │ │ │ │ │ │ │ └── flows.py ─────────┼── schemas.py (Pydantic models) │ │ │ │ └───────────────────────────┼──────────────────────────────────┘ │ ┌───────────────────────────┼──────────────────────────────────┐ │ PRODUCTION MIDDLEWARE LAYER │ │ │ │ │ security.py ────────────┤ RequestIdMiddleware │ │ (headers, CORS, │ SecurityHeadersMiddleware │ │ body-size, │ MaxBodySizeMiddleware │ │ trusted-host) │ │ │ │ │ │ rate_limit.py ──────────┤ RateLimitMiddleware (ASGI) │ │ (token bucket) │ GrpcRateLimitInterceptor │ │ │ │ │ cache.py ───────────────┤ FlowCache (TTL + LRU) │ │ │ │ │ circuit_breaker.py ─────┤ CircuitBreaker │ │ │ │ │ connection.py ──────────┤ HTTP pool + keep-alive tuning │ │ │ │ │ resilience.py ──────────┤ Global cache + breaker singletons│ │ │ │ └───────────────────────────┼──────────────────────────────────┘ │ ┌───────────────────────────┼──────────────────────────────────┐ │ UTILITY LAYER (zero app deps) │ │ │ │ │ util/asgi.py ───────────┤ send_json_error, get_client_ip │ │ util/date.py ───────────┤ utc_now_str, format_utc │ │ util/hash.py ───────────┤ make_cache_key │ │ util/parse.py ──────────┤ parse_rate, split_comma_list │ │ │ │ └──────────────────────────────────────────────────────────────┘ │ ┌───────────────────────────┼──────────────────────────────────┐ │ GENKIT CORE (today) │ │ │ │ genkit.web.manager ─────┤ ServerManager, adapters, ports │ │ genkit.web.typing ──────┤ ASGI type aliases │ │ genkit.core.flows ──────┤ /__health, flow execution │ │ genkit.core.http_client ┤ Per-loop httpx client pool │ │ genkit.core.logging ────┤ structlog typed wrapper │ │ genkit.core.tracing ────┤ OpenTelemetry spans │ │ genkit.core.error ──────┤ GenkitError, status codes │ │ │ └──────────────────────────────────────────────────────────────┘ ``` ### Classification: what stays vs. what moves The table below classifies every sample module by where it should live long-term. "Core" means `genkit` package. "Plugin" means a separate `genkit-plugin-*` package. "Sample" means it stays here. | Module | Current | Target | Rationale | |--------|---------|--------|-----------| | `security.py` | Sample | **Core** | Every ASGI Genkit app needs request-ID, security headers, body-size limits. Generic, framework-agnostic. | | `rate_limit.py` | Sample | **Core** | Rate limiting is table-stakes for any public API. The ASGI middleware + gRPC interceptor pair is reusable. | | `cache.py` | Sample | **Core** | Flow-level response caching is Genkit-specific (keyed on flow name + input). Belongs next to `ai.flow()`. | | `circuit_breaker.py` | Sample | **Core** | LLM APIs fail; every Genkit app needs a breaker. Wrapping `ai.generate()` calls is Genkit-specific. | | `connection.py` | Sample | **Core** | HTTP pool tuning and `HttpOptions` for the Google GenAI SDK should be framework defaults, not boilerplate. | | `logging.py` | Sample | **Core** | Production (JSON) vs. dev (Rich) logging is a universal need. Core already has a structlog wrapper but lacks the prod/dev auto-switch. | | `telemetry.py` | Sample | **Plugin** | Platform-specific OTEL setup belongs in `genkit-plugin-google-cloud`, `genkit-plugin-aws`, etc. The generic OTLP export could be in core. | | `sentry_init.py` | Sample | **Plugin** | Error-tracker integration is optional. Ship as `genkit-plugin-sentry`. | | `server.py` | Sample | **Core** | Server helpers for granian/uvicorn/hypercorn duplicate what `genkit.web.manager` partially provides. Merge. | | `config.py` | Sample | Sample | App-specific settings (API keys, feature flags) stay in the app. Core could provide a base `GenkitSettings` class. | | `flows.py` | Sample | Sample | Application-specific LLM flows are always user code. | | `schemas.py` | Sample | Sample | Application-specific Pydantic schemas are always user code. | | `grpc_server.py` | Sample | **Core** | gRPC flow serving is generic: map `ai.flow()` to unary/streaming RPCs. Core should provide `serve_grpc()`. | | `asgi.py` | Sample | Sample | App factory wiring is app-specific, but becomes trivial once middleware and server are in core. | | `main.py` | Sample | Sample | CLI entry point is app-specific. | | `resilience.py` | Sample | **Core** | If cache + breaker move to core, the wiring singletons go with them. | | `util/asgi.py` | Sample | **Core** | Pure ASGI helpers (error responses, header extraction) are generic. Merge into `genkit.web`. | | `util/date.py` | Sample | Sample | Trivial; not Genkit-specific. | | `util/hash.py` | Sample | **Core** | Deterministic cache-key generation is tied to `FlowCache`. Moves with it. | | `util/parse.py` | Sample | **Core** | `parse_rate` is tied to rate-limiter config. Moves with it. | ### What the sample looks like after migration Once the above modules move to core/plugins, the sample reduces to: ``` src/ __init__.py __main__.py main.py <-- ~30 lines: parse args, ai.serve() config.py <-- app-specific settings flows.py <-- LLM flows (user code) schemas.py <-- Pydantic models (user code) frameworks/ <-- 3 one-file adapters (FastAPI, Litestar, Quart) ``` Everything else comes from `genkit` core or plugins: ```python from genkit.web.security import apply_security_middleware from genkit.web.rate_limit import RateLimitMiddleware from genkit.cache import FlowCache from genkit.resilience import CircuitBreaker ``` ### Existing open-source libraries (avoid duplicating) Before building into core, evaluate whether wrapping an existing library is better than reimplementing. The table below maps each module to established OSS alternatives. | Module | OSS library | PyPI | Notes | |--------|-------------|------|-------| | **Rate limiting** | [SlowAPI](https://slowapi.readthedocs.io/) | `slowapi` | FastAPI/Starlette decorator-based. Uses `limits` under the hood with Redis/memcached backends. Well-maintained. | | | [asgi-ratelimit](https://github.com/abersheeran/asgi-ratelimit) | `asgi-ratelimit` | Pure ASGI middleware with regex rules and Redis backend. More generic than SlowAPI. Last updated 2022. | | | [limits](https://limits.readthedocs.io/) | `limits` | Backend-agnostic rate limit strategies (fixed-window, sliding-window, token-bucket). SlowAPI uses this internally. | | **Circuit breaker** | [PyBreaker](https://github.com/danielfm/pybreaker) | `pybreaker` | Mature (v1.4, 2025). Configurable thresholds, listeners, Redis-backed state. Thread-safe. | | | [Tenacity](https://tenacity.readthedocs.io/) | `tenacity` | Retry library with exponential backoff, jitter, custom predicates. Complements (not replaces) a breaker. | | | [resilient-circuit](https://resilient-circuit.readthedocs.io/) | `resilient-circuit` | Newer (2025). Composable breaker + retry policies. PostgreSQL-backed distributed state. | | **Caching** | [aiocache](https://github.com/aio-libs/aiocache) | `aiocache` | aio-libs maintained. Memory, Redis, Memcached backends. TTL support. Serializers. | | | [cashews](https://github.com/krukas/cashews) | `cashews` | Decorator-based async cache. TTL strings ("2h5m"), Redis + disk backends. Active (2025). | | **Security headers** | [secure.py](https://secure.readthedocs.io/) | `secure` | Lightweight, multi-framework. HSTS, CSP, X-Frame, Referrer-Policy, Permissions-Policy. | | | [Secweb](https://github.com/tmotagam/Secweb) | `Secweb` | 16 OWASP-aligned security middlewares for Starlette/FastAPI. Active (Jan 2026). No external deps. | | **Request ID** | [asgi-correlation-id](https://github.com/snok/asgi-correlation-id) | `asgi-correlation-id` | Reads/generates X-Request-ID, injects into structlog context. 630+ stars, production-stable. | | **Error tracking** | [sentry-sdk](https://docs.sentry.io/platforms/python/) | `sentry-sdk` | Official SDK with built-in ASGI, FastAPI, gRPC integrations. Auto-discovers frameworks. | | **Logging** | [structlog](https://www.structlog.org/) | `structlog` | Already used. Provides JSON renderer, dev console, context vars. Core should ship a pre-configured setup. | | **HTTP resilience** | [httpx](https://www.python-httpx.org/) | `httpx` | Already used by Google GenAI SDK. Built-in connection pooling, timeouts, retries. | ### Recommended approach per module | Module | Recommendation | Status | |--------|---------------|--------| | `rate_limit.py` | Wrap **`limits`** (strategy layer) in a Genkit-specific ASGI middleware + gRPC interceptor. Supports in-memory + Redis out of the box. Drop custom `TokenBucket`. | **Done** — Migrated to `limits.FixedWindowRateLimiter` with `MemoryStorage`. Custom `TokenBucket` removed. | | `circuit_breaker.py` | Wrap **`pybreaker`**. It already supports listeners (for metrics), Redis state (for multi-instance), and configurable thresholds. Add a `genkit.resilience.circuit_breaker()` helper that returns a configured `CircuitBreaker`. | **Done** — Wrapped `pybreaker.CircuitBreaker` with async-aware adapter (pybreaker's `call()` is sync-only; `CircuitOpenState.before_call()` invokes it internally). Manual state check + `_handle_error`/`_handle_success` delegation. | | `cache.py` | Wrap **`aiocache`** or **`cashews`**. Provide a `FlowCache` adapter that handles Genkit-specific cache-key generation (flow name + Pydantic input hashing) on top of the pluggable backend. | **Done** — Wrapped `aiocache.SimpleMemoryCache` in `FlowCache` adapter. TTL managed by aiocache; LRU eviction deferred to Redis eviction policies for production (in-memory relies on TTL). | | `security.py` | Wrap **`secure.py`** for security headers (tiny, no deps). Keep custom `MaxBodySizeMiddleware` and `RequestIdMiddleware` (or adopt **`asgi-correlation-id`** for the latter). Bundle as `genkit.web.security`. | **Done** — Security headers generated by `secure.Secure()` with OWASP-aligned defaults. `MaxBodySizeMiddleware` and `RequestIdMiddleware` kept (small, tightly integrated with structlog). | | `sentry_init.py` | Thin wrapper around **`sentry-sdk`** auto-discovery. Ship as `genkit-plugin-sentry` with a `setup_sentry(dsn=..., genkit_instance=ai)` one-liner. | Pending — already using `sentry-sdk` directly; plugin extraction is a Genkit-core concern. | | `logging.py` | Extend `genkit.core.logging` with a `setup_logging(env="auto")` that auto-detects TTY vs production and configures **`structlog`** with JSON or Rich accordingly. | Pending — Genkit-core enhancement. | | `connection.py` | Merge into core's `genkit.core.http_client`. Add `HttpOptions` defaults and `HTTPX_*` env-var tuning as part of `Genkit.__init__()`. | Pending — Genkit-core enhancement. | | `server.py` | Merge into `genkit.web.manager`. Add Hypercorn adapter alongside existing Uvicorn + Granian adapters. | Pending — Genkit-core enhancement. | | `grpc_server.py` | Add `genkit.web.grpc` module. Auto-generate servicer from registered flows. Provide `ai.serve_grpc(port=50051)` alongside existing `ai.serve()`. | Pending — Genkit-core enhancement. | --- ## Build systems - [ ] **Bazel support** — Add `BUILD.bazel` files for hermetic, reproducible builds. Useful for monorepo integration and CI caching. Includes `py_binary`, `py_library`, `py_test` targets for the Python code, and `proto_library` / `grpc_py_library` for protobuf codegen. Would replace `scripts/generate_proto.sh` with a Bazel rule. - [ ] **Makefile** — Evaluate whether a `Makefile` is needed alongside `justfile`. Current assessment: **not needed**. The `justfile` already covers all workflows (dev, test, build, deploy, lint, audit, security). A Makefile would duplicate functionality. Reconsider only if consumers strongly prefer Make over just. ## gRPC - [ ] **Streaming TellJoke RPC** — The REST side has `/tell-joke/stream` (SSE) but the gRPC service only exposes `TellJoke` as a unary RPC. Add a `TellJokeStream` server-streaming RPC to the proto definition and implement it in `grpc_server.py`. - [ ] **gRPC-Web proxy** — Add an Envoy or grpc-web proxy configuration so browser clients can call gRPC endpoints directly. ## Security ### Completed All core security hardening is implemented and tested (92% branch coverage). The sample follows a **secure-by-default** philosophy — production settings are restrictive out of the box; debug mode relaxes them for local development. | Feature | Module | Notes | |---------|--------|-------| | OWASP security headers | `security.py` | Via `secure.py` library; CSP, X-Frame-Options, Referrer-Policy, Permissions-Policy, COOP | | Content-Security-Policy | `security.py` | Strict `default-src none` in production; relaxed for Swagger UI in debug mode | | CORS (same-origin default) | `security.py` | Empty allowlist = same-origin; wildcard only in debug mode | | CORS explicit header allowlist | `security.py` | `Content-Type`, `Authorization`, `X-Request-ID` (no wildcard) | | Trusted host validation | `security.py` | Warns in production if `TRUSTED_HOSTS` is not set | | Per-client-IP rate limiting | `rate_limit.py` | REST (ASGI middleware) + gRPC (interceptor); health endpoints exempt | | Request body size limit | `security.py` | REST (`MaxBodySizeMiddleware`) + gRPC (`grpc.max_receive_message_length`) | | Per-request timeout | `security.py` | `TimeoutMiddleware` returns 504 on expiry; configurable via settings/CLI | | Global exception handler | `security.py` | `ExceptionMiddleware` returns JSON 500; no tracebacks to clients | | Secret masking in logs | `log_config.py` | `structlog` processor redacts API keys, tokens, passwords, DSNs | | Request ID / correlation | `security.py` | `RequestIdMiddleware` generates or propagates `X-Request-ID`; bound to structlog context | | Server header suppression | `security.py` | Removes upstream `Server` header to prevent version fingerprinting | | Cache-Control: no-store | `security.py` | Prevents intermediaries/browsers from caching API responses | | HSTS (conditional on HTTPS) | `security.py` | Configurable `max-age`; only sent over HTTPS | | GZip response compression | `security.py` | Via Starlette `GZipMiddleware`; configurable minimum size | | HTTP access logging | `security.py` | `AccessLogMiddleware` logs method, path, status, duration | | Circuit breaker for LLM calls | `circuit_breaker.py` | Async-safe; wraps `pybreaker` with stampede protection | | Response cache (stampede-safe) | `cache.py` | TTL + LRU via `aiocache`; single-flight dedup prevents thundering herd | | gRPC logging interceptor | `grpc_server.py` | Logs method, duration, status for every RPC | | gRPC rate limiting interceptor | `rate_limit.py` | Token-bucket per client; returns `RESOURCE_EXHAUSTED` | | gRPC reflection gated | `grpc_server.py` | Only enabled in debug mode | | Swagger UI / OpenAPI gated | framework adapters | Only enabled in debug mode | | Readiness probe with checks | framework adapters | `/ready` verifies `GEMINI_API_KEY`; returns 503 if missing | | Sentry error tracking | `sentry_init.py` | Optional; activated via `SENTRY_DSN` env var | | Platform telemetry auto-detection | `app_init.py` | GCP, AWS, Azure, generic OTLP | | Distroless container | `Dockerfile` | Minimal attack surface; no shell, no package manager | | Dependency auditing | `justfile` | `pysentry-rs` (vulnerabilities), `liccheck` (licenses), `addlicense` (headers) | | Configurable settings + CLI | `config.py` | All security parameters (timeouts, body size, rate limit, CORS, HSTS, gzip) configurable via env vars and CLI flags | ### Pending | # | Feature | Priority | Complexity | Description | |---|---------|----------|------------|-------------| | 1 | **Redis-backed rate limiting** | Medium | Medium | Current in-memory token bucket is per-process. Add optional Redis backend via `RATE_LIMIT_REDIS_URL` for multi-instance deployments. The `limits` library already supports this. | | 2 | **mTLS for gRPC** | Medium | Medium | Mutual TLS on the gRPC server for service-to-service authentication in zero-trust environments. | | 3 | **API key authentication** | Medium | Low-Medium | Optional API key middleware for REST + gRPC interceptor, configurable via `API_KEY` env var. | | 4 | **Google Checks integration** | Low | High | Middleware integrating with [Google Checks](https://checks.google.com/) for AI Safety (input/output policy enforcement), Code Compliance (CI/CD privacy monitoring), and App Compliance (regulatory tracking). Implement as optional REST middleware + gRPC interceptor gated on Checks policy evaluation. | | 5 | **TensorFlow-based content filtering** | Low | High | Optional input/output filtering using TensorFlow models for content safety: [Jigsaw Perspective API](https://perspectiveapi.com/) (cloud toxicity scoring), TF Lite text classifier (offline), or custom `SavedModel`. ASGI middleware + gRPC interceptor with configurable `CONTENT_FILTER_THRESHOLD` (default: `0.8`). Install via optional `[safety]` extra. | ## Performance - [ ] **Redis-backed response cache** — The current flow cache is in-memory (per-process). Add an optional Redis backend via `CACHE_REDIS_URL` for shared caching across multi-instance deployments. If wrapping `aiocache` or `cashews`, this comes for free. - [ ] **Adaptive circuit breaker** — The current circuit breaker uses a fixed failure threshold. Add sliding-window failure rate tracking and adaptive thresholds based on error percentage rather than absolute count. `pybreaker` supports listeners for custom metrics. - [ ] **Response streaming cache** — Cache streamed responses by collecting chunks and storing the assembled result for subsequent identical requests. ## Observability - [ ] **Prometheus metrics endpoint** — Expose `/metrics` with request count, latency histograms, and rate-limit rejection counts. - [ ] **Structured audit logging** — Log all request metadata (client IP, method, path, status, duration) in a machine-parseable format suitable for SIEM ingestion. ## Testing - [ ] **Load testing with Locust** — Add a `locustfile.py` for performance benchmarking of REST and gRPC endpoints. - [ ] **Contract tests** — Add proto-based contract tests that verify the gRPC service matches the `.proto` definition at test time. ## Deployment - [ ] **Kubernetes manifests** — Add `k8s/` directory with Deployment, Service, HPA, and NetworkPolicy manifests. - [ ] **Terraform / Pulumi** — Infrastructure-as-code for Cloud Run, App Runner, or Container Apps deployment. - [x] **GitHub Actions CI** — `.github/workflows/` with lint, test, build, and deploy pipelines (6 cloud platforms + CI).

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/firebase/genkit'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

roadmap.md•23.6 KiB