Genkit MCP

Official

Overview Schema Related Servers Score Discussions

security.md•15.3 KiB

# Security & Hardening This sample follows a **secure-by-default** philosophy. Every configuration default is chosen so that a fresh deployment with zero configuration is locked down. Development convenience (Swagger UI, colored logs, open CORS, gRPC reflection) requires *explicit* opt-in. !!! tip "Design principle" _"If someone forgets to configure this, should the system be open or closed?" Choose closed._ --- ## Secure-by-default design | Principle | How it's enforced | |-----------|-------------------| | Locked down on deploy | All defaults are restrictive; dev features require `--debug` or `DEBUG=true` | | Debug is explicit | A single flag gates Swagger UI, gRPC reflection, relaxed CSP, open CORS | | Defense in depth | Multiple independent layers — any single bypass still leaves others active | | Framework-agnostic | All middleware is pure ASGI (no FastAPI/Litestar/Quart dependency) | | Fail closed | Missing config → deny; not "missing config → allow" | --- ## Debug mode A single `debug` flag (via `--debug` CLI, `DEBUG=true` env var, or `Settings.debug`) controls all development-only features: | Feature | `debug=false` (production default) | `debug=true` (development) | |---------|------------------------------------|---------------------------| | Swagger UI (`/docs`, `/redoc`) | Disabled (`docs_url=None`) | Enabled | | OpenAPI schema (`/openapi.json`) | Disabled (`openapi_url=None`) | Enabled | | gRPC reflection | Disabled | Enabled (for `grpcui` / `grpcurl`) | | Content-Security-Policy | `default-src none` (strict) | Allows `cdn.jsdelivr.net`, `fastapi.tiangolo.com`, inline scripts | | CORS (when unconfigured) | Same-origin only (`[]`) | Wildcard (`["*"]`) | | Trusted hosts warning | Logs a warning at startup | Suppressed | | Log format (when unconfigured) | `json` (structured) | `console` (colored) | Activate debug mode: ```bash # CLI flag (used by run.sh automatically) python -m src --debug # Environment variable DEBUG=true python -m src # In .local.env DEBUG=true ``` !!! danger "Never use `--debug` in production" Debug mode disables critical security controls. The `run.sh` script passes `--debug` automatically for local development; production deployments (gunicorn, Cloud Run, Kubernetes) should **never** set it. --- ## Middleware stack Security middleware is applied as pure ASGI wrappers. The order for an incoming request: ``` AccessLog → GZip → CORS → TrustedHost → Timeout → MaxBodySize → ExceptionHandler → SecurityHeaders → RequestId → App ``` Each layer is independent — disabling one doesn't affect the others. The response passes through the same layers in reverse. ### Security headers (OWASP) `SecurityHeadersMiddleware` (in `src/security.py`) uses the [`secure`](https://secure.readthedocs.io/) library to inject OWASP-recommended headers on every HTTP response: | Header | Value | Purpose | |--------|-------|---------| | `Content-Security-Policy` | `default-src none` | Block all resource loading (API-only server) | | `X-Content-Type-Options` | `nosniff` | Prevent MIME-type sniffing | | `X-Frame-Options` | `DENY` | Block clickjacking via iframes | | `Referrer-Policy` | `strict-origin-when-cross-origin` | Limit referrer leakage | | `Permissions-Policy` | `geolocation=(), camera=(), microphone=()` | Disable unnecessary browser APIs | | `Cross-Origin-Opener-Policy` | `same-origin` | Isolate browsing context | | `Strict-Transport-Security` | `max-age=31536000; includeSubDomains` | Force HTTPS (only added over HTTPS) | !!! note "X-XSS-Protection omitted intentionally" The browser XSS auditor it controlled has been removed from all modern browsers, and setting it can *introduce* XSS in older browsers (OWASP recommendation since 2023). The `secure` library dropped it for this reason. **Debug mode CSP** allows Swagger UI to function by permitting CDN resources from `cdn.jsdelivr.net`, the FastAPI favicon, and inline scripts. ### CORS Starlette's `CORSMiddleware` is configured from `CORS_ALLOWED_ORIGINS`: | Scenario | `CORS_ALLOWED_ORIGINS` | Effective behavior | |----------|----------------------|-------------------| | Production (default) | `""` (empty) | Same-origin only — all cross-origin requests denied | | Production (explicit) | `"https://app.example.com"` | Only listed origins allowed | | Development (debug, unconfigured) | `""` (empty) | Falls back to `*` (wildcard) | Additional CORS settings (hardcoded for security): - **Allowed methods**: `GET`, `POST`, `OPTIONS` - **Allowed headers**: `Content-Type`, `Authorization`, `X-Request-ID` - **Credentials**: `False` (cookies/auth headers not forwarded) !!! warning "Why not `allow_headers=["*"]`?" Wildcard allowed headers let any custom header through CORS preflight, which can be exploited for cache poisoning or header injection. The explicit list only permits headers the API actually uses. ### Request ID / correlation `RequestIdMiddleware` assigns a unique ID to every HTTP request: 1. If the client sends `X-Request-ID`, it is reused (for end-to-end tracing) 2. Otherwise, a UUID4 is generated 3. The ID is bound to `structlog` context vars — every log line includes `request_id` 4. The ID is echoed in the `X-Request-ID` response header 5. The ID is stored in `scope["state"]["request_id"]` for framework access ### Body size limit `MaxBodySizeMiddleware` checks `Content-Length` **before** the framework parses the body, preventing memory exhaustion: - Default: 1 MB (1,048,576 bytes) - Override: `MAX_BODY_SIZE=2097152` (2 MB) - Response: `413 Payload Too Large` with JSON body The gRPC server applies the same limit via `grpc.max_receive_message_length`. ### Trusted host validation When `TRUSTED_HOSTS` is set, Starlette's `TrustedHostMiddleware` rejects requests with spoofed `Host` headers (returns 400). ```bash TRUSTED_HOSTS=api.example.com,admin.example.com ``` If `TRUSTED_HOSTS` is empty in production (non-debug) mode, a **warning** is logged at startup: > No TRUSTED_HOSTS configured — Host-header validation is disabled. > Set TRUSTED_HOSTS to your domain(s) in production to prevent > host-header poisoning attacks. --- ## Rate limiting Token-bucket rate limiting is applied per client IP at both protocol layers using the same algorithm: | Protocol | Component | Over-limit response | Headers | |----------|-----------|-------------------|---------| | REST | `RateLimitMiddleware` | `429 Too Many Requests` | `Retry-After` | | gRPC | `GrpcRateLimitInterceptor` | `RESOURCE_EXHAUSTED` | — | Configuration: ```bash RATE_LIMIT_DEFAULT=60/minute # Default RATE_LIMIT_DEFAULT=100/second # High-traffic API RATE_LIMIT_DEFAULT=10/minute # Restrictive ``` Health endpoints (`/health`, `/healthz`, `/ready`, `/readyz`) are exempt from rate limiting so orchestration platforms can always probe. --- ## Input validation All input models in `src/schemas.py` use Pydantic `Field` constraints to reject malformed input before it reaches any Genkit flow or LLM call: | Constraint | Example | Purpose | |-----------|---------|---------| | `max_length` | Name ≤ 200, text ≤ 10,000, code ≤ 50,000 | Prevent oversized strings | | `min_length` | text ≥ 1 (no empty strings) | Reject empty inputs | | `ge` / `le` | 0 ≤ skill ≤ 100 | Numeric range validation | | `pattern` | `^[a-zA-Z#+]+$` for language | Prevent injection in freeform fields | Pydantic returns a `422 Unprocessable Entity` with detailed validation errors for invalid input — no custom error handling needed. Additional sanitization in `src/flows.py`: - `text.strip()[:2000]` — normalize and truncate freeform text before passing to the LLM --- ## Resilience ### Circuit breaker `CircuitBreaker` (in `src/circuit_breaker.py`) protects against cascading failures when the LLM API is degraded. After consecutive failures, it fails fast without making API calls, then probes with a single request before reopening. | Setting | Env Var | Default | Description | |---------|---------|---------|-------------| | Enabled | `CB_ENABLED` | `true` | Enable/disable | | Failure threshold | `CB_FAILURE_THRESHOLD` | `5` | Consecutive failures to trip | | Recovery timeout | `CB_RECOVERY_TIMEOUT` | `30.0` | Seconds before half-open probe | States: **Closed** (normal) → **Open** (fail fast) → **Half-open** (probe). Uses `time.monotonic()` for NTP-immune timing and `asyncio.Lock` for thread safety. ### Response cache (stampede protection) `FlowCache` (in `src/cache.py`) provides in-memory TTL + LRU caching for idempotent flows with **per-key request coalescing** to prevent cache stampedes (thundering herd): | Setting | Env Var | Default | Description | |---------|---------|---------|-------------| | Enabled | `CACHE_ENABLED` | `true` | Enable/disable | | TTL | `CACHE_TTL` | `300` | Time-to-live in seconds | | Max entries | `CACHE_MAX_SIZE` | `1024` | LRU eviction after this count | - Uses SHA-256 hashed cache keys (via `src/util/hash.py`) - Per-key `asyncio.Lock` prevents concurrent identical LLM calls - Non-idempotent flows (chat, joke) and streaming flows bypass the cache --- ## Connection tuning | Setting | Env Var | Default | Purpose | |---------|---------|---------|---------| | Server keep-alive | `KEEP_ALIVE_TIMEOUT` | `75s` | Above typical 60s LB idle timeout to prevent premature disconnects | | LLM API timeout | `LLM_TIMEOUT` | `120000ms` | 2-minute hard timeout for LLM calls | | Connection pool max | `HTTPX_POOL_MAX` | `100` | Max concurrent outbound connections | | Pool keepalive | `HTTPX_POOL_MAX_KEEPALIVE` | `20` | Max idle connections kept alive | Configured in `src/connection.py` via `configure_httpx_defaults()`. --- ## Graceful shutdown SIGTERM is handled with a configurable grace period: - **Default**: 10 seconds (matches Cloud Run's SIGTERM window) - **Override**: `SHUTDOWN_GRACE=30` (seconds) - **gRPC**: `server.stop(grace=shutdown_grace)` drains in-flight RPCs - **ASGI**: Server-native shutdown (granian/uvicorn/hypercorn) --- ## gRPC security | Feature | Configuration | Default | |---------|---------------|---------| | Max message size | `grpc.max_receive_message_length` | 1 MB (matches REST) | | Rate limiting | `GrpcRateLimitInterceptor` | `60/minute` per peer | | Logging | `GrpcLoggingInterceptor` | Logs method, duration, status | | Reflection | Debug-only | Disabled in production | !!! warning "gRPC reflection disabled in production" Reflection exposes the full API schema (service names, method signatures, message types) to unauthenticated clients. It is only enabled when `debug=true`. --- ## Structured logging | Mode | `LOG_FORMAT` | Output | |------|-------------|--------| | Production (default) | `json` | Machine-parseable, no ANSI codes, suitable for log aggregation | | Development | `console` | Colored, human-friendly with Rich tracebacks | All log entries include `request_id` from `RequestIdMiddleware` for request-level correlation. Set `LOG_FORMAT=console` in your `.local.env` for development. --- ## Error tracking (Sentry) Optional integration — only active when `SENTRY_DSN` is set: ```bash SENTRY_DSN=https://examplePublicKey@o0.ingest.sentry.io/0 SENTRY_TRACES_SAMPLE_RATE=0.1 # 10% of transactions SENTRY_ENVIRONMENT=production ``` - Auto-detects active framework (FastAPI, Litestar, Quart) + gRPC - PII stripped by default (`send_default_pii=False`) - Install: `uv sync --extra sentry` or `pip install "sentry-sdk[fastapi,litestar,quart,grpc]"` --- ## Platform telemetry auto-detection `src/app_init.py` automatically detects the cloud platform at startup and enables the matching telemetry plugin (if installed): | Platform | Detection signal | Plugin (optional dep) | |----------|-----------------|----------------------| | GCP — Cloud Run | `K_SERVICE` | `genkit-plugin-google-cloud` (`[gcp]` extra) | | GCP — GCE/GKE | `GCE_METADATA_HOST` | `genkit-plugin-google-cloud` (`[gcp]` extra) | | AWS — ECS/App Runner | `AWS_EXECUTION_ENV` | `genkit-plugin-amazon-bedrock` (`[aws]` extra) | | Azure — Container Apps | `CONTAINER_APP_NAME` | `genkit-plugin-microsoft-foundry` (`[azure]` extra) | | Generic OTLP | `OTEL_EXPORTER_OTLP_ENDPOINT` | `genkit-plugin-observability` (`[observability]` extra) | !!! note "GOOGLE_CLOUD_PROJECT alone doesn't trigger GCP telemetry" It's commonly set on dev machines for the gcloud CLI. To force GCP telemetry locally, also set `GENKIT_TELEMETRY_GCP=1`. Disable all telemetry: `GENKIT_TELEMETRY_DISABLED=1` or `--no-telemetry`. --- ## Dependency auditing ```bash just audit # pip-audit — checks against PyPA advisory database just security # pysentry-rs + pip-audit + liccheck (all checks) just licenses # License compliance against allowlist just lint # Includes all of the above plus linters and type checkers ``` **License allowlist**: Apache-2.0, MIT, BSD-3-Clause, BSD-2-Clause, PSF-2.0, ISC, Python-2.0, MPL-2.0. --- ## Container security The `Containerfile` produces a hardened image using `gcr.io/distroless/python3-debian13:nonroot`: | Property | Value | |----------|-------| | Shell | None (cannot `exec` into container) | | Package manager | None (no `apt install` attack vector) | | User | uid 65534 (`nonroot`) | | Base size | ~50 MB (vs ~150 MB for `python:3.13-slim`) | | `setuid` binaries | None | --- ## Health check endpoints | Endpoint | Purpose | Rate limited | |----------|---------|-------------| | `GET /health` | Liveness — process is running | No | | `GET /ready` | Readiness — app can serve traffic | No | Both return `{"status": "ok"}` with minimal overhead. --- ## Production hardening checklist | Item | How | Secure default | |------|-----|----------------| | Debug mode | `DEBUG=false` | Off — Swagger, reflection, relaxed CSP disabled | | TLS termination | Load balancer / reverse proxy | Not included (use Cloud Run, nginx, etc.) | | Trusted hosts | `TRUSTED_HOSTS=api.example.com` | Disabled (warns at startup) | | CORS | `CORS_ALLOWED_ORIGINS=https://app.example.com` | Same-origin only | | Rate limiting | `RATE_LIMIT_DEFAULT=100/minute` | `60/minute` | | Body size limit | `MAX_BODY_SIZE=524288` | 1 MB | | Log format | `LOG_FORMAT=json` | JSON (structured) | | Secrets management | Cloud secrets manager (not `.env`) | `.env` files (dev only) | | Error tracking | `SENTRY_DSN=...` | Disabled | | Container image | `Containerfile` with distroless + nonroot | Included | | Dependency audit | `just security` in CI | Manual | | License compliance | `just licenses` in CI | Manual | --- ## Security environment variables | Variable | Description | Secure default | |----------|-------------|----------------| | `DEBUG` | Enable dev-only features (Swagger, reflection, relaxed CSP) | `false` | | `CORS_ALLOWED_ORIGINS` | Comma-separated allowed CORS origins | `""` (same-origin) | | `TRUSTED_HOSTS` | Comma-separated allowed Host headers | `""` (disabled, warns) | | `RATE_LIMIT_DEFAULT` | Rate limit in `<count>/<period>` format | `60/minute` | | `MAX_BODY_SIZE` | Max request body in bytes | `1048576` (1 MB) | | `LOG_FORMAT` | `json` (production) or `console` (dev) | `json` | | `SHUTDOWN_GRACE` | Graceful shutdown grace period in seconds | `10.0` | | `SENTRY_DSN` | Sentry Data Source Name | `""` (disabled) | | `SENTRY_TRACES_SAMPLE_RATE` | Fraction of transactions to sample | `0.1` | | `SENTRY_ENVIRONMENT` | Sentry environment tag | (auto from `--env`) | | `GENKIT_TELEMETRY_DISABLED` | Disable all platform telemetry | `""` (enabled) | | `GENKIT_TELEMETRY_GCP` | Force GCP telemetry with `GOOGLE_CLOUD_PROJECT` | `""` (disabled) |

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/firebase/genkit'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

security.md•15.3 KiB