de en es ja ko ru zh

Genkit MCP

Official

by firebase

Overview Schema Related Servers Score Discussions

Python

Hybrid

genkit
py
engdoc
extending

api.md•13.9 KiB

# API Design Genkit is a framework for building AI-powered applications using generative models. It provides a streamlined way to work with AI models, tools, prompts, embeddings, and other AI-related functionality. The API is structured to make it easy to: * Define prompts that can be reused across your application. * Create tools that AI models can call. * Work with different AI models through a consistent interface. * Build complex AI workflows through "flows". * Store and retrieve data through embeddings and vector search. ## Design principles Genkit is designed with several principles in mind: * **Async-first**: Most communication among us and future interactive agents appear to be largely naturally asynchronous. * **Type Safety**: Uses build-time and runtime-type information for strong typing. * **Modularity**: Components can be mixed and matched. * **Extensibility**: Plugin system allows adding new features. * **Developer Experience**: Development tools like Reflection Server help debug applications. ## Veneer The veneer refers to the user-facing API and exludes the internals of the library. ### The `Genkit` Class The `Genkit` class is the central part of the framework that: * Manages a registry of AI-related components (models, tools, flows, etc.). * Provides an API for working with AI models and flows. * Handles configuration and initialization. * Sets up development tools like the reflection server. #### Key features | Feature | Description | |-------------------------|-------------------------------------------------------------------------------| | **Registry Management** | It maintains a registry to keep track of all components in a Genkit instance. | | **Plugin System** | Supports loading plugins to extend functionality. | | **Prompt Management** | Allows defining and using prompts both programmatically and from files. | | **Model Integration** | Provides methods to work with generative AI models. | #### Core Functionality `Genkit` defines methods for the following: | Category | Function | Description | |------------------------|---------------------|-----------------------------------------------| | **Text Generation** | `generate()` | Generates text using AI models | | | `generate_stream()` | Streaming version for real-time results | | **Embedding** | `embed()` | Creates vector embeddings of content | | | `embed_many()` | Batch embedding generation | | **Retrieval & Search** | `retrieve()` | Fetches documents based on queries | | | `index()` | Indexes documents for fast retrieval | | | `rerank()` | Re-orders retrieved documents by relevance | | **Tools & Functions** | `define_tool()` | Creates tools that models can use | | | `define_flow()` | Creates workflows that combine multiple steps | | **Evaluation** | `evaluate()` | Evaluates AI model outputs | #### Helper Functions The veneer Genkit module may also include: * `genkit()`: A factory function to create new Genkit instances * `shutdown()`: Handles clean shutdown of Genkit servers * Event handlers for process termination signals ## Endpoints ### Telemetry Server | Endpoint | HTTP Method | Purpose | Request Body | Response | Content Type | |------------------------|-------------|---------------------------|--------------------------------------------|----------------------------------------|--------------------| | `/api/__health` | GET | Health check | - | "OK" (200) | `text/plain` | | `/api/traces/:traceId` | GET | Retrieve a specific trace | - | Trace data JSON | `application/json` | | `/api/traces` | POST | Save a new trace | `TraceData` object | "OK" (200) | `text/plain` | | `/api/traces` | GET | List traces | Query params: `limit`, `continuationToken` | List of traces with continuation token | `application/json` | ### Flow Server | Endpoint | HTTP Method | Purpose | Request Body | Response | Content Type | |---------------------------------------|-------------|-------------------------------|---------------------|--------------------------------------------------------------------------------|------------------------| | `/<pathPrefix><flowName>` | POST | Execute a flow | `{ data: <input> }` | `{ result: <output> }` (200) or error (4xx/5xx) | `application/json` | | `/<pathPrefix><flowName>?stream=true` | POST | Execute a flow with streaming | `{ data: <input> }` | `data: {"message": <chunk>}` (stream) and `data: {"result": <result>}` (final) | `text/plain` (chunked) | ### Reflection Server TODO: Ideally, these should behave the same, but we're making a note of differences here for now. === "TypeScript" | Endpoint | HTTP Method | Purpose | Request Body | Response | Content Type | |------------------------------|-------------|-----------------------------|----------------------------------------------------|--------------------------------------|------------------------| | `/api/__health` | GET | Health check | - | "OK" (200) | `text/plain` | | `/api/__quitquitquit` | GET | Terminate server | - | "OK" (200) and server stops | `text/plain` | | `/api/actions` | GET | List registered actions | - | Action metadata with schemas | `application/json` | | `/api/runAction` | POST | Run an action | `{ key, input, context, telemetryLabels }` | `{ result, telemetry: { traceId } }` | `application/json` | | `/api/runAction?stream=true` | POST | Run action with streaming | `{ key, input, context, telemetryLabels }` | Stream of chunks and final result | `text/plain` (chunked) | | `/api/envs` | GET | Get configured environments | - | List of environment names | `application/json` | | `/api/notify` | POST | Notify of telemetry server | `{ telemetryServerUrl, reflectionApiSpecVersion }` | "OK" (200) | `text/plain` | === "Go" | Endpoint | HTTP Method | Purpose | Request Body | Response | Content Type | |------------------|-------------|----------------------------|----------------------------------------------------|--------------------------------------|--------------------| | `/api/__health` | GET | Health check | - | 200 OK status | - | | `/api/actions` | GET | List registered actions | - | Action metadata with schemas | `application/json` | | `/api/runAction` | POST | Run an action | `{ key, input, context }` | `{ result, telemetry: { traceId } }` | `application/json` | | `/api/notify` | POST | Notify of telemetry server | `{ telemetryServerUrl, reflectionApiSpecVersion }` | OK response | `application/json` | === "Python" | Endpoint | HTTP Method | Purpose | Request Body | Response | Content Type | |------------------|-------------|-------------------------|--------------|------------------------------|--------------------| | `/api/__health` | GET | Health check | - | 200 OK status | - | | `/api/actions` | GET | List registered actions | - | Action metadata with schemas | `application/json` | | `/api/runAction` | POST | Run an action | Action input | Action output with traceId | `application/json` | ## Common Patterns * **Health check endpoints** (`/api/__health`): All servers implement a simple health check endpoint. * **Action/flow execution**: All servers provide endpoints to execute actions/flows. * **Streaming support**: JavaScript-based servers support streaming responses. * **Telemetry integration**: All execution endpoints include telemetry data (trace IDs) in responses. * **Error handling**: Standardized error formats with status codes and stack traces. * **Content negotiation**: Different response formats based on accept headers or query parameters. # Async-First Design Genkit is a library that allows application developers to create AI flows for their applications using an API that abstracts over various components such as indexers, retrievers, models, embedders, etc. The API is **async-first** because this single-threaded model of dealing with concurrency is the direction that Python frameworks are taking and Genkit naturally lives in an async world. Genkit is majorly I/O-bound, not as much computationally-bound, since its primary purpose is composing various AI foundational components and setting up typed communication patterns between them. ### Class Hierarchy The implementation uses a three-level class hierarchy: ```ascii +---------------------+ | GenkitRegistry | (in _registry.py) |---------------------| | + flow() | Decorator to register flows | + tool() | Decorator to register tools | + define_model() | Register model actions | + define_embedder() | Register embedder actions | + registry (prop) | +--------^------------+ | +--------|-----------+ | GenkitBase | (in _base_async.py) |--------------------| | + __init__( | | plugins, | | model, | | reflection_ | | server_spec) | +--------^-----------+ | +--------|-----------+ | Genkit | (in _aio.py) |--------------------| | + generate() | async — text generation | + generate_stream()| streaming generation | + embed() | async — create embeddings | + retrieve() | async — fetch documents | + rerank() | async — reorder documents | + evaluate() | async — evaluate outputs | + chat() | session-based chat +--------------------+ ``` ```mermaid classDiagram class GenkitRegistry { <<_registry.py>> +flow(name, description) Callable +tool(name, description) Callable +define_model(config, fn) Action +define_embedder(config, fn) Action +registry() Registry } class GenkitBase { <<_base_async.py>> +__init__(plugins, model, reflection_server_spec) } class Genkit { <<_aio.py>> +generate(model, prompt, system, ...) GenerateResponseWrapper +generate_stream(model, prompt, ...) tuple +embed(embedder, content) EmbedResponse +retrieve(retriever, query) list +rerank(reranker, query, documents) list +evaluate(evaluator, dataset) EvalResponse } GenkitBase --|> GenkitRegistry : inherits Genkit --|> GenkitBase : inherits ``` All methods on the `Genkit` class are `async`. Synchronously-defined flows and tools are executed using a thread-pool executor internally. ### Usage ```python from genkit.ai import Genkit from genkit.plugins.google_genai import GoogleAI ai = Genkit( plugins=[GoogleAI()], model='googleai/gemini-2.0-flash', ) @ai.flow() async def my_flow(query: str) -> str: response = await ai.generate(prompt=f"Answer this: {query}") return response.text ``` ## Implementation The `Genkit` class starts a reflection server when the `GENKIT_ENV` environment variable has been set to `'dev'`. Running the following command: ```bash genkit start -- uv run sample.py ``` sets `GENKIT_ENV='dev'` within a running instance of `sample.py`. `genkit start` exposes a developer UI (usually called dev UI for short) that is used for debugging and that talks to a reflection API server implemented by the `Genkit` class instance. The reflection API server provides a way for the dev UI to allow users to debug their custom flows, test features such as models and plugins, and also observe traces emitted by these components. ### Concurrency handling The implementation avoids using threads for server infrastructure since asyncio is primarily a single-threaded design. The reflection server runs as a coroutine on the same event loop. #### Scenarios - For simple short-lived applications without dev mode, the program exits normally after completing all flows. - For simple short-lived applications with dev mode (`GENKIT_ENV=dev`), the reflection server starts and prevents the main thread from exiting to enable debugging. - For long-lived servers, the reflection server attaches to the server manager alongside any application servers written by the end user.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/firebase/genkit'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

api.md•13.9 KiB