Skip to main content
Glama

gemini

Analyze codebases, design UI prototypes, and interpret images using Google's Gemini model for development tasks within CLI Agent MCP.

Instructions

Invoke Google Gemini CLI agent for UI design and comprehensive analysis.

CAPABILITIES:

  • Strongest UI design and image understanding abilities

  • Excellent at rapid UI prototyping and visual tasks

  • Great at inferring original requirements from code clues

  • Best for full-text analysis and detective work

LIMITATIONS:

  • Not good at summarization (outputs can be verbose)

  • May need full_output=true for research tasks

BEST PRACTICES:

  • Use for: UI mockups, image analysis, requirement discovery

  • Enable full_output when doing research or analysis

  • Good first choice for "understand this codebase" tasks

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYesDetailed task instruction for the agent. Include specific file paths, function names, or error messages when available. Be explicit about scope and constraints to avoid over-engineering. Example: 'Fix the TypeError in utils.py:42, only modify that function'
workspaceYesAbsolute path to the project directory. Use the path mentioned in conversation, or the current project root. Supports relative paths (resolved against server CWD). Example: '/Users/dev/my-project' or './src'
permissionNoFile system permission level: - 'read-only': Can only read files, safe for analysis tasks - 'workspace-write': Can modify files within workspace only (recommended for most tasks) - 'unlimited': (DANGER) Full system access, use only when explicitly neededread-only
modelNoModel override. Only specify if user explicitly requests a specific model.
save_fileNoSave agent output to a file at the specified path. The file will contain the agent's response without debug info. This saves the orchestrator from having to write files separately. Example: '/path/to/output.md' NOTE: This is intentionally exempt from permission restrictions. It serves as a convenience for persisting analysis results, not as a general file-write capability. The CLI agent's actual file operations are still governed by the 'permission' parameter.
save_file_with_promptNoWhen true AND save_file is set, injects a note into the prompt asking the model to verbalize its analysis and insights. The model's detailed reasoning will be automatically saved to the file. Useful for generating comprehensive analysis reports.
full_outputNoReturn detailed output including reasoning and tool calls. Recommended for Gemini research/analysis tasks. Default: false (concise output)
session_idNoSession ID to continue a previous conversation. Reuse the ID from prior tool calls to maintain context. Leave empty for new conversations.
task_noteNoDisplay label for GUI, e.g., '[Review] PR #123'
debugNoOverride global debug setting for this call. When true, response includes execution stats (model, duration, tokens). When omitted, uses global CAM_DEBUG setting.

Implementation Reference

  • MCP tool registration for 'gemini' (and others) in the server's list_tools() handler. The tool is conditionally added based on config if allowed.
    @server.list_tools() async def list_tools() -> list[Tool]: """列出可用工具。""" tools = [] for cli_type in ["codex", "gemini", "claude", "opencode"]: if config.is_tool_allowed(cli_type): tools.append( Tool( name=cli_type, description=TOOL_DESCRIPTIONS[cli_type], inputSchema=create_tool_schema(cli_type), ) ) # DEBUG: 记录工具列表请求(通常是客户端初始化后的第一个调用) logger.debug( f"[MCP] list_tools called, returning {len(tools)} tools: " f"{[t.name for t in tools]}" ) return tools
  • Core handler implementation: GeminiInvoker class builds the specific CLI command for 'gemini' tool and processes its events, invoked from MCP server's call_tool.
    class GeminiInvoker(CLIInvoker): """Gemini CLI 调用器。 封装 Gemini CLI 的调用逻辑,包括: - 命令行参数构建 - Permission 到 --sandbox 参数映射 - 最精简的参数集(无特有参数) Example: invoker = GeminiInvoker() result = await invoker.execute(GeminiParams( prompt="Analyze this project", workspace=Path("/path/to/repo"), )) """ def __init__( self, gemini_path: str = "gemini", event_callback: EventCallback | None = None, parser: Any | None = None, ) -> None: """初始化 Gemini 调用器。 Args: gemini_path: gemini 可执行文件路径,默认 "gemini" event_callback: 事件回调函数 parser: 自定义解析器 """ super().__init__(event_callback=event_callback, parser=parser) self._gemini_path = gemini_path @property def cli_type(self) -> CLIType: return CLIType.GEMINI def build_command(self, params: CommonParams) -> list[str]: """构建 Gemini CLI 命令。 Args: params: 调用参数 Returns: 命令行参数列表 """ cmd = [self._gemini_path] # 硬编码:流式 JSON 输出(实时 JSONL) cmd.extend(["-o", "stream-json"]) # 工作目录 cmd.extend(["--include-directories", str(params.workspace.absolute())]) # Permission 映射 # Gemini 的 sandbox 是开关式的,不是像 Codex 那样有具体值 # read-only 和 workspace-write 都启用 sandbox # unlimited 则不启用 sandbox if params.permission != Permission.UNLIMITED: cmd.append("--sandbox") # 允许的工具列表(显式指定以确保一致性) cmd.extend(["--allowed-tools", ",".join(GEMINI_ALLOWED_TOOLS)]) # 可选:模型 if params.model: cmd.extend(["--model", params.model]) # 会话恢复 if params.session_id: cmd.extend(["--resume", params.session_id]) # Prompt 作为位置参数(gemini 0.20+ 废弃了 -p 参数) cmd.append(params.prompt) return cmd @property def uses_stdin_prompt(self) -> bool: """Gemini 使用位置参数而非 stdin 传递 prompt。""" return False def _process_event(self, event: Any, params: CommonParams) -> None: """处理 Gemini 特有的事件。 Gemini 的 session_id 在 init 事件中。 """ super()._process_event(event, params) if not self._session_id: raw = event.raw if raw.get("type") == "init": session_id = raw.get("session_id", "") if session_id: self._session_id = session_id
  • Tool description/schema metadata for the 'gemini' tool, used in Tool() constructor during registration.
    "gemini": """Invoke Google Gemini CLI agent for UI design and comprehensive analysis. CAPABILITIES: - Strongest UI design and image understanding abilities - Excellent at rapid UI prototyping and visual tasks - Great at inferring original requirements from code clues - Best for full-text analysis and detective work LIMITATIONS: - Not good at summarization (outputs can be verbose) - May need full_output=true for research tasks BEST PRACTICES: - Use for: UI mockups, image analysis, requirement discovery - Enable full_output when doing research or analysis - Good first choice for "understand this codebase" tasks""",
  • Dynamic JSON Schema generation for 'gemini' tool input parameters (uses common properties since no gemini-specific props). Called during registration.
    def create_tool_schema(cli_type: str) -> dict[str, Any]: """创建工具的 JSON Schema。 参数顺序: 1. prompt, workspace (必填) 2. permission, model, save_file, full_output (常用) 3. 特有参数 (image / system_prompt / append_system_prompt / file / agent) 4. session_id, task_note, debug (末尾) """ # 按顺序构建 properties properties: dict[str, Any] = {} # 1. 公共参数(必填 + 常用) properties.update(COMMON_PROPERTIES) # 2. 特有参数 if cli_type == "codex": properties.update(CODEX_PROPERTIES) elif cli_type == "claude": properties.update(CLAUDE_PROPERTIES) elif cli_type == "opencode": properties.update(OPENCODE_PROPERTIES) # 3. 末尾参数 properties.update(TAIL_PROPERTIES) return { "type": "object", "properties": properties, "required": ["prompt", "workspace"], }
  • GeminiParser class for parsing output events from gemini CLI into unified event format, used internally by the invoker.
    class GeminiParser: """Gemini CLI 事件解析器。 维护解析状态,支持流式事件的 ID 关联。 Example: parser = GeminiParser() for line in stream: event = parser.parse(json.loads(line)) if event: gui.push_event(event) """ def __init__(self) -> None: self.session_id: str | None = None self.model: str | None = None self._tool_names: dict[str, str] = {} # tool_id -> tool_name def parse(self, data: dict[str, Any]) -> UnifiedEvent: """解析单个 Gemini 事件。 Args: data: 原始事件字典 Returns: 统一事件实例 """ event_type = data.get("type", "") timestamp = _parse_timestamp(data.get("timestamp")) base_kwargs = { "source": CLISource.GEMINI, "timestamp": timestamp, "raw": data, } # 分发到具体的解析方法 if event_type == "init": return self._parse_init(data, base_kwargs) elif event_type == "message": return self._parse_message(data, base_kwargs) elif event_type == "tool_use": return self._parse_tool_use(data, base_kwargs) elif event_type == "tool_result": return self._parse_tool_result(data, base_kwargs) elif event_type == "error": return self._parse_error(data, base_kwargs) elif event_type == "result": return self._parse_result(data, base_kwargs) else: # Fallback: 未识别的事件类型 return make_fallback_event(CLISource.GEMINI, data) def _parse_init( self, data: dict[str, Any], base: dict[str, Any] ) -> LifecycleEvent: """解析 init 事件。""" self.session_id = data.get("session_id") self.model = data.get("model") return LifecycleEvent( event_id=make_event_id("gemini", "init"), lifecycle_type="session_start", session_id=self.session_id, model=self.model, status=Status.SUCCESS, **base, ) def _parse_message( self, data: dict[str, Any], base: dict[str, Any] ) -> MessageEvent: """解析 message 事件。""" role = data.get("role", "assistant") content = data.get("content", "") is_delta = data.get("delta", False) return MessageEvent( event_id=make_event_id("gemini", f"msg_{role}"), content_type=ContentType.TEXT, role=role if role in ("user", "assistant") else "assistant", text=content, is_delta=is_delta, session_id=self.session_id, **base, ) def _parse_tool_use( self, data: dict[str, Any], base: dict[str, Any] ) -> OperationEvent: """解析 tool_use 事件。""" tool_name = data.get("tool_name", "unknown") tool_id = data.get("tool_id", "") parameters = data.get("parameters", {}) # 缓存 tool_id -> tool_name 映射 if tool_id: self._tool_names[tool_id] = tool_name # 将参数序列化为字符串 try: input_str = json.dumps(parameters, ensure_ascii=False, indent=2) except (TypeError, ValueError): input_str = str(parameters) return OperationEvent( event_id=make_event_id("gemini", f"tool_{tool_name}"), operation_type=OperationType.TOOL, name=tool_name, operation_id=tool_id, input=input_str, status=Status.RUNNING, session_id=self.session_id, # 传递 session_id metadata={"parameters": parameters}, **base, ) def _parse_tool_result( self, data: dict[str, Any], base: dict[str, Any] ) -> OperationEvent: """解析 tool_result 事件。""" tool_id = data.get("tool_id", "") status_str = data.get("status", "success") output = data.get("output") error = data.get("error") # 从缓存获取工具名 tool_name = self._tool_names.get(tool_id, "unknown") # 确定状态 if status_str == "error" or error: status = Status.FAILED error_msg = "" if isinstance(error, dict): error_msg = error.get("message", "") elif isinstance(error, str): error_msg = error output_str = error_msg or output or "" else: status = Status.SUCCESS output_str = output if isinstance(output, str) else str(output) if output else "" return OperationEvent( event_id=make_event_id("gemini", f"result_{tool_name}"), operation_type=OperationType.TOOL, name=tool_name, operation_id=tool_id, output=output_str, status=status, session_id=self.session_id, # 传递 session_id **base, ) def _parse_error( self, data: dict[str, Any], base: dict[str, Any] ) -> SystemEvent: """解析 error 事件。""" severity = data.get("severity", "error") message = data.get("message", "Unknown error") # 映射 severity sev_map = {"warning": "warning", "error": "error"} unified_sev = sev_map.get(severity, "error") return SystemEvent( event_id=make_event_id("gemini", "error"), severity=unified_sev, message=message, session_id=self.session_id, # 传递 session_id **base, ) def _parse_result( self, data: dict[str, Any], base: dict[str, Any] ) -> LifecycleEvent: """解析 result 事件(会话结束)。""" status_str = data.get("status", "success") error = data.get("error") stats = data.get("stats", {}) # 确定状态 if status_str == "error" or error: status = Status.FAILED else: status = Status.SUCCESS # 构建统计信息 unified_stats = {} if stats: unified_stats = { "total_tokens": stats.get("total_tokens"), "input_tokens": stats.get("input_tokens"), "output_tokens": stats.get("output_tokens"), "duration_ms": stats.get("duration_ms"), "tool_calls": stats.get("tool_calls"), } return LifecycleEvent( event_id=make_event_id("gemini", "result"), lifecycle_type="session_end", session_id=self.session_id, model=self.model, status=status, stats=unified_stats, **base, )

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/shiharuharu/cli-agent-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server