gemini
Analyze codebases and design user interfaces using Google Gemini's visual understanding and detective capabilities for rapid prototyping and comprehensive analysis.
Instructions
Run Google Gemini CLI agent (UI design / comprehensive analysis).
NO SHARED MEMORY:
Cannot see messages/outputs from codex/claude/opencode.
Only sees: (1) this prompt, (2) files in context_paths, (3) its own history via continuation_id.
CROSS-AGENT HANDOFF:
Small data: paste into prompt.
Large data: save_file -> context_paths -> prompt says "Read ".
CAPABILITIES:
Strongest UI design and image understanding abilities
Excellent at rapid UI prototyping and visual tasks
Great at inferring original requirements from code clues
Best for full-text analysis and detective work
BEST PRACTICES:
Good first choice for "understand this codebase" tasks
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | Detailed instructions for the agent. IMPORTANT: If 'continuation_id' is NOT set, you MUST include ALL context (background, file contents, errors, constraints), as the agent has no memory. If 'continuation_id' IS set, you may be brief and reference previous context. | |
| workspace | Yes | Project root directory. Boundary for 'workspace-write'. Use absolute paths or relative paths. | |
| continuation_id | No | Resume session WITHIN THIS TOOL ONLY. Use only the <continuation_id> returned by this same tool. IDs are agent-specific: codex ID won't work with gemini/claude/opencode. Switching agents does NOT sync info; pass updates via prompt or context_paths. | |
| permission | No | Security level: 'read-only' (analyze files), 'workspace-write' (modify inside workspace), 'unlimited' (full system access). Default: 'read-only'. | read-only |
| model | No | Optional model override (e.g., 'gemini-2.5-pro'). Use only if specifically requested. | |
| save_file | No | PREFERRED when agent needs to write files or produce lengthy output. Output is written directly to this path, avoiding context overflow. This write is permitted even in read-only mode (server-handled). Essential for: code generation, detailed reports, documentation. | |
| save_file_with_wrapper | No | When true AND save_file is set, wrap output in <agent-output> XML tags with metadata (agent name, continuation_id). For multi-agent assembly. | |
| save_file_with_append_mode | No | When true AND save_file is set, append instead of overwrite. For multi-agent collaboration on same document. | |
| report_mode | No | Generate a standalone, document-style report (no chat filler) suitable for sharing. | |
| context_paths | No | List of relevant files/dirs to preload as context hints. | |
| task_note | No | REQUIRED user-facing label. Summarize action in < 60 chars (e.g., '[Fix] Auth logic' or '[Read] config.py'). Shown in GUI progress bar to inform user. | |
| debug | No | Enable execution stats (tokens, duration) for this call. |
Implementation Reference
- src/cli_agent_mcp/server.py:100-144 (registration)Registration of the 'gemini' MCP tool in the server's list_tools() method. Adds Tool(name="gemini", description=..., inputSchema=...) if enabled. Also registers 'gemini_parallel'.@server.list_tools() async def list_tools() -> list[Tool]: """列出可用工具。""" tools = [] for cli_type in ["codex", "gemini", "claude", "opencode", "banana", "image"]: if config.is_tool_allowed(cli_type): tools.append( Tool( name=cli_type, description=TOOL_DESCRIPTIONS[cli_type], inputSchema=create_tool_schema(cli_type), ) ) # 追加 *_parallel 工具(仅支持的 CLI 工具) if cli_type in PARALLEL_SUPPORTED_TOOLS: parallel_name = f"{cli_type}_parallel" parallel_desc = ( f"Run multiple {cli_type} tasks in parallel. " f"All tasks share workspace/permission/save_file. " f"Results are appended to save_file with XML wrappers " f"(<agent-output agent=... continuation_id=... task_note=... task_index=... status=...>). " f"Max 100 tasks. Model can be array: single element shared by all, or one per task." ) tools.append( Tool( name=parallel_name, description=parallel_desc, inputSchema=create_tool_schema(cli_type, is_parallel=True), ) ) # 添加 get_gui_url 工具 if gui_manager: tools.append( Tool( name="get_gui_url", description="Get the GUI dashboard URL. Returns the HTTP URL where the live event viewer is accessible.", inputSchema={"type": "object", "properties": {}, "required": []}, ) ) # DEBUG: 记录工具列表请求(通常是客户端初始化后的第一个调用) logger.debug( f"[MCP] list_tools called, returning {len(tools)} tools: " f"{[t.name for t in tools]}" ) return tools
- Specific description/schema for the 'gemini' tool used in registration. create_tool_schema('gemini') defines input validation using common properties.BEST PRACTICES: - Good first choice for "understand this codebase" tasks""", "claude": """Run Anthropic Claude CLI agent (code implementation). NO SHARED MEMORY: - Cannot see messages/outputs from codex/gemini/opencode. - Only sees: (1) this prompt, (2) files in context_paths, (3) its own history via continuation_id. CROSS-AGENT HANDOFF: - Small data: paste into prompt. - Large data: save_file -> context_paths -> prompt says "Read <file>". CAPABILITIES: - Strongest code writing and implementation abilities - Excellent at translating requirements into working code - Good at following patterns and conventions BEST PRACTICES: - Be explicit about target: "Replace old implementation completely" - Specify cleanup: "Remove deprecated code paths"
- src/cli_agent_mcp/handlers/cli.py:76-234 (handler)CLIHandler.handle() is the primary handler function executing the 'gemini' tool logic: validates args, injects context, creates GeminiInvoker, runs CLI process, formats response, handles save_file.class CLIHandler(ToolHandler): """CLI 工具处理器(codex, gemini, claude, opencode)。""" def __init__(self, cli_type: str): """初始化 CLIHandler。 Args: cli_type: CLI 类型(codex, gemini, claude, opencode) """ self._cli_type = cli_type @property def name(self) -> str: return self._cli_type @property def description(self) -> str: from ..tool_schema import TOOL_DESCRIPTIONS return TOOL_DESCRIPTIONS.get(self._cli_type, "") def get_input_schema(self) -> dict[str, Any]: from ..tool_schema import create_tool_schema return create_tool_schema(self._cli_type) def validate(self, arguments: dict[str, Any]) -> str | None: prompt = arguments.get("prompt") workspace = arguments.get("workspace") if not prompt or not str(prompt).strip(): return "Missing required argument: 'prompt'" if not workspace: return "Missing required argument: 'workspace'" return None async def handle( self, arguments: dict[str, Any], ctx: ToolContext, ) -> list[TextContent]: """处理 CLI 工具调用。""" # 校验 error = self.validate(arguments) if error: return format_error_response(error) task_note = arguments.get("task_note", "") prompt = arguments.get("prompt", "") # 创建 invoker(per-request 隔离) event_callback = ctx.make_event_callback(self._cli_type, task_note, None) if ctx.gui_manager else None invoker = create_invoker(self._cli_type, event_callback=event_callback) # 立即推送用户 prompt 到 GUI ctx.push_user_prompt(self._cli_type, prompt, task_note) # 使用 helper 注入 report_mode 和 context_paths report_mode = arguments.get("report_mode", False) context_paths = arguments.get("context_paths", []) injected_prompt = inject_context_and_report_mode(prompt, context_paths, report_mode) arguments = {**arguments, "prompt": injected_prompt} # 构建参数 params = build_params(self._cli_type, arguments) try: # 执行(取消异常会直接传播,不会返回) result = await invoker.execute(params) # 获取参数 debug_enabled = ctx.resolve_debug(arguments) save_file_path = arguments.get("save_file", "") # 构建 debug_info(当 debug 开启时始终构建,包含 log_file) debug_info = None if debug_enabled: debug_info = FormatterDebugInfo( model=result.debug_info.model if result.debug_info else None, duration_sec=result.debug_info.duration_sec if result.debug_info else 0.0, message_count=result.debug_info.message_count if result.debug_info else 0, tool_call_count=result.debug_info.tool_call_count if result.debug_info else 0, input_tokens=result.debug_info.input_tokens if result.debug_info else None, output_tokens=result.debug_info.output_tokens if result.debug_info else None, cancelled=result.cancelled, log_file=ctx.config.log_file if ctx.config.log_debug else None, ) # 构建 ResponseData(直接使用 invoker 提取的统一数据) # 错误时也尽力返回已收集的内容和 session_id,方便客户端发送"继续" response_data = ResponseData( answer=result.agent_messages, # 即使失败也返回已收集的内容 session_id=result.session_id or "", thought_steps=result.thought_steps if not result.success else [], debug_info=debug_info, success=result.success, error=result.error, ) # 格式化响应 formatter = get_formatter() response = formatter.format( response_data, debug=debug_enabled, ) # DEBUG: 记录响应摘要 logger.debug( f"[MCP] call_tool response:\n" f" Tool: {self._cli_type}\n" f" Success: {result.success}\n" f" Response length: {len(response)} chars\n" f" Duration: {result.debug_info.duration_sec:.3f}s" if result.debug_info else "" ) # 保存到文件(如果指定) # NOTE: save_file 是权限限制的例外,它仅用于落盘分析记录结果, # 而非通用的文件写入能力。CLI agent 的实际文件操作仍受 permission 参数控制。 # 这是一个便捷功能,让编排器无需单独写文件来保存分析结果。 if save_file_path and result.success: try: file_content = formatter.format_for_file(response_data) # 添加 XML wrapper(如果启用) if arguments.get("save_file_with_wrapper", False): continuation_id = result.session_id or "" file_content = ( f'<agent-output agent="{self._cli_type}" continuation_id="{continuation_id}">\n' f'{file_content}\n' f'</agent-output>\n' ) # 追加或覆盖 file_path = Path(save_file_path) file_path.parent.mkdir(parents=True, exist_ok=True) if arguments.get("save_file_with_append_mode", False) and file_path.exists(): with file_path.open("a", encoding="utf-8") as f: f.write("\n" + file_content) logger.info(f"Appended output to: {save_file_path}") else: file_path.write_text(file_content, encoding="utf-8") logger.info(f"Saved output to: {save_file_path}") except Exception as e: logger.warning(f"Failed to save output to {save_file_path}: {e}") return [TextContent(type="text", text=response)] except anyio.get_cancelled_exc_class() as e: # 取消通知已由 invoker._send_cancel_event() 推送到 GUI # 直接 re-raise 让 MCP 框架处理 logger.info(f"Tool '{self._cli_type}' cancelled (type={type(e).__name__})") raise except asyncio.CancelledError as e: # 捕获 asyncio.CancelledError(可能与 anyio 不同) logger.info(f"Tool '{self._cli_type}' cancelled via asyncio.CancelledError") raise except Exception as e: logger.error(f"Tool '{self._cli_type}' error: {e}") return format_error_response(str(e))
- GeminiInvoker: Core helper that constructs the 'gemini' CLI command line arguments, maps permissions to --sandbox/--allowed-tools, invokes subprocess, parses events.class GeminiInvoker(CLIInvoker): """Gemini CLI 调用器。 封装 Gemini CLI 的调用逻辑,包括: - 命令行参数构建 - Permission 到 --sandbox 参数映射 - 最精简的参数集(无特有参数) Example: invoker = GeminiInvoker() result = await invoker.execute(GeminiParams( prompt="Analyze this project", workspace=Path("/path/to/repo"), )) """ def __init__( self, gemini_path: str = "gemini", event_callback: EventCallback | None = None, parser: Any | None = None, ) -> None: """初始化 Gemini 调用器。 Args: gemini_path: gemini 可执行文件路径,默认 "gemini" event_callback: 事件回调函数 parser: 自定义解析器 """ super().__init__(event_callback=event_callback, parser=parser) self._gemini_path = gemini_path @property def cli_type(self) -> CLIType: return CLIType.GEMINI def build_command(self, params: CommonParams) -> list[str]: """构建 Gemini CLI 命令。 Args: params: 调用参数 Returns: 命令行参数列表 """ cmd = [self._gemini_path] # 硬编码:流式 JSON 输出(实时 JSONL) cmd.extend(["-o", "stream-json"]) # 工作目录 cmd.extend(["--include-directories", str(params.workspace.absolute())]) # Permission 映射 # Gemini 的 sandbox 是开关式的,不是像 Codex 那样有具体值 # read-only 和 workspace-write 都启用 sandbox # unlimited 则不启用 sandbox if params.permission != Permission.UNLIMITED: cmd.append("--sandbox") # 允许的工具列表(基于权限级别) # read-only: 只允许读取类工具 # workspace-write/unlimited: 允许所有工具 if params.permission == Permission.READ_ONLY: allowed_tools = GEMINI_READ_ONLY_TOOLS else: allowed_tools = GEMINI_ALL_TOOLS cmd.extend(["--allowed-tools", ",".join(allowed_tools)]) # 可选:模型 if params.model: cmd.extend(["--model", params.model]) # 会话恢复 if params.session_id: cmd.extend(["--resume", params.session_id]) # Prompt 作为位置参数(gemini 0.20+ 废弃了 -p 参数) cmd.append(params.prompt) return cmd @property def uses_stdin_prompt(self) -> bool: """Gemini 使用位置参数而非 stdin 传递 prompt。""" return False def _process_event(self, event: Any, params: CommonParams) -> None: """处理 Gemini 特有的事件。 Gemini 的 session_id 在 init 事件中。 """ super()._process_event(event, params) if not self._session_id: raw = event.raw if raw.get("type") == "init": session_id = raw.get("session_id", "") if session_id: self._session_id = session_id
- GeminiAdapter: Supporting adapter for Gemini CLI command building and session_id extraction from events.class GeminiAdapter(AgentAdapter): """Gemini CLI 适配器 - 无状态。 只负责: - 命令行参数构建 - session_id 抽取规则(gemini 使用 init 事件中的 session_id) """ def __init__(self, gemini_path: str = "gemini") -> None: """初始化适配器。 Args: gemini_path: gemini 可执行文件路径 """ self._gemini_path = gemini_path @property def cli_type(self) -> str: return "gemini" @property def uses_stdin_prompt(self) -> bool: """Gemini 使用位置参数而非 stdin 传递 prompt。""" return False def build_command(self, params: Any) -> list[str]: """构建 Gemini CLI 命令。 Args: params: GeminiParams 实例 Returns: 命令行参数列表 """ cmd = [self._gemini_path] # 硬编码:流式 JSON 输出 cmd.extend(["-o", "stream-json"]) # 工作目录 cmd.extend(["--include-directories", str(Path(params.workspace).absolute())]) # Permission 映射 # Gemini 的 sandbox 是开关式的 permission_value = params.permission.value if hasattr(params.permission, "value") else str(params.permission) if permission_value != "unlimited": cmd.append("--sandbox") # 可选:模型 if params.model: cmd.extend(["--model", params.model]) # 会话恢复 if params.session_id: cmd.extend(["--resume", params.session_id]) # Prompt 作为位置参数 cmd.append(params.prompt) return cmd def extract_session_id(self, event: "UnifiedEvent") -> str | None: """从事件中提取 session_id。 Gemini 的 session_id 在 init 事件中。 """ # 先尝试基类的通用提取 session_id = super().extract_session_id(event) if session_id: return session_id # Gemini 特有:init 事件 raw = event.raw if raw.get("type") == "init": session_id = raw.get("session_id", "") if session_id: return session_id return None