Skip to main content
Glama
ARCHITECTURE.mdβ€’12.2 kB
--- summary: 'Review Peekaboo Architecture Overview guidance' read_when: - 'planning work related to peekaboo architecture overview' - 'debugging or extending features described here' --- # Peekaboo Architecture Overview This document provides a high-level overview of how Tachikoma and PeekabooCore work together to provide AI-powered macOS automation capabilities. ## System Architecture ### Core Components ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Tachikoma β”‚ AI models + streaming β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ PeekabooAutomation│◄───►│ PeekabooAgentRuntime │◄───►│ PeekabooVisualizer β”‚ β”‚ UI/system servicesβ”‚ β”‚ Agent + MCP runtime β”‚ β”‚ Visual feedback stack β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ PeekabooCoreβ”‚ β”‚ Apps / CLI β”‚ β”‚ (umbrella) β”‚ β”‚ consumers β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` - **PeekabooAutomation** – houses *all* automation-facing code (configuration, capture, application/menu/window services, session management, typed models). Anything that touches Accessibility, ScreenCaptureKit, or on-host configuration lives here. - **PeekabooVisualizer** – standalone visual feedback layer (`VisualizationClient`, event store, presets) used by automation and apps. - **PeekabooAgentRuntime** – MCP tools, ToolRegistry/formatters, and the agent service itself. Depends on `PeekabooAutomation` for services/data models and on `PeekabooVisualizer` for status tokens. - **PeekabooCore** – thin umbrella (`_exported` imports + `PeekabooServices` convenience container). Apps/CLI keep importing `PeekabooCore`, but large features can now link the more focused products directly. Whoever instantiates `PeekabooServices` is responsible for calling `installAgentRuntimeDefaults()` so MCP tools and the ToolRegistry share that instance. - **Tachikoma** – still the AI provider surface (OpenAI/Anthropic/Grok/Ollama) that the runtime modules call through. ### Dependency Flow **Tachikoma** (AI Model Management) - Provides `AIModelProvider` for dependency injection - Manages OpenAI, Anthropic, Grok, and Ollama models - Handles API configuration and credential management **PeekabooAutomation** - Depends on Tachikoma for provider metadata and `PeekabooVisualizer` for optional UI feedback. - Exposes pure Swift protocols (`ApplicationServiceProtocol`, `LoggingServiceProtocol`, etc.) plus concrete implementations (MenuService, ScreenCaptureService, ProcessService, etc.). - Owns persisted models such as `CaptureTarget`, `AutomationAction`, `UIElement`, `SessionInfo`, and shared helper utilities. **PeekabooAgentRuntime** - Imports `PeekabooAutomation` for services/models and hosts MCP/agent tooling (`PeekabooAgentService`, `MCPToolContext`, `ToolRegistry`, CLI/MCP formatters). - Provides a clean `PeekabooServiceProviding` protocol so higher layers (CLI, macOS app, Tachikoma MCP host) can swap concrete service collections without touching globals. **PeekabooVisualizer** - Stays decoupled from automation; only consumes `PeekabooProtocols` data (`DetectedElement`, `LogLevel`) so it can be embedded in other contexts later. - `VisualizationClient` is still accessed via `PeekabooAutomation` convenience wrappers, but the module boundary keeps visual dependencies out of headless hosts. ## Tachikoma: AI Model Management ### Architecture Pattern: Dependency Injection Tachikoma has migrated from a singleton pattern to dependency injection for better testability and flexibility: ```swift // Old (deprecated) let model = try await Tachikoma.shared.getModel("gpt-4.1") // New (recommended) let provider = try AIConfiguration.fromEnvironment() let model = try provider.getModel("gpt-4.1") ``` ### Key Components #### AIModelProvider - **Role**: Central registry for AI model instances - **Pattern**: Immutable collection with functional updates - **Thread Safety**: Full concurrent access support #### AIModelFactory - **Role**: Factory methods for creating model instances - **Supported Providers**: OpenAI, Anthropic, Grok (xAI), Ollama - **Configuration**: Handles API keys, base URLs, and model-specific parameters #### AIConfiguration - **Role**: Environment-based automatic configuration - **Sources**: Environment variables and `~/.tachikoma/credentials` file - **Auto-Discovery**: Automatically registers all available models ## PeekabooCore: Automation Engine ### Architecture Pattern: Service Orchestration PeekabooCore uses a service locator pattern with specialized service delegation: ```swift let services = PeekabooServices() let automation = services.automation // UIAutomationService let screenCapture = services.screenCapture // ScreenCaptureService let applications = services.applications // ApplicationService ``` ### Service Hierarchy #### PeekabooServices (Service Locator) - **Role**: Central registry for all automation services - **Pattern**: Service locator with dependency injection support - **Lifecycle**: Manages service initialization and coordination ##### Installing a services instance `PeekabooServices` no longer registers itself globally. Whoever constructs an instance (CLI runtime, macOS app, integration test, etc.) **must** call `services.installAgentRuntimeDefaults()` immediately after initialization. This wires the container into `MCPToolContext` and `ToolRegistry` so downstream tooling (MCP server, CLI `peekaboo tools`, agent service) can resolve the exact same services without touching singletons. Skipping the install step will cause MCP and ToolRegistry code to fatal because no default factory is configured. #### UIAutomationService (Orchestrator) - **Role**: Primary automation interface delegating to specialized services - **Delegation**: Routes operations to appropriate specialized services - **Session Management**: Maintains state across automation workflows #### Specialized Services Each service handles a specific aspect of automation: - **ClickService**: Mouse interaction and element targeting - **TypeService**: Keyboard input and text manipulation - **ScreenCaptureService**: Display and window capture - **ApplicationService**: Application discovery and management - **WindowManagementService**: Window positioning and state control - **MenuService**: Menu bar navigation and interaction - **SessionManager**: State persistence and element caching ### Threading Model **Main Thread Requirement**: All UI automation operations run on MainActor due to macOS requirements: ```swift @MainActor public final class UIAutomationService: UIAutomationServiceProtocol { // All operations are main-thread bound } ``` ### Integration Points #### AI Integration PeekabooCore integrates with Tachikoma through `PeekabooAgentService`: ```swift let modelProvider = try AIConfiguration.fromEnvironment() let agent = PeekabooAgentService( services: PeekabooServices(), modelProvider: modelProvider ) ``` #### Visual Feedback Integration Services automatically connect to PeekabooVisualizer when available: ```swift // Automatic visualizer integration let visualizerClient = VisualizationClient.shared _ = await visualizerClient.showClickFeedback(at: clickPoint, type: clickType) ``` Behind the scenes the client serializes a `VisualizerEvent` into `~/Library/Application Support/PeekabooShared/VisualizerEvents/<uuid>.json` and posts `boo.peekaboo.visualizer.event` via `NSDistributedNotificationCenter`. When Peekaboo.app is alive its `VisualizerEventReceiver` loads the payload and hands it to `VisualizerCoordinator`; otherwise the event is silently dropped and execution continues. ## Data Flow Architecture ### Automation Workflow 1. **Input**: Natural language task or direct API call 2. **AI Processing**: `PeekabooAgentService` uses Tachikoma models 3. **Service Orchestration**: `UIAutomationService` delegates to specialized services 4. **Platform Integration**: Services use macOS APIs (Accessibility, ScreenCaptureKit) 5. **Visual Feedback**: Operations trigger visualizer animations 6. **Session Management**: State cached for subsequent operations ### Example Flow: "Click the Submit button" ``` User Input ("Click Submit") ↓ PeekabooAgentService (AI interpretation) ↓ UIAutomationService.detectElements() β†’ ElementDetectionService ↓ UIAutomationService.click() β†’ ClickService ↓ macOS Accessibility APIs ↓ VisualizationClient (click animation) ``` ## Performance Characteristics ### Service Performance Ranges - **Element Detection**: 200-800ms (AI analysis + accessibility correlation) - **Click Operations**: 10-50ms (accessibility API optimization) - **Screen Capture**: 20-100ms (ScreenCaptureKit acceleration) - **Application Discovery**: 20-200ms (depending on system load) - **Window Management**: 10-200ms (depending on operation complexity) ### Optimization Strategies - **Session Caching**: Element detection results cached per session - **Accessibility Timeouts**: Reduced from 6s to 2s to prevent hangs - **Dual APIs**: Modern ScreenCaptureKit with CGWindowList fallback - **Visual Feedback**: Async animations don't block automation operations ## Error Handling Strategy ### Layered Error Handling 1. **Service Level**: Individual services handle API-specific errors 2. **Orchestration Level**: UIAutomationService provides unified error handling 3. **Agent Level**: AI agent handles retry logic and error recovery 4. **Client Level**: Applications receive structured error information ### Defensive Programming - **Permission Validation**: Automatic checks for Screen Recording and Accessibility permissions - **Timeout Protection**: Configurable timeouts prevent system hangs - **Graceful Degradation**: Fallback strategies for problematic applications - **State Validation**: Element existence and accessibility verification ## Configuration Management ### Multi-Source Configuration 1. **Environment Variables**: `PEEKABOO_AI_PROVIDERS`, `OPENAI_API_KEY`, etc. 2. **Credential Files**: `~/.peekaboo/config.json`, `~/.tachikoma/credentials` 3. **Runtime Parameters**: Method-level configuration overrides 4. **Feature Flags**: `PEEKABOO_USE_MODERN_CAPTURE`, etc. ### Configuration Precedence ``` CLI Arguments > Environment Variables > Credential Files > Config Files > Defaults ``` ## Future Architecture Considerations ### Scalability - Service architecture supports horizontal scaling through additional specialized services - AI model provider supports multiple concurrent model instances - Session management designed for multi-user and multi-process scenarios ### Extensibility - Plugin architecture possible through service locator pattern - AI model provider supports custom model implementations - Visual feedback system can be extended with additional visualization types ### Cross-Platform Potential - Service interfaces abstract platform-specific implementations - Threading model adaptable to other platforms - AI integration remains platform-agnostic --- *This architecture has been designed to be "really easy for other people to understand" while providing the performance and reliability needed for production automation workflows.*

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/steipete/Peekaboo'

If you have feedback or need assistance with the MCP directory API, please join our Discord server