Peekaboo Architecture Overview
This document provides a high-level overview of how Tachikoma and PeekabooCore work together to provide AI-powered macOS automation capabilities.
#System Architecture
#Core Components
┌─────────────────┐
│ Tachikoma │ AI models + streaming
└────────┬────────┘
│
┌────────▼────────┐ ┌────────────────────┐ ┌────────────────────┐
│ PeekabooAutomation│◄───►│ PeekabooAgentRuntime │◄───►│ PeekabooVisualizer │
│ UI/system services│ │ Agent + MCP runtime │ │ Visual feedback stack │
└────────┬────────┘ └──────────┬──────────┘ └──────────┬──────────┘
│ │ │
└───────────────┬───────────┴───────────┬───────────────┘
▼ ▼
┌─────────────┐ ┌──────────────┐
│ PeekabooCore│ │ Apps / CLI │
│ (umbrella) │ │ consumers │
└─────────────┘ └──────────────┘
- PeekabooAutomation – houses all automation-facing code (configuration, capture, application/menu/window services, snapshot management, typed models). Anything that touches Accessibility, ScreenCaptureKit, or on-host configuration lives here.
- PeekabooVisualizer – standalone visual feedback layer (
VisualizationClient, event store, presets) used by automation and apps. - PeekabooAgentRuntime – MCP tools, ToolRegistry/formatters, and the agent service itself. Depends on
PeekabooAutomationfor services/data models and onPeekabooVisualizerfor status tokens. - PeekabooCore – thin umbrella (
_exportedimports +PeekabooServicesconvenience container). Apps/CLI keep importingPeekabooCore, but large features can now link the more focused products directly. Whoever instantiatesPeekabooServicesis responsible for callinginstallAgentRuntimeDefaults()so MCP tools and the ToolRegistry share that instance. - Tachikoma – still the AI provider surface (OpenAI/Anthropic/Grok/Ollama) that the runtime modules call through.
#Dependency Flow
Tachikoma (AI Model Management)
- Provides
AIModelProviderfor dependency injection - Manages OpenAI, Anthropic, Grok, and Ollama models
- Handles API configuration and credential management
PeekabooAutomation
- Depends on Tachikoma for provider metadata and
PeekabooVisualizerfor optional UI feedback. - Exposes pure Swift protocols (
ApplicationServiceProtocol,LoggingServiceProtocol, etc.) plus concrete implementations (MenuService, ScreenCaptureService, ProcessService, etc.). - Owns persisted models such as
CaptureTarget,AutomationAction,UIElement,SnapshotInfo, and shared helper utilities.
PeekabooAgentRuntime
- Imports
PeekabooAutomationfor services/models and hosts MCP/agent tooling (PeekabooAgentService,MCPToolContext,ToolRegistry, CLI/MCP formatters). - Provides a clean
PeekabooServiceProvidingprotocol so higher layers (CLI, macOS app, and the MCP server entrypoints) can swap concrete service collections without touching globals.
PeekabooVisualizer
- Stays decoupled from automation; only consumes
PeekabooProtocolsdata (DetectedElement,LogLevel) so it can be embedded in other contexts later. VisualizationClientis still accessed viaPeekabooAutomationconvenience wrappers, but the module boundary keeps visual dependencies out of headless hosts.
#Tachikoma: AI Model Management
#Architecture Pattern: Dependency Injection
Tachikoma has migrated from a singleton pattern to dependency injection for better testability and flexibility:
// Old (deprecated)
let model = try await Tachikoma.shared.getModel("gpt-4.1")
// New (recommended)
let provider = try AIConfiguration.fromEnvironment()
let model = try provider.getModel("gpt-4.1")
#Key Components
#AIModelProvider
- Role: Central registry for AI model instances
- Pattern: Immutable collection with functional updates
- Thread Safety: Full concurrent access support
#AIModelFactory
- Role: Factory methods for creating model instances
- Supported Providers: OpenAI, Anthropic, Grok (xAI), Ollama
- Configuration: Handles API keys, base URLs, and model-specific parameters
#AIConfiguration
- Role: Environment-based automatic configuration
- Sources: Environment variables and
~/.tachikoma/credentialsfile - Auto-Discovery: Automatically registers all available models
#PeekabooCore: Automation Engine
#Architecture Pattern: Service Orchestration
PeekabooCore uses a service locator pattern with specialized service delegation:
let services = PeekabooServices()
let automation = services.automation // UIAutomationService
let screenCapture = services.screenCapture // ScreenCaptureService
let applications = services.applications // ApplicationService
#Service Hierarchy
#PeekabooServices (Service Locator)
- Role: Central registry for all automation services
- Pattern: Service locator with dependency injection support
- Lifecycle: Manages service initialization and coordination
##### Installing a services instance PeekabooServices no longer registers itself globally. Whoever constructs an instance (CLI runtime, macOS app, integration test, etc.) must call services.installAgentRuntimeDefaults() immediately after initialization. This wires the container into MCPToolContext and ToolRegistry so downstream tooling (MCP server, CLI peekaboo tools, agent service) can resolve the exact same services without touching singletons. Skipping the install step will cause MCP and ToolRegistry code to fatal because no default factory is configured.
#UIAutomationService (Orchestrator)
- Role: Primary automation interface delegating to specialized services
- Delegation: Routes operations to appropriate specialized services
- Snapshot Management: Maintains state across automation workflows
#Specialized Services
Each service handles a specific aspect of automation:
- ClickService: Mouse interaction and element targeting
- TypeService: Keyboard input and text manipulation
- ScreenCaptureService: Display and window capture
- ApplicationService: Application discovery and management
- WindowManagementService: Window positioning and state control
- MenuService: Menu bar navigation and interaction
- SnapshotManager: State persistence and element caching
#Threading Model
Main Thread Requirement: All UI automation operations run on MainActor due to macOS requirements:
@MainActor
public final class UIAutomationService: UIAutomationServiceProtocol {
// All operations are main-thread bound
}
#Integration Points
#AI Integration
PeekabooCore integrates with Tachikoma through PeekabooAgentService:
let modelProvider = try AIConfiguration.fromEnvironment()
let agent = PeekabooAgentService(
services: PeekabooServices(),
modelProvider: modelProvider
)
#Visual Feedback Integration
Services automatically connect to PeekabooVisualizer when available:
// Automatic visualizer integration
let visualizerClient = VisualizationClient.shared
_ = await visualizerClient.showClickFeedback(at: clickPoint, type: clickType)
Behind the scenes the client serializes a VisualizerEvent into ~/Library/Application Support/PeekabooShared/VisualizerEvents/<uuid>.json and posts boo.peekaboo.visualizer.event via NSDistributedNotificationCenter. When Peekaboo.app is alive its VisualizerEventReceiver loads the payload and hands it to VisualizerCoordinator; otherwise the event is silently dropped and execution continues.
#Data Flow Architecture
#Automation Workflow
- Input: Natural language task or direct API call
- AI Processing:
PeekabooAgentServiceuses Tachikoma models - Service Orchestration:
UIAutomationServicedelegates to specialized services - Platform Integration: Services use macOS APIs (Accessibility, ScreenCaptureKit)
- Visual Feedback: Operations trigger visualizer animations
- Snapshot Management: State cached for subsequent operations
#Example Flow: "Click the Submit button"
User Input ("Click Submit")
↓
PeekabooAgentService (AI interpretation)
↓
UIAutomationService.detectElements() → ElementDetectionService
↓
UIAutomationService.click() → ClickService
↓
macOS Accessibility APIs
↓
VisualizationClient (click animation)
#Performance Characteristics
#Service Performance Ranges
- Element Detection: 200-800ms (AI analysis + accessibility correlation)
- Click Operations: 10-50ms (accessibility API optimization)
- Screen Capture: 20-100ms (ScreenCaptureKit acceleration)
- Application Discovery: 20-200ms (depending on system load)
- Window Management: 10-200ms (depending on operation complexity)
#Optimization Strategies
- Snapshot Caching: Element detection results cached per snapshot
- Accessibility Timeouts: Reduced from 6s to 2s to prevent hangs
- Dual APIs: Modern ScreenCaptureKit with CGWindowList fallback
- Visual Feedback: Async animations don't block automation operations
#Error Handling Strategy
#Layered Error Handling
- Service Level: Individual services handle API-specific errors
- Orchestration Level: UIAutomationService provides unified error handling
- Agent Level: AI agent handles retry logic and error recovery
- Client Level: Applications receive structured error information
#Defensive Programming
- Permission Validation: Automatic checks for Screen Recording and Accessibility permissions
- Timeout Protection: Configurable timeouts prevent system hangs
- Graceful Degradation: Fallback strategies for problematic applications
- State Validation: Element existence and accessibility verification
#Configuration Management
#Multi-Source Configuration
- Environment Variables:
PEEKABOO_AI_PROVIDERS,OPENAI_API_KEY, etc. - Credential Files:
~/.peekaboo/config.json,~/.tachikoma/credentials - Runtime Parameters: Method-level configuration overrides
- Feature Flags:
PEEKABOO_USE_MODERN_CAPTURE, etc.
#Configuration Precedence
CLI Arguments > Environment Variables > Credential Files > Config Files > Defaults
#Future Architecture Considerations
#Scalability
- Service architecture supports horizontal scaling through additional specialized services
- AI model provider supports multiple concurrent model instances
- Snapshot management designed for multi-user and multi-process scenarios
#Extensibility
- Plugin architecture possible through service locator pattern
- AI model provider supports custom model implementations
- Visual feedback system can be extended with additional visualization types
#Cross-Platform Potential
- Service interfaces abstract platform-specific implementations
- Threading model adaptable to other platforms
- AI integration remains platform-agnostic
This architecture has been designed to be "really easy for other people to understand" while providing the performance and reliability needed for production automation workflows.