Automation
Peekaboo's automation surface is small but covers the whole macOS UI graph. Each command is documented separately under commands/; this page is the map.
#Targeting model
Every input command accepts one of three target shapes:
- Element ID —
--id E12(frompeekaboo see); the most reliable. - Label / role / app — positional query text such as
peekaboo click "Send" --app Mail; resolved via the AX tree. - Coordinates —
--coords 480,120; target-relative when paired with--app,--pid, or--window-*, global otherwise. Add--global-coordsto force screen coordinates with a target.
Prefer IDs when you can capture them, labels when you can't, and coordinates only as a last resort. The agent and MCP tooling default to the first two.
#Delivery modes
Peekaboo has two input delivery modes:
- Background (default when a target process is known) posts process-targeted input without activating the app.
click,type,press,hotkey, andpasteuse this mode when you pass--app,--pid,--window-id, or a snapshot with process metadata. - Foreground focuses the target first, then sends normal/global input to the active key window or mouse focus. Add
--foregroundwhen an app ignores background input, when a text field only accepts key-window input, or when you want focus/Space switching to be part of the action.
Focus flags such as --space-switch, --bring-to-current-space, and --no-auto-focus belong to foreground delivery; using them implies --foreground. Background element/query clicks can complete through Accessibility alone. Keyboard input, coordinate clicks, and synthetic click fallback require Event Synthesizing for the sender shown by peekaboo permissions status; request it with peekaboo permissions request-event-synthesizing.
Examples:
# Background: target Safari without activating it
peekaboo hotkey cmd,l --app Safari
peekaboo type "github.com/openclaw/Peekaboo" --app Safari --return
# Foreground: activate/focus first for apps that require a key window
peekaboo hotkey cmd,l --app Safari --foreground --space-switch
peekaboo type "github.com/openclaw/Peekaboo" --app Safari --return --foreground
#Input primitives
| Command | Use it for |
|---|---|
| click | mouse clicks, double/triple, right/middle, hold |
| type | typing strings into targeted fields |
| press | individual key presses (return, escape, arrows, etc.) |
| hotkey | shortcut combos, including background apps |
| scroll | wheel scrolling at a point or on a target |
| drag | press, move, release — files, sliders, selections |
| swipe | trackpad-style multi-finger gestures |
| move | warp the mouse without clicking |
| set-value | write to text fields without typing |
| perform-action | trigger any AX action (AXPress, AXShowMenu, …) |
| sleep | wait between steps with deterministic timing |
For UX parity with humans (jitter, easing, dwell), see human-typing.md and human-mouse-move.md.
#Surfaces
| Surface | Command | Notes |
|---|---|---|
| App lifecycle | app | launch, quit, focus, hide |
| Windows | window | move, resize, focus, minimize, fullscreen |
| Spaces & Stage Manager | space | enumerate and switch Spaces |
| Menus | menu | walk app menus by path |
| Menu bar / status items | menubar.md | extra-fiddly popovers |
| Dialogs | dialog | sheets, alerts, save panels |
| Dock | dock | inspect/click dock items |
| Clipboard | clipboard | read/write pasteboard contents |
| Open files / URLs | open | with focus controls |
| Visual feedback | visualizer | overlay so a human can follow what the agent is doing |
#Recipe: click a button by label
# 1. Inspect first to find a stable label.
peekaboo see --app Safari --annotate --path safari.png
# 2. Click it.
peekaboo click "Reload" --app Safari
#Recipe: a small flow
peekaboo window focus --app "Notes"
peekaboo hotkey cmd+n
peekaboo type "Standup notes\n\n- Shipped Peekaboo docs\n- Reviewed PR #42\n"
peekaboo hotkey cmd+s
Three primitives, four lines. The agent does the same thing under the hood — it just plans the sequence for you.
#Resilience tips
- Always run
peekaboo seewhen an element is unreachable. The AX tree refreshes after focus changes; capture again if a click fails. - Use focus and application-resolving for tricky cases (multiple windows, helper apps, processes that hide on activation).
- Wrap risky sequences with
peekaboo sleep 0.2— humans don't fire ten clicks in a single frame, and neither should you. - Prefer background delivery for routine app-specific input so automations do not steal focus.
- Add
--foregroundonly when an app needs a focused key window, Space switch, or foreground mouse event.
#Going further
- Agent overview — let Peekaboo plan input sequences from a goal.
- MCP — expose all of the above to Codex, Claude Code, and Cursor.
- Architecture — how the input pipeline routes through Bridge and Daemon.