agent-device is a CLI for UI automation on iOS, tvOS, macOS, Android, and AndroidTV. It is designed for agent-driven workflows: inspect the UI, act on it deterministically, and keep that work session-aware and replayable.
If you know Vercel's agent-browser, this project applies the same broad idea to mobile apps and devices.
Screen-studio-EfDNxbh6oI.mp4
- Give agents a practical way to understand mobile UI state through structured snapshots.
- Keep automation flows token-efficient enough for real agent loops.
- Make common interactions reliable enough for repeated automation runs.
- Keep automation grounded in sessions, selectors, and replayable flows instead of one-off scripts.
- Sessions: open a target once, interact within that session, then close it cleanly.
- Snapshots: inspect the current accessibility tree in a compact form and get stable refs for exploration.
- Refs vs selectors: use refs for discovery, use selectors for durable replay and assertions.
- Human docs vs agent skills: docs explain the system for people; skills provide compact operating guidance for agents.
The canonical loop is:
agent-device open SampleApp --platform ios
agent-device snapshot -i
agent-device press @e3
agent-device diff snapshot -i
agent-device fill @e5 "test"
agent-device closeIn practice, most work follows the same pattern:
opena target app or URL.snapshot -ito inspect the current screen.press,fill,scroll,get, orwaitusing refs or selectors.diff snapshotor re-snapshot after UI changes.closewhen the session is finished.
For people:
For agents:
npm install -g agent-deviceSee CONTRIBUTING.md.
agent-device is an open source project and will always remain free to use. Callstack is a group of React and React Native geeks. Contact us at hello@callstack.com if you need any help with these technologies or just want to say hi.