Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 13 additions & 22 deletions .agents/tools/browser/agent-browser.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ tools:
- **Limitations**: No proxy, no extensions, no Chrome DevTools MCP pairing
- **Performance** (warm): navigate+screenshot 1.9s, form fill 1.4s, reliability 0.6s. Cold-start ~3-5s.
- **iOS** (macOS only): `-p ios --device "iPhone 16 Pro"` — Mobile Safari via Appium
- **License**: Apache-2.0 | TypeScript (74%), Rust (22%)

**Core workflow** — use refs from `snapshot -i` for deterministic targeting:

Expand All @@ -46,7 +47,6 @@ agent-browser screenshot page.png && agent-browser close
```bash
# Navigate
agent-browser open <url> | back | forward | reload

# Interact
agent-browser click <sel>
agent-browser fill <sel> <text> # clear+fill
Expand All @@ -56,12 +56,10 @@ agent-browser select <sel> <val> # dropdown
agent-browser check/uncheck <sel>
agent-browser scroll <dir> [px] # up/down/left/right
agent-browser drag <src> <tgt> | upload <sel> <files> | hover <sel>

# Read
agent-browser get text/html/value/title/url <sel>
agent-browser get attr <sel> <attr> | get count/box <sel>
agent-browser is visible/enabled/checked <sel>

# Output
agent-browser screenshot [path] [--full] | pdf <path> | eval <js> | close
```
Expand All @@ -78,19 +76,14 @@ agent-browser find label "Email" fill "test@test.com"
agent-browser find first ".item" click | find nth 2 "a" text
```

## Sessions
## Sessions, Wait, Storage, Network

Each session has isolated browser instance, cookies, storage, history, and auth state. Parallel sessions supported (`--session s1/s2/s3`; 3 parallel tested in 2.0s).
Isolated browser per session (cookies, storage, history, auth). Parallel: `--session s1/s2/s3` (3 parallel tested in 2.0s).

```bash
agent-browser --session agent1 open site-a.com
AGENT_BROWSER_SESSION=agent1 agent-browser click "#btn"
agent-browser --session agent1 open site-a.com # named session
AGENT_BROWSER_SESSION=agent1 agent-browser click "#btn" # env var
agent-browser session list
```

## Wait, Cookies, Storage, Network

```bash
agent-browser wait <selector> | <ms> | --text "Welcome" | --url "**/dash" | --load networkidle
agent-browser wait --fn "window.ready === true"
agent-browser cookies | cookies set <name> <val> | cookies clear
Expand All @@ -114,13 +107,13 @@ agent-browser trace start/stop [path] | console [--clear] | errors [--clear]
agent-browser highlight <sel> | state save/load <path>
```

## iOS Simulator

- **Env vars**: `AGENT_BROWSER_PROVIDER=ios`, `AGENT_BROWSER_IOS_DEVICE="iPhone 16 Pro"`, `AGENT_BROWSER_IOS_UDID=<udid>`
- **First launch**: ~30-60s to boot simulator; subsequent commands are fast
- **Real device**: UDID via `xcrun xctrace list devices`, sign WebDriverAgent in Xcode (free Apple Developer account)
## iOS Simulator (macOS only)

```bash
# Env vars: AGENT_BROWSER_PROVIDER=ios, AGENT_BROWSER_IOS_DEVICE="iPhone 16 Pro",
# AGENT_BROWSER_IOS_UDID=<udid>
# First launch ~30-60s (simulator boot); subsequent commands fast
# Real device: UDID via `xcrun xctrace list devices`, sign WebDriverAgent in Xcode (free Apple Developer account)
Comment on lines +113 to +116
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While the goal of making the document more concise is understood, moving the descriptive information about the iOS simulator from a bulleted list into comments inside a code block harms readability and structure.

The previous format was clearer for a few reasons:

  • It separated prose/metadata from the code examples, which is a good practice for documentation.
  • Bulleted lists are a more standard and scannable way to present key facts.
  • Using backticks in the prose (e.g., `AGENT_BROWSER_PROVIDER=ios`) correctly marked up variable names as code, improving visual distinction.

The current format mixes concerns and makes the information less discoverable. I'd recommend reverting this specific change to use the bulleted list format again, even if it adds a few lines back. The improved clarity would be worth it.

References
  1. For important operational details, format them as bullet points for clarity and readability.

agent-browser device list
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
agent-browser -p ios snapshot -i | tap @e1 | swipe up/down/left/right [px]
Expand Down Expand Up @@ -148,12 +141,12 @@ agent-browser -p ios screenshot mobile.png | close
## Common Patterns

```bash
# Login flow
# Login flow — fill, submit, save auth state
agent-browser open https://app.example.com/login && agent-browser snapshot -i
agent-browser fill @e3 "user@example.com" && agent-browser fill @e4 "password"
agent-browser click @e5 && agent-browser wait --url "**/dashboard" && agent-browser state save auth.json

# Form submission
# Form submission — fill fields, select, check, submit
agent-browser open https://example.com/form && agent-browser snapshot -i
agent-browser fill @e1 "John Doe" && agent-browser fill @e2 "john@example.com"
agent-browser select @e3 "US" && agent-browser check @e4
Expand All @@ -162,9 +155,7 @@ agent-browser click @e5 && agent-browser wait --text "Success"
# Data extraction
agent-browser open https://example.com/products && agent-browser snapshot --json > products.json

# Multi-session parallel
# Multi-session parallel — load saved auth per session
agent-browser --session s1 open https://site-a.com && agent-browser --session s1 state load auth-a.json
agent-browser --session s2 open https://site-b.com && agent-browser --session s2 state load auth-b.json
```

**License**: Apache-2.0 | TypeScript (74%), Rust (22%)
Loading