Vercel's agent-browser: Why I Prefer It Over Playwright MCP

I’ve been testing agent-browser from Vercel Labs and it’s replaced Playwright MCP in my workflow. Here’s why — and how they built it.

What it does

agent-browser is a headless browser automation CLI designed specifically for AI agents. You install it globally (npm i -g agent-browser) and any AI coding assistant can drive a real Chrome browser through simple shell commands:

agent-browser open localhost:8080
agent-browser snapshot -i --json    # get accessibility tree with refs
agent-browser click @e2             # click element by ref
agent-browser fill @e3 "test@example.com"

Here’s me asking an AI agent to inspect the submenu on my local site — it navigated, took a snapshot, and reported back the dropdown contents:

Why I prefer it over Playwright MCP

	agent-browser	Playwright MCP
Runtime	Pure Rust daemon, direct CDP — no Node.js needed	Requires Node.js + Playwright runtime
Interface	Simple CLI commands any LLM can call	MCP protocol — needs a compatible client
Element selection	Snapshot → ref workflow (`@e1`, `@e2`) — deterministic	CSS selectors or accessibility tree
Speed	Daemon persists between commands, sub-millisecond dispatch	Cold start on each MCP tool call
Portability	Works with any AI assistant that can run shell commands	Only works with MCP-compatible clients

The killer advantage: any agent that can run a shell command can use it. No special protocol, no SDK, no integration layer. Claude Code, Cursor, Windsurf, Gemini CLI, GitHub Copilot — they all work out of the box.

How they built it: The “Skills” pattern

The most interesting architectural decision is the skills system. Instead of building an MCP server or a custom integration for every AI tool, they created a single SKILL.md file that teaches any AI assistant how to use agent-browser.

npx skills add vercel-labs/agent-browser

This command drops a SKILL.md into your project (e.g., .claude/skills/agent-browser/SKILL.md for Claude Code). The skill file contains:

The snapshot → ref interaction pattern (the core workflow)
All available commands and their syntax
Session management and timeout handling
Best practices for chaining commands

The AI reads this file as context and learns how to drive the browser. No custom code, no API wrappers — just a well-written instruction document that gets injected into the model’s context.

This is a brilliant pattern: documentation as integration. The skill stays up to date because it’s fetched from the repo, not copy-pasted.

Architecture

The project is pure Rust with a client-daemon design:

┌─────────────┐      ┌──────────────┐      ┌─────────┐
│  Rust CLI   │ ───► │ Rust Daemon  │ ───► │ Chrome  │
│ (commands)  │      │ (pure CDP)   │      │  (CDP)  │
└─────────────┘      └──────────────┘      └─────────┘

Rust CLI — Parses commands, sends them to the daemon
Rust Daemon — Starts automatically on first command, persists between commands, talks to Chrome via CDP (Chrome DevTools Protocol)
Chrome — Downloaded from Chrome for Testing (Google’s official automation channel)

No Playwright. No Puppeteer. No Node.js in the daemon at all. The daemon auto-starts and stays alive, so subsequent commands are near-instant.

Security features worth noting

For production agent deployments:

Authentication Vault — Encrypted credential storage; the LLM never sees passwords
Domain Allowlist — Restrict navigation to trusted domains only
Action Policy — Gate destructive actions with a policy file
Content Boundaries — Wrap output in delimiters so LLMs distinguish tool output from untrusted page content

Bottom line

If you’re building or testing with AI agents and need browser automation, try agent-browser before reaching for Playwright MCP. The snapshot-ref workflow is cleaner, the Rust daemon is faster, and the skills pattern means zero integration effort.

npm i -g agent-browser && agent-browser install