A coding agent skill for browser automation using Go and the Chrome DevTools Protocol. And yes, this was coded by an AI. Inspired by this blogpost.
- Start Browser: Launch Chrome with remote debugging, selectable via
--browserorBROWSER_TOOLS_BROWSER(chrome-stable,chrome-beta,chrome-dev,chrome-canary) - Navigate: Open URLs in the active or a new tab
- Execute JavaScript: Run inline code, files, or STDIN input
- Element Picker: Interactive DOM element selection
- Mouse Actions: Click, double-click, hover, right-click, and drag
- Fill Text Fields: Fill input and textarea elements
- Check Elements: Check/uncheck checkboxes and select radio buttons
- Press Key: Simulate key presses (Enter, Escape, Tab, etc.)
- Upload Files: Set files on file inputs (works on hidden inputs)
- Download Files: Click download links/buttons and save files
- Select Dropdown: Choose options by value, label, or index
- Console Logs: Capture browser console messages and errors
- Network Monitor: Track HTTP requests with filtering and body inspection
- HTML Extraction: Get page HTML with optional regex filtering
- Screenshots: Viewport or full-page screenshots saved to
/tmp - Cookie Management: List and clear cookies per tab or all origins
- DOM Storage: Inspect and clear localStorage / sessionStorage
- Clear Browser Data: Wipe cache, cookies, IndexedDB, service workers, and more
- Tab Management: List (with active tab indicator), activate, close, and refresh tabs
Build the binary:
go build -o scripts/browser-tools .And copy the whole directory to your skills directory. Or download a prebuilt one from releases.
Then mention "use the browser-tools skill" and the agent will invoke it automatically.
- SKILL.md — Quick reference for agents
- REFERENCE.md — Full command documentation
- Go 1.21+
- Chrome/Chromium browser
Remote debugging is not allowed in the main Chrome profile, so a separate one is created and reused per variant:
~/.cache/claude-browser-tools/<variant>/