Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions UPSTREAM.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ Each upstream has its own append-only table. Add a row every time you pull.
| 2026-04-29 | `2125cea` | `997ee45` | bcode | 6 upstream commits (PRs #241, #244, #245). `src/browser_harness/_ipc.py`: when `BH_TMP_DIR` is set, drop the `bu-<NAME>` filename prefix (caller-isolated dir means no shared-tmpdir disambiguation needed); without `BH_TMP_DIR` the original `bu-<NAME>` scheme is unchanged. `src/browser_harness/admin.py`: `_daemon_endpoint_names` short-circuits to the local NAME when `BH_TMP_DIR` is set (no glob); plus catch `SystemError` from `os.kill` on Windows during `restart_daemon`. `src/browser_harness/daemon.py`: discover DevToolsActivePort in Comet and Arc profiles on macOS. `tests/unit/test_admin.py`: 2 new tests for the `BH_TMP_DIR` discovery path. All in protected `src/browser_harness/*.py` + tests — taken verbatim. Smoke test + 12 admin unit tests pass. The `_ipc` filename change pairs with our recent per-session BH_TMP_DIR work (browsercode PR #22) — caller isolation now extends to filenames as well as the dir. Divergences touched: none. |
| 2026-04-30 | `997ee45` | `660827d` | bcode | 11 upstream commits (PRs #246, #247, #251, #254, #256, #260). `src/browser_harness/daemon.py`: resolve WS via `/json/version` to avoid stale `DevToolsActivePort` path (PR #260) + report `cdp_disconnected` on stale CDP probe in `connection_status` (PR #254) + cleanup remote browser when daemon startup fails (PR #251). `src/browser_harness/admin.py`: companion changes for the daemon fixes. `tests/unit/test_admin.py`: 7 new tests. New domain skills: `agent-workspace/domain-skills/xiaohongshu/scraping.md` (PR #246), and a top-level `domain-skills/shopify-admin/` tree (PR #247: README, embedded-apps, knowledge-base, polaris-inputs). Note: PR #247 added skills at the top-level `domain-skills/` path, not under `agent-workspace/domain-skills/` as the post-#229 layout would suggest — vendored verbatim to match upstream layout. Doc updates: README operator framing (PR #255), install.md heredoc → `-c` flag (PR #256), profile-sync.md same. All files outside divergences — taken verbatim. Smoke test + 19 admin unit tests pass. Divergences touched: none. |
| 2026-05-01 | `660827d` | `013097a` | bcode | 8 upstream commits (PRs #261, #265, #266). `src/browser_harness/daemon.py` (PR #265): split `DevToolsActivePort` into port + ws-path lines and fall back to `ws://127.0.0.1:<port><ws_path>` when `/json/version` returns 404 (Chrome 147+ disables `/json/*` HTTP discovery on the default user-data-dir). `src/browser_harness/run.py` (PR #266): when no daemon is alive, no local Chrome is listening on 9222/9223 (probed via `/json/version`, not bare TCP), and `BROWSER_USE_API_KEY` is set, auto-bootstrap a cloud daemon. `tests/unit/test_run.py`: 2 new tests for the cloud bootstrap path. PR #261 moved `domain-skills/shopify-admin/` → `agent-workspace/domain-skills/shopify-admin/` upstream — both paths are excluded from the vendored tree per §3, so this rename is a no-op for browsercode (`script/check-harness-diff.sh` filters both via `IGNORED_PATHS_REGEX`). All in protected `src/browser_harness/*.py` + tests — taken verbatim. Smoke test + 23 unit tests pass. Divergences touched: none. |
| 2026-05-03 | `013097a` | `59a166f` | bcode | 62 upstream commits. **Helper additions** (PRs #258, #279): `helpers.py` adds `fill_input` (raises on missing element, optional timeout for SPA rendering, dispatches select-all without char event so Cmd/Ctrl+A fires on macOS), `wait_for_element` (prefers `checkVisibility`, falls back to computed style), `wait_for_network_idle`. `tests/unit/test_helpers.py`: +253 lines covering the new helpers. `daemon.py`: discover Dia browser profile on macOS. **Windows IPC hardening** (PR #276): `_ipc.py` adds ping handshake, token auth, atomic port file. **Domain-skills opt-in** (PR #274): `helpers.py` gates auto-injected domain skills behind `BH_DOMAIN_SKILLS=1` (default off). Aligns upstream default with browsercode's exclusion policy — no behavior change for us, but the `BH_DOMAIN_SKILLS` env name is now the canonical knob if we ever decide to ship a curated set. **Cloud bootstrap opt-in** (PR #277): `run.py` makes cloud auto-bootstrap opt-in via `BU_AUTOSPAWN` instead of triggering on any `BROWSER_USE_API_KEY` presence. Plus admin tweaks (`tests/unit/test_admin.py` +10 lines), doc canonicalization (`README.md`, `SKILL.md`, `install.md`, `interaction-skills/profile-sync.md` PR #280), and new top-level scaffolding: `AGENTS.md` (repo orientation for coding agents), `.github/ISSUE_TEMPLATE/{bug-report,feature-request,config}.yml`, `.github/VOUCHED.td`, `docs/allow-remote-debugging.png`. All non-excluded paths taken verbatim. **Excluded paths** (per §3): 14 new domain-skills directories added upstream (aa, alaska, articulate-rise, bigbang-hr, bilibili, BOSS-zhipin, claude-ai, ctrip, flipkart, ly-com, manus, perplexity, wehotel, plus amazon under top-level `domain-skills/`) — skipped. **Divergence update**: `.gitignore` now also includes upstream's new `.idea/` and `.claude/` entries while preserving our `.venv/`. Smoke test (imports + `--version`) clean. Divergences touched: `.gitignore` (extended, same intent). |

---

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: Bug report
description: Report a reproducible bug in browser-harness.
labels: [bug]
body:
- type: checkboxes
id: preflight
attributes:
label: Before submitting
options:
- label: I searched existing issues for duplicates.
required: true
- label: I ran `browser-harness --doctor` and read the output.
required: true
- label: I read the troubleshooting section of `install.md`.
required: true
- label: This is a reproducible bug in browser-harness — not a question, feature request, or `cloud.browser-use.com` issue.
required: true

- type: textarea
id: summary
attributes:
label: Summary
description: What's broken, in one or two sentences.
validations:
required: true

- type: textarea
id: repro
attributes:
label: Repro
description: Numbered steps. Include the exact command and the output you saw.
placeholder: |
1. Chrome 147 on default profile, remote debugging on
2. browser-harness -c 'print(page_info())'
3. RuntimeError: DevTools is not live yet on 127.0.0.1:9222
validations:
required: true

- type: textarea
id: environment
attributes:
label: Environment
placeholder: |
OS:
Chrome version:
browser-harness --version:
browser-harness --doctor output:
validations:
required: true
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
blank_issues_enabled: false
contact_links:
- name: Question or how-to
url: https://github.com/browser-use/browser-harness/discussions/categories/q-a
about: Ask in Discussions Q&A, not Issues.
- name: Install or setup troubleshooting
url: https://github.com/browser-use/browser-harness/blob/main/install.md
about: Most install and "DevTools not live" errors are covered here.
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
name: Feature request
description: Propose a new feature or change.
labels: [feature-request]
body:
- type: checkboxes
id: preflight
attributes:
label: Before submitting
options:
- label: I searched existing issues and discussions.
required: true
- label: This is a feature request, not a bug.
required: true

- type: textarea
id: problem
attributes:
label: Problem
description: What user pain or limitation motivates this?
validations:
required: true

- type: textarea
id: proposal
attributes:
label: Proposal
description: What you'd like to happen.
validations:
required: true

- type: textarea
id: alternatives
attributes:
label: Alternatives considered
description: What else you tried, or why other approaches fall short.
validations:
required: true
13 changes: 13 additions & 0 deletions packages/bcode-browser/harness/.github/VOUCHED.td
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Vouched (or denounced) users for browser-harness.
#
# See https://github.com/mitchellh/vouch for details.
#
# Syntax:
# - One handle per line (without @), sorted alphabetically.
# - Optional platform prefix: platform:username (e.g., github:user).
# - Denounce by prefixing with minus: -username
# - Optional reason after a space following the handle.

molesza
rohitdutt108
shaunandrewjackson1977
2 changes: 2 additions & 0 deletions packages/bcode-browser/harness/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,5 @@ __pycache__/
.venv/
uv.lock
*.egg-info/
.idea/
.claude/
24 changes: 24 additions & 0 deletions packages/bcode-browser/harness/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
browser-harness is a thin layer that connects agents to browsers via an editable CDP harness.

# Code priorities
- Clarity
- Precision
- Low verbosity
- Versatility

# Overview
Core code lives in `src/browser_harness/`:
- `admin.py` — daemon lifecycle, diagnostics, updates, profile management
- `daemon.py` — the long-lived middleman process between the browser and the agent
- `helpers.py` — CDP wrapper and core browser primitives auto-imported into `-c` scripts
- `run.py` — the `browser-harness` CLI

`SKILL.md` tells agents how to use the harness and CLI.
`install.md` tells agents how to install it, attach a browser, and troubleshoot.

An agent operating the harness only edits inside `agent-workspace/`:
- `agent_helpers.py` — task-specific browser helpers the agent adds
- `domain-skills/` — skills the agent writes and reads

# Contributing
Consider what is really needed. Prefer the smallest diff that fixes the bug.
20 changes: 14 additions & 6 deletions packages/bcode-browser/harness/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,24 +25,28 @@ Paste into Claude Code or Codex:
```text
Set up https://github.com/browser-use/browser-harness for me.

Read `install.md` first to install and connect this repo to my real browser. Then read `SKILL.md` for normal usage. Use `agent-workspace/agent_helpers.py` and `agent-workspace/domain-skills/` for task-specific edits. When you open a setup or verification tab, activate it so I can see the active browser tab. After it is installed, open this repository in my browser and, if I am logged in to GitHub, ask me whether you should star it for me as a quick demo that the interaction works — only click the star if I say yes. If I am not logged in, just go to browser-use.com.
Read `install.md` and follow the steps to install browser-harness and connect it to my browser.
```

When this page appears, tick the checkbox so the agent can connect to your browser:
The agent will open `chrome://inspect/#remote-debugging`. Tick the checkbox so the agent can connect to your browser:

<img src="docs/setup-remote-debugging.png" alt="Remote debugging setup" width="520" style="border-radius: 12px;" />

Click Allow when the per-attach popup appears (Chrome 144+):

<img src="docs/allow-remote-debugging.png" alt="Allow remote debugging popup" width="520" style="border-radius: 12px;" />

See [agent-workspace/domain-skills/](agent-workspace/domain-skills/) for example tasks.

## Free remote browsers
## Free Browser Use Cloud browsers

Useful for stealth, sub-agents, or deployment.<br>
**Free tier: 3 concurrent browsers, proxies, captcha solving, and more. No card required.**
Stealth, sub-agents, or headless deployment.<br>
**Browser Use Cloud free tier: 3 concurrent browsers, proxies, captcha solving, and more. No card required.**

- Grab a key at [cloud.browser-use.com/new-api-key](https://cloud.browser-use.com/new-api-key)
- Or let the agent sign up itself via [docs.browser-use.com/llms.txt](https://docs.browser-use.com/llms.txt) (setup flow + challenge context included).

## How simple is it? (~592 lines of Python)
## Architecture (~1k lines across 4 core files)

- `install.md` — first-time install and browser bootstrap
- `SKILL.md` — day-to-day usage
Expand All @@ -61,6 +65,10 @@ PRs and improvements welcome. The best way to help: **contribute a new domain sk

If you're not sure where to start, open an issue and we'll point you somewhere useful.

## Domain skills

Set `BH_DOMAIN_SKILLS=1` to enable [agent-workspace/domain-skills/](agent-workspace/domain-skills/) — community-contributed per-site playbooks `goto_url` surfaces by domain. Contribute via PR.

---

[The Bitter Lesson of Agent Harnesses](https://browser-use.com/posts/bitter-lesson-agent-harnesses) · [Web Agents That Actually Learn](https://browser-use.com/posts/web-agents-that-actually-learn)
65 changes: 12 additions & 53 deletions packages/bcode-browser/harness/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@ description: Direct browser control via CDP. Use when the user wants to automate

# browser-harness

Direct browser control via CDP. For task-specific edits, use `agent-workspace/agent_helpers.py` and `agent-workspace/domain-skills/`. For setup, install, or connection problems, read install.md.
Direct browser control via CDP. For task-specific edits, use `agent-workspace/agent_helpers.py`. For setup, install, or connection problems, read install.md.

Domain skills (community-contributed per-site playbooks under `agent-workspace/domain-skills/`) are off by default. Set `BH_DOMAIN_SKILLS=1` to enable them; see the bottom section.

## Usage

Expand All @@ -20,13 +22,6 @@ print(page_info())
- Invoke as browser-harness — it's on $PATH. No cd, no uv run.
- First navigation is new_tab(url), not goto_url(url) — goto runs in the user's active tab and clobbers their work.

Available interaction skills:
- interaction-skills/connection.md — startup sequence, tab visibility, omnibox popup fix

Available domain skills:
- tiktok/upload.md
- polymarket/scraping.md

## Tool call shape

```bash
Expand Down Expand Up @@ -60,11 +55,10 @@ start_remote_daemon prints liveUrl and auto-opens it in the local browser (if a

Profiles (cookies-only login state) live in interaction-skills/profile-sync.md — covers list_cloud_profiles(), the chat-driven "which profile?" pattern, and sync_local_profile() for uploading a local Chrome profile.

## Search first
## Interaction skills

Search `agent-workspace/domain-skills/` first for the domain you are working on before inventing a new approach.

Only if you start struggling with a specific mechanic while navigating, look in interaction-skills/ for helpers. The available interaction skills are:
If you start struggling with a specific mechanic while navigating, look in interaction-skills/ for helpers. They cover reusable UI mechanics like dialogs, tabs, dropdowns, iframes, and uploads. The available interaction skills are:
- connection.md
- cookies.md
- cross-origin-iframes.md
- dialogs.md
Expand All @@ -82,44 +76,6 @@ Only if you start struggling with a specific mechanic while navigating, look in
- uploads.md
- viewport.md

Useful commands:

```bash
rg --files agent-workspace/domain-skills
rg -n "tiktok|upload" agent-workspace/domain-skills
```

## Always contribute back

If you learned anything non-obvious about how a site works, open a PR to `agent-workspace/domain-skills/<site>/` before you finish. Default to contributing. The harness gets better only because agents file what they learn. If figuring something out cost you a few steps, the next run should not pay the same tax.

Examples of what's worth a PR:

- A private API the page calls (XHR/fetch endpoint, request shape, auth) — often 10× faster than DOM scraping.
- A stable selector that beats the obvious one, or an obfuscated CSS-module class to avoid.
- A framework quirk — "the dropdown is a React combobox that only commits on Escape", "this Vue list only renders rows inside its own scroll container, so scrollIntoView on the row doesn't work — you have to scroll the container".
- A URL pattern — direct route, required query params (?lang=en, ?th=1), a variant that skips a loader.
- A wait that wait_for_load() misses, with the reason.
- A trap — stale drafts, legacy IDs that now return null, unicode quirks, beforeunload dialogs, CAPTCHA surfaces.

### What a domain skill should capture

The *durable* shape of the site — the map, not the diary. Focus on what the next agent on this site needs to know before it starts:

- URL patterns and query params.
- Private APIs and their payload shape.
- Stable selectors (data-*, aria-*, role, semantic classes).
- Site structure — containers, items per page, framework, where state lives.
- Framework/interaction quirks unique to this site.
- Waits and the reasons they're needed.
- Traps and the selectors that *don't* work.

### Do not write

- Raw pixel coordinates. They break on viewport, zoom, and layout changes. Describe how to *locate* the target (selector, scrollIntoView, aria-label, visible text) — never where it happened to be on your screen.
- Run narration or step-by-step of the specific task you just did.
- Secrets, cookies, session tokens, user-specific state. `agent-workspace/domain-skills/` is shared and public.

## What actually works

- Screenshots first: use capture_screenshot() to understand the current page quickly, find visible targets, and decide whether you need a click, a selector, or more navigation.
Expand Down Expand Up @@ -155,7 +111,10 @@ The *durable* shape of the site — the map, not the diary. Focus on what the ne
- Prefer compositor-level actions over framework hacks. Try screenshots, coordinate clicks, and raw key input before adding DOM-specific workarounds.
- If you need framework-specific DOM tricks, check interaction-skills/ first. That is where dropdown, dialog, iframe, shadow DOM, and form-specific guidance belongs.

## Interaction notes
## Domain skills (opt-in)

Only applies when `BH_DOMAIN_SKILLS=1`. Otherwise ignore — `agent-workspace/domain-skills/` is dormant and `goto_url` won't surface skill files.

When enabled, search `agent-workspace/domain-skills/<host>/` before inventing an approach. `goto_url` returns up to 10 skill filenames for the navigated host.

- interaction-skills/ holds reusable UI mechanics such as dialogs, tabs, dropdowns, iframes, and uploads.
- `agent-workspace/domain-skills/` holds site-specific workflows and should be updated when you discover reusable patterns for a website.
If you learn anything non-obvious — a private API, stable selector, framework quirk, URL pattern, hidden wait, or site-specific trap — open a PR to `agent-workspace/domain-skills/<site>/`. Capture the durable shape of the site (the map, not the diary). Don't write pixel coordinates (break on layout), task narration, or secrets — the directory is public.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading