Hacker News: gregpr07

New comment by gregpr07 in "How We Run Firecracker VMs Inside EC2 and Start Browsers in <1s"

gregpr07 — Tue, 16 Jun 2026 15:33:36 +0000

well that's how browser agents work in a nutshell lol

How We Run Firecracker VMs Inside EC2 and Start Browsers in <1s

gregpr07 — Tue, 16 Jun 2026 15:15:01 +0000

Article URL: https://browser-use.com/posts/firecracker-browser-infra

Comments URL: https://news.ycombinator.com/item?id=48556561

Points: 3

# Comments: 2

New comment by gregpr07 in "OpenWarp"

gregpr07 — Fri, 01 May 2026 08:52:21 +0000

Why not just use ghostty at that point?

Show HN: Browser Harness – Gives LLM freedom to complete any browser task

gregpr07 — Fri, 24 Apr 2026 14:31:38 +0000

Hey HN,

We got tired of browser frameworks restricting the LLM, so we removed the framework and gave the LLM maximum freedom to do whatever it's trained on. We gave the harness the ability to self correct and add new tools if the LLM wants (is pre-trained on) that.

Our Browser Use library is tens of thousands of lines of deterministic heuristics wrapping Chrome (CDP websocket). Element extractors, click helpers, target managemenet (SUPER painful), watchdogs (crash handling, file downloads, alerts), cross origin iframes (if you want to click on an element you have to switch the target first, very anoying), etc.

Watchdogs specifically are extremely painful but required. If Chrome triggers for example a native file popup the agent is just completely stuck. So the two solutions are to: 1. code those heuristics and edge cases away 1 by 1 and prevent them 2. give LLM a tool to handle the edge case

As you can imagine - there are crazy amounts of heuristics like this so you eventually end up with A LOT of tools if you try to go for #2. So you have to make compromises and just code those heuristics away.

BUT if the LLM just "knows" CDP well enough to switch the targets when it encounters a cross origin iframe, dismiss the alert when it appears, write its own click helpers, or upload function, you suddenly don't have to worry about any of those edge cases.

Turns out LLMs know CDP pretty well these days. So we bitter pilled the harness. The concepts that should survive are: - something that holds and keeps CDP websocket alive (deamon) - extremely basic tools (helpers.py) - skill.md that explains how to use it

The new paradigm? SKILL.md + a few python helpers that need to have the ability to change on the fly.

One cool example: We forgot to implement upload_file function. Then mid-task the agent wants to upload a file so it grepped helpers.py, saw nothing, wrote the function itself using raw DOM.setFileInputFiles (which we only noticed that later in a git diff). This was a relly magical moment of how powerful LLMs have become.

Compared to other approaches (Playwright MCP, browser use CLI, agent-browser, chrome devtools MCP): all of them wrap Chrome in a set of predefined functions for the LLM. The worst failure mode is silent. The LLM's click() returns fine so the LLM thinks it clicked, but on this particular site nothing actually happened. It moves on with a broken model of the world. Browser Harness gives the LLM maximum freedom and perfect context for HOW the tools actually work.

Here are a few crazy examples of what browser harness can do: - plays stockfish https://x.com/shawn_pana/status/2046457374467379347 - sets a world record in tetris https://x.com/shawn_pana/status/2047120626994012442 - figures out how to draw a heart with js https://x.com/mamagnus00/status/2046486159992480198?s=20

You can super easily install it by telling claude code: `Set up https://github.com/browser-use/browser-harness for me.`

Repo: https://github.com/browser-use/browser-harness

What would you call this new paradigm? A dialect?

Comments URL: https://news.ycombinator.com/item?id=47890841

Points: 134

# Comments: 66

New comment by gregpr07 in "Prove you are a robot: CAPTCHAs for agents"

gregpr07 — Mon, 20 Apr 2026 03:25:35 +0000

It's useful for only distinguishing the smart AI from deterministic scripts and humans (we don't want either). We are convincing OpenClaws to create api keys for free (we have a free tier specifically for those agents). So it's basically marketing blog post - but for OpenClaws

New comment by gregpr07 in "Prove you are a robot: CAPTCHAs for agents"

gregpr07 — Mon, 20 Apr 2026 03:21:25 +0000

Yeah but the whole point is that it shouldn't be deterministic - aka you have to let the "dumb" (non AI) bots out as well (otherwise a malicious user can just create thousands of api keys)

New comment by gregpr07 in "Prove you are a robot: CAPTCHAs for agents"

gregpr07 — Mon, 20 Apr 2026 03:20:13 +0000

Yeah, we are aiming all OpenClaw/Hermes Agent agents to sign up for free without humans intervention, so you need some sort of proof-of-stake (or proof of compute) algorithm so that a simple deterministic algorithm can't just claim thousands of API keys. Most agents (at least in the current token subsidised market) don't care about token consumption, so the stakes are very small for the user!

Show HN: Self-healing browser harness via direct CDP

gregpr07 — Mon, 20 Apr 2026 01:05:25 +0000

Creator of browser-use library here.

At Browser Use we spend a lot of time thinking about the bitter lesson of agentic frameworks. A few days we did an experiment that combines replacing Playwright with CDP and giving the LLM complete freedom (building with autoresearch from ground up) to do whatever it knows from pretraining.

I think this is the simplest way to give an AI control of a real browser: raw CDP, and let the agent write its own tools. ~600 lines total.

Example: I forgot to implement upload_file(). Mid-task, the agent noticed, wrote the function, and uploaded the file. I found out when I read the git diff.

I haven't found a task that doesn't work yet - but extremely happy to be proven wrong. Give it a shot. Open to criticism on CDP, agent frameworks, browser use, or whatever.

Comments URL: https://news.ycombinator.com/item?id=47829234

Points: 3

# Comments: 1

Browser Agents That Learn

gregpr07 — Tue, 07 Apr 2026 02:20:22 +0000

Article URL: https://browser-use.com/posts/web-agents-that-actually-learn

Comments URL: https://news.ycombinator.com/item?id=47669995

Points: 3

# Comments: 0

Give Agents Maximum Freedom. The less you assume, the more it works

gregpr07 — Sat, 14 Mar 2026 08:43:00 +0000

Article URL: https://browser-use.com/posts/agent-freedom

Comments URL: https://news.ycombinator.com/item?id=47374619

Points: 1

# Comments: 0

New comment by gregpr07 in "Show HN: Open-source browser for AI agents"

gregpr07 — Wed, 11 Mar 2026 17:17:00 +0000

Love it! From first principles: this kinda answers the "do we really even need CDP" I always have in my head building browser use...

Building secure, scalable agent sandbox infrastructure

gregpr07 — Fri, 27 Feb 2026 15:03:12 +0000

Article URL: https://browser-use.com/posts/two-ways-to-sandbox-agents

Comments URL: https://news.ycombinator.com/item?id=47181316

Points: 79

# Comments: 17

We built scalable evaluation infrastructure for AI web agents

gregpr07 — Mon, 23 Feb 2026 15:45:59 +0000

Article URL: https://browser-use.com/posts/our-browser-agent-evaluation-system

Comments URL: https://news.ycombinator.com/item?id=47123845

Points: 1

# Comments: 0

New comment by gregpr07 in "Show HN: Sandboxing untrusted code using WebAssembly"

gregpr07 — Tue, 03 Feb 2026 15:56:31 +0000

Why go this route? Why Python is more powerful than JS is mostly because of third party plugins like pandas which are excplicitly not supported (C bindings, is this possible to fix?)...

At that point it might be just easier to convince the model to write JS directly

New comment by gregpr07 in "Show HN: Fence – Sandbox CLI commands with network/filesystem restrictions"

gregpr07 — Sun, 25 Jan 2026 19:57:33 +0000

Wow this is really cool

The Bitter Lesson of Agent Frameworks

gregpr07 — Sat, 17 Jan 2026 02:02:50 +0000

Article URL: https://browser-use.com/posts/bitter-lesson-agent-frameworks

Comments URL: https://news.ycombinator.com/item?id=46654581

Points: 2

# Comments: 0

New comment by gregpr07 in "Show HN: Webctl – Browser automation for agents based on CLI instead of MCP"

gregpr07 — Thu, 15 Jan 2026 02:04:12 +0000

Creator of Browser Use here, this is cool, really innovative approach with ARIA roles. One idea we have been playing around with a lot is just giving the LLM raw html and a really good way to traverse it - no heuristics, just BS4. Seems to work well, but much more expensive than the current prod ready [index]

Show HN: Open-Source Browser Use LLM (30B, A3B)

gregpr07 — Fri, 19 Dec 2025 18:20:30 +0000

Our first model brings SoTA Browser Use capabilities in a small and fast model that can be hosted on a single GPU. It’s much much faster than anything out there and around 15x cheaper than Sonnet 4.5.

The wonderful name comes from: 30B parameters, 3B active, SoTA quality at real-time speed (and the model is based on Qwen/Qwen3-VL-30B-A3B-Instruct).

This model is heavily trained to be used with browser-use OSS library and provides comprehensive browsing capabilities with superior DOM understanding and visual reasoning.

Comments URL: https://news.ycombinator.com/item?id=46329054

Points: 2

# Comments: 0

New comment by gregpr07 in "Speed Matters: How We Achieve the Fastest Web Agent"

gregpr07 — Thu, 09 Oct 2025 15:12:36 +0000

Hey HN,

we heard a lot of complaints about Browser Use being slow. Last few weeks we focused a lot on improving the speed, while keeping the same accuracy.

Go try it out. It's really fun to see it glide the web.

Speed Matters: How We Achieve the Fastest Web Agent

gregpr07 — Thu, 09 Oct 2025 15:12:36 +0000

Article URL: https://browser-use.com/posts/speed-matters

Comments URL: https://news.ycombinator.com/item?id=45528818

Points: 1

# Comments: 3