Hacker News: rfw300

New comment by rfw300 in "Claude Fable is relentlessly proactive"

rfw300 — Fri, 12 Jun 2026 05:07:57 +0000

> if a malicious actor can weaponize an agent to do their bidding

In my experience, human employees are much more vulnerable to this particular weakness than frontier agents (i.e. phishing attacks).

New comment by rfw300 in "Harness engineering: Leveraging Codex in an agent-first world"

rfw300 — Sun, 07 Jun 2026 00:44:35 +0000

I understand that the’ve written zero lines of code for this application, but would it kill them to write a few lines of the blog post by hand?

Forcing readers to wade through an unceasing string of LLM clichés demonstrates the opposite of the point you’re trying to make—that the consumers of your work are worse off because you exercised no human judgment in creating it.

New comment by rfw300 in "AI outperforms law professors in Stanford Law study"

rfw300 — Wed, 03 Jun 2026 01:11:45 +0000

A law professor studying AI has an affiliation with the center at their university that studies applications of AI? Scandalous!

New comment by rfw300 in "Every Law a Commit – US Law in GitHub"

rfw300 — Fri, 03 Apr 2026 02:24:30 +0000

A chapeau is not "just like another title basically". It's a lead-in, a phrase which acts as the grammatical start of a sentence which the following subsections finish. For instance, the text in the first paragraph of 18 U.S.C § 3632(a) which ends in an em-dash is a chapeau. Taking pride in work you have not done and not bothered to understand is perplexing.

New comment by rfw300 in "Every Law a Commit – US Law in GitHub"

rfw300 — Fri, 03 Apr 2026 01:25:43 +0000

The author (author's operator?) does not understand the data they are working with. And in doing so, they inadvertently make the case against their own "dark factory" nonsense.

For one, nothing about this project makes "every law" a commit. It just takes the _annual_ snapshots published by the House clerk and diffs chunks of those files against each other. A project which actually traced the edits in each annual snapshot to a specific passed bill would be incredibly cool (and is probably tractable now for the first time with current AI agents). This is not that!

All this does, as far as I can tell, is parse a set of well-structured XML files into chunks and commit those chunks to Git. It's not literally nothing, but it's something that the author's own README credits multiple people doing years ago with ~100 line Python scripts.

I don't mean to be overly harsh. But this is exactly the problem with treating your software as a "factory": you release something you do not understand, in a domain you did not care to learn. And we are all the poorer for it.

New comment by rfw300 in "Epoch confirms GPT5.4 Pro solved a frontier math open problem"

rfw300 — Wed, 25 Mar 2026 01:25:54 +0000

What is a "truly new task"? Does there exist such a thing? What's an example of one?

Everything we do builds on top of what's already been done. When I write a new program, I'm composing a bunch of heuristics and tricks I've learned from previous programs. When a mathematician approaches an open problem, they use the tactics they've developed from their experience. When Newton derived the laws of physics, he stood on the shoulders of giants. Sure, some approaches are more or less novel, but it's a difference in degree, not kind. There's no magical firebreak to separate what AI is doing or will do, and the things the most talented humans do.

New comment by rfw300 in "Fast regex search: indexing text for agent tools"

rfw300 — Mon, 23 Mar 2026 21:59:31 +0000

I don't understand why their "Instant Grep + roundtrip to us-east-1" is so slow. First of all, the round-trip latency should not be nearly so bad to us-east-1. But second, and much more importantly, the LLM runs in the cloud. Shouldn't you just situate the LLM, agent runtime, and regex index in the same region? Wouldn't that be faster than round-tripping to the user's local machine?

New comment by rfw300 in "90% of crypto's Illinois primary spending failed to achieve its objective"

rfw300 — Fri, 20 Mar 2026 17:44:00 +0000

On those terms, they also wasted a lot of cash. 90% of it went to candidates who lost (or opposing candidates who won).

New comment by rfw300 in "Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster"

rfw300 — Thu, 19 Mar 2026 23:28:15 +0000

In fact, looking at the blog post, the agent orchestrating 16 GPUs is half as efficient as the agent using 1 GPU in GPU-time. Since it uses 16 GPUs to reach the same result as 1 GPU in 1/8 of the time.

New comment by rfw300 in "Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster"

rfw300 — Thu, 19 Mar 2026 23:24:09 +0000

Yeah, assuming there's no active monitoring during the training runs, you can trivially give the agent an abstraction which turns "1 GPU" into "16 GPUs" that just so happens to take 16x the wall-clock time to run.

New comment by rfw300 in "Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster"

rfw300 — Thu, 19 Mar 2026 23:15:48 +0000

Do you have a sense of whether these validation loss improvements are leading to generalized performance uplifts? From afar I can't tell whether these are broadly useful new ideas or just industrialized overfitting on a particular (model, dataset, hardware) tuple.

New comment by rfw300 in "Speed at the cost of quality: Study of use of Cursor AI in open source projects (2025)"

rfw300 — Mon, 16 Mar 2026 17:46:11 +0000

Super interesting study. One curious thing I've noticed is that coding agents tend to increase the code complexity of a project, but simultaneously massively reduce the cost of that code complexity.

If a module becomes unsustainably complex, I can ask Claude questions about it, have it write tests and scripts that empirically demonstrate the code's behavior, and worse comes to worst, rip out that code entirely and replace it with something better in a fraction of the time it used to take.

That's not to say complexity isn't bad anymore—the paper's findings on diminishing returns on velocity seem well-grounded and plausible. But while the newest (post-Nov. 2025) models often make inadvisable design decisions, they rarely do things that are outright wrong or hallucinated anymore. That makes them much more useful for cleaning up old messes.

New comment by rfw300 in "The Appalling Stupidity of Spotify's AI DJ"

rfw300 — Sun, 15 Mar 2026 16:22:34 +0000

I don’t necessarily endorse the author’s broad conclusions about “AI”, but I will say that the Spotify DJ specifically is an enragingly bad product. Nothing close to the utility of Claude Code.

New comment by rfw300 in "Surpassing vLLM with a Generated Inference Stack"

rfw300 — Wed, 11 Mar 2026 00:11:34 +0000

I've no problem with the intuition. But I would hope for a lot more focus in the marketing materials on proving the (statistical) correctness of the implementation. 15% better inference speed is not worth it to use a completely unknown inference engine not tested across a wide range of generation scenarios.

New comment by rfw300 in "Surpassing vLLM with a Generated Inference Stack"

rfw300 — Tue, 10 Mar 2026 18:57:12 +0000

OK... we need way more information than this to validate this claim! I can run Qwen-8B at 1 billion tokens per second if you don't check the model's output quality. No information is given about the source code, correctness, batching, benchmark results, quantization, etc. etc. etc.

New comment by rfw300 in "Is legal the same as legitimate: AI reimplementation and the erosion of copyleft"

rfw300 — Mon, 09 Mar 2026 22:23:18 +0000

More likely: this is a transitional phase where our previously hard problems become easy, and we will soon set our sights on new and much harder problems. The pinnacle of creative achievement in the universe is probably not 2010s B2B SaaS.

It is entirely possible, however, that human beings will not be the primary drivers of progress on those problems.

New comment by rfw300 in "Let's Get Physical"

rfw300 — Thu, 05 Mar 2026 22:55:01 +0000

I did, and yet I also felt more relaxed reading it than I am reading most blog entries posted on here. I didn't feel like I had to guard against my time being wasted by vacuous LLM fiction.

New comment by rfw300 in "The Brand Age"

rfw300 — Thu, 05 Mar 2026 21:28:03 +0000

Being wealthy solves virtually all problems of consumption, so the invisible hand provides new problems to serve the market need. Beautiful, really.

New comment by rfw300 in "If AI writes code, should the session be part of the commit?"

rfw300 — Mon, 02 Mar 2026 02:36:35 +0000

Why should it be? The agent session is a messy intermediate output, not an artifact that should be part of the final product. If the "why" of a code change is important, have your agent write a commit message or a documentation file that is polished and intended for consumption.

New comment by rfw300 in "When does MCP make sense vs CLI?"

rfw300 — Sun, 01 Mar 2026 18:14:06 +0000

Making those tools first-class primitives is good for (human) UX: you see the diffs inline, you can add custom rules and hooks that trigger on certain files being edited, etc.