Hacker News: ag8

New comment by ag8 in "Thank HN: You helped save 33k lives"

ag8 — Wed, 18 Feb 2026 00:36:20 +0000

You're right; I should've been more precise. However, we have tools for dealing with this—that's what quality-adjusted life-years are for! I don't contest that surgeries often significantly increase QALYs, and may do so pretty cost-effectively.

New comment by ag8 in "Thank HN: You helped save 33k lives"

ag8 — Wed, 18 Feb 2026 00:31:40 +0000

Lol, I just care a lot about saving as many lives as I can; the most effective charities I've been able to find good evidence on save one life for $6–8k. If Watsi had a credible claim at being able to save lives 10x cheaper I would redirect my entire donation budget to them!

That said, once again, Watsi is great. I really appreciate all the hard work they've put into making this happen—this is orders of magnitude more impressive and impactful than most projects I've ever seen!

New comment by ag8 in "Thank HN: You helped save 33k lives"

ag8 — Tue, 17 Feb 2026 23:30:14 +0000

Watsi seems to be doing great work, but the title—"you helped save 33k lives"—reads as misleading to me. I guess "helped" could be doing a lot of heavy lifting here, but I would be incredibly surprised if the counterfactual number of lives saved was more than 3000. (But don't let this dissuade you from donating; concretely improving someone's life is totally a worthwhile goal, and Watsi seems very good at effecting this)

Gourmand Syndrome

ag8 — Thu, 29 Jan 2026 19:01:02 +0000

Article URL: https://en.wikipedia.org/wiki/Gourmand_syndrome

Comments URL: https://news.ycombinator.com/item?id=46814828

Points: 27

# Comments: 9

New comment by ag8 in "Ask HN: Share your personal website"

ag8 — Thu, 15 Jan 2026 08:02:51 +0000

https://andrew.gr

guys why does armenian completely break Claude

ag8 — Sun, 11 Jan 2026 20:03:20 +0000

https://xcancel.com/dyushag/status/1993143599286886525

https://claude.ai/share/e368b733-71a4-4211-99f5-6b6cc717b575

Comments URL: https://news.ycombinator.com/item?id=46579397

Points: 99

# Comments: 65

Sampling at negative temperature

ag8 — Sun, 11 Jan 2026 20:01:14 +0000

Article URL: https://cavendishlabs.org/blog/negative-temperature/

Comments URL: https://news.ycombinator.com/item?id=46579374

Points: 203

# Comments: 60

Perfectly Replicating Coca Cola [video]

ag8 — Sun, 11 Jan 2026 05:24:19 +0000

Article URL: https://www.youtube.com/watch?v=TDkH3EbWTYc

Comments URL: https://news.ycombinator.com/item?id=46572924

Points: 1

# Comments: 1

New comment by ag8 in "Size of Life"

ag8 — Wed, 10 Dec 2025 22:46:55 +0000

Not 13?

New comment by ag8 in "We collected 10k hours of neuro-language data in our basement"

ag8 — Mon, 08 Dec 2025 18:02:30 +0000

This is a cool setup, but naively it feels like it would require hundreds of thousands of hours of data to train a decent generalizable model that would be useful for consumers. Are there plans to scale this up, or is there reason to believe that tens of thousands of hours are enough?

Po.ta.to

ag8 — Fri, 07 Nov 2025 01:13:01 +0000

Article URL: https://po.ta.to/

Comments URL: https://news.ycombinator.com/item?id=45842568

Points: 4

# Comments: 2

Scaling pretraining affects RL sample efficiency

ag8 — Sat, 25 Oct 2025 00:09:05 +0000

Article URL: https://www.runrl.com/blog/warm-start-rl

Comments URL: https://news.ycombinator.com/item?id=45700340

Points: 1

# Comments: 0

Systematically generating tests that would have caught Anthropic's top‑K bug

ag8 — Fri, 10 Oct 2025 00:17:44 +0000

Article URL: https://theorem.dev/blog/anthropic-bug-test/

Comments URL: https://news.ycombinator.com/item?id=45534347

Points: 2

# Comments: 0

New comment by ag8 in "Tinker"

ag8 — Wed, 01 Oct 2025 18:53:54 +0000

Yeah, not sure why the HN backend changed it...

Tinker

ag8 — Wed, 01 Oct 2025 18:03:04 +0000

Article URL: https://2b4fdb18.connectionism.pages.dev/blog/announcing-tinker/

Comments URL: https://news.ycombinator.com/item?id=45440952

Points: 4

# Comments: 2

Training Qwen to answer briefly yet intelligently using feedback control

ag8 — Tue, 23 Sep 2025 00:32:22 +0000

Article URL: https://www.runrl.com/blog/feedback-control

Comments URL: https://news.ycombinator.com/item?id=45341448

Points: 4

# Comments: 0

New comment by ag8 in "Launch HN: RunRL (YC X25) – Reinforcement learning as a service"

ag8 — Thu, 18 Sep 2025 19:16:08 +0000

A) You could have an additional field in the jsonl file which says which rubric to use; then, your reward function could access this via `kwargs["rubric"]` and return a reward based on that example's preferred rubric;

B) currently, pricing on the deployed API is free, but the startup time is a few minutes and it's run on a small GPU node and is therefore not awfully fast. If you would like more production-level inference, email us at founders@runrl.com and we could set you up with something much faster (where we'd charge per token depending on model size)

New comment by ag8 in "Launch HN: RunRL (YC X25) – Reinforcement learning as a service"

ag8 — Thu, 18 Sep 2025 19:09:31 +0000

Having an RL agent that's really good at search across some space sounds very powerful in general; "proofs-as-search" make this an appealing target. Back in the day, when I did more fundamental RL research, we worked on an extension of SoRB [0] where an additional meta-level target was learning improved heuristics to explore the search space faster; would be exciting to figure out what a good setup for doing things like this in LLM-policy-gradient world is these days!

[0]: https://arxiv.org/abs/1906.05253

New comment by ag8 in "Launch HN: RunRL (YC X25) – Reinforcement learning as a service"

ag8 — Thu, 18 Sep 2025 17:22:27 +0000

we should publish some; the high-order effect seems to be that LoRAs significantly hurt small model performance vs FFT, with less of an effect for large models. This is maybe because large models have more built-in skills and thus a LoRA suffices to elicit the existing skill, whereas for small models you need to do more actual learning (holding # parameter updates constant). In general I think it's better to get a performant small model with FFT than a performant large model with a large LoRA, which is why we default to FFT, but I agree that we should publish more details here.

New comment by ag8 in "Launch HN: RunRL (YC X25) – Reinforcement learning as a service"

ag8 — Thu, 18 Sep 2025 00:52:35 +0000

Thanks! Our goal is to make rl "just work" with completely automated GPU provisioning/algorithm selection/SFT-warm up, but giving people the ability to switch away from the defaults if they want to.

The way tools currently work in the beta is you add tools via MCP to the configuration, and they get passed in as additional context for the model; the model might then choose to use a tool during inference; the tool is then automatically called and the output is returned as a tool message. If you really want to you could parse the tool output as part of reward calculation, but I expect you'd usually base the reward just on the model's completion. I could give more details if there's a specific tool setup you're envisioning!