Hacker News: ehtbanton

LLM is a compiler, not a runtime

ehtbanton — Mon, 13 Apr 2026 14:12:53 +0000

Article URL: https://getpocketbot.com/blog/llm-compiler-not-runtime

Comments URL: https://news.ycombinator.com/item?id=47752276

Points: 3

# Comments: 1

New comment by ehtbanton in "Claude Code v2.1.100 silently adds ~20k invisible tokens to every request"

ehtbanton — Mon, 13 Apr 2026 14:05:31 +0000

I just don't trust Anthropic's Claude Code team at all any more. Their tools are vibe-coded and their behaviour is anti-consumer.

They shouldn't be surprised at the thousands moving to Codex every day.

New comment by ehtbanton in "Claude Opus 4.6 accuracy on BridgeBench hallucination test drops from 83% to 68%"

ehtbanton — Mon, 13 Apr 2026 03:34:11 +0000

Benchmarks like this one are designed to thoroughly test the model across several iterations. 15% is a MASSIVE discrepancy.

Come on Anthropic, admit what you're doing already and let us access your best models unhindered, even if it costs us more. At the moment we just all feel short-changed.

New comment by ehtbanton in "I ran Gemma 4 as a local model in Codex CLI"

ehtbanton — Mon, 13 Apr 2026 01:46:30 +0000

This is genuinely very helpful. I'm planning a MacBook pro purchase with local inference in mind and now see I'll have to aim for a slightly higher memory option because the Gemma A4 26B MoE is not all that!

New comment by ehtbanton in "Doom, Played over Curl"

ehtbanton — Mon, 13 Apr 2026 01:22:37 +0000

This is very impressive, have tried it out.

If only everyone was as good at making performant terminal applications (cough cough Anthropic)

New comment by ehtbanton in "Surelock: Deadlock-Free Mutexes for Rust"

ehtbanton — Sun, 12 Apr 2026 23:59:47 +0000

I've had this thought myself too. Going off on a slight tangent: I think there's also loads of useful stuff in domains like either of these which maps amazingly well to AI agent system design, but there's such a huge discrepancy between the knowledge bases of the fields that no benefit ever really surfaces.

(Speaking from the perspective of someone who simultaneously loves high-performance compute and agentic AI haha)

New comment by ehtbanton in "Exploiting the most prominent AI agent benchmarks"

ehtbanton — Sun, 12 Apr 2026 23:52:08 +0000

I will always maintain that the best benchmark is just trying it out for yourself. The most practical parallel for me is all the people posting about how some open-source model has "achieved X on Y benchmark - beating out Opus 4.6!" It's all show and everyone cheats.

New comment by ehtbanton in "Small models also found the vulnerabilities that Mythos found"

ehtbanton — Sun, 12 Apr 2026 23:46:44 +0000

Wake me up when Anthropic does something right again...