Hacker News: pavpanchekha

New comment by pavpanchekha in "Advancing the price-performance frontier with GPT‑5.6"

pavpanchekha — Thu, 30 Jul 2026 17:31:13 +0000

Making Luna, which was already very cheap and extremely capable, 5x cheaper is crazy. I use Sol at work but Luna at home, and while there's definitely a difference, it doesn't feel like night-and-day. After a year of ever-increasing prices it suddenly feels (between this, Kimi K3, GLM 5.2) that prices are falling again.

New comment by pavpanchekha in "GPT-5.6 Sol, along with Terra and Luna, will launch publicly this Thursday"

pavpanchekha — Wed, 08 Jul 2026 05:25:45 +0000

For compiler work I found that Sol is noticably better than 5.5 (and I generally use OAI models because I like the Codex app), but Fable was still obviously better.

New comment by pavpanchekha in "Notes from the Mistral AI Now Summit"

pavpanchekha — Fri, 29 May 2026 18:54:44 +0000

OpenAI used to make Codex-specific models, but they stopped. What I've gathered from interviews and similar is that training two models isn't worth the (small) lift from having a coding-specific model. You're pre-training on everything anyway, and coding RL is reasonably useful for general-purpose models too.

Can LLMs accelerate science? An experiment

pavpanchekha — Thu, 09 Apr 2026 04:52:23 +0000

Article URL: https://pavpanchekha.com/blog/llk.html

Comments URL: https://news.ycombinator.com/item?id=47699408

Points: 2

# Comments: 1

New comment by pavpanchekha in "Herbie: Automatically improve imprecise floating point formulas"

pavpanchekha — Sat, 04 Apr 2026 18:09:31 +0000

University of Washington Programming Languages and Software Engineering (research group).

I'm not at UW any more, I'm now at Utah, but some of the Herbie team is at UW and they provide the infrastructure

New comment by pavpanchekha in "Herbie: Automatically improve imprecise floating point formulas"

pavpanchekha — Sat, 04 Apr 2026 14:04:09 +0000

Documented here but yes it's an average, of something similar to but not exactly the same as relative error: https://herbie.uwplse.org/doc/latest/error.html

It's true that averages can be misleading but we encourage users to think about it instead as a percentage of inputs. In practice the error distribution is very bimodal, the two modes being "basically fine" (a few ulps of error) and "garbage" (usually 0 instead of some actual value)

New comment by pavpanchekha in "Herbie: Automatically improve imprecise floating point formulas"

pavpanchekha — Sat, 04 Apr 2026 14:01:27 +0000

Author here. The speed up is modeled throughput, though the model is relatively naive. It's possible to disable branches by turning off the regimes flag, see https://herbie.uwplse.org/doc/1.0/options.html

New comment by pavpanchekha in "Herbie: Automatically improve imprecise floating point formulas"

pavpanchekha — Sat, 04 Apr 2026 13:59:55 +0000

Author here. I've got a few papers about this problem (including one in submission), but it is very very hard to do, especially with acceptable overhead. The state of the art is maybe 100x overhead.

New comment by pavpanchekha in "Herbie: Automatically improve imprecise floating point formulas"

pavpanchekha — Sat, 04 Apr 2026 13:58:44 +0000

It is, there's a page in the documentation about how errors are defined. Let me also add: Herbie generally gives the most accurate option it found first, and then the other stuff might be useful for speed (0.5x is way faster than two square roots and a divide!) but it's not as accurate

New comment by pavpanchekha in "Herbie: Automatically improve imprecise floating point formulas"

pavpanchekha — Sat, 04 Apr 2026 13:56:17 +0000

Author here! Yes, the float distribution isn't what you want in practice, but distribution selector isn't really the right thing either, because a low probability bad result can still be pretty bad! Hence the range selector; the float distribution is good at picking extreme values that trigger FP error.

We usually recommend looking for 90%+ accuracy or carefully examining the accuracy plot

New comment by pavpanchekha in "Herbie: Automatically improve imprecise floating point formulas"

pavpanchekha — Sat, 04 Apr 2026 13:49:56 +0000

It was me. Damn it you're right! Will fix!

New comment by pavpanchekha in "In math, rigor is vital, but are digitized proofs taking it too far?"

pavpanchekha — Mon, 30 Mar 2026 14:56:15 +0000

In calculus the core issue is that the concept of a "function" was undefined but generally understood to be something like what we'd call today an "expression" in a programming language. So, for example, "x^2 + 1" was widely agreed to be a function, but "if x < 0 then x else 0" was controversial. What's nice about the "function as expression" idea is that generally speaking these functions are continuous, analytic [1], etc and the set of such functions is closed under differentiation and integration [2]. There's a good chance that if you took AP Calculus you basically learned this definition.

The formal definition of "function" is totally different! This is typically a big confusion in Calculus 2 or 3! Today, a function is defined as literally any input→output mapping, and the "rule" by which this mapping is defined is irrelevant. This definition is much worse for basic calculus—most mappings are not continuous or differentiable. But it has benefits for more advanced calculus; the initial application was Fourier series. And it is generally much easier to formalize because it is "canonical" in a certain sense, it doesn't depend on questions like "which exact expressions are allowed".

This is exactly what the article is complaining about. The non-rigorous intuition preferred for basic calculus and the non-rigorous intuition required for more advanced calculus are different. If you formalize, you'll end up with one rigorous definition, which necessarily will have to incorporate a lot of complexity required for advanced calculus but confusing to beginners.

Programming languages are like this too. Compare C and Python. Some things must be written in C, but most things can be more easily written in Python. If the whole development must be one language, the more basic code will suffer. In programming we fix this by developing software as assemblages of different programs written in different languages, but mechanisms for this kind of modularity in formal systems are still under-studied and, today, come with significant untrusted pieces or annoying boilerplate, so this solution isn't yet available.

[1] Later it was discovered that in fact this set isn't analytic, but that wasn't known for a long time.

[2] I am being imprecise; integrating and solving various differential equations often yields functions that are nice but aren't defined by combinations of named functions. The solution at the time was to name these new discovered functions.

New comment by pavpanchekha in "Python: The Optimization Ladder"

pavpanchekha — Sat, 14 Mar 2026 17:02:06 +0000

They're cheap but not free, especially at the front end of the CPU where it's just a lot more instructions to churn through. What the branch predictor gets you is it turns branches, which would normally cause a pipeline bubble, to be executed like straightline code if they're predicted right. It's a bit like a tracing jit. But you will still have a bunch of extra instructions to, like, compute the branch predicate.

New comment by pavpanchekha in "Faster asin() was hiding in plain sight"

pavpanchekha — Wed, 11 Mar 2026 18:54:18 +0000

Horner's form is typically also more accurate, or at least, it is not bit-identical, so the compiler won't do it unless you pass -funsafe-math, and maybe not even then.

New comment by pavpanchekha in "Deterministic Programming with LLMs"

pavpanchekha — Sun, 01 Mar 2026 01:32:18 +0000

Deterministic output is incompatible with batching, which in turn is critical to high utilization on GPUs, which in turn is necessary to keep costs low.

The Token Production System

pavpanchekha — Fri, 20 Feb 2026 15:02:10 +0000

Article URL: https://pavpanchekha.com/blog/tps.html

Comments URL: https://news.ycombinator.com/item?id=47088858

Points: 1

# Comments: 0

New comment by pavpanchekha in "Challenges and Research Directions for Large Language Model Inference Hardware"

pavpanchekha — Sun, 25 Jan 2026 17:23:40 +0000

Frontier models are now much bigger than an individual query, hence batching, MoE, etc. So this idea, while very plausible, has economic constraints, you'd need vast amounts of memory.

New comment by pavpanchekha in "Should CSS be constraints?"

pavpanchekha — Thu, 11 Dec 2025 00:03:32 +0000

That pretty much is how CSS works! At the most basic level, Flow level is about widths down, heights up. But this basic model doesn't let you do a lot of things some people want to do, like distributing left-over space in a container equally among children (imagine a table). So then CSS added more stuff, like Flex-box, which also fundamentally works like this though adds a second pass.

New comment by pavpanchekha in "Should CSS be constraints?"

pavpanchekha — Thu, 11 Dec 2025 00:01:13 +0000

Author here—it is from Tufte CSS. I have a blog post [1] about how floats work. It is a nice example of there being unintuitive and also more-intuitive ways to achieve things in CSS. These days I believe CSS Anchor Positioning provides a simpler way to do this, but I haven't used it yet.

[1]: https://pavpanchekha.com/blog/css-floats.html

New comment by pavpanchekha in "Should CSS be constraints?"

pavpanchekha — Wed, 10 Dec 2025 23:58:26 +0000

Author here. You're right that a lot of CSS's edge cases and implicit rules stem from other choices and implicit rules that maybe need to be reconsidered. But take this logic a step further. The way text with mixed font sizes is laid out is kinda weird—should we just get rid of that? Mixed Chinese-Latin text is weird (search "idiographic baseline"); should we get rid of that? In fact, variable-size characters are weird, maybe just stick to all-Chinese? I'm joking, of course, but my point isn't that a simpler system is inconceivable, just that it would be inconvenient.