Hacker News: pyrex41

New comment by pyrex41 in "Formal Verification Gates for AI Coding Loops"

pyrex41 — Wed, 20 May 2026 21:04:46 +0000

I mean yes, that's a risk, and you are correct. In practice, is your spec about the shape of the app you want to build really going to be that complicated?

But I mentioned its Turing completeness as a lazy shorthand to illustrate that it is far more flexible and expressive than what people think of as a "type system". https://shenlanguage.org/OSM/Recursive.html

New comment by pyrex41 in "Formal Verification Gates for AI Coding Loops"

pyrex41 — Wed, 20 May 2026 20:06:34 +0000

Shen has some really unique properties that are under-developed here. It's type system itself is Turing complete and very flexible / expressive. Also, the Shen kernel is extremely compact, and easy to port into a wide variety of runtime languages (C, Lisp, Ruby, Python, JS, Go, etc https://shen-language.github.io/#downloads). What I discussed about using it as a compile-time gate + codegen is just scratching the surface, I think.

Now, a lot of the ports haven't been maintained. But the underlying Shen kernel is only 4-5k lines of code...remains extremely portable. More discussion here https://news.ycombinator.com/item?id=39602472

I didn't focus a ton on Shen in the blog post, because the underlying principles aren't really about the implementation. Shen is very cool tho.

New comment by pyrex41 in "Formal Verification Gates for AI Coding Loops"

pyrex41 — Wed, 20 May 2026 19:54:59 +0000

I would add, re: Shen -- it's sequent calculus and unique type system (type system itself is Turing complete) give you a lot of flexibility here.

New comment by pyrex41 in "Formal Verification Gates for AI Coding Loops"

pyrex41 — Wed, 20 May 2026 19:51:36 +0000

If you are the kind of person that immediately reaches for this solution -- then I agree, yes you should. You could even do it in Shen! (https://news.ycombinator.com/item?id=39602472, https://news.ycombinator.com/item?id=9297665)

But, for everyone else: even if you skipped the sidecar entirely, didn't use the codegen, you just had the AI interpret the spec'd application into a short Shen proof, iterate until it's internally consistent / compiles...now you have a spec that is internally consistent, straightforward for human to understand, and much stronger context for the LLM than English language spec alone.

New comment by pyrex41 in "Formal Verification Gates for AI Coding Loops"

pyrex41 — Wed, 20 May 2026 18:52:00 +0000

I think you're right on the substance. A production-grade spec (or guard type) needs stronger assertions than the toy example in the post — predicates for signature verification, claim-binding, and expiry-from-token, at minimum. The example is only illustrating the proof-chain shape, and isn't a good example of a full-fledged JWT validator.

Your underlying point, that calling the constructor is the assertion so AI passing `true` can "prove" whatever — is true of any smart-constructor pattern, including your own `newUnverified` approach. The trust still has to live somewhere. In your pattern it lives in the small set of audited callers; in shengen's case it lives in the same place — the wrappers (like `CheckTenantAccess`) that actually establish the premise via a DB query or a JWT parse. Structurally the two approaches are doing the same thing. To harden it, you'd keep the raw constructors package-private and export only the wrappers, so the handler code the LLM is writing physically cannot call NewTenantAccess(..., true) — only CheckTenantAccess.

On the deeper question about "why codegen": the short answer is "obviously, you don't have to." But if we assume that we're using AI to write at least some of the code, now you have to either (1) describe the constructor in very precise English and have the LLM generate it, (2) inject yourself into the loop closely with the LLM, or (3) not use an LLM for this part. My proposition is that writing the core invariants as proofs that can be deterministically checked for internal consistency and written declaratively is (1) more efficient, (2) less lossy, and (3) easier for the developer to read and reason about than writing the constructor from scratch. This puts a lot of trust in the codegen, as you point out; but as a practical matter, having a formal representation of what you want plus an English prompt is stronger context to the LLM anyway.

The other reason I started down this path, which I didn't get into in the post because I haven't figured out yet if it's truly practical, comes from a property specific to Shen: it has a very small kernel that has been ported into a lot of runtimes — Lisp, C, JS, Go, Python, Erlang, Scheme, Java, etc (https://shen-language.github.io/#downloads). That opens up the possibility of writing specs whose predicates run as runtime gates from the same Shen expression, no translation step — and even mixing compile-time and runtime assertions into the same spec. I find this very interesting conceptually, but I'm not sure yet whether it's practically useful for anything.

New comment by pyrex41 in "Formal Verification Gates for AI Coding Loops"

pyrex41 — Wed, 20 May 2026 18:16:16 +0000

The distinction worth keeping clean is between the spec (here, written as proofs in Shen) being formally rigorous and the entire codebase being formally verified. Shen-Backpressure does the first: the spec is a sequent-calculus statement of invariants, and shengen lowers it into guard types the target compiler refuses to violate, so within the target language's type discipline you cannot construct a tenant-access (or any other witness) without discharging its premises.

It does not do the second (formally verify the entire codebase). Outside the guard types your Go or TypeScript is just code. It can panic, race, have bugs in unrelated logic, use reflection to forge values inside the guard package, get a wrong answer from the SQL query that fed the predicate, and so on. The proof ends at the projection boundary.

Why not go further? Not really "too complex to implement," in theory; it has been done before. But verifying the whole program is much higher engineering cost, and the trades-offs to do it make sense in a much narrower set of cases than what I'm trying to target: teams shipping production code with AI in the loop, in the language they already ship.

The pragmatic choice is to spend the verification budget on the small set of invariants that genuinely matter and leave the rest as ordinary code with ordinary review and tests. That is why the claim is phrased as "practically impossible to accidentally bypass, not categorically impossible to bypass." Over-claiming "verified" when the host language is unverified would be misleading.

New comment by pyrex41 in "Formal Verification Gates for AI Coding Loops"

pyrex41 — Wed, 20 May 2026 15:27:48 +0000

Author here. The TL;DR: move rules from prompts into types the compiler refuses to violate, then bounce the AI coding loop off those refusals. The repo is github.com/pyrex41/Shen-Backpressure. Builds a lot on Geoff Huntley's backpressure idea -- none of this is rocket science, just an effort to apply sound programming principles in a world of LLM coding agents.

Formal Verification Gates for AI Coding Loops

pyrex41 — Wed, 20 May 2026 15:25:45 +0000

Article URL: https://reubenbrooks.dev/blog/structural-backpressure-beats-smarter-agents/

Comments URL: https://news.ycombinator.com/item?id=48209323

Points: 108

# Comments: 24

Real Decision Making vs. Expectation Value

pyrex41 — Sat, 01 Dec 2018 05:43:36 +0000

Article URL: https://aip.scitation.org/doi/10.1063/1.4940236

Comments URL: https://news.ycombinator.com/item?id=18575426

Points: 3

# Comments: 0

New comment by pyrex41 in "Kelly Criterion (2007)"

pyrex41 — Mon, 19 Nov 2018 16:14:47 +0000

And, given need to allow for errors in estimation and eventual deterioration of many alpha-generating strategies, adjusting investment fraction downward (a sort Bayesian prior, I suppose) is sensible on its own, sans vol sensitivity considerations.

New comment by pyrex41 in "Kelly Criterion (2007)"

pyrex41 — Mon, 19 Nov 2018 15:21:03 +0000

This is also an argument for finding investments that allow you to better define the downside risk of investments. I think that this is why static investments or hedges have such value; they may not change the expected value, they might even reduce it (based on mean estimates), but they reduce / eliminate the estimation error in your downside risk, allowing you a much more certain calculation of leverage.

New comment by pyrex41 in "Kelly Criterion (2007)"

pyrex41 — Mon, 19 Nov 2018 15:16:43 +0000

Most of the examples of Kelly criterion application are either concrete bets with discrete payoff/loss odds and values, or assumed to be normally distributed. This paper discusses how extremely skewed outcomes (eg, stock options) should affect the Kelly calculation: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2956161

New comment by pyrex41 in "Kelly Criterion (2007)"

pyrex41 — Mon, 19 Nov 2018 15:14:03 +0000

The reasoning behind the Kelly Criterion was explored recently in a more broad context, showing that the logarithmic utility is not required: https://aip.scitation.org/doi/10.1063/1.4940236

Taleb has a good discussion here: https://medium.com/incerto/the-logic-of-risk-taking-107bf410...