Hacker News: ptnpzwqd

New comment by ptnpzwqd in "A desktop made for one"

ptnpzwqd — Mon, 04 May 2026 09:47:52 +0000

I think a notable difference is that the AI that is portrayed in most sci-fi (that I have read/watched anyway) tend to be "logical machines" that act deterministically based on the data available to them.

What we got are "statistical machines" that tend to do the right thing under the right conditions, but can go completely off the rail every now and then.

The former are more akin to a generalization of computers as we typically think of it, whereas the latter is something else. Maybe that something else is closer to human behavior in some ways, but also so very different - unlike humans, where you get to know people, build relationships, know who to trust in what ways, and so forth, you can never really trust an LLM with any critical tasks without close supervision.

New comment by ptnpzwqd in "The Claude Code Leak"

ptnpzwqd — Thu, 02 Apr 2026 07:17:19 +0000

I feel the conclusions here are a bit thin.

Code quality tends to have an impact on more than just aesthetics - and Claude Code certainly feels like a buggy mess from an end user's perspective.

Of course people still use Claude Code, but that is certainly because of the underlying models first and foremost. Most products don't have such a moat and would not nearly see as much tolerance from end users. If the Max subscriptions could be used with other harnesses, I am sure Anthropic would have to compete harder on the quality of the harness (to be fair, most AI based tooling seems pretty alpha these days, but eventually things will stabilize).

Polish is not everything, clearly, but it is a factor, and I feel Claude Code is maybe the worst example to use here, as it doesn't at all generalize to most other products.

New comment by ptnpzwqd in "Diverse perspectives on AI from Rust contributors and maintainers"

ptnpzwqd — Sun, 22 Mar 2026 23:34:15 +0000

On the falling behind:

I strongly doubt that is going to be the case - picking up these tools is not rocket science, even if you want to be able to use them fairly effectively. In addition, there is so much churn in AI tooling these days that an early investment might not really be worth a lot in the longer run.

On the other hand, hands-on experience in programming and architecture is currently a must-have to use the tools effectively - and continuing without AI in the short term might just buy an inexperienced engineer some time to learn, and postpone skill atrophy for an experienced engineer.

Of course, who can know what the future looks like, but I doubt a "wait and see" approach is that dangerous to anyone's career.

New comment by ptnpzwqd in "10x Is the New Floor"

ptnpzwqd — Tue, 10 Mar 2026 16:56:04 +0000

I live in two realities too.

One where articles like this talk about a 10x increase in productivity, where the sentiment on HN is that all software can now be vibe coded without review, where AGI is right around the corner.

And another reality where we don't really see any measurable difference in productivity betweens that have gone all in on LLMs and those that haven't, where the current SOTA models still produce bad quality that needs a lot of guidance and revision, where mediocre engineers that never cared about quality in the first place now use these tools to produce even more mediocre code, where the most talented engineers get bogged down reviewing low quality LLM-generated PRs, and so on.

Maybe a bit exaggerated, but OP definitely lives in a very different reality from the one I am seeing. So different that I find it extremely hard to believe.

New comment by ptnpzwqd in "Redox OS has adopted a Certificate of Origin policy and a strict no-LLM policy"

ptnpzwqd — Tue, 10 Mar 2026 13:45:09 +0000

It was maybe not quite clear enough in my comment, but this is more of a hypothetical future scenario - not at all where I assess LLMs are today or will get to in the foreseable future.

So it becomes a bit theoretical, but I guess if we had a future where LLMs could consistently write perfect code, it would not be too far fetched to also think it could perfectly review code, true enough. But either way the maintainer would still spend some time ensuring a contribution aligns with their vision and so forth, and there would still be close to zero incentive to allow outside contributors in that scenario. No matter what, that scenario is a bit of a fairytale at this point.

New comment by ptnpzwqd in "Redox OS has adopted a Certificate of Origin policy and a strict no-LLM policy"

ptnpzwqd — Tue, 10 Mar 2026 12:52:53 +0000

Yes, but LLM-based reviews are not nearly a compensation for human review, so it doesn't change much.

New comment by ptnpzwqd in "Redox OS has adopted a Certificate of Origin policy and a strict no-LLM policy"

ptnpzwqd — Tue, 10 Mar 2026 10:14:06 +0000

Yes, I agree. It was just me playing with a hypothetical (but in my view not imminent) future where vibe-coding without review would somehow be good enough.

New comment by ptnpzwqd in "Redox OS has adopted a Certificate of Origin policy and a strict no-LLM policy"

ptnpzwqd — Tue, 10 Mar 2026 10:07:31 +0000

At the moment verification at scale is an unsolved problem, though. As mentioned, I think this will act as a rough filter for now, but probably not work forever - and denying contributions from non-vetted contributors will likely end up being the new default.

Once outside contributions are rejected by default, the maintainers can of course choose whether or not to use LLMs or not.

I do think that it is a misconception that OSS software needs to "viable". OSS maintainers can have many motivations to build something, and just shipping a product might not be at the top of that list at all, and they certainly don't have that obligation. Personally, I use OSS as a way to build and design software with a level of gold plating that is not possible in most work settings, for the feeling that _I_ built something, and the pure joy of coding - using LLMs to write code would work directly against those goals. Whether LLMs are essential in more competitive environments is also something that there are mixed opinions on, but in those cases being dogmatic is certainly more risky.

New comment by ptnpzwqd in "Redox OS has adopted a Certificate of Origin policy and a strict no-LLM policy"

ptnpzwqd — Tue, 10 Mar 2026 09:49:40 +0000

Sure - and I suspect we will see that soon enough. But it has downsides too, and finding the right way to vet potential contributors is tricky.

New comment by ptnpzwqd in "Redox OS has adopted a Certificate of Origin policy and a strict no-LLM policy"

ptnpzwqd — Tue, 10 Mar 2026 09:46:44 +0000

Even if we assume that LLMs become good enough for this to be true (some might feel that is the case already - I disagree, but that is beside the point), there is no reason why OSS maintainers should accept such outside contributions that they would need to carefully review, as it comes from an untrusted source, when they could just use the tools themselves directly. Low effort drive-by PRs is a burden with no upside.

New comment by ptnpzwqd in "Redox OS has adopted a Certificate of Origin policy and a strict no-LLM policy"

ptnpzwqd — Tue, 10 Mar 2026 09:39:17 +0000

The problem is the increasing review burden - with LLMs it is possible to create superficially valid looking (but potentially incorrect) code without much effort, which will still take a lot of effort to review. So outright rejecting code that can identified as LLM-generated at a glance, is a rough filter to remove the lowest effort PRs.

Over time this might not be enough, though, so I suspect we will see default deny policies popping up soon enough.

New comment by ptnpzwqd in "Redox OS has adopted a Certificate of Origin policy and a strict no-LLM policy"

ptnpzwqd — Tue, 10 Mar 2026 09:22:42 +0000

Not necessarily a bad idea, but I think the bigger issue here and now is the increasing assymmetry in effort between code submitter and reviewer, and the unsustainable review burden on the maintainers if nothing is done.

New comment by ptnpzwqd in "Redox OS has adopted a Certificate of Origin policy and a strict no-LLM policy"

ptnpzwqd — Tue, 10 Mar 2026 09:19:26 +0000

I suspect this is for now just a rough filter to remove the lowest effort PRs. It likely will not be enough for long, though, so I suspect we will see default deny policies soon enough, and various different approaches to screening potential contributors.

New comment by ptnpzwqd in "Redox OS has adopted a Certificate of Origin policy and a strict no-LLM policy"

ptnpzwqd — Tue, 10 Mar 2026 09:13:16 +0000

I think this is a reasonable decision (although maybe increasingly insufficient).

It doesn't really matter what your stance on AI is, the problem is the increased review burden on OSS maintainers.

In the past, the code itself was a sort of proof of effort - you would need to invest some time and effort on your PRs, otherwise they would be easily dismissed at a glance. That is no longer the case, as LLMs can quickly generate PRs that might look superficially correct. Effort can still have been out into those PRs, but there is no way to tell without spending time reviewing in more detail.

Policies like this help decrease that review burden, by outright rejecting what can be identified as LLM-generated code at a glance. That is probably a fair bit today, but it might get harder over time, though, so I suspect eventually we will see a shift towards more trust-based models, where you cannot submit PRs if you haven't been approved in advance somehow.

Even if we assume LLMs would consistently generate good enough quality code, code submitted by someone untrusted would still need detailed review for many reasons - so even in that case it would like be faster for the maintainers to just use the tools themselves, rather than reviewing someone else's use of the same tools.

New comment by ptnpzwqd in "When AI writes the software, who verifies it?"

ptnpzwqd — Tue, 03 Mar 2026 23:02:31 +0000

Correct, but that has and probably always will be the case.

You spend the time on what is needed for you to move ahead - if code review is now the most time consuming part, that is where you will spend your time. If ever that is no longer a problem, defining requirements will maybe be the next bottleneck and where you spend your time, and so forth.

Of course it would be great to get rid of the review bottleneck as well, but I at least don't have an answer to that - I don't think the current generation of LLMs are good enough to allow us bypassing that step.

New comment by ptnpzwqd in "When AI writes the software, who verifies it?"

ptnpzwqd — Tue, 03 Mar 2026 22:26:47 +0000

I of course cannot say what the future holds, but current frontier models are - in my experience - nowhere near good enough for such autonomy.

Even with other agents reviewing the code, good test coverage, etc., both smaller - and every now and then larger - mistakes make their way through, and the existence of such mistakes in the codebase tend to accellerate even more of them.

It for sure depends on many factors, but I have seen enough to feel confident that we are not there yet.

New comment by ptnpzwqd in "When AI writes the software, who verifies it?"

ptnpzwqd — Tue, 03 Mar 2026 22:10:38 +0000

If reviewing has become the bottleneck, the obvious - albeit slightly boring - solution is to slow down spitting out new code, and spend relatively more time reviewing.

Just going ahead and piling up PRs or skipping the review process is of course not recommended.

New comment by ptnpzwqd in "Switch to Claude without starting over"

ptnpzwqd — Sun, 01 Mar 2026 20:32:25 +0000

You can use this: hnthrowaway.outboard407@passmail.net

New comment by ptnpzwqd in "Switch to Claude without starting over"

ptnpzwqd — Sun, 01 Mar 2026 15:01:26 +0000

I don't expect that - am merely responding to the parent comments claim that Claude consistently one-shots production ready code (which does not at all match my observations).

New comment by ptnpzwqd in "Switch to Claude without starting over"

ptnpzwqd — Sun, 01 Mar 2026 11:20:07 +0000

Feel free to share it, would be very curious - ideally alongside the prompts.