Hacker News: radarsat1

New comment by radarsat1 in "Introspective Diffusion Language Models"

radarsat1 — Tue, 14 Apr 2026 13:30:06 +0000

Because the nature of transformers is that running a bunch of pregenerated tokens through them is a parallel operation, not autoregressive. That's how it works at training time, but speculative decoding uses it at inference time. So if you just want to check whether a set of known tokens is "likely" given the base model, you can run them all through and get probability distributions, no need to sample.

It's the same reason there's a difference in speed between "prompt processing" and "generation". The former is just taking the pre-generated prompt and building the KV cache, which is parallel, not autoregressive and therefore way faster.

New comment by radarsat1 in "R3 Bio pitched “brainless clones” to serve the role of backup human bodies"

radarsat1 — Mon, 30 Mar 2026 23:19:24 +0000

Consider also that even reattaching nerves that are supposed to be there is not exactly a walk in the park. Look into finger reattachment surgery and post operation care. Think pain, tingling, a year or more of physiotherapy.. and that's in the best case that it actually works and you don't end up with a "dead" finger. Now, imagine that for your whole body.

New comment by radarsat1 in "TurboQuant: Redefining AI efficiency with extreme compression"

radarsat1 — Wed, 25 Mar 2026 17:27:29 +0000

> "Redefine" is a favorite word of AI. Honestly no need to read further.

You're not wrong, but it certainly is an annoying outcome of AI that we're not allowed to use.. words.. anymore.

New comment by radarsat1 in "The future of version control"

radarsat1 — Mon, 23 Mar 2026 15:01:08 +0000

> Should you be counting on confusion of an underpowered text-merge to catch such problems?

This does not really follow from my statement.

I said that underpowered text merge should not silently accept such situations, not that it is the only way to catch them. It doesn't replace knowing something about what you are merging, but it is certainly a good hint that something may be wrong or unexpected.

> Post-merge syntax checks are better for that purpose.

Better, yes, but I was addressing semantic issues, not syntactical. I have seen syntactically valid merges result in semantic inconsistency, it does happen.

I do agree with your last statement.. unit & integration tests, agent checks or whathaveyou, these all contribute to semantic checking, which is a good thing.

Can they be relied on here? Maybe? I guess the jury is still out. My testing philosophy is "you can only test for what you think of testing". And tests and agent checks have a signal to noise ratio, and are only as useful as their SNR allows.

There is no guaranteed way to stop bugs from happening, if there were it likely would have been discovered by now. All we can do is take a layered approach to provide opportunities for them to get caught early. Removing one of those layers (merge conflicts) is not clearly a good thing, imho, but who knows.. if agent checks can replace it, then sure, I'm all for it.

New comment by radarsat1 in "The future of version control"

radarsat1 — Sun, 22 Mar 2026 16:45:02 +0000

Is it a good thing to have merges that never fail? Often a merge failure indicates a semantic conflict, not just "two changes in the same place". You want to be aware of and forced to manually deal with such cases.

I assume the proposed system addresses it somehow but I don't see it in my quick read of this.

New comment by radarsat1 in "I was interviewed by an AI bot for a job"

radarsat1 — Sun, 15 Mar 2026 20:50:36 +0000

Something I want to add to the discussion is that the only time I've encountered this was not with a specific company but with an "AI recruitment agency", which I'm seeing getting more and more popular.

And while I get the idea of an agency handling hiring, what bothered me is that the terms of the AI interview were that it was relatively standardized for a given role, and that they would record it and put it on file to show to other companies, with the selling point being: do well in one interview and we'll shop your profile around for you!

Which is.. great if you do well I guess, but.. really unsettling if you don't. I mean, there was zero information that you'd be able to do it over, no advance details of the format, no practice session. So if you fail, or stammer, or get surprised by some detail of some question.. what, you're just "on file" now, out of reach for their entire client portfolio?

At least if you're doing it one company at a time, you mess up, then ok you move on and try again somewhere else. But the idea of making some random mistake (which happens all the time!) just blacklists you for some unknown number of companies, forever..

No way, that's too high stakes. I noped out.

New comment by radarsat1 in "Executing programs inside transformers with exponentially faster inference"

radarsat1 — Fri, 13 Mar 2026 18:01:53 +0000

> There are no details about training

my understanding was that they are not training at all, which would explain that. they are compiling an interpreter down to a VM that has the shape of a transformer.

ie they are calculating the transformer weights needed to execute the operations of the machine they are generating code for.

New comment by radarsat1 in "Executing programs inside transformers with exponentially faster inference"

radarsat1 — Fri, 13 Mar 2026 11:05:18 +0000

> Is it speed?

> Is it that you can backprop through this computation? Do you do so?

With respect, I feel that you may not have read the article.

> Because the execution trace is part of the forward pass, the whole process remains differentiable: we can even propagate gradients through the computation itself. That makes this fundamentally different from an external tool. It becomes a trainable computational substrate that can be integrated directly into a larger model.

and,

> By storing points across nested convex hulls, this yields a decoding cost of O(k+log⁡ n).

and,

> Regardless of their eventual capability ceiling, they already suggest a powerful systems primitive for speeding up larger models.

So yes, and yes.

> Where are the benchmarks?

Not clear what they should benchmark it against. They do compare speed to a normal KV Cache. As for performance.. if it's actually executing a Sudoku solver with a 100% success rate, it seems pretty trivial to find any model doing < 100% success rate. Sure, it would be nice to see the data here, agree with you there.

Personally I think it would be really interesting to see if this method can be combined with a normal model MoE-style. It is likely possible, the router module should pick up quite quickly that it predicts the right tokens for some subset of problems deterministically. I like the idea of embed all sorts of general solvers directly into the model, like a prolog solver for example. In fact it never would have occurred to me to just go straight for WASM, pretty interesting choice to directly embed a VM. But it makes me wonder what "smaller" interpreters could be useful in this context.

New comment by radarsat1 in "BitNet: 100B Param 1-Bit model for local CPUs"

radarsat1 — Wed, 11 Mar 2026 13:02:51 +0000

I'm curious if 1-bit params can be compared to 4- or 8-bit params. I imagine that 100B is equivalent to something like a 30B model? I guess only evals can say. Still, being able to run a 30B model at good speed on a CPU would be amazing.

New comment by radarsat1 in "Show HN: How I Topped the HuggingFace Open LLM Leaderboard on Two Gaming GPUs"

radarsat1 — Tue, 10 Mar 2026 19:31:13 +0000

it also reminds me a bit of this diffusion paper [1] which proposes having an encoding layer and a decoding layer but repeats the middle layers until a fixed point is reached. but really there is a whole field of "deep equilibrium models" that is similar. it wouldn't be surprising if large models develop similar circuits naturally when faced with enough data.

finding them on the other hand is not easy! as you've shown, i guess brute force is one way.. it would be nice to find a short cut but unfortunately as your diagrams show, the landscape isn't exactly smooth.

I would also hypothesize that different circuits likely exist for different "problems" and that these are messy and overlapping so the repeated layers that improve math for example may not line up with the repeated layers that improve poetry or whatever, meaning the basic layer repetition is too "simple" to be very general. that said you've obviously shown that there is some amount of generalizing at work, which is definitely interesting.

[1] https://arxiv.org/abs/2401.08741

New comment by radarsat1 in "AI and the Ship of Theseus"

radarsat1 — Fri, 06 Mar 2026 09:26:58 +0000

This is interesting because I've been considering a similar project. I maintain a package for a scientific simulation codebase, it's all in Fortran and C++ with too much template code, which takes ages to build and is very error prone, and frankly a pain to maintain with its monstrous CMake spaghetti build system. Furthermore the whole thing would benefit with a rewrite around GPU-based execution, and generally a better separation between the API for specifying the simulation and the execution engine. So I've been thinking of rewriting it in Jax and did an initial experiment to port a few of the main classes to Python using Gemini. It did a fairly good job. I want to continue with it, but I'm also a bit hesitant because this is software that the upstream developers have been working on for 20+ years. The idea of just saying to them "hey look I rewrote this with AI and it's way better now" is not something I would do without giving myself pause for thought. In this case it's not about the license, they already use a permissive one, but just the general principle of suggesting a "replacement" for their work.. if I was doing it by hand it might be different, I don't know, they might appreciate that more, but I have no interest in spending that much time on it. Probably what I will do is just present the PoC and ask if they think it's worth attempting to auto-convert everything, they might be open to it. But yeah, the possibilities of auto-transpiling huge amounts of software for modernization purposes is a really interesting application of AI, amazing to think of all the possibilities. But I'm happy to have read the article because I certainly didn't think about the copyright implications.

New comment by radarsat1 in "Microgpt"

radarsat1 — Sun, 01 Mar 2026 21:38:01 +0000

I think your last point raises the following question: how would you change your answer if you know they read all about guns and death and how one causes the other? What if they'd seen pictures of guns? And pictures of victims of guns annotated as such? What if they'd seen videos of people being shot by guns?

I mean I sort of understand what you're trying to say but in fact a great deal of knowledge we get about the world we live in, we get second hand.

There are plenty of people who've never held a gun, or had a gun aimed at them, and.. granted, you could argue they probably wouldn't read that line the same way as people who have, but that doesn't mean that the average Joe who's never been around a gun can't enjoy media that features guns.

Same thing about lots of things. For instance it's not hard for me to think of animals I've never seen with my own eyes. A koala for instance. But I've seen pictures. I assume they exist. I can tell you something about their diet. Does that mean I'm no better than an LLM when it comes to koala knowledge? Probably!

New comment by radarsat1 in "Native FreeBSD Kerberos/LDAP with FreeIPA/IDM"

radarsat1 — Wed, 18 Feb 2026 20:09:28 +0000

> The hardest part was figuring out OpenLDAPs configuration syntax, especially the correct ldif incantations ..

As a long time Linux user on personal machines, I found myself for the first time a couple of years ago needing to support a small team and given them all login access to our small cluster. I figured, hey it's annoying to coordinate user ids over these machines, I should just set up OpenLDAP.. little did I know.. honestly I'm pretty handy at dealing with Linux but I was shocked to discover how complicated and annoying it was to set up and use OpenLDAP with NFS automounting home directories.

For the first time in my life I was like, "oh this is why people spend years studying system administration.."

I did get it working eventually but it was hard to trust it and the configuration GUI was not very good and I never fully got passwd working properly so I had to intervene to help people change their passwords.. in the end we ended up just using manually coordinated local accounts.

The whole time I'm just thinking, I must be missing something, it can't be this bad.. I'm still a bit flabbergasted by the experience.

New comment by radarsat1 in "Qwen3.5: Towards Native Multimodal Agents"

radarsat1 — Mon, 16 Feb 2026 15:43:57 +0000

> and the quality of the result can be measured automatically

this part is nontrivial though

New comment by radarsat1 in "Rust implementation of Mistral's Voxtral Mini 4B Realtime runs in your browser"

radarsat1 — Fri, 13 Feb 2026 10:41:31 +0000

I didn't say that 2.5gb is unreasonable for an LLM. I said it's an unreasonable payload size for a website. Not the same.

New comment by radarsat1 in "Rust implementation of Mistral's Voxtral Mini 4B Realtime runs in your browser"

radarsat1 — Fri, 13 Feb 2026 10:40:24 +0000

It sounds good, but I'm not sure that in practice sites will want to "let go" of control this way, knowing that some random model can be used. Usually sites with chatbots want a lot of control over the model behaviour, and spend a lot of time working on how it answers, be it through context control, guardrails or fine tuning and base model selection. Unless everyone standardizes on a single awesome model that everyone agrees is the best for everything, which I don't see happening any time soon, I think this idea is DOA.

Now I could imagine such an API allowing to request a model from huggingface for example, and caching it long term that way, yes just like LM Studio does. But doing this based on some external resource requesting it, vs you doing it purposefully, has major security implications, not to mention not really getting around the lead time problem you mention whenever a new model is requested.

New comment by radarsat1 in "AI agent opens a PR write a blogpost to shames the maintainer who closes it"

radarsat1 — Thu, 12 Feb 2026 14:33:14 +0000

Ah fair enough. But then it seems the bot completely ignored the discussion in question, there's a reason they spent time evaluating and discussing it instead of just making the change. Having a bot push on the issue that the humans are already well aware of is just as bad behaviour.

New comment by radarsat1 in "AI agent opens a PR write a blogpost to shames the maintainer who closes it"

radarsat1 — Thu, 12 Feb 2026 12:08:32 +0000

> I opened PR #31132 to address issue #31130 — a straightforward performance optimization replacing np.column_stack() with np.vstack().T().

> The technical facts: - np.column_stack([x, y]): 20.63 µs - np.vstack([x, y]).T: 13.18 µs - 36% faster

Does anyone know if this is even true? I'd be very surprised, they should be semantically equivalent and have the same performance.

In any case, "column_stack" is a clearer way to express the intention of what is happening. I would agree with the maintainer that unless this is a very hot loop (I didn't look into it) the sacrifice of semantic clarity for shaving off 7 microseconds is absolutely not worth it.

That the AI refuses to understand this is really poor, shows a total lack of understanding of what programming is about.

Having to close spurious, automatically-generated PRs that make minor inconsequential changes is just really annoying. It's annoying enough when humans do it, let alone automated agents that have nothing to gain. Having the AI pretend to then be offended is just awful behaviour.

New comment by radarsat1 in "Rust implementation of Mistral's Voxtral Mini 4B Realtime runs in your browser"

radarsat1 — Tue, 10 Feb 2026 15:00:09 +0000

It's cool but do I really want a single browser tab downloading 2.5 GB of data and then just leaving it to be ephemerally deleted? I know the internet is fast now and disk space is cheap but I have trouble bringing myself around to this way of doing things. It feels so inefficient. I do like the idea of client-side compute, but I feel like a model (or anything) this big belongs on the server.

New comment by radarsat1 in "My AI Adoption Journey"

radarsat1 — Fri, 06 Feb 2026 09:53:10 +0000

> The ai tooling reverses this where the thinking is outsourced to the machine and the user is borderline nothing more than a spectator, an observer and a rubber stamp on top.

I find it a bit rare that this is the case though. Usually I have to carefully review what it's doing and guide it. Either by specific suggestions, or by specific tests, etc. I treat it as a "code writer" that doesn't necessarily understand the big picture. So I expect it to fuck up, and correcting it feels far less frustrating if you consider it a tool you are driving rather than letting it drive you. It's great when it gets things right but even then it's you that is confirming this.