Hacker News: 1dom

New comment by 1dom in "Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks"

1dom — Wed, 20 May 2026 17:38:53 +0000

Hi! Thanks for the response. Like I mentioned, I only skimmed, and it sounds like there's more to it than I understand, so I'll take a deeper look and see how it feels in practice.

> Where timing gets interesting is that forge will slow down workflows because the retries mean you don't error right away. Bare runs were failing fast in my experience. But on a per-call basis there's very little overhead.

> I haven't detailed it simply because the order of magnitude of a single LLM call is so much higher than all the overhead put together.

Yeah, that makes sense and seems fair. The sort of delays are almost and inevitability, you're not trying to improve speed, but by improving reliability, it can obviously increase overall throughput.

Having watched the demo video too now, automating retries etc would be helpful for me. It's impressive to see how quick the models run on better hardware, and the performance improvements are impressive, even if the overall run takes longer sometimes because it does more correct things. Thanks again!

New comment by 1dom in "Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks"

1dom — Wed, 20 May 2026 15:42:33 +0000

I agree accuracy isn't maybe the best word here, I used it as it was used in the original post, mainly a as a catchall for "everything but speed", so fidelity, perplexity, etc.

I also agree that if I spent more time using cloud based LLMs, I would very much find local LLMs less capable and useful. Comparison is the thief of joy though, and I'd rather feel blissfully ignorance towards SOTA LLMs rather than a dependence on them.

Before taking a local focus approach, LLMs increasingly left me feeling a mixture of FOMO, sadness and futility towards the future of software and tech. I assume it's 100% a me problem, but it has it's benefits:)

New comment by 1dom in "Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks"

1dom — Wed, 20 May 2026 11:52:05 +0000

Yup, confirming what pamcake said, 30b with 3b active.

I have a laptop with a broken screen and an RTX2060 at my disposal. I can run 12b - 14b dense usably, just, although I think 4b - 8b dense models give me the best tradeoff of speed and usefulness.

Larger MOE models with more parameters (20b+) but fewer active (2 - 3b) are sometimes a little bit slower, but are often far more capable.

New comment by 1dom in "I believe there are entire companies right now under AI psychosis"

1dom — Wed, 20 May 2026 08:20:36 +0000

Good point. I guess I feel they're still getting into position there, and haven't really had their opportunity to blossom yet. The average western citizen's experience of war is still just slightly increased food and fuel prices.

New comment by 1dom in "Ploopy Bean: a trackpoint for every computer"

1dom — Wed, 20 May 2026 08:15:11 +0000

That's a good question and make me think. I always thought the trackpoint nubs were binary too, basically just a stick in the middle of up/down/left/right mousekey buttons, bit it turns out they're not!

For original trackpoints, it's basically a stick in the middle of an up/down/left/right resistive strain gauge.

For the ploopy beans here, they use hall effect sensors instead of resistive strain to get a bit more movement.

As soon as you have non-binary up/down/left/right values, the mouse direction and speed can be interpolated to so many values that mousekey accidental squares become impossible.

New comment by 1dom in "Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks"

1dom — Wed, 20 May 2026 08:03:29 +0000

I like work in this area, and this is really helpful, thanks. I actively avoid cloud based LLMs and mainly use 4b - 30a3b param local models. This means I don't really have a good grasp of SOTA LLM performance or accuracy, but I know what to expect when dealing with local models, and where the pain points are.

I've only skimmed the post and read the abstract and in some places you make a nod to how simple tweaks can make something 10x faster/slower, but then all of your metrics and data seem to focus 100% on accuracy. You need to address speed.

Specifically for agentic workflows and local models, accuracy around function/tool calling hasn't been a problem for me now for about 6 - 12 months, personally, since around QwenCoder3. The main issue is context management and the impact on timing, since agents will often swap prompts and break prompt caching and similar timing improvements.

It looks like your work adds a layers and wrappers like guard rails and retries. This would make my local model experience - specifically for agents - unusable because of the delays it would add.

I really appreciate and respect the work you've done, and apologies if you have already addressed this head on, but with so little talk about the impact on timing here, I feel like you're hiding something or overinflating the actual real world improvements here - what are your thoughts?

It's also mildly concerning me that nobody else has raised this - am I doing something wrong here, or is everyone else just not actually using local models in real life?! Talk to me about your speed experiences!

New comment by 1dom in "Incident Report: Railway Blocked by Google Cloud [resolved]"

1dom — Wed, 20 May 2026 07:06:03 +0000

Personally, I don't see this as people punching someone who's down. This is the sort of real life experience and necessary context from actual technical users that I come to HN comments for.

Someone is just asking to get Google's side and explaining why they want that, which seems reasonable since we're in a post where Google is being punched/blamed for this, and it sounds like it isn't Railways first questionable outage.

New comment by 1dom in "Where Are the Vibecoded Photoshops?"

1dom — Mon, 18 May 2026 11:07:58 +0000

You seem to be arguing that vibecoding photoshop wasn't possible up until 2 months ago, with GPT 5.4/5.5.

That's a very, very weird take on many, many levels. Could you elaborate a bit about where that view came from, how often you use AI, what's your career etc.?

New comment by 1dom in "Fecal transplants for autism deliver success in clinical trials (2019)"

1dom — Sat, 16 May 2026 16:44:11 +0000

No embarrassment, I thought the same. Feels like a far more sensible thought than fecal transplant via the mouth, for many reasons. Vindicating edit!

New comment by 1dom in "I believe there are entire companies right now under AI psychosis"

1dom — Sat, 16 May 2026 16:13:47 +0000

Paypal Mafia -> Crypto Mafia -> AI/LLM Mafia -> I'm calling Biotech/BCI Mafia, WW3 Military Industrial Mafia, or Energy Mafia next.

New comment by 1dom in "Ploopy Bean: a trackpoint for every computer"

1dom — Sat, 16 May 2026 14:12:52 +0000

I went the opposite way: I started with UHK, then went for a ZSA moonlander, but settled on a kbdcraft Israfel, which is a relatively cheap, split ortholinear.

I felt most of the extra functionality and polish that I guess makes up the massive costs of UHK and ZSA wasn't actually necessary. It was cool and fun and useful to try a bunch of different stuff, but then over time, I wanted things to be simple and small which UHK and ZSA Moonlander aren't (ZSA voyager wasn't at the time).

All I'm saying is if you've got comfortable with a cheap Corne, I think you might feel underwhelmed if you spend a lot on something a lot fancier.

New comment by 1dom in "Ploopy Bean: a trackpoint for every computer"

1dom — Sat, 16 May 2026 14:06:57 +0000

I had a UHK for a few months before refunding it because the shielding wasn't good enough to stop it becoming unreliable when my mobile phone was within about 30cm of my keyboard. I contacted support and the solution was to move the phone away from the keyboard, which is kind of irritating for such an expensive piece of kit.

But, just wanted to share that I was similarly surprised to land on mouse keys as a preference. I tried most of the UHK modules which were also pretty good and have since tried various other trackballs and pads, but since trying UHK mouse keys, they're what I keep coming back to most, even since switching to new keyboards.

One issue I have with mouse keys is fear of using them in front of others though: every so often, if I need to click something particularly small and don't have a keyboard shortcut memorised (vscode panel resizing is one) it can sometimes take me a fair few embarrassing seconds drawing small squares around my target before I resort to actual mouse hardware.

For the amount of time and thought and effort people have put into alternative mice, I feel mouse keys are massively overlooked and probably have a lot of room for software/firmware innovation without hardware costs.

New comment by 1dom in "LFM2-24B-A2B: Scaling Up the LFM2 Architecture"

1dom — Sat, 02 May 2026 08:50:58 +0000

I think context length is important to consider here.

I find Gemmas really good for a short conversation with maybe 3 or 4 exchanges of a few paragraphs each, which covers a surprisingly large amount of interactions.

For anything longer form though, particularly with larger code contexts, Qwen is far more useful for me personally.

I'm not an expert in this field, but my understanding is Qwen are hybrid gated attention mechanisms, whereas Gemma is hybrid including a sliding attention attention mechanism which makes it look like it favour the most recent tokens a little too much at times.

This is all in the context of local quantized models, I'm aware both have larger cloud variants that wouldn't suffer as much.

New comment by 1dom in "American Dads Became the Parents Their Fathers Never Were"

1dom — Fri, 01 May 2026 17:49:10 +0000

You're not saying what your position clearly - instead you're "just asking questions", and it's rubbing some people up the wrong way (including me, sorry). It looks like you're not apologising for that because "it wasn't your intention".

If you're sincerely trying to engage in good faith, I feel you should be apologising for your role in sending it in the wrong direction unintentionally.

To be clear, I'm not taking a position in the debate here, just commenting that the way your engaging is legitimately a bit annoying if you're not aware. The other person getting really angry isn't the best look either, but I'm sure they already know that.

New comment by 1dom in "I am building a cloud"

1dom — Thu, 23 Apr 2026 07:59:24 +0000

I think this comment and replies capture the problem with Kubernetes. Nobody gets fired for choosing Kubernetes now.

It's obvious to you, me and the other 2 presumably techie people who've responded within 15 mins that you shouldn't have been using Kubernetes. But you probably work in a company of full of techie people, who ended up using Kubernetes.

We have HN, an environment full of techie people here who immediately recognise not to use k8s in 99% of cases, yet in actually paid professional environments, in 99% of cases, the same techie people will tolerate, support and converge on the idea they should use k8s.

I feel like there's an element of the emperors new clothes here.

New comment by 1dom in "Qwen3.6-Max-Preview: Smarter, Sharper, Still Evolving"

1dom — Tue, 21 Apr 2026 07:40:48 +0000

It's all openwashing, all of the ones you listed at somepoint have expressed how important and valuable open weights and locally usable models are. Every single one of them has then increasingly focused and pushed closed, proprietary or cloud usable only options since saying/doing that.

I'm annoyed at myself, because I thought/hoped/praised chinese AI when they were opening up as Llama was closing, but Qwen looks to be doing the same playbook here as Llama/Meta, Gemma/Google and OpenAI/gpt-oss.

New comment by 1dom in "Claude Opus 4.7"

1dom — Fri, 17 Apr 2026 11:17:47 +0000

The issue is business and transparency. Transparency is often in the customer's interest at the individual business's expense.

There are very, very few things that can be completely transparent without giving competitors an advantage. The nice solution solution to this is to be better and faster than your competitors, but sometimes it's easier just to remove transparency.

New comment by 1dom in "Most people can't juggle one ball"

1dom — Mon, 13 Apr 2026 12:09:19 +0000

Juggling is something I've done on and off all my life and every time I ever see any writing about it, it doesn't at all capture how juggling fits with me. Every time I read about it, I feel like I'm doing something wrong. I know I'm not, but I thought I'd share my views in case it helps someone want to pickup juggling in a different way.

Most writing on juggling talks about moves and repetition. You'll see the same terminology and approaches everwhere, site swap notation, cascades, messes, 1 hand, 2 hand, claws etc.

This can sometimes give the impression that when you're juggling, you're basically doing a move, or repeating a motion, or following a script, like being able to perform the specific move is the skill and achievement itself, but it's really, really not.

I think it's better to think of the balls as instruments. When you're reading and learning about juggling moves like cascades and messes and clawing, you're reading about chords on that instrument. If all you know is how to play individual chords, and the names of the chords and theoretical and technical ways to represent the chords, you're missing the point of knowing how to play an instrument.

Most of the time when I'm juggling, I'm in the same creative and expressive mindset and headspace as I am if I'm soloing on drums. Every move is a different move between a bunch of different moves based on wherever my mood takes me. It feels good and allegedly look impressive too.

I've tried to learn keyboard/piano so, so, so many times and it just doesn't stick with me, I struggle to retain the theory, I can't get the intuition, I never find myself in the zone after years and years of attempts and lessons. Drumming was the opposite though, I felt comfortable on it within months of having my kit with very, very little theoretical background. I consider myself a theoretical person, but juggling - like drumming - seems to be one of those few things for me where the theory makes it worse/harder/less intuitive/less fun for me.

New comment by 1dom in "I ran Gemma 4 as a local model in Codex CLI"

1dom — Mon, 13 Apr 2026 11:38:06 +0000

Sorry, I somehow didn't see the comment above yours, but it makes a lot more sense now.

The sentiment still applies the parent comment of yours though.

New comment by 1dom in "I ran Gemma 4 as a local model in Codex CLI"

1dom — Mon, 13 Apr 2026 11:03:23 +0000

I feel that's a little bit misleading.

That link doesn't have much affiliation with Qwen or anyone who produces/trained the Qwen models. That doesn't mean it's not good or safe, but it seems quite subjective to suggest it's the latest latest or greatest Qwen iteration.

I can see huggingface turning into the same poisoned watering-hole as NPM if people fall into the same habits of dropping links and context like that.