Hacker News: thot_experiment

New comment by thot_experiment in "Restore full BambuNetwork support for Bambu Lab printers"

thot_experiment — Tue, 12 May 2026 23:34:59 +0000

idk, my 10 year old makerbot 2 has been pretty reliable, ever since Prusa slicer came out and I tuned a profile for it maybe 6 years ago it's been spitting out quick dimensionally accurate prints. i use it all the time, probably go through a spool every month or two and all i've had to replace is the cooling fan for the extruder once

i'm mostly printing small mechanical parts and i can't say i have any complaints, i assume a modern prusa would be much better, surely there are other FDM printers that are good?

New comment by thot_experiment in "GM just laid off IT workers to hire those with stronger AI skills"

thot_experiment — Tue, 12 May 2026 00:15:53 +0000

I can't tell if this is sarcastic or not, but it seems insane to let the AI write the tests.

AI can't be held accountable, it shouldn't be writing the tests that determine whether car systems function correctly.

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 18:09:54 +0000

Yo, MTP for Qwen is sick, thank you! Your work is invaluable.

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 18:06:12 +0000

Also you can feed it ALL of your data willy nilly without ever worrying about safety because you can just do it with the LAN cable unplugged, for applications that demand data hygiene it's a cheat code that guarantees safety without any sort of data sanitization.

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 18:02:01 +0000

Yeah, thanks, though I think local models are at least a Cessna, which while being nothing like an F-35 can fly.

New comment by thot_experiment in "Running local models on an M4 with 24GB memory"

thot_experiment — Mon, 11 May 2026 17:59:36 +0000

Maybe a skill issue but they both feel about the same and the MoE is 3x faster so I barely use the dense model.

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 17:58:08 +0000

This is probably a precision thing, I think there's a really big difference in long running tasks between q4 and q6.

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 09:23:12 +0000

Sorry, "essentially useless in the context of local model availability". It's a fine model but it's tier of inference is fully fungible.

New comment by thot_experiment in "I'm going back to writing code by hand"

thot_experiment — Mon, 11 May 2026 08:55:32 +0000

It's more complex than that, I think the reality is that there's a lot of code that's just not that deep bro. I have some purely personal projects that have components that I don't understand anymore, I wrote that shit by hand, they still work but I haven't touched that shit in years. There's a lot of code that AI can write that's like that that helps me, the stuff I would forget about even if I wrote it by hand. I think you have to have discipline in it's use, it's a tool like any other.

AI, and especially agentic AI can make you lose situational awareness over a codebase and when you're doing deep work that SUUUUCKS, but it's not useless, you just have to play to it's strengths. Though my favorite hill to die on is telling people not to underestimate it's value as autocomplete. Turns out 40 gigabytes of autocomplete makes for a fucking amazing autocomplete. Try it with llama.vim + qwen coder 30b, it feels like the editor is reading your mind sometimes and the latency is so low.

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 08:45:27 +0000

Overall using screentime as the metric, derived from some imperfect logging and vibes it's about 50% OpenCode 15% Continue 15% my homebrew bullshit 13% Claude Code and 7% Cline. I've been deep on agentic stuff lately (1.3wks aka 3 months of AI time), there are only so many hours in the day to duplicate work and AB test, but in the past I've sworn by Qwen Coder + llama.vim and I still enjoy that workflow for deep work far more than I like prompting agents, but there's a lot of dross I'm learning to delegate.

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 08:00:11 +0000

No, it isn't. I am saying that the set of tasks that can be completed by Opus 4.7 has a surprisingly large overlap with the set of tasks that can be completed by Gemma 31B. It is meaningfully equivalent in many cases.

(of course if i'm being honest 640kB is fine, i'm sure tons of the world's commerce is handled by less for example, the delta between a system with 640kb of ram and a modern one is near nil for many people, the UX on a PoS terminal does not require more than that for example, the hacker news UX could also be roughly the same)

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 07:47:23 +0000

Benchmarks only give you the roughest idea of how models compare in real world use. They're essentially useless beyond maybe classifying models into a few buckets. The only way you gain an understanding of something as complex as how an LLM integrates with your workflow is by doing it and measuring across many trials. I've been running Opus 4.7 in Claude Code and Gemma 4 31b in parallel on projects for hours a day this past week, Opus 4.7 is definitely better, but for many things they are roughly equivalent, there are some things on the edge that are just up to chance, and either model may stumble across the solution, and there are some areas of my work that reliably trip up both models and I get better mileage out of writing code the old fashioned way. I understand that I'm just one data point, but I'm not writing CRUD apps here, I'm doing DSPs and weird color math in shaders, I don't think any of it is hard, and the stuff that I think is hard none of the models are good at yet, but idk, they just don't seem that extremely disparate from one another.

FWIW I think Gemma 4 31b is more likely to be of use to me than Sonnet, idfk, maybe it's a skill issue but I love Opus 4.7, undisputed king, but Sonnet seems borderline useless and I basically think of it as on the same level as Qwen 35b MoE.

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 07:36:36 +0000

I 100% agree with your philosophy but I wanna note that I genuinely find Gemma 4 31b to be better than Sonnet. To be clear, this makes NO sense to me, so I'm probably just high and making stuff up or just biased by a small sample size since I don't use Sonnet that often. I find that Gemma 4 makes the sort of "dumb AI" mistakes Sonnet makes less often, especially in agentic mode. I genuinely don't know how that can be true but Sonnet feels much more like "autocomplete" and Gemma 4 feels like "some facsimile of thought".

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 07:27:20 +0000

False. The absolute capability is irrelevant, with the proper harness 31b is more than adequate for a very large portion of the tasks I ask AI to do. The metric isn't how good the model is at Erdos Problems, it's how reliably it can remove drudgery in my life. It just autonomously reverse engineered a bluetooth protocol with minimal intervention, it's ability to react to data and ground itself is constantly impressive to me. I do a ton of testing with these models, today I had Gemma answer a physics problem that Opus 4.7 gave up on. With a decent harness and context the set of tasks where their capabilities are both good enough is very surprising. The tasks I have that stump Gemma often also stump Opus 4.7.

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 07:17:31 +0000

No, exactly the opposite actually. Qwen3.6 is too imprecise for long running agentic tasks. It doesn't have the same ability to check itself as Gemma does in my testing. I keep Qwen MoE in vram by default because there are tons of tasks i trust it to oneshot and it's 90tok/sec is unparalleled, anything where I don't want to have to intervene too much it can't be trusted.

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 05:47:51 +0000

Flat wrong. Q6 Gemma 31b feels a lot like opus 4.5 to me when run in a harness so it can retrieve information and ground itself. The gap is not that big for a lot of usecases. Qwen MoE is fast as fuck locally for things that are oneshottable. I have subscriptions to all the major providers right now and since Gemma 4 and Qwen 3.6 came out I haven't hit limits a single time. I'm actually super surprised by the number of things I try with Gemma 4 with the intent of seeing how it fails and then having Claude do it only to come away with something perfectly usable from the local model.

New comment by thot_experiment in "Local AI needs to be the norm"

thot_experiment — Mon, 11 May 2026 05:41:48 +0000

Depending on your laptop, if your laptop is a Strix Halo or a Macbook with a decent amount of ram, that day they arrived is about 6 months ago, and today if you can run Gemma 31b, you're golden for your basic workslop code. You can do most of it with local models. Heck, for a lot of the tier of programming you might encounter in the average job Qwen 35b MoE is good enough and it can hit 100tok/s on decent hardware.

New comment by thot_experiment in "Running local models on an M4 with 24GB memory"

thot_experiment — Mon, 11 May 2026 05:25:00 +0000

Re-posting this from a buried comment for visibility because it's just so fucking impressive to me.

I went to the store to buy mixers and while I was out Gemma 4 31b got pretty far along with reverse engineering the bluetooth protocol of a desk thermometer I have. I forgot to turn on the web search tool, so it just went at it, writing more and more specific diagnostic logging/probing tools over the course of like 8 turns. It connected to the thermometer, scanned the characteristics and had made a dump of the bluetooth notification data. When I got back it was theorizing about how the data might be encoded in the bluetooth characteristics and it got into an infinite loop. (local models aren't perfect and i never said they were) I turned on the websearch tool and told it to "pick up the project where it left off", it read the directory, did a couple googles and had a working script to print temperature, humidity and battery state in like 3 turns. Reading back throught it's chain of thought I'm pretty sure it would have been able to get it eventually without googling.

idk, I thought I was a cool and smart engineer type for being able to do stuff like this, if my GPUs being able to do this more or less unsupervised isn't impressive I guess fuck me lol.

New comment by thot_experiment in "Running local models on an M4 with 24GB memory"

thot_experiment — Mon, 11 May 2026 04:22:14 +0000

It may surprise you but over thousands of hours I have actually gathered more than one sample.

EDIT: Here's another sample for ya. I went to the store to buy mixers and while I was out Gemma 4 31b got pretty far along with reverse engineering the bluetooth protocol of a desk thermometer I have. I forgot to turn on the web search tool, so it just went at it, writing more and more specific diagnostic logging/probing tools over the course of like 8 turns. It connected to the thermometer, scanned the characteristics and had made a dump of the bluetooth notification data. When I got back it was theorizing about how the data might be encoded in the bluetooth characteristics and it got into an infinite loop. (local models aren't perfect and i never said they were) I turned on the websearch tool and told it to "pick up the project where it left off", it read the directory, did a couple googles and had a working script to print temperature, humidity and battery state in like 3 turns. Reading back throught it's chain of thought I'm pretty sure it would have been able to get it eventually without googling.

idk, I thought I was a cool and smart engineer type for being able to do stuff like this, if my GPUs being able to do this more or less unsupervised isn't impressive I guess fuck me lol.

New comment by thot_experiment in "Running local models on an M4 with 24GB memory"

thot_experiment — Mon, 11 May 2026 03:29:45 +0000

Very different from my experience, Gemma 31b just solved a physics problem Opus 4.7 gave up on. I definitely don't think they're equivalent in general, Opus for sure is way smarter and way more likely to get things right on the edge, but it's still quite likely to get things wrong too it doesn't make it that useful for a lot of stuff. Conversely there are so many things that you would use an LLM for that they will both reliably oneshot. Especially in agentic mode where you have ground truth feedback between turns the difference gets quite small for a lot of tasks.

That all being said I've spent hundreds (maybe thousands?) of hours on this stuff over the past few years so I don't see a lot of the rough edges. I really believe capability is there, Gemma 4 31B is a useful agent for all sorts of stuff, and anything you can reasonably expect an LLM to oneshot Qwen 3.6 35b MoE will handle at like 90tok/sec, absolutely fantastic for tasks that don't require a huge amount of precision.