Hacker News: intothemild

New comment by intothemild in "US and Iran reach cease fire agreement"

intothemild — Mon, 15 Jun 2026 06:22:17 +0000

It's that anyone who is keeping count can see that the amount of Palestinian civilians that have been killed in this war is a very very large number.

New comment by intothemild in "US and Iran reach cease fire agreement"

intothemild — Mon, 15 Jun 2026 05:52:21 +0000

> to reduce civilian casualties.

I am in shock you wrote this.

New comment by intothemild in "Open source AI must win"

intothemild — Sat, 13 Jun 2026 09:14:01 +0000

Well. Right now buying hardware to run your own models tops off at about 32gb VRAM at any price point that's not insane. Sure you can get a Mac mini, or a PC equivalent. But the problem is RAM.

More RAM means bigger models, which means smarter models.

Which is why Qwen and Gemma have been so interesting to a lot of us who run our own... Now 32gb VRAM isn't so bad, as these models can be run on that with decent results.

Where this gets interesting is in a couple years, when all the A100, etc, all the Enterprise hardware hits eBay.

New comment by intothemild in "Liquid AI reveals 8B-A1B MoE trained on 38T"

intothemild — Sat, 30 May 2026 14:32:00 +0000

I get 50-60t/s tg on my r9700 with the dense, unsloth MTP quant UD-Q5_K_XL, K@8/V@4 256k context.

Using Vulkan backend.

``` llama-server -fa on -t 7 -ngl 999 --mlock --fit off --kv-offload --no-webui --metrics --chat-template-kwargs {"preserve_thinking": true} -b 2048 -ub 1024 -m /mnt/models/unsloth/Qwen3.6-27B-MTP-GGUF/Qwen3.6-27B-UD-Q5_K_XL.gguf --mmproj /mnt/models/unsloth/Qwen3.6-27B-MTP-GGUF/mmproj-F16.gguf -c 262144 --kv-unified -ctk q8_0 -ctv q4_0 --spec-type draft-mtp --spec-draft-n-max 3 --spec-draft-ngl 99 --alias unsloth/Qwen3.6-27B-MTP-GGUF --temp 0.60 --top-k 20 --top-p 0.95 --min-p 0.00 --presence-penalty 0.00 --repeat-penalty 1.00 ```

New comment by intothemild in "Liquid AI reveals 8B-A1B MoE trained on 38T"

intothemild — Sat, 30 May 2026 09:59:49 +0000

You should enable MTP now that its available.

LLamaCPP has had some massive updates in the last week or so.

New comment by intothemild in "Local LLMs perform better when you teach them to ask before they answer"

intothemild — Sun, 24 May 2026 22:30:25 +0000

Sure. It's just an old I7 8700 (non-k), 64gb ram. Running proxmox. But recently I put an AMD R9700 AI Pro, in there which is a 32gb inference focused card, think of it as a 32gb version of a 9070xt.

All the inference happens on that card, so the CPU/RAM is there for the other containers.

I'll eventually swap the motherboard and CPU for something better, so I can fit 1 or 3 more of those cards.

Why not NVIDIA? 32gb on team green means spending crazy money. And I can get 4 R9700s for the cost of one 32gb 5090.

128gb ... Vs 32gb.

New comment by intothemild in "Local LLMs perform better when you teach them to ask before they answer"

intothemild — Sun, 24 May 2026 08:38:10 +0000

Yes. I run local models, Qwen3.6-27B and IMHO the massive level up was the agents and skills files that I've worked on.

Basically I run a flow

Brainstorming > Create Spec > Review Spec* > Create Plans > Review Plan* > Execute Plan (in subagents) > Review Against Plan > Code Review* > Open PR > Finish Plan (marks plan files done)

* Each review step marked with an asterisk uses a paid larger LLM, right now Deepseek V4 Pro. Having it do this catches a lot of small things, and now I'm effectively one shotting any task I give it.

And it's not costing me much at all, just those three reviews. I could use a free model like Gemini but I'm happy with what I've got.

New comment by intothemild in "AI subscriptions are a ticking time bomb for enterprise"

intothemild — Sun, 17 May 2026 16:15:59 +0000

I've spent the last month bringing in a small demo of what the future could be like, running Qwen, Gemma, and Deepseek, behind LiteLLM so we can monitor token usage, and instead of some dumb ass "tokenmaxxing" we're actively trying to get the cost of inference both down, and in-house.

Boss is happy, very happy. We're rolling it out more widely now.

But this is the future.

New comment by intothemild in "Halt and Catch Fire"

intothemild — Sat, 16 May 2026 23:42:26 +0000

Same. Having experienced the growth of computing in those eras, the show itself had a very well researched yet very nostalgic sense of "oh yes. I'd forgotten about that".

New comment by intothemild in "Halt and Catch Fire"

intothemild — Sat, 16 May 2026 23:40:50 +0000

The best part of Silicon Valley was that it had a very south park quality to it.. in that things that were actually happening at the time were parodied on the show.

New comment by intothemild in "Ploopy Bean: a trackpoint for every computer"

intothemild — Sat, 16 May 2026 09:26:46 +0000

Exactly. People love trackpoint because it's right there in the middle of the keyboard, and you don't have to move your hands.

Any variation of trackpoint where you have to move your hand away from the keyboard, is a failure IMHO

New comment by intothemild in "What's in a GGUF, besides the weights – and what's still missing?"

intothemild — Thu, 14 May 2026 23:08:18 +0000

Well considering right now MTP support is being developed, there was a conversation in that that seemed to throw around the idea of separating the MTP model out of the main GGUF, like with Mmproj. This was rejected.

Which I'm happy for. So given that decision, I don't think it's unreasonable to think that they might be open to including Mmproj files in the GGUF.

Only issue I can think of is, which one? BF16, F16? Etc

New comment by intothemild in "Local AI needs to be the norm"

intothemild — Mon, 11 May 2026 06:06:57 +0000

There's a percentage of people who love to question how the open models were trained.. they are almost always going to try and make some argument about using the closed frontier models for distillation as some form of theft.

Just totally forgetting that the frontier models themselves stole an insane amount to get to where they are.

It's theft all the way across the board, and when someone tries to make the argument that open models theft is bad, but Altman or Amodei's theft is good.. they are revealing a lot about themselves

New comment by intothemild in "Local AI needs to be the norm"

intothemild — Sun, 10 May 2026 20:24:09 +0000

That's already happening. Qwen3.6 and Gemma4.

Basically small and medium models that are crazy well trained for their sizes.

Then we have a lot of specular decoding stuff like MTP and others coming to speed up responses, and finally better quantisation to use less memory.

Local LLM is the future, and the larger labs know that the open models will eat their lunch once people realise that the gap is only a few months. If we were good with LLMs a couple months ago, we're good with the open models now.

New comment by intothemild in "Accelerating Gemma 4: faster inference with multi-token prediction drafters"

intothemild — Wed, 06 May 2026 13:32:39 +0000

Don't forget to update the gguf you have too. The templates in them were updated recently too

New comment by intothemild in "A report on burnout in open source software communities (2025) [pdf]"

intothemild — Sat, 02 May 2026 12:34:54 +0000

I like it, only one problem.. the fix it now types also are the same ones that didn't read anything.

New comment by intothemild in "Spirit Airlines Is Winding Down All Operations"

intothemild — Sat, 02 May 2026 07:36:49 +0000

If only they flapped. Maybe they'd still be in the air.

New comment by intothemild in "A report on burnout in open source software communities (2025) [pdf]"

intothemild — Sat, 02 May 2026 07:28:43 +0000

> We're talking about code that users can modify themselves to solve their own problems. That's it. I don't need to hear about the struggle.

That's exactly the kind of attitude that this discusses.

You create something that solves your problems, you put it up on GitHub, free, and open... Suddenly it turns out others have the same problems you did, your software solves them.

It starts ok. People are nice. But as it gains traction, a certain kind of toxic person becomes more and more common. The "YOU FIX IT NOW! I DONT KNOW" Kind of person.

You wake in the morning, look at your email, and it's a stream of being screamed at. That takes a toll.

All because you had an idea one time to build something that solved your problem you thought "hey I might just open source this".

> That's it. I don't need to hear about the struggle.

New comment by intothemild in "Anthropic's Champion Kit for engineers pushing Claude Code at their company"

intothemild — Wed, 29 Apr 2026 11:07:36 +0000

How many pieces of flair is the minimum?

New comment by intothemild in "Making RAM at Home [video]"

intothemild — Wed, 22 Apr 2026 05:58:56 +0000

I only have raw RAM, pastured RAM is wrong.

I get my DRAM needs at the RAM ranch.