<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: intothemild</title><link>https://news.ycombinator.com/user?id=intothemild</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 15 Jun 2026 14:28:31 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=intothemild" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by intothemild in "US and Iran reach cease fire agreement"]]></title><description><![CDATA[
<p>It's that anyone who is keeping count can see that the amount of Palestinian civilians that have been killed in this war is a very very large number.</p>
]]></description><pubDate>Mon, 15 Jun 2026 06:22:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=48537317</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48537317</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48537317</guid></item><item><title><![CDATA[New comment by intothemild in "US and Iran reach cease fire agreement"]]></title><description><![CDATA[
<p>> to reduce civilian casualties.<p>I am in shock you wrote this.</p>
]]></description><pubDate>Mon, 15 Jun 2026 05:52:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=48537095</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48537095</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48537095</guid></item><item><title><![CDATA[New comment by intothemild in "Open source AI must win"]]></title><description><![CDATA[
<p>Well. Right now buying hardware to run your own models tops off at about 32gb VRAM at any price point that's not insane. Sure you can get a Mac mini, or a PC equivalent. But the problem is RAM.<p>More RAM means bigger models, which means smarter models.<p>Which is why Qwen and Gemma have been so interesting to a lot of us who run our own... Now 32gb VRAM isn't so bad, as these models can be run on that with decent results.<p>Where this gets interesting is in a couple years, when all the A100, etc, all the Enterprise hardware hits eBay.</p>
]]></description><pubDate>Sat, 13 Jun 2026 09:14:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=48515180</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48515180</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48515180</guid></item><item><title><![CDATA[New comment by intothemild in "Liquid AI reveals 8B-A1B MoE trained on 38T"]]></title><description><![CDATA[
<p>I get 50-60t/s tg on my r9700 with the dense, unsloth MTP quant UD-Q5_K_XL, K@8/V@4 256k context.<p>Using Vulkan backend.<p>```
llama-server -fa on -t 7 -ngl 999 --mlock --fit off --kv-offload --no-webui --metrics --chat-template-kwargs {"preserve_thinking": true} -b 2048 -ub 1024 -m /mnt/models/unsloth/Qwen3.6-27B-MTP-GGUF/Qwen3.6-27B-UD-Q5_K_XL.gguf --mmproj /mnt/models/unsloth/Qwen3.6-27B-MTP-GGUF/mmproj-F16.gguf -c 262144 --kv-unified -ctk q8_0 -ctv q4_0 --spec-type draft-mtp --spec-draft-n-max 3 --spec-draft-ngl 99 --alias unsloth/Qwen3.6-27B-MTP-GGUF --temp 0.60 --top-k 20 --top-p 0.95 --min-p 0.00 --presence-penalty 0.00 --repeat-penalty 1.00
```</p>
]]></description><pubDate>Sat, 30 May 2026 14:32:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=48336642</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48336642</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48336642</guid></item><item><title><![CDATA[New comment by intothemild in "Liquid AI reveals 8B-A1B MoE trained on 38T"]]></title><description><![CDATA[
<p>You should enable MTP now that its available.<p>LLamaCPP has had some massive updates in the last week or so.</p>
]]></description><pubDate>Sat, 30 May 2026 09:59:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=48334544</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48334544</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48334544</guid></item><item><title><![CDATA[New comment by intothemild in "Local LLMs perform better when you teach them to ask before they answer"]]></title><description><![CDATA[
<p>Sure. It's just an old I7 8700 (non-k), 64gb ram. Running proxmox. But recently I put an AMD R9700 AI Pro, in there which is a 32gb inference focused card, think of it as a 32gb version of a 9070xt.<p>All the inference happens on that card, so the CPU/RAM is there for the other containers.<p>I'll eventually swap the motherboard and CPU for something better, so I can fit 1 or 3 more of those cards.<p>Why not NVIDIA? 32gb on team green means spending crazy money. And I can get 4 R9700s for the cost of one 32gb 5090.<p>128gb ... Vs 32gb.</p>
]]></description><pubDate>Sun, 24 May 2026 22:30:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=48261666</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48261666</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48261666</guid></item><item><title><![CDATA[New comment by intothemild in "Local LLMs perform better when you teach them to ask before they answer"]]></title><description><![CDATA[
<p>Yes. I run local models, Qwen3.6-27B and IMHO the massive level up was the agents and skills files that I've worked on.<p>Basically I run a flow<p>Brainstorming > Create Spec > Review Spec* > Create Plans > Review Plan* > Execute Plan (in subagents) > Review Against Plan > Code Review* > Open PR > Finish Plan (marks plan files done)<p>* Each review step marked with an asterisk uses a paid larger LLM, right now Deepseek V4 Pro. Having it do this catches a lot of small things, and now I'm effectively one shotting any task I give it.<p>And it's not costing me much at all, just those three reviews. I could use a free model like Gemini but I'm happy with what I've got.</p>
]]></description><pubDate>Sun, 24 May 2026 08:38:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=48255629</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48255629</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48255629</guid></item><item><title><![CDATA[New comment by intothemild in "AI subscriptions are a ticking time bomb for enterprise"]]></title><description><![CDATA[
<p>I've spent the last month bringing in a small demo of what the future could be like, running Qwen, Gemma, and Deepseek, behind LiteLLM so we can monitor token usage, and instead of some dumb ass "tokenmaxxing" we're actively trying to get the cost of inference both down, and in-house.<p>Boss is happy, very happy. We're rolling it out more widely now.<p>But this is the future.</p>
]]></description><pubDate>Sun, 17 May 2026 16:15:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=48170258</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48170258</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48170258</guid></item><item><title><![CDATA[New comment by intothemild in "Halt and Catch Fire"]]></title><description><![CDATA[
<p>Same. Having experienced the growth of computing in those eras, the show itself had a very well researched yet very nostalgic sense of "oh yes. I'd forgotten about that".</p>
]]></description><pubDate>Sat, 16 May 2026 23:42:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=48164789</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48164789</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48164789</guid></item><item><title><![CDATA[New comment by intothemild in "Halt and Catch Fire"]]></title><description><![CDATA[
<p>The best part of Silicon Valley was that it had a very south park quality to it.. in that things that were actually happening at the time were parodied on the show.</p>
]]></description><pubDate>Sat, 16 May 2026 23:40:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=48164777</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48164777</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48164777</guid></item><item><title><![CDATA[New comment by intothemild in "Ploopy Bean: a trackpoint for every computer"]]></title><description><![CDATA[
<p>Exactly. People love trackpoint because it's right there in the middle of the keyboard, and you don't have to move your hands.<p>Any variation of trackpoint where you have to move your hand away from the keyboard, is a failure IMHO</p>
]]></description><pubDate>Sat, 16 May 2026 09:26:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=48158482</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48158482</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48158482</guid></item><item><title><![CDATA[New comment by intothemild in "What's in a GGUF, besides the weights – and what's still missing?"]]></title><description><![CDATA[
<p>Well considering right now MTP support is being developed, there was a conversation in that that seemed to throw around the idea of separating the MTP model out of the main GGUF, like with Mmproj. This was rejected.<p>Which I'm happy for. So given that decision, I don't think it's unreasonable to think that they might be open to including Mmproj files in the GGUF.<p>Only issue I can think of is, which one? BF16, F16? Etc</p>
]]></description><pubDate>Thu, 14 May 2026 23:08:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=48142435</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48142435</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48142435</guid></item><item><title><![CDATA[New comment by intothemild in "Local AI needs to be the norm"]]></title><description><![CDATA[
<p>There's a percentage of people who love to question how the open models were trained.. they are almost always going to try and make some argument about using the closed frontier models for distillation as some form of theft.<p>Just totally forgetting that the frontier models themselves stole an insane amount to get to where they are.<p>It's theft all the way across the board, and when someone tries to make the argument that open models theft is bad, but Altman or Amodei's theft is good.. they are revealing a lot about themselves</p>
]]></description><pubDate>Mon, 11 May 2026 06:06:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=48091525</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48091525</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48091525</guid></item><item><title><![CDATA[New comment by intothemild in "Local AI needs to be the norm"]]></title><description><![CDATA[
<p>That's already happening. Qwen3.6 and Gemma4.<p>Basically small and medium models that are crazy well trained for their sizes.<p>Then we have a lot of specular decoding stuff like MTP and others coming to speed up responses, and finally better quantisation to use less memory.<p>Local LLM is the future, and the larger labs know that the open models will eat their lunch once people realise that the gap is only a few months. If we were good with LLMs a couple months ago, we're good with the open models now.</p>
]]></description><pubDate>Sun, 10 May 2026 20:24:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=48087559</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48087559</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48087559</guid></item><item><title><![CDATA[New comment by intothemild in "Accelerating Gemma 4: faster inference with multi-token prediction drafters"]]></title><description><![CDATA[
<p>Don't forget to update the gguf you have too. The templates in them were updated recently too</p>
]]></description><pubDate>Wed, 06 May 2026 13:32:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=48036049</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=48036049</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48036049</guid></item><item><title><![CDATA[New comment by intothemild in "A report on burnout in open source software communities (2025) [pdf]"]]></title><description><![CDATA[
<p>I like it, only one problem.. the fix it now types also are the same ones that didn't read anything.</p>
]]></description><pubDate>Sat, 02 May 2026 12:34:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47985867</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=47985867</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47985867</guid></item><item><title><![CDATA[New comment by intothemild in "Spirit Airlines Is Winding Down All Operations"]]></title><description><![CDATA[
<p>If only they flapped. Maybe they'd still be in the air.</p>
]]></description><pubDate>Sat, 02 May 2026 07:36:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=47984248</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=47984248</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47984248</guid></item><item><title><![CDATA[New comment by intothemild in "A report on burnout in open source software communities (2025) [pdf]"]]></title><description><![CDATA[
<p>> We're talking about code that users can modify themselves to solve their own problems. That's it. I don't need to hear about the struggle.<p>That's exactly the kind of attitude that this discusses.<p>You create something that solves your problems, you put it up on GitHub, free, and open... Suddenly it turns out others have the same problems you did, your software solves them.<p>It starts ok. People are nice. But as it gains traction, a certain kind of toxic person becomes more and more common. The "YOU FIX IT NOW! I DONT KNOW" Kind of person.<p>You wake in the morning, look at your email, and it's a stream of being screamed at. That takes a toll.<p>All because you had an idea one time to build something that solved your problem you thought "hey I might just open source this".<p>> That's it. I don't need to hear about the struggle.</p>
]]></description><pubDate>Sat, 02 May 2026 07:28:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=47984198</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=47984198</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47984198</guid></item><item><title><![CDATA[New comment by intothemild in "Anthropic's Champion Kit for engineers pushing Claude Code at their company"]]></title><description><![CDATA[
<p>How many pieces of flair is the minimum?</p>
]]></description><pubDate>Wed, 29 Apr 2026 11:07:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=47946682</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=47946682</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47946682</guid></item><item><title><![CDATA[New comment by intothemild in "Making RAM at Home [video]"]]></title><description><![CDATA[
<p>I only have raw RAM, pastured RAM is wrong.<p>I get my DRAM needs at the RAM ranch.</p>
]]></description><pubDate>Wed, 22 Apr 2026 05:58:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=47859630</link><dc:creator>intothemild</dc:creator><comments>https://news.ycombinator.com/item?id=47859630</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47859630</guid></item></channel></rss>