<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: mungoman2</title><link>https://news.ycombinator.com/user?id=mungoman2</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 13 Jun 2026 02:29:24 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=mungoman2" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by mungoman2 in "A €0.01 bank transfer could compromise a banking AI agent"]]></title><description><![CDATA[
<p>In this case it could be solved by not letting the LLM consume the transaction message. Effectively the same as preventing user supplied info going to the first argument of printf().<p>The transaction in question can remain opaque to the LLM and a %transaction.message% string is resolved in the layer between the LLM and the user.</p>
]]></description><pubDate>Thu, 11 Jun 2026 04:41:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=48486299</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=48486299</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48486299</guid></item><item><title><![CDATA[New comment by mungoman2 in "1-Bit Bonsai Image 4B Image Generation for Local Devices"]]></title><description><![CDATA[
<p>Curious about this take, how do you mean?<p>I understand the point of distorted facts, but what I’m not sure how things are improved by basically having no trust in any facts?</p>
]]></description><pubDate>Mon, 01 Jun 2026 04:43:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=48352690</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=48352690</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48352690</guid></item><item><title><![CDATA[New comment by mungoman2 in "Real-time LLM Inference on Standard GPUs: 3k tokens/s per request"]]></title><description><![CDATA[
<p>This looks very interesting. Possible to get those rates without exotic hardware.<p>But I have to say that the comparison is not really fair. Comparison is done with a 2 B model vs frontier models that are likely 100s of times larger. Also taalas with their 15000 tok/s inference are suspiciously missing from the comparison.<p>We need to see the comparison with this framework and useful models, which at present seems to mean ~30 B.</p>
]]></description><pubDate>Fri, 29 May 2026 10:36:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=48321384</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=48321384</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48321384</guid></item><item><title><![CDATA[New comment by mungoman2 in "DeepSeek-V4-Flash means LLM steering is interesting again"]]></title><description><![CDATA[
<p>Wow, this is really fascinating. And it reads like the intro of a sci-fi short.</p>
]]></description><pubDate>Sun, 17 May 2026 06:44:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=48166584</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=48166584</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48166584</guid></item><item><title><![CDATA[New comment by mungoman2 in "Seeing Birdsong"]]></title><description><![CDATA[
<p>This looks very cool, but it's not clear what it means.<p>I wonder if it is captivating simply because it syncs cool graphics to audio, like those Winamp visualization filters in the old days.</p>
]]></description><pubDate>Mon, 11 May 2026 13:39:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=48094864</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=48094864</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48094864</guid></item><item><title><![CDATA[New comment by mungoman2 in "PortalVR Motion – use any VR content in 2D with 3D tracked Joy-Cons"]]></title><description><![CDATA[
<p>Very interesting! Will definitely try this.<p>Feedback: I think it would be beneficial to clarify that there is a trial at the top of the page. Currently we have download and buy buttons next to each other, which I assumed meant that the software can be downloaded first, but will require a key you get from ”Buy”.<p>The images show Switch 1 controllers. Since the Switch 2 is already out, does this imply that the tracking doesn’t work on Switch 2 controllers?<p>I don’t like the 3 PC  activation thing. It implies that the software needs to call home and will stop working if the company goes under. Or, at least, will not be possible to install anew.</p>
]]></description><pubDate>Sat, 09 May 2026 06:30:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=48072396</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=48072396</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48072396</guid></item><item><title><![CDATA[New comment by mungoman2 in "Accelerating Gemma 4: faster inference with multi-token prediction drafters"]]></title><description><![CDATA[
<p>Naively it seems odd that running multiple checks in parallel is faster than just running the autoregressive model multiple times in series. It’s the same amount of compute right?<p>But I think the key is that in the standard autoregressive case we get memory bandwidth bound, so there are tons of idle compute resources. And so checking multiple tokens is cheap because we can batch and thus reuse the read weights for multiple tokens.<p>The verification step is similar to a prefill with a small batch size. The difference is what we do with the generated logits.</p>
]]></description><pubDate>Wed, 06 May 2026 05:36:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=48032628</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=48032628</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48032628</guid></item><item><title><![CDATA[New comment by mungoman2 in "Show HN: I built a toy that plays grandma's stories when my daughter hugs it"]]></title><description><![CDATA[
<p>This is an amazing idea, and congrats on getting so far through it.<p>I personally would be wary about fire. Custom electronics without experience, and then putting (assuming here) high energy density batteries in a soft toy handled by little kids. Any accident will be absolutely disastrous. And when you scale up those very low probability failures are bound to happen.<p>How do you think about this, is this handled already?</p>
]]></description><pubDate>Fri, 24 Apr 2026 05:41:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47886033</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=47886033</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47886033</guid></item><item><title><![CDATA[New comment by mungoman2 in "Ternary Bonsai: Top Intelligence at 1.58 Bits"]]></title><description><![CDATA[
<p>This is very interesting and exciting, but IMHO the comparisons read as a bit disingenuous with the other models at 16 bit weights. 
The 16 bit releases of the others models are not optimized for size, making it difficult to take the comparison seriously.<p>Would be interesting to see a comparison to quantized versions of the other models. If this model beats the others also in a fair comparison it gives more credibility to it.</p>
]]></description><pubDate>Tue, 21 Apr 2026 09:13:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=47846469</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=47846469</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47846469</guid></item><item><title><![CDATA[New comment by mungoman2 in "The Clock"]]></title><description><![CDATA[
<p>Instead of anchoring the sun and thus noon at the top it would be interesting to have the sun move around the clock face as the year progresses. Noon then moves around as the year progresses.
”Up” could be said to point towards the center of the galaxy instead.</p>
]]></description><pubDate>Wed, 08 Apr 2026 05:39:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47685783</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=47685783</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47685783</guid></item><item><title><![CDATA[New comment by mungoman2 in "TurboQuant: Redefining AI efficiency with extreme compression"]]></title><description><![CDATA[
<p>What they're saying is that the error for a vector increases with r, which is true.<p>Trivially, with r=0, the error is 0, regardless of how heavily the   direction is quantized. Larger r means larger absolute error in the reconstructed vector.</p>
]]></description><pubDate>Wed, 25 Mar 2026 08:12:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=47514637</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=47514637</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47514637</guid></item><item><title><![CDATA[New comment by mungoman2 in "Apply video compression on KV cache to 10,000x less error at Q4 quant"]]></title><description><![CDATA[
<p>This is cool. It makes storage of the KV cache much smaller, making it possible to keep more of it in fast memory.<p>Bandwidth-wise it is worse (more bytes accessed) to generate and do random recall on than the vanilla approach, and significantly worse than a quantized approach. That’s because the reference needs to be accessed.<p>I guess implied is that since the KV cache is smaller, the probability is higher that the parts it that are needed are in fast memory, and that bandwidth requirements of slow links is reduced, and performance goes up.<p>Would be interesting with a discussion about benefits/drawbacks of the approach. Ideally backed by data.</p>
]]></description><pubDate>Mon, 23 Mar 2026 06:48:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=47486166</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=47486166</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47486166</guid></item><item><title><![CDATA[New comment by mungoman2 in "A sufficiently detailed spec is code"]]></title><description><![CDATA[
<p>Well, the spec can of course define constraints of how the function is implemented.</p>
]]></description><pubDate>Thu, 19 Mar 2026 06:36:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=47435704</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=47435704</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47435704</guid></item><item><title><![CDATA[New comment by mungoman2 in "Claude Tips for 3D Work"]]></title><description><![CDATA[
<p>Really good. I’ve struggled with the same thing.<p>> Instead of expecting it to understand my requests, I almost always build tooling first to give us a shared language to discuss the project.<p>This is probably the key. I’ve found this to be true in general. Building simple tools that the model can use help frame the problem in a very useful way.</p>
]]></description><pubDate>Tue, 17 Mar 2026 05:26:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47408985</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=47408985</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47408985</guid></item><item><title><![CDATA[New comment by mungoman2 in "Okmain: How to pick an OK main colour of an image"]]></title><description><![CDATA[
<p>Tbh shrinking the image is probably the cheapest operation you can do that still lets every pixel influence the result. It’s just the average of all pixels, after suitable color conversion.</p>
]]></description><pubDate>Fri, 13 Mar 2026 14:14:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=47364774</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=47364774</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47364774</guid></item><item><title><![CDATA[New comment by mungoman2 in "Show HN: Moongate – Ultima Online server emulator in .NET 10 with Lua scripting"]]></title><description><![CDATA[
<p>This is a very fun idea. Would also be very interesting to see if one could have a system where talking to an NPC could alter the world.<p>One maybe obvious way would be that asking for rumors will actually creates the scenario that the NPC describes.</p>
]]></description><pubDate>Fri, 06 Mar 2026 20:09:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=47280427</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=47280427</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47280427</guid></item><item><title><![CDATA[New comment by mungoman2 in "Something is afoot in the land of Qwen"]]></title><description><![CDATA[
<p>Not sure what the uptime is meant to signal. People have quite low uptime as well…</p>
]]></description><pubDate>Wed, 04 Mar 2026 18:49:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=47252027</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=47252027</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47252027</guid></item><item><title><![CDATA[New comment by mungoman2 in "Show HN: I built a sub-500ms latency voice agent from scratch"]]></title><description><![CDATA[
<p>I think you’re implying that it would be useful to have the LLM predict the end of the speaker’s speech, and continue with its reply based on that.<p>If, when the speaker actually stops speaking, there is a match vs predicted, the response can be played without any latency.<p>Seems like an awesome approach! One could imagine doing this prediction for the K most likely threads simultaneously, subject by computer power available, and prune/branch as some threads become inaccurate.</p>
]]></description><pubDate>Tue, 03 Mar 2026 06:44:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=47228977</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=47228977</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47228977</guid></item><item><title><![CDATA[New comment by mungoman2 in "Show HN: I created a Mars colony RPG based on Kim Stanley Robinson’s Mars books"]]></title><description><![CDATA[
<p>But renewable is already cheaper than fossil fuels. Why don't we see this already?</p>
]]></description><pubDate>Mon, 09 Feb 2026 14:51:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=46945841</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=46945841</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46945841</guid></item><item><title><![CDATA[New comment by mungoman2 in "AI fatigue is real and nobody talks about it"]]></title><description><![CDATA[
<p>IMHO, this is not really about AI, it's about setting boundaries and not overwork yourself.</p>
]]></description><pubDate>Sun, 08 Feb 2026 15:42:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=46935251</link><dc:creator>mungoman2</dc:creator><comments>https://news.ycombinator.com/item?id=46935251</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46935251</guid></item></channel></rss>