<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: naasking</title><link>https://news.ycombinator.com/user?id=naasking</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 09 Apr 2026 20:00:30 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=naasking" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by naasking in "Muse Spark: Scaling towards personal superintelligence"]]></title><description><![CDATA[
<p>> Based on what? A lot of this is vibes and FOMO; just like any economic bubble.<p>You're in a bubble.<p><a href="https://www.helpnetsecurity.com/2026/04/07/google-llm-content-moderation/" rel="nofollow">https://www.helpnetsecurity.com/2026/04/07/google-llm-conten...</a></p>
]]></description><pubDate>Thu, 09 Apr 2026 01:35:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=47698345</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47698345</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47698345</guid></item><item><title><![CDATA[New comment by naasking in "System Card: Claude Mythos Preview [pdf]"]]></title><description><![CDATA[
<p>> The best you'll see is an improvised post-hoc rationalization story.<p>Funny, because "post-hoc rationalization" is how many neuroscientists think humans operate.<p>That LLMs are stochastic inference engines is obvious by construction, but you skipped the step where you proved that human thoughts, self-awareness and metacognition are not reducible to stochastic inference.</p>
]]></description><pubDate>Wed, 08 Apr 2026 19:17:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=47694922</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47694922</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47694922</guid></item><item><title><![CDATA[New comment by naasking in "System Card: Claude Mythos Preview [pdf]"]]></title><description><![CDATA[
<p>I think many humans engage in metacognitive reasoning, and that this might not be strongly represented in training data so it probably isn't common to LLMs yet. They can still do it when prompted though.</p>
]]></description><pubDate>Wed, 08 Apr 2026 16:01:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47692040</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47692040</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47692040</guid></item><item><title><![CDATA[New comment by naasking in "System Card: Claude Mythos Preview [pdf]"]]></title><description><![CDATA[
<p>> Conversely: in humans, intelligence is inversely correlated with crime.<p>Inversely correlated with crime that's <i>caught and successfully prosecuted</i>, you mean, because that's what makes up the stats on crime. I think people too often forget that we consider most criminals "dumb" because those who are caught are mostly dumb. Smart "criminals" either don't get caught or have made their unethical actions legal.</p>
]]></description><pubDate>Wed, 08 Apr 2026 15:46:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=47691813</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47691813</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47691813</guid></item><item><title><![CDATA[New comment by naasking in "System Card: Claude Mythos Preview [pdf]"]]></title><description><![CDATA[
<p>I'm curious if frontier labs use any forms of compression on their models to improve performance. The small % drop of Q8 or FP8 would still put it ahead of Opus, but should double token throughput. Maybe then interactive use would feel like an improvement.</p>
]]></description><pubDate>Wed, 08 Apr 2026 15:40:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=47691740</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47691740</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47691740</guid></item><item><title><![CDATA[New comment by naasking in "GLM-5.1: Towards Long-Horizon Tasks"]]></title><description><![CDATA[
<p>I used GLM5 quite a bit, and I'd say it was maybe on par with Sonnet for most simple to medium tasks. Definitely not Opus though. Didn't test super long context tasks, and that's where I would expect it to break down. A recent study on software maintainability still showed Sonnet and Opus were peerless on that metric, although GLM series of models has been making impressive gains.</p>
]]></description><pubDate>Wed, 08 Apr 2026 15:22:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=47691482</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47691482</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47691482</guid></item><item><title><![CDATA[New comment by naasking in "Issue: Claude Code is unusable for complex engineering tasks with Feb updates"]]></title><description><![CDATA[
<p>Very interesting. I run Claude Code in VS Code, and unfortunately there doesn't seem to be an equivalent to "cli.js", it's all bundled into the "claude.exe" I've found under the VS code extensions folder (confirmed via hex editor that the prompts are in there).<p>Edit: tried patching with revised strings of equivalent length informed by this gist, now we'll see how it goes!</p>
]]></description><pubDate>Tue, 07 Apr 2026 16:14:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=47677562</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47677562</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47677562</guid></item><item><title><![CDATA[New comment by naasking in "Issue: Claude Code is unusable for complex engineering tasks with Feb updates"]]></title><description><![CDATA[
<p>They're a business. The alternative to keep costs in check would to ask you for more money, and you'd likely be even more upset with that.</p>
]]></description><pubDate>Tue, 07 Apr 2026 14:55:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47676398</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47676398</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47676398</guid></item><item><title><![CDATA[New comment by naasking in "Embarrassingly simple self-distillation improves code generation"]]></title><description><![CDATA[
<p>It's interesting that LLMs improve skills, especially on harder problems, just by practicing them. That's effectively what's going on.</p>
]]></description><pubDate>Sun, 05 Apr 2026 01:12:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=47645200</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47645200</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47645200</guid></item><item><title><![CDATA[New comment by naasking in "Lemonade by AMD: a fast and open source local LLM server using GPU and NPU"]]></title><description><![CDATA[
<p>> I only ask because I've been running local models (using Ollama) on my RX 7900 XTX for the last year and a half or so and haven't had a single problem that was ROCm specific that I can think of.<p>It's probably using the Vulkan backend, that is pretty stable and performance is good.</p>
]]></description><pubDate>Fri, 03 Apr 2026 21:37:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=47632640</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47632640</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47632640</guid></item><item><title><![CDATA[New comment by naasking in "Lemonade by AMD: a fast and open source local LLM server using GPU and NPU"]]></title><description><![CDATA[
<p>Routing in a MoE model might fit.</p>
]]></description><pubDate>Fri, 03 Apr 2026 11:46:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=47625584</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47625584</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47625584</guid></item><item><title><![CDATA[New comment by naasking in "Lemonade by AMD: a fast and open source local LLM server using GPU and NPU"]]></title><description><![CDATA[
<p>It's just an example where it fits perfectly, and it's exactly what something like Alexa or Google home needs for low power machine learning, eg. when sitting idle it needs to consume as little power as possible while waiting for a trigger word.<p>Any context that needs some limited intelligence while consuming little power would benefit from this.</p>
]]></description><pubDate>Fri, 03 Apr 2026 11:46:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=47625579</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47625579</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47625579</guid></item><item><title><![CDATA[New comment by naasking in "Lemonade by AMD: a fast and open source local LLM server using GPU and NPU"]]></title><description><![CDATA[
<p>Small models aren't entirely useless, and the NPU can run LLMs up to around 8B parameters from what I've seen. So one way they could be useful: Qwen3 text to speech models are all under 2B parameters, and Open AI's whisper-small speech to text model is under 1B parameters, so you could have an AI agent that you could talk to and could talk back, where, in theory, you could offload all audio-text and text-audio processing to the low power NPU and leave the GPU to do all of the LLM processing.</p>
]]></description><pubDate>Thu, 02 Apr 2026 18:02:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=47617932</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47617932</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47617932</guid></item><item><title><![CDATA[New comment by naasking in "Lemonade by AMD: a fast and open source local LLM server using GPU and NPU"]]></title><description><![CDATA[
<p>Yes, Vulkan is currently faster due to some ROCm regressions: <a href="https://github.com/ROCm/ROCm/issues/5805#issuecomment-4141615579" rel="nofollow">https://github.com/ROCm/ROCm/issues/5805#issuecomment-414161...</a><p>ROCm should be faster in the end, if they ever fix those issues.</p>
]]></description><pubDate>Thu, 02 Apr 2026 17:39:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=47617579</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47617579</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47617579</guid></item><item><title><![CDATA[New comment by naasking in "Lemonade by AMD: a fast and open source local LLM server using GPU and NPU"]]></title><description><![CDATA[
<p>From what I understand, ROCm is a lot buggier and has some performance regressions on a lot of GPUs in the 7.x series. Vulkan performance for LLMs is apparently not far behind ROCm and is far more stable and predictable at this time.</p>
]]></description><pubDate>Thu, 02 Apr 2026 17:36:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=47617530</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47617530</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47617530</guid></item><item><title><![CDATA[New comment by naasking in "Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs"]]></title><description><![CDATA[
<p>Great! I hope the era of 1-bit LLMs really gets going.</p>
]]></description><pubDate>Wed, 01 Apr 2026 14:19:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=47601314</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47601314</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47601314</guid></item><item><title><![CDATA[New comment by naasking in "Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs"]]></title><description><![CDATA[
<p>Similar in spirit but different in execution as far as I can tell.</p>
]]></description><pubDate>Wed, 01 Apr 2026 14:18:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=47601278</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47601278</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47601278</guid></item><item><title><![CDATA[New comment by naasking in "Mathematical methods and human thought in the age of AI"]]></title><description><![CDATA[
<p>No organization can ever rival a real government like the US due to the latter's monopolization on the use of force.</p>
]]></description><pubDate>Tue, 31 Mar 2026 12:56:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47586698</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47586698</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47586698</guid></item><item><title><![CDATA[New comment by naasking in "Mathematical methods and human thought in the age of AI"]]></title><description><![CDATA[
<p>You can only truly stop competition by government intervention.</p>
]]></description><pubDate>Mon, 30 Mar 2026 20:16:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=47579196</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47579196</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47579196</guid></item><item><title><![CDATA[New comment by naasking in "Mathematical methods and human thought in the age of AI"]]></title><description><![CDATA[
<p>Open source vs. Microsoft is a great example.</p>
]]></description><pubDate>Mon, 30 Mar 2026 20:15:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=47579183</link><dc:creator>naasking</dc:creator><comments>https://news.ycombinator.com/item?id=47579183</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47579183</guid></item></channel></rss>