<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: coder543</title><link>https://news.ycombinator.com/user?id=coder543</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 26 May 2026 19:57:23 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=coder543" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by coder543 in "Google's Antigravity bait and switch"]]></title><description><![CDATA[
<p>Closed source?<p>IntelliJ: <a href="https://github.com/JetBrains/intellij-community" rel="nofollow">https://github.com/JetBrains/intellij-community</a><p>PyCharm: <a href="https://github.com/JetBrains/intellij-community/tree/master/python#intellij-platform-open-source-repository-pycharm" rel="nofollow">https://github.com/JetBrains/intellij-community/tree/master/...</a><p>Android Studio: <a href="https://android.googlesource.com/platform/tools/adt/idea/+/refs/heads/mirror-goog-studio-master-dev#android-studio" rel="nofollow">https://android.googlesource.com/platform/tools/adt/idea/+/r...</a><p>Yes, they might offer extended proprietary editions/plugins <i>in addition</i>, but the IDEs themselves are open source.</p>
]]></description><pubDate>Thu, 21 May 2026 16:11:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=48225111</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=48225111</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48225111</guid></item><item><title><![CDATA[New comment by coder543 in "MacBook Neo Deep Dive: Benchmarks, Wafer Economics, and the 8GB Gamble"]]></title><description><![CDATA[
<p>I have never ever seen Windows provide this warning even once just because there is a faster port on the machine and the user plugged the device into the wrong one. Please provide a source for this claim that you are making. Citation absolutely needed.<p>In the unlikely case that this feature exists thanks to Microsoft, I <i>would like</i> to say that is great, because it is much more user friendly than only having tiny labels. But since I’ve never seen this feature work before, it seems to me that it must be broken, if it exists at all.</p>
]]></description><pubDate>Wed, 13 May 2026 23:35:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=48129111</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=48129111</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48129111</guid></item><item><title><![CDATA[New comment by coder543 in "MacBook Neo Deep Dive: Benchmarks, Wafer Economics, and the 8GB Gamble"]]></title><description><![CDATA[
<p>The computer pops up a warning if you plug a fast device into the slow port, which is a lot more informative for the average user than a tiny label that most users wouldn’t even read.<p>Labels would be nice, I guess, but their absence is hardly a dealbreaker.</p>
]]></description><pubDate>Wed, 13 May 2026 22:04:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=48128219</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=48128219</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48128219</guid></item><item><title><![CDATA[New comment by coder543 in "Accelerating Gemma 4: faster inference with multi-token prediction drafters"]]></title><description><![CDATA[
<p>That's great news. That has not been the case with other MTP implementations like Qwen3.5, but I see the section in the article saying Google introduced some architectural optimizations to make this possible.</p>
]]></description><pubDate>Tue, 05 May 2026 18:09:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=48026278</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=48026278</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48026278</guid></item><item><title><![CDATA[New comment by coder543 in "Accelerating Gemma 4: faster inference with multi-token prediction drafters"]]></title><description><![CDATA[
<p>MTP requires a separate KV cache, so there is more memory overhead than just the weights of the MTP model, but it's a manageable amount.</p>
]]></description><pubDate>Tue, 05 May 2026 17:43:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=48025902</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=48025902</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48025902</guid></item><item><title><![CDATA[New comment by coder543 in "DeepSeek v4"]]></title><description><![CDATA[
<p>Your “benchmark” is invalid. Penalizing the model because the hosting environment is being DDoSed by users a few hours after launch is utter nonsense.<p>I see that you tried to justify this lower in the thread, but no… it completely invalidates your benchmark. You are not testing the model. You are conflating one specific model host and model performance, and then claiming you are benchmarking the model. All major models are hosted by multiple different services.<p>In the real world, clients will just retry if there is a server error, and that will not impact response quality at all, and the workflow the model is being used in will not fail. If a workflow is so poorly coded that it doesn’t even have retry logic, then that workflow is doomed no matter which host you use. But again, reliability of the host is separate from the model.<p>You <i>can</i> make your benchmark valid by having separate leaderboards for model quality and host reliability. I’m not saying to throw the whole thing away. But the current claim is not valid.<p>And you’re also making an unsourced claim that everyone else has already determined this model sucks? Nah. The first result from Artificial Analysis shows good things: <a href="https://x.com/ArtificialAnlys/status/2047547434809880611" rel="nofollow">https://x.com/ArtificialAnlys/status/2047547434809880611</a><p>But I am still waiting to see the results from the full suite of AA benchmarks.</p>
]]></description><pubDate>Fri, 24 Apr 2026 14:21:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=47890702</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47890702</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47890702</guid></item><item><title><![CDATA[New comment by coder543 in "Kimi K2.6: Advancing open-source coding"]]></title><description><![CDATA[
<p>The description specifically says:<p>"Kimi-K2.6 adopts the same native int4 quantization method as Kimi-K2-Thinking."</p>
]]></description><pubDate>Mon, 20 Apr 2026 17:34:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=47837766</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47837766</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47837766</guid></item><item><title><![CDATA[New comment by coder543 in "Claude Design"]]></title><description><![CDATA[
<p>From the page:<p>> Import from anywhere. Start from a text prompt, upload images and documents (DOCX, PPTX, XLSX), or point Claude at your codebase. You can also use the web capture tool to grab elements directly from your website so prototypes look like the real product.</p>
]]></description><pubDate>Fri, 17 Apr 2026 15:33:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47807061</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47807061</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47807061</guid></item><item><title><![CDATA[New comment by coder543 in "Qwen3.6-35B-A3B: Agentic coding power, now open to all"]]></title><description><![CDATA[
<p>No… seriously. Every model release is accused. Including Opus, GPT-5.4, whatever. And yes, including smaller models that are not the top in every benchmark.<p>My own experiences with Gemma 4 have been quite mediocre: <a href="https://www.reddit.com/r/LocalLLaMA/comments/1sn3izh/comment/ogjsfjv/?context=3" rel="nofollow">https://www.reddit.com/r/LocalLLaMA/comments/1sn3izh/comment...</a><p>I would almost be tempted to call it benchmaxed if that term weren’t such a joke at this point. It is a deeply unserious term these days.<p>Gemma 4 is worse than its benchmarks show in terms of agentic workflows. The Qwen3.x models are much better; not benchmaxed. I have tested this extensively for my own workflows. Google really needs to release Gemma 4.1 ASAP. I really hope they’re not planning to just wait another calendar year like they did for Gemma 3 -> 4 with no intermediate updates.<p>And the lead author on the paper replied to that tweet to say that the scores would need to be greater than 80 to show actual contamination: <a href="https://x.com/MiZawalski/status/2043990236317851944?s=20" rel="nofollow">https://x.com/MiZawalski/status/2043990236317851944?s=20</a></p>
]]></description><pubDate>Fri, 17 Apr 2026 12:40:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=47805306</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47805306</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47805306</guid></item><item><title><![CDATA[New comment by coder543 in "Qwen3.6-35B-A3B: Agentic coding power, now open to all"]]></title><description><![CDATA[
<p>Every model release gets accused of that, including the flagship models.</p>
]]></description><pubDate>Fri, 17 Apr 2026 11:41:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=47804829</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47804829</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47804829</guid></item><item><title><![CDATA[New comment by coder543 in "Qwen3.6-35B-A3B: Agentic coding power, now open to all"]]></title><description><![CDATA[
<p>Artificial Analysis hasn't posted their independent analysis of Qwen3.6 35B A3B yet, but Alibaba's benchmarks paint it as being on par with Qwen3.5 27B (or better in some cases).<p>Even Qwen3.5 35B A3B benchmarks roughly on par with Haiku 4.5, so Qwen3.6 should be a noticeable step up.<p><a href="https://artificialanalysis.ai/models?models=gpt-oss-120b%2Cgpt-5-4%2Cgemini-3-1-pro-preview%2Cgemma-4-31b%2Cclaude-sonnet-4-6-adaptive%2Cclaude-opus-4-6-adaptive%2Cclaude-4-5-haiku-reasoning%2Cglm-5-1%2Cqwen3-5-27b%2Cqwen3-5-35b-a3b" rel="nofollow">https://artificialanalysis.ai/models?models=gpt-oss-120b%2Cg...</a><p>No, these benchmarks are not perfect, but short of trying it yourself, this is the best we've got.<p>Compared to the frontier coding models like Opus 4.7 and GPT 5.4, Qwen3.6 35B A3B is not going to feel smart at all, but for something that can run quickly at home... it is impressive how far this stuff has come.</p>
]]></description><pubDate>Thu, 16 Apr 2026 15:34:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=47794824</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47794824</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47794824</guid></item><item><title><![CDATA[New comment by coder543 in "Qwen3.6-35B-A3B: Agentic coding power, now open to all"]]></title><description><![CDATA[
<p>Not true. With a MoE, you can offload quite a bit of the model to CPU without losing a ton of performance. 16GB should be fine to run the 4-bit (or larger) model at speeds that are decent. The --n-cpu-moe parameter is the key one on llama-server, if you're not just using -fit on.</p>
]]></description><pubDate>Thu, 16 Apr 2026 15:17:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=47794460</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47794460</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47794460</guid></item><item><title><![CDATA[New comment by coder543 in "AI Cybersecurity After Mythos: The Jagged Frontier"]]></title><description><![CDATA[
<p>That is an extremely strange article, in my opinion. They test Gemma 4 31B, but they use Qwen3 32B, DeepSeek R1, and Kimi K2, which are all outdated models whose replacements were released long before Gemma 4? Qwen3.5 27B would have done far better on these tests than Qwen3 32B, and the same for DeepSeek V3.2 and Kimi K2.5. Not to mention the obvious absence of GLM-5.1, which is the leading open weight model right now.<p>The article also seems to brush over the discovery phase, which seems very important. If it were as easy as they say, then the models should have been let loose and we would see if they actually found these bugs, and how many false positives they marked as critical. Instead, they pointed the models at the flawed code directly.</p>
]]></description><pubDate>Thu, 09 Apr 2026 16:03:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=47705405</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47705405</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47705405</guid></item><item><title><![CDATA[New comment by coder543 in "Google releases Gemma 4 open models"]]></title><description><![CDATA[
<p>Gemma 4 31B has now wiped out several of those models from the pareto frontier, now that it has pricing. Gemma 4 26B A4B has an Elo, but no pricing, so it still isn't on that chart. The Gemma 4 E2B/E4B models still aren't on the arena at all, but I expect them to move the pareto frontier as well if they're ever added, based on how well they've performed in general.</p>
]]></description><pubDate>Tue, 07 Apr 2026 03:28:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47670401</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47670401</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47670401</guid></item><item><title><![CDATA[New comment by coder543 in "Google releases Gemma 4 open models"]]></title><description><![CDATA[
<p>If you search the model card[0], there is a section titled "Code for processing Audio", which you can probably use to test things out. But, the model card makes the audio support seem disappointing:<p>> Audio supports a maximum length of 30 seconds.<p>[0]: <a href="https://huggingface.co/google/gemma-4-26B-A4B-it#getting-started" rel="nofollow">https://huggingface.co/google/gemma-4-26B-A4B-it#getting-sta...</a></p>
]]></description><pubDate>Fri, 03 Apr 2026 01:20:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=47622271</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47622271</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47622271</guid></item><item><title><![CDATA[New comment by coder543 in "Google releases Gemma 4 open models"]]></title><description><![CDATA[
<p>The E2B and E4B models support 128k context, not 256k, and even with the 128k... it could take a long time to process that much context on most phones, even with the processor running full tilt. It's hard to say without benchmarks, but 128k supported isn't the same as 128k practical. It will be interesting to see.</p>
]]></description><pubDate>Fri, 03 Apr 2026 01:07:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=47622215</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47622215</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47622215</guid></item><item><title><![CDATA[New comment by coder543 in "Google releases Gemma 4 open models"]]></title><description><![CDATA[
<p>Reducing the expert count after training causes catastrophic loss of knowledge and skills. Cerebras does this with their REAP models (although it is applied to the total set of experts, not just routing to fewer experts each time), and it can be okay for very specific use cases if you measure which experts are needed for your use case and carefully choose to delete the least used ones, but it doesn't really provide any general insight into how a higher sparsity model would behave if trained that way from scratch.</p>
]]></description><pubDate>Thu, 02 Apr 2026 22:55:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47621247</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47621247</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47621247</guid></item><item><title><![CDATA[New comment by coder543 in "Google releases Gemma 4 open models"]]></title><description><![CDATA[
<p>That rule of thumb was invented years ago, and I don’t think it is relevant anymore, despite how frequently it is quoted on Reddit. It is certainly not the "current" rule of thumb.<p>For the sake of argument, even if we take that old rule of thumb at face value, you can see how the MoE still wins:<p>- (DGX Spark) 273GB/s of memory bandwidth with 3B active parameters at Q4 = 273 / 1.5 = 182 tokens per second as the theoretical maximum.<p>- (RTX 3090) 936GB/s with 24B parameters at Q4 = 936 / 12 = 78 tokens per second. Or 39 tokens per second if you wanted to run at Q8 to maximize the memory usage on the 24GB card.<p>The "slow" DGX Spark is now more than twice as fast as the RTX 3090, thanks to an appropriate MoE architecture. Even with two RTX 3090s, you would still be slower. All else being equal, I would take 182 tokens per second over 78 any day of the week. Yes, an RTX 5090 would close that gap significantly, but you mentioned RTX 3090s, and I also have an RTX 3090-based AI desktop.<p>(The above calculation is dramatically oversimplified, but the end result holds, even if the absolute numbers would probably be less for both scenarios. Token generation is fundamentally bandwidth limited with current autoregressive models. Diffusion LLMs could change that.)<p>The mid-size frontier models are rumored to be extremely sparse like that, but 10x larger on both total and active. No one has ever released an open model that sparse for us to try out.<p>As I said, I wanted to see what it is possible for Google to achieve.<p>> Qwen 3.5 uses 122B-A10B and still is neck and neck with the 27B dense model.<p>From what I've seen, having used both, I would anecdotally report that the 122B model is better in ways that aren't reflected in benchmarks, with more inherent knowledge and more adaptability. But, I agree those two models are quite close, and that's why I want to see greater sparsity and greater total parameters: to push the limits and see what happens, for science.</p>
]]></description><pubDate>Thu, 02 Apr 2026 21:46:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=47620599</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47620599</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47620599</guid></item><item><title><![CDATA[New comment by coder543 in "Google releases Gemma 4 open models"]]></title><description><![CDATA[
<p>It was just an example of a bug, not that it was the <i>only</i> bug. I’ve personally reported at least one other for Gemma 4 on llama.cpp already.<p>In a few days, I imagine that Gemma 4 support should be in better shape.</p>
]]></description><pubDate>Thu, 02 Apr 2026 21:36:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=47620512</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47620512</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47620512</guid></item><item><title><![CDATA[New comment by coder543 in "Google releases Gemma 4 open models"]]></title><description><![CDATA[
<p>That Pareto plot doesn't seem include the Gemma 4 models <i>anywhere</i> (not just not at the frontier), likely because pricing wasn't available when the chart was generated. At least, I can't find the Gemma 4 models there. So, not particularly relevant until it is updated for the models released today.</p>
]]></description><pubDate>Thu, 02 Apr 2026 20:30:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47619771</link><dc:creator>coder543</dc:creator><comments>https://news.ycombinator.com/item?id=47619771</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47619771</guid></item></channel></rss>