<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: anon373839</title><link>https://news.ycombinator.com/user?id=anon373839</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 23 May 2026 00:12:59 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=anon373839" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by anon373839 in "Was my $48K GPU server worth it?"]]></title><description><![CDATA[
<p>> our infosec department doesn't buy the "zero retention" promise<p>They are wise to be skeptical! It is neither a promise nor zero data retention.<p>Look at Anthropic's Zero Data Retention policy -- and remember, this is the policy that applies to the exclusively eligible enterprise partners who can even qualify for a ZDR agreement with Anthropic:<p>> When ZDR is enabled, prompts and model responses generated during Claude Code sessions are processed in real time and not stored by Anthropic after the response is returned, *except where needed to comply with law or combat misuse*.<p>> Even with ZDR enabled, Anthropic may retain data where required by law or to address Usage Policy violations. If a session is flagged for a policy violation, *Anthropic may retain the associated inputs and outputs for up to 2 years*....<p>This means that Anthropic is actively inspecting all of your data with machine learning classifiers. When the usage is flagged for whatever reason as violating any aspect of Anthropic's Usage Policy, then they get to keep your data for 2 years, with no apparent limitation on what they can then use it for.<p>Crucially, you have ZERO guarantees about the sensitivity or specificity of these classifiers. For all anyone knows, Anthropic is silently flagging 75% of queries and retaining the data.<p><a href="https://code.claude.com/docs/en/zero-data-retention" rel="nofollow">https://code.claude.com/docs/en/zero-data-retention</a></p>
]]></description><pubDate>Fri, 22 May 2026 02:50:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=48231400</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=48231400</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48231400</guid></item><item><title><![CDATA[New comment by anon373839 in "Running local models on an M4 with 24GB memory"]]></title><description><![CDATA[
<p>Hm. I think there is a bit of a shifting goalpost dynamic at play here. Those April releases, even the fast MoE versions, are better than big cloud models from 18 months ago. I remember when everyone was gushing about Sonnet 3.7 and what a transformative experience development was using it. So was it useful or wasn’t it? A tool doesn’t lose  its usability just because a better one comes along.<p>To me, these small local LLMs are highly useful (and this “usable”) even though they don’t match the output of <i>today’s</i> frontier models.</p>
]]></description><pubDate>Mon, 11 May 2026 15:37:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=48096433</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=48096433</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48096433</guid></item><item><title><![CDATA[New comment by anon373839 in "Agents for financial services and insurance"]]></title><description><![CDATA[
<p>This was their play all along with their unethical data collection practices: let others use the APIs to discover the applications, then use the data against them  to offer integrated solutions in every vertical of interest. Cursor, once Anthropic’s biggest customer, was one of the early ones they screwed.<p>They are also fighting for their lives because these insane valuations simply aren’t justified by being dumb pipes. Fortunately, open weights models are widely available and have crossed a threshold of usefulness that cements their place as good substitutes.</p>
]]></description><pubDate>Tue, 05 May 2026 16:13:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=48024531</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=48024531</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48024531</guid></item><item><title><![CDATA[New comment by anon373839 in "The Road to a Billion-Token Context"]]></title><description><![CDATA[
<p>When you read technical papers on various models, you’ll find that they often did most of the pretraining and even the supervised fine tuning using relatively short context data; then they “extended” the context window by training on a little bit of long context data. I think this is what is meant by not being trained uniformly.<p>However, now that RL environments and long-horizon agentic performance have taken such a prominent role in model development, I wonder if that practice still holds. I know that the most recent Gemma and Qwen models are incomparably more reliable at long contexts than their predecessors, even though, e.g. Qwen already had a 256k context. It just didn’t work like it does now.</p>
]]></description><pubDate>Mon, 04 May 2026 12:55:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=48008108</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=48008108</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48008108</guid></item><item><title><![CDATA[New comment by anon373839 in "New statue in London, attributed to Banksy, of a suited man, blinded by a flag"]]></title><description><![CDATA[
<p>One can’t say that proposition is obvious to the population at large. Else, “we” (as in Earth in 2026) would have very political dynamics. So maybe Banksy felt inclined to do a public service announcement.</p>
]]></description><pubDate>Mon, 04 May 2026 05:50:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=48005080</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=48005080</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48005080</guid></item><item><title><![CDATA[New comment by anon373839 in "LLMs Are Not a Higher Level of Abstraction"]]></title><description><![CDATA[
<p>The model outputs a probability distribution for the next token, given the sequence of all previous tokens in the context window. It’s just a list of floats in the same order as the list of tokens that the tokenizer uses.<p>After that, a piece of software that is NOT the LLM chooses the next token. This is called the sampler. There are different sampling parameters and strategies available, but if you want repeatable* outputs, just take the token with the highest probability number.<p>* Perfect determinism in this sense is difficult to achieve because GPU calculations naturally have a minor bit of nondeterminism. But you can get very close.</p>
]]></description><pubDate>Mon, 04 May 2026 00:21:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=48003099</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=48003099</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48003099</guid></item><item><title><![CDATA[New comment by anon373839 in "Mike: open-source legal AI"]]></title><description><![CDATA[
<p>Hm, I don't think this looks like Anthropic's design style. Anthropic is kind of doing a Chobanicore + Corporate Memphis design system that I personally find kind of creepy. But the website here just feels fresh and pleasant.</p>
]]></description><pubDate>Thu, 30 Apr 2026 04:38:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=47958164</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47958164</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47958164</guid></item><item><title><![CDATA[New comment by anon373839 in "Mike: open-source legal AI"]]></title><description><![CDATA[
<p>Agreed; that's a beautiful site. The main design style apart from minimalism that I notice is glassmorphism. Well, that and a very well chosen Monet to set the tone.</p>
]]></description><pubDate>Thu, 30 Apr 2026 03:27:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=47957726</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47957726</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47957726</guid></item><item><title><![CDATA[New comment by anon373839 in "Which one is more important: more parameters or more computation? (2021)"]]></title><description><![CDATA[
<p>Well both aren’t “more important”, since that’s illogical. I think recent strides in high performance small LLMs have shown that the tasks LLMs are useful for may not require the level of representational capacity that trillion-parameter models offer.<p>However: the labs releasing these high-intelligence-density models are getting them by first training much larger models and then distilling down. So the most interesting question to me is, how can we accelerate learning in small networks to avoid the necessity of training huge teacher networks?</p>
]]></description><pubDate>Sun, 26 Apr 2026 01:13:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47906357</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47906357</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47906357</guid></item><item><title><![CDATA[New comment by anon373839 in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>This is just blind belief. The model discussed in this topic already outperforms “well made” frontier LLMs of 12-18 months ago. If what you wrote is true, that wouldn’t have been possible.</p>
]]></description><pubDate>Wed, 22 Apr 2026 22:56:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=47870368</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47870368</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47870368</guid></item><item><title><![CDATA[New comment by anon373839 in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>Absolutely. Plus as these companies become hungrier for revenue and to get out of the commodity market they are in, they are only going to get more aggressive in their (ab)use of customer data.</p>
]]></description><pubDate>Wed, 22 Apr 2026 22:51:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=47870324</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47870324</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47870324</guid></item><item><title><![CDATA[New comment by anon373839 in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"]]></title><description><![CDATA[
<p>I would recommend trying oMLX, which is much more performant and efficient than LM Studio. It has block-level KV context caching that makes long chats and agentic/tool calling scenarios MUCH faster.</p>
]]></description><pubDate>Wed, 22 Apr 2026 22:47:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=47870291</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47870291</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47870291</guid></item><item><title><![CDATA[New comment by anon373839 in "The RAM shortage could last years"]]></title><description><![CDATA[
<p>That's not what consumes the most memory at scale. The KV caches are per-user.</p>
]]></description><pubDate>Sun, 19 Apr 2026 08:43:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47822790</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47822790</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47822790</guid></item><item><title><![CDATA[New comment by anon373839 in "The Gemini app is now on Mac"]]></title><description><![CDATA[
<p>It was always possible to store it in the browser’s localStorage, so…</p>
]]></description><pubDate>Wed, 15 Apr 2026 23:31:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=47786738</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47786738</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47786738</guid></item><item><title><![CDATA[New comment by anon373839 in "Apple's accidental moat: How the "AI Loser" may end up winning"]]></title><description><![CDATA[
<p>That amount of RAM won’t be necessary. Gemma 4 and comparably sized Qwen 3.5 models are already <i>better</i> than the very best, biggest frontier models were just 12-18 months ago. Now in an 18-36GB footprint, depending on quantization.</p>
]]></description><pubDate>Mon, 13 Apr 2026 10:43:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=47750186</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47750186</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47750186</guid></item><item><title><![CDATA[New comment by anon373839 in "How We Broke Top AI Agent Benchmarks: And What Comes Next"]]></title><description><![CDATA[
<p>That is Anthropic’s shtick to a tee.</p>
]]></description><pubDate>Sun, 12 Apr 2026 00:13:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=47735077</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47735077</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47735077</guid></item><item><title><![CDATA[New comment by anon373839 in "System Card: Claude Mythos Preview [pdf]"]]></title><description><![CDATA[
<p>That’s not what they are doing. They are just hyping up the product - and, no doubt, trying to foster a climate of awe so that when they ask their friends in Washington to legislate on their behalf, the environment is more receptive.</p>
]]></description><pubDate>Tue, 07 Apr 2026 23:20:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=47682585</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47682585</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47682585</guid></item><item><title><![CDATA[New comment by anon373839 in "Trinity Large Thinking"]]></title><description><![CDATA[
<p>Thanks for the tip! Hadn't seen that one.</p>
]]></description><pubDate>Thu, 02 Apr 2026 05:41:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47610380</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47610380</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47610380</guid></item><item><title><![CDATA[New comment by anon373839 in "Trinity Large Thinking"]]></title><description><![CDATA[
<p>Bit of a tangent, but I'm pleased to see that Qwen 3.5 35B is tied with GPT-5.4 and just 2 points behind 4.6 Opus. That little model is so impressively capable and fast! I'm frequently still surprised that I have that level of capability and speed running locally on my laptop.</p>
]]></description><pubDate>Thu, 02 Apr 2026 05:30:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=47610323</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47610323</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47610323</guid></item><item><title><![CDATA[New comment by anon373839 in "Ollama is now powered by MLX on Apple Silicon in preview"]]></title><description><![CDATA[
<p>They’re not far behind, unless you mean for “vibe coding”.  And for probably 85% of queries that people use LLMs for, you can’t even really perceive the difference between frontier and local.</p>
]]></description><pubDate>Tue, 31 Mar 2026 20:52:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=47593330</link><dc:creator>anon373839</dc:creator><comments>https://news.ycombinator.com/item?id=47593330</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47593330</guid></item></channel></rss>