Hacker News: anon373839

New comment by anon373839 in "Was my $48K GPU server worth it?"

anon373839 — Fri, 22 May 2026 02:50:45 +0000

> our infosec department doesn't buy the "zero retention" promise

They are wise to be skeptical! It is neither a promise nor zero data retention.

Look at Anthropic's Zero Data Retention policy -- and remember, this is the policy that applies to the exclusively eligible enterprise partners who can even qualify for a ZDR agreement with Anthropic:

> When ZDR is enabled, prompts and model responses generated during Claude Code sessions are processed in real time and not stored by Anthropic after the response is returned, *except where needed to comply with law or combat misuse*.

> Even with ZDR enabled, Anthropic may retain data where required by law or to address Usage Policy violations. If a session is flagged for a policy violation, *Anthropic may retain the associated inputs and outputs for up to 2 years*....

This means that Anthropic is actively inspecting all of your data with machine learning classifiers. When the usage is flagged for whatever reason as violating any aspect of Anthropic's Usage Policy, then they get to keep your data for 2 years, with no apparent limitation on what they can then use it for.

Crucially, you have ZERO guarantees about the sensitivity or specificity of these classifiers. For all anyone knows, Anthropic is silently flagging 75% of queries and retaining the data.

https://code.claude.com/docs/en/zero-data-retention

New comment by anon373839 in "Running local models on an M4 with 24GB memory"

anon373839 — Mon, 11 May 2026 15:37:06 +0000

Hm. I think there is a bit of a shifting goalpost dynamic at play here. Those April releases, even the fast MoE versions, are better than big cloud models from 18 months ago. I remember when everyone was gushing about Sonnet 3.7 and what a transformative experience development was using it. So was it useful or wasn’t it? A tool doesn’t lose its usability just because a better one comes along.

To me, these small local LLMs are highly useful (and this “usable”) even though they don’t match the output of today’s frontier models.

New comment by anon373839 in "Agents for financial services and insurance"

anon373839 — Tue, 05 May 2026 16:13:56 +0000

This was their play all along with their unethical data collection practices: let others use the APIs to discover the applications, then use the data against them to offer integrated solutions in every vertical of interest. Cursor, once Anthropic’s biggest customer, was one of the early ones they screwed.

They are also fighting for their lives because these insane valuations simply aren’t justified by being dumb pipes. Fortunately, open weights models are widely available and have crossed a threshold of usefulness that cements their place as good substitutes.

New comment by anon373839 in "The Road to a Billion-Token Context"

anon373839 — Mon, 04 May 2026 12:55:29 +0000

When you read technical papers on various models, you’ll find that they often did most of the pretraining and even the supervised fine tuning using relatively short context data; then they “extended” the context window by training on a little bit of long context data. I think this is what is meant by not being trained uniformly.

However, now that RL environments and long-horizon agentic performance have taken such a prominent role in model development, I wonder if that practice still holds. I know that the most recent Gemma and Qwen models are incomparably more reliable at long contexts than their predecessors, even though, e.g. Qwen already had a 256k context. It just didn’t work like it does now.

New comment by anon373839 in "New statue in London, attributed to Banksy, of a suited man, blinded by a flag"

anon373839 — Mon, 04 May 2026 05:50:02 +0000

One can’t say that proposition is obvious to the population at large. Else, “we” (as in Earth in 2026) would have very political dynamics. So maybe Banksy felt inclined to do a public service announcement.

New comment by anon373839 in "LLMs Are Not a Higher Level of Abstraction"

anon373839 — Mon, 04 May 2026 00:21:52 +0000

The model outputs a probability distribution for the next token, given the sequence of all previous tokens in the context window. It’s just a list of floats in the same order as the list of tokens that the tokenizer uses.

After that, a piece of software that is NOT the LLM chooses the next token. This is called the sampler. There are different sampling parameters and strategies available, but if you want repeatable* outputs, just take the token with the highest probability number.

* Perfect determinism in this sense is difficult to achieve because GPU calculations naturally have a minor bit of nondeterminism. But you can get very close.

New comment by anon373839 in "Mike: open-source legal AI"

anon373839 — Thu, 30 Apr 2026 04:38:42 +0000

Hm, I don't think this looks like Anthropic's design style. Anthropic is kind of doing a Chobanicore + Corporate Memphis design system that I personally find kind of creepy. But the website here just feels fresh and pleasant.

New comment by anon373839 in "Mike: open-source legal AI"

anon373839 — Thu, 30 Apr 2026 03:27:07 +0000

Agreed; that's a beautiful site. The main design style apart from minimalism that I notice is glassmorphism. Well, that and a very well chosen Monet to set the tone.

New comment by anon373839 in "Which one is more important: more parameters or more computation? (2021)"

anon373839 — Sun, 26 Apr 2026 01:13:28 +0000

Well both aren’t “more important”, since that’s illogical. I think recent strides in high performance small LLMs have shown that the tasks LLMs are useful for may not require the level of representational capacity that trillion-parameter models offer.

However: the labs releasing these high-intelligence-density models are getting them by first training much larger models and then distilling down. So the most interesting question to me is, how can we accelerate learning in small networks to avoid the necessity of training huge teacher networks?

New comment by anon373839 in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

anon373839 — Wed, 22 Apr 2026 22:56:53 +0000

This is just blind belief. The model discussed in this topic already outperforms “well made” frontier LLMs of 12-18 months ago. If what you wrote is true, that wouldn’t have been possible.

New comment by anon373839 in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

anon373839 — Wed, 22 Apr 2026 22:51:48 +0000

Absolutely. Plus as these companies become hungrier for revenue and to get out of the commodity market they are in, they are only going to get more aggressive in their (ab)use of customer data.

New comment by anon373839 in "Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model"

anon373839 — Wed, 22 Apr 2026 22:47:36 +0000

I would recommend trying oMLX, which is much more performant and efficient than LM Studio. It has block-level KV context caching that makes long chats and agentic/tool calling scenarios MUCH faster.

New comment by anon373839 in "The RAM shortage could last years"

anon373839 — Sun, 19 Apr 2026 08:43:37 +0000

That's not what consumes the most memory at scale. The KV caches are per-user.

New comment by anon373839 in "The Gemini app is now on Mac"

anon373839 — Wed, 15 Apr 2026 23:31:01 +0000

It was always possible to store it in the browser’s localStorage, so…

New comment by anon373839 in "Apple's accidental moat: How the "AI Loser" may end up winning"

anon373839 — Mon, 13 Apr 2026 10:43:02 +0000

That amount of RAM won’t be necessary. Gemma 4 and comparably sized Qwen 3.5 models are already better than the very best, biggest frontier models were just 12-18 months ago. Now in an 18-36GB footprint, depending on quantization.

New comment by anon373839 in "How We Broke Top AI Agent Benchmarks: And What Comes Next"

anon373839 — Sun, 12 Apr 2026 00:13:40 +0000

That is Anthropic’s shtick to a tee.

New comment by anon373839 in "System Card: Claude Mythos Preview [pdf]"

anon373839 — Tue, 07 Apr 2026 23:20:18 +0000

That’s not what they are doing. They are just hyping up the product - and, no doubt, trying to foster a climate of awe so that when they ask their friends in Washington to legislate on their behalf, the environment is more receptive.

New comment by anon373839 in "Trinity Large Thinking"

anon373839 — Thu, 02 Apr 2026 05:41:12 +0000

Thanks for the tip! Hadn't seen that one.

New comment by anon373839 in "Trinity Large Thinking"

anon373839 — Thu, 02 Apr 2026 05:30:27 +0000

Bit of a tangent, but I'm pleased to see that Qwen 3.5 35B is tied with GPT-5.4 and just 2 points behind 4.6 Opus. That little model is so impressively capable and fast! I'm frequently still surprised that I have that level of capability and speed running locally on my laptop.

New comment by anon373839 in "Ollama is now powered by MLX on Apple Silicon in preview"

anon373839 — Tue, 31 Mar 2026 20:52:40 +0000

They’re not far behind, unless you mean for “vibe coding”. And for probably 85% of queries that people use LLMs for, you can’t even really perceive the difference between frontier and local.