Hacker News: aesthesia

New comment by aesthesia in "Feds freaked over Fable 5 after 'fix this code', not jailbreak, say researchers"

aesthesia — Tue, 16 Jun 2026 15:28:00 +0000

You can see their general approach to guardrail classifiers in these posts:

https://www.anthropic.com/research/constitutional-classifier... https://www.anthropic.com/research/next-generation-constitut...

It's not just keyword matching, but I'm sure they tuned the Fable classifiers pretty hard to avoid false negatives.

New comment by aesthesia in "SubQ 1.1 Small"

aesthesia — Tue, 16 Jun 2026 15:12:39 +0000

Disappointing they don't actually say how their sparse attention mechanism works.

New comment by aesthesia in "Apple Foundation Models"

aesthesia — Mon, 15 Jun 2026 16:46:38 +0000

From the linked docs page:

> Requests go directly from your app to the Claude API; Apple is not in the request path and does not see prompts or responses. Usage is billed to your Anthropic account at standard API pricing. Your app decides when to use Claude and when to use Apple's on-device model: pass whichever model you want to each session.

New comment by aesthesia in "There is a shadow hanging over this Fable thing"

aesthesia — Sat, 13 Jun 2026 16:35:58 +0000

LLM-isms are much less prevalent in base models, which is what GPT-2 was. It had significant problems with maintaining coherence, but GPT-2 generated text did not have the obvious tells of today's LLMs.

New comment by aesthesia in "Noise infusion banned from statistical products published by Census Bureau"

aesthesia — Sat, 13 Jun 2026 16:29:51 +0000

They can certainly enforce that you answer the survey. But it's very difficult to enforce a requirement that people answer questions accurately, particularly when they perceive that doing so will expose them to danger.

New comment by aesthesia in "Statement on US government directive to suspend access to Fable 5 and Mythos 5"

aesthesia — Sat, 13 Jun 2026 06:10:15 +0000

I'm skeptical that you're going to be able to reliably exfiltrate ~10TB of model weights using TEMPEST. Which is not to say weights are secure, just that this isn't the threat model I would be concerned about.

New comment by aesthesia in "Statement on US government directive to suspend access to Fable 5 and Mythos 5"

aesthesia — Sat, 13 Jun 2026 03:15:14 +0000

This is not legislation.

New comment by aesthesia in "Statement on US government directive to suspend access to Fable 5 and Mythos 5"

aesthesia — Sat, 13 Jun 2026 03:09:03 +0000

Come on, no one was worried that GPT-2 would help people engineer viruses. The concern was generating misinformation and spam.

New comment by aesthesia in "Statement on US government directive to suspend access to Fable 5 and Mythos 5"

aesthesia — Sat, 13 Jun 2026 03:02:41 +0000

Moolenaar's quote: "The AI models these companies use are trained by China’s censorship regime and introduce hidden vulnerabilities that put Americans’ data and businesses at risk." That is, Americans using Chinese-trained AI models are exposed to some form of cybersecurity risk.

That's not really a threat model described in either of the Anthropic posts you share, which mainly talk about the risks of allowing authoritarian regimes to use powerful US-trained models, and the geopolitical risks of authoritarian countries developing strong AI before democratic/liberal countries do.

New comment by aesthesia in "macOS 27 Beta breaks the ability to boot Asahi Linux"

aesthesia — Fri, 12 Jun 2026 15:21:58 +0000

Word was originally released for the Mac in 1985, so the deal was not that Office would be ported, just that MS would keep developing Office for the Mac.

New comment by aesthesia in "Don't let the LLM speak, just probe it"

aesthesia — Fri, 12 Jun 2026 04:39:57 +0000

This is a neat little trick, but I wonder if you could do substantially the same thing by just prompting/LoRA finetuning the model to produce a single-token output ("yes" or "no"). This only requires a single model forward pass, you can use the same KV caching strategy for shared parts of the prompt, and isotonic regression should work just as well to calibrate the output logits. I guess if you use this method and probe on an internal layer you can skip all the remaining layers, which could be a nice inference speedup.

New comment by aesthesia in "Ear Training Practice"

aesthesia — Fri, 12 Jun 2026 04:10:47 +0000

I appreciate the extremely low fuss interface, but I'm always a little disappointed by chord progression ear training that just plays triads one after another with no thought for voice leading. Generating a nice voice leading for an arbitrary chord progression is a little tricky to do automatically but far from impossible, and might be a fun exercise either for you or your favorite LLM.

New comment by aesthesia in "Ear Training Practice"

aesthesia — Fri, 12 Jun 2026 04:04:10 +0000

Using only 3/2 ratios can sound pretty bad in just intonation as well. Major thirds tuned to 81/64 are off (by a ratio of 81/80) compared with the standard 5/4 tuning, and they don't sound great. This difference is called the syntonic comma and it's been a major issue in the history of tuning.

New comment by aesthesia in "Open Reproduction of DeepSeek-R1"

aesthesia — Thu, 11 Jun 2026 15:09:13 +0000

If you really want to see fully open training pipelines for modern LLMs, Olmo and to a lesser extent Nemotron are what you should look at.

https://github.com/allenai/OLMo

https://github.com/NVIDIA-NeMo/Nemotron

New comment by aesthesia in "Policy on the AI Exponential"

aesthesia — Wed, 10 Jun 2026 21:03:43 +0000

> They are asking for FAA style preclearance and third party audits. That literally means no new AI startup can emerge. Do they not know that audits cost money?

Training frontier AI models costs money, orders of magnitude more than third-party audits. If you can afford to build the model, you can afford to have it audited.

New comment by aesthesia in "AI profitability is mathematically impossible"

aesthesia — Tue, 09 Jun 2026 21:08:00 +0000

Yep, in their analysis depreciation meant "get no useful work out of the GPU after this point," though.

New comment by aesthesia in "GPT-2: Too Dangerous To Release (2019)"

aesthesia — Tue, 09 Jun 2026 20:59:32 +0000

One of the main purposes of model cards, from the beginning, has been to outline the ways that a model could be harmful or dangerous, and mitigations that can be or have been taken to reduce those risks. How do you expect labs to publish model cards without talking about this rationale?

New comment by aesthesia in "AI profitability is mathematically impossible"

aesthesia — Tue, 09 Jun 2026 19:06:00 +0000

Oh, just noticed one other very significant error: they evaluate revenue using input token pricing while counting capacity using generated tokens per second. There's a big gap between input and output token pricing, and between prefill TPS and generation TPS.

New comment by aesthesia in "AI profitability is mathematically impossible"

aesthesia — Tue, 09 Jun 2026 18:58:06 +0000

There are some glaring local errors that make this analysis less than trustworthy. For instance, an assumption that corporate income tax applies directly to revenue, or a supposedly generous assumption that GPUs will fully depreciate after 3 years (6-year-old A100s are still in very high demand!). I would love to read a really well thought through investigation of inference costs and how they relate to token pricing, but I have low confidence that this is it.

New comment by aesthesia in "Claude Fable 5"

aesthesia — Tue, 09 Jun 2026 18:23:13 +0000

I mean, they do actually describe what that extra work was, and people elsewhere in this thread are complaining about the effects of those safeguards. So it's not like this is purely empty rhetoric.