Hacker News: kakugawa

New comment by kakugawa in "US Government directive to suspend access to Fable 5 and Mythos 5"

kakugawa — Sat, 13 Jun 2026 01:15:18 +0000

So, how is it being disabled? It still shows "Fable 5" on all surfaces (to me). Is it being silently degraded to Opus under-the-hood?

Edit: Fable 5 was just disabled.

New comment by kakugawa in "Claude Fable 5"

kakugawa — Tue, 09 Jun 2026 18:37:14 +0000

I didn't see Fable 5 in the `/model` list, until I ran it with: `$ claude --model fable-5`

New comment by kakugawa in "Is AI causing a repeat of frontend’s lost decade?"

kakugawa — Fri, 29 May 2026 14:11:50 +0000

a11y testing is non-trivial. axe-core can automatically detect many types of issues. However, enough compliance (to avoid being sued) needs end-to-end testing and human judgement. e.g. keyboard traps, focus restoration, alt-text, etc.

New comment by kakugawa in "Claude Opus 4.8"

kakugawa — Thu, 28 May 2026 21:07:12 +0000

Thank you for pointing this out.

New comment by kakugawa in "Claude Opus 4.8"

kakugawa — Thu, 28 May 2026 20:29:34 +0000

Opus 4.7 does not support disabling adaptive thinking (web, Claude Code). [1] Like the OP, I experienced similar issues and I'm glad that they brought back the ability to disable adaptive thinking in Opus 4.8.

[1] https://code.claude.com/docs/en/model-config#adaptive-reason...

> Opus 4.7 and later always use adaptive reasoning. The fixed thinking budget mode and `CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING` do not apply to them.

New comment by kakugawa in "Apple unveils new accessibility features"

kakugawa — Tue, 19 May 2026 20:25:21 +0000

It's prob why they chose a11y features. They have more pain, so they're willing to tolerate more growing pains. (And prob more motivated to provide feedback.)

New comment by kakugawa in "Google changes its search box"

kakugawa — Tue, 19 May 2026 20:21:00 +0000

I've found Google AI Search to be good for really topical searches. And its conversational ability has noticeably improved over the last year. I can now have a (short) conversation where I reference past messages.

New comment by kakugawa in "I believe there are entire companies right now under AI psychosis"

kakugawa — Fri, 15 May 2026 21:52:59 +0000

https://claude.ai/settings/general (Instructions for Claude)

---

Treat my claims as hypotheses, not decisions. Before agreeing with a proposed change, state the strongest case against it. Ask what evidence a change is based on before evaluating it. Distinguish tactical observations from strategic commitments — don't silently promote one to the other. If you paraphrase my proposal, name what you changed. Mark confidence explicitly: guessing / fairly sure / well-established. Give reasoning and evidence for claims, not just conclusions. Flag what would change your mind. Rank concerns by cost-of-being-wrong; lead with the highest-stakes ones. Say hard things plainly, then soften if needed — not the other way around. For drafting, brainstorming, or casual questions, ease off and match the task.

---

Beware though that it can be an annoying little shit w/ this prompt. Prepare yourself emotionally, because you are explicitly making the tradeoff that it will be annoyingly pedantic, and in return it will lessen (not eliminate) its sycophancy. These system instructions are not fool-proof, but they help (at the start of the conversation, at least).

New comment by kakugawa in "I believe there are entire companies right now under AI psychosis"

kakugawa — Fri, 15 May 2026 21:33:48 +0000

He uses AI himself, so I agree he doesn't see AI use as black/white.

Hard agree about ideas, thinking, advice. AI's sycophancy is a huge subtle problem. I've tried my best to create a system prompt to guard against this w/ Opus 4.7. It doesn't adhere to it 100% of the time and the longer the conversation goes, the worse the sycophancy gets (because the system instructions become weaker and weaker). I have to actively look for and guard against sycophancy whenever I chat w/ Opus 4.7.

New comment by kakugawa in "Bun Rust rewrite: "codebase fails basic miri checks, allows for UB in safe rust""

kakugawa — Fri, 15 May 2026 19:23:01 +0000

You could view it as a specific application of the quote.

In your quote, there is no time-dependency between the lie and the truth. Whereas here, it's an attractive lie (easily parsed, great narrative), followed up by truths (that need more than surface-level analysis).

New comment by kakugawa in "Check Your Fucking Sources, People"

kakugawa — Fri, 15 May 2026 15:13:43 +0000

Have we forgotten how bad LLMs were at citing sources when they first came out? So, we had to build a lot of structure (harness engineering) and frontier labs had to do specific training to try to compensate for this.

So, LLMs are inherently bad at citing sources. A lot of effort has been put in to improve this behavior, but it's compensating for an inherent flaw.

New comment by kakugawa in "Claude Code users hitting usage limits 'way faster than expected'"

kakugawa — Tue, 31 Mar 2026 14:58:29 +0000

gemini-cli has not been useable for weeks. The API endpoint it uses for subscription users is so heavily rate-limited that the CLI is non-functional. There are many reports of this issue on Github. [1]

1/ https://github.com/google-gemini/gemini-cli/issues?q=is%3Ais...

New comment by kakugawa in "Gemini 3.1 Pro"

kakugawa — Thu, 19 Feb 2026 21:33:10 +0000

In mid-2024, Anthropic made the deliberate decision to stop chasing benchmarks and focus on practical value. There was a lot of skepticism at the time, but it's proven to be a prescient decision.

New comment by kakugawa in "Gemini 3 Deep Think drew me a good SVG of a pelican riding a bicycle"

kakugawa — Sat, 14 Feb 2026 20:12:08 +0000

That's how you know you've made it: when your pet benchmark becomes a target.

New comment by kakugawa in "Claude Code is being dumbed down?"

kakugawa — Wed, 11 Feb 2026 19:36:38 +0000

How much longer is Anthropic going to allow OpenCode to use Pro/Max subscriptions? Yes, it's technically possible, but it's against Anthropic's ToS. [1]

1: https://blog.devgenius.io/you-might-be-breaking-claudes-tos-...

New comment by kakugawa in "Software factories and the agentic moment"

kakugawa — Sat, 07 Feb 2026 19:10:26 +0000

It's short-term vs long-term optimization. Short-term optimization is making the system effective right now. Long-term optimization is exploring ways to improve the system as a whole.

New comment by kakugawa in "Claude Code Is the Inflection Point"

kakugawa — Thu, 05 Feb 2026 20:55:29 +0000

What It Is, How We Use It, Industry Repercussions, Microsoft's Dilemma, Why Anthropic Is Winning

Claude Code Is the Inflection Point

kakugawa — Thu, 05 Feb 2026 20:55:29 +0000

Article URL: https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point

Comments URL: https://news.ycombinator.com/item?id=46905162

Points: 7

# Comments: 3

New comment by kakugawa in "LLM Output Drift in Financial Workflows: Validation and Mitigation (arXiv)"

kakugawa — Wed, 12 Nov 2025 21:06:31 +0000

Defeating Nondeterminism in LLM Inference

https://news.ycombinator.com/item?id=45200925

https://thinkingmachines.ai/blog/defeating-nondeterminism-in...

> As it turns out, our request’s output does depend on the parallel user requests. Not because we’re somehow leaking information across batches — instead, it’s because our forward pass lacks “batch invariance”, causing our request’s output to depend on the batch size of our forward pass.

tl;dr: the way inference is batched introduces non-determinism.

New comment by kakugawa in "Show HN: Realm (YC S11), a mobile database"

kakugawa — Tue, 15 Jul 2014 19:47:32 +0000

CoreData is not as stable as advertised. If you use it simply it works. However advanced use will quickly discover sharp edges. I devote a non-trivial amount of code managing objects between contexts and there are still FRC bugs. (e.g. sorted update.)

And it's all closed source.