Hacker News: joshuaisaact

New comment by joshuaisaact in "Embarrassingly simple self-distillation improves code generation"

joshuaisaact — Sun, 05 Apr 2026 07:50:33 +0000

This was a really interesting paper but there's a massive gap in what they didn't try, which is inference-time temperature changes based on the fork/lock distinction.

Maybe I'll try that myself, because it feels like it could be a great source of improvements. It would be really useful to see adaptive per-token sampling as an additional decode-only baseline.

New comment by joshuaisaact in "European Tech Alternatives"

joshuaisaact — Thu, 19 Feb 2026 06:34:50 +0000

ASML is European and is arguably the most strategically important company in the entire semiconductor supply chain.

New comment by joshuaisaact in "European Tech Alternatives"

joshuaisaact — Thu, 19 Feb 2026 06:32:32 +0000

Have you heard of a little company called Arm Holdings?

It was a travesty that the UK government let it be sold, admittedly.

New comment by joshuaisaact in "I proved my AI agent can't skip the approval step (196 states, zero bypasses)"

joshuaisaact — Mon, 16 Feb 2026 14:15:47 +0000

I've been exploring Petri nets as a formalism for AI agent safety, specifically, proving properties like termination and human-gate enforcement exhaustively across every reachable state, rather than testing them on sample inputs. This post benchmarks the approach against n8n and ReAct on the same agent. Tomorrow I'm open-sourcing the engine as a declarative rules DSL.

I proved my AI agent can't skip the approval step (196 states, zero bypasses)

joshuaisaact — Mon, 16 Feb 2026 14:15:47 +0000

Article URL: https://joshtuddenham.dev/blog/agent-safety/

Comments URL: https://news.ycombinator.com/item?id=47035215

Points: 1

# Comments: 1

New comment by joshuaisaact in "Ask HN: Are you using an agent orchestrator to write code?"

joshuaisaact — Fri, 13 Feb 2026 05:48:50 +0000

I don't think you need two separate models for this - I get similarly good results re-prompting with Claude. Well, not re-prompting, I just have a skill that wipes the context then gets Claude to review the current PR and make improvements before I review it.

You're Building Petri Nets. You're Just Building Them Badly

joshuaisaact — Fri, 13 Feb 2026 05:47:30 +0000

Article URL: https://joshtuddenham.dev/blog/petri-nets/

Comments URL: https://news.ycombinator.com/item?id=46999339

Points: 2

# Comments: 1

Show HN: FizzBuzz Enterprise Edition 2026. AI-powered divisibility detection

joshuaisaact — Thu, 05 Feb 2026 15:20:52 +0000

Article URL: https://github.com/joshuaisaact/fizzbuzz-enterprise-edition-2026

Comments URL: https://news.ycombinator.com/item?id=46900632

Points: 2

# Comments: 0

New comment by joshuaisaact in "The future of software engineering is SRE"

joshuaisaact — Mon, 26 Jan 2026 06:00:21 +0000

Couldn't disagree with this article more. I think the future of software engineering is more T-shaped.

Look at the 'Product Engineer' roles we are seeing spreading in forward-thinking startups and scaleups.

That's the future of SWE I think. SWEs take on more PM and design responsibilities as part of the existing role.

New comment by joshuaisaact in "Claude Code's new hidden feature: Swarms"

joshuaisaact — Sun, 25 Jan 2026 12:36:15 +0000

Fair push back. The distinction I'm drawing is between:

A. Using a role prompt to configure a single function's scope ("you are a code reviewer, focus on X") - totally reasonable, leverages training

B. Building an elaborate multi-agent orchestration layer with hand-offs, coordination protocols, and framework abstractions on top of that

I'm not arguing against A. I'm arguing that B often adds complexity without proportional benefit, especially as models get better at long-context reasoning.

Fairly recent research (arXiv May 2025: "Single-agent or Multi-agent Systems?" - https://arxiv.org/abs/2505.18286) found that MAS benefits over single-agent diminish as LLM capabilities improve. The constraints that motivated swarm architectures are being outpaced by model improvements. I admit the field is moving fast, but the direction of travel appears to be that the better the models get, the simpler your abstractions need to be.

So yes, use roles. But maybe don't reach for a framework to orchestrate a PM handing off to an Engineer handing off to QA when a single context with scoped instructions would do.

New comment by joshuaisaact in "Claude Code's new hidden feature: Swarms"

joshuaisaact — Sun, 25 Jan 2026 12:31:27 +0000

Fair point on the date - the paper was updated October 2024 with Llama-3 and Qwen2.5 (up to 72B), same findings. The v1 to v3 revision is interesting. They initially found personas helped, then reversed their conclusion after expanding to more models.

"Comprehensively disproven" was too strong - should have said "evidence suggests the effect is largely random." There's also Gupta et al. 2024 (arxiv.org/abs/2408.08631) with similar findings if you want more data points.

New comment by joshuaisaact in "Claude Code's new hidden feature: Swarms"

joshuaisaact — Sun, 25 Jan 2026 09:00:05 +0000

This has been pretty comprehensively disproven:

https://arxiv.org/abs/2311.10054

Key findings:

-Tested 162 personas across 6 types of interpersonal relationships and 8 domains of expertise, with 4 LLM families and 2,410 factual questions

-Adding personas in system prompts does not improve model performance compared to the control setting where no persona is added

-Automatically identifying the best persona is challenging, with predictions often performing no better than random selection

-While adding a persona may lead to performance gains in certain settings, the effect of each persona can be largely random

Fun piece of trivia - the paper was originally designed to prove the opposite result (that personas make LLMs better). They revised it when they saw the data completely disproved their original hypothesis.

New comment by joshuaisaact in "Claude Code's new hidden feature: Swarms"

joshuaisaact — Sun, 25 Jan 2026 08:45:05 +0000

This feels like massively overengineering something very simple.

Agents are stateless functions with a limited heap (context window) that degrades in quality as it fills. Once you see it that way, the whole swarm paradigm is just function scoping and memory management cosplaying as an org chart:

Agent = function

Role = scope constraints

Context window = local memory

Shared state file = global state

Orchestration = control flow

The solution isn't assigning human-like roles to stateless functions. It's shared state (a markdown file) and clear constraints.

New comment by joshuaisaact in "Software engineers can no longer neglect their soft skills"

joshuaisaact — Sun, 18 Jan 2026 15:04:41 +0000

Fair. I'll retire 'bringing people along with you' before it ends up on a motivational poster with a stock photo of a rowing team.

Though you're right that there's no I in team. There is one in AI though, which probably tells us something.

Speed Vertigo: A New Kind of Engineering Debt

joshuaisaact — Sun, 18 Jan 2026 14:53:16 +0000

Article URL: https://joshtuddenham.dev/blog/vertigo/

Comments URL: https://news.ycombinator.com/item?id=46668237

Points: 1

# Comments: 0

New comment by joshuaisaact in "Iconify: Library of Open Source Icons"

joshuaisaact — Sun, 18 Jan 2026 14:46:52 +0000

This is a brilliant library, thanks so much for sharing it

New comment by joshuaisaact in "Software engineers can no longer neglect their soft skills"

joshuaisaact — Sun, 18 Jan 2026 14:45:20 +0000

I may have misread your comment, but I don't think soft skills are a 'narrow thing' at all. Effective communication, building trust, bringing people along with you - these are fundamental to being an effective human, not some niche pivot.

New comment by joshuaisaact in "Software engineers can no longer neglect their soft skills"

joshuaisaact — Sun, 18 Jan 2026 14:39:58 +0000

This couldn't ring more true to me - I think one of the consequences of the rapid change in the profession we are seeing is that skills that typically were required only at more senior levels become required further down the stack.

If I was a junior today, I'd be studying business impact, effective communication, project management, skills that were previously something you could get away with under-indexing on until senior+.

New comment by joshuaisaact in "Show HN: KeelTest – AI-driven VS Code unit test generator with bug discovery"

joshuaisaact — Wed, 07 Jan 2026 14:51:49 +0000

I notice one of the things you don't really talk about in the blog post (or if you did, I missed it) is unnecessary tests, which is one of the key problems LLMs have with test writing.

In my experience, if you just ask an LLM to write tests, it'll write you a ton of boilerplate happy path tests that aren't wrong, per se, they're just pointless (one fun one in react is 'the component renders').

How do you plan to handle this?