Hacker News: aszen

New comment by aszen in "Rethinking search as code generation"

aszen — Tue, 02 Jun 2026 19:14:52 +0000

Claude code already fans out and sandboxes context by calling sub agents so I'm not sure this approach brings much benefit there. A complex search strategy only makes sense if the search is slow and compute intensive.

New comment by aszen in "Rethinking search as code generation"

aszen — Tue, 02 Jun 2026 19:08:53 +0000

Coding agents prefer to do iterative search, I have yet to see them create a complex search script. They try different search cmds in parallel, evaluate their results and then refine or dive deeper.

This approach usually works great but I can see many use cases where a smarter search strategy may make sense especially to optimize context.

New comment by aszen in "Ask HN: How are you handling QA being bottlenecked with more AI-generated PRs?"

aszen — Fri, 08 May 2026 07:59:01 +0000

By slowing down engineers with ai agents adding multiple code reviews on top. Also encouraging engineers to engage in manual testing themselves to better understand the product.

New comment by aszen in "An update on recent Claude Code quality reports"

aszen — Thu, 23 Apr 2026 18:16:45 +0000

Claude code is not infra, the model is the infra. They changed settings to make their models faster and probably cheaper to run too. Honestly with adaptive thinking it no longer matters what model it is if you can dynamically make it do less or more work.

New comment by aszen in "Parallel agents in Zed"

aszen — Thu, 23 Apr 2026 12:48:18 +0000

Same here. Reviewing gets harder too and multi tasking kills any kind of productivity if you need to review the code then.

My approach these days is to do one change at a time, until I can fully merge it with confidence.

New comment by aszen in "Show HN: Mdarena – Benchmark your Claude.md against your own PRs"

aszen — Mon, 06 Apr 2026 09:03:35 +0000

This is quite interesting, will try it. I kind of expect this to be done continuously as the code base changes.

New comment by aszen in "Closed Source vs. Open Source AI: A Cage Fight Few People Understand"

aszen — Thu, 26 Mar 2026 19:21:48 +0000

This article doesn't mention the moat of data gathering, frontier AI labs have a huge advantage in curating proprietary datasets from actual usage of their platforms.

This in turn allows them to optimize their models for the long tail of tasks that open weight models can't compete with.

Another factor is that pure intelligence isn't enough, how the model communicates is a huge plus. An enterprise used to talking to Claude all day won't be easy to switch to another model

New comment by aszen in "Ask HN: How are people doing AI evals these days?"

aszen — Wed, 11 Mar 2026 07:54:58 +0000

Seems like you are testing llms genric abilities rather than your actual agent logic.

Llms are like vendor code you don't need to test them yourself people already created benchmarks for that.

New comment by aszen in "The L in "LLM" Stands for Lying"

aszen — Thu, 05 Mar 2026 15:41:18 +0000

If you buy real handcrafted scarves they are both thinner and warmer than anything factory made bcz of their choice of pashmina wool.

New comment by aszen in "Improving 15 LLMs at Coding in One Afternoon. Only the Harness Changed"

aszen — Thu, 12 Feb 2026 16:08:17 +0000

So the new implementation always operates at the line level, replacing one or more lines. That's not ideal for some refactorings like rename where search and replace is faster.

Edit

Checking ohmypi The model has access to str replace too so this is just a edit till

New comment by aszen in "After two years of vibecoding, I'm back to writing by hand [video]"

aszen — Sat, 24 Jan 2026 16:51:05 +0000

I bet writing the code directly could have been even faster, llms aren't magically fast

New comment by aszen in "Bubblewrap: A nimble way to prevent agents from accessing your .env files"

aszen — Fri, 16 Jan 2026 18:49:17 +0000

https://devenv.sh/integrations/secretspec/

New comment by aszen in "Bubblewrap: A nimble way to prevent agents from accessing your .env files"

aszen — Thu, 15 Jan 2026 09:14:00 +0000

I wonder why we are even storing secrets in .env files in plain text

New comment by aszen in "AI is a business model stress test"

aszen — Sun, 11 Jan 2026 09:52:30 +0000

How? You don't know what the llm was trained on and don't know if it has any bias. Imo llms are a disaster for knowledge work because they act like a black box.

New comment by aszen in "Fly's Sprites.dev addresses dev environment sandboxes and API sandboxes together"

aszen — Sun, 11 Jan 2026 01:44:47 +0000

Stupid question but why not use a local sandbox for yolo mode instead of a remote machine.

Is there a similar service that runs locally?

New comment by aszen in "How to code Claude Code in 200 lines of code"

aszen — Thu, 08 Jan 2026 23:20:37 +0000

Agreed it probably contributes to the model improving for all agents but crucially it is verifiably better against their own agent. So they get a good feedback loop to improve both

New comment by aszen in "How to code Claude Code in 200 lines of code"

aszen — Thu, 08 Jan 2026 21:54:01 +0000

They nailed down the UX I would say and the models themselves are a lot better even outside of CC

New comment by aszen in "How to code Claude Code in 200 lines of code"

aszen — Thu, 08 Jan 2026 21:52:26 +0000

Yeah that's one example, but I suspect they train the model on entire sequences of tool calls, so unless you prompt the model exactly as them you won't get the same results.

There's a reason they won the agent race, their models are trained to use their own tools.

New comment by aszen in "OpenAPI Isn't Enough"

aszen — Thu, 08 Jan 2026 21:40:30 +0000

Seems odd to not mention other semantic standards that standardize resource operations like pagination, sorting etc.

Jsonld, json:api

New comment by aszen in "How to code Claude Code in 200 lines of code"

aszen — Thu, 08 Jan 2026 21:26:23 +0000

The most imp part is editing code, to do that reliably Claude models are trained on their own str replace tool schema I think. Models find it hard to modify existing code, they also can't just rewrite whole files bcz that's expensive and doesn't scale.