Hacker News: adamthegoalie

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Wed, 13 May 2026 05:41:42 +0000

Fair point. But since Claude Code I guess? :D

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Tue, 12 May 2026 22:16:00 +0000

Your review is ready sir

https://github.com/Vija02/TheOpenPresenter/pull/170#issuecom...

This was claude-only, no codex / --ensemble mode. It ran against the branch's base, which is 1 behind main.

Would love your feedback when you have the chance!

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Tue, 12 May 2026 21:02:10 +0000

I will be trying these and will report back!

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Tue, 12 May 2026 21:01:29 +0000

Well, it doesn’t give you an opinion on whether you should merge or not. It gives you a list of issues along with details about those issues, such as fix hints, and whether you need human attention before fixing.

It’s all given to you in a structured file, in your chat, and as a nicely formatted PR comment.

Still up to the human to decide whether and when to merge based on the output!

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Tue, 12 May 2026 20:57:28 +0000

Yesss! On it now! 1-2 hours you’ll get a comment

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 18:44:07 +0000

Here's an example: https://github.com/adamjgmiller/worktreehq/pull/145#issuecom...

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 18:43:54 +0000

Meanwhile here's what it looks like - just ran it on my own other repo: https://github.com/adamjgmiller/worktreehq/pull/145#issuecom...

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 16:00:03 +0000

That is such a great point. We do need evals for this - and not just ones that the model companies use themselves. They have to be public and sharable and easy to use ourselves.

And in terms of sharing, I agree. On one hand, so many of us are already doing this themselves. On the other hand, when I was first learning CC and agentic engineering (vibe coding at the time :) ), I did find some of these random people's templates useful.

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 15:58:18 +0000

That is such a good point.

I recently opened a PR against this AI personal finance tool Ray https://github.com/cdinnison/ray-finance/pull/8 to add an Apple Card import feature, since Apple Card is not supported by Plaid.

I built the manual import feature, opened the PR, and then ran a code review.

What I hadn't thought about when I built the feature, was the myriad ways that the implications of importing data from Apple would have to be considered and integrated into the rest of the app, for the manual import to be a first-class feature, not "just a manual import" of data.

I ended up running adamsreview against it like 5-10 times, before considering it complete, as I learned that there was much more to the integration than I realized.

Now is that necessarily a problem? Maybe not. I should have realized from the start that the import feature was going to much more than just a small feature. But at least, thanks to the review loop, I got it completely right before the PR was merged.

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 15:26:37 +0000

Does anyone have an open PR on a public repo?

I'll run this against your PR for you with my CC credits as a sort-of benchmark! Send me your PR link :)

I'm going to create one on one of my other repos meanwhile and add a link to the review when it's ready.

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 15:15:27 +0000

No not at all! I haven't used it to review itself actually, for the most part, because adamsreview is mostly english language, not much code.

Here's a comment from adamsreview, but even this was 3 weeks ago, and I've worked on it a lot since then: https://github.com/cdinnison/ray-finance/pull/8#issuecomment...

I'll try to find a good public PR to review sometime soon so I can share that and add it to the Readme. This is really good feedback. I should have had something like this ready before posting to Show HN.

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 15:11:37 +0000

It burns a lot of tokens, that's for sure.

Friction - maybe? Depending on what you mean.

But it's extremely useful and effective compared to everything else out there that I've tried, if you're looking for an AI code review. Let me know if you try it - or find anything else that might work too without the bazillion prompts :)

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 15:10:35 +0000

That's awesome to hear and I'd love to see it when you're ready.

I actually think having something like adamsreview orchestrated by deterministic code - instead of simply having AI agents use deterministic code occasionally as this app does - could be even better!

The problem I ran into is that if you build a deterministic app that happens to use LLMs instead of the other way around, I don't think there's any way to get it to use your Claude Code subscription credits. It has to use API. And something like adamsreview would end up being so expensive if not subsidized by Anthropic along with the rest of our CC usage.

Curious to hear about your experience.

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 15:08:13 +0000

I totally see where you guys are coming from.

It sometimes feels silly to me to have AI reviewing AI reviewing AI all the way down - see my above comment https://news.ycombinator.com/item?id=48095831

But with human judgment inserted into the right steps, it's really LLMs leveraging human thought at key stages and then, to your point, LLMs fighting LLMs fighting LLMs all the way down until...

Someone like me who has loved software his whole life and never been able to build anything more than a front-end website himself is building entire applications. So maybe it's worth the complexity!

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 15:03:22 +0000

Do you mean that, even though the tooling keeps getting better, people aren't putting the effort into using it?

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 15:02:09 +0000

Definitely!

adamsreview is mostly english language - since it is instructing your CC agents. While there are a number of python scripts and JSON storage, you - and your agent - would easily be able to add your own rules to this. It also respects your Claude.md and one of the lenses it already uses is checking for Claude.md compliance etc.

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 15:00:59 +0000

On small PRs (small features / changes ~hundreds to thousands of lines), I'd say around 500,000 total tokens.

On large PRs (new feature sets for apps ~10,000-30,000 lines), around 2-3 million total tokens.

By the way, I should have mentioned in my original post, adamsreview counts tokens used by sub-agents across the stages, and tells you at the end of each stage the total used so far.

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 14:58:25 +0000

That's a great idea. I had trouble finding anything like this, a benchmark made for (AI) code reviewers.

I had expected to find something like an eval harness available on GitHub, but couldn't find it.

Any suggestions? Or maybe we/I/someone should build something like this?

I suppose one challenge is that if it's going to be publicly available, it would also be easy to cheat, but still seems it would be useful if people agreed it's a good benchmark and could easily re-test tools themselves.

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 14:55:02 +0000

Hey thanks for the comment and the question.

I would say my workflow for any meaningful amount of work is (all in Claude Code):

- PRD: I discuss and brainstorm with Claude Code using something like the Grill Me skill https://github.com/mattpocock/skills/tree/main/skills/produc... but that I've modified a bit for my own style, until I have a good PRD (what the goals / design decisions are for what I'm building)

--- I run this PRD through multiple AI reviews (sometimes ChatGPT Pro for really important PRDs, because it seems to have some of the best critical feedback)

--- I read the PRD myself in detail before finalizing.

- PLAN: I have Claude Code develop the plan for implementing the PRD. Again, I have this reviewed several times by CC and sometimes by other tools for effectiveness, consistency with the PRD, consistency with the codebase, and internal consisenty.

- EXECUTE: I have an orchestration command I made that has CC execute the PLAN and use a build journal, using sub-agents whenever possible to save context, so that it can operate for up to several hours.

- QUICK REVIEWS: I have these commands /review-fix-loop /quick-dual-review which loops around running a Claude+Codex sub agents review and then fixing anything critical (deferring items needing human judgment)

- CODE REVIEWS: This is when I run between one and several of the adamsreview reviews, starting with /review --ensemble, then /walkthrough, then /fix; until I am satisfied.

Would it be useful if I packaged all this stuff into a GH repo to share with you and others?

New comment by adamthegoalie in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

adamthegoalie — Mon, 11 May 2026 14:48:18 +0000

I was thinking about building a GitHub repo made for evaluating Code Reviews. Something like a complex app (or perhaps a few branches with different options), and then PRs on each branch with varying types and degrees of bugs for a Code Review to find.

I suppose this would not be a 'real' benchmark because it would be public and so you couldn't necessarily trust scores people share about how their own tool did, but it would at least allow anyone to try out code review tools on their own and report relative effectiveness and characteristics.

I'll post again if I end up finding or building something like that. I couldn't find anything when I looked previously.

I'll also keep in mind your question as I continue testing this, because you are right that it would be useful to be able to describe what is different, not just the magnitude of bugs found.