Hacker News: lmeyerov

New comment by lmeyerov in "Copper transport drug restores memory and clears toxic Alzheimer's proteins"

lmeyerov — Tue, 16 Jun 2026 05:15:24 +0000

I don't have a horse in this race, but for anyone who has worked in it, "science advances one funeral at a time" comes to mind here

New comment by lmeyerov in "Claude Fable is relentlessly proactive"

lmeyerov — Fri, 12 Jun 2026 05:59:16 +0000

This is a funny one because it seems less into what fable is being clever on and more about the bitter lesson and data flywheels

Our UX agentic engineering flow, as many others, is playwright doing things, and as part of the ux review skill, taking & verifying the screenshots against the written specs. Likewise, as many others, we vibe coded the flows to set all that up and tweak it over time. When we hit prod issues or scraping tasks, we sometimes do similar. In some of our envs, we don't have playwright, so do it other ways.

Now imagine a million developer using claude code, how many of them are doing web & frontend stuff, and what the data flywheel looks like there. So how much is really needed for this use case to be native?

New comment by lmeyerov in "Google to pay SpaceX $920M a month for compute capacity at xAI data centers"

lmeyerov — Fri, 05 Jun 2026 20:57:12 +0000

tesla not paying bills: https://www.cnn.com/2025/07/31/us/elon-musk-company-unpaid-l...

x not paying bills: https://www.cnbc.com/2023/02/24/musks-twitter-has-been-sued-...

spacex not paying bills: https://www.fastcompany.com/91124157/spacex-contractors-texa...

New comment by lmeyerov in "Anthropic confidentially submits draft S-1 to the SEC"

lmeyerov — Tue, 02 Jun 2026 03:33:19 +0000

? Very much agreed, the IPO pop is a manufactured pricing event focused on investor dynamics rather than direct fair market pricing, making it more of a gamble than normal. Including gambles in index funds defeats the point.

Maybe the confusing point was my involvement is (discounted) pre-IPO shares, which almost by definition, is not an activity accessible to retail investors.

New comment by lmeyerov in "Anthropic confidentially submits draft S-1 to the SEC"

lmeyerov — Mon, 01 Jun 2026 21:12:28 +0000

Mostly by having a pulse for the last 10-20 years as someone in the bay area seeing it repeatedly play out as tech IPOs get dumped onto retail investors repeatedly, including the 'good' ones. Being lucky enough to participate in IPOs makes you check these wrt when to balance IPO pop exit (weeks/months) vs long-term tax benefits of holding (2yr+).

- The initial pop is known to be manufactured by banks, so mostly benefits insiders, so good time to diversify. I'm conservative so sold to cover effective basis or whatever risk strategy :)

- The lockup period (6mo) is a similarly known artificial event, and studies show that

- Tech companies take ~8 quarters of prep for the IPO as they do financial engineering to transition from VC growth-at-all-costs to public $, and I'd expect the same for whatever nonsense they pulled to juice numbers to shake out. And that's not including oddballs like the Musk alternate universe, just normal tech companies covering up EBITDA and low interest rate madness.

- Tech is especially volatile as an industry, so even more skepticism here. Eg, the latest IPO I was involved in was a successful professional social network play, and chatgpt killed it.

Most/all of these are googleable things

New comment by lmeyerov in "Anthropic confidentially submits draft S-1 to the SEC"

lmeyerov — Mon, 01 Jun 2026 19:29:48 +0000

4-8 quarters for most tech IPOs to settle. IPOs are manufactured for the good times around young co's, so not surprising, and economic stability isn't a question of days/weeks/months.

And yes often a falling knife

This is pretty predictably wall street & federal regulators scamming normal people, retirement funds, etc, taking their fees and exit window at everyone else's expense

New comment by lmeyerov in "WH proposes rules giving political appointees final approval on research grants"

lmeyerov — Sat, 30 May 2026 23:02:59 +0000

R1 work generally doesn't have a replication crisis, and generally incrementalism is the bigger issue there, which is in turn tied to penny pinching

The bigger issue is failure to significantly increase r&d funding, vs last decade+ shrinkages and Trump-era eating of the young, and focuses like you now propose suggest a continuation of such economy-inhibiting thinking. Also, note how your post was goalpost moving. This in turn is classic trolling with asymmetric effort, so I don't see your response in good faith.

New comment by lmeyerov in "WH proposes rules giving political appointees final approval on research grants"

lmeyerov — Sat, 30 May 2026 14:55:10 +0000

Useless russian-troll-style argument:

- With no workers working, no worker fraud problem, sure. If you cut core scientific processes, politicize science, and destablize paycheck predictability enough to chase everyone good out of science, then yes any small amount of waste is also caught in the cuts.

- This seems to increase what you call bad "fun": Increases abuse of tax funding being corruptly given to projects advocated by political appointees despite rejection by scientific peer review. Vicious feedback loop.

New comment by lmeyerov in "Outsourcing plus local AI will soon become more economical vs. frontier labs"

lmeyerov — Tue, 26 May 2026 15:35:39 +0000

Fwiw, the cost per answer, which is what ultimately matters, is going down. In a competitive market with oss and multiple frontier labs, it is hard to maintain a premium long-term.

The big question is how subsidies vs technology improvement will play out. As we saw with Uber, selling at a loss can happen for a very long time, and technology improves relentlessly.

For reference, we publish https://botsbench.com/ that shows time and cost per answer are going down while quality is going up.

New comment by lmeyerov in "The current AI pricing was always going to go away"

lmeyerov — Fri, 22 May 2026 16:59:56 +0000

oss models don't directly matter when multiple at-scale frontier API providers have to compete on price: they are limited in defensible margin

They do matter in that oss researchers enable faster cross-pollination of good inferencing efficiency improvements to help the big boys adapt ideas from the community

Long-term local ai may matter more, but imo not there until models + hw get way better (1-2 years?) . Reasoning grade quality at speed is still $$$: we need fast opus, not slow sonnet.

New comment by lmeyerov in "Every AI Subscription Is a Ticking Time Bomb for Enterprise"

lmeyerov — Sun, 17 May 2026 15:32:00 +0000

Not really. Claude Code harness with Sonnet 4.5 model showed you don't really need bigger GPU rollouts, and it's only a matter of time for OSS combos to hit that. Overtime, this will only get better, and the set of enterprise tasks smaller deployments can handle will only go up.

New comment by lmeyerov in "I don't think AI will make your processes go faster"

lmeyerov — Sun, 17 May 2026 15:26:50 +0000

It's felt awhile similar to what we see in parallel computing:

- shift towards throughput-oriented vs latency-oriented. Can juggle more tasks, but increasingly hard to speed up individual ones.

- strong scaling is tough. Might even see slowdowns for individual tasks, so reliable benefits come from being able to juggle more and eat the per-task inefficiency

- amdahl's law: we can't speed up tasks beyond their longest sequential (human) unit, so our work becomes identifying those bits and working on them. Related: you can buy bandwidth, but you can't buy latency

New comment by lmeyerov in "Frontier AI has broken the open CTF format"

lmeyerov — Sat, 16 May 2026 21:15:50 +0000

I think that ship has sailed as well --

botsbench.com shows Sonnet 4.5+ with Claude Code harness does pretty well, and Sonnet roughly tracks the edge of what self-hosted models do on the upper tier of affordable GPUs, like running 1-2 DGX Sparks and waiting 6mo for oss to catch up a bit

New comment by lmeyerov in "Frontier AI has broken the open CTF format"

lmeyerov — Sat, 16 May 2026 17:01:51 +0000

It's tough. We run botsbench.com , which tracks AI progress on a top CTF, and I gave a talk at CCC a few months ago on our own results doing AI speed runs, so I think about this a lot.

In our own trainings we give (AI agents for security, and a graph masterclass), we ended up leaning into it. For example, we ship with a skills bundle. There are plus sides, like less code-forward participants can go further and are appreciating that, and less of a gap between high-level concepts and successful hands-on. But at the same time, manual work does build a lot of intuition & knowledge that gets missed in auto modes.

New comment by lmeyerov in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

lmeyerov — Tue, 12 May 2026 00:43:41 +0000

Yep, a few views here:

- one wave is code reduction via DRY removals and architectural fixes, and another is adverserial to get rid of false additions, so this helps AI bloat either way

- as the other comment says, underspecification is a problem, so this ends up finding when the implementation, tests, docs, quality guide, and spec are out of sync, with whichever to blame.

- Usable, well-designed, secure, and well-typed code ends up being bigger, so this helps cut to the chase. Ultimately, either you get there or you don't, and this helps cut review burden so you can do your part of it faster and at a higher level.

Funny enough, I'm now playing with gardening agents whose job it is to reduce code. But I wouldn't want to slow PRs on that so view as seperate PRs.

New comment by lmeyerov in "Show HN: adamsreview – better multi-agent PR reviews for Claude Code"

lmeyerov — Mon, 11 May 2026 14:40:21 +0000

Yes, being comprehensive, so early or blatant cheapo findings do not distract from other ones. That's important for base results. Splitting in both file and task is (currently) important.

Additionally, we run in a loop until it stops finding things, and as part of that, do test amplification when it does find any. We regularly see 3-8 rounds yielding valid results.

IMO half the value is customization to your repo, so copying these and specializing to your repo is super quick and pays off almost immediately . How to find style guides, how to run tests, what dimensions of correctness to look for, etc.

This kind of thing makes me question how important Mythos is for security bug finding - doing a High effort loop with a frontier model in code reviews until convergence has already outperformed human review for us . (Doesn't replace, but does find things we miss, and catches many we do see earlier).

New comment by lmeyerov in "Vibe coding and agentic engineering are getting closer than I'd like"

lmeyerov — Thu, 07 May 2026 19:53:18 +0000

That feels like true in theory, but in practice, we see the reverse for advanced projects where AI is helping us a lot. A decent chunk of our core IP falls into the bucket you're describing:

We have been building a GPU-accelerated graph investigation platform that has grown over 10+ years with fancy stuff all over the place - think accelerated query languages, layout kernels, distribution, etc. R&D-grade high performance engineering projects and kernels end up needing a lot of iterations to make a prototype and initial release. Likewise, they're more devilish to maintain when they need a small tweak later because of the sophistication and bus factor. Both phases benefit.

AI coding helps automate investigation, testing, measurement, patching, etc. The immediate effect is we can squeeze in many more experimental iterations with more fidelity and reach. Having an AI help automatically explore the design space and the details helps a LOT. And later, maintaining a wide surface area of code here that is delicate to touch and infrequently edited is traditionally stressful for teammates, and AI editing + AI-generated automation is helping destress that a LOT. We very much invest in upgrading our team, processes, and tooling here.

New comment by lmeyerov in "Vibe coding and agentic engineering are getting closer than I'd like"

lmeyerov — Thu, 07 May 2026 15:41:52 +0000

Yes and no

I've seen productivity surveys of senior programmers that share the reverse, and that matches our experience. A common finding is that gardening projects are a lot cheaper now when they're just a few extra terminal tabs running in parallel - security, refactoring, more testing, etc. Non-feature backlog items that senior developers value around tech debt are less of a discussion now. They're often essential now: to make AI coding work well, there is an effective automation poverty line around verification, testing, and specification that needs to be reached.

The understanding code thing is tough. Eg, when a non-senior fullstack developer manually edits frontend css code and didn't start from pixel-perfect designs across all form-factors, do they really understand what they did? I wrote the first formal mechanized specification of the CSS standard, and would claim 95%+ of web developers do not understand core CSS layout rules to beginwith: it was a struggle to semantically formalize even a tiny core of the box model as soon as you have floats. If the AI generates live storybooks and in-tool screenshots of all these things as part of the review process, and doing code review "looks good", what's the difference?

I don't truly think this way - my point is to challenge basic claims of manual coding to be good to begin with and whether AI coding is being held to an artificial standard. What I see in commercial and defense software is a joke compared to what we do in the verification world. AI coding automating review iteration fixes in areas like security engineering and test coverage+amplification has been a blessing for quality improvement.

More fundamentally, we require developers by default to be responsible for knowing what the code does and having tested it. Every case of relaxing that rule has to be explicit, eg, clear that something is a prototype, or an area is vibed with what alternate review/test flow, and we are learning as a team what that means in different situations. In practice, our senior ai coders are doing more quality engineering work than the manual coders, both per-pr and in broader gardening contributions.

New comment by lmeyerov in "Vibe coding and agentic engineering are getting closer than I'd like"

lmeyerov — Thu, 07 May 2026 15:03:27 +0000

1. Probably most of https://github.com/simonw , but take care to seperate adopted / semi-professional from exploratory personal work

2. That sounds like your company has a weak engineering culture and is early on its upskilling journey. We explicitly seperate projects into prototypes vs production, where vibes are fine for the former, eg, demos by designers / data scientists / sales engineers but traditional code review standards for whatever is going into production. That mirrors my qualifier in #1.

I find that success here is a combination of engineering seniority, prompting experience, and domain experience . Anything lacking breaks the automation loop, like not knowing how and what to automate. Ex: All of our team finds value in ai coding, but junior engineers struggle on these dimensions, so are not running the 3+ agents that senior ones are.

New comment by lmeyerov in "Vibe coding and agentic engineering are getting closer than I'd like"

lmeyerov — Thu, 07 May 2026 05:52:01 +0000

That's the failure to automate. The AI isn't telepathic, so agentic engineers not automating this stuff is skipping out on the engineering part.

You setup the environment and then you do the work. Unless you are switching employers every week, you invest in writing that stuff down so the generation is right-ish and generate validation tooling so it auto-detects the mistakes and self-repairs.