Hacker News: aspenmartin

New comment by aspenmartin in "The bottleneck was never the code"

aspenmartin — Wed, 06 May 2026 19:37:13 +0000

What better practices do you mean? Are you saying we just need different more agentic-friendly practices that ensure scaled reliability beyond what we can manually check? If so I totally agree.

AI is 100% capable fundamentally of making new processes. Look I mean it’s not like I think opus 4.7 is all you need, but how can you argue with the fact that adoption since 4.5 has been an inflection point? That’s kind of proof that reliability has reached a level that serious usage is possible. That’s over a period of months. When you zoom out further you see this is extremely predictable even a few years ago, despite the absolute hissy fits thrown on HN when CEOs began saying this.

Agentic coding is verifiable and this implies there are very few practical limits to what it can do. Combine that with insanely active research on tackling the remaining issues (hallucinations — which are not a fundamentally unsolvable problem at a practical level, context rot, continual learning etc)

New comment by aspenmartin in "The bottleneck was never the code"

aspenmartin — Wed, 06 May 2026 19:32:40 +0000

How do you mean?

New comment by aspenmartin in "The bottleneck was never the code"

aspenmartin — Wed, 06 May 2026 19:31:27 +0000

You are not wrong about anything you’re saying but like I said this misses the forest for the trees. I’m talking about like the next ~2 years. There is a common idea that we don’t understand this technology or what will happen performance wise. We know a lot more about what’s going to happen than people think. It’s because none of this is new. We’ve known about neural nets since the 40s, we know how RL works on a fundamental level and it has been an active and beautiful field of research for at least 30-40 years, we know what happens when you combine RL with verifiable rewards and throw a lot of compute at it.

One big misconception is that these models are trained to mimic humans and are limited by the quality of the human training data, and this is not true and also basically almost entirely the reason why you have so much bullishness and premature adoption of agentic coding tools.

Coding agents use human traces as a starting point. You technically don’t have to do this at all but that’s an academic point, you can’t do it practically (today). The early training stages with human traces (and also verified synthetic traces from your last model) get you to a point where RL is stable and efficient and push you the rest of the way. It’s synthetic data that really powers this and it’s rejection sampling; you generate a bunch of traces, figure out which ones pass the verification, and keep those as training examples.

So because

- we know how this works on a fundamental level and have for some time

- human training data is a bootstrap it’s not a limitation fundamentally

- you are absolutely right about your observations yet look at where you are today and look at say Claude sonnet 3.x. It’s an entire world away in like a year

- we have imperfect benchmarks all with various weaknesses yet all of them telling the same compelling story. Plus you have adoption numbers and walled garden data that is the proof in the pudding

The onus is on people who say “this is plateauing” or “this has some fundamental limitation that we will not get past fairly quickly”.

New comment by aspenmartin in "The bottleneck was never the code"

aspenmartin — Wed, 06 May 2026 18:39:39 +0000

The solution truly is more AI, yes.

> AI craze isn't going to produce the boon some people think it will.

What’s the boon you don’t think it will produce?

New comment by aspenmartin in "The bottleneck was never the code"

aspenmartin — Wed, 06 May 2026 18:38:21 +0000

- systemic tech debt is now addressable at scale with LLMs. Future models will be good enough to sustain this, if people don’t believe this I would challenge them to explain why. First consider if you understand what scaling laws are like chinchilla and how RL with verification works fundamentally

- I completely agree with you about fundamentally the limitation being the business able to coherently articulate itself and its strategy

- BUT the benefit now is you can basically prototype for free. Before we had to be extremely careful with engineer headcount investment. Now we can try many more things under the same time constraints.

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 18:53:17 +0000

You are not at all wrong. Also get ready for the insidious advertising!

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 18:52:35 +0000

The internet has been around since the 80s…ChatGPT came out 4 years ago. The internet took decades to build out the infrastructure. Inflation adjusted capex for AI infrastructure already far surpasses that of the internet. You’re talking about a technology that doesn’t just make things easier it replaces entire swaths of work. Under some weird measurement you may be right but I mean cmon.

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 18:41:25 +0000

What exactly do you find satirical?

- obviously LLMs are not a religion I’m using it to illustrate a point

- 5-6 months ago was when agent perf hit a meaningful inflection point where adoption has exploded. It’s why people in this thread reference “the past 6 months” whether or not they realize we’ve been on the same path for years now

So to overextend the metaphor, opus 4.5 was really kind of the right fit for the rapture.

I mean no need to take any of this seriously, I have worked on benchmarks and measurement in an AI lab professionally for over 4 years now, in software and data science for 8 and before got a PhD in Astro, like I’m not some sort of armchair person with no understanding of this field. Though I do find it entertaining when my background in an AI lab is people’s favorite reason to dismiss this :)

I find that when people find stuff like this satirical they often don’t really know the industry or underlying mechanics that well. Not saying that’s you but as ridiculous as I apparently sound to you, do consider that sounds even more ridiculous to not understand the tsunami that is coming right for you…

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 14:42:53 +0000

Are you saying AI is tackling the wrong bottlenecks? I’m not sure what you mean by “AI changes profitable software”. Maybe you mean: AI will not create something new, only do the existing things we do?

I agree the foundations: git, GitHub, compilers, etc. are arguably are “a fraction of the price” and today they have arguably more impact (though not sure by which measure). But literally since January we have been rolling out our replacements, I don’t really see how that wouldn’t be an earth shattering impact. You talk about GitHub and that’s fine but ignore the fact that huge swaths of the profession aren’t even directly using any of these tools anymore.

I’m not sure what you imagine the promise of AI to be, and without that I can’t really be specific in any refutation I would just say coding is only the beginning. It is the most powerful and also the easiest thing to solve first. Improved coding performance also improves generalization and performance on non-coding tasks, so that’s a nice bonus, and we’re maybe 5 years away from decent embodied systems which after an inflection point of consumer adoption will quickly get better via data flywheels and on policy learning. Basically there are very few bottlenecks that will not be touched.

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 13:20:17 +0000

I accept your skepticism is all I can say but just consider we’re not talking about the most important numbers and topics in this conversation. We have a lot of mileage left in the current stack. Nothing is plateauing though you wouldn’t know it if you read HN.

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 12:31:34 +0000

Yea that’s what I just can’t wrap my mind around. It’s a cacophony of engineers with authoritative sounding blog posts explaining a subject they seem to have barely a tenuous grasp on. It’s hard to watch a population of tech people I used to really revere getting things so wrong. I thought “surely once we’re no one with any self respect would still claim AI is a useless fad or that it shouldn’t be used” and yet to my disappointment that’s where we seem to be.

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 12:23:47 +0000

Well I’m not offended but it sounds like you may not be paying attention? Do you know the capital outlay that has gone into infra buildouts? several people here have described “6 months” of AI mania—-the fact that people are saying 6 months is exactly the point. Development has been going on since 2010s. All of the “boosters” as HN likes to say have been saying “hey this thing is huge and the performance trends are startling, get ready” and people then say “that’s psychotic I can’t even get Siri to understand my name”. Sure enough, 6 months ago we hit a performance inflection point where “madness” has begun. That’s just when you started paying attention, the rate of change has not stopped. Pretty easy to predict what happens next…

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 12:16:44 +0000

Why deflect from the conversation and attempt to insult someone? What I’m saying is literally canonical and extremely well known literature.

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 12:14:12 +0000

But I don’t really understand: the ask is for evidence AI is generating meaningful returns and it demonstrably is, even while we have integrated these tools only partially. “Just software evolving” um yes, I agree, just that now this happens faster and more efficiently. It is also more than that: models that power advertising and content recommendation at TikTok, Google, Facebook, Instagram, etc are not just “software evolving” it is meaningful improvements to models that are only possible with good AI.

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 01:31:09 +0000

Oh also! Two dashes on my phone converts to an EN dash I think (not em dash!)

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 01:24:52 +0000

See chinchilla scaling laws, we have the functional form of the curve and know the constants (though they change and are domain and model specific):

L(N,D) ~= 1.69 + 406 / N^0.339 + 411 / D^0.285

L is loss (pre training test loss) D is the scale of the data N is the number of model parameters

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 01:21:19 +0000

Always amazes me how we’re on a platform with “ycombinator” in the url and people don’t understand how private companies scale to capture market share. You’re right Uber was that company that ran at a loss for so long and collapsed, another YOLO business strategy. Or maybe it was Amazon or…hmm I forget

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 01:19:20 +0000

Way more than six months. You may be talking about how the world looks from your vantage point, as well you should. But there’s a reason why the world doesn’t allocate trillions of dollars of capital based on that.

I really value skeptical people and skepticism generally. But what I think skeptical people would prefer to consider themselves is: rational and reasonable, with their beliefs well calibrated.

You’re not the only one to think that literally nothing major or significant has happened with AI but that’s simply wrong. Every major tech company - the ones poised to get the first best rewards, have already gotten good incremental revenue from AI via ads ranking/recommendations (Google, Meta, etc.), good productivity increases due to scale of workforce and advanced in house tooling. You won’t see these numbers and you don’t have to believe them. But I have seen them and I believe them, and I, like you, hate bullshit.

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 01:10:38 +0000

No. Time horizon I’m talking about spans years. “We don’t know” is just wrong, we’ve had scaling laws for many years and they continue to hold up. Benchmarks, in all their ugliness, tell a consistent story.

New comment by aspenmartin in "Let's talk about LLMs"

aspenmartin — Tue, 05 May 2026 01:08:52 +0000

I typed this out, character by painful human character, on an iPhone. It is indeed me!