Hacker News: sothatsit

New comment by sothatsit in "I think Anthropic and OpenAI have found product-market fit"

sothatsit — Thu, 28 May 2026 05:39:35 +0000

I don’t remember ever hearing Dario or Sam recommend replacing people. Rather they say that smaller groups of people can do more work, so hiring will slow because small teams can do more.

The only times when people talk about actual full replacement of people is always when they are talking about some “future AGI” that is far more capable than the tools we have today.

New comment by sothatsit in "Is AI Profitable Yet?"

sothatsit — Sat, 23 May 2026 03:30:22 +0000

Or, tokens are more like energy and prices will drop over time until they reach some equilibrium.

The big labs are actively moving into the application layer, where they’ll have more pricing power. Maybe that layer will end up with a Mac (Anthropic) vs Windows (OpenAI) vs Linux (open-source) dynamic as well if they can create a moat. But so far it’s pretty easy to move between providers.

New comment by sothatsit in "Vibe coding and agentic engineering are getting closer than I'd like"

sothatsit — Thu, 07 May 2026 16:13:55 +0000

I think the distinction is that for experiments and prototypes the behaviour of the final system is what we are trying to design. We can experiment and see the tradeoffs and explore the design space before committing to a direction. And then we can sit down and produce the final code to a quality we are happy with. If you are serious about this process, there is no way you are producing 1000s of lines of code a day, unless it is trivial boilerplate.

In terms of higher-level abstractions, I agree this is one particularly treacherous rung on the ladder of abstractions. Previous abstractions like compilers or garbage collectors have at least had more structure/rules to rely upon. I don't know exactly how that will look but I don't think we will solely be relying on banging on the output, we will also be spot-checking the source code, using profilers or other tools to inspect the behaviour of systems, and asking the agent to explain the architectural decisions made. I'm not sure exactly how this will look, but I do believe that people who care will still find ways to do good work.

New comment by sothatsit in "Vibe coding and agentic engineering are getting closer than I'd like"

sothatsit — Thu, 07 May 2026 15:00:50 +0000

Well you have obviously already made up your mind, so have fun with your confirmation bias. We'll all be over here having a good time, getting more work done. Feel free to come over when you put down your grudge.

New comment by sothatsit in "Vibe coding and agentic engineering are getting closer than I'd like"

sothatsit — Thu, 07 May 2026 06:17:45 +0000

The entire mistake you are making is comparing using AI to skimming textbooks, or taking shortcuts. Your entire premise is wrong.

People who care about craft will care about the quality of what they produce whether they use AI or not.

The code I ship now is better tested and better thought through now than before I used AI because I can do a lot more. That extra time goes into additional experiments, jumping down more rabbit holes, and trying out ideas I previously couldn’t due to time constraints. It’s freeing to be able to spend more time to improve quality because the ROI on time spent experimenting has gone up dramatically.

New comment by sothatsit in "Vibe coding and agentic engineering are getting closer than I'd like"

sothatsit — Thu, 07 May 2026 02:44:25 +0000

You don’t need to write code by hand to learn from iterations and experiments. I run more experiments and try out more different solutions than I ever could before, and that leads to better decisions. I still read all the code that gets shipped, and don’t want to give that up, but the idea that all craft and learning is lost when you don’t is a bit silly. The craft/learning just moves.

New comment by sothatsit in "Agents for financial services and insurance"

sothatsit — Tue, 05 May 2026 20:27:42 +0000

It still surprises me how effective the /simplify skill is.

I’ve also had some great results with a /reflect skill that asks the agent to look at the work in the broader context of the project. But those are the only two skills I use regularly that aren’t specific to our company, codebase, or tools.

New comment by sothatsit in "AI should elevate your thinking, not replace it"

sothatsit — Mon, 27 Apr 2026 09:09:30 +0000

> I'd say that by purging stuff from the brain we are losing thinking itself

The idea that there will be less to think about seems a bit short-sighted. Humans are very good at moving to higher levels of abstraction, often with more complexity to deal with, not less.

New comment by sothatsit in "Claude Opus 4.7"

sothatsit — Fri, 17 Apr 2026 01:33:19 +0000

You can’t make up your mind about a model by using it on one task. Especially to say it’s such a bad downgrade after that is ludicrous. I’ve had great experiences with it this morning.

New comment by sothatsit in "Issue: Claude Code is unusable for complex engineering tasks with Feb updates"

sothatsit — Tue, 07 Apr 2026 06:37:25 +0000

They provide thinking summaries, so I assume they have to call Haiku or some other model to summarise the thinking blocks.

New comment by sothatsit in "I used AI. It worked. I hated it"

sothatsit — Sun, 05 Apr 2026 06:37:23 +0000

People will accept it as a way to build good software.

Many are still in denial that you can do work that is as good as before, quicker, using coding agents. A lot of people think there has to be some catch, but there really doesn’t have to be. If you continue to put effort in, reviewing results, caring about testing and architecture, working to understand your codebase, then you can do better work. You can think through more edge cases, run more experiments, and iterate faster to a better end result.

New comment by sothatsit in "Claude Code Found a Linux Vulnerability Hidden for 23 Years"

sothatsit — Sun, 05 Apr 2026 00:23:31 +0000

I think the anti-AI stance has been reversing on HN as tooling improves and people try it. It’s only been a little over a year since Claude Code was released, and 3 or 4 months since the models got really capable. People need time to adjust, even if I would expect devs to be more up-to-date than most.

People’s willingness to argue about technology they’ve barely used is always bewildering to me though.

New comment by sothatsit in "I'm Too Lazy to Check Datadog Every Morning, So I Made AI Do It"

sothatsit — Mon, 16 Mar 2026 01:48:04 +0000

Generally I think this happens when people don’t monitor for errors on a regular basis. People only notice if things are actively broken for customers, and tons of small non-fatal bugs slip through and build up over time.

New comment by sothatsit in "The AI coding divide: craft lovers vs. result chasers"

sothatsit — Thu, 12 Mar 2026 23:46:57 +0000

It is not just startups or small companies embracing agentic engineering… Stripe published blog posts about their autonomous coding agents. Amazon is blowing up production because they gave their agents access to prod. Google and Microsoft develop their own agentic engineering tools. It’s not just tech companies either, massive companies are frequently announcing their partnerships with OpenAI or Anthropic.

You can’t just pretend it’s startups doing all the agentic engineering. They’re just the ones pushing the boundaries on best practices the most aggressively.

New comment by sothatsit in "ATMs didn't kill bank teller jobs, but the iPhone did"

sothatsit — Thu, 12 Mar 2026 22:55:23 +0000

The benchmark is AI making less mistakes than humans, not making no mistakes. Just like autonomous vehicles.

And yes, presumably there would be a person who set the firm up, or else our legal system would need to change quite fundamentally.

New comment by sothatsit in "ATMs didn't kill bank Teller jobs, but the iPhone did"

sothatsit — Thu, 12 Mar 2026 17:10:46 +0000

That is why a fully automated firm would be a paradigm shift. Instead of requiring someone to be responsible and to QA things, you just let AI systems be responsible internally, and the company responsible as a whole for legal concerns.

This idea of an automated firm relies on the premise that AI will become more capable and reliable than people.

New comment by sothatsit in "Yann LeCun raises $1B to build AI that understands the physical world"

sothatsit — Thu, 12 Mar 2026 16:14:17 +0000

You laid out the theoretical limitations well, and I tend to agree with them.

I just get frustrated when people downplay how big of an impact filling in the gaps at the frontier of knowledge would have. 99.9% of researchers will never have an idea that adds a new spike to the knowledge frontier (rather than filling in holes), and 99.99% of research is just filling in gaps by combining existing ideas (numbers made up). In this realm, autoresearch may not be groundbreaking, but it can do the job. AlphaEvolve is similar.

If LLMs can actually get closer to something like that, it leaves human researchers a whole lot more time to focus on new ideas that could move entire fields forward. And their iteration speed can be a lot faster if AI agents can help with the implementation and testing of them.

New comment by sothatsit in "Yann LeCun raises $1B to build AI that understands the physical world"

sothatsit — Thu, 12 Mar 2026 16:02:39 +0000

Fundamentally, I’m more optimistic on how far current approaches can scale. I see no reason why RL could not be used to train models to use memory, and fine-tuning already works, it’s just expensive.

The continual learning we get may be a bit hamfisted, and not fit into a neat architecture, but I think we could actually see it work at scale in the next few years. Whereas new techniques like what Yann Lecun have demonstrated still live heavily in the realm of research. Cool, but not useful yet.

Fine tuning is also not so limited as you suggest. For one, we don’t need to fine tune the same model over and over, you can just start with a frontier model each time. And two, modern models are much better at generating synthetic data or environments for RL. This could definitely work, but it might require a lot of work in data collection and curation, and the ROI is not clear. But if large companies continue to allocate more and more resources to AI in the next few years, I could see this happening.

OpenAI already has a custom model service, and labs have stated they already have custom models built for the military (although how custom those models are is unclear). It doesn’t seem like a huge leap to also fine-tune models over a companies internal codebases and tooling. Especially for large companies like Google, Amazon, or Stripe that employ tens of thousands of software engineers.

New comment by sothatsit in "Yann LeCun raises $1B to build AI that understands the physical world"

sothatsit — Thu, 12 Mar 2026 08:25:26 +0000

Memory systems built on top of LLMs could provide continual learning. I do not agree that it is some fundamental limitation.

Claude Code already writes its own memory files. And people already finetune models. There is clear potential to use the former as a form of short-term memory and the latter for long-term “learning”.

The main blockers to this are that models aren’t good enough at managing their own memory, and finetuning is expensive and difficult. But both of these seem like solvable engineering problems.

New comment by sothatsit in "Yann LeCun raises $1B to build AI that understands the physical world"

sothatsit — Thu, 12 Mar 2026 01:29:31 +0000

This is what you said:

> they are still predicting training set continuations

But this is underselling what they do. Probably a large part of what they predict is learnt from their training set, but RL has added a layer on top that does not just come from just mimicry.

Again, I doubt this is enough for “AGI” but I think that term is not very well-defined to begin with. These models have now shown they are capable of novel reasoning, they just have to be prodded in the right way.

It’s not clear to me that there isn’t scaffolding that can use LLMs to search for novel improvements, like Katpathy’s recent autoresearch. The models, with the help of RL, seem to be getting to the point where this actually works to some extent, and I would expect this to happen in other fields in the next few years as well.