<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: sothatsit</title><link>https://news.ycombinator.com/user?id=sothatsit</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 12 Apr 2026 11:47:20 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=sothatsit" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by sothatsit in "Issue: Claude Code is unusable for complex engineering tasks with Feb updates"]]></title><description><![CDATA[
<p>They provide thinking summaries, so I assume they have to call Haiku or some other model to summarise the thinking blocks.</p>
]]></description><pubDate>Tue, 07 Apr 2026 06:37:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=47671479</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47671479</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47671479</guid></item><item><title><![CDATA[New comment by sothatsit in "I used AI. It worked. I hated it"]]></title><description><![CDATA[
<p>People will accept it as a way to build good software.<p>Many are still in denial that you can do work that is as good as before, quicker, using coding agents. A lot of people think there has to be some catch, but there really doesn’t have to be. If you continue to put effort in, reviewing results, caring about testing and architecture, working to understand your codebase, then you can do better work. You can think through more edge cases, run more experiments, and iterate faster to a better end result.</p>
]]></description><pubDate>Sun, 05 Apr 2026 06:37:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=47646735</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47646735</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47646735</guid></item><item><title><![CDATA[New comment by sothatsit in "Claude Code Found a Linux Vulnerability Hidden for 23 Years"]]></title><description><![CDATA[
<p>I think the anti-AI stance has been reversing on HN as tooling improves and people try it. It’s only been a little over a year since Claude Code was released, and 3 or 4 months since the models got really capable. People need time to adjust, even if I would expect devs to be more up-to-date than most.<p>People’s willingness to argue about technology they’ve barely used is always bewildering to me though.</p>
]]></description><pubDate>Sun, 05 Apr 2026 00:23:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=47644927</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47644927</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47644927</guid></item><item><title><![CDATA[New comment by sothatsit in "I'm Too Lazy to Check Datadog Every Morning, So I Made AI Do It"]]></title><description><![CDATA[
<p>Generally I think this happens when people don’t monitor for errors on a regular basis. People only notice if things are actively broken for customers, and tons of small non-fatal bugs slip through and build up over time.</p>
]]></description><pubDate>Mon, 16 Mar 2026 01:48:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=47394176</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47394176</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47394176</guid></item><item><title><![CDATA[New comment by sothatsit in "The AI coding divide: craft lovers vs. result chasers"]]></title><description><![CDATA[
<p>It is not just startups or small companies embracing agentic engineering… Stripe published blog posts about their autonomous coding agents. Amazon is blowing up production because they gave their agents access to prod. Google and Microsoft develop their own agentic engineering tools. It’s not just tech companies either, massive companies are frequently announcing their partnerships with OpenAI or Anthropic.<p>You can’t just pretend it’s startups doing all the agentic engineering. They’re just the ones pushing the boundaries on best practices the most aggressively.</p>
]]></description><pubDate>Thu, 12 Mar 2026 23:46:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=47358890</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47358890</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47358890</guid></item><item><title><![CDATA[New comment by sothatsit in "ATMs didn't kill bank teller jobs, but the iPhone did"]]></title><description><![CDATA[
<p>The benchmark is AI making less mistakes than humans, not making no mistakes. Just like autonomous vehicles.<p>And yes, presumably there would be a person who set the firm up, or else our legal system would need to change quite fundamentally.</p>
]]></description><pubDate>Thu, 12 Mar 2026 22:55:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=47358382</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47358382</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47358382</guid></item><item><title><![CDATA[New comment by sothatsit in "ATMs didn't kill bank Teller jobs, but the iPhone did"]]></title><description><![CDATA[
<p>That is why a fully automated firm would be a paradigm shift. Instead of requiring someone to be responsible and to QA things, you just let AI systems be responsible internally, and the company responsible as a whole for legal concerns.<p>This idea of an automated firm relies on the premise that AI will become more capable and reliable than people.</p>
]]></description><pubDate>Thu, 12 Mar 2026 17:10:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=47354054</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47354054</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47354054</guid></item><item><title><![CDATA[New comment by sothatsit in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>You laid out the theoretical limitations well, and I tend to agree with them.<p>I just get frustrated when people downplay how big of an impact filling in the gaps at the frontier of knowledge would have. 99.9% of researchers will never have an idea that adds a new spike to the knowledge frontier (rather than filling in holes), and 99.99% of research is just filling in gaps by combining existing ideas (numbers made up). In this realm, autoresearch may not be groundbreaking, but it can do the job. AlphaEvolve is similar.<p>If LLMs can actually get closer to something like that, it leaves human researchers a whole lot more time to focus on new ideas that could move entire fields forward. And their iteration speed can be a lot faster if AI agents can help with the implementation and testing of them.</p>
]]></description><pubDate>Thu, 12 Mar 2026 16:14:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=47353063</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47353063</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47353063</guid></item><item><title><![CDATA[New comment by sothatsit in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>Fundamentally, I’m more optimistic on how far current approaches can scale. I see no reason why RL could not be used to train models to use memory, and fine-tuning already works, it’s just expensive.<p>The continual learning we get may be a bit hamfisted, and not fit into a neat architecture, but I think we could actually see it work at scale in the next few years. Whereas new techniques like what Yann Lecun have demonstrated still live heavily in the realm of research. Cool, but not useful yet.<p>Fine tuning is also not so limited as you suggest. For one, we don’t need to fine tune the same model over and over, you can just start with a frontier model each time. And two, modern models are much better at generating synthetic data or environments for RL. This could definitely work, but it might require a lot of work in data collection and curation, and the ROI is not clear. But if large companies continue to allocate more and more resources to AI in the next few years, I could see this happening.<p>OpenAI already has a custom model service, and labs have stated they already have custom models built for the military (although how custom those models are is unclear). It doesn’t seem like a huge leap to also fine-tune models over a companies internal codebases and tooling. Especially for large companies like Google, Amazon, or Stripe that employ tens of thousands of software engineers.</p>
]]></description><pubDate>Thu, 12 Mar 2026 16:02:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=47352833</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47352833</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47352833</guid></item><item><title><![CDATA[New comment by sothatsit in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>Memory systems built on top of LLMs could provide continual learning. I do not agree that it is some fundamental limitation.<p>Claude Code already writes its own memory files. And people already finetune models. There is clear potential to use the former as a form of short-term memory and the latter for long-term “learning”.<p>The main blockers to this are that models aren’t good enough at managing their own memory, and finetuning is expensive and difficult. But both of these seem like solvable engineering problems.</p>
]]></description><pubDate>Thu, 12 Mar 2026 08:25:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=47347938</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47347938</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47347938</guid></item><item><title><![CDATA[New comment by sothatsit in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>This is what you said:<p>> they are still predicting training set continuations<p>But this is underselling what they do. Probably a large part of what they predict is learnt from their training set, but RL has added a layer on top that does not just come from just mimicry.<p>Again, I doubt this is enough for “AGI” but I think that term is not very well-defined to begin with. These models have now shown they are capable of novel reasoning, they just have to be prodded in the right way.<p>It’s not clear to me that there isn’t scaffolding that can use LLMs to search for novel improvements, like Katpathy’s recent autoresearch. The models, with the help of RL, seem to be getting to the point where this actually works to some extent, and I would expect this to happen in other fields in the next few years as well.</p>
]]></description><pubDate>Thu, 12 Mar 2026 01:29:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=47345102</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47345102</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47345102</guid></item><item><title><![CDATA[New comment by sothatsit in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>You can’t really say it is just predicting continuations when it is learning to write proofs for Erdos problems, formalise significant math results, or perform automated AI research. Those are far beyond what you get by just being a copying and re-forming machine, a lot of these problems require sophisticated application of logic.<p>I don’t know if this can reach AGI, or if that term makes any sense to begin with. But to say these models have not learnt from their RL seems a bit ludicrous. What do you think training to predict when to use different continuations is other than learning?<p>I would say LLM’s failure cases like failing at riddles are more akin to our own optical illusions and blind spots rather than indicative of the nature of LLMs as a whole.</p>
]]></description><pubDate>Wed, 11 Mar 2026 02:37:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=47331206</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47331206</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47331206</guid></item><item><title><![CDATA[New comment by sothatsit in "Debian decides not to decide on AI-generated contributions"]]></title><description><![CDATA[
<p>I’d argue this social angle is not very nuanced or effective. Not all people who used Claude Code will be submitting low-effort patches, and bad-faith actors will just lie about their AI-use.<p>For example, someone might have done a lot of investigation to find the root cause of an issue, followed by getting Claude Code to implement the fix, which they then tested. That has a good chance of being a good contribution.<p>I think tackling this from the trust side is likely to be a better solution. One approach would be to only allow new contributors to make small patches. Once those are accepted, then allow them to make larger contributions. That would help with the real problem, which is higher volumes of low-effort contributions overwhelming maintainers.</p>
]]></description><pubDate>Tue, 10 Mar 2026 20:08:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47328177</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47328177</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47328177</guid></item><item><title><![CDATA[New comment by sothatsit in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>RL on LLMs has changed things. LLMs are not stuck in continuation predicting territory any more.<p>Models build up this big knowledge base by predicting continuations. But then their RL stage gives rewards for completing problems successfully. This requires learning and generalisation to do well, and indeed RL marked a turning point in LLM performance.<p>A year after RL was made to work, LLMs can now operate in agent harnesses over 100s of tool calls to complete non-trivial tasks. They can recover from their own mistakes. They can write 1000s of lines of code that works. I think it’s no longer fair to categorise LLMs as just continuation-predictors.</p>
]]></description><pubDate>Tue, 10 Mar 2026 18:54:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47327376</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47327376</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47327376</guid></item><item><title><![CDATA[New comment by sothatsit in "Debian decides not to decide on AI-generated contributions"]]></title><description><![CDATA[
<p>I quite like this direction. Limit new contributors to small contributions, and then relax restrictions as more of their contributions are accepted.</p>
]]></description><pubDate>Tue, 10 Mar 2026 17:51:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=47326572</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47326572</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47326572</guid></item><item><title><![CDATA[New comment by sothatsit in "Debian decides not to decide on AI-generated contributions"]]></title><description><![CDATA[
<p>The people likely to submit low-effort contributions are also the people most likely to ignore policies restricting AI usage.<p>The people following the policies are the most likely to use AI responsibly and not submit low-effort contributions.<p>I’m more interested in how we might allow people to build trust so that reviewers can positively spend time on their contributions, whilst avoiding wasting reviewers time on drive-by contributors. This seems like a hard problem.</p>
]]></description><pubDate>Tue, 10 Mar 2026 17:40:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=47326423</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47326423</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47326423</guid></item><item><title><![CDATA[New comment by sothatsit in "Debian decides not to decide on AI-generated contributions"]]></title><description><![CDATA[
<p>Trusted contributors using LLMs do not cause this problem though. It is the larger volume of low-effort contributions causing this problem, and those contributors are the most likely to ignore the policies.<p>Therefore, policies restricting AI-use on the basis of avoiding low-quality contributions are probably hurting more than they’re helping.</p>
]]></description><pubDate>Tue, 10 Mar 2026 17:32:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=47326333</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47326333</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47326333</guid></item><item><title><![CDATA[New comment by sothatsit in "Debian decides not to decide on AI-generated contributions"]]></title><description><![CDATA[
<p>Concerns about the wasting of maintainer’s time, onboarding, or copyright, are of great interest to me from a policy perspective. But I find some of the debate around the quality of AI contributions to be odd.<p>Quality should always be the responsibility of the person submitting changes. Whether a person used LLMs should not be a large concern if someone is acting in good-faith. If they submitted bad code, having used AI is not a valid excuse.<p>Policies restricting AI-use might hurt good contributors while bad contributors ignore the restrictions. That said, restrictions for non-quality reasons, like copyright concerns, might still make sense.</p>
]]></description><pubDate>Tue, 10 Mar 2026 16:40:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=47325648</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47325648</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47325648</guid></item><item><title><![CDATA[New comment by sothatsit in "GPT-5.4"]]></title><description><![CDATA[
<p>I much prefer this, we can choose based on our use-cases, and people who don’t care can still use Auto.</p>
]]></description><pubDate>Thu, 05 Mar 2026 22:50:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47268356</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47268356</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47268356</guid></item><item><title><![CDATA[New comment by sothatsit in "Google Workspace CLI"]]></title><description><![CDATA[
<p>This is all manual, so people ask their agent to load Jira issues, edit Confluence pages, etc. Users sign-in using their own accounts using the CLIs, so the agents inherit their own permissions. Then we have the permissions in Claude Code setup so any write commands are in Ask, so it always prompts the user if it wants to run them.</p>
]]></description><pubDate>Thu, 05 Mar 2026 22:33:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=47268208</link><dc:creator>sothatsit</dc:creator><comments>https://news.ycombinator.com/item?id=47268208</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47268208</guid></item></channel></rss>