<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: ertgbnm</title><link>https://news.ycombinator.com/user?id=ertgbnm</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 30 Jun 2026 00:06:03 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=ertgbnm" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by ertgbnm in "Mag 7 starting to underperform [pdf]"]]></title><description><![CDATA[
<p>Once you lose, you have lost. Ok, but how does that help us predict when something will lose?</p>
]]></description><pubDate>Mon, 29 Jun 2026 15:27:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=48720552</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=48720552</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48720552</guid></item><item><title><![CDATA[New comment by ertgbnm in "MAI-Thinking-1"]]></title><description><![CDATA[
<p>> with AI-generated content excluded from pre-training.<p>> without distillation from third-party models<p>sounds like zero unless they are lying.</p>
]]></description><pubDate>Tue, 02 Jun 2026 20:19:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=48375674</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=48375674</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48375674</guid></item><item><title><![CDATA[New comment by ertgbnm in "Claude Opus 4.8"]]></title><description><![CDATA[
<p>My point is that if I made someone "smarter" they wouldn't suddenly know "What day, month, and year was Carrie Underwood’s album “CryPretty” certified Gold by the RIAA?" which is an example of a question in the SimpleQA benchmark.<p>So (in my opinion) knowledge benchmarks stagnating for small models is not evidence that small model agentic coding performance improvement will stagnate soon. Small models do not struggle with syntax, the barrier is not knowledge. The barrier is long context coherence and problem solving, which I don't see a bottleneck on improvements for small models in the near horizon as we get more and more high quality reasoning traces to train upon.</p>
]]></description><pubDate>Fri, 29 May 2026 16:22:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=48325357</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=48325357</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48325357</guid></item><item><title><![CDATA[New comment by ertgbnm in "Claude Opus 4.8"]]></title><description><![CDATA[
<p>Knowledge benchmarks can't really be improved upon via distillation or RL. It requires those facts be added to the training corpus and for the model to memorize them better. Neither distillation or RL really do that and thus we shouldn't expect improvements on SimpleQA unless some other interventions are being made.<p>Model intelligence and knowledge aren't necessarily directly related. If we can pack greater intelligence and agency at the cost of it forgetting factoids, that would actually be a good thing. We don't need LLMs to memorize facts, we need them to learn how to interact with the world such that they can find the facts that are necessary and surface them to the user.<p>If we could distill all of the knowledge out of an LLM and just be left with a very agentic model that only knows facts in it's context, I think some very interesting stuff would happen.</p>
]]></description><pubDate>Thu, 28 May 2026 17:41:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=48312583</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=48312583</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48312583</guid></item><item><title><![CDATA[New comment by ertgbnm in "Five frontier LLMs disagree on 67% of 1k real-world fact-check claims"]]></title><description><![CDATA[
<p>"Shark attacks correlate strongly with ice cream sales" is an entirely true statement that some would argue is also misleading.<p>Misleading should be removed as a category and replaced with a better hedge like "not sure"</p>
]]></description><pubDate>Thu, 28 May 2026 15:43:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=48310621</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=48310621</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48310621</guid></item><item><title><![CDATA[New comment by ertgbnm in "I'm Tired of Talking to AI"]]></title><description><![CDATA[
<p>My threshold for asking for help might be a little higher than the median, but, I like this operating style personally. Maybe it's just how I was raised, but the thought of not trying to figure something out for myself first is unthinkable. I fill like I get plenty of human connection at work and collaborate with peers and other disciplines plenty.<p>I'm not in tech, I'm in civil engineering so maybe it's just a difference in the types of problems we have in different industries.<p>I do find it very frustrating when an EIT asks me how to do something and it's clear that they haven't even read the instructions page to the excel sheet that they literally have open. I have time to mentor peers and subordinates but I want them to treat my time with the same respect that they treat their own.</p>
]]></description><pubDate>Wed, 27 May 2026 20:15:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=48299952</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=48299952</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48299952</guid></item><item><title><![CDATA[New comment by ertgbnm in "Why AI Agents Cannot Change Software Systems"]]></title><description><![CDATA[
<p>The null hypothesis isn't just the opposite of whatever your opposition believes.<p>For LLMs the null hypothesis would be that there is no relationship between the input and output tokens. Something that is so obviously not true that it's not even worth calculating the number of sigmas away from the null hypothesis that LLMs are.<p>So clearly we discarded the null hypothesis sometime in 2017. Now we have a system that is really really good at pattern matching and seems to understand consequences. Is that "seeming" just a ruse or does it really understand stuff? A proper scientists would look at that evidence and put forward the hypothesis that maybe it really does understand stuff and begin working on experiments that would disprove that alternative hypothesis, moving forward with the assumption that the hypothesis is true until disproven or a better hypothesis is proposed that explains previous evidence more accurately. Naysayers saying "you haven't proven that pattern matching becomes understanding to my satisfaction" is not a rebuttal. They need an alternative hypothesis that can make predications that better fit the model and can be tested.<p>The only rebuttals I've heard are "AI can't actually understand stuff and therefore can't do X" which is a testable hypothesis at least. But Invariably AI eventually does X, just in a different way than anyone really expected.</p>
]]></description><pubDate>Wed, 27 May 2026 15:11:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=48295550</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=48295550</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48295550</guid></item><item><title><![CDATA[New comment by ertgbnm in "I'm Tired of Talking to AI"]]></title><description><![CDATA[
<p>Sending an AI response to a question that someone asks you is insulting because it's a bit like sending them a link to letmegooglethat where it just animates typing the question you have into google.<p>I think it's only appropriate when you are trying to insult the asker. Like if an employee asks a really dumb question that indicates that they didn't even bother googling the question or asking AI first, then sending them back an AI response is appropriate specifically because it's a bit insulting to do.<p>In fact it does exist for gpts: <a href="https://letmegpt.com/" rel="nofollow">https://letmegpt.com/</a><p>Personally, If I'm asking for help it's because I've surely exhausted other avenues of approach like googling it or asking chatGPT. I've come to the person because I need their input specifically. The people I work with are professional enough and I've developed such a relationship with them that I don't have the problem the OP is discussing very much.</p>
]]></description><pubDate>Wed, 27 May 2026 14:24:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=48294872</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=48294872</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48294872</guid></item><item><title><![CDATA[New comment by ertgbnm in "Alignment pretraining: AI discourse creates self-fulfilling (mis)alignment"]]></title><description><![CDATA[
<p>If your AI alignment strategy is so fickle that it breaks if people simply discuss potential problems with the strategy then you didn't really have an alignment strategy to begin with.</p>
]]></description><pubDate>Tue, 19 May 2026 02:42:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=48188594</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=48188594</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48188594</guid></item><item><title><![CDATA[New comment by ertgbnm in "Claude for Small Business"]]></title><description><![CDATA[
<p>It makes scams like that scalable. Once you discover one vector of scamming an AI bookkeeper, you can scam all of the users of that AI, using your own AI to scale it for you.</p>
]]></description><pubDate>Thu, 14 May 2026 13:10:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=48134900</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=48134900</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48134900</guid></item><item><title><![CDATA[New comment by ertgbnm in "I bought Friendster for $30k – Here's what I'm doing with it"]]></title><description><![CDATA[
<p>Well did they sell the website too? Or just the domain? Because the domain doesn't generate ad revenue, the original website did. Like just because I sell the domain name for my blog doesn't mean you also get the content of my blog too.</p>
]]></description><pubDate>Mon, 27 Apr 2026 14:12:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=47921887</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=47921887</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47921887</guid></item><item><title><![CDATA[New comment by ertgbnm in "If more than 50% press blue, everyone survives. Red pressers always survive"]]></title><description><![CDATA[
<p>I think most of the people who pick blue would be empathic, loving people that are just kind of bad at game theory.<p>I don't think I want to live in a world in which they all died out.</p>
]]></description><pubDate>Sun, 26 Apr 2026 19:57:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=47913592</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=47913592</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47913592</guid></item><item><title><![CDATA[New comment by ertgbnm in "If more than 50% press blue, everyone survives. Red pressers always survive"]]></title><description><![CDATA[
<p>The downside of redding is that some portion of the world probably dies and you now have to live in that worse world that if you and 50% of the rest of the world has just blued, would not have happened.</p>
]]></description><pubDate>Sun, 26 Apr 2026 19:43:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=47913387</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=47913387</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47913387</guid></item><item><title><![CDATA[New comment by ertgbnm in "GPT-5.5"]]></title><description><![CDATA[
<p>can't wait for "our worst and dumbest model yet"</p>
]]></description><pubDate>Thu, 23 Apr 2026 19:22:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47880392</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=47880392</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47880392</guid></item><item><title><![CDATA[New comment by ertgbnm in "GitHub's Fake Star Economy"]]></title><description><![CDATA[
<p>Instagram follows is not a good way to hire football players but it's probably a good way to hire instagram influencers. The football analogy is a little unfair because VCs are investing in more than just a company's ability to "play football" they are investing in the brand, the marketing, and the vision. GitHub stars are at least an indication of a startup having a promising brand or some ability to market themselves.<p>Nevertheless, VCs are in fact pretty dumb sometimes and it'd be stupid to invest soley based on stars.</p>
]]></description><pubDate>Mon, 20 Apr 2026 13:46:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=47834256</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=47834256</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47834256</guid></item><item><title><![CDATA[New comment by ertgbnm in "The future of everything is lies, I guess: Where do we go from here?"]]></title><description><![CDATA[
<p>Agreed. I think the starting comparison actually works here. It's a bit like the automobile. The advice of "just don't" doesn't work for cars. It takes a deliberate effort on every scale of society to accomplish, it's not something an individual can just do and succeed at. An American can't just <i>not</i> have a car the same way someone from the netherlands might be able to.</p>
]]></description><pubDate>Thu, 16 Apr 2026 20:03:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=47798741</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=47798741</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47798741</guid></item><item><title><![CDATA[New comment by ertgbnm in "The buns in McDonald's Japan's burger photos are all slightly askew"]]></title><description><![CDATA[
<p>It's going for a rendition of the leaning tower of Lire.</p>
]]></description><pubDate>Wed, 15 Apr 2026 22:06:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47785953</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=47785953</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47785953</guid></item><item><title><![CDATA[New comment by ertgbnm in "TurboQuant: Redefining AI efficiency with extreme compression"]]></title><description><![CDATA[
<p>Most breakthroughs that are published are for efficiency because most breakthroughs that are published are for open source.'<p>All the foundation model breakthroughs are hoarded by the labs doing the pretraining. That being said, RL reasoning training is the obvious and largest breakthrough for intelligence in recent years.</p>
]]></description><pubDate>Wed, 25 Mar 2026 12:59:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=47516757</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=47516757</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47516757</guid></item><item><title><![CDATA[New comment by ertgbnm in "So where are all the AI apps?"]]></title><description><![CDATA[
<p>Does the data not support a 2X increase in packages?<p>Pre-ChatGPT, in ~2020, there were about 5,000 new packages per month. Starting in 2025 (the actual year agents took off), there is a clear uptick in packages that is consistently about 10,000 or 2X the pre-ChatGPT era.<p>In general, the rate of increase is on a clear exponential. So while we might not see a step change in productivity, there comes a point where the average developer is in fact 10X productive than before. It just doesn't feel so crazy because it can about in discrete 5% boosts.<p>I also disagree with the dataset being a good indicator of productivity. I wouldn't actually suspect the number of packages or the frequency of updates to track closely with productivity. My first order guess would that AI would actually be deflationary. Why spend the time to open source something that AI can gen up for anyone on a case by case basis specific to the project. it takes a certain level of dedication and passion for a person to open source a project and if the AI just made it for them, then they haven't actually made the investment of their time and effort to make them feel justified in publishing the package.<p>The metrics I would expect to go up are actually the size of codebases, the number of forks of projects that create hyper customized versions of tools and libraries, and other metrics like that.<p>Overall, I'd predict AI is deflationary on the number of products that exist. If AI removes the friction involved with just making a custom solution, then the amount of demand for middleman software should actually fall as products vertically integrate and reduce dependencies.</p>
]]></description><pubDate>Tue, 24 Mar 2026 15:27:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47504092</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=47504092</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47504092</guid></item><item><title><![CDATA[New comment by ertgbnm in "Pandas Exercises for Data Analysis (Interactive)"]]></title><description><![CDATA[
<p>It's less about performance and more about ecosystem lockin. It's a bit like imperial vs metric units. Why would you ever <i>chose</i> to learn imperial if you had the option to only ever use metric to begin with?</p>
]]></description><pubDate>Wed, 18 Mar 2026 13:37:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=47425671</link><dc:creator>ertgbnm</dc:creator><comments>https://news.ycombinator.com/item?id=47425671</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47425671</guid></item></channel></rss>