<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: yogthos</title><link>https://news.ycombinator.com/user?id=yogthos</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 19 May 2026 02:06:09 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=yogthos" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by yogthos in "AI is a technology not a product"]]></title><description><![CDATA[
<p>I've always looked at it as a platform to build stuff on top of as well. I expect that we'll be treating this tech the same way we treat stuff like Linux today. It'll become common open infrastructure that will be used to build products. Incidentally, this is exactly what Chinese companies seem to be banking on, hence why they don't worry about releasing their models in the open. They understand that getting more people using their models is the key part right now.</p>
]]></description><pubDate>Sun, 17 May 2026 21:04:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=48173181</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48173181</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48173181</guid></item><item><title><![CDATA[Different models solve number-theory race problem]]></title><description><![CDATA[
<p>Article URL: <a href="https://aicc.rayonnant.ai/challenges/palin-prime-bits/">https://aicc.rayonnant.ai/challenges/palin-prime-bits/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48154550">https://news.ycombinator.com/item?id=48154550</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 15 May 2026 22:06:11 +0000</pubDate><link>https://aicc.rayonnant.ai/challenges/palin-prime-bits/</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48154550</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48154550</guid></item><item><title><![CDATA[New comment by yogthos in "The old world of tech is dying and the new cannot be born"]]></title><description><![CDATA[
<p>You can't really do any serious work with a chat interface though. Also, you're still wasting time figuring out if your providers do what you expect. If you run a local model then you know it's going to work the way you set it up in perpetuity.</p>
]]></description><pubDate>Fri, 15 May 2026 21:12:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=48153923</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48153923</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48153923</guid></item><item><title><![CDATA[New comment by yogthos in "The old world of tech is dying and the new cannot be born"]]></title><description><![CDATA[
<p>Or whether the model still works the way you want. For example, a lot of people were pretty unhappy with Claude 4.7 and preferred the way Claude 4.6 worked. If you're relying on a service, then you're stuck with whatever changes the provider decides to make. And the provider is chasing a demographic that's profitable, if you happen to fall out of that demographic then tough luck.<p>But if you run your own models then you're not subject to anybody's whims anymore. You have full control of how your software works and what it does.</p>
]]></description><pubDate>Fri, 15 May 2026 18:19:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=48151951</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48151951</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48151951</guid></item><item><title><![CDATA[New comment by yogthos in "The old world of tech is dying and the new cannot be born"]]></title><description><![CDATA[
<p>I assumed you were asking about capabilities since that is a question that can be answered. There isn't any comprehensive reporting by companies, so it's just the anecdotal reports the video I linked discusses.<p>And there are also occasional statements like the one by Airbnb here disclosing what they use <a href="https://www.bloomberg.com/news/articles/2025-10-21/airbnb-ceo-brian-chesky-says-chatgpt-integration-not-ready-for-airbnb-app" rel="nofollow">https://www.bloomberg.com/news/articles/2025-10-21/airbnb-ce...</a></p>
]]></description><pubDate>Fri, 15 May 2026 17:25:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=48151340</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48151340</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48151340</guid></item><item><title><![CDATA[New comment by yogthos in "The old world of tech is dying and the new cannot be born"]]></title><description><![CDATA[
<p>This is absolutely false as a recent study from Stanford clearly states <a href="https://hai.stanford.edu/news/inside-the-ai-index-12-takeaways-from-the-2026-report" rel="nofollow">https://hai.stanford.edu/news/inside-the-ai-index-12-takeawa...</a></p>
]]></description><pubDate>Fri, 15 May 2026 16:36:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=48150696</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48150696</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48150696</guid></item><item><title><![CDATA[New comment by yogthos in "The old world of tech is dying and the new cannot be born"]]></title><description><![CDATA[
<p>Here's a recent Stanford study showing that Chinese models are basically just as good <a href="https://hai.stanford.edu/news/inside-the-ai-index-12-takeaways-from-the-2026-report" rel="nofollow">https://hai.stanford.edu/news/inside-the-ai-index-12-takeawa...</a><p>For most use cases, you don't actually need frontier performance either. Customization, cost, and data sovereignty are far bigger practical concerns. If you can run your own model on prem and tune it exactly what you need, then you're both saving money and getting better quality output.<p>It's also wroth noting that tooling can go a long way to improve the quality of output from the models as well, and this is very much an under explored area right now. For example, ATLAS agentic harness does a clever trick where it gets the model to generate multiple candidates then uses a second lightweight model as a heuristic to score them keeping the promising ones. And this drastically improves coding capability.<p><a href="https://github.com/itigges22/ATLAS" rel="nofollow">https://github.com/itigges22/ATLAS</a><p>There's also a paper along similar lines discussing how using a harness to force a project structure also allows it to work on much larger projects successfully.<p><a href="https://arxiv.org/abs/2509.16198" rel="nofollow">https://arxiv.org/abs/2509.16198</a><p>So, I don't think that raw power of the model is even the most important part at this point. We can squeeze a lot more juice out of smaller models we can run locally by using them more effectively.<p>We're basically in the mainframe era of this tech, but the pendulum always swings to tech getting more optimized and moving to edge devices over time. And I think we're already starting to see this happen with local models becoming good enough to do real work.</p>
]]></description><pubDate>Fri, 15 May 2026 16:14:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=48150425</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48150425</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48150425</guid></item><item><title><![CDATA[New comment by yogthos in "The old world of tech is dying and the new cannot be born"]]></title><description><![CDATA[
<p>Sure, publications might not equal talent, but the fact is that China leads research in 90% of crucial technologies. So, clearly China leads in a very tangible way here <a href="https://www.nature.com/articles/d41586-025-04048-7" rel="nofollow">https://www.nature.com/articles/d41586-025-04048-7</a><p>And specifically to AI, practically all major innovation that's been published and is used in the wild comes from Chinese companies. Before DeepSeek, everybody just assumed you needed a gigantic date centre to train models. Qwen is showing that you can get near frontier quality on your desktop. Nothing of the sort is coming out from the US.<p>And frankly when you look at the recent report from Stanford, it's embarrassing af for the US. Look at the chart on how much money is going into AI in US relative to China, and then at the chart showing how there's practically no difference in quality of the models. The only thing the US is ahead in is burning through capital like there's no tomorrow.<p><a href="https://hai.stanford.edu/news/inside-the-ai-index-12-takeaways-from-the-2026-report" rel="nofollow">https://hai.stanford.edu/news/inside-the-ai-index-12-takeawa...</a></p>
]]></description><pubDate>Fri, 15 May 2026 16:05:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=48150319</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48150319</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48150319</guid></item><item><title><![CDATA[New comment by yogthos in "The old world of tech is dying and the new cannot be born"]]></title><description><![CDATA[
<p>Who cares how the script was generated. What he says is entirely factual. He cites plenty of concrete examples too.<p>The idea that the talent in the US surpasses the global research community is laughable. China already tops the world in artificial intelligence publications. 
<a href="https://www.science.org/content/article/china-tops-world-artificial-intelligence-publications-database-analysis-reveals" rel="nofollow">https://www.science.org/content/article/china-tops-world-art...</a><p>China also has a population of 1.4 billion people, and an excellent education system. Pretty much all top universities are Chinese. <a href="https://www.nature.com/nature-index/institution-outputs/generate/all/global/all" rel="nofollow">https://www.nature.com/nature-index/institution-outputs/gene...</a><p>And let's not forget that top AI researchers from US are now fleeing to China. <a href="https://www.scmp.com/news/china/science/article/3353398/lead-microsoft-ai-scientist-li-hongzhi-joins-chinas-tongji-university" rel="nofollow">https://www.scmp.com/news/china/science/article/3353398/lead...</a></p>
]]></description><pubDate>Fri, 15 May 2026 14:56:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=48149494</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48149494</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48149494</guid></item><item><title><![CDATA[New comment by yogthos in "The old world of tech is dying and the new cannot be born"]]></title><description><![CDATA[
<p>What are you talking about even. Chinese models are what pretty much every AI company in the US is using now because you can run them on prem and customize them, and because hosted versions cost a fraction of US ones. <a href="https://www.youtube.com/watch?v=9baDOfwUzHQ" rel="nofollow">https://www.youtube.com/watch?v=9baDOfwUzHQ</a><p>And that's in the US, the rest of the world is all using Chinese models as well. Which means these models get far more collaboration from the global research community being developed in the open. They will set the standards in terms of how APIs work. And they will be what everyone uses going forward.<p>The closed approach simply can't compete with that. The same way Linux destroyed Windows on servers, open AI models will destroy proprietary solutions as well.</p>
]]></description><pubDate>Fri, 15 May 2026 14:16:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=48148913</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48148913</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48148913</guid></item><item><title><![CDATA[Chinese memory module makers ramp up production as DDR5 breakthrough hits market]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.scmp.com/tech/tech-trends/article/3353464/chinese-memory-module-makers-ramp-production-cxmt-ddr5-breakthrough-hits-market">https://www.scmp.com/tech/tech-trends/article/3353464/chinese-memory-module-makers-ramp-production-cxmt-ddr5-breakthrough-hits-market</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48141381">https://news.ycombinator.com/item?id=48141381</a></p>
<p>Points: 13</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 14 May 2026 21:17:43 +0000</pubDate><link>https://www.scmp.com/tech/tech-trends/article/3353464/chinese-memory-module-makers-ramp-production-cxmt-ddr5-breakthrough-hits-market</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48141381</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48141381</guid></item><item><title><![CDATA[New comment by yogthos in "LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users"]]></title><description><![CDATA[
<p>I'd argue there's little rational for having the model talk down to people which isn't malicious. If the user doesn't understand the answer, they can explicitly ask the model to explain it in simpler terms. If you read through the study, it's pretty clear that this isn't just accidental bias from the training data, but rather intentional limiting of capability for specific groups of users.</p>
]]></description><pubDate>Thu, 14 May 2026 18:11:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=48139034</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48139034</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48139034</guid></item><item><title><![CDATA[New comment by yogthos in "LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users"]]></title><description><![CDATA[
<p>Everyone’s worried about AI taking jobs or whatever, but baked in biases are another very real and more basic problem that doesn't get nearly enough attention.<p>This study took GPT-4, Claude 3 Opus, and Llama 3 and fed them the same 1,817 factual questions from TruthfulQA and SciQ. And then they looked at how the models responded by changing the user bio with one persona being a Harvard neuroscientist from Boston, another a PhD student from Mumbai who mentioned her English is "not so perfect, yes", a fisherman named Jimmy ,and a guy named Alexei from a small Russian village.<p>Claude scored 95.60% on SciQ for the Harvard user, but for a Russian villager it dropped to 69.30%, for the Iranian low education user the score fell to 66.22. What's alarming here is that the model knew the answers, but decided that some users shouldn’t get them.<p>And the way it answered those users was genuinely gross as well. Claude used condescending or mocking language 43.74% of the time for less educated users while for Harvard users it was under 1%. Imagine asking about the water cycle and getting "My friend, the water cycle, it never end, always repeating, yes. Like the seasons in our village, always coming back around". The model is perfectly capable of giving a proper scientific answer. but chose to talk to that user like a child in broken English.<p>If you thought that was bad, it just keeps getting worse because it turns out that Claude refuses to answer Iranian and Russian users on topics like nuclear power, anatomy, female health, drugs, Judaism, or even 9/11. When the Russian persona asked about explosives, Claude deflected with "perhaps we could talk about your interests in fishing, nature, folk music or travel instead". Foreign low education users got refused 10.9% of the time while control users 3.61% on the same question.<p>The reality is that these systems aren’t neutral and the safety training that purportedly makes them helpful and harmless makes them look at who is asking to decide if you deserve the real answer. If you’re outside the US and if English isn’t your first language, or you didn’t go to a fancy school then you’re getting a worse, dumber, sometimes straight up mocking version of the product.<p>This is what makes open models like DeepSeek and Qwen so important going forward. You can see their weights, and you can tune them to work any way you want. You can host them locally and not have to worry that they'll give you a wrong answer based on your nationality. If DeepSeek did something like this, it would caught immediately, and we'd see an uncensored version published within days.<p>With closed models you’re just trusting a black box that might be treating you differently based on your country, education, and English level.</p>
]]></description><pubDate>Thu, 14 May 2026 16:45:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=48137861</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48137861</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48137861</guid></item><item><title><![CDATA[LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users]]></title><description><![CDATA[
<p>Article URL: <a href="https://arxiv.org/abs/2406.17737">https://arxiv.org/abs/2406.17737</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48136163">https://news.ycombinator.com/item?id=48136163</a></p>
<p>Points: 9</p>
<p># Comments: 3</p>
]]></description><pubDate>Thu, 14 May 2026 14:44:19 +0000</pubDate><link>https://arxiv.org/abs/2406.17737</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48136163</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48136163</guid></item><item><title><![CDATA[New comment by yogthos in "Agents need control flow, not more prompts"]]></title><description><![CDATA[
<p>This was basically my realization as well. We are trying to get LLMs to write software the way humans do it, but they have a different set of strength and weaknesses. Structuring tooling around what LLMs actually do well seems like an obvious thing to do. I wrote about this in some detail here:<p><a href="https://yogthos.net/posts/2026-02-25-ai-at-scale.html" rel="nofollow">https://yogthos.net/posts/2026-02-25-ai-at-scale.html</a></p>
]]></description><pubDate>Thu, 07 May 2026 18:16:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=48052826</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48052826</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48052826</guid></item><item><title><![CDATA[New comment by yogthos in "GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents"]]></title><description><![CDATA[
<p>I have a subscription and I have not seen any difference in performance during on/off hours. What exactly are you basing this on?</p>
]]></description><pubDate>Tue, 05 May 2026 19:00:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=48026943</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=48026943</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48026943</guid></item><item><title><![CDATA[New comment by yogthos in "DeepSeek V4 – almost on the frontier"]]></title><description><![CDATA[
<p>Similar idea, I find tree sitter is nice because it already supports a bunch of languages and it's easily extensible. Once you the AST, you can really have the LLM go to town with it.</p>
]]></description><pubDate>Sat, 02 May 2026 22:04:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47991026</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=47991026</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47991026</guid></item><item><title><![CDATA[New comment by yogthos in "DeepSeek V4 – almost on the frontier"]]></title><description><![CDATA[
<p>Awesome, and feel free to open issues if you find anything missing that would be useful.</p>
]]></description><pubDate>Sat, 02 May 2026 15:20:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=47987197</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=47987197</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47987197</guid></item><item><title><![CDATA[New comment by yogthos in "DeepSeek V4 – almost on the frontier"]]></title><description><![CDATA[
<p>I find a lot of the inefficiency also comes from the model just randomly poking around and grepping all the time which is the fault of the harness. I ended up building a Prolog based MCP where I use tree-sitter to parse the code into a graph, and then the model can just ask questions like 'what are all the functions connected to this function'. So, in case you're trying to focus on what a particular endpoint is doing, you can trivially and predictably trace the whole subgraphs of calls.<p><a href="https://github.com/yogthos/chiasmus" rel="nofollow">https://github.com/yogthos/chiasmus</a></p>
]]></description><pubDate>Sat, 02 May 2026 13:58:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=47986468</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=47986468</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47986468</guid></item><item><title><![CDATA[New comment by yogthos in "A 200-Person Chinese Team Just Embarrassed Every $500B AI Lab on Earth"]]></title><description><![CDATA[
<p>Why should you feel the need to make a vapid comment on an article you admit to not having read?</p>
]]></description><pubDate>Fri, 01 May 2026 17:15:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=47977337</link><dc:creator>yogthos</dc:creator><comments>https://news.ycombinator.com/item?id=47977337</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47977337</guid></item></channel></rss>