<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: XCSme</title><link>https://news.ycombinator.com/user?id=XCSme</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 21 Jun 2026 09:35:43 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=XCSme" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[Show HN: One hundred LLMs Generating a HTML/CSS Solar System]]></title><description><![CDATA[
<p>Article URL: <a href="https://aibenchy.com/showcase/solar-system-animation/">https://aibenchy.com/showcase/solar-system-animation/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48587146">https://news.ycombinator.com/item?id=48587146</a></p>
<p>Points: 5</p>
<p># Comments: 1</p>
]]></description><pubDate>Thu, 18 Jun 2026 15:44:16 +0000</pubDate><link>https://aibenchy.com/showcase/solar-system-animation/</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48587146</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48587146</guid></item><item><title><![CDATA[MariaDB now has a DuckDB storage engine]]></title><description><![CDATA[
<p>Article URL: <a href="https://mariadb.org/duckdb-storage-engine-for-mariadb-when-the-sea-lion-learns-to-quack/">https://mariadb.org/duckdb-storage-engine-for-mariadb-when-the-sea-lion-learns-to-quack/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48582061">https://news.ycombinator.com/item?id=48582061</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 18 Jun 2026 07:35:11 +0000</pubDate><link>https://mariadb.org/duckdb-storage-engine-for-mariadb-when-the-sea-lion-learns-to-quack/</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48582061</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48582061</guid></item><item><title><![CDATA[New comment by XCSme in "GLM 5.2 Performance Benchmarks"]]></title><description><![CDATA[
<p>Thanks for the feedback!<p>What are you using Claude models for? Coding only? Computer use? Which harness?</p>
]]></description><pubDate>Wed, 17 Jun 2026 15:28:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=48571837</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48571837</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48571837</guid></item><item><title><![CDATA[New comment by XCSme in "GLM 5.2 Performance Benchmarks"]]></title><description><![CDATA[
<p>Also Claude/Fable models are quite bad at instructions following: <a href="https://artificialanalysis.ai/evaluations/ifbench" rel="nofollow">https://artificialanalysis.ai/evaluations/ifbench</a></p>
]]></description><pubDate>Wed, 17 Jun 2026 15:11:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=48571602</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48571602</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48571602</guid></item><item><title><![CDATA[New comment by XCSme in "GLM 5.2 Performance Benchmarks"]]></title><description><![CDATA[
<p>On some it does yes, also in real usage.<p>It avoided answering 2/21 tests in this specific benchmark mark, that's already 90% max score already.</p>
]]></description><pubDate>Wed, 17 Jun 2026 15:08:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=48571572</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48571572</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48571572</guid></item><item><title><![CDATA[New comment by XCSme in "GLM 5.2 Performance Benchmarks"]]></title><description><![CDATA[
<p>Well, most people were not liking Fable when it was available anyway, because it refused to answer questions very often.</p>
]]></description><pubDate>Wed, 17 Jun 2026 14:11:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=48570831</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48570831</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48570831</guid></item><item><title><![CDATA[New comment by XCSme in "GLM 5.2 Performance Benchmarks"]]></title><description><![CDATA[
<p>PS: Just added a cool feature, so you can filter the leaderboard for multiple models at once, by using a comma, like: <a href="https://aibenchy.com/?q=glm,claude" rel="nofollow">https://aibenchy.com/?q=glm,claude</a></p>
]]></description><pubDate>Wed, 17 Jun 2026 12:29:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=48569470</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48569470</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48569470</guid></item><item><title><![CDATA[New comment by XCSme in "GLM 5.2 Performance Benchmarks"]]></title><description><![CDATA[
<p>I also tested it[0]: quite similar to GLM 5, a few percent better, 30% faster and 50% more expensive.<p>[0]: <a href="https://aibenchy.com/?q=glm" rel="nofollow">https://aibenchy.com/?q=glm</a></p>
]]></description><pubDate>Wed, 17 Jun 2026 12:28:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=48569450</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48569450</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48569450</guid></item><item><title><![CDATA[New comment by XCSme in "GLM-5.2 is the new leading open weights model on Artificial Analysis"]]></title><description><![CDATA[
<p>I think the problem is, as can also be seen on other benchmarks, is that most models nowadays are focused more and more purely on tool calling and coding.<p>This means, that models are losing more and more general and domain-specific knowledge.<p>Look at those graphs on ARtificialAnalysis, GLM-5.1 still performs similarly or better:<p>AA-Omnisicence Accuracy: <a href="https://i.snipboard.io/5DYmpx.jpg" rel="nofollow">https://i.snipboard.io/5DYmpx.jpg</a><p>IFBench: <a href="https://i.snipboard.io/74kg0R.jpg" rel="nofollow">https://i.snipboard.io/74kg0R.jpg</a><p>I still feel like models are not getting any smarter for a few months already, they just changed their training to be focused more on some areas than others, so shifting the intelligence from one place to another, not necessarily increasing the overall intelligence or "AGI" score.</p>
]]></description><pubDate>Wed, 17 Jun 2026 11:48:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=48569036</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48569036</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48569036</guid></item><item><title><![CDATA[New comment by XCSme in "GLM-5.2 is the new leading open weights model on Artificial Analysis"]]></title><description><![CDATA[
<p>Oh, or you meant a smaller model than GLM-5.2 with similar capabilities?</p>
]]></description><pubDate>Wed, 17 Jun 2026 11:39:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=48568920</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48568920</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48568920</guid></item><item><title><![CDATA[New comment by XCSme in "GLM-5.2 is the new leading open weights model on Artificial Analysis"]]></title><description><![CDATA[
<p>Which Opus?<p>GLM-5.2 is already close to Opus-4.7 level:<p><a href="https://aibenchy.com/compare/anthropic-claude-opus-4-7-medium/z-ai-glm-5-2-medium/" rel="nofollow">https://aibenchy.com/compare/anthropic-claude-opus-4-7-mediu...</a></p>
]]></description><pubDate>Wed, 17 Jun 2026 11:38:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=48568915</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48568915</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48568915</guid></item><item><title><![CDATA[New comment by XCSme in "GLM-5.2 is the new leading open weights model on Artificial Analysis"]]></title><description><![CDATA[
<p>In my tests[0] GLM-5.2 is not much better than GLM-5, and overall DeepSeek V4 Flash seems to be the better/more cost-effective choice:<p>[0]: <a href="https://aibenchy.com/compare/deepseek-deepseek-v4-flash-high/z-ai-glm-5-2-medium/" rel="nofollow">https://aibenchy.com/compare/deepseek-deepseek-v4-flash-high...</a></p>
]]></description><pubDate>Wed, 17 Jun 2026 11:21:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=48568750</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48568750</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48568750</guid></item><item><title><![CDATA[New comment by XCSme in "Has AI already killed self-help nonfiction books?"]]></title><description><![CDATA[
<p>Kind of, the 4-hour work workweek was one of the first book's I've read (I started late, was never interested), and it had many good insights that lead to me living a freer and more fulfilling life, that I might have not done otherwise.<p>Sure, you don't need to read an entire book to get the idea to do something, but that's way in which ideas get into your head and consciously or not, guide your steps in certain directions.<p>Of course, everything in self-help books should be taken with a grain of salt and also applied differently for each individual. It's just a way to get more ideas of what's possible, similar to travelling: it doesn't force you to do things in a certain way, it just shows you what's out there.</p>
]]></description><pubDate>Wed, 17 Jun 2026 10:06:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=48568129</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48568129</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48568129</guid></item><item><title><![CDATA[New comment by XCSme in "Ask HN: Has anyone replaced Claude/GPT with a local model for daily coding?"]]></title><description><![CDATA[
<p>So, are you saying that local models are maybe better than we give them credit? Because with some extra orchestration/processing we could improve the results?</p>
]]></description><pubDate>Mon, 15 Jun 2026 23:57:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=48548776</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48548776</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48548776</guid></item><item><title><![CDATA[New comment by XCSme in "Ask HN: Has anyone replaced Claude/GPT with a local model for daily coding?"]]></title><description><![CDATA[
<p>> You will notice multiple "thinking" parts per "turn"<p>I thought that was the code harness simply minifying the outputs.
Many models now no longer return the entire chain-of-thought (to avoid distillation attacks). So yes, we don't get the raw LLM output, but I think it's just the thinking summarized, not a complex orchestration or different models.<p>I do agree though that now cloud models are kind of a black box, that's not only obfuscated but also changes over time. Companies seem to be changing model capabilities without notifying users, or even hiddenly serving completely different models. This is even worse via OpenRouter, with providers serving open-source models, some of them serve heavily quantized versions or even completely different models.</p>
]]></description><pubDate>Mon, 15 Jun 2026 23:19:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=48548412</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48548412</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48548412</guid></item><item><title><![CDATA[New comment by XCSme in "Ask HN: Has anyone replaced Claude/GPT with a local model for daily coding?"]]></title><description><![CDATA[
<p>> The SOTA models are a deep orchestration of multiple models operating together it isn't a single mode<p>I don't understand, why does it make you think this is the case?<p>> how can GPT send thinking parts one after another with a markdown header summary of the thinking block itself<p>Can you give an example?</p>
]]></description><pubDate>Mon, 15 Jun 2026 21:43:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=48547402</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48547402</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48547402</guid></item><item><title><![CDATA[New comment by XCSme in "Kimi K2.7-Code: open-source coding model with better token efficiency"]]></title><description><![CDATA[
<p>Seems to be similar level to Kimi K.26, just that it's more token efficient and cheaper to run:<p><a href="https://aibenchy.com/compare/moonshotai-kimi-k2-6-medium/moonshotai-kimi-k2-7-code-medium/" rel="nofollow">https://aibenchy.com/compare/moonshotai-kimi-k2-6-medium/moo...</a></p>
]]></description><pubDate>Fri, 12 Jun 2026 18:46:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=48507923</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48507923</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48507923</guid></item><item><title><![CDATA[New comment by XCSme in "Raspberry Pi 5 – 16GB RAM"]]></title><description><![CDATA[
<p>Good point, everyone was expecting GPU prices to go down and all crypto bros to drop their mining rigs, but they can just repurpose them now...</p>
]]></description><pubDate>Wed, 10 Jun 2026 22:29:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=48483639</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48483639</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48483639</guid></item><item><title><![CDATA[New comment by XCSme in "Claude Fable 5"]]></title><description><![CDATA[
<p>It also does A LOT better, for my hamster test: <a href="https://aibenchy.com/showcase/?q=claude#showcase=6efb87c28e3152b2" rel="nofollow">https://aibenchy.com/showcase/?q=claude#showcase=6efb87c28e3...</a></p>
]]></description><pubDate>Wed, 10 Jun 2026 14:11:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=48476579</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48476579</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48476579</guid></item><item><title><![CDATA[New comment by XCSme in "Claude Fable 5"]]></title><description><![CDATA[
<p>Best hamster by far: <a href="https://aibenchy.com/showcase/?q=claude" rel="nofollow">https://aibenchy.com/showcase/?q=claude</a></p>
]]></description><pubDate>Wed, 10 Jun 2026 14:05:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=48476506</link><dc:creator>XCSme</dc:creator><comments>https://news.ycombinator.com/item?id=48476506</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48476506</guid></item></channel></rss>