<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: comboy</title><link>https://news.ycombinator.com/user?id=comboy</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 18 Jun 2026 03:08:28 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=comboy" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by comboy in "US holds off blacklisting DeepSeek, more than 100 firms deemed security risks"]]></title><description><![CDATA[
<p>I made Qwen respond it was made by Google with a simple Chinese greeting.<p>But also, I made Sonnet introduce itself as made by OpenAI..<p>Prompt:  你好！用一句话介绍你自己。<p>Sonnet in around 5% of resplies:<p><pre><code>    你好！我是 **ChatGPT**，一个由 OpenAI 开发的 AI 助手，致力于回答问题、提供信息和帮助解决各种问题。有什么我可以帮你的吗？
</code></pre>
Found it like a month ago and it kept working, I wonder if it will stop after this comment.</p>
]]></description><pubDate>Wed, 17 Jun 2026 17:31:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=48573718</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48573718</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48573718</guid></item><item><title><![CDATA[New comment by comboy in "Claude: Elevated errors across many models [resolved]"]]></title><description><![CDATA[
<p>What's that about? It's full screen/terminal anyway. Is it just switching some rendering engine under the hood?</p>
]]></description><pubDate>Wed, 17 Jun 2026 10:00:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=48568090</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48568090</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48568090</guid></item><item><title><![CDATA[New comment by comboy in "Iroh 1.0"]]></title><description><![CDATA[
<p>If you think that your phone number is equivalent to your home address then yes.</p>
]]></description><pubDate>Tue, 16 Jun 2026 13:18:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=48554857</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48554857</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48554857</guid></item><item><title><![CDATA[New comment by comboy in "Iroh 1.0"]]></title><description><![CDATA[
<p>Different network layer, no centralization, no authorities, DNS has nothing to do with making p2p connections, it's like the ballpark is not even in the same country</p>
]]></description><pubDate>Tue, 16 Jun 2026 02:42:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=48549885</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48549885</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48549885</guid></item><item><title><![CDATA[New comment by comboy in "Iroh 1.0"]]></title><description><![CDATA[
<p>I'm so disappointed in this comment thread <a href="https://en.wikipedia.org/wiki/OSI_model" rel="nofollow">https://en.wikipedia.org/wiki/OSI_model</a><p>I've just learned about it, but my understanding is that Iroh is L7, compared to e.g. tailscale which is L3</p>
]]></description><pubDate>Tue, 16 Jun 2026 02:37:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=48549857</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48549857</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48549857</guid></item><item><title><![CDATA[New comment by comboy in "Openrouter Fusion API"]]></title><description><![CDATA[
<p>Prompt matters. Obviously if you want another model opinion you must generate from the scratch using the same prompt and then you can try to synthesize, but working with an existing response can work if desired. I use explicit instructions to find issues with assigned severities and then these are going through the panel of judges, only issues passing certain threshold are fixed in the original response.<p>I'll share a revelation which vastly improved my results: tell judges to evaluate truth and usefulness/should-be-fixed axis separately. Because inevitably with a prompt that is forcing to find issues you will end up with nitpicks. Plus truth axis allows to better evaluate the issue-finder models for your use case.<p>That's some part of what happens when I generate explanations like this one: <a href="https://hanzirama.com/character/%E6%9D%A5#explain" rel="nofollow">https://hanzirama.com/character/%E6%9D%A5#explain</a> - at this point the site is a small side product of my LLMs-evaluation machinery.<p>Bonus content for patient readers: if you need top quality you will likely need to pin provider(s) on OR, :exacto is not enough to get good repeatable results especially for open-weights models.</p>
]]></description><pubDate>Mon, 15 Jun 2026 13:32:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=48541039</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48541039</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48541039</guid></item><item><title><![CDATA[New comment by comboy in "Anthropic's Safety Superpower"]]></title><description><![CDATA[
<p>They cannot do it. Apart from all the practical, technical and talent reasons, it would still be exporting forbidden stuff.<p>The signal is clear enough though for the next Anthropic..</p>
]]></description><pubDate>Mon, 15 Jun 2026 12:51:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=48540515</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48540515</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48540515</guid></item><item><title><![CDATA[New comment by comboy in "What happened to nerds?"]]></title><description><![CDATA[
<p>It's simple, marketing dominates everything. With attention being very expensive, appearance is what matters.<p>It doesn't matter if you write fantastic library, nobody is gonna use it because they won't know about it, the one with a gif of the terminal (ffs) will win that has a good page describing what it does (and being the most popular one can even become better than your library because of the following but that's not the point here).<p>It's everywhere, products, hiring, services. We have no network of trust (sigh), we need to trust some heuristics based on a shallow information. If somebody focuses on the shallow he wins, because nobody can ever dive into everything.</p>
]]></description><pubDate>Mon, 15 Jun 2026 09:14:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=48538641</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48538641</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48538641</guid></item><item><title><![CDATA[New comment by comboy in "Claude Fable 5: mid-tier results on coding tasks"]]></title><description><![CDATA[
<p>I'm creating hanzirama.com<p>I generate explanations for characters and words like so: <a href="https://hanzirama.com/character/%E6%9D%A5#explain" rel="nofollow">https://hanzirama.com/character/%E6%9D%A5#explain</a><p>But I don't want to mislead learners and want to provide some cultural depth, so I have a hole sophisticated pipeline, using multiple models to generate the explanation, then multiple models look for issues in the explanation, each issue goes through the panel of judges (basically trying to squash down any hallucinations), it's fixed and it goes through such cycles a few times over.<p>I've been at it for some months now, so I have dozens of different probes, that I needed to evaluate prompts and method changes. Plus on some items I generated so many explanations through different means that I can tell a lot about given model just by looking at one.<p>Plus I'm doing some statistics, so I see how e.g. when working as judges of issues some models correlate heavily with some others... Fun fact during some testing runs basically just testing providers I stumbled upon qwen introducing himself as made by Google. And also Anhropic's Sonnet saying that it was made by OpenAI :)<p>At this point all my evaluations frameworks and pipelines stuff is much bigger than the site itself. I'm having lots of fun though.</p>
]]></description><pubDate>Sun, 14 Jun 2026 16:20:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=48529000</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48529000</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48529000</guid></item><item><title><![CDATA[New comment by comboy in "Show HN: FablePool – pool money behind a prompt, and Fable builds it in public"]]></title><description><![CDATA[
<p>But it sounds like FableFool so it has that going for it.</p>
]]></description><pubDate>Thu, 11 Jun 2026 23:41:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=48497929</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48497929</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48497929</guid></item><item><title><![CDATA[New comment by comboy in "Claude Fable 5: mid-tier results on coding tasks"]]></title><description><![CDATA[
<p>There is in /config "Switch models when a message is flagged" now which can be set to false, but I had no chance to see what happens then, does it just stop or what.</p>
]]></description><pubDate>Thu, 11 Jun 2026 20:41:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=48496131</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48496131</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48496131</guid></item><item><title><![CDATA[New comment by comboy in "Claude Fable 5: mid-tier results on coding tasks"]]></title><description><![CDATA[
<p>'by the way, your previous attempts have these structural problems."<p>Just to be clear, it did not have access to any previous work that opus did? Because they are pretty good at digging out relevant tmp files and making use of whatever is out there.<p>With my fable adventures I caught it hallucinating something and stating it as a fact in CLI twice. And it was something that I did not see opus do in such way, opus obviously many times stated some things that it did not verify but guessed, but fable said something like "the probe showed that ..." - but there was no probe, it was not about some past events it was about what it was doing right now. "I overstated"...<p>But boy does it know Chinese, so much better than any other english model, gemini used to be the king but fable clearly was trained on a decent amount of it. It has a deep cultural understanding.</p>
]]></description><pubDate>Thu, 11 Jun 2026 20:31:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=48495999</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48495999</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48495999</guid></item><item><title><![CDATA[New comment by comboy in "Workers are spending over 6 hours a week botsitting AI, fueling job frustration"]]></title><description><![CDATA[
<p>I kind of enjoy exploring black boxes, trying how different inputs are mapping to differences in outputs. It's kind of like hacking. The problem is, they keep altering the box.</p>
]]></description><pubDate>Thu, 11 Jun 2026 14:05:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=48490546</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48490546</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48490546</guid></item><item><title><![CDATA[New comment by comboy in "Anthropic requires 30 day data retention for Fable and Mythos"]]></title><description><![CDATA[
<p>ROTFL</p>
]]></description><pubDate>Thu, 11 Jun 2026 12:23:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=48489370</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48489370</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48489370</guid></item><item><title><![CDATA[New comment by comboy in "AI agent runs amok in Fedora and elsewhere"]]></title><description><![CDATA[
<p>Here's the thing. Building trust and then leaving stuff in has been around forever. The fact that it becomes cheaper does not matter that much (since protection against it is also getting better), but it required you to have a bunch of extremely talented people who has spent much of their life diving into given topic.<p>Such driven people are usually even hard to buy, they usually would rather get by with enough income and work on interesting projects with interesting people that get some uninteresting work for tons of money. This still does not stop them from working for Malice. But ethics do. Even if not right away, if people see that what they are doing is not quite OK, the talent stops eroding. People quit, productivity drops. That was a good dynamic. Which now will be gone.</p>
]]></description><pubDate>Thu, 11 Jun 2026 12:14:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=48489303</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48489303</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48489303</guid></item><item><title><![CDATA[New comment by comboy in "Claude Desktop spawns 1.8 GB Hyper-V VM on every launch, even for chat-only use"]]></title><description><![CDATA[
<p>Oh, a nice subthread place to vent. Their CLI is so f tragic that it is ridiculous. It keeps scrambling the terminal, scroll and basic shortcuts keep breaking, I've used so many tuis and terminal apps and many of them are a single man operation and a side project and I have never seen anything so bad.<p>If I didn't know from experience that directed properly claude can be powerful, knowing that they used it to create that CLI would be instant runaway based on very reasonable heuristics - if they are not able to use their product to create a decent piece of software that is not even sophisticated then it seems futile for me to try.<p>I just do not understand. I feel like most HN could vibe code better claude CLI in claude than the CLI (and certainly just write one) than what we have to deal with to use subscription.</p>
]]></description><pubDate>Wed, 10 Jun 2026 20:06:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=48481870</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48481870</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48481870</guid></item><item><title><![CDATA[New comment by comboy in "If Claude Fable stops helping you, you'll never know"]]></title><description><![CDATA[
<p>I'm fairly certain they were doing something similar already possibly with some quantizations and not for the good humanity but just trying to handle the increased usage. Not for API requests though, just subscription CLI usage.</p>
]]></description><pubDate>Tue, 09 Jun 2026 22:09:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=48468463</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48468463</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48468463</guid></item><item><title><![CDATA[New comment by comboy in "Apple WWDC 2026"]]></title><description><![CDATA[
<p>If I share a project with an American friend and he says it's awesome, I still don't know whether he liked it or not.<p>If I share it with a Polish or German friend and he says it's "not bad" then I know he is really impressed.</p>
]]></description><pubDate>Mon, 08 Jun 2026 19:19:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=48450218</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48450218</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48450218</guid></item><item><title><![CDATA[New comment by comboy in "MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second"]]></title><description><![CDATA[
<p>For one, they invested in infrastructure. They can build fast and efficiently. They can provide power, they can provide cooling. Even if you just make roads better you make everything more efficient. Plus level of standard education. It all compounds.<p>On HN China is seen as a cheap labor copycat. This used to be a fair approximation at some point in the past. In my opinion China is getting ahead of everyone else much more than US used to be.<p>SF is a beautiful thing in the US, vast power and wealth comes from there. Smart people collaborating communicating and building fast and with excitement. China did SF kind of thing for many different sectors in many different places.</p>
]]></description><pubDate>Mon, 08 Jun 2026 18:56:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=48449806</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48449806</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48449806</guid></item><item><title><![CDATA[New comment by comboy in "Do agents.md files help coding agents?"]]></title><description><![CDATA[
<p>This is relevant to my interests, did you maybe test which models handle custom languages best? It also seems like a good proxy for them being able to stick to important instructions and not being carried away with things that are lookalikes.</p>
]]></description><pubDate>Mon, 08 Jun 2026 10:13:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=48443403</link><dc:creator>comboy</dc:creator><comments>https://news.ycombinator.com/item?id=48443403</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48443403</guid></item></channel></rss>