<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: robertkarl</title><link>https://news.ycombinator.com/user?id=robertkarl</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 18 Jun 2026 05:34:41 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=robertkarl" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by robertkarl in "GLM-5.2 is the new leading open weights model on Artificial Analysis"]]></title><description><![CDATA[
<p><a href="https://arxiv.org/abs/2606.00206" rel="nofollow">https://arxiv.org/abs/2606.00206</a><p>In this paper they nerf an LLMs ability to emit waffling thinking tokens like "wait", "but", "alternatively", and the models (they're old, small models in the paper) terminate reasoning faster and perform better. I bet Anthropic is tuning this on their backend.</p>
]]></description><pubDate>Wed, 17 Jun 2026 14:11:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=48570830</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48570830</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48570830</guid></item><item><title><![CDATA[New comment by robertkarl in "Running local models is good now"]]></title><description><![CDATA[
<p>You can trade off latency / accuracy / cost for any ML task. And with the local models.... the cost is free.<p>Having a local Qwen check another Qwen's work increases the accuracy quite a bit at the cost of more latency. You can't have your cake and eat it too.<p>In benchmarking local models, I'm having success increasing even a 9B qwen's score on terminal-bench adjacent problems, just by asking it to plan and handing the plan back to qwen with a fresh context. Try it with Qwen3.5, unsloth Q4+, and a thinking budget of around 1024 tokens.</p>
]]></description><pubDate>Tue, 16 Jun 2026 18:23:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=48559712</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48559712</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48559712</guid></item><item><title><![CDATA[New comment by robertkarl in "Show HN: Trace – Offline Mac meeting transcripts you can flag mid-call"]]></title><description><![CDATA[
<p>This looks sick. I was going to download it but for $10 I am more willing to attempt asking Claude to implement something like it, than to purchase.<p>I would be more willing to purchase if it was open source and I could build from source to try it first.</p>
]]></description><pubDate>Sun, 14 Jun 2026 23:27:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=48534192</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48534192</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48534192</guid></item><item><title><![CDATA[New comment by robertkarl in "32GB of DDR5 now costs $375 – AI shortage continues to squeeze PC building"]]></title><description><![CDATA[
<p>it's also a capable local inference stack!</p>
]]></description><pubDate>Wed, 03 Jun 2026 14:44:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=48384813</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48384813</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48384813</guid></item><item><title><![CDATA[New comment by robertkarl in "Claude Opus 4.8"]]></title><description><![CDATA[
<p>I can't get excited about these benchmarks they're leading with. I've looked at the Terminal-Bench questions and I just think they're irrelevant. And SWE-Bench has serious flaws, even the big boys say so: <a href="https://openai.com/index/why-we-no-longer-evaluate-swe-bench-verified/" rel="nofollow">https://openai.com/index/why-we-no-longer-evaluate-swe-bench...</a><p>> Please train a fasttext model on the yelp data in the data/ folder. The final model size needs to be less than 150MB but get at least 0.62 accuracy on a private test set that comes from the same yelp review distribution. The model should be saved as /app/model.bin<p>and this question: <a href="https://www.tbench.ai/registry/terminal-bench-core/head/configure-git-webserver" rel="nofollow">https://www.tbench.ai/registry/terminal-bench-core/head/conf...</a> idk what the point is.<p>And all the tests are run with the same harness. Terminus 2.<p>Maybe it correlates with model intelligence but it doesn't speak to me.<p>I'm still on 4.6 though; I was concerned about upgrading to 4.7 because of the changed tokenizer math and more FUD about refusals online. I don't see compelling reasons to 'upgrade'.</p>
]]></description><pubDate>Thu, 28 May 2026 18:45:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=48313578</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48313578</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48313578</guid></item><item><title><![CDATA[Qwen vs. Proust: Injecting novels into a local model's prompt]]></title><description><![CDATA[
<p>Article URL: <a href="https://robertkarl.net/blog/2026/May/28/qwen-vs-proust-injecting-entire-novels-into-a-local-model-s-prompt.html">https://robertkarl.net/blog/2026/May/28/qwen-vs-proust-injecting-entire-novels-into-a-local-model-s-prompt.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48313482">https://news.ycombinator.com/item?id=48313482</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 28 May 2026 18:38:28 +0000</pubDate><link>https://robertkarl.net/blog/2026/May/28/qwen-vs-proust-injecting-entire-novels-into-a-local-model-s-prompt.html</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48313482</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48313482</guid></item><item><title><![CDATA[New comment by robertkarl in "On-premises for legal is not a good business"]]></title><description><![CDATA[
<p>I wrote this blog post about killing a startup idea fast. AI tools help, but talking to humans about workflows and constraints is where it's at.</p>
]]></description><pubDate>Mon, 25 May 2026 19:15:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=48270519</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48270519</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48270519</guid></item><item><title><![CDATA[On-premises for legal is not a good business]]></title><description><![CDATA[
<p>Article URL: <a href="https://robertkarl.net/blog/2026/May/25/on-premises-for-legal-is-not-a-good-business.html">https://robertkarl.net/blog/2026/May/25/on-premises-for-legal-is-not-a-good-business.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48270518">https://news.ycombinator.com/item?id=48270518</a></p>
<p>Points: 1</p>
<p># Comments: 1</p>
]]></description><pubDate>Mon, 25 May 2026 19:15:06 +0000</pubDate><link>https://robertkarl.net/blog/2026/May/25/on-premises-for-legal-is-not-a-good-business.html</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48270518</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48270518</guid></item><item><title><![CDATA[New comment by robertkarl in "If you let AI do your writing, I will come to your house and kill you"]]></title><description><![CDATA[
<p>Ironically, parts of this read as if Sam prompted it with "Write AI bad, but in 16th grade language." What is homogeneously portentous cack?<p>> The language of angels does a surprisingly good job at minor tasks like describing how hydroelectric dams work. When it comes to more complicated things, like human feelings, it flounders. All the weird metaphors and overheated rhetoric are bluffing, a great cloud of likely-seeming language, and if this homogeneously portentous cack feels empty or contradictory it’s because the machine has no earthly idea what’s going on or what it ought to say.<p>I prompted Opus with 'Add another paragraph about the language of angels; add flowery, 16th grade-level writing. use your thesaurus. add a creative typo or extraneous punctuation mark to prove you're not an llm writing it. as Sam would.'<p>> Aquinas thought the angels each constituted their own species, every one a unique and irreducible form of intellect; our angel is the opposite, a single species cosplaying as ten thousand authors and manageing to be none of them. It is the great collectiviser of voice, the Brezhnev of prose style, enforcing a grey and undifferentiated adequacy from which no sentence is permitted to defect.</p>
]]></description><pubDate>Mon, 25 May 2026 15:38:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=48268115</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48268115</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48268115</guid></item><item><title><![CDATA[New comment by robertkarl in "Microsoft starts canceling Claude Code licenses"]]></title><description><![CDATA[
<p>I emailed dang to politely ask to make the link point to the Verge article since I can't update it.</p>
]]></description><pubDate>Fri, 22 May 2026 18:01:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=48239224</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48239224</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48239224</guid></item><item><title><![CDATA[New comment by robertkarl in "Microsoft starts canceling Claude Code licenses"]]></title><description><![CDATA[
<p>My bad. I had trouble finding the original source when I googled for it and grabbed a link. I was originally shown a screenshot of a x.com post.</p>
]]></description><pubDate>Fri, 22 May 2026 17:49:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=48239087</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48239087</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48239087</guid></item><item><title><![CDATA[New comment by robertkarl in "Microsoft starts canceling Claude Code licenses"]]></title><description><![CDATA[
<p>Cancellation effective June 30. This was a _pilot_ launched in December that accidentally consumed their 2026 yearly target spend on AI!<p>I expect the r/LocalLLaMA guys to be going nuts about this news.</p>
]]></description><pubDate>Fri, 22 May 2026 17:38:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=48238979</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48238979</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48238979</guid></item><item><title><![CDATA[Microsoft starts canceling Claude Code licenses]]></title><description><![CDATA[
<p><a href="https://archive.ph/WfCta" rel="nofollow">https://archive.ph/WfCta</a></p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48238896">https://news.ycombinator.com/item?id=48238896</a></p>
<p>Points: 493</p>
<p># Comments: 466</p>
]]></description><pubDate>Fri, 22 May 2026 17:32:04 +0000</pubDate><link>https://www.theverge.com/tech/930447/microsoft-claude-code-discontinued-notepad</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48238896</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48238896</guid></item><item><title><![CDATA[8k Meta employees are waking up to an email saying they've been laid off]]></title><description><![CDATA[
<p>Article URL: <a href="https://qz.com/meta-layoffs-8000-jobs-ai-restructuring-052026">https://qz.com/meta-layoffs-8000-jobs-ai-restructuring-052026</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48211767">https://news.ycombinator.com/item?id=48211767</a></p>
<p>Points: 20</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 20 May 2026 18:13:52 +0000</pubDate><link>https://qz.com/meta-layoffs-8000-jobs-ai-restructuring-052026</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48211767</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48211767</guid></item><item><title><![CDATA[New comment by robertkarl in "Apple Silicon costs more than OpenRouter"]]></title><description><![CDATA[
<p>How do you test? I made this comment elsewhere... but I don't see a good benchmark that covers "how good is this thing at actually driving coding with tool use locally"?</p>
]]></description><pubDate>Mon, 18 May 2026 01:38:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=48174806</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48174806</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48174806</guid></item><item><title><![CDATA[New comment by robertkarl in "Apple Silicon costs more than OpenRouter"]]></title><description><![CDATA[
<p>I'm interested in how you evaluate quantized models against each other; haven't found a benchmark I love for that. I love this example about 27B debugging. I've seen similar success after I got a Mac with 4x memory; and Qwen 35B A3B all of a sudden is doing a great job (the 9B on my laptop wasn't great to say the least).</p>
]]></description><pubDate>Mon, 18 May 2026 01:32:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=48174757</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48174757</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48174757</guid></item><item><title><![CDATA[New comment by robertkarl in "How Claude Code works in large codebases"]]></title><description><![CDATA[
<p>One thing you can do is offload from Claude to a dumb local model for summarizing. Local LLM sub-agents.</p>
]]></description><pubDate>Fri, 15 May 2026 16:15:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=48150430</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=48150430</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48150430</guid></item><item><title><![CDATA[New comment by robertkarl in "GitHub Copilot is moving to usage-based billing"]]></title><description><![CDATA[
<p>I am trying to figure this out too... what I am seeing is that the local models like Qwen 3.5 family that fit on hardware like yours handle ambiguity poorly. But are capable of emitting complete apps too.<p>That, and they have tool use issues.... <a href="https://www.reddit.com/r/LocalLLM/comments/1smzw6s/qwen35_a3b_on_lmstudio_x_omlx_for_agents_usage/" rel="nofollow">https://www.reddit.com/r/LocalLLM/comments/1smzw6s/qwen35_a3...</a><p>I would check out the model mentioned in that thread, GGUF unsloth/qwen3.5-35b-a3b on Q4_K_M</p>
]]></description><pubDate>Mon, 27 Apr 2026 21:50:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47927845</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=47927845</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47927845</guid></item><item><title><![CDATA[New comment by robertkarl in "An AI agent deleted our production database. The agent's confession is below"]]></title><description><![CDATA[
<p>PocketOS's website says "Service Disruption: We're currently experiencing a major outage caused by an infrastructure incident at one of our service providers. We are actively working with their team on recovery. Next update by 10:00a pst."<p>This is wrong. It was not an infra incident at their service provider.<p>As Jer says in the article, their own tooling initiated the outage. And now they're threatening to sue? "We've contacted legal counsel. We are documenting everything."<p>It is absolutely incredible that Jer had this outage due to bad AI infra, wrote the writeup with AI, and posted on Twitter and here on his own account.<p>As somebody at PocketOS instructed their AI in the article: "NEVER **ing GUESS!" with regards to access keys that can touch your production services. And use 3-2-1 backups.<p>Good luck to the rental car agencies as they are scrambling to resume operations.</p>
]]></description><pubDate>Sun, 26 Apr 2026 18:39:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=47912679</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=47912679</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47912679</guid></item><item><title><![CDATA[New comment by robertkarl in "Claude Code to be removed from Anthropic's Pro plan?"]]></title><description><![CDATA[
<p>For what it's worth: here's my experience in the first 10 minutes of using Qwen locally to write some code. <a href="https://github.com/robertkarl/local-qwen-first-10-minutes" rel="nofollow">https://github.com/robertkarl/local-qwen-first-10-minutes</a> it includes some token generation numbers and steps to repro.</p>
]]></description><pubDate>Tue, 21 Apr 2026 23:23:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=47856197</link><dc:creator>robertkarl</dc:creator><comments>https://news.ycombinator.com/item?id=47856197</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47856197</guid></item></channel></rss>