<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: criemen</title><link>https://news.ycombinator.com/user?id=criemen</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 24 May 2026 20:52:34 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=criemen" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by criemen in "DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost"]]></title><description><![CDATA[
<p>> Ah, reminds me of good old "There are only 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors."<p>You quip, but LLM KV caching (from the harness side) is quite easy: You get a cache hit on stable prompt prefixes, period. That means you want to keep the prefix stable, and only append at the end of the conversation.
Made up example: Don't put the git branch name into the system prompt part (that comes first), as whenever the branch name changes, that'd trigger a cache invalidation of the entire prompt.<p>Getting this right requires some care to not by accident modify the prefix, basically, and some design on communicating the things that can change (user configuration, working dir, git information, ...).</p>
]]></description><pubDate>Sun, 24 May 2026 16:27:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=48258625</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=48258625</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48258625</guid></item><item><title><![CDATA[New comment by criemen in "Project Glasswing: An Initial Update"]]></title><description><![CDATA[
<p>> There's also a runaway effect of model improvement from the discovery, triage and fix data. This is likely already the most potent corpus of curated offensive data ever assembled and will only get better.<p>But that corpus of data is accessible to all competitors, American or not.
I don't believe that this can't be replicated. I'd posit that there's enough annotated data out there (CVE+patch), only increasing thanks to Mythos, that if you specifically RL for this scenario, you can improve your models performance on finding vulnerabilities without access to Mythos.</p>
]]></description><pubDate>Fri, 22 May 2026 22:48:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=48242583</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=48242583</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48242583</guid></item><item><title><![CDATA[New comment by criemen in "Cursor Introduces Composer 2.5"]]></title><description><![CDATA[
<p>Well is that a statement about the quality of Opus 4.7 or about compose 2.5? :P</p>
]]></description><pubDate>Mon, 18 May 2026 18:20:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=48183442</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=48183442</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48183442</guid></item><item><title><![CDATA[New comment by criemen in "Measuring Claude 4.7's tokenizer costs"]]></title><description><![CDATA[
<p>> Is there an equivalent ultra-high-end LLM you can have if you’re willing to pay? Or does it not exist because it would cost too much to train?<p>I guess at the time that was GPT-4.5. I don't think people used it a lot because it was crazy expensive, and not that much better than the rest of the crop.</p>
]]></description><pubDate>Fri, 17 Apr 2026 21:04:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=47810552</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=47810552</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47810552</guid></item><item><title><![CDATA[New comment by criemen in "Measuring Claude 4.7's tokenizer costs"]]></title><description><![CDATA[
<p>> it's them trying to push the models to burn less compute<p>I'm curious, how does using more tokens save compute?</p>
]]></description><pubDate>Fri, 17 Apr 2026 21:03:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47810538</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=47810538</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47810538</guid></item><item><title><![CDATA[New comment by criemen in "Škoda DuoBell: A bicycle bell that penetrates noise-cancelling headphones"]]></title><description><![CDATA[
<p>Pretty cool if true!</p>
]]></description><pubDate>Wed, 08 Apr 2026 09:29:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=47687616</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=47687616</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47687616</guid></item><item><title><![CDATA[New comment by criemen in "No, it doesn't cost Anthropic $5k per Claude Code user"]]></title><description><![CDATA[
<p>> What people don't realize is that cache is <i>free</i><p>I'm incredibly salty about this - they're essentially monetizing intensely something that allows them to sell their inference at premium prices to more users - without any caching, they'd have much less capacity available.</p>
]]></description><pubDate>Tue, 10 Mar 2026 09:08:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=47320758</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=47320758</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47320758</guid></item><item><title><![CDATA[New comment by criemen in "Giving LLMs a personality is just good engineering"]]></title><description><![CDATA[
<p>I tried ChatGPT over the holidays (paid) vs. claude.ai (paid).
After trying some prompts that worked well on Claude in ChatGPT, I understand why people are so annoyed about AI slop. The speech patterns in text output for ChatGPT are both obvious and annoying, and impossible to unsee when people use them in written communication.<p>Claude isn't without problems ("You're absolutely right"), but I feel that some of the perception there is around the limited set of phrases the coding agent uses regularly, and comes less from the multi-paragraph responses from the chatbot.</p>
]]></description><pubDate>Wed, 04 Mar 2026 08:09:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=47244564</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=47244564</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47244564</guid></item><item><title><![CDATA[New comment by criemen in "MacBook Air with M5"]]></title><description><![CDATA[
<p>> Out of curiosity, what are some good use cases for a MBP now with the MBAs being so powerful?<p>Local software development (node/TS). When opus-4.6-fast launched, it felt like some of the limiting factor in turnaround time moved from inference to the validation steps, i.e. execute tests, run linter, etc. Granted, that's with endpoint management slowing down I/O, and hopefully tsgo and some eslint replacement will speed things up significantly over there.</p>
]]></description><pubDate>Tue, 03 Mar 2026 19:06:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=47237153</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=47237153</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47237153</guid></item><item><title><![CDATA[New comment by criemen in "Building SQLite with a small swarm"]]></title><description><![CDATA[
<p>Even if was copying sqlite code over, wouldn't the ability to automatically rewrite sqlite in Rust be a valuable asset?</p>
]]></description><pubDate>Mon, 16 Feb 2026 11:29:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=47033778</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=47033778</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47033778</guid></item><item><title><![CDATA[New comment by criemen in "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"]]></title><description><![CDATA[
<p>> I had assumed that reasoning models should easily be able to answer this correctly.<p>I thought so too, yet Opus 4.6 with extended thinking (on claude.ai) gives me
> Walk. At 50 meters you'd spend more time parking and maneuvering at the car wash than the walk itself takes. Drive the car over only if the wash requires the car to be there (like a drive-through wash), then walk home and back to pick it up.<p>which is still pretty bad.</p>
]]></description><pubDate>Mon, 16 Feb 2026 11:21:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=47033724</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=47033724</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47033724</guid></item><item><title><![CDATA[New comment by criemen in "I fixed Windows native development"]]></title><description><![CDATA[
<p>This is amazing.<p>At $workplace, we have a script that extracts a toolchain from a GitHub actions windows runner, packages it up, stuffs it into git LFS, which is then pulled by bazel as C++ toolchain.<p>This is the more scalable way, and I assume it could still somewhat easily be integrated into a bazel build.</p>
]]></description><pubDate>Sun, 15 Feb 2026 12:50:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47023282</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=47023282</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47023282</guid></item><item><title><![CDATA[New comment by criemen in "Two different tricks for fast LLM inference"]]></title><description><![CDATA[
<p>One other thing I'd assume Anthropic is doing is routing all fast requests to the latest-gen hardware. They most certainly have a diverse fleet of inference hardware (TPUs, GPUs of different generations), and fast will be only served by whatever is fastest, whereas the general inference workload will be more spread out.</p>
]]></description><pubDate>Sun, 15 Feb 2026 10:04:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=47022513</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=47022513</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47022513</guid></item><item><title><![CDATA[Hamming, "You and Your Research" (1995) [video]]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.youtube.com/watch?v=a1zDuOPkMSw">https://www.youtube.com/watch?v=a1zDuOPkMSw</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47016112">https://news.ycombinator.com/item?id=47016112</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 14 Feb 2026 17:01:08 +0000</pubDate><link>https://www.youtube.com/watch?v=a1zDuOPkMSw</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=47016112</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47016112</guid></item><item><title><![CDATA[New comment by criemen in "Two Weeks Until Tapeout"]]></title><description><![CDATA[
<p>> aka: For those not living in 2026, we have uncovered a new clue to the mystery of where all the low-power DRAM chips have suddenly vanished to!<p>I love the writing style!</p>
]]></description><pubDate>Sun, 25 Jan 2026 12:03:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=46753353</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=46753353</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46753353</guid></item><item><title><![CDATA[New comment by criemen in "Nvidia Kicks Off the Next Generation of AI with Rubin"]]></title><description><![CDATA[
<p>What's the power hookup to just boot one rack?
I'd imagine that's more than you get anywhere in residential areas for a single house.</p>
]]></description><pubDate>Thu, 08 Jan 2026 20:41:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=46546168</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=46546168</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46546168</guid></item><item><title><![CDATA[New comment by criemen in "I switched from VSCode to Zed"]]></title><description><![CDATA[
<p>> I’m currently using a mix of Zed, Sublime, and VS Code.<p>Can you elaborate on when you use which editor?
I'd have imagined that there's value in learning and using one editor in-depth, instead of switching around based on use-case, so I'd love to learn more about your approach.</p>
]]></description><pubDate>Mon, 05 Jan 2026 23:05:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=46506414</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=46506414</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46506414</guid></item><item><title><![CDATA[Is AI actually a Bubble?]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.newyorker.com/culture/open-questions/is-ai-actually-a-bubble">https://www.newyorker.com/culture/open-questions/is-ai-actually-a-bubble</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46256759">https://news.ycombinator.com/item?id=46256759</a></p>
<p>Points: 3</p>
<p># Comments: 1</p>
]]></description><pubDate>Sat, 13 Dec 2025 18:32:57 +0000</pubDate><link>https://www.newyorker.com/culture/open-questions/is-ai-actually-a-bubble</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=46256759</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46256759</guid></item><item><title><![CDATA[New comment by criemen in "Has the cost of building software dropped 90%?"]]></title><description><![CDATA[
<p>> This takes a fairly large mindset shift, but the hard work is the conceptual thinking, not the typing.<p>But the hard work always was the conceptual thinking? At least at and beyond the Senior level, for me it was always the thinking that's the hard work, not converting the thoughts into code.</p>
]]></description><pubDate>Mon, 08 Dec 2025 22:25:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=46198484</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=46198484</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46198484</guid></item><item><title><![CDATA[New comment by criemen in "Has the cost of building software dropped 90%?"]]></title><description><![CDATA[
<p>The large open-weights models aren't really usable for local running (even with current hardware), but multiple providers compete on running inference for you, so it's reasonable to assume that there is and will be a functioning marketplace.</p>
]]></description><pubDate>Mon, 08 Dec 2025 22:14:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=46198364</link><dc:creator>criemen</dc:creator><comments>https://news.ycombinator.com/item?id=46198364</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46198364</guid></item></channel></rss>