<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: Bolwin</title><link>https://news.ycombinator.com/user?id=Bolwin</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 15 Jun 2026 02:14:43 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=Bolwin" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by Bolwin in "Don't trust large context windows"]]></title><description><![CDATA[
<p>It's pretty hard to measure because most context rot comes from <i>related</i> context and the model has to be able to figure which parts are truly relevant, which ones are relevant but stale, which ones to ignore etc.<p>Each relevant thing is basically a rule. Trying to so something with 500 rules is what's hard.<p>If you take a standard benchmark and just prepend a random book to it, it will not capture that</p>
]]></description><pubDate>Sun, 14 Jun 2026 15:02:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=48527877</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48527877</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48527877</guid></item><item><title><![CDATA[New comment by Bolwin in "Don't trust large context windows"]]></title><description><![CDATA[
<p>I don't use Claude Code. I use my own handwritten agent (formerly using Pi) and know every token that goes into it. There are zero memories to confuse it. The system prompt is 200 tokens and completely self consistent.<p>Plus I've found that the only time models go above 100k tokens anyway is when they've started looping at which point it's much better to go back anyway.<p>Anecdotally most models know their recall is terrible (or have been trained to act as such), that's why they constantly reread files before editing or while reasoning.</p>
]]></description><pubDate>Sun, 14 Jun 2026 14:46:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=48527724</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48527724</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48527724</guid></item><item><title><![CDATA[New comment by Bolwin in "Don't trust large context windows"]]></title><description><![CDATA[
<p>I see this said often and find it insane given how many times I find opus models making basic recall mistakes at <100k tokens.<p>Personally I consider < 60k to be the smart zone for opus. This is worse for opus 4.7 and 4.8 cause of the more granular tokenizer</p>
]]></description><pubDate>Sun, 14 Jun 2026 08:01:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=48525166</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48525166</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48525166</guid></item><item><title><![CDATA[New comment by Bolwin in "Kimi K2.7-Code: open-source coding model with better token efficiency"]]></title><description><![CDATA[
<p>What do you mean by custom format? Non-json?</p>
]]></description><pubDate>Fri, 12 Jun 2026 16:47:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=48506405</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48506405</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48506405</guid></item><item><title><![CDATA[New comment by Bolwin in "FrontierCode"]]></title><description><![CDATA[
<p>What is the "house" harness for minimax? They haven't released any</p>
]]></description><pubDate>Tue, 09 Jun 2026 01:34:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=48455040</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48455040</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48455040</guid></item><item><title><![CDATA[New comment by Bolwin in "The OnlyFans Economy of American AI"]]></title><description><![CDATA[
<p>If this is what we get without editors I want every thing I read to be without editors</p>
]]></description><pubDate>Sun, 07 Jun 2026 16:48:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=48436549</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48436549</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48436549</guid></item><item><title><![CDATA[New comment by Bolwin in "Hacker News, Sans AI"]]></title><description><![CDATA[
<p>Probably not as fast as a simple regex but static embedding models can get stupid fast e.g <a href="https://www.flowercomputer.com/news/fast-static-embedding/" rel="nofollow">https://www.flowercomputer.com/news/fast-static-embedding/</a></p>
]]></description><pubDate>Sat, 06 Jun 2026 05:53:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=48421814</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48421814</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48421814</guid></item><item><title><![CDATA[New comment by Bolwin in "Hacker News, Sans AI"]]></title><description><![CDATA[
<p>It is a little ironic coming from the most prolific AI poster here</p>
]]></description><pubDate>Sat, 06 Jun 2026 05:49:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=48421799</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48421799</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48421799</guid></item><item><title><![CDATA[New comment by Bolwin in "The back cover of C++: The Language raises questions not answered by front cover"]]></title><description><![CDATA[
<p>Also is this an official Microsoft dev blog?<p>Probably not a good look back at publishing hq</p>
]]></description><pubDate>Sat, 06 Jun 2026 05:43:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=48421772</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48421772</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48421772</guid></item><item><title><![CDATA[New comment by Bolwin in "Hacker News, Sans AI"]]></title><description><![CDATA[
<p>You can get the best of both worlds with a small embedding model</p>
]]></description><pubDate>Fri, 05 Jun 2026 21:54:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=48418866</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48418866</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48418866</guid></item><item><title><![CDATA[New comment by Bolwin in "Am I Unc?"]]></title><description><![CDATA[
<p>The social section is also just selecting for introverts it asocial people<p>But they expect a few wrong</p>
]]></description><pubDate>Fri, 05 Jun 2026 17:34:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=48415716</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48415716</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48415716</guid></item><item><title><![CDATA[New comment by Bolwin in "MAI-Thinking-1"]]></title><description><![CDATA[
<p>In my experience above 60k quality noticeably drops.<p>30k for open source models</p>
]]></description><pubDate>Wed, 03 Jun 2026 15:34:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=48385500</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48385500</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48385500</guid></item><item><title><![CDATA[New comment by Bolwin in "Step 3.7 Flash"]]></title><description><![CDATA[
<p>What's wrong with the name? The step function is a pretty well known one</p>
]]></description><pubDate>Fri, 29 May 2026 17:03:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=48326013</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48326013</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48326013</guid></item><item><title><![CDATA[New comment by Bolwin in "Xiaomi MiMo-v2.5 Series API Permanent Price Reduction Up to 99%"]]></title><description><![CDATA[
<p>Here's another: <a href="https://xcancel.com/FireworksAI_HQ/status/2060103886028046739#m" rel="nofollow">https://xcancel.com/FireworksAI_HQ/status/206010388602804673...</a><p>Fireworks is processing 30T tokens a day on open models, or about 210T a week. So about 40% of gemini? I'd say that's pretty good.<p>Two more points: 
1. <a href="https://openrouter.ai/provider/fireworks" rel="nofollow">https://openrouter.ai/provider/fireworks</a> ~5B tokens average daily on openrouter from fireworks, which is a ration of ~1:6000
2. <a href="https://openrouter.ai/rankings" rel="nofollow">https://openrouter.ai/rankings</a> total tokens on openrouter, ~4T daily, and more than half seems to be open. Say 2T.<p>If other providers' ratio is anywhere close to fireworks, that's on the order of  10 quadrillion open tokens daily.<p>That said I'd guess the ratio is probably not nearly as high for most providers.</p>
]]></description><pubDate>Thu, 28 May 2026 21:31:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=48315789</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48315789</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48315789</guid></item><item><title><![CDATA[New comment by Bolwin in "Unicode 18.0.0 Beta"]]></title><description><![CDATA[
<p>I've never seen emoji used for subtext. Usually they just repeat or emphasize what's in the text</p>
]]></description><pubDate>Thu, 28 May 2026 05:19:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=48304888</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48304888</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48304888</guid></item><item><title><![CDATA[New comment by Bolwin in "Xiaomi MiMo-v2.5 Series API Permanent Price Reduction Up to 99%"]]></title><description><![CDATA[
<p>OpenRouter is not indicative of volume. Most high volume clients will go to the providers directly. There's not point to paying the 5% OR cut if you know what you want.</p>
]]></description><pubDate>Wed, 27 May 2026 03:19:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=48289155</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48289155</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48289155</guid></item><item><title><![CDATA[New comment by Bolwin in "Xiaomi MiMo-v2.5 Series API Permanent Price Reduction Up to 99%"]]></title><description><![CDATA[
<p>I mean there is a minor moat. Most people don't enjoy switching providers or models. If you can get people to trust you'll stay near frontier, they'll stick around even when you aren't the best. Claude is a prime example of this</p>
]]></description><pubDate>Wed, 27 May 2026 03:14:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=48289127</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48289127</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48289127</guid></item><item><title><![CDATA[New comment by Bolwin in "Xiaomi MiMo-v2.5 Series API Permanent Price Reduction Up to 99%"]]></title><description><![CDATA[
<p>No one is producing one output token though.<p>And using up gpus for that cache is a pretty big opportunity cost. I highly doubt it's done in vram. That would be insane for the one hour caches.<p>So its memory + the time it takes to unload/load into vram + the extra cost per output token<p>Is it a scam? Idk</p>
]]></description><pubDate>Wed, 27 May 2026 03:10:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=48289093</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48289093</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48289093</guid></item><item><title><![CDATA[New comment by Bolwin in "YAML? That's Norway Problem"]]></title><description><![CDATA[
<p>Cause it's very verbose. A lot more syntax to break.<p>I personally think the best is one of the humanized json ones like <a href="https://maml.dev/" rel="nofollow">https://maml.dev/</a></p>
]]></description><pubDate>Sat, 23 May 2026 01:22:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=48243592</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48243592</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48243592</guid></item><item><title><![CDATA[New comment by Bolwin in "Google Declaring War on the Web"]]></title><description><![CDATA[
<p>I don't see how being decentralized helps search. Makes it quite harder if the fediverse is any indication</p>
]]></description><pubDate>Wed, 20 May 2026 22:34:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=48215213</link><dc:creator>Bolwin</dc:creator><comments>https://news.ycombinator.com/item?id=48215213</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48215213</guid></item></channel></rss>