<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: puppystench</title><link>https://news.ycombinator.com/user?id=puppystench</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 24 Apr 2026 17:15:18 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=puppystench" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by puppystench in "An update on recent Claude Code quality reports"]]></title><description><![CDATA[
<p>The Claude UI still only has "adaptive" reasoning for Opus 4.7, making it functionally useless for scientific/coding work compared to older models (as Opus 4.7 will randomly stop reasoning after a few turns, even when prompted otherwise). There's no way this is just a bug and not a choice to save tokens.</p>
]]></description><pubDate>Thu, 23 Apr 2026 19:14:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47880253</link><dc:creator>puppystench</dc:creator><comments>https://news.ycombinator.com/item?id=47880253</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47880253</guid></item><item><title><![CDATA[New comment by puppystench in "GPT-5.5"]]></title><description><![CDATA[
<p>In the announcement webpage:<p>>For API developers, gpt-5.5 will soon be available in the Responses and Chat Completions APIs at $5 per 1M input tokens and $30 per 1M output tokens, with a 1M context window.</p>
]]></description><pubDate>Thu, 23 Apr 2026 19:09:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=47880186</link><dc:creator>puppystench</dc:creator><comments>https://news.ycombinator.com/item?id=47880186</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47880186</guid></item><item><title><![CDATA[New comment by puppystench in "GPT-5.5"]]></title><description><![CDATA[
<p>For API usage, GPT-5.5 is 2x the price of GPT-5.4, ~4x the price of GPT-5.1, and ~10x the price of Kimi-2.6.<p>Unfortunately I think the lesson they took from Anthropic is that devs get really reliant and even addicted on coding agents, and they'll happily pay any amount for even small benefits.</p>
]]></description><pubDate>Thu, 23 Apr 2026 19:03:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=47880106</link><dc:creator>puppystench</dc:creator><comments>https://news.ycombinator.com/item?id=47880106</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47880106</guid></item><item><title><![CDATA[New comment by puppystench in "Claude Opus 4.7"]]></title><description><![CDATA[
<p>Does this mean Claude no longer outputs the full raw reasoning, only summaries? At one point, exposing the LLM's full CoT was considered a core safety tenet.</p>
]]></description><pubDate>Thu, 16 Apr 2026 16:27:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=47795812</link><dc:creator>puppystench</dc:creator><comments>https://news.ycombinator.com/item?id=47795812</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47795812</guid></item><item><title><![CDATA[New comment by puppystench in "Claude mixes up who said what"]]></title><description><![CDATA[
<p>I believe you're right, it's an issue of the model misinterpreting things that sound like user message as actual user messages. It's a known phenomenon: <a href="https://arxiv.org/abs/2603.12277" rel="nofollow">https://arxiv.org/abs/2603.12277</a></p>
]]></description><pubDate>Thu, 09 Apr 2026 16:41:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=47705894</link><dc:creator>puppystench</dc:creator><comments>https://news.ycombinator.com/item?id=47705894</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47705894</guid></item><item><title><![CDATA[New comment by puppystench in "Claude mixes up who said what"]]></title><description><![CDATA[
<p>>Several people questioned whether this is actually a harness bug like I assumed, as people have reported similar issues using other interfaces and models, including chatgpt.com. One pattern does seem to be that it happens in the so-called “Dumb Zone” once a conversation starts approaching the limits of the context window.<p>I also don't think this is a harness bug. There's research* showing that models infer the source of text from how it sounds, not the actual role labels the harness would provide. The messages from Claude here sound like user messages ("Please deploy") rather than usual Claude output, which tricks its later self into thinking it's from the user.<p>*<a href="https://arxiv.org/abs/2603.12277" rel="nofollow">https://arxiv.org/abs/2603.12277</a><p>Presumably this is also why prompt innjection works at all.</p>
]]></description><pubDate>Thu, 09 Apr 2026 15:49:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=47705269</link><dc:creator>puppystench</dc:creator><comments>https://news.ycombinator.com/item?id=47705269</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47705269</guid></item></channel></rss>