<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: Jweb_Guru</title><link>https://news.ycombinator.com/user?id=Jweb_Guru</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 05 May 2026 08:25:45 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=Jweb_Guru" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by Jweb_Guru in "Roblox shares plummet 18% as child safety measures weigh on bookings"]]></title><description><![CDATA[
<p>God forbid people want to work on video game stuff instead of for an advertising company.</p>
]]></description><pubDate>Sat, 02 May 2026 19:06:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=47989396</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47989396</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47989396</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "Anonymous request-token comparisons from Opus 4.6 and Opus 4.7"]]></title><description><![CDATA[
<p>For some reason people are perfectly able to understand this in the context of, say, cursive, calculator use, etc., but when it comes to their own skillset somehow it's going to be really different.</p>
]]></description><pubDate>Sat, 18 Apr 2026 20:20:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=47819208</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47819208</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47819208</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "Anonymous request-token comparisons from Opus 4.6 and Opus 4.7"]]></title><description><![CDATA[
<p>No, it hasn't.  I did not have a problem before AI with people sending in gigantic pull requests that made absolutely no sense, and justifying them with generated responses that they clearly did not understand.  This is not a thing that used to happen.  That's not to say people wouldn't have done it if it were possible, but there was a barrier to submitting a pull request that no longer exists.</p>
]]></description><pubDate>Sat, 18 Apr 2026 20:18:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=47819189</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47819189</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47819189</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "Measuring Claude 4.7's tokenizer costs"]]></title><description><![CDATA[
<p>I'm mostly surprised that people found the output quality of Opus 4.6 good enough... 4.7 so far is a pretty sizable improvement for the stuff I care about.  I don't really care how cheap 4.6 was per task when 90% of the tasks weren't actually being done correctly.  Or maybe it's that people like the LLM agreeing with them blindly while sneakily doing something else under the hood?  Did people enjoy Claude routinely disregarding their instructions?  Not really sure I understand, I truly found 4.6 immensely frustrating (from the getgo, not just the "pre-nerf" version, whatever that means).  4.7 is a buggy mess, it's slow, and it costs a lot per token.  It's also a huge breath of fresh air because it actually seems to make a good faith effort at doing the thing you asked it to do, and doesn't waste your time with irrelevant nonsense just to make it look busy or because it thinks you want that nonsense (I mean, it still does all of these things to some extent, but so far it seems like it does them much less than 4.6 did).<p>Disclaimer: I'm always running on max and don't really have token limits so I am in a position not to care about cost per token.  But I am not surprised by the improved benchmark results at all, 4.6 was really not nearly as strong of a model as people seem to remember it being.</p>
]]></description><pubDate>Sat, 18 Apr 2026 03:43:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47812958</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47812958</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47812958</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "Issue: Claude Code is unusable for complex engineering tasks with Feb updates"]]></title><description><![CDATA[
<p>Yup.  Every single time it's about to do the dumbest thing I've seen in my life.</p>
]]></description><pubDate>Tue, 07 Apr 2026 08:10:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=47672112</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47672112</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47672112</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "LLMs work best when the user defines their acceptance criteria first"]]></title><description><![CDATA[
<p>You may have had <i>one</i>.  It clearly made a pretty negative impression on you because you are still complaining about them years later.  I find it pretty misanthropic when people ascribe this kind of antisocial behavior to <i>all</i> of their coworkers.</p>
]]></description><pubDate>Sat, 07 Mar 2026 17:30:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=47289596</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47289596</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47289596</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "LLMs work best when the user defines their acceptance criteria first"]]></title><description><![CDATA[
<p>In the long run, good code makes everyone much happier than code that is bad because people are being "nice" and letting things slide in code review to avoid confrontation.</p>
]]></description><pubDate>Sat, 07 Mar 2026 17:15:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=47289480</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47289480</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47289480</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "LLMs work best when the user defines their acceptance criteria first"]]></title><description><![CDATA[
<p>It's not reality.  I'm really not a fan of the way that people excuse the really terrible code LLMs write by claiming that people write code just as bad.  Even if that were true, it is <i>not</i> true that when you ask those people to do otherwise they simply pretend to have done it and forget you asked later.</p>
]]></description><pubDate>Sat, 07 Mar 2026 06:51:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=47285135</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47285135</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47285135</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "Claude's Cycles [pdf]"]]></title><description><![CDATA[
<p>I assure you that LLM thinking also has a speed limit.</p>
]]></description><pubDate>Tue, 03 Mar 2026 16:04:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47234459</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47234459</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47234459</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "A16z partner says that the theory that we’ll vibe code everything is wrong"]]></title><description><![CDATA[
<p>My point is that a lot of people think it'd be really easy to build the next Salesforce until they actually try to compete with Salesforce in the market.  Like it or not, if you want to build a Salesforce competitor (or try to get your company to build its own) you're going to be compared to <i>actual</i> Salesforce, not the version of Salesforce that existed when the market was new.</p>
]]></description><pubDate>Wed, 25 Feb 2026 15:48:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=47153133</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47153133</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47153133</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "A16z partner says that the theory that we’ll vibe code everything is wrong"]]></title><description><![CDATA[
<p>Salesforce literally has its own query optimizer, you are vastly underestimating the complexity of its software.</p>
]]></description><pubDate>Sun, 22 Feb 2026 16:05:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=47112103</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47112103</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47112103</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "How I use Claude Code: Separation of planning and execution"]]></title><description><![CDATA[
<p>> But the aha moment for me was what’s maintainable by AI vs by me by hand are on different realms<p>I don't find that LLMs are any more likely than humans to remember to update all of the places it wrote redundant functions.  Generally far less likely, actually.  So forgive me for treating this claim with a massive grain of salt.</p>
]]></description><pubDate>Sun, 22 Feb 2026 08:28:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=47109351</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47109351</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47109351</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "Asahi Linux Progress Report: Linux 6.19"]]></title><description><![CDATA[
<p>This comment expresses how it feels to work in a corporate environment better than anything I've ever seen on this site.</p>
]]></description><pubDate>Thu, 19 Feb 2026 12:34:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=47073106</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47073106</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47073106</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "What every compiler writer should know about programmers (2015) [pdf]"]]></title><description><![CDATA[
<p>It's ironic that I have to tell you of all people this, but many users of C (or at least, backends of compilers targeted by C) do actually want the compiler to aggressively optimize around UB.</p>
]]></description><pubDate>Tue, 17 Feb 2026 06:56:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=47044484</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47044484</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47044484</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"]]></title><description><![CDATA[
<p>Yeah people are always like "these are just trick questions!" as though the <i>correct</i> mode of use for an LLM is quizzing it on things where the answer is already available.  Where LLMs have the greatest potential to steer you wrong is when you ask something where the answer is <i>not</i> obvious, the question might be ill-formed, or the user is incorrectly convinced that something should be possible (or easy) when it isn't.  Such cases have a lot more in common with these "nonsensical riddles" than they do with any possible frontier benchmark.<p>This is especially obvious when viewing the reasoning trace for models like Claude, which often spends a lot of time speculating about the user's "hints" and trying to parse out the intent of the user in asking the question.  Essentially, the model I use for LLMs these days is to treat them as very good "test takers" which have limited open book access to a large swathe of the internet.  They are trying to ace the test by any means necessary and love to take shortcuts to get there that don't require actual "reasoning" (which burns tokens and increases the context window, decreasing accuracy overall).  For example, when asked to read a <i>full</i> paper, focusing on the implications for some particular problem, Claude agents will try to cheat by skimming until they get to a section that feels relevant, then searching directly for some words they read in that section.  They will do this even if told explicitly that they must read the whole paper.  I assume this is because the vast majority of the time, for the kinds of questions that they are trained on, this sort of behavior maximizes their reward function (though I'm sure I'm getting lots of details wrong about the way frontier models are trained, I find it very unlikely that the kinds of prompts that these agents get very closely resemble data found in the wild on the internet pre-LLMs).</p>
]]></description><pubDate>Mon, 16 Feb 2026 18:56:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=47038765</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=47038765</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47038765</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "Claude Code's new hidden feature: Swarms"]]></title><description><![CDATA[
<p>It affects it very heavily IME.  People need to make sure they are getting a good mix of writing from other sources.</p>
]]></description><pubDate>Sat, 24 Jan 2026 18:16:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=46746018</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=46746018</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46746018</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "Letting Claude play text adventures"]]></title><description><![CDATA[
<p>Yeah, I do not find performances like this very impressive.</p>
]]></description><pubDate>Thu, 22 Jan 2026 03:38:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=46714991</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=46714991</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46714991</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "Lightpanda migrate DOM implementation to Zig"]]></title><description><![CDATA[
<p>Believe it or not, using arenas does not provide free memory safety.  You need to statically bound allocations to make sure they don't escape the arena (which is exactly how arenas work in Rust, but not Zig).  There are also quite a lot of ways of generating memory unsafe code that aren't just use after free or array-out-of-bounds in a language like Zig, especially in the context of stuff like DOM nodes where one frequently needs to swap out pointers between elements of one type and a different type.</p>
]]></description><pubDate>Thu, 15 Jan 2026 14:54:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=46633379</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=46633379</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46633379</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "Lightpanda migrate DOM implementation to Zig"]]></title><description><![CDATA[
<p>Respectfully, for browser-based work, simplicity is absolutely not a good enough reason to use a memory-unsafe language.  Your claim that Zig is in some way safer than Rust for something like this is flat out untrue.</p>
]]></description><pubDate>Mon, 12 Jan 2026 14:37:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=46589108</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=46589108</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46589108</guid></item><item><title><![CDATA[New comment by Jweb_Guru in "“Erdos problem #728 was solved more or less autonomously by AI”"]]></title><description><![CDATA[
<p>Yeah people dramatically overestimate the difficulty of getting one's definitions correct for most problems, especially when you are doing an end to end proof rather than just axiomatizing some system.  They are still worth looking at carefully, especially for AI-generated proofs where you don't get the immediate feedback that you do as a human when something you expect to be hard goes through easily, but contrary to what seems to be popular belief here they are generally much easier to verify than the corresponding proof (in the case of formally verified software, the corresponding analogy is verifying that the spec is what you want vs. verifying that the program matches the spec; the former is generally <i>much</i> easier).</p>
]]></description><pubDate>Sat, 10 Jan 2026 17:03:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=46567480</link><dc:creator>Jweb_Guru</dc:creator><comments>https://news.ycombinator.com/item?id=46567480</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46567480</guid></item></channel></rss>