<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: Calavar</title><link>https://news.ycombinator.com/user?id=Calavar</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 29 May 2026 19:06:40 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=Calavar" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by Calavar in "Green card seekers must leave U.S. to apply, Trump administration says"]]></title><description><![CDATA[
<p>Your sarcasm is misplaced because yes, this has unironically been true for large chunks of Latin American history.<p>- Argentinians in particular are over 60% of Italian descent.<p>- The richest man in Mexico was born to Lebanese immigrants.<p>- The chief military leader of the Chilean war of independence was born to an Irish immigrant.<p>- Peru had a president who was born to Japanese immigrants.<p>These countries have all, at various times, had an influx of overseas immigrants whose birthright citizen children rose to high stations in society.</p>
]]></description><pubDate>Sun, 24 May 2026 20:57:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=48260936</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=48260936</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48260936</guid></item><item><title><![CDATA[New comment by Calavar in "Green card seekers must leave U.S. to apply, Trump administration says"]]></title><description><![CDATA[
<p>If you're looking for international precedent, this is an old vs. new world issue. Birthright citizenship is rare in the old world, but it is the default for the Americas. Canada, most of Latin America, and a decent part of the Caribbean have birthright citizenship.</p>
]]></description><pubDate>Sun, 24 May 2026 13:31:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=48257156</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=48257156</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48257156</guid></item><item><title><![CDATA[New comment by Calavar in "Elon Musk has lost his lawsuit against Sam Altman and OpenAI"]]></title><description><![CDATA[
<p>I did look up numbers before I made that claim:<p>From Yahoo Finance<p>GME Jan 1, 2016: $7.09, $5.49 adjusted (accounting for dividend disbursements)<p>GME Jan 1, 2026: $20.09<p>266% or 365% return depending on how you count dividends. 365% for GME vs. 306% for S&P 500 over the same period (also using adjusted for dividend numbers).</p>
]]></description><pubDate>Mon, 18 May 2026 19:19:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=48184201</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=48184201</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48184201</guid></item><item><title><![CDATA[New comment by Calavar in "Elon Musk has lost his lawsuit against Sam Altman and OpenAI"]]></title><description><![CDATA[
<p>GME also beat the S&P 500 over the past 10 years. Is this evidence that Ryan Cohen is a business genius?<p>Tesla has been a meme stock for about five years now, maybe more. Its  valuation correlates with Musk's abilities as a showman and media figure, not a businessman.</p>
]]></description><pubDate>Mon, 18 May 2026 18:31:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=48183584</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=48183584</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48183584</guid></item><item><title><![CDATA[New comment by Calavar in "Fake building: Claude wrote 3k lines instead of import pywikibot"]]></title><description><![CDATA[
<p>I consider myself AI skeptical-ish and I detest when people defend LLMs with "it's user error, prompt better," but in this case it actually <i>is</i> user error.<p>If you want a particular implementation approach, you need to specify not only the features you want, but the implementation strategy at least at a high level. This could be as simple as adding "use pywikibit" or "use relevant packages from pypi" to the end of your prompt. Or you could seed your project with some manually writtem scaffolding, including a pyproject.toml<p>While LLMs do tend have NIH syndrome by default, I think this is a good default. I'd much rather have tight control over when and how to include external dependencies as opposed to letting a prompt fire for 40 minutes, and coming back to find 2 GB of newly installed node packages with a dependency tree 300 levels deep.</p>
]]></description><pubDate>Tue, 12 May 2026 03:42:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=48103951</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=48103951</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48103951</guid></item><item><title><![CDATA[New comment by Calavar in "OpenAI’s o1 correctly diagnosed 67% of ER patients vs. 50-55% by triage doctors"]]></title><description><![CDATA[
<p>> But it’s getting harder and harder to define a task that humans beat LLMs on. On pretty much any easily quantifiable test of knowledge or reasoning, the machines win.<p>Quite to the contrary, I think it's extremely trivial to find a task where humans beat LLMs.<p>For all the money that's been thrown at agentic coding, LLMs still produce substantially worse code than a senior dev. See my own prior comments on this for a concrete example [1].<p>These trivial failure cases show that there are dimensions to task proficiency - significant ones - that benchmarks fail to capture.<p>> Is medical diagnosis one of these high judgement tasks?<p>Situational. I would break diagnosis into three types:<p>1. The diagnosis comes from  objective criteria - laboratory values, vital signs, visual findings, family history. I think LLMs are likely already superior to humans in this case.<p>2. The diagnosis comes from "chart lore" - reading notes from prior physicians and realizing that there is new context now points to a different diagnosis. (That new context can be the benefit of hindsight into what they already tried and failed and/or new objective data). LLMs do pretty good at this when you point them at datasets where all the prior notes were written by humans, which means that those humans did a nontrivial part of the diagnostic work. What if the prior notes were written by LLMs as well? Will they propagate their own mistakes forward? Yet to be studied in depth.<p>3. The diagnosis comes from human interaction - knowing the difference between a patient who's high as a bat on crack and one who's delirious from infection; noticing that a patient hesitates slightly before they assure you that they've been taking all their meds as prescribed; etc. I doubt that LLMs will ever beat humans at this, but if LLMs can be proven to be good at point 2, then point 3 alone will not save human physicians.<p>[1] <a href="https://news.ycombinator.com/threads?id=Calavar#47891432">https://news.ycombinator.com/threads?id=Calavar#47891432</a></p>
]]></description><pubDate>Sun, 03 May 2026 23:12:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=48002610</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=48002610</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48002610</guid></item><item><title><![CDATA[New comment by Calavar in "Spinel: Ruby AOT Native Compiler"]]></title><description><![CDATA[
<p>I disagree, I use metaprogramming in application code quite regularly, although I tend to limit myself to a single construct (instance_eval) because I find that makes things more manageable.<p>In my opinion the main draw of Ruby is that it's kind of Lisp-y in the way you can quickly build a metalanguage tailored to your specific problem domain. For problems where I don't need metaprogramming, I'd rather use a language that is statically typed.</p>
]]></description><pubDate>Fri, 24 Apr 2026 21:27:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=47896017</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47896017</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47896017</guid></item><item><title><![CDATA[New comment by Calavar in "Over-editing refers to a model modifying code beyond what is necessary"]]></title><description><![CDATA[
<p>I'm writing a compiler. When I have Claude write a new feature, I have validate that suite against a test suite of ~200 tiny programs.<p>I have a shell script that automates this. If all tests pass, the shell script prints "200/200 passing" with very little token spend. If only 190/200 pass, the shell script reports the names of every test that failed, and now Claude does a process of<p>1) run the compiler binary -> 2) get assembly output and inspect for obvious errors -> 3) assemble -> 4) verify that the assembler did not report errors -> 5) run test binary, connect with gdb, and find the issue -> 6) edit the compiler source -> 7) recompile the compiler -> 8) back to 1<p>multiplied by 10 for the 10 failing tests. This eats up tokens very quickly. I realize that not every use case is going to look like this. But if I didn't have Claude verify against the test suite, then I'd be getting regressions left and right, and then what's the point?<p>The whole codebase (tests included) is less than 15k lines, so I don't think that's the issue. No MCPs. CLAUDE.md about 1.5k lines.</p>
]]></description><pubDate>Fri, 24 Apr 2026 16:38:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=47892561</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47892561</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47892561</guid></item><item><title><![CDATA[New comment by Calavar in "Spinel: Ruby AOT Native Compiler"]]></title><description><![CDATA[
<p>I'm skeptical of that reasoning because the original C wasn't too clean or performant either. For example emit.c from an earlier commit [1]<p>It writes a separate call to emit_raw for each line, even though there many successive calls to emit_raw before it runs into any branching or other dynamic logic. What if you change this<p><pre><code>    emit_raw(ctx, "#include <stdio.h>\n");
    emit_raw(ctx, "#include <stdlib.h>\n");
    emit_raw(ctx, "#include <string.h>\n");
    emit_raw(ctx, "#include <math.h>\n");
    // And on for dozens more lines
</code></pre>
to this<p><pre><code>    emit_raw(ctx,
        "#include <stdio.h>\n"
        "#include <stdlib.h>\n"
        "#include <string.h>\n"
        "#include <math.h>\n"
        // And on for dozens more lines
    );
</code></pre>
That would leave you with code that is just as readable, but only calls the emit function once, leading to a smaller and faster binary. Again, this is a trivial change to the code, but Claude struggles to get there.<p>[1] <a href="https://github.com/matz/spinel/blob/aba17d8266d72fae3555ec916688fe1edcfa9858/src/emit.c#L892" rel="nofollow">https://github.com/matz/spinel/blob/aba17d8266d72fae3555ec91...</a></p>
]]></description><pubDate>Fri, 24 Apr 2026 16:02:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=47892049</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47892049</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47892049</guid></item><item><title><![CDATA[New comment by Calavar in "US special forces soldier arrested after allegedly winning $400k on Maduro raid"]]></title><description><![CDATA[
<p>I know this is tangential to your overall point, but did really they murder everyone in the room? I was under the impression that a few Venezuelan generals kidnapped Maduro themselves, left him at a predetermined point for US forces to pick up, and had their soldiers fire some small arms into the air to make a token show of resistance. There's no way the US would have flown a slow-moving convoy of helicopters into a hostile city unless they knew a priori that Venezuelan air defense missile batteries would be ordered to stand down.</p>
]]></description><pubDate>Fri, 24 Apr 2026 15:51:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=47891914</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47891914</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47891914</guid></item><item><title><![CDATA[New comment by Calavar in "Spinel: Ruby AOT Native Compiler"]]></title><description><![CDATA[
<p>spinel_codegen.rb is an eldritch horror. I always get spaghetti code like this when using Claude, and I've been wondering if I'm doing something wrong. Now I see an application that looks genuinely interesting (not trivial slop) written by someone I consider to be a top notch programmer, and the code quality is still pretty garbage in some places.<p>For example infer_comparison_type() [1]. This is far from the worst offender - it's not that hard to read - but what's striking here that there is a better implementation that's so simple and obvious and Claude still fails to get there. Why not replace this with<p><pre><code>    COMPARISON_TYPES = Set.new(["<", ">", "<=", ">=", "==", "!=", "!"])

    def infer_comparison_type(mname)
      if COMPARISON_TYPES.include?(mname)
          "bool"
      else 
        ""
      end
      # Or even better, strip the else case
      # (Which would return nil for anything not in the set)
    end
</code></pre>
This would be shorter, faster, more readable, and more easily maintainable, but Claude always defaults to an if-return, if-return, if-return pattern. (Even if-else seems to be somewhat alien to Claude.) My own Claude codebases are full of that if-return crap, and now I know I'm not alone.<p>Other files have much better code quality though. For example, most of the lib directory, which seems to correspond to the ext directory in the mainline Ruby repo. The API is clearly inspired by MRI ruby, even though the implementation differs substantially. I would guess that Matz prompted Claude to mirror parts of the original API and this had a bit of a regularizing effect on the output.<p>[1] <a href="https://github.com/matz/spinel/blob/98d1179670e4d6486bbd15473a68ecdb1c4309cb/spinel_codegen.rb#L1600" rel="nofollow">https://github.com/matz/spinel/blob/98d1179670e4d6486bbd1547...</a></p>
]]></description><pubDate>Fri, 24 Apr 2026 15:16:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47891432</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47891432</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47891432</guid></item><item><title><![CDATA[New comment by Calavar in "Over-editing refers to a model modifying code beyond what is necessary"]]></title><description><![CDATA[
<p>What sorts of instructions?</p>
]]></description><pubDate>Thu, 23 Apr 2026 07:51:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47873199</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47873199</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47873199</guid></item><item><title><![CDATA[New comment by Calavar in "Over-editing refers to a model modifying code beyond what is necessary"]]></title><description><![CDATA[
<p>It's interesting how variable people's experiences seem to be.<p>Personally, I tend to get crap quality code out of Claude. Very branchy. Very un-DRY. Consistently fails to understand the conventions of my codebase (e.g. keeps hallucinating that my arena allocator zero initializes memory - it does not). And sometimes after a context compaction it goes haywire and starts creating new regressions everywhere. And while you can prompt to fix these things, it can take an entire afternoon of whack-a-mole prompting to fix the fallout of one bad initial run. I've also tried dumping lessons into a project specific skill file, which sometimes helps, but also sometimes hurts - the skill file can turn into a footgun if it gets out of sync with an evolving codebase.<p>In terms of limits, I usually find myself hitting the rate limit after two or three requests. On bad days, only one. This has made Claude borderline unusable over the past couple weeks, so I've started hand coding again and using Claude as a code search and debugging tool rather than a code generator.</p>
]]></description><pubDate>Thu, 23 Apr 2026 07:42:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=47873146</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47873146</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47873146</guid></item><item><title><![CDATA[New comment by Calavar in "The abandoned war: Why no one is stopping the genocide in Sudan"]]></title><description><![CDATA[
<p>It's really hard to cry victim about others misrepresenting Trump's motives for the Iran war as oil, oil, oil when the US did in fact launch a military attack on a country - within the last six months - where the subsequent negotiated agreement on oil rights was quite literally described by the White House press secretary as "the president’s control of Venezuela’s oil" [1] and just a few weeks later the president held a public, televised conference with Chevron and ExxonMobil executives in the White House where he pitched them on investing in the Venezuelan oil industry [2]<p>[1] <a href="https://www.wsj.com/business/energy-oil/trump-venezuela-oil-us-control-plan-265a39c1" rel="nofollow">https://www.wsj.com/business/energy-oil/trump-venezuela-oil-...</a><p>[2] <a href="https://youtu.be/sD4x6T-u4XY" rel="nofollow">https://youtu.be/sD4x6T-u4XY</a></p>
]]></description><pubDate>Tue, 21 Apr 2026 14:24:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47849289</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47849289</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47849289</guid></item><item><title><![CDATA[New comment by Calavar in "The creative software industry has declared war on Adobe"]]></title><description><![CDATA[
<p>> we're seeing an explosion of brand-new high-polish OSS apps this year<p>Do you mind sharing a few examples?</p>
]]></description><pubDate>Sun, 19 Apr 2026 15:29:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=47825011</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47825011</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47825011</guid></item><item><title><![CDATA[New comment by Calavar in "Claude Code Found a Linux Vulnerability Hidden for 23 Years"]]></title><description><![CDATA[
<p>It's not an insightful statement  right now, but it was at the peak of cloud hype ca. 2010, when "the cloud" often used in a metaphorical sense. You'd hear things like "it's scalable because it's in the cloud" or "our clients want a cloud based solution." Replacing "the cloud" in those sorts of claims with "another person's computer" showed just how inane those claims were.</p>
]]></description><pubDate>Sat, 04 Apr 2026 19:54:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47642713</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47642713</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47642713</guid></item><item><title><![CDATA[New comment by Calavar in "MoD sources warn Palantir role at heart of government is threat to UK security"]]></title><description><![CDATA[
<p>I still don't understand why Theil and Karp decided to name their surveillance tech company after a device that is best known for being used by an evil dark lord to decieve and corrupt. It's like the Mitchell and Webb skit "are we the baddies" except they're the ones who designed the uniforms with skulls on them.</p>
]]></description><pubDate>Mon, 16 Mar 2026 18:20:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=47402735</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47402735</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47402735</guid></item><item><title><![CDATA[New comment by Calavar in "Entities enabling scientific fraud at scale (2025)"]]></title><description><![CDATA[
<p>This is a good point. It is not humanly possible to verify every claim you read from every source.<p>Ideally, you should independently verify claims that appear to be particularly consequential or particularly questionable on the surface. But at some point you have to rely on heuristics like chain of trust (it was peer reviewed, it was published in a reputable textbook), or you will never make forward progress on anything.</p>
]]></description><pubDate>Wed, 11 Mar 2026 18:23:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=47339241</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47339241</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47339241</guid></item><item><title><![CDATA[New comment by Calavar in "Judge orders government to begin refunding more than $130B in tariffs"]]></title><description><![CDATA[
<p>> If they couldn't do anything that gave an appearance of a conflict<p>This time I won't say maybe - that's a straw man.<p>I never said Cantor shouldn't be able to do <i>anything</i> that even gives the <i>appearance</i> of a conflict. Or anything even close to that really.<p>As you said yourself further up the thread, investments of investment bank employees are highly regulated. And not only  employees themselves, but also their immediate family members.<p>Yet that same level of legal regulation doesn't apply to immediate relatives of government officials. We've seen frequently with spouses and children of congressmen, and now we're seeing it with the son of a cabinet member. Yes, this may technically be legal, but legal does not equate to just and desirable. This reads to me like a serious loophole in the law that needs to be closed.</p>
]]></description><pubDate>Thu, 05 Mar 2026 22:23:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=47268135</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47268135</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47268135</guid></item><item><title><![CDATA[New comment by Calavar in "Judge orders government to begin refunding more than $130B in tariffs"]]></title><description><![CDATA[
<p>> In this case, the idea that Cantor can't do something because the former head is now in a government job is crazy. No one "in the business" thinks Cantor is suddenly hobbled.<p>That's not the idea, and it almost seems like a straw man to be honest. The actual idea is that the current head of Cantor can't do something because he's a direct relative of a high ranking government official whose powers and job duties present a conflict of interest for this specific set of transactions.</p>
]]></description><pubDate>Thu, 05 Mar 2026 17:50:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=47264834</link><dc:creator>Calavar</dc:creator><comments>https://news.ycombinator.com/item?id=47264834</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47264834</guid></item></channel></rss>