<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: amirhirsch</title><link>https://news.ycombinator.com/user?id=amirhirsch</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 05 Apr 2026 13:07:15 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=amirhirsch" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by amirhirsch in "F-15E jet shot down over Iran"]]></title><description><![CDATA[
<p>Gp was referring to AN/TPY2 which is the THAAD radar.<p>Iron dome has nothing to do with that systems.</p>
]]></description><pubDate>Sat, 04 Apr 2026 12:08:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=47638331</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=47638331</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47638331</guid></item><item><title><![CDATA[New comment by amirhirsch in "Full network of clitoral nerves mapped out for first time"]]></title><description><![CDATA[
<p>Any sufficiently advanced incompetence is indistinguishable from malice. If you are the editor of Gray’s Anatomy, incompetence is malice.</p>
]]></description><pubDate>Mon, 30 Mar 2026 01:28:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=47569417</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=47569417</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47569417</guid></item><item><title><![CDATA[New comment by amirhirsch in "Personal Computer by Perplexity"]]></title><description><![CDATA[
<p>Critically, Amdahl’s Law remains a conjecture until we have the maths to disjoin complexity class P from NC.</p>
]]></description><pubDate>Thu, 12 Mar 2026 12:51:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=47349869</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=47349869</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47349869</guid></item><item><title><![CDATA[New comment by amirhirsch in "Intelligence is a commodity. Context is the real AI Moat"]]></title><description><![CDATA[
<p>Not sure about the conclusion regarding NVidia value capture. I imagine the context for many applications will come from a physical simulation environment running in dramatically more GPUs than the AI part.</p>
]]></description><pubDate>Thu, 05 Mar 2026 16:16:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=47263495</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=47263495</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47263495</guid></item><item><title><![CDATA[New comment by amirhirsch in "Amazon accused of widespread scheme to inflate prices across the economy"]]></title><description><![CDATA[
<p>Beat me to it. Now I have to delete my reply.</p>
]]></description><pubDate>Wed, 25 Feb 2026 14:14:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=47151751</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=47151751</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47151751</guid></item><item><title><![CDATA[New comment by amirhirsch in "The time I didn't meet Jeffrey Epstein"]]></title><description><![CDATA[
<p>> S&S Deli in Cambridge<p>Good lunch spot for a nudnik</p>
]]></description><pubDate>Fri, 06 Feb 2026 14:13:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=46913049</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46913049</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46913049</guid></item><item><title><![CDATA[New comment by amirhirsch in "Anthropic's original take home assignment open sourced"]]></title><description><![CDATA[
<p>Here’s some other hints:
combine hash stages 2 and 3, it can be two muladds and a XOR<p>For the first several
rounds (when every tree value is in use) Combine the stage 5 XOR with the subsequent round’s tree XORs. You can determine even/odd in hash stage 5 starting with a ^ (a>>16) without Xoring the constant, then you can only need one XOR, this saves you a ton of XORs<p>Create separate instruction bundles for the first round, rounds 1-5 (combining hash stages 5 XOR with next round tree XORs) and 6-9 (not every tree node is used anymore), round 10 round 11-14 and round 15 and combine them.<p>you can use add_imm in parallel to load consts.
stage 0 you have to do load the tree first and the vals,  by later stages when everything is in scratch, you could use 12 scalar XORs and 6 vector XORs on scratch. 
once you vload vals, you can start to do XORs but can only advance so much at a time, so I’m starting to work on getting hash stages moving to different rounds faster to hide the initial vloads and get to the heavy load section sooner and spread the load pain.</p>
]]></description><pubDate>Fri, 23 Jan 2026 17:24:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=46735107</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46735107</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46735107</guid></item><item><title><![CDATA[New comment by amirhirsch in "Anthropic's original take home assignment open sourced"]]></title><description><![CDATA[
<p>I think I can hit #1 (current #1 is 1000). sub 900 not possible though.<p>Let me put down my thought process: You have to start to think of designing a 6-slot x8-len vector pipeline doing 48 hashes in parallel first which needs at least 10 steps —- if you convert three stages to multiply adds and do parallel XORs for the other three) —- the problem with 10 cycle hashing is you need to cram 96 scalar xors along side your vector pipeline, so that will use all 12 ALUs for 8 of those cycles. Leaving you only 24 more scalar ops per hash cycle which isn’t enough for the 48 tree value xors..<p>so you must use at least 11 steps per hash, with 96 xors (including the tree value xor) done in the scalar alus using 8 steps, and giving 3*12 Alu ops per hash cycle. You need 12 more ops per hash to do odd/even, so you must be 12 stages, and just do all of the hash ops in valu, 4 cycles of 12 alus doing modulo, 8 cycles x 12 alus free<p>With 12 steps and 48 parallel you’re absolute minimum could be 4096/48 x 12 = 1,024 cycles, since stage 10 can be optimized (you don’t need the odd/even modulo cycle, and can use some of those extra scalar cycles to pre-xor the constant can save you ~10 cycles. 1024 gonna be real hard, but I can imagine shenanigans to get it down to 1014, sub-1000 possible by throwing more xor to the scalar alus.</p>
]]></description><pubDate>Thu, 22 Jan 2026 14:48:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=46719908</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46719908</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46719908</guid></item><item><title><![CDATA[New comment by amirhirsch in "Anthropic's original take home assignment open sourced"]]></title><description><![CDATA[
<p>======================================================================<p>BROADCAST LOAD SCHEDULE<p>======================================================================<p>Round | Unique | Load Strategy<p>------|--------|------------------------------------------<p><pre><code>   0  |    1   | 1 broadcast → all 256 items

   1  |    2   | 2 broadcasts → groups

   2  |    4   | 4 broadcasts → groups

   3  |    8   | 8 broadcasts → groups

   4  |   16   | 16 broadcasts → groups

   5  |   32   | 32 broadcasts → groups

   6  |   63   | 63 loads (sparse, use indirection)

   7  |  108   | 108 loads (sparse, use indirection)

   8  |  159   | 159 loads (sparse, use indirection)

   9  |  191   | 191 loads (sparse, use indirection)

  10  |  224   | 224 loads (sparse, use indirection)

  11  |    1   | 1 broadcast → all 256 items

  12  |    2   | 2 broadcasts → groups

  13  |    4   | 4 broadcasts → groups

  14  |    8   | 8 broadcasts → groups

  15  |   16   | 16 broadcasts → groups
</code></pre>
Total loads with grouping: 839<p>Total loads naive:         4096<p>Load reduction:            4.9x</p>
]]></description><pubDate>Wed, 21 Jan 2026 21:25:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=46711772</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46711772</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46711772</guid></item><item><title><![CDATA[New comment by amirhirsch in "Anthropic's original take home assignment open sourced"]]></title><description><![CDATA[
<p>take advantage of index collisions, optimizing round 0 and 11, speculative pre-loading, and the early branch predictor (which now I am doing looking at bits output at stage 3)</p>
]]></description><pubDate>Wed, 21 Jan 2026 18:31:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=46709546</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46709546</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46709546</guid></item><item><title><![CDATA[New comment by amirhirsch in "Anthropic's original take home assignment open sourced"]]></title><description><![CDATA[
<p>I'm at 1137 with one hour with opus now...
Pipelined vectorized hash, speculation, static code for each stage, epilogues and prologues for each stage-to-stage...<p>I think I'm going to get sub 900 since i just realized i can in-parallel compute whether stage 5 of the hash is odd just by looking at bits 16 and 0 of stage 4 with less delay.....</p>
]]></description><pubDate>Wed, 21 Jan 2026 18:05:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=46709173</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46709173</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46709173</guid></item><item><title><![CDATA[New comment by amirhirsch in "Show HN: ChartGPU – WebGPU-powered charting library (1M points at 60fps)"]]></title><description><![CDATA[
<p>Very Nice. There is an issue with panning on the million point demo -- it currently does not redraw until the dragging velocity is below some threshold, but it should seem like the points are just panned into frame. It is probably enough to just get rid of the dragging velocity threshold, but sometimes helps to cache an entire frame around the visible range</p>
]]></description><pubDate>Wed, 21 Jan 2026 16:32:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=46707943</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46707943</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46707943</guid></item><item><title><![CDATA[New comment by amirhirsch in "San Francisco to offer free childcare to people making up to $230k"]]></title><description><![CDATA[
<p>I gave your reply the most generous interpretation and read it in the ironic way as you point out in the edit</p>
]]></description><pubDate>Fri, 16 Jan 2026 14:01:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=46646412</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46646412</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46646412</guid></item><item><title><![CDATA[New comment by amirhirsch in "San Francisco to offer free childcare to people making up to $230k"]]></title><description><![CDATA[
<p>this won't cost the city too much, there's only like a hundred kids under 6 in this city and 3% of them are mine.</p>
]]></description><pubDate>Fri, 16 Jan 2026 06:54:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=46643755</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46643755</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46643755</guid></item><item><title><![CDATA[New comment by amirhirsch in "FPGAs Need a New Future"]]></title><description><![CDATA[
<p>I did write this 20 years ago <a href="https://fpgacomputing.blogspot.com/2006/05/methods-for-reconfigurable-computing.html" rel="nofollow">https://fpgacomputing.blogspot.com/2006/05/methods-for-recon...</a><p>The vendor tools are still a barrier to the high-end FPGA's hardened IP</p>
]]></description><pubDate>Tue, 23 Dec 2025 05:37:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=46362733</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46362733</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46362733</guid></item><item><title><![CDATA[New comment by amirhirsch in "Ask HN: Why Did Python Win?"]]></title><description><![CDATA[
<p>python won because of enforced whitespace. It solved a social problem that other languages punted to linters, baking readability into the spec</p>
]]></description><pubDate>Mon, 22 Dec 2025 16:06:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=46355243</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46355243</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46355243</guid></item><item><title><![CDATA[New comment by amirhirsch in "Ask HN: How can I get better at using AI for programming?"]]></title><description><![CDATA[
<p>Probably also end of ZIRP and some “AI washing” to give the illusion of progress</p>
]]></description><pubDate>Sun, 14 Dec 2025 04:28:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=46260757</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46260757</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46260757</guid></item><item><title><![CDATA[New comment by amirhirsch in "Ask HN: How can I get better at using AI for programming?"]]></title><description><![CDATA[
<p>The effect of these tools is people losing their software jobs (down 35% since 2020). Unemployed devs aren’t clamoring to go use AI on OSS.</p>
]]></description><pubDate>Sat, 13 Dec 2025 23:53:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=46259409</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46259409</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46259409</guid></item><item><title><![CDATA[New comment by amirhirsch in "Show HN: I made a spreadsheet where formulas also update backwards"]]></title><description><![CDATA[
<p>Cool!<p>Constraint propagation from SICP is a great reference here:<p><a href="https://sicp.sourceacademy.org/chapters/3.3.5.html" rel="nofollow">https://sicp.sourceacademy.org/chapters/3.3.5.html</a></p>
]]></description><pubDate>Sat, 13 Dec 2025 03:09:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=46251634</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46251634</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46251634</guid></item><item><title><![CDATA[New comment by amirhirsch in "I got an Nvidia GH200 server for €7.5k on Reddit and converted it to a desktop"]]></title><description><![CDATA[
<p># Tell the driver to completely ignore the NVLINK and it should allow the GPUs to initialise independently over PCIe !!!!   This took a week of work to find, thanks Reddit!<p>I needed this info, thanks for putting it up. Can this really be an issue for every data center?</p>
]]></description><pubDate>Wed, 10 Dec 2025 22:11:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=46224633</link><dc:creator>amirhirsch</dc:creator><comments>https://news.ycombinator.com/item?id=46224633</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46224633</guid></item></channel></rss>