<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: clausecker</title><link>https://news.ycombinator.com/user?id=clausecker</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 07 Apr 2026 22:49:10 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=clausecker" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by clausecker in "A SIMD coding challenge: First non-space character after newline"]]></title><description><![CDATA[
<p>You can find the first set bit in an integer with a machine instruction, it's completely branch free.  gcc has __builtin_ctz() for this.  You'll either need to iterate over all set bits (so one branch per set bit) or use a compression instruction (requiring AVX-512) to turn the bit set into a set of integers.<p>That said, as you seem to actually want to do something with the results, you'll take a branch per match anyway, so I don't see the problem.</p>
]]></description><pubDate>Tue, 30 Dec 2025 01:48:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=46428553</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=46428553</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46428553</guid></item><item><title><![CDATA[Hacking Washing Machines [video]]]></title><description><![CDATA[
<p>Article URL: <a href="https://media.ccc.de/v/39c3-hacking-washing-machines">https://media.ccc.de/v/39c3-hacking-washing-machines</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46428496">https://news.ycombinator.com/item?id=46428496</a></p>
<p>Points: 222</p>
<p># Comments: 49</p>
]]></description><pubDate>Tue, 30 Dec 2025 01:40:49 +0000</pubDate><link>https://media.ccc.de/v/39c3-hacking-washing-machines</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=46428496</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46428496</guid></item><item><title><![CDATA[New comment by clausecker in "A SIMD coding challenge: First non-space character after newline"]]></title><description><![CDATA[
<p>So I've thought about it and I don't really feel like spending more time to convince you that this works.  If you have questions I am happy to answer them, but please write your own code.</p>
]]></description><pubDate>Sun, 28 Dec 2025 18:24:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=46413236</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=46413236</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46413236</guid></item><item><title><![CDATA[New comment by clausecker in "A SIMD coding challenge: First non-space character after newline"]]></title><description><![CDATA[
<p>This does track the state.  If you want to track it across multple vectors of input, you'll need to carry it over manually.</p>
]]></description><pubDate>Sun, 28 Dec 2025 18:23:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=46413228</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=46413228</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46413228</guid></item><item><title><![CDATA[New comment by clausecker in "A SIMD coding challenge: First non-space character after newline"]]></title><description><![CDATA[
<p>You can do it like this, assuming A is the mask of newlines and B is the mask of non-spaces.<p>1. Compute M1 = ~A & ~B, which is the mask of all spaces that are not newlines
2. Compute M2 = M1 + (A << 1) + 1, which is the first non-space or newline after each newline and then additional bits behind each such newline.
3. Compute M3 = M2 & ~M1, which removes the junk bits, leaving only the first match in each section<p>Here is what it looks like:<p><pre><code>    10010000 = A
    01100110 = B
    00001001 = M1 = ~A & ~B
    00101010 = M2 = M1 + (A << 1) + 1
    00100010 = M3 = M2 & ~M1
</code></pre>
Note that this code treats newlines as non-spaces, meaning if a line comprises only spaces, the terminating NL character is returned. You can have it treat newlines as spaces (meaning a line of all spaces is not a match) by computing M4 = M3 & ~A.</p>
]]></description><pubDate>Fri, 26 Dec 2025 00:24:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=46388013</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=46388013</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46388013</guid></item><item><title><![CDATA[New comment by clausecker in "Package managers keep using Git as a database, it never works out"]]></title><description><![CDATA[
<p>We do this with FreeBSD ports, but users don't have to clone the ports tree unless they want to modify ports or compile them with custom options.</p>
]]></description><pubDate>Fri, 26 Dec 2025 00:01:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=46387896</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=46387896</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46387896</guid></item><item><title><![CDATA[New comment by clausecker in "Uninitialized garbage on ia64 can be deadly (2004)"]]></title><description><![CDATA[
<p>And the same reason NVRAM was dead on arrival.  No affordable dev systems meant that only enterprise software supported it.</p>
]]></description><pubDate>Mon, 08 Dec 2025 13:50:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=46192202</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=46192202</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46192202</guid></item><item><title><![CDATA[New comment by clausecker in "Addressing the adding situation"]]></title><description><![CDATA[
<p>That's more "load store architecture" than RISC.  And by that measure, S/360 could be considered a RISC.</p>
]]></description><pubDate>Wed, 03 Dec 2025 12:40:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=46133815</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=46133815</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46133815</guid></item><item><title><![CDATA[New comment by clausecker in "Addressing the adding situation"]]></title><description><![CDATA[
<p>You may enjoy the RISC deprogrammer: <a href="https://blog.erratasec.com/2022/10/the-risc-deprogrammer.html" rel="nofollow">https://blog.erratasec.com/2022/10/the-risc-deprogrammer.htm...</a></p>
]]></description><pubDate>Wed, 03 Dec 2025 12:38:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=46133809</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=46133809</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46133809</guid></item><item><title><![CDATA[New comment by clausecker in "Poll HN: What operating system do you primarily develop on?"]]></title><description><![CDATA[
<p>FreeBSD</p>
]]></description><pubDate>Fri, 28 Nov 2025 18:16:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=46081194</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=46081194</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46081194</guid></item><item><title><![CDATA[New comment by clausecker in "FEX-emu – Run x86 applications on ARM64 Linux devices"]]></title><description><![CDATA[
<p>ARM already has most stuff required for this on board.  Two proprietary extensions are used by Rosetta: one emulates the parity (rarely used) and half-carry (obsolete) flags, which can also be emulated conventionally.  The other implementa TSO memory ordering, which can either be ignored or implemented with explicit barriers; some other chips apparently have a similar setting.<p>The other stuff is all present in ARMv8.5 I think.</p>
]]></description><pubDate>Fri, 21 Nov 2025 19:35:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=46008062</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=46008062</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46008062</guid></item><item><title><![CDATA[New comment by clausecker in "Fortran Outsmarted Our Billion-Dollar AI Chips"]]></title><description><![CDATA[
<p>It's very likely that there is some serious autovectorisation going on behind the scenes.</p>
]]></description><pubDate>Mon, 10 Nov 2025 22:16:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=45881684</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=45881684</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45881684</guid></item><item><title><![CDATA[New comment by clausecker in "End of Japanese community"]]></title><description><![CDATA[
<p>Best one was when gedit had the option to syntax highlight for a language named “Los.”</p>
]]></description><pubDate>Thu, 06 Nov 2025 11:46:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=45834171</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=45834171</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45834171</guid></item><item><title><![CDATA[New comment by clausecker in "Why every Rust crate feels like a research paper on abstraction"]]></title><description><![CDATA[
<p>Had the same feeling browsing through the Haskell package collection.  Felt like and almagamation of PhD theses, none of which were maintained after the author got his degree.  Every single one a work of art, but most engeneered so badly that you would only use them begrudgingly.</p>
]]></description><pubDate>Sun, 19 Oct 2025 22:40:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=45638676</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=45638676</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45638676</guid></item><item><title><![CDATA[New comment by clausecker in "Mysterious Intrigue Around an x86 "Corporate Entity Other Than Intel/AMD""]]></title><description><![CDATA[
<p>Such as?</p>
]]></description><pubDate>Fri, 17 Oct 2025 00:26:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=45612218</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=45612218</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45612218</guid></item><item><title><![CDATA[New comment by clausecker in "Comparing a RISC and a CISC with similar hardware organization (1991)"]]></title><description><![CDATA[
<p>I'm talking about how they are able to integrate stuff that normally wouldn't fit into 32 bits (such as 3 operand simd with masking), not about getting the instruction set more compact.  ARM knows how to do this (Thumb being the most compact mainstream ISA is evidence of that), they just have decided to waste a bit more space to make decoding simpler, while also adding more quality-of-life features.</p>
]]></description><pubDate>Thu, 09 Oct 2025 13:27:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=45527403</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=45527403</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45527403</guid></item><item><title><![CDATA[New comment by clausecker in "Comparing a RISC and a CISC with similar hardware organization (1991)"]]></title><description><![CDATA[
<p>Also, the VAX instruction encoding is a class of horror above that of x86.</p>
]]></description><pubDate>Mon, 06 Oct 2025 11:20:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=45490159</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=45490159</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45490159</guid></item><item><title><![CDATA[New comment by clausecker in "Comparing a RISC and a CISC with similar hardware organization (1991)"]]></title><description><![CDATA[
<p>ARM64 has a trick up its sleeve: many instructions that would be longer on other architecturea are instead split into easily recognisable pairs on ARM64.  This allows for simple inplementations to pretend it's fixed length while more complex ones can pretend it's variable length.  SVE takes this one step further with MOVPRFX, which can add be placed before almost all SVE instructions to supply masking and a third operand.</p>
]]></description><pubDate>Mon, 06 Oct 2025 11:17:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=45490134</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=45490134</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45490134</guid></item><item><title><![CDATA[New comment by clausecker in "Comparing a RISC and a CISC with similar hardware organization (1991)"]]></title><description><![CDATA[
<p>LL/SC is performant, it just doesn't scale to high core counts.<p>The VEX encoding is actually only rarely longer than the legacy one, and frequently it is shorter.</p>
]]></description><pubDate>Mon, 06 Oct 2025 11:11:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=45490094</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=45490094</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45490094</guid></item><item><title><![CDATA[New comment by clausecker in "Processing Strings 109x Faster Than Nvidia on H100"]]></title><description><![CDATA[
<p>When I implemented SIMD-accelerated string functions for FreeBSD's libc, I briefly looked at Stringzilla, but the code didn't look particularly interesting or fast.  So no surprise here.</p>
]]></description><pubDate>Tue, 23 Sep 2025 13:51:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=45347040</link><dc:creator>clausecker</dc:creator><comments>https://news.ycombinator.com/item?id=45347040</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45347040</guid></item></channel></rss>