<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: SUPERCILEX</title><link>https://news.ycombinator.com/user?id=SUPERCILEX</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 30 Apr 2026 10:13:27 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=SUPERCILEX" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by SUPERCILEX in "Lockless MPSC/SPMC/MPMC queues are not queues"]]></title><description><![CDATA[
<p>Fair point, though I'm not sure I agree. MPMC channels underpin pretty much every general task scheduler (take a peek inside tokio or rayon for example). And SPSCs are quite useful for designing custom pipelines. Though I agree that MPMC channels are silly.</p>
]]></description><pubDate>Mon, 29 Sep 2025 13:48:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=45413782</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=45413782</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45413782</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "Lockless MPSC/SPMC/MPMC queues are not queues"]]></title><description><![CDATA[
<p>As noted by other commenters, the point I was trying to get across is that the way we implement lockless channels is suboptimal and could be made faster from a theoretical standpoint.<p>In my benchmarks[1], the average processing time for an element is 250ns with 4 producers and 4 consumers contending heavily. That's terrible! Even if your numbers are correct, 100ns is a bit faster than two round trips to RAM while 33ns is about three round trips to L3 and ~100x slower than spamming a core with add operations. That's slow.<p>[1]: <a href="https://github.com/SUPERCILEX/lockness/blob/master/bags/benches/atomic_bag.rs" rel="nofollow">https://github.com/SUPERCILEX/lockness/blob/master/bags/benc...</a>
$ cargo bench -- 8_threads/std_mpmc</p>
]]></description><pubDate>Mon, 29 Sep 2025 13:39:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=45413653</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=45413653</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45413653</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "Lockless MPSC/SPMC/MPMC queues are not queues"]]></title><description><![CDATA[
<p>Thanks for sharing, I had not! It sounds like "processor sharing" would be the expected mode of operation for lockless queues. But see my comment to the parent, this is not how they work.</p>
]]></description><pubDate>Mon, 29 Sep 2025 13:16:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=45413370</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=45413370</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45413370</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "Lockless MPSC/SPMC/MPMC queues are not queues"]]></title><description><![CDATA[
<p>This is actually a great analogy because it exemplifies the misconceptions people have about lockless queues.<p>In the example with multiple counters, in real life each counter could shout out a number and have people approach their respective counters in parallel. But this is not how lockless queues work. Instead, the person at the head of the queue holds a baton and when multiple numbers are called, everybody waiting in the queue goes up to the counter of the person holding the baton. Once that head-of-the-queue has made it to the counter, they give the baton to the person behind them who then drags everybody along to their counter. And so on.<p>The article was arguing for a lockless channel implementation akin to your interpretation of a queue with parallel access to the counters.</p>
]]></description><pubDate>Mon, 29 Sep 2025 13:05:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=45413207</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=45413207</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45413207</guid></item><item><title><![CDATA[Lockless MPSC/SPMC/MPMC queues are not queues]]></title><description><![CDATA[
<p>Article URL: <a href="https://alexsaveau.dev/blog/opinions/performance/lockness/lockless-queues-are-not-queues">https://alexsaveau.dev/blog/opinions/performance/lockness/lockless-queues-are-not-queues</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45409258">https://news.ycombinator.com/item?id=45409258</a></p>
<p>Points: 39</p>
<p># Comments: 31</p>
]]></description><pubDate>Mon, 29 Sep 2025 00:21:01 +0000</pubDate><link>https://alexsaveau.dev/blog/opinions/performance/lockness/lockless-queues-are-not-queues</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=45409258</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45409258</guid></item><item><title><![CDATA[The need for new instructions: atomic bit fill and drain]]></title><description><![CDATA[
<p>Article URL: <a href="https://alexsaveau.dev/blog/opinions/performance/lockness/atomic-bit-fill">https://alexsaveau.dev/blog/opinions/performance/lockness/atomic-bit-fill</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45404843">https://news.ycombinator.com/item?id=45404843</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Sun, 28 Sep 2025 15:02:42 +0000</pubDate><link>https://alexsaveau.dev/blog/opinions/performance/lockness/atomic-bit-fill</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=45404843</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45404843</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "Ask HN: Is trusted client compute possible?"]]></title><description><![CDATA[
<p>So you have to use probabilistic methods, makes sense thanks!</p>
]]></description><pubDate>Fri, 27 Sep 2024 13:31:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=41670231</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=41670231</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41670231</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "Ask HN: Is trusted client compute possible?"]]></title><description><![CDATA[
<p>Thanks for the links!</p>
]]></description><pubDate>Fri, 27 Sep 2024 13:31:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=41670227</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=41670227</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41670227</guid></item><item><title><![CDATA[Ask HN: Is trusted client compute possible?]]></title><description><![CDATA[
<p>I'm wondering if I can have a client build some artifact and upload the artifact to a cache server that redistributes it. Of course the problem is that a malicious client could upload something evil, so I would need some way of proving that the client built the thing it was supposed to. Is it possible to trust client computation?</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=41614238">https://news.ycombinator.com/item?id=41614238</a></p>
<p>Points: 3</p>
<p># Comments: 4</p>
]]></description><pubDate>Sun, 22 Sep 2024 02:37:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=41614238</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=41614238</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41614238</guid></item><item><title><![CDATA[Show HN: A scalable clipboard manager for Linux]]></title><description><![CDATA[
<p>This is incredibly over-engineered, but it was quite a fun process. :)</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=41206988">https://news.ycombinator.com/item?id=41206988</a></p>
<p>Points: 6</p>
<p># Comments: 1</p>
]]></description><pubDate>Sat, 10 Aug 2024 02:57:30 +0000</pubDate><link>https://alexsaveau.dev/blog/projects/performance/clipboard/ringboard/ringboard</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=41206988</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41206988</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "Metric Time"]]></title><description><![CDATA[
<p>I made something similar a while back where you can set your own start and end time for the "day": <a href="https://alexsaveau.dev/10hrday" rel="nofollow noreferrer">https://alexsaveau.dev/10hrday</a><p>The point was to be able to divide the day into nicely sized bites for getting stuff done. I chose 10 minutes in an hour for that reason: you can get a small task done in one "minute."</p>
]]></description><pubDate>Fri, 13 Oct 2023 03:36:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=37866298</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=37866298</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=37866298</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "Ask HN: Could you share your personal blog here?"]]></title><description><![CDATA[
<p><a href="https://alexsaveau.dev/blog" rel="nofollow noreferrer">https://alexsaveau.dev/blog</a><p>Mostly about performance and project internals.</p>
]]></description><pubDate>Sat, 08 Jul 2023 06:23:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=36641874</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=36641874</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36641874</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "The fastest rm command and one of the fastest cp commands"]]></title><description><![CDATA[
<p>Except they are and your claims are trivial to disprove: simply run the benchmarks under perf. You'll find that most of the time is spent on the rwsem which is described here onwards: <a href="https://www.kernel.org/doc/html/latest/filesystems/path-lookup.html#inode-i-rwsem" rel="nofollow">https://www.kernel.org/doc/html/latest/filesystems/path-look...</a></p>
]]></description><pubDate>Thu, 30 Mar 2023 21:19:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=35378955</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=35378955</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35378955</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "The fastest rm command and one of the fastest cp commands"]]></title><description><![CDATA[
<p>No. Run the benchmark on a tmpfs:<p>$ hyperfine --warmup 3 -N "./test /dev/shm 8 zip" "./test /dev/shm 8 chain"
Benchmark 1: ./test /dev/shm 8 zip
  Time (mean ± σ):     118.5 ms ±  11.6 ms    [User: 92.9 ms, System: 726.6 ms]
  Range (min … max):   103.6 ms … 143.4 ms    23 runs<p>Benchmark 2: ./test /dev/shm 8 chain
  Time (mean ± σ):     235.7 ms ±  11.0 ms    [User: 116.4 ms, System: 1537.7 ms]
  Range (min … max):   220.1 ms … 258.3 ms    13 runs<p>Summary
  './test /dev/shm 8 zip' ran
    1.99 ± 0.22 times faster than './test /dev/shm 8 chain'</p>
]]></description><pubDate>Tue, 28 Mar 2023 06:10:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=35336602</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=35336602</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35336602</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "The fastest rm command and one of the fastest cp commands"]]></title><description><![CDATA[
<p>> is disk IO bottlenecked by NVMe/PCIe limits, or by disk iops limits?<p>Note that I'm out of my depth here, so this is all speculation.
Until we hit hardware limitations (which will be PCIe 6 if I had to guess), I'm pretty sure those are the same thing. One read/write iop = 4KiB. If your PCIe bandwidth is limited to N GB/s, then there are only so many iops you can physically send/receive to/from the SSD, regardless of how many iops the SSD could be capable of processing. So currently we're bottlenecked by PCIe, but I doubt that will continue to be the case.<p>> again, you're handwaving on what "interfere" means<p>It depends on how the file system is implemented, but my guess would be that a lock on the inode or block cache entry is acquired.<p>> do "inode mappings" represent resources with no shared resource constraints?<p>Those are the contents of the directory. The problem is not reading them, but changing them.</p>
]]></description><pubDate>Tue, 28 Mar 2023 06:07:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=35336584</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=35336584</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35336584</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "The fastest rm command and one of the fastest cp commands"]]></title><description><![CDATA[
<p>A directory is a file like anything else that contains a map of names to inodes. If you're trying to add or remove mappings (create or delete files), then clearly some synchronization must occur or the contents of the file will contain garbage. In theory you could get away with a very small critical section that says "lock bytes N through M" of the file, but then how do you deal with disk block alignment (i.e. two pairs of n-m bytes are on the same disk block, so they need to take turns anyway) and how do you deal with I/O errors (the first n-m bytes fail, but the second n-m succeed, now you have a hole with garbage).<p>Also no need to theorize: run the benchmark I linked for yourself. It clearly shows a massive advantage to having each thread work with its own directory.</p>
]]></description><pubDate>Tue, 28 Mar 2023 05:48:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=35336437</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=35336437</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35336437</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "The fastest rm command and one of the fastest cp commands"]]></title><description><![CDATA[
<p>Added a small clarification:
"The intuition here is that directories are a shared resource for their direct children and must therefore serialize concurrent directory-modifying operations, causing contention. In brief, file creation or deletion cannot occur at the same time within one directory."<p>Plus accompanying benchmark: <a href="https://alexsaveau.dev/blog/projects/performance/files/fuc/fast-unix-commands#contention" rel="nofollow">https://alexsaveau.dev/blog/projects/performance/files/fuc/f...</a><p>---<p>> file operations are (almost always) io bound<p>This is a common misconception. It was presumably true a decade ago, but PCIe is getting exponentially faster every 3 years: <a href="https://arstechnica.com/gadgets/2022/06/months-after-finalizing-pcie-6-0-pci-sig-looks-to-double-speeds-again-with-pcie-7-0/" rel="nofollow">https://arstechnica.com/gadgets/2022/06/months-after-finaliz...</a><p>The NVMe protocol has extremely deep queues [1] and leaving them empty means leaving performance on the table. I think you'll find it surprisingly difficult to saturate the PCIe bus with just one core: PCIe 7 will support 512GB/s. Assuming a single core can produce 64 bytes (an entire cache line!) per cycle running at 5GHz, you're still only at 320/512=62.5% saturation. This napkin math is a little BS, but my point is that individual cores are quickly going to be outpaced by bandwidth availability.<p>> and totally unclear how directories represent an "interference" boundary<p>To add a bit more color here, it depends on how your file system is implemented. I belive Windows stores every file's metadata in a global database, so segmenting operations by directory yields no benefits. On the other hand, Unix FSs tend to store file_name to inode mappings per directory, so creating a new mapping in one directory doesn't interfere with another directory.<p>[1]: <a href="https://en.wikipedia.org/wiki/NVM_Express#Comparison_with_AHCI" rel="nofollow">https://en.wikipedia.org/wiki/NVM_Express#Comparison_with_AH...</a></p>
]]></description><pubDate>Tue, 28 Mar 2023 00:31:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=35334172</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=35334172</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35334172</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "The fastest rm command and one of the fastest cp commands"]]></title><description><![CDATA[
<p>We're working on this! <a href="https://github.com/axboe/liburing/issues/830">https://github.com/axboe/liburing/issues/830</a></p>
]]></description><pubDate>Mon, 27 Mar 2023 21:10:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=35332689</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=35332689</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35332689</guid></item><item><title><![CDATA[New comment by SUPERCILEX in "The fastest rm command and one of the fastest cp commands"]]></title><description><![CDATA[
<p>I added a clarification to the benchmarks section:<p>"The macOS/Windows implementations are currently equivalent to the *_rayon implementations shown in the benchmarks."<p>Rayon is pretty good, but clearly suboptimal as evidenced by the benchmarks.</p>
]]></description><pubDate>Mon, 27 Mar 2023 21:07:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=35332640</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=35332640</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35332640</guid></item><item><title><![CDATA[The fastest rm command and one of the fastest cp commands]]></title><description><![CDATA[
<p>Article URL: <a href="https://alexsaveau.dev/blog/projects/performance/files/fuc/fast-unix-commands">https://alexsaveau.dev/blog/projects/performance/files/fuc/fast-unix-commands</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=35307538">https://news.ycombinator.com/item?id=35307538</a></p>
<p>Points: 86</p>
<p># Comments: 81</p>
]]></description><pubDate>Sat, 25 Mar 2023 21:22:22 +0000</pubDate><link>https://alexsaveau.dev/blog/projects/performance/files/fuc/fast-unix-commands</link><dc:creator>SUPERCILEX</dc:creator><comments>https://news.ycombinator.com/item?id=35307538</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35307538</guid></item></channel></rss>