<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: ibraheemdev</title><link>https://news.ycombinator.com/user?id=ibraheemdev</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 28 Apr 2026 08:17:38 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=ibraheemdev" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by ibraheemdev in "What async promised and what it delivered"]]></title><description><![CDATA[
<p>Not the runtime per se, but cooperative scheduling has the advantage that tasks do not yield at adverse code points, e.g., right before giving up a lock, or performing an I/O request. Of course the lack of preemption has it's own downsides, but with thread-per-request you tend to run into tail latency issues much earlier than context switching overhead.</p>
]]></description><pubDate>Sun, 26 Apr 2026 05:39:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47907661</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=47907661</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47907661</guid></item><item><title><![CDATA[New comment by ibraheemdev in "What async promised and what it delivered"]]></title><description><![CDATA[
<p>> OS threads are expensive: an operating system thread typically reserves a megabyte of stack space<p>Why is reserving a megabyte of stack space "expensive"?<p>> and takes roughly a millisecond to create<p>I'm not sure where this number is from, it seems off by a few orders of magnitude. On Linux, thread creation is closer to 10 microseconds.</p>
]]></description><pubDate>Sat, 25 Apr 2026 20:24:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=47904287</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=47904287</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47904287</guid></item><item><title><![CDATA[Astral to Join OpenAI]]></title><description><![CDATA[
<p><a href="https://openai.com/index/openai-to-acquire-astral/" rel="nofollow">https://openai.com/index/openai-to-acquire-astral/</a></p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47438723">https://news.ycombinator.com/item?id=47438723</a></p>
<p>Points: 1489</p>
<p># Comments: 901</p>
]]></description><pubDate>Thu, 19 Mar 2026 13:05:50 +0000</pubDate><link>https://astral.sh/blog/openai</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=47438723</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47438723</guid></item><item><title><![CDATA[Cuckoo hashing improves SIMD hash tables (and other hash table tradeoffs)]]></title><description><![CDATA[
<p>Article URL: <a href="https://reiner.org/cuckoo-hashing">https://reiner.org/cuckoo-hashing</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45475623">https://news.ycombinator.com/item?id=45475623</a></p>
<p>Points: 79</p>
<p># Comments: 1</p>
]]></description><pubDate>Sat, 04 Oct 2025 18:45:55 +0000</pubDate><link>https://reiner.org/cuckoo-hashing</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=45475623</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45475623</guid></item><item><title><![CDATA[New comment by ibraheemdev in "From Rust to reality: The hidden journey of fetch_max"]]></title><description><![CDATA[
<p>It does make a difference of course if you're running fetch_max from multiple threads, adding a load fast-path introduces a race condition.</p>
]]></description><pubDate>Wed, 24 Sep 2025 07:49:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=45357480</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=45357480</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45357480</guid></item><item><title><![CDATA[New comment by ibraheemdev in "PEP 751: Pylock.toml"]]></title><description><![CDATA[
<p>pip, PDM, and uv already support PEP751 [0] and were involved in the design process.<p>[0]: <a href="https://discuss.python.org/t/community-adoption-of-pylock-toml-pep-751/89778" rel="nofollow">https://discuss.python.org/t/community-adoption-of-pylock-to...</a></p>
]]></description><pubDate>Fri, 12 Sep 2025 01:55:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=45217912</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=45217912</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45217912</guid></item><item><title><![CDATA[New comment by ibraheemdev in "Shared_ptr<T>: the (not always) atomic reference counted smart pointer (2019)"]]></title><description><![CDATA[
<p>I did not claim that x86 provides sequential consistency in general, I made that claim only for RMW operations. Sequentially consistent stores are typically lowered to an XCHG instruction on x86 without an explicit barrier.<p>From the Intel SDM:<p>> Synchronization mechanisms in multiple-processor systems may depend upon a strong memory-ordering model. Here, a program can use a locking instruction such as the XCHG instruction or the LOCK prefix to ensure that a read-modify-write operation on memory is carried out atomically. Locking operations typically operate like I/O operations in that they wait for all previous instructions to complete and for all buffered writes to drain to memory (see Section 8.1.2, “Bus Locking”).</p>
]]></description><pubDate>Mon, 01 Sep 2025 01:31:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=45088632</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=45088632</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45088632</guid></item><item><title><![CDATA[New comment by ibraheemdev in "Shared_ptr<T>: the (not always) atomic reference counted smart pointer (2019)"]]></title><description><![CDATA[
<p>Yes, what I meant was that the same instruction is generated by the compiler, regardless if the RMW operation is performed with relaxed or sequentially consistent ordering, because that instruction is strong enough in terms of hardware semantics to enforce C++'s definition of sequential consistency.<p>There is a pretty clear mapping in terms of C++ atomic operations to hardware instructions, and while the C++ memory model is not defined in terms of instruction reordering, that mapping is still useful to talk about performance. Sequential consistency is also a pretty broadly accepted concept outside of the C++ memory model, I think you're being a little too nitpicky on terminology.</p>
]]></description><pubDate>Sun, 31 Aug 2025 23:50:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=45088148</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=45088148</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45088148</guid></item><item><title><![CDATA[New comment by ibraheemdev in "Shared_ptr<T>: the (not always) atomic reference counted smart pointer (2019)"]]></title><description><![CDATA[
<p>I'm referring to the performance implications of the hardware instruction, not the programming language semantics. Incrementing or decrementing the reference count is going to require an RMW instruction, which is expensive on x86 regardless of the ordering.</p>
]]></description><pubDate>Sun, 31 Aug 2025 22:11:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=45087574</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=45087574</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45087574</guid></item><item><title><![CDATA[New comment by ibraheemdev in "Shared_ptr<T>: the (not always) atomic reference counted smart pointer (2019)"]]></title><description><![CDATA[
<p>> There is no way the shared_ptr<T> is using the expensive sequentially consistent atomic operations.<p>All RMW operations have sequentially consistent semantics on x86.<p>It's not exactly a store buffer flush, but any subsequent loads in the pipeline will stall until the store has completed.</p>
]]></description><pubDate>Sun, 31 Aug 2025 17:20:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=45084954</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=45084954</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45084954</guid></item><item><title><![CDATA[New comment by ibraheemdev in "Without the futex, it's futile"]]></title><description><![CDATA[
<p>> ParkingLot just uses pthread mutex and cond.<p>That's interesting, I'm more familiar with the Rust parking-lot implementation, which uses futex on Linux [0].<p>> Sure that uses futex under the hood, but the point is, you use futexes on Linux because that’s just what Linux gives you<p>It's a little more than that though, using a pthread_mutex or even thread.park() on the slow path is less efficient than using a futex directly. A futex lets you manage the atomic condition yourself, while generic parking utilities encode that state internally. A mutex implementation generally already has a built-in atomic condition with simpler state transitions for each thread in the queue, and so can avoid the additional overhead by making the futex call directly.<p>[0]: <a href="https://github.com/Amanieu/parking_lot/blob/739d370a809878e45021f6de21b32a0dba4520de/core/src/thread_parker/linux.rs#L64" rel="nofollow">https://github.com/Amanieu/parking_lot/blob/739d370a809878e4...</a></p>
]]></description><pubDate>Wed, 20 Aug 2025 22:52:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=44967347</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=44967347</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44967347</guid></item><item><title><![CDATA[New comment by ibraheemdev in "Without the futex, it's futile"]]></title><description><![CDATA[
<p>> And futexes aren’t the only way to get there. Alternatives:<p>> - thin locks (what JVMs use)<p>> - ParkingLot (a futex-like primitive that works entirely in userland and doesn’t require that the OS have futexes)<p>Worth nothing that somewhere under the hood, any modern lock is going to be using a futex (if supported). futex is the most efficient way to park on Linux, so you even want to be using it on the slow path. Your language's thread.park() primitive is almost certainly using a futex.</p>
]]></description><pubDate>Wed, 20 Aug 2025 22:02:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=44966992</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=44966992</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44966992</guid></item><item><title><![CDATA[New comment by ibraheemdev in "Zig; what I think after months of using it"]]></title><description><![CDATA[
<p>> The message has some weird mentions in (alloc565), but the actual useful information is there: a pointer is dangling.<p>The allocation ID is actually very useful for debugging. You can actually use the flags `-Zmiri-track-alloc-id=alloc565 -Zmiri-track-alloc-accesses` to track the allocation, deallocation, and any reads/writes to/from this location.</p>
]]></description><pubDate>Wed, 05 Feb 2025 08:56:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=42945862</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=42945862</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42945862</guid></item><item><title><![CDATA[David Heinemeier Hansson joins Shopify's board]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.shopify.com/news/david-heinemeier-hansson-board">https://www.shopify.com/news/david-heinemeier-hansson-board</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42183868">https://news.ycombinator.com/item?id=42183868</a></p>
<p>Points: 12</p>
<p># Comments: 2</p>
]]></description><pubDate>Tue, 19 Nov 2024 14:35:53 +0000</pubDate><link>https://www.shopify.com/news/david-heinemeier-hansson-board</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=42183868</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42183868</guid></item><item><title><![CDATA[New comment by ibraheemdev in "Show HN: Whirlwind – Async concurrent hashmap for Rust"]]></title><description><![CDATA[
<p>> Every single Future you look at will look like this,<p>That's not true. A Future is supposed to schedule itself to be woken up again <i>when it's ready</i>. This Future schedules it to be woken immediately. Most runtimes, like Tokio, will put a Future that acts like this at the end of the run queue, so in practice it's not as egregious. However, it's unquestionably a spin lock, equivalent to back off with thread::yield.</p>
]]></description><pubDate>Tue, 05 Nov 2024 23:30:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=42056132</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=42056132</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42056132</guid></item><item><title><![CDATA[New comment by ibraheemdev in "Designing a Fast Concurrent Hash Table"]]></title><description><![CDATA[
<p>It's quite common for concurrent algorithms to only implement a subset of operations. For example forgoing, removal or iteration. It's also common to put limitations on the data structure, such as limiting keys and values to 64-bits. Papaya being feature-complete means that it does not have any of these limitations when compared to std::collections::HashMap.</p>
]]></description><pubDate>Fri, 11 Oct 2024 17:55:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=41811639</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=41811639</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41811639</guid></item><item><title><![CDATA[New comment by ibraheemdev in "Designing a Fast Concurrent Hash Table"]]></title><description><![CDATA[
<p>Looks very interesting, but seems to serve a pretty different use case:<p>> This is an ordered data structure, and supports very high throughput iteration over lexicographically sorted ranges of values. If you are looking for simple point operation performance, you may find a better option among one of the many concurrent hashmap implementations that are floating around. Pay for what you actually use :)</p>
]]></description><pubDate>Thu, 10 Oct 2024 20:21:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=41803131</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=41803131</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41803131</guid></item><item><title><![CDATA[Uv: Unified Python Packaging]]></title><description><![CDATA[
<p>Article URL: <a href="https://astral.sh/blog/uv-unified-python-packaging">https://astral.sh/blog/uv-unified-python-packaging</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=41302523">https://news.ycombinator.com/item?id=41302523</a></p>
<p>Points: 49</p>
<p># Comments: 7</p>
]]></description><pubDate>Tue, 20 Aug 2024 18:19:46 +0000</pubDate><link>https://astral.sh/blog/uv-unified-python-packaging</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=41302523</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41302523</guid></item><item><title><![CDATA[New comment by ibraheemdev in "Rust Atomics and Locks (2023)"]]></title><description><![CDATA[
<p>Java atomics are actually sequentially consistent. C# relaxes this to acquire/release. Though the general concept of happens-before is still immensely useful for learning atomics as sequential consistency is a superset of acquire/release.</p>
]]></description><pubDate>Tue, 13 Aug 2024 23:40:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=41241012</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=41241012</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41241012</guid></item><item><title><![CDATA[A fast and ergonomic concurrent hash-table for read-heavy workloads]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/ibraheemdev/papaya">https://github.com/ibraheemdev/papaya</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=40950059">https://news.ycombinator.com/item?id=40950059</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 12 Jul 2024 22:30:55 +0000</pubDate><link>https://github.com/ibraheemdev/papaya</link><dc:creator>ibraheemdev</dc:creator><comments>https://news.ycombinator.com/item?id=40950059</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40950059</guid></item></channel></rss>