<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: attractivechaos</title><link>https://news.ycombinator.com/user?id=attractivechaos</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 12 Apr 2026 10:43:26 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=attractivechaos" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by attractivechaos in "Don't post generated/AI-edited comments. HN is for conversation between humans."]]></title><description><![CDATA[
<p>In the age of AI, thinking becomes a privilege.</p>
]]></description><pubDate>Wed, 11 Mar 2026 22:44:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=47343376</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=47343376</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47343376</guid></item><item><title><![CDATA[New comment by attractivechaos in "Bun v1.3.9"]]></title><description><![CDATA[
<p>Their C compiler project proves the opposite.</p>
]]></description><pubDate>Sun, 08 Feb 2026 21:45:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=46938830</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46938830</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46938830</guid></item><item><title><![CDATA[New comment by attractivechaos in "Deep dive into Turso, the “SQLite rewrite in Rust”"]]></title><description><![CDATA[
<p>Yeah, I was expecting performance benchmarks, detailed feature comparisons, analysis of binary/extension compatibility, etc.</p>
]]></description><pubDate>Thu, 29 Jan 2026 23:25:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=46818358</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46818358</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46818358</guid></item><item><title><![CDATA[New comment by attractivechaos in "Command-line Tools can be 235x Faster than your Hadoop Cluster (2014)"]]></title><description><![CDATA[
<p>On the contrary, the key message from the blog post is <i>not</i> to load the entire dataset to RAM unless necessary. The trick is to stream when the pattern works. This is how our field routinely works with files over 100GB.</p>
]]></description><pubDate>Sun, 18 Jan 2026 15:03:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=46668322</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46668322</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46668322</guid></item><item><title><![CDATA[New comment by attractivechaos in "Lessons from Hash Table Merging"]]></title><description><![CDATA[
<p>First of all, as khuey pointed out, the current implementation accumulates values. extend() replaces values instead. It wouldn't achieve the same functionality.<p>I tried extend() anyway. It didn't work well. Based on your description, extend() implements a variation of preallocation (i.e. Solution II). However, because it doesn't always reserve enough space to hold the merged hash table, clustering still happens depending on N. I have updated the rust implementation (with the help of LLM as I am not a good rust programmer). You can try it yourself with "ht-merge-rust 1 -e -n14m" or point out if I made mistakes.<p>> <i>HashMap will by default be randomly seeded in Rust</i><p>Yes, so it is with Abseil. The default rust hash functions, siphash in the standard library and foldhash in hashbrown, are ~3X as slow in comparison to simple hash functions on pure insertion load. When performance matters, we will use faster hash functions at least for small keys and will need a solution from my post.<p>> <i>In a new enough C++ in theory you might find the same functionality supported, but Quality of Implementation tends to be pretty frightful.</i><p>This is not necessary. The rust libraries are a port of Abseil, a C++ library. Boost is as fast as Rust. Languages/libraries should learn from each other, not fight each other.</p>
]]></description><pubDate>Thu, 08 Jan 2026 19:47:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=46545536</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46545536</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46545536</guid></item><item><title><![CDATA[New comment by attractivechaos in "Lessons from Hash Table Merging"]]></title><description><![CDATA[
<p>These three and boost are all based on swiss tables. They are indeed more robust than plain linear probing. khashl is the only one here using basic linear probing. Without salting, its curve is through the roof, much worse than swiss tables.</p>
]]></description><pubDate>Thu, 08 Jan 2026 14:26:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=46541338</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46541338</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46541338</guid></item><item><title><![CDATA[New comment by attractivechaos in "Lessons from Hash Table Merging"]]></title><description><![CDATA[
<p>Exactly. And khashl uses Fibonacci hashing. Without salting, it has the same problem.</p>
]]></description><pubDate>Thu, 08 Jan 2026 14:17:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=46541228</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46541228</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46541228</guid></item><item><title><![CDATA[New comment by attractivechaos in "JavaScript engines zoo – Compare every JavaScript engine"]]></title><description><![CDATA[
<p>I am not sure how "long term" you are thinking about. Making big architectural changes does not necessarily lead to better performance. JSC has been the fastest JS engine in the past 5-10 years as I remember. If google had a solution, they would have it now.</p>
]]></description><pubDate>Sun, 04 Jan 2026 15:53:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=46489087</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46489087</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46489087</guid></item><item><title><![CDATA[Lessons from Hash Table Merging]]></title><description><![CDATA[
<p>Article URL: <a href="https://gist.github.com/attractivechaos/d2efc77cc1db56bbd5fc597987e73338">https://gist.github.com/attractivechaos/d2efc77cc1db56bbd5fc597987e73338</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46464270">https://news.ycombinator.com/item?id=46464270</a></p>
<p>Points: 85</p>
<p># Comments: 21</p>
]]></description><pubDate>Fri, 02 Jan 2026 12:51:55 +0000</pubDate><link>https://gist.github.com/attractivechaos/d2efc77cc1db56bbd5fc597987e73338</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46464270</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46464270</guid></item><item><title><![CDATA[New comment by attractivechaos in "Fabrice Bellard: Biography (2009) [pdf]"]]></title><description><![CDATA[
<p>I have been thinking what talent means in programming and thought of a case in the past. The task was to parse a text file format. One programmer used ~1000 lines of code (LOC) with complex logic. The other used <200 LOC with a straightforward solution that ran times faster and would probably be more extensible and easier to maintain in future. This is a small task. The difference will be exponentially amplified for complex projects that Fabrice is famous for. The first programmer in my story may be able to write a javascript runtime if he has time + obsession, but it will take him much longer and the quality will be much lower in comparison to quickjs or mqjs.</p>
]]></description><pubDate>Thu, 25 Dec 2025 15:26:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=46384939</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46384939</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46384939</guid></item><item><title><![CDATA[New comment by attractivechaos in "Fabrice Bellard: Biography (2009) [pdf]"]]></title><description><![CDATA[
<p>This doesn't explain why so few people of Fabrice's generation have reached his level. Think about violin playing. Many players can become professionals if they have the obsession, but 99% of them won't reach the Heifetz/Hadelich/Ehnes level no matter how hard they try. Talent matters. Programming is not much different from performing art.</p>
]]></description><pubDate>Thu, 25 Dec 2025 06:07:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=46382498</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46382498</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46382498</guid></item><item><title><![CDATA[New comment by attractivechaos in "Fabrice Bellard Releases MicroQuickJS"]]></title><description><![CDATA[
<p>You can call 1000 averaged programmers and see if they can write MicroQuickJS using the same amount of time, or call one averaged programmer and see if he/she can write MicroQuickJS to the same quality in his/her life time. 10X, 100X or 1000X measures the productivity of us mortals, not someone like Fabrice Bellard.</p>
]]></description><pubDate>Tue, 23 Dec 2025 21:59:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=46369969</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46369969</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46369969</guid></item><item><title><![CDATA[New comment by attractivechaos in "Show HN: Autograd.c – A tiny ML framework built from scratch"]]></title><description><![CDATA[
<p>> <i>Is there a compiler-autograd "library"?</i><p>Do you mean the method theano is using? Anyway, the performance bottleneck often lies in matrix multiplication or 2D-CNN (which can be reduced to matmul). Compiler autograd wouldn't save much time.</p>
]]></description><pubDate>Mon, 22 Dec 2025 00:38:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=46350137</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46350137</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46350137</guid></item><item><title><![CDATA[New comment by attractivechaos in "Deprecations via warnings don't work for Python libraries"]]></title><description><![CDATA[
<p>Even if getHeaders() has security/performance concerns, the better solution is to make it an alias to the newer headers.get() in this case. Keeping the old API is a small hassle to a handful of developers but breaking existing code puts a much bigger burden on a lot more users.</p>
]]></description><pubDate>Wed, 10 Dec 2025 21:11:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=46223911</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46223911</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46223911</guid></item><item><title><![CDATA[New comment by attractivechaos in "My favourite small hash table"]]></title><description><![CDATA[
<p>This is a smart implementation of Robin Hood hashing I am not aware of. In my understanding, a standard implementation keeps the probe length of each entry. This one avoids that due to its extra constraints. I don't quite understand the following strategy, though<p>> <i>To meet property (3) [if the key 0 is present, its value is not 0] ... "array index plus one" can be stored rather than "array index".</i><p>If hash code can take any value in [0,2^32), how to define a special value for empty buckets? The more common solution is to have a special key, not a special hash code, for empty slots, which is easier to achieve. In addition, as the author points out, supporting generic keys requires to store 32-bit hash values. With the extra 4 bytes per bucket, it is not clear if this implementation is better than plain linear probing (my favorite). The fastest hash table implementations like boost and abseil don't often use Robin Hood hashing.</p>
]]></description><pubDate>Wed, 10 Dec 2025 02:15:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=46213260</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46213260</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46213260</guid></item><item><title><![CDATA[New comment by attractivechaos in "Thoughts on Go vs. Rust vs. Zig"]]></title><description><![CDATA[
<p>> <i>But you only need about 5% of the concepts in that comment to be productive in Rust.</i><p>The similar argument against C++ is applicable here: another programmer may be using 10% (or a different 5%) of the concepts. You will have to learn that fraction when working with him/her. This may also happen when you read the source code of some random projects. C programmers seldom have this problem. Complexity matters.</p>
]]></description><pubDate>Fri, 05 Dec 2025 05:43:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=46157165</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=46157165</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46157165</guid></item><item><title><![CDATA[New comment by attractivechaos in "Removing newlines in FASTA file increases ZSTD compression ratio by 10x"]]></title><description><![CDATA[
<p>FASTA was invented in late 1980s. At that time, unix tools often limited line length. Even in early 2000s, some unix tools (on AIX as I remember) still had this limit.</p>
]]></description><pubDate>Mon, 15 Sep 2025 16:38:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=45251818</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=45251818</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45251818</guid></item><item><title><![CDATA[New comment by attractivechaos in "I prefer human-readable file formats"]]></title><description><![CDATA[
<p>> <i>Another thing is human readable is typically synonymous with unindexed</i><p>Indexing is not directly related to binary vs text. Many text formats in bioinformatics are indexed and many binary formats are not when they are not designed with indexing in mind.<p>> <i>a human-eye check is tedious/impossible because you have to scroll through gigabytes to find what you want.</i><p>Yes, indexing is better but without indexing, you can use command line tools to extract the portion you want to look at and then pipe to "more" or "less".</p>
]]></description><pubDate>Sat, 09 Aug 2025 15:40:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=44847357</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=44847357</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44847357</guid></item><item><title><![CDATA[New comment by attractivechaos in "I prefer human-readable file formats"]]></title><description><![CDATA[
<p>> <i>human-readable files are ridiculously inefficient on every axis you can think of (space, parsing, searching, processing, etc.).</i><p>In bioinformatics, most large text files are gzip'd. Decompression is a few times slower than proper file parsing in C/C++/Rust. Some pure python parsers can be "ridiculously inefficient" but that is not the fault of human-readability. Binary files are compressed with existing libraries. Compressed binary files are not noticeably faster to parse than compressed text files. Binary formats can be indeed smaller but space-efficienct formats take years to develop and tend to have more compatibility issues. You can't skip the text format phase.<p>> <i>And at that scale, "readable" has no value, since it would take you longer to read the file than 10 lifetimes.</i><p>You can't read the whole file by eye, but you can (and should often) eyeball small sections in a huge file. For that, you need a human-readable file format. A problem with this field IMHO is that not many people are literally looking at the data by eye.</p>
]]></description><pubDate>Sat, 09 Aug 2025 13:24:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=44846276</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=44846276</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44846276</guid></item><item><title><![CDATA[New comment by attractivechaos in "Implementing Generic Types in C"]]></title><description><![CDATA[
<p>For linked lists and binary trees, intrusive data structures are better.<p>> <i>Well, except the first one, template macros, where I can’t really find any pro, only cons.</i><p>For toy examples, the first (expanding a huge macro) has mostly cons. But it is more flexible when you want to instantiate different parts of the header. The second approach can work but will be clumsy in this case because the whole header is considered as one unit.</p>
]]></description><pubDate>Wed, 19 Mar 2025 23:43:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=43418452</link><dc:creator>attractivechaos</dc:creator><comments>https://news.ycombinator.com/item?id=43418452</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43418452</guid></item></channel></rss>