Hacker News: attractivechaos

New comment by attractivechaos in "I am worried about Bun"

attractivechaos — Mon, 04 May 2026 22:51:40 +0000

The best feature Bun delivered recently is portable binary. That portability is a huge deal to me as my users are often on ancient Linux distros. Thank you. Both node and deno require recent Linux, more exactly, recent glibc.

New comment by attractivechaos in "You can beat the binary search"

attractivechaos — Thu, 30 Apr 2026 20:13:16 +0000

See also: Static search trees: 40x faster than binary search

- https://curiouscoding.nl/slides/p99-text/

- https://curiouscoding.nl/posts/static-search-tree/

- https://news.ycombinator.com/item?id=42562847 (656 points; 232 comments)

New comment by attractivechaos in "Don't post generated/AI-edited comments. HN is for conversation between humans."

attractivechaos — Wed, 11 Mar 2026 22:44:26 +0000

In the age of AI, thinking becomes a privilege.

New comment by attractivechaos in "Bun v1.3.9"

attractivechaos — Sun, 08 Feb 2026 21:45:01 +0000

Their C compiler project proves the opposite.

New comment by attractivechaos in "Deep dive into Turso, the “SQLite rewrite in Rust”"

attractivechaos — Thu, 29 Jan 2026 23:25:50 +0000

Yeah, I was expecting performance benchmarks, detailed feature comparisons, analysis of binary/extension compatibility, etc.

New comment by attractivechaos in "Command-line Tools can be 235x Faster than your Hadoop Cluster (2014)"

attractivechaos — Sun, 18 Jan 2026 15:03:07 +0000

On the contrary, the key message from the blog post is not to load the entire dataset to RAM unless necessary. The trick is to stream when the pattern works. This is how our field routinely works with files over 100GB.

New comment by attractivechaos in "Lessons from Hash Table Merging"

attractivechaos — Thu, 08 Jan 2026 19:47:20 +0000

First of all, as khuey pointed out, the current implementation accumulates values. extend() replaces values instead. It wouldn't achieve the same functionality.

I tried extend() anyway. It didn't work well. Based on your description, extend() implements a variation of preallocation (i.e. Solution II). However, because it doesn't always reserve enough space to hold the merged hash table, clustering still happens depending on N. I have updated the rust implementation (with the help of LLM as I am not a good rust programmer). You can try it yourself with "ht-merge-rust 1 -e -n14m" or point out if I made mistakes.

> HashMap will by default be randomly seeded in Rust

Yes, so it is with Abseil. The default rust hash functions, siphash in the standard library and foldhash in hashbrown, are ~3X as slow in comparison to simple hash functions on pure insertion load. When performance matters, we will use faster hash functions at least for small keys and will need a solution from my post.

> In a new enough C++ in theory you might find the same functionality supported, but Quality of Implementation tends to be pretty frightful.

This is not necessary. The rust libraries are a port of Abseil, a C++ library. Boost is as fast as Rust. Languages/libraries should learn from each other, not fight each other.

New comment by attractivechaos in "Lessons from Hash Table Merging"

attractivechaos — Thu, 08 Jan 2026 14:26:05 +0000

These three and boost are all based on swiss tables. They are indeed more robust than plain linear probing. khashl is the only one here using basic linear probing. Without salting, its curve is through the roof, much worse than swiss tables.

New comment by attractivechaos in "Lessons from Hash Table Merging"

attractivechaos — Thu, 08 Jan 2026 14:17:41 +0000

Exactly. And khashl uses Fibonacci hashing. Without salting, it has the same problem.

New comment by attractivechaos in "JavaScript engines zoo – Compare every JavaScript engine"

attractivechaos — Sun, 04 Jan 2026 15:53:13 +0000

I am not sure how "long term" you are thinking about. Making big architectural changes does not necessarily lead to better performance. JSC has been the fastest JS engine in the past 5-10 years as I remember. If google had a solution, they would have it now.

Lessons from Hash Table Merging

attractivechaos — Fri, 02 Jan 2026 12:51:55 +0000

Article URL: https://gist.github.com/attractivechaos/d2efc77cc1db56bbd5fc597987e73338

Comments URL: https://news.ycombinator.com/item?id=46464270

Points: 85

# Comments: 21

New comment by attractivechaos in "Fabrice Bellard: Biography (2009) [pdf]"

attractivechaos — Thu, 25 Dec 2025 15:26:28 +0000

I have been thinking what talent means in programming and thought of a case in the past. The task was to parse a text file format. One programmer used ~1000 lines of code (LOC) with complex logic. The other used <200 LOC with a straightforward solution that ran times faster and would probably be more extensible and easier to maintain in future. This is a small task. The difference will be exponentially amplified for complex projects that Fabrice is famous for. The first programmer in my story may be able to write a javascript runtime if he has time + obsession, but it will take him much longer and the quality will be much lower in comparison to quickjs or mqjs.

New comment by attractivechaos in "Fabrice Bellard: Biography (2009) [pdf]"

attractivechaos — Thu, 25 Dec 2025 06:07:25 +0000

This doesn't explain why so few people of Fabrice's generation have reached his level. Think about violin playing. Many players can become professionals if they have the obsession, but 99% of them won't reach the Heifetz/Hadelich/Ehnes level no matter how hard they try. Talent matters. Programming is not much different from performing art.

New comment by attractivechaos in "Fabrice Bellard Releases MicroQuickJS"

attractivechaos — Tue, 23 Dec 2025 21:59:23 +0000

You can call 1000 averaged programmers and see if they can write MicroQuickJS using the same amount of time, or call one averaged programmer and see if he/she can write MicroQuickJS to the same quality in his/her life time. 10X, 100X or 1000X measures the productivity of us mortals, not someone like Fabrice Bellard.

New comment by attractivechaos in "Show HN: Autograd.c – A tiny ML framework built from scratch"

attractivechaos — Mon, 22 Dec 2025 00:38:08 +0000

> Is there a compiler-autograd "library"?

Do you mean the method theano is using? Anyway, the performance bottleneck often lies in matrix multiplication or 2D-CNN (which can be reduced to matmul). Compiler autograd wouldn't save much time.

New comment by attractivechaos in "Deprecations via warnings don't work for Python libraries"

attractivechaos — Wed, 10 Dec 2025 21:11:11 +0000

Even if getHeaders() has security/performance concerns, the better solution is to make it an alias to the newer headers.get() in this case. Keeping the old API is a small hassle to a handful of developers but breaking existing code puts a much bigger burden on a lot more users.

New comment by attractivechaos in "My favourite small hash table"

attractivechaos — Wed, 10 Dec 2025 02:15:31 +0000

This is a smart implementation of Robin Hood hashing I am not aware of. In my understanding, a standard implementation keeps the probe length of each entry. This one avoids that due to its extra constraints. I don't quite understand the following strategy, though

> To meet property (3) [if the key 0 is present, its value is not 0] ... "array index plus one" can be stored rather than "array index".

If hash code can take any value in [0,2^32), how to define a special value for empty buckets? The more common solution is to have a special key, not a special hash code, for empty slots, which is easier to achieve. In addition, as the author points out, supporting generic keys requires to store 32-bit hash values. With the extra 4 bytes per bucket, it is not clear if this implementation is better than plain linear probing (my favorite). The fastest hash table implementations like boost and abseil don't often use Robin Hood hashing.

New comment by attractivechaos in "Thoughts on Go vs. Rust vs. Zig"

attractivechaos — Fri, 05 Dec 2025 05:43:31 +0000

> But you only need about 5% of the concepts in that comment to be productive in Rust.

The similar argument against C++ is applicable here: another programmer may be using 10% (or a different 5%) of the concepts. You will have to learn that fraction when working with him/her. This may also happen when you read the source code of some random projects. C programmers seldom have this problem. Complexity matters.

New comment by attractivechaos in "Removing newlines in FASTA file increases ZSTD compression ratio by 10x"

attractivechaos — Mon, 15 Sep 2025 16:38:10 +0000

FASTA was invented in late 1980s. At that time, unix tools often limited line length. Even in early 2000s, some unix tools (on AIX as I remember) still had this limit.

New comment by attractivechaos in "I prefer human-readable file formats"

attractivechaos — Sat, 09 Aug 2025 15:40:02 +0000

> Another thing is human readable is typically synonymous with unindexed

Indexing is not directly related to binary vs text. Many text formats in bioinformatics are indexed and many binary formats are not when they are not designed with indexing in mind.

> a human-eye check is tedious/impossible because you have to scroll through gigabytes to find what you want.

Yes, indexing is better but without indexing, you can use command line tools to extract the portion you want to look at and then pipe to "more" or "less".