<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: EdSchouten</title><link>https://news.ycombinator.com/user?id=EdSchouten</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 25 Apr 2026 11:48:39 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=EdSchouten" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by EdSchouten in "I made my own Git"]]></title><description><![CDATA[
<p>You can always use a divide and conquer strategy to compute the chunks. Chunk both halves of the file independently. Once that’s done, you redo the chunking around the midpoint of the file forward, until it starts to match the chunks obtained previously.</p>
]]></description><pubDate>Wed, 28 Jan 2026 05:24:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=46791405</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=46791405</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46791405</guid></item><item><title><![CDATA[New comment by EdSchouten in "I made my own Git"]]></title><description><![CDATA[
<p>It depends on the architecture. On ARM64, SHA-256 tends to be faster than BLAKE3. The reasons being that most modern ARM64 CPUs have native SHA-256 instructions, and lack an equivalent of AVX-512.<p>Furthermore, if your input files are large enough that parallelizing across multiple cores makes sense, then it's generally better to change your data model to eliminate the existence of the large inputs altogether.<p>For example, Git is somewhat primitive in that every file is a single object. In retrospect it would have been smarter to decompose large files into chunks using a Content Defined Chunking (CDC) algorithm, and model large files as a manifest of chunks. That way you get better deduplication. The resulting chunks can then be hashed in parallel, using a single-threaded algorithm.</p>
]]></description><pubDate>Tue, 27 Jan 2026 14:09:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=46780099</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=46780099</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46780099</guid></item><item><title><![CDATA[New comment by EdSchouten in "Germany Forces Lexus to Remotely Kill Car Heating in Dead of Winter"]]></title><description><![CDATA[
<p>That’s great! People who do that are often inconsiderate of how it affect others. First of all, it generates unnecessary noise, which is annoying for neighbors who are still trying to sleep. Pedestrians/cyclists also need to breathe those exhaust gases.</p>
]]></description><pubDate>Wed, 21 Jan 2026 05:52:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=46701633</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=46701633</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46701633</guid></item><item><title><![CDATA[New comment by EdSchouten in "Spinlocks vs. Mutexes: When to Spin and When to Sleep"]]></title><description><![CDATA[
<p>I don’t understand why I would need to care about this. Can’t my operating system and/or pthread library sort this out by itself?</p>
]]></description><pubDate>Mon, 08 Dec 2025 06:24:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=46189037</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=46189037</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46189037</guid></item><item><title><![CDATA[New comment by EdSchouten in "CDC File Transfer"]]></title><description><![CDATA[
<p>Yeah, that's true. Having some kind of chunking algorithm that's content/file format aware could make it work even better. For example, it makes a lot of sense to chunk source files at function/scope boundaries.<p>In my case I need to ensure that all producers of data use exactly the same algorithm, as I need to look up build cache results based on Merkle tree hashes. That's why I'm intentionally focusing on having algorithms that are not only easy to implement, but also easy to implement <i>consistently</i>. I think that MaxCDC implementation that I shared strikes a good balance in that regard.</p>
]]></description><pubDate>Wed, 01 Oct 2025 17:48:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=45440750</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=45440750</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45440750</guid></item><item><title><![CDATA[New comment by EdSchouten in "CDC File Transfer"]]></title><description><![CDATA[
<p>In my case I observed a ~2% reduction in data storage when attempting to store and deduplicate various versions of the Linux kernel source tree (see link above). But that also includes the space needed to store the original version.<p>If we take that out of the equation and only measure the size of the additional chunks being transferred, it's a reduction of about 3.4%. So it's not an order of magnitude difference, but not bad for a relatively small change.</p>
]]></description><pubDate>Wed, 01 Oct 2025 10:03:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=45436048</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=45436048</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45436048</guid></item><item><title><![CDATA[New comment by EdSchouten in "CDC File Transfer"]]></title><description><![CDATA[
<p>Yeah, GEAR hashing is simple enough that I haven't considered using anything else.<p>Regarding the RNG used to seed the GEAR table: I don't think it actually makes that much of a difference. You only use it once to generate 2 KB of data (256 64-bit constants). My suspicion is that using some nothing-up-my-sleeve numbers (e.g., the first 2048 binary digits of π) would work as well.</p>
]]></description><pubDate>Wed, 01 Oct 2025 07:06:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=45435130</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=45435130</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45435130</guid></item><item><title><![CDATA[New comment by EdSchouten in "CDC File Transfer"]]></title><description><![CDATA[
<p>I’ve also been doing lots of experimenting with Content Defined Chunking since last year (for <a href="https://bonanza.build/" rel="nofollow">https://bonanza.build/</a>). One of the things I discovered is that the most commonly used algorithm FastCDC (also used by this project) can be improved significantly by looking ahead. An implementation of that can be found here:<p><a href="https://github.com/buildbarn/go-cdc" rel="nofollow">https://github.com/buildbarn/go-cdc</a></p>
]]></description><pubDate>Wed, 01 Oct 2025 05:35:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=45434599</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=45434599</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45434599</guid></item><item><title><![CDATA[New comment by EdSchouten in "Which NPM package has the largest version number?"]]></title><description><![CDATA[
<p>So 19494 is the largest? That's far lower than I expected. There's nobody out there that has put a date in a version number (e.g., 20250915)?</p>
]]></description><pubDate>Mon, 15 Sep 2025 07:37:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=45247110</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=45247110</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45247110</guid></item><item><title><![CDATA[New comment by EdSchouten in "Show HN: What country you would hit if you went straight where you're pointing"]]></title><description><![CDATA[
<p>Soon available for people in the Vatican as well?</p>
]]></description><pubDate>Wed, 20 Aug 2025 20:12:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=44965920</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=44965920</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44965920</guid></item><item><title><![CDATA[New comment by EdSchouten in "Without the futex, it's futile"]]></title><description><![CDATA[
<p>Exactly! At the same time you also don't want to call into the kernel's internal malloc() whenever a thread ends up blocking on a lock to allocate the data structures that are needed to keep track of queues of blocked threads for a given lock.<p>To prevent that, many operating systems allocate these 'queue objects' whenever threads are created and will attach a pointer to it from the thread object. Whenever a thread then stumbles upon a contended lock, it will effectively 'donate' this queue object to that lock, meaning that every lock having one or more waiters will have a linked list of 'queue objects' attached to it. When threads are woken up, they will each take one of those objects with them on the way out. But there's no guarantee that they will get their own queue object back; they may get shuffled! So by the time a thread terminates, it will free one of those objects, but that may not necessarily be the one it created.<p>I think the first operating system to use this method was Solaris. There they called these 'queue objects' turnstiles. The BSDs adopted the same approach, and kept the same name.<p><a href="https://www.oreilly.com/library/view/solaristm-internals-core/0130224960/0130224960_ch03lev1sec7.html" rel="nofollow">https://www.oreilly.com/library/view/solaristm-internals-cor...</a><p><a href="https://www.bsdcan.org/2012/schedule/attachments/195_locking.pdf" rel="nofollow">https://www.bsdcan.org/2012/schedule/attachments/195_locking...</a></p>
]]></description><pubDate>Tue, 19 Aug 2025 15:09:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=44952458</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=44952458</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44952458</guid></item><item><title><![CDATA[New comment by EdSchouten in "QUIC for the kernel"]]></title><description><![CDATA[
<p>> Calls to bind(), connect(), listen(), and accept() can be used to initiate and accept connections in much the same way as with TCP, but then things diverge a bit. [...] The sendmsg() and recvmsg() system calls are used to carry out that setup<p>I wish the article explained why this approach was chosen, as opposed to adding a dedicated system call API that matches the semantics of QUIC.</p>
]]></description><pubDate>Fri, 01 Aug 2025 14:34:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=44757391</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=44757391</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44757391</guid></item><item><title><![CDATA[New comment by EdSchouten in "From XML to JSON to CBOR"]]></title><description><![CDATA[
<p>That's not entirely true: with CBOR you can add custom data types through custom tags. A central registry of them is here:<p><a href="https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml" rel="nofollow">https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml</a><p>This is, for example, used by IPLD (<a href="https://ipld.io" rel="nofollow">https://ipld.io</a>) to express references between objects through native types (<a href="https://github.com/ipld/cid-cbor/">https://github.com/ipld/cid-cbor/</a>).</p>
]]></description><pubDate>Wed, 30 Jul 2025 12:47:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=44733521</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=44733521</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44733521</guid></item><item><title><![CDATA[New comment by EdSchouten in "SF may soon ban natural gas in homes and businesses undergoing major renovations"]]></title><description><![CDATA[
<p>Those plants likely have a higher efficiency than a gas powered stove, so may be worth it regardless?</p>
]]></description><pubDate>Mon, 28 Jul 2025 14:52:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=44711463</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=44711463</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44711463</guid></item><item><title><![CDATA[New comment by EdSchouten in "3-JSON"]]></title><description><![CDATA[
<p>If only there was a variant of execve() / posix_spawn() that simply took a literal array of which file descriptors would need to be present in the new process. So that you can say:<p><pre><code>    int subprocess_stdin = open("/dev/null", O_RDONLY);
    int subprocess_stdout = open("some_output", O_WRONLY);
    int subprocess_stderr = STDERR_FILENO; // Let the subprocess use the same stderr as me.
    int subprocess_fds[] = {subprocess_stdin, subprocess_stdout, subprocess_stderr};
    posix_spawn_with_fds("my process", [...], subprocess_fds, 3);
</code></pre>
Never understood why POSIX makes all of this so hard.</p>
]]></description><pubDate>Fri, 25 Jul 2025 13:11:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=44682760</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=44682760</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44682760</guid></item><item><title><![CDATA[New comment by EdSchouten in "Fixing a Direct3D9 bug in Far Cry (2018)"]]></title><description><![CDATA[
<p>It looks like @HoussemNasri forked @CookiePLMonster's website repo:<p><a href="https://github.com/HoussemNasri/HoussemNasri.github.io">https://github.com/HoussemNasri/HoussemNasri.github.io</a><p>Maybe that person simply wanted to have a nice template to work with, but forgot to scrub all the old content?</p>
]]></description><pubDate>Fri, 18 Jul 2025 16:03:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=44606287</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=44606287</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44606287</guid></item><item><title><![CDATA[New comment by EdSchouten in "Phrase origin: Why do we "call" functions?"]]></title><description><![CDATA[
<p>Same in Dutch. “Oproepen” means “to summon”. We would use “aanroepen”.</p>
]]></description><pubDate>Thu, 10 Jul 2025 04:32:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=44517194</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=44517194</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44517194</guid></item><item><title><![CDATA[New comment by EdSchouten in "Show HN: memEx, a personal knowledge base inspired by zettlekasten and org-mode"]]></title><description><![CDATA[
<p>zettELkasten</p>
]]></description><pubDate>Sun, 13 Apr 2025 05:07:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=43670229</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=43670229</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43670229</guid></item><item><title><![CDATA[New comment by EdSchouten in "A curious case of O(N^2) behavior which should be O(N) (2023)"]]></title><description><![CDATA[
<p>Just like “this entire meeting could have been an email”, all this effort writing this gist could have been spent creating a PR.</p>
]]></description><pubDate>Sun, 22 Dec 2024 05:14:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=42484516</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=42484516</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42484516</guid></item><item><title><![CDATA[New comment by EdSchouten in "ICC issues warrants for Netanyahu, Gallant, and Hamas officials"]]></title><description><![CDATA[
<p>> The Dutch have a very lackadaisical attitude to law<p>What do you mean by this specifically?</p>
]]></description><pubDate>Fri, 22 Nov 2024 05:30:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=42211421</link><dc:creator>EdSchouten</dc:creator><comments>https://news.ycombinator.com/item?id=42211421</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42211421</guid></item></channel></rss>