Hacker News: charleshn

New comment by charleshn in "Open sourcing Dicer: Databricks's auto-sharder"

charleshn — Wed, 14 Jan 2026 00:01:55 +0000

> Application pods learn the current assignment through a library called the Slicelet (S for server side). The Slicelet maintains a local cache of the latest assignment by fetching it from the Dicer service and watching for updates. When it receives an updated assignment, the Slicelet notifies the application via a listener API.

For a critical control plane component like this, I tend to prefer a constant work pattern [0], to avoid metastable failures [1], e.g. periodically pull the data instead of relying on notifications.

[0] https://aws.amazon.com/builders-library/reliability-and-cons...

[1] https://brooker.co.za/blog/2021/05/24/metastable.html

What Does a Database for SSDs Look Like?

charleshn — Sat, 20 Dec 2025 10:13:33 +0000

Article URL: https://brooker.co.za/blog/2025/12/15/database-for-ssd.html

Comments URL: https://news.ycombinator.com/item?id=46334990

Points: 148

# Comments: 121

New comment by charleshn in "AMD officially confirms fresh next-gen Zen 6 CPU details"

charleshn — Fri, 19 Dec 2025 18:39:22 +0000

They should be reintroducing the 3D vcache [0] variants (X) in EPYC, with a higher cache/core ratio, that was present in EPYC4 (e.g. 9684X [1]) they for some reason wasn't available in EPYC5.

Makes a massive difference at high density and utilisation, with the standard cache/core performance can really degrade under load.

[0] https://www.amd.com/en/products/processors/technologies/3d-v...

[1] https://www.amd.com/en/products/processors/server/epyc/4th-g...

New comment by charleshn in "The highest quality codebase"

charleshn — Sat, 13 Dec 2025 09:12:46 +0000

It's fundamentally because of verifier's law [0].

Current AI, and in particular RL-based, is already or will soon achieve super human performance on problems that can be - quickly - verified and measured.

So maths, algorithms, etc and well defined bugs fall into that category.

However architectural decision, design, long-term planning where there is little data, no model allowing synthetic data generation, and long iteration cycles are not so much amenable to it.

[0] https://www.jasonwei.net/blog/asymmetry-of-verification-and-...

New comment by charleshn in "Spinlocks vs. Mutexes: When to Spin and When to Sleep"

charleshn — Mon, 08 Dec 2025 04:04:16 +0000

> std::hardware_destructive_interference_size Exists so you don't have to guess, although in practice it'll basically always be 64.

Unfortunately it's not quite true, do to e.g. spacial prefetching [0]. See e.g. Folly's definition [1].

[0] https://community.intel.com/t5/Intel-Moderncode-for-Parallel...

[1] https://github.com/facebook/folly/blob/d2e6fe65dfd6b30a9d504...

New comment by charleshn in "TPUs vs. GPUs and why Google is positioned to win AI race in the long term"

charleshn — Sat, 29 Nov 2025 00:14:26 +0000

> There's a good reason so much research is done on Nvidia clusters and not TPU clusters.

You are aware that Gemini was trained on TPU, and that most research at Deepmind is done on TPU?

Collective Communication for 100k+ GPUs

charleshn — Fri, 31 Oct 2025 00:11:03 +0000

Article URL: https://arxiv.org/abs/2510.20171

Comments URL: https://news.ycombinator.com/item?id=45766956

Points: 1

# Comments: 0

New comment by charleshn in "Tinnitus Neuromodulator"

charleshn — Sat, 18 Oct 2025 18:48:46 +0000

I can relate.

I had tinnitus for over 10 years. My tinnitus was not the usual ringing type, it was some sort of humming, low frequency noise. The frequency was not constant, it could vary. It could sometimes stop for 5-10 minutes, e.g. after a hot bath.

Went to see many specialists, tried everything, to no avail.

One day I started experiencing recurrent tension and light pain in my neck and shoulder blades, so I started doing some neck and shoulder blades stretches several times a day.

After a few weeks, the pain was gone, and I realised the tinnitus had stopped. This was maybe 2 years ago (I am still doing those exercises multiple times a day).

New comment by charleshn in "TernFS – An exabyte scale, multi-region distributed filesystem"

charleshn — Thu, 18 Sep 2025 22:50:14 +0000

A few questions if the authors are around!

> Is hardware agnostic and uses TCP/IP to communicate.

So no RDMA? It's very hard to make effective use of modern NVMe drives bandwidth over TCP/IP.

> A logical shard is further split into five physical instances, one leader and four followers, in a typical distributed consensus setup. The distributed consensus engine is provided by a purpose-built Raft-like implementation, which we call LogsDB

Raft-like, so not Raft, a custom algorithm? Implementing distributed consensus correctly from scratch is very hard - why not use some battle-tested implementations?

> Read/write access to the block service is provided using a simple TCP API currently implemented by a Go process. This process is hardware agnostic and uses the Go standard library to read and write blocks to a conventional local file system. We originally planned to rewrite the Go process in C++, and possibly write to block devices directly, but the idiomatic Go implementation has proven performant enough for our needs so far.

The document mentions it's designed to reach TB/s though. Which means that for an IO intensive workload, one would end up wasting a lot of drive bandwidth, and require a huge number of nodes.

Modern parallel filesystems can reach 80-90GB/s per node, using RDMA, DPDK etc.

> This is in contrast to protocols like NFS, whereby each connection is very stateful, holding resources such as open files, locks, and so on.

This is not true for NFSv3 and older, it tends to be stateless (no notion of open file).

No mention of the way this was developed and tested - does it use some formal methods, simulator, chaos engineering etc?

New comment by charleshn in "Strong Eventual Consistency – The Big Idea Behind CRDTs"

charleshn — Wed, 10 Sep 2025 11:02:11 +0000

Interesting that neither the article nor the comments mention the CALM theorem [0], which gives a framework to explain when coordination-free consistency is possible, and is arguably the big idea behind SEC.

[0] https://arxiv.org/abs/1901.01930

New comment by charleshn in "The Bitter Lesson Is Misunderstood"

charleshn — Wed, 03 Sep 2025 23:16:18 +0000

You can have a look at the DeepSeek paper, in particular section "2.2 DeepSeek-R1-Zero: Reinforcement Learning on the Base Mode".

But generally the idea is that it's, you need some notion of reward, verifiers etc.

Works really well for maths, algorithms, amd many things actually.

See also this very short essay/introduction: https://www.jasonwei.net/blog/asymmetry-of-verification-and-...

That's why we have IMO gold level models now, and I'm pretty confident we'll have superhuman mathematics, algorithmic etc models before long.

Now domains which are very hard to verify - think e.g. theoretical physics etc - that's another story.

New comment by charleshn in "The Bitter Lesson Is Misunderstood"

charleshn — Wed, 03 Sep 2025 22:53:44 +0000

> We cannot add more compute to a given compute budget C without increasing data D to maintain the relationship. > We must either (1) discover new architectures with different scaling laws, and/or (2) compute new synthetic data that can contribute to learning (akin to dreams).

Of course we can, this is a non issue.

See e.g. AlphaZero [0] that's 8 years old at this point, and any modern RL training using synthetic data, e.g. DeepSeek-R1-Zero [1].

[0] https://en.m.wikipedia.org/wiki/AlphaZero

[1] https://arxiv.org/abs/2501.12948

New comment by charleshn in "How to Think About GPUs"

charleshn — Wed, 20 Aug 2025 14:31:17 +0000

Yes, 450GB/s is the per GPU bandwidth in the nvlink domain. 3.2Tbps is the per-host bandwidth in the scale out IB/Ethernet domain.

Demystifying NCCL: An In-Depth Analysis of GPU Communication Protocols and Algos

charleshn — Wed, 13 Aug 2025 22:04:27 +0000

Article URL: https://arxiv.org/abs/2507.04786

Comments URL: https://news.ycombinator.com/item?id=44894413

Points: 1

# Comments: 0

New comment by charleshn in "The Surprising gRPC Client Bottleneck in Low-Latency Networks"

charleshn — Wed, 23 Jul 2025 23:47:46 +0000

Could you check the value of your kernel's net.ipv4.tcp_slow_start_after_idle sysctl, and if it's non zero set it to 0?

New comment by charleshn in "AI capex is so big that it's affecting economic statistics"

charleshn — Sat, 19 Jul 2025 10:26:54 +0000

You can now add getting gold at IMO [0] to the above list.

[0] https://x.com/alexwei_/status/1946477742855532918

New comment by charleshn in "AI capex is so big that it's affecting economic statistics"

charleshn — Sat, 19 Jul 2025 10:23:29 +0000

Frontier models went from not being able to count the number of 'r's in "strawberry" to getting gold at IMO in under 2 years [0], and people keep repeating the same clichés such as "LLMs can't reason" or "they're just next token predictors".

At this point, I think it can only be explained by ignorance, bad faith, or fear of becoming irrelevant.

[0] https://x.com/alexwei_/status/1946477742855532918

New comment by charleshn in "AI capex is so big that it's affecting economic statistics"

charleshn — Fri, 18 Jul 2025 22:04:28 +0000

I mentioned algorithms, not software engineering, precisely for that reason.

But the next step is obviously increased formalism via formal methods, deterministic simulators etc, basically so that one could define an environment for a RL agent.

New comment by charleshn in "AI capex is so big that it's affecting economic statistics"

charleshn — Fri, 18 Jul 2025 21:53:51 +0000

I'm always surprised by the number of people posting here that are dismissive of AI and the obvious unstoppable progress.

Just looking at what happened with chess, go, strategy games, protein folding etc, it's obvious that pretty much any field/problem that can be formalised and cheaply verified - e.g. mathematics, algorithms etc - will be solved, and that it's only a matter of time before we have domain-specific ASI.

I strongly encourage everyone to read about the bitter lesson [0] and verifier's law [1].

[0] http://www.incompleteideas.net/IncIdeas/BitterLesson.html

[1] https://www.jasonwei.net/blog/asymmetry-of-verification-and-...

New comment by charleshn in "AI CapEx Is Eating the Economy"

charleshn — Fri, 18 Jul 2025 21:39:57 +0000

We do already have ASICs, see Google's TPU to get some cost estimates.

HBM is also very expensive.