Hacker News: trishume

New comment by trishume in "Anthropic's original take home assignment open sourced"

trishume — Wed, 21 Jan 2026 17:06:37 +0000

Author of the take-home here: That's quite a good cycle count, substantially better than Claude's, you should email it to performance-recruiting@anthropic.com.

New comment by trishume in "All my favorite tracing tools"

trishume — Wed, 06 Dec 2023 08:27:11 +0000

I haven't actually used bpftrace myself, only BCC. I can totally imagine it being more janky than DTrace, BCC is pretty janky even if I also think it's cool. In my eBPF tracing framework I had to add special handling counters to alert you if it ever lost any events, plausible bpftrace didn't do that.

New comment by trishume in "All my favorite tracing tools"

trishume — Wed, 06 Dec 2023 06:51:35 +0000

Do you know of anyone who's built that kind of time travel debugging with a trace visualization in the open outside of Javascript? I know about rr and Pernosco but don't know of trace visualization integration for either of them, that would indeed be very cool. I definitely dream of having systems like this.

All my favorite tracing tools

trishume — Tue, 05 Dec 2023 22:56:29 +0000

Article URL: https://thume.ca/2023/12/02/tracing-methods/

Comments URL: https://news.ycombinator.com/item?id=38538111

Points: 355

# Comments: 40

New comment by trishume in "Production Twitter on one machine? 100Gbps NICs and NVMe are fast"

trishume — Sat, 07 Jan 2023 23:56:40 +0000

To me most interesting are factors I didn't consider in features I did cover. Next most interesting are features I didn't cover which are kinda core to Twitter being good, and also pose interesting performance problems, like the person who mentioned spam/abuse detection. After that are non-core features which pose interesting performance problems that are different from problems I already covered.

The comments that I think aren't contributing much are ones that mention features that I didn't cover but make no attempt to argue that they're actually hard to implement efficiently, or that assert that because I didn't implement something it isn't feasible to make as fast as I calculate, without arguing what would actually stop me from implementing something that efficient. Or ones who repeat that this isn't practical, which I say at length in the post.

New comment by trishume in "Production Twitter on one machine? 100Gbps NICs and NVMe are fast"

trishume — Sat, 07 Jan 2023 23:28:18 +0000

That's really cool! Each year of historical images I estimate at 2.8PB, so it would need to scale quite far to handle multiple years. How would you actually connect all those external drive chassis, is there some kind of chainable SAS or PCIe that can scale arbitrarily far? I consider NVMe-over-fabrics to be cheating and just using multiple machines and calling it one machine, but "one machine" is kinda an arbitrary stunt metric.

New comment by trishume in "Production Twitter on one machine? 100Gbps NICs and NVMe are fast"

trishume — Sat, 07 Jan 2023 23:23:05 +0000

I have fantasized about doing this as a startup, basically doing cache coherency protocols at the page table level with RDMA. There's some academic systems that do something like it but without the hypervisor part.

My joke fantasy startup is a cloud provider called one.computer where you just have a slider for the number of cores on your single instance, and it gives you a standard linux system that appears to have 10k cores. Most multithreaded software would absolutely trash the cache-coherency protocols and have poor performance, but it might be useful to easily turn embarrassingly parallel threaded map-reduces into multi-machine ones.

New comment by trishume in "Production Twitter on one machine? 100Gbps NICs and NVMe are fast"

trishume — Sat, 07 Jan 2023 22:04:27 +0000

I think I'm pretty careful to say that this is a simplified version of Twitter. Of the features you list:

- spam detection: I agree this is a reasonably core feature and a good point. I think you could fit something here but you'd have to architect your entire spam detection approach around being able to fit, which is a pretty tricky constraint and probably would make it perform worse than a less constrained solution. Similar to ML timelines.

- ad relevance: Not a core feature if your costs are low enough. But see the ML estimates for how much throughput A100s have at dot producting ML embeddings.

- web previews: I'd do this by making it the client's responsibility. You'd lose trustworthiness though so users with hacked clients could make troll web previews, they can already do that for a site they control, but not a general site.

- blocks/mutes: Not a concern for the main timeline other than when using ML, when looking at replies will need to fetch blocks/mutes and filter. Whether this costs too much depends on how frequently people look at replies.

I'm fully aware that real Twitter has bajillions of features that I don't investigate, and you couldn't fit all of them on one machine. Many of them make up such a small fraction of load that you could still fit them. Others do indeed pose challenges, but ones similar to features I'd already discussed.

New comment by trishume in "Production Twitter on one machine? 100Gbps NICs and NVMe are fast"

trishume — Sat, 07 Jan 2023 21:48:57 +0000

Which I think I'm perfectly clear about in the blog post. The post is mostly about napkin math systems analysis, which does cover HTTP and HTTPS.

I'm now somewhat confident I could implement this if I tried, but it would take many years, the prototype and math is to check whether there's anything that would stop me if I tried and be a fun blog post about what systems are capable of.

I've worked on a team building a system to handle millions of messages per second per machine, and spending weeks doing math and building performance prototypes like this is exactly what we did before we built it for real.

New comment by trishume in "Production Twitter on one machine? 100Gbps NICs and NVMe are fast"

trishume — Sat, 07 Jan 2023 21:34:28 +0000

I agree most HTTP server benchmarks are highly misleading in that way, and mention in my post how disappointed I am at the lack of good benchmarks. I also agree that typical HTTP servers would fall over at much lower new connection loads.

I'm talking about a hypothetical HTTPS server that used optimized kernel-bypass networking. Here's a kernel-bypass HTTP server benchmarked doing 50k new connections per core second while re-using nginx code: https://github.com/F-Stack/f-stack. But I don't know of anyone who's done something similar with HTTPS support.

New comment by trishume in "Production Twitter on one machine? 100Gbps NICs and NVMe are fast"

trishume — Sat, 07 Jan 2023 21:27:58 +0000

Yah like I say in the post, the exactly one machine thing is just for fun and as an illustration of how far vertical scaling can go, practically I'd definitely scale storage with many sharded smaller storage servers.

New comment by trishume in "Production Twitter on one machine? 100Gbps NICs and NVMe are fast"

trishume — Sat, 07 Jan 2023 21:25:11 +0000

Quote tweets I'd do as a reference and they'd basically have the cost of loading 2 tweets instead of one, so increasing the delivery rate by the fraction of tweets that are quote tweets.

Hashtags are a search feature and basically need the same posting lists as for search, but if you only support hashtags the posting lists are smaller. I already have an estimate saying probably search wouldn't fit. But I think hashtag-only search might fit, mainly because my impression is people doing hashtag searches are a small fraction of traffic nowadays so the main cost is disk, not sure though.

I did run the post by 5 ex-Twitter engineers and none of them said any of my estimates were super wrong, mainly just brought up additional features and things I didn't discuss (which I edited into the post before publishing). Still possible that they just didn't divulge or didn't know some number they knew that I estimated very wrong.

New comment by trishume in "Production Twitter on one machine? 100Gbps NICs and NVMe are fast"

trishume — Sat, 07 Jan 2023 20:54:26 +0000

My friend mentioned this just before I published and I think that probably is the fastest largest thing you can get which would in some sense count as one machine. I haven't looked into it, but I wouldn't be surprised if they could get around the trickiest constraint, which is how many hard drives you can plug in to a non-mainframe machine for historical image storage. Definitely more expensive than just networking a few standard machines though.

I also bet that mainframes have software solutions to a lot of the multi-tenancy and fault tolerance challenges with running systems on one machine that I mention.

New comment by trishume in "Production Twitter on one machine? 100Gbps NICs and NVMe are fast"

trishume — Sat, 07 Jan 2023 20:46:30 +0000

I specifically assumed a max tweet size based on the maximum number of UTF-8 bytes a tweet can contain (560), with a link to an analysis of that, and discussion of how you could optimize for the common case of tweets that contain way fewer UTF-8 bytes than that. Everything in my post assumes unicode.

New comment by trishume in "Production Twitter on one machine? 100Gbps NICs and NVMe are fast"

trishume — Sat, 07 Jan 2023 20:18:49 +0000

As the author, this sounds good to me! I'll probably even change the actual title to match. I originally was going to make it a question mark and the only reason I didn't is https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headline... when I think the answer is probably "could probably be somewhat done" rather than "no".

Production Twitter on one machine? 100Gbps NICs and NVMe are fast

trishume — Sat, 07 Jan 2023 18:46:51 +0000

Article URL: https://thume.ca/2023/01/02/one-machine-twitter/

Comments URL: https://news.ycombinator.com/item?id=34291191

Points: 776

# Comments: 477

New comment by trishume in "My DIY ergonomic travel workstation with aluminum and magnets"

trishume — Wed, 09 Nov 2022 17:17:23 +0000

Oooh nice! Your Kyria posts are actually where I first learned about how awesome and cheap SendCutSend is, and got some of the inspiration for the magnets.

I actually ordered a plain steel plate first, but I realized that given that I needed them in a different position to travel compactly than the wide position I like for typing, I wanted them to snap in consistent positions so it wasn't finicky to line up exactly how I liked them.

My DIY ergonomic travel workstation with aluminum and magnets

trishume — Wed, 09 Nov 2022 03:55:44 +0000

Article URL: https://thume.ca/2022/11/06/diy-travel-work-setup/

Comments URL: https://news.ycombinator.com/item?id=33527356

Points: 215

# Comments: 89

New comment by trishume in "Aquila: A unified, low-latency fabric for datacenter networks"

trishume — Wed, 19 Oct 2022 03:06:36 +0000

The latency numbers they state seem achievable or beatable with Infiniband, Amazon's EFA, or TCPDirect. 2us round-trip is achievable for very simple systems. If this kind of networking sounds good to you, you can buy it today! It's even available on AWS, Azure and Oracle Cloud (but not GCP yet AFAIK).

New comment by trishume in "Difftastic, the fantastic diff"

trishume — Wed, 07 Sep 2022 06:13:54 +0000

This is a really cool example of tree diffing via path finding. I noticed that this was the approach I used when I did tree diffing, and sure enough looks like this was inspired by autochrome which was inspired by my post (https://thume.ca/2017/06/17/tree-diffing/).

I'm curious exactly why A* failed here. It worked great for me, as long as you design a good heuristic. I imagine it might have been complicated to design a good heuristic with an expanded move set. I see autochrome had to abandon A* and has an explanation of why, but that explanation shouldn't apply to difftastic I think.