Hacker News: nsteel

New comment by nsteel in "Our eighth generation TPUs: two chips for the agentic era"

nsteel — Fri, 24 Apr 2026 10:45:42 +0000

19.2Tb is 2 x 9.6Tbps. Some (most) companies count Tx and Rx separately despite it making no sense in the context of serdes lanes. Stupid marketing in my opinion.

I don't think they are meaningfully ahead, it's more to do with what's available at the time. 200/224G is only just coming available this year. The others will have the same in their next product announcements.

New comment by nsteel in "Our eighth generation TPUs: two chips for the agentic era"

nsteel — Wed, 22 Apr 2026 16:38:35 +0000

Isn't it 6x 200Gb octals? An octal being 8x 200Gb lanes. So 9.6Tbps?

New comment by nsteel in "Our eighth generation TPUs: two chips for the agentic era"

nsteel — Wed, 22 Apr 2026 16:24:09 +0000

Apparently it's Broadcom for 8t and Mediatek for 8i.

https://wccftech.com/google-splits-tpuv8-strategy-two-chips-...

New comment by nsteel in "Our eighth generation TPUs: two chips for the agentic era"

nsteel — Wed, 22 Apr 2026 13:16:00 +0000

This link has more on the architecture: https://cloud.google.com/blog/products/compute/tpu-8t-and-tp...

New comment by nsteel in "Our eighth generation TPUs: two chips for the agentic era"

nsteel — Wed, 22 Apr 2026 13:14:46 +0000

It's HBM3e

New comment by nsteel in "Too much discussion of the XOR swap trick"

nsteel — Thu, 16 Apr 2026 10:36:42 +0000

It's similar to RAID schemes but instead of drive failure it's port unavailability. There's a reference at [1] or an FPGA-centric one at [2], but it applies to anywhere where dual/single-port rams are readily available but anything more exotic isn't.

  [1] Achieving Multi-Port Memory Performance on Single-Port Memory with Coding Techniques - https://arxiv.org/abs/2001.09599
  [2] https://people.csail.mit.edu/ml/pubs/fpga12_xor.pdf

New comment by nsteel in "Too much discussion of the XOR swap trick"

nsteel — Thu, 16 Apr 2026 08:52:21 +0000

Also to cheaply (area) create multi-port RAMs.

New comment by nsteel in "Anthropic expands partnership with Google and Broadcom for next-gen compute"

nsteel — Tue, 07 Apr 2026 07:53:09 +0000

And and Broadcom designs a huge part of the chip. They take Google's (mostly) logical design and providing everything TSMC need to physically make the chip (including imports g IP such as serdes, PLLs, and test).

New comment by nsteel in "Apple discontinues the Mac Pro"

nsteel — Tue, 31 Mar 2026 20:47:05 +0000

OK, but that's significantly slower and larger. It's a worse solution. Am I missing something?

New comment by nsteel in "VHDL's Crown Jewel"

nsteel — Mon, 30 Mar 2026 14:29:04 +0000

I disagree. We've produced numerous complex chips with VHDL over the last 30 years. Most of the vendor models we have to integrate with are Verilog, so perhaps it is more popular, but that's no problem for us. We've found plenty of bugs for both VHDL and Verilog in the commercial tooling we use, neither is particularly worse (providing you're happy to steer clear of the more recent VHDL language features).

New comment by nsteel in "Apple discontinues the Mac Pro"

nsteel — Fri, 27 Mar 2026 16:11:42 +0000

Except they don't use DDR5. LPDDR5 is always soldered. LPDDR5 requires short point-to-point connections to give you good SI at high speeds and low voltages. To get the same with DDR5 DIMMs, you'd have something physically much bigger, with way worse SI, with higher power, and with higher latency. That would be a much worse solution. GDDR is much higher power, the solution would end up bigger. Plus it's useless for system memory so now you need two memory types. LPDDR5 is the only sensible choice.

New comment by nsteel in "The Road Not Taken: A World Where IPv4 Evolved"

nsteel — Fri, 13 Mar 2026 09:37:41 +0000

> but it's always getting updated,

I entirely disagree. Due to a combination of ISPs sticking with what they know and refusing to update (because of the huge time/cost in validating it), and vendors minimising their workloads/risk exposure and only updating what they "have to". The vendors have a lot of power here and these big new protocols are just more work.

In addition, smaller ISPs have virtually no say in what software/features they get. They can ask all they want, they have little power. It takes a big customer to move the needle and get new features into these expensive boxes. It really only happens when there's another vendor offering something new, and therefore a business requirement to maintain feature parity else lose big-customer revenue. So yeh, if a new protocol magically becomes standard, only then would anyone bother implementing and supporting it.

I think it's much easier to update consumer edge equipment. The ISP dictates all aspects of this relationship, the boxes are cheap, and just plug and play. They're relatively simple and easy to validate for 99% of usecases. If your internet stops working (because you didn't get the new hw/sw), they ship you a replacement, 2 days later it's fixed.

But I will just say, and slightly off topic of this thread, the lack of multiple extension headers in this proposed protocol instantly makes it more attractive to implement compared to v6.

New comment by nsteel in "DDR4 Sdram – Initialization, Training and Calibration"

nsteel — Fri, 13 Mar 2026 08:13:40 +0000

I don't know if this is still the case, but back then the likes of Synopsys charged a lot of money for what was very limited controller functionality; you were stuck with their frustrating support channels and generally dumpster fire firmware. Our controller was fully custom to our needs, supporting more optimum refresh schemes tightly integrated with our application, and multiple memory protocols (not just DDR3), and I don't remember what else.

At least we were able to modify the training algorithms and find the improvements, rather than being stuck with the usual vendor "works for us" response. Especially with something like commodity DDR, where our quantities don't command much clout. But it was a bit of an ordeal and may have contributed to us buying in a controller for our next gen (not DDRx). But I think we're going the other way again after that experience..!

New comment by nsteel in "The Road Not Taken: A World Where IPv4 Evolved"

nsteel — Fri, 13 Mar 2026 00:52:20 +0000

> the easiest part of the system to get changed... the core routers and associated infrastructure.

Is that really the easy bit to change? ISPs spend years trialling new hardware and software in their core. You go through numerous cheapo home routers over the lifetime of one of their chassis. You'll use whatever non-name box they send you, and you'll accept their regular OTA updates too, else you're on your own.

New comment by nsteel in "DDR4 Sdram – Initialization, Training and Calibration"

nsteel — Fri, 13 Mar 2026 00:12:47 +0000

Implementing DDR3 training for our packet queuing chip (custom memory controller) was my first project at work. We had originally hoped to use the same training params for all parts. That wasn't reliable even over a small number of testing systems in the chamber. DDR3 RAM parts were super cheap compared to what we had used in previous generations, and you get what you pay for with a huge amount of device variation. So we implemented a relatively long training process to be run on each device during our board testing, and saved those per-lane skews. But we found the effects of temperature, and particularly system noise, were too great once the system was sending full-rate traffic. (The training had to be done one interface at a time, with pedestrian data-rates). We then ended up with a quick re-training pass to re-center the eyes. It still wasn't perfect - slower ram chips (with smaller eyes) would report ECC correctables when all interfaces were doing worst-case patterns at temperature extremes. We spent a lot of time making those interfaces robust, and ended up relying more on ECC than we had intended. But those chips have been shipping ever since and will have seen traffic from most of us.

New comment by nsteel in "MacBook Neo"

nsteel — Thu, 05 Mar 2026 22:40:39 +0000

https://news.ycombinator.com/item?id=47256032

M1 Air single core is 2347 and multicore is 8342. Plus, the A18 pro chip in the Neo will likely perform better with the improved thermal environment.

New comment by nsteel in "Intel's make-or-break 18A process node debuts for data center with 288-core Xeon"

nsteel — Thu, 05 Mar 2026 22:31:05 +0000

And that's why it's got to be a big company that takes this on, you need deep pockets to successfully sue a company like Intel. It's not realistic for most. Plus the huge opportunity cost of missing your market and wasting years having to start over. Again, a bigger company can survive that with multiple projects in parallel.

New comment by nsteel in "MacBook Neo"

nsteel — Thu, 05 Mar 2026 12:38:22 +0000

Doesn't that M1 Air have a much, much slower CPU?

New comment by nsteel in "Intel's make-or-break 18A process node debuts for data center with 288-core Xeon"

nsteel — Thu, 05 Mar 2026 11:03:04 +0000

Worthless. Just looks how IFS worked out the previous two times they gave it a go. If you're not in the industry you may not even be aware it was a thing. And then not. Twice.

New comment by nsteel in "Intel's make-or-break 18A process node debuts for data center with 288-core Xeon"

nsteel — Wed, 04 Mar 2026 10:15:34 +0000

Agree entirely with your take. The packaging story is awesome, I wish there were more details on the stacking used on this one.

But I am at a loss to how Intel are really going to get any traction with IFS. How can anyone trust Intel as a long-term foundry partner. Even if they priced it more aggressively, the opportunity cost in picking a supplier who decides to quit next year would be catastrophic for many. The only way this works is if they practically give their services away to someone big, who can afford to take that risk and can also make it worth Intel's continued investment. Any ideas who that would be, I've got nothing.