Hacker News: gdiamos

New comment by gdiamos in "Forcing Flash Attention onto a TPU and Learning the Hard Way"

gdiamos — Thu, 12 Mar 2026 23:11:09 +0000

I personally don't mind letting Claude write about work.

You could spend 80% doing the work and 20% writing about it, or 99% doing the work and 1% copy-pasting Claude's writeup about it into a blog.

There is nothing wrong with writing if you are into it, and yes you can probably do better than Claude, but I can related to engineers who just want to build.

New comment by gdiamos in "Forcing Flash Attention onto a TPU and Learning the Hard Way"

gdiamos — Thu, 12 Mar 2026 21:22:38 +0000

One of my lessons in using different accelerators, whether they be different NVIDIA versions, or GPU->TPU, etc is that someone needs to do this work of indexing, partitioning, mapping, scheduling, and benchmarking. That work is labor intensive.

In this case, google has already done it, and that will be true for high resourced accelerator companies like Google working with the most popular operations like attention.

As long as you use those operations, you are okay. But if you do something different, you need to be prepared to do all of this yourself.

New comment by gdiamos in "Something is afoot in the land of Qwen"

gdiamos — Thu, 05 Mar 2026 09:24:43 +0000

Results as good as Qwen has been posting would seem to trigger a power struggle.

I think companies that don’t navigate these correctly eventually lose.

New comment by gdiamos in "Claude becomes number one app on the U.S. App Store"

gdiamos — Sun, 01 Mar 2026 00:15:17 +0000

It was inevitable.

New comment by gdiamos in "Statement from Dario Amodei on our discussions with the Department of War"

gdiamos — Fri, 27 Feb 2026 02:13:00 +0000

This is why I like Dario as a CEO - he has a system of ethics that is not jus about who writes the largest check.

You may not agree with it, but I appreciate that it exists.

New comment by gdiamos in "Fast KV Compaction via Attention Matching"

gdiamos — Fri, 20 Feb 2026 17:55:59 +0000

I know the frontier “labs” are holding back publications.

I don’t think it will last among researchers who think beyond production LLMs

New comment by gdiamos in "15 years of FP64 segmentation, and why the Blackwell Ultra breaks the pattern"

gdiamos — Fri, 20 Feb 2026 04:12:00 +0000

I remember it being less profitable than graphics for a long time.

It did make money that would be interesting to a startup, but not to a public company.

New comment by gdiamos in "15 years of FP64 segmentation, and why the Blackwell Ultra breaks the pattern"

gdiamos — Thu, 19 Feb 2026 07:07:41 +0000

I remember it differently. CUDA was built with the intention of finding/enabling something like deep learning. I thought it was unrealistic too and took it on faith in people more experienced than me, until I saw deep learning work.

Some of the near misses I remember included bitcoin. Many of the other attempts didn't ever see the light of day.

Luck in english often means success by chance rather than one's own efforts or abilities. I don't think that characterizes CUDA. I think it was eventual success in the face of extreme difficulty, many failures, and sacrifices. In hindsight, I'm still surprised that Jensen kept funding it as long as he did. I've never met a leader since who I think would have done that.

New comment by gdiamos in "15 years of FP64 segmentation, and why the Blackwell Ultra breaks the pattern"

gdiamos — Thu, 19 Feb 2026 04:11:39 +0000

Most people don't appreciate how many dead end applications NVIDIA explored before finding deep learning. It took a very long time, and it wasn't luck.

New comment by gdiamos in "15 years of FP64 segmentation, and why the Blackwell Ultra breaks the pattern"

gdiamos — Thu, 19 Feb 2026 04:09:54 +0000

I'm not sure why the article dismisses cost.

Let's say X=10% of the GPU area (~75mm^2) is dedicated to FP32 SIMD units. Assume FP64 units are ~2-4x bigger. That would be 150-300mm^2, a huge amount of area that would increase the price per GPU. You may not agree with these assumptions. Feel free to change them. It is an overhead that is replicated per core. Why would gamers want to pay for any features they don't use?

Not to say there isn't market segmentation going on, but FP64 cost is higher for massively parallel processors than it was in the days of high frequency single core CPUs.

New comment by gdiamos in "Vouch"

gdiamos — Mon, 09 Feb 2026 01:38:09 +0000

I feel like a lot of software engineering problems come out of people who refuse to talk to each other than through comments in VCS.

It makes sense if you are collaborating over IRC, but I feel the need to face palm when people sitting next to each other do it.

What is your preferred way to talk to your team?

No English, only code

Slack

Zoom

In a meeting room

Over lunch

On a walk

One thing I’ve learned over time is that the highest bandwidth way of talking is face to face because you can read body language in addition to words. Video chat is okay, but an artificial and often overly formal setting. Phone is faster than text. Text drops the audio/visual/emotional signal completely. Code is precise but requires reverse engineering intent.

I personally like a walk, and then pair programming a shared screen.

New comment by gdiamos in "Clawdbot - open source personal AI assistant"

gdiamos — Mon, 26 Jan 2026 04:37:32 +0000

It sounds like lack of security is the biggest feature and risk of this clawd thing.

I also tried using Siri to tell me the weather forcast while I was driving to the park. It asked me to auth into my phone. Then it asked me to approve location access. I guess it was secure but I never figured out what the weather forecast was.

Thankfully it didn't rain on my picnic. Some of the parents there asked me if their investors should be interested in clawd.

New comment by gdiamos in "Skip is now free and open source"

gdiamos — Thu, 22 Jan 2026 11:22:16 +0000

Here’s another perspective. Developers aren’t budget approvers in engineering organizations by choice.

If you are a budget approver then your inbox and calendar are full of sales teams.

I find that experience is too distracting to concentrate on writing good code.

New comment by gdiamos in "On the slow death of scaling"

gdiamos — Wed, 07 Jan 2026 07:50:49 +0000

It was an interesting read Sara, thanks for sharing it.

I especially agree with your point that scaling laws really killed open research. That's a shame and I personally think we could benefit from more research.

I originally didn't like calling them scaling laws.

In addition to the law part seeming a bit much, I've found that researchers often overemphasize the scale part. If scaling is predictable, then you don't need to do most experiments at very large scale. However, that doesn't seem to stop researchers from starting there.

Once you find something good, and you understand how it scales, then you can pour system resources into it. So I originally thought it would encourage research. I find it sad that it seems to have had the opposite effect.

New comment by gdiamos in "John Giannandrea to retire from Apple"

gdiamos — Tue, 02 Dec 2025 01:32:21 +0000

The iphone is so useful they can probably ride it for a couple more decades at least. I would still buy it.

From a technology or engineering perspective, I have no idea how to work with Apple.

New comment by gdiamos in "Ilya Sutskever, Yann LeCun and the End of “Just Add GPUs”"

gdiamos — Thu, 27 Nov 2025 05:13:00 +0000

A Dyson sphere brain ?

New comment by gdiamos in "Ilya Sutskever, Yann LeCun and the End of “Just Add GPUs”"

gdiamos — Thu, 27 Nov 2025 01:09:16 +0000

I personally don’t think the scaling hypothesis is wrong, but it is running up against real limits

What high quality data sources are not already tapped?

Where does the next 1000x flops come from?

New comment by gdiamos in "TiDAR: Think in Diffusion, Talk in Autoregression"

gdiamos — Sat, 22 Nov 2025 16:53:44 +0000

Diffusion is favored by current GPUs .

Over time we seem to have a tendency to build models that are well matched to our machines

New comment by gdiamos in "Jeff Bezos creates A.I. startup where he will be co-chief executive"

gdiamos — Tue, 18 Nov 2025 04:05:18 +0000

How do they break ties?

New comment by gdiamos in "Show HN: Tiny Diffusion – A character-level text diffusion model from scratch"

gdiamos — Sat, 15 Nov 2025 07:37:34 +0000

Did I miss something? https://github.com/NVlabs/Fast-dLLM/blob/main/llada/chat.py

That’s inference code, but where is the high perf web server?