Hacker News: mezark

What happens when you run a CUDA kernel?

mezark — Mon, 29 Jun 2026 13:11:08 +0000

Article URL: https://fergusfinn.com/blog/what-happens-when-you-run-a-gpu-kernel/

Comments URL: https://news.ycombinator.com/item?id=48718863

Points: 166

# Comments: 14

A running list of reasons to move to open source

mezark — Mon, 22 Jun 2026 15:42:39 +0000

Article URL: https://whyopensource.ai/

Comments URL: https://news.ycombinator.com/item?id=48631791

Points: 6

# Comments: 0

New comment by mezark in "Anatomy of a high-performance EP kernel"

mezark — Wed, 10 Jun 2026 18:15:31 +0000

I love this blog

New comment by mezark in "Artificial intelligence is not conscious – Ted Chiang"

mezark — Thu, 04 Jun 2026 14:18:32 +0000

(As someone who cares a lot about philosophy of consciousness / & cogsci)

The whole point of consciousness being a 'hard problem' is that we just cannot make claims like 'X is not conscious'

New comment by mezark in "Bringing Up DeepSeek-V4-Flash on AMD MI300X"

mezark — Wed, 03 Jun 2026 20:14:34 +0000

we think so - but haven't tested it ourselves

New comment by mezark in "Bringing Up DeepSeek-V4-Flash on AMD MI300X"

mezark — Wed, 03 Jun 2026 20:14:17 +0000

Hi! Co-founder of Doubleword here - we've hugely increased the number of models that we offer (partly thanks to work that we've done on hotswapping https://blog.doubleword.ai/fast-sglang-starts.

We're kind of known for our low prices - our prices (our main usage is for our high throughput API - the async tier) is significantly below average openrouter prices - but cached prices is coming soon which will lower them even more :)

New comment by mezark in "Bringing Up DeepSeek-V4-Flash on AMD MI300X"

mezark — Tue, 02 Jun 2026 19:31:26 +0000

We at doubleword are bullish for AMD for low-interactivity inference - it does just take a bigger lift on the software side...

Moe inference optimizations: 15% lower expert load by request reordering

mezark — Wed, 20 May 2026 23:05:25 +0000

Article URL: https://blog.doubleword.ai/moe-expert-coactivations

Comments URL: https://news.ycombinator.com/item?id=48215546

Points: 3

# Comments: 0

New comment by mezark in "UK sovereign LLM inference"

mezark — Fri, 15 May 2026 14:08:27 +0000

If you're talking about UK sovereign LLM inference you need to mention Doubleword... very serious inference optimization lab in london with public endpoints for OS models

Tensor Network Attention

mezark — Thu, 07 May 2026 12:14:12 +0000

Article URL: https://mainlymatmul.com/blog/tensor-network-attention/

Comments URL: https://news.ycombinator.com/item?id=48048439

Points: 2

# Comments: 0

Redundant Information in LLM Weights

mezark — Tue, 05 May 2026 11:38:10 +0000

Article URL: https://fergusfinn.com/blog/weight-entropy/

Comments URL: https://news.ycombinator.com/item?id=48021077

Points: 5

# Comments: 0

Tans: Precomputing RANS

mezark — Thu, 30 Apr 2026 13:39:12 +0000

Article URL: https://fergusfinn.com/blog/understanding-tans/

Comments URL: https://news.ycombinator.com/item?id=47962281

Points: 3

# Comments: 0

Also-RANS: Asymmetric Numeral Systems for Entropy Coding

mezark — Thu, 30 Apr 2026 13:38:45 +0000

Article URL: https://fergusfinn.com/blog/understanding-rans/

Comments URL: https://news.ycombinator.com/item?id=47962271

Points: 25

# Comments: 0

70x faster cold(ish) starts for SGLang

mezark — Fri, 24 Apr 2026 15:02:19 +0000

Article URL: https://fergusfinn.com/blog/fast-sglang-starts/

Comments URL: https://news.ycombinator.com/item?id=47891224

Points: 4

# Comments: 0

QueueSpec – drafting speculation tokens while a request queues

mezark — Mon, 26 Jan 2026 12:49:46 +0000

Article URL: https://blog.doubleword.ai/queue-speculation-drafting-while-you-wait

Comments URL: https://news.ycombinator.com/item?id=46765015

Points: 1

# Comments: 0

ZeroDP: Just-in-Time Weight Offloading over NVLink for Data Parallelism

mezark — Mon, 19 Jan 2026 12:37:58 +0000

Article URL: https://mainlymatmul.com/blog/zerodp/

Comments URL: https://news.ycombinator.com/item?id=46678316

Points: 1

# Comments: 0

Parallel Primitives for Multi-Agent Workflows

mezark — Wed, 14 Jan 2026 12:15:19 +0000

Article URL: https://fergusfinn.com/blog/parallel-primitives-blog/

Comments URL: https://news.ycombinator.com/item?id=46615169

Points: 1

# Comments: 0

New fastest AI Model Gateway – 450x less overhead than LiteLLM

mezark — Tue, 21 Oct 2025 13:23:58 +0000

Article URL: https://github.com/doublewordai/control-layer

Comments URL: https://news.ycombinator.com/item?id=45655480

Points: 2

# Comments: 0

New comment by mezark in "Should GPUs Make Free Trade Agreements?"

mezark — Fri, 19 Sep 2025 17:11:52 +0000

We look at how comparative advantage from economics applies to LLM inference - some GPUs are relatively better at FLOPs, others at memory bandwidth. What happens if you let each do what it’s best at?

Should GPUs Make Free Trade Agreements?

mezark — Fri, 19 Sep 2025 17:11:52 +0000

Article URL: https://www.doubleword.ai/resources/should-gpus-make-free-trade-agreements

Comments URL: https://news.ycombinator.com/item?id=45304031

Points: 3

# Comments: 1