<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: mezark</title><link>https://news.ycombinator.com/user?id=mezark</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 29 Jun 2026 20:07:25 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=mezark" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[What happens when you run a CUDA kernel?]]></title><description><![CDATA[
<p>Article URL: <a href="https://fergusfinn.com/blog/what-happens-when-you-run-a-gpu-kernel/">https://fergusfinn.com/blog/what-happens-when-you-run-a-gpu-kernel/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48718863">https://news.ycombinator.com/item?id=48718863</a></p>
<p>Points: 166</p>
<p># Comments: 14</p>
]]></description><pubDate>Mon, 29 Jun 2026 13:11:08 +0000</pubDate><link>https://fergusfinn.com/blog/what-happens-when-you-run-a-gpu-kernel/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=48718863</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48718863</guid></item><item><title><![CDATA[A running list of reasons to move to open source]]></title><description><![CDATA[
<p>Article URL: <a href="https://whyopensource.ai/">https://whyopensource.ai/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48631791">https://news.ycombinator.com/item?id=48631791</a></p>
<p>Points: 6</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 22 Jun 2026 15:42:39 +0000</pubDate><link>https://whyopensource.ai/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=48631791</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48631791</guid></item><item><title><![CDATA[New comment by mezark in "Anatomy of a high-performance EP kernel"]]></title><description><![CDATA[
<p>I love this blog</p>
]]></description><pubDate>Wed, 10 Jun 2026 18:15:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=48480410</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=48480410</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48480410</guid></item><item><title><![CDATA[New comment by mezark in "Artificial intelligence is not conscious – Ted Chiang"]]></title><description><![CDATA[
<p>(As someone who cares a lot about philosophy of consciousness / & cogsci)<p>The whole point of consciousness being a 'hard problem' is that we just cannot make claims like 'X is not conscious'</p>
]]></description><pubDate>Thu, 04 Jun 2026 14:18:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=48399041</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=48399041</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48399041</guid></item><item><title><![CDATA[New comment by mezark in "Bringing Up DeepSeek-V4-Flash on AMD MI300X"]]></title><description><![CDATA[
<p>we think so - but haven't tested it ourselves</p>
]]></description><pubDate>Wed, 03 Jun 2026 20:14:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=48389354</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=48389354</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48389354</guid></item><item><title><![CDATA[New comment by mezark in "Bringing Up DeepSeek-V4-Flash on AMD MI300X"]]></title><description><![CDATA[
<p>Hi! Co-founder of Doubleword here - we've hugely increased the number of models that we offer (partly thanks to work that we've done on hotswapping <a href="https://blog.doubleword.ai/fast-sglang-starts" rel="nofollow">https://blog.doubleword.ai/fast-sglang-starts</a>.<p>We're kind of known for our low prices - our prices (our main usage is for our high throughput API - the async tier) is significantly below average openrouter prices - but cached prices is coming soon which will lower them even more :)</p>
]]></description><pubDate>Wed, 03 Jun 2026 20:14:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=48389349</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=48389349</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48389349</guid></item><item><title><![CDATA[New comment by mezark in "Bringing Up DeepSeek-V4-Flash on AMD MI300X"]]></title><description><![CDATA[
<p>We at doubleword are bullish for AMD for low-interactivity inference - it does just take a bigger lift on the software side...</p>
]]></description><pubDate>Tue, 02 Jun 2026 19:31:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=48375069</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=48375069</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48375069</guid></item><item><title><![CDATA[Moe inference optimizations: 15% lower expert load by request reordering]]></title><description><![CDATA[
<p>Article URL: <a href="https://blog.doubleword.ai/moe-expert-coactivations">https://blog.doubleword.ai/moe-expert-coactivations</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48215546">https://news.ycombinator.com/item?id=48215546</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 20 May 2026 23:05:25 +0000</pubDate><link>https://blog.doubleword.ai/moe-expert-coactivations</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=48215546</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48215546</guid></item><item><title><![CDATA[New comment by mezark in "UK sovereign LLM inference"]]></title><description><![CDATA[
<p>If you're talking about UK sovereign LLM inference you need to mention Doubleword... very serious inference optimization lab in london with public endpoints for OS models</p>
]]></description><pubDate>Fri, 15 May 2026 14:08:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=48148801</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=48148801</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48148801</guid></item><item><title><![CDATA[Tensor Network Attention]]></title><description><![CDATA[
<p>Article URL: <a href="https://mainlymatmul.com/blog/tensor-network-attention/">https://mainlymatmul.com/blog/tensor-network-attention/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48048439">https://news.ycombinator.com/item?id=48048439</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 07 May 2026 12:14:12 +0000</pubDate><link>https://mainlymatmul.com/blog/tensor-network-attention/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=48048439</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48048439</guid></item><item><title><![CDATA[Redundant Information in LLM Weights]]></title><description><![CDATA[
<p>Article URL: <a href="https://fergusfinn.com/blog/weight-entropy/">https://fergusfinn.com/blog/weight-entropy/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48021077">https://news.ycombinator.com/item?id=48021077</a></p>
<p>Points: 5</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 05 May 2026 11:38:10 +0000</pubDate><link>https://fergusfinn.com/blog/weight-entropy/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=48021077</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48021077</guid></item><item><title><![CDATA[Tans: Precomputing RANS]]></title><description><![CDATA[
<p>Article URL: <a href="https://fergusfinn.com/blog/understanding-tans/">https://fergusfinn.com/blog/understanding-tans/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47962281">https://news.ycombinator.com/item?id=47962281</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 30 Apr 2026 13:39:12 +0000</pubDate><link>https://fergusfinn.com/blog/understanding-tans/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=47962281</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47962281</guid></item><item><title><![CDATA[Also-RANS: Asymmetric Numeral Systems for Entropy Coding]]></title><description><![CDATA[
<p>Article URL: <a href="https://fergusfinn.com/blog/understanding-rans/">https://fergusfinn.com/blog/understanding-rans/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47962271">https://news.ycombinator.com/item?id=47962271</a></p>
<p>Points: 25</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 30 Apr 2026 13:38:45 +0000</pubDate><link>https://fergusfinn.com/blog/understanding-rans/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=47962271</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47962271</guid></item><item><title><![CDATA[70x faster cold(ish) starts for SGLang]]></title><description><![CDATA[
<p>Article URL: <a href="https://fergusfinn.com/blog/fast-sglang-starts/">https://fergusfinn.com/blog/fast-sglang-starts/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47891224">https://news.ycombinator.com/item?id=47891224</a></p>
<p>Points: 4</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 24 Apr 2026 15:02:19 +0000</pubDate><link>https://fergusfinn.com/blog/fast-sglang-starts/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=47891224</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47891224</guid></item><item><title><![CDATA[QueueSpec – drafting speculation tokens while a request queues]]></title><description><![CDATA[
<p>Article URL: <a href="https://blog.doubleword.ai/queue-speculation-drafting-while-you-wait">https://blog.doubleword.ai/queue-speculation-drafting-while-you-wait</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46765015">https://news.ycombinator.com/item?id=46765015</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 26 Jan 2026 12:49:46 +0000</pubDate><link>https://blog.doubleword.ai/queue-speculation-drafting-while-you-wait</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=46765015</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46765015</guid></item><item><title><![CDATA[ZeroDP: Just-in-Time Weight Offloading over NVLink for Data Parallelism]]></title><description><![CDATA[
<p>Article URL: <a href="https://mainlymatmul.com/blog/zerodp/">https://mainlymatmul.com/blog/zerodp/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46678316">https://news.ycombinator.com/item?id=46678316</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 19 Jan 2026 12:37:58 +0000</pubDate><link>https://mainlymatmul.com/blog/zerodp/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=46678316</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46678316</guid></item><item><title><![CDATA[Parallel Primitives for Multi-Agent Workflows]]></title><description><![CDATA[
<p>Article URL: <a href="https://fergusfinn.com/blog/parallel-primitives-blog/">https://fergusfinn.com/blog/parallel-primitives-blog/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46615169">https://news.ycombinator.com/item?id=46615169</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 14 Jan 2026 12:15:19 +0000</pubDate><link>https://fergusfinn.com/blog/parallel-primitives-blog/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=46615169</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46615169</guid></item><item><title><![CDATA[New fastest AI Model Gateway – 450x less overhead than LiteLLM]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/doublewordai/control-layer">https://github.com/doublewordai/control-layer</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45655480">https://news.ycombinator.com/item?id=45655480</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 21 Oct 2025 13:23:58 +0000</pubDate><link>https://github.com/doublewordai/control-layer</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=45655480</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45655480</guid></item><item><title><![CDATA[New comment by mezark in "Should GPUs Make Free Trade Agreements?"]]></title><description><![CDATA[
<p>We look at how comparative advantage from economics applies to LLM inference - some GPUs are relatively better at FLOPs, others at memory bandwidth. What happens if you let each do what it’s best at?</p>
]]></description><pubDate>Fri, 19 Sep 2025 17:11:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=45304032</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=45304032</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45304032</guid></item><item><title><![CDATA[Should GPUs Make Free Trade Agreements?]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.doubleword.ai/resources/should-gpus-make-free-trade-agreements">https://www.doubleword.ai/resources/should-gpus-make-free-trade-agreements</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45304031">https://news.ycombinator.com/item?id=45304031</a></p>
<p>Points: 3</p>
<p># Comments: 1</p>
]]></description><pubDate>Fri, 19 Sep 2025 17:11:52 +0000</pubDate><link>https://www.doubleword.ai/resources/should-gpus-make-free-trade-agreements</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=45304031</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45304031</guid></item></channel></rss>