<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: mezark</title><link>https://news.ycombinator.com/user?id=mezark</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 05 May 2026 08:47:45 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=mezark" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[Tans: Precomputing RANS]]></title><description><![CDATA[
<p>Article URL: <a href="https://fergusfinn.com/blog/understanding-tans/">https://fergusfinn.com/blog/understanding-tans/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47962281">https://news.ycombinator.com/item?id=47962281</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 30 Apr 2026 13:39:12 +0000</pubDate><link>https://fergusfinn.com/blog/understanding-tans/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=47962281</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47962281</guid></item><item><title><![CDATA[Also-RANS: Asymmetric Numeral Systems for Entropy Coding]]></title><description><![CDATA[
<p>Article URL: <a href="https://fergusfinn.com/blog/understanding-rans/">https://fergusfinn.com/blog/understanding-rans/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47962271">https://news.ycombinator.com/item?id=47962271</a></p>
<p>Points: 24</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 30 Apr 2026 13:38:45 +0000</pubDate><link>https://fergusfinn.com/blog/understanding-rans/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=47962271</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47962271</guid></item><item><title><![CDATA[70x faster cold(ish) starts for SGLang]]></title><description><![CDATA[
<p>Article URL: <a href="https://fergusfinn.com/blog/fast-sglang-starts/">https://fergusfinn.com/blog/fast-sglang-starts/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47891224">https://news.ycombinator.com/item?id=47891224</a></p>
<p>Points: 4</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 24 Apr 2026 15:02:19 +0000</pubDate><link>https://fergusfinn.com/blog/fast-sglang-starts/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=47891224</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47891224</guid></item><item><title><![CDATA[QueueSpec – drafting speculation tokens while a request queues]]></title><description><![CDATA[
<p>Article URL: <a href="https://blog.doubleword.ai/queue-speculation-drafting-while-you-wait">https://blog.doubleword.ai/queue-speculation-drafting-while-you-wait</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46765015">https://news.ycombinator.com/item?id=46765015</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 26 Jan 2026 12:49:46 +0000</pubDate><link>https://blog.doubleword.ai/queue-speculation-drafting-while-you-wait</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=46765015</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46765015</guid></item><item><title><![CDATA[ZeroDP: Just-in-Time Weight Offloading over NVLink for Data Parallelism]]></title><description><![CDATA[
<p>Article URL: <a href="https://mainlymatmul.com/blog/zerodp/">https://mainlymatmul.com/blog/zerodp/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46678316">https://news.ycombinator.com/item?id=46678316</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 19 Jan 2026 12:37:58 +0000</pubDate><link>https://mainlymatmul.com/blog/zerodp/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=46678316</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46678316</guid></item><item><title><![CDATA[Parallel Primitives for Multi-Agent Workflows]]></title><description><![CDATA[
<p>Article URL: <a href="https://fergusfinn.com/blog/parallel-primitives-blog/">https://fergusfinn.com/blog/parallel-primitives-blog/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46615169">https://news.ycombinator.com/item?id=46615169</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 14 Jan 2026 12:15:19 +0000</pubDate><link>https://fergusfinn.com/blog/parallel-primitives-blog/</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=46615169</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46615169</guid></item><item><title><![CDATA[New fastest AI Model Gateway – 450x less overhead than LiteLLM]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/doublewordai/control-layer">https://github.com/doublewordai/control-layer</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45655480">https://news.ycombinator.com/item?id=45655480</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 21 Oct 2025 13:23:58 +0000</pubDate><link>https://github.com/doublewordai/control-layer</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=45655480</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45655480</guid></item><item><title><![CDATA[New comment by mezark in "Should GPUs Make Free Trade Agreements?"]]></title><description><![CDATA[
<p>We look at how comparative advantage from economics applies to LLM inference - some GPUs are relatively better at FLOPs, others at memory bandwidth. What happens if you let each do what it’s best at?</p>
]]></description><pubDate>Fri, 19 Sep 2025 17:11:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=45304032</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=45304032</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45304032</guid></item><item><title><![CDATA[Should GPUs Make Free Trade Agreements?]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.doubleword.ai/resources/should-gpus-make-free-trade-agreements">https://www.doubleword.ai/resources/should-gpus-make-free-trade-agreements</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45304031">https://news.ycombinator.com/item?id=45304031</a></p>
<p>Points: 3</p>
<p># Comments: 1</p>
]]></description><pubDate>Fri, 19 Sep 2025 17:11:52 +0000</pubDate><link>https://www.doubleword.ai/resources/should-gpus-make-free-trade-agreements</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=45304031</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45304031</guid></item><item><title><![CDATA[New comment by mezark in "Our Small ML Team Beat OpenAI and Anthropic in a Specialized Domain [pdf]"]]></title><description><![CDATA[
<p>Huge congrats - and when you look at the latency graphs as well it really shows the value of these specialised systems!</p>
]]></description><pubDate>Tue, 11 Mar 2025 22:27:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=43337776</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=43337776</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43337776</guid></item><item><title><![CDATA[New comment by mezark in "Controlled generation of OS LLMs – without impacting latency"]]></title><description><![CDATA[
<p>TitanML Takeoff Inference Server demonstrating controlled generation</p>
]]></description><pubDate>Sat, 28 Oct 2023 01:03:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=38045957</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=38045957</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38045957</guid></item><item><title><![CDATA[Controlled generation of OS LLMs – without impacting latency]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.youtube.com/watch?v=ih2SX2UsZQ0">https://www.youtube.com/watch?v=ih2SX2UsZQ0</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=38045956">https://news.ycombinator.com/item?id=38045956</a></p>
<p>Points: 7</p>
<p># Comments: 1</p>
]]></description><pubDate>Sat, 28 Oct 2023 01:03:28 +0000</pubDate><link>https://www.youtube.com/watch?v=ih2SX2UsZQ0</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=38045956</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=38045956</guid></item><item><title><![CDATA[New comment by mezark in "Takeoff Inference Server Is Now Open Source"]]></title><description><![CDATA[
<p>Drop in replacement for HF's TGI server. The fastest and easiest way to inference LLMs locally<p>Github: <a href="https://github.com/titanml/takeoff">https://github.com/titanml/takeoff</a>
Docs: <a href="https://docs.titanml.co/docs/titan-takeoff/getting-started" rel="nofollow noreferrer">https://docs.titanml.co/docs/titan-takeoff/getting-started</a>
Discord: <a href="https://discord.gg/83RmHTjZgf" rel="nofollow noreferrer">https://discord.gg/83RmHTjZgf</a></p>
]]></description><pubDate>Tue, 01 Aug 2023 13:09:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=36955486</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=36955486</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36955486</guid></item><item><title><![CDATA[Takeoff Inference Server Is Now Open Source]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/titanml/takeoff">https://github.com/titanml/takeoff</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=36955485">https://news.ycombinator.com/item?id=36955485</a></p>
<p>Points: 3</p>
<p># Comments: 1</p>
]]></description><pubDate>Tue, 01 Aug 2023 13:09:53 +0000</pubDate><link>https://github.com/titanml/takeoff</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=36955485</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36955485</guid></item><item><title><![CDATA[New comment by mezark in "Falcon 7B running real time on CPU"]]></title><description><![CDATA[
<p>Hey there - TitanML is these guys: <a href="https://www.titanml.co/" rel="nofollow noreferrer">https://www.titanml.co/</a> . I think the impressive thing isn't actually whether the model is good (although it is a good model especially when fine-tuned) - but how fast this model runs on CPU with the TitanML server compared with before.</p>
]]></description><pubDate>Thu, 06 Jul 2023 13:08:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=36615161</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=36615161</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36615161</guid></item><item><title><![CDATA[New comment by mezark in "Falcon 7B running real time on CPU"]]></title><description><![CDATA[
<p>Falcon 7B running real time on CPU</p>
]]></description><pubDate>Wed, 05 Jul 2023 19:37:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=36605948</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=36605948</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36605948</guid></item><item><title><![CDATA[Falcon 7B running real time on CPU]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.youtube.com/watch?v=LvrEO_lNjcA">https://www.youtube.com/watch?v=LvrEO_lNjcA</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=36605947">https://news.ycombinator.com/item?id=36605947</a></p>
<p>Points: 11</p>
<p># Comments: 3</p>
]]></description><pubDate>Wed, 05 Jul 2023 19:37:00 +0000</pubDate><link>https://www.youtube.com/watch?v=LvrEO_lNjcA</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=36605947</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36605947</guid></item><item><title><![CDATA[New comment by mezark in "Amazon Titan"]]></title><description><![CDATA[
<p>Annoying because they stole my company's name (TitanML - <a href="https://www.titanml.co/" rel="nofollow">https://www.titanml.co/</a>)
Fortunately they haven't trademarked it, but still unideal.</p>
]]></description><pubDate>Fri, 14 Apr 2023 09:10:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=35567210</link><dc:creator>mezark</dc:creator><comments>https://news.ycombinator.com/item?id=35567210</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=35567210</guid></item></channel></rss>