<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: charles_irl</title><link>https://news.ycombinator.com/user?id=charles_irl</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 18 May 2026 23:53:45 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=charles_irl" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by charles_irl in "Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint"]]></title><description><![CDATA[
<p>To clarify: we do content-based hashing, and when we say "shared bytes aren’t guaranteed to be in the exact same container image layer", what we mean is that<p>FROM some/image
RUN pip install torch==2.7.1<p>and<p>FROM another/image
RUN pip install torch==2.7.1<p>will produce images with very high overlap in contents, which will be shared by a content-based cache, but those images' final layers are disjoint from the perspective of a layerwise cache.</p>
]]></description><pubDate>Mon, 18 May 2026 22:55:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=48186971</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=48186971</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48186971</guid></item><item><title><![CDATA[New comment by charles_irl in "Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint"]]></title><description><![CDATA[
<p>You're absolutely right!</p>
]]></description><pubDate>Mon, 18 May 2026 20:34:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=48185228</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=48185228</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48185228</guid></item><item><title><![CDATA[New comment by charles_irl in "Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint"]]></title><description><![CDATA[
<p>Yep! That should start in ten seconds or so -- about a second per gigabyte of weights, plus a second to start the container and a few seconds to load the memory snapshot.<p>There are a few limitations with snapshotting, e.g. it generally fails when using multiple GPUs, which we document here: <a href="https://modal.com/docs/guide/memory-snapshots" rel="nofollow">https://modal.com/docs/guide/memory-snapshots</a>.</p>
]]></description><pubDate>Mon, 18 May 2026 20:08:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=48184870</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=48184870</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48184870</guid></item><item><title><![CDATA[New comment by charles_irl in "Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint"]]></title><description><![CDATA[
<p>Cutting latencies by 40x! Unfortunately couldn't fit the whole title in the character limit :<</p>
]]></description><pubDate>Mon, 18 May 2026 19:05:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=48184090</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=48184090</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48184090</guid></item><item><title><![CDATA[Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint]]></title><description><![CDATA[
<p>Article URL: <a href="https://modal.com/blog/truly-serverless-gpus">https://modal.com/blog/truly-serverless-gpus</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48183038">https://news.ycombinator.com/item?id=48183038</a></p>
<p>Points: 65</p>
<p># Comments: 15</p>
]]></description><pubDate>Mon, 18 May 2026 17:56:26 +0000</pubDate><link>https://modal.com/blog/truly-serverless-gpus</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=48183038</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48183038</guid></item><item><title><![CDATA[How to Achieve Serverless GPUs]]></title><description><![CDATA[
<p>Article URL: <a href="https://modal.com/blog/truly-serverless-gpus">https://modal.com/blog/truly-serverless-gpus</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48110438">https://news.ycombinator.com/item?id=48110438</a></p>
<p>Points: 8</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 12 May 2026 16:23:16 +0000</pubDate><link>https://modal.com/blog/truly-serverless-gpus</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=48110438</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48110438</guid></item><item><title><![CDATA[New comment by charles_irl in "Keeping 20k GPUs healthy"]]></title><description><![CDATA[
<p>> Ed Zitron also called out the business model of GPU-as-a-service middleman companies like modal deeply unsustainable, and I also don't see how they can make a profit if they are only reselling public clouds.<p>You got a link for that? I work on Modal and would be interested in seeing the argument!<p>We think building a proper software layer for multitenant demand aggregation on top of the public clouds is sufficient value-add to be a sustainable business (cf DBRX and Snowflake).</p>
]]></description><pubDate>Thu, 22 Jan 2026 21:03:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=46725108</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=46725108</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46725108</guid></item><item><title><![CDATA[New comment by charles_irl in "Three types of LLM workloads and how to serve them"]]></title><description><![CDATA[
<p>Sorry to lead with a bunch of jargon! Wanted to make it obvious that we'd give concrete recommendations instead of palaver.<p>The technical terms there are later explained and diagrammed, and the recommendations derived from something close to first principles (e.g. roofline analysis).</p>
]]></description><pubDate>Thu, 22 Jan 2026 03:46:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=46715047</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=46715047</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46715047</guid></item><item><title><![CDATA[New comment by charles_irl in "Three types of LLM workloads and how to serve them"]]></title><description><![CDATA[
<p>oof ty, willfix</p>
]]></description><pubDate>Thu, 22 Jan 2026 03:14:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=46714829</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=46714829</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46714829</guid></item><item><title><![CDATA[Three types of LLM workloads and how to serve them]]></title><description><![CDATA[
<p>Article URL: <a href="https://modal.com/llm-almanac/workloads">https://modal.com/llm-almanac/workloads</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46707708">https://news.ycombinator.com/item?id=46707708</a></p>
<p>Points: 75</p>
<p># Comments: 5</p>
]]></description><pubDate>Wed, 21 Jan 2026 16:15:06 +0000</pubDate><link>https://modal.com/llm-almanac/workloads</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=46707708</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46707708</guid></item><item><title><![CDATA[Host overhead is killing your inference efficiency]]></title><description><![CDATA[
<p>Article URL: <a href="https://modal.com/blog/host-overhead-inference-efficiency">https://modal.com/blog/host-overhead-inference-efficiency</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45970964">https://news.ycombinator.com/item?id=45970964</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 18 Nov 2025 19:37:50 +0000</pubDate><link>https://modal.com/blog/host-overhead-inference-efficiency</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=45970964</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45970964</guid></item><item><title><![CDATA[New comment by charles_irl in "Quantized Float Exposed"]]></title><description><![CDATA[
<p>Inspired by <a href="https://float.exposed" rel="nofollow">https://float.exposed</a>, which was on the front page recently, I put together this visualizer for lower precision/quantized floating point numbers -- specifically, all of the formats in the Open Compute Project's Microscaling Formats Spec (<a href="https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf" rel="nofollow">https://www.opencompute.org/documents/ocp-microscaling-forma...</a>).</p>
]]></description><pubDate>Mon, 03 Nov 2025 20:09:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=45803793</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=45803793</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45803793</guid></item><item><title><![CDATA[Quantized Float Exposed]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.quant.exposed">https://www.quant.exposed</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45803768">https://news.ycombinator.com/item?id=45803768</a></p>
<p>Points: 2</p>
<p># Comments: 1</p>
]]></description><pubDate>Mon, 03 Nov 2025 20:07:13 +0000</pubDate><link>https://www.quant.exposed</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=45803768</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45803768</guid></item><item><title><![CDATA[Against SQL (2021)]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.scattered-thoughts.net/writing/against-sql/">https://www.scattered-thoughts.net/writing/against-sql/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45704419">https://news.ycombinator.com/item?id=45704419</a></p>
<p>Points: 82</p>
<p># Comments: 77</p>
]]></description><pubDate>Sat, 25 Oct 2025 15:00:06 +0000</pubDate><link>https://www.scattered-thoughts.net/writing/against-sql/</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=45704419</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45704419</guid></item><item><title><![CDATA[Length-extension attacks are still a thing]]></title><description><![CDATA[
<p>Article URL: <a href="https://00f.net/2025/10/23/length-extension-attacks/">https://00f.net/2025/10/23/length-extension-attacks/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45695719">https://news.ycombinator.com/item?id=45695719</a></p>
<p>Points: 2</p>
<p># Comments: 1</p>
]]></description><pubDate>Fri, 24 Oct 2025 15:37:37 +0000</pubDate><link>https://00f.net/2025/10/23/length-extension-attacks/</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=45695719</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45695719</guid></item><item><title><![CDATA[The future of Python web services looks GIL-free]]></title><description><![CDATA[
<p>Article URL: <a href="https://blog.baro.dev/p/the-future-of-python-web-services-looks-gil-free">https://blog.baro.dev/p/the-future-of-python-web-services-looks-gil-free</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45643007">https://news.ycombinator.com/item?id=45643007</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 20 Oct 2025 12:13:33 +0000</pubDate><link>https://blog.baro.dev/p/the-future-of-python-web-services-looks-gil-free</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=45643007</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45643007</guid></item><item><title><![CDATA[Lexical differential highlighting instead of syntax highlighting]]></title><description><![CDATA[
<p>Article URL: <a href="https://wordsandbuttons.online/lexical_differential_highlighting_instead_of_syntax_highlighting.html">https://wordsandbuttons.online/lexical_differential_highlighting_instead_of_syntax_highlighting.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45626265">https://news.ycombinator.com/item?id=45626265</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 18 Oct 2025 10:30:18 +0000</pubDate><link>https://wordsandbuttons.online/lexical_differential_highlighting_instead_of_syntax_highlighting.html</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=45626265</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45626265</guid></item><item><title><![CDATA[CReact – JSX for the Cloud]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/creact-labs/creact">https://github.com/creact-labs/creact</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45546024">https://news.ycombinator.com/item?id=45546024</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 11 Oct 2025 02:20:38 +0000</pubDate><link>https://github.com/creact-labs/creact</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=45546024</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45546024</guid></item><item><title><![CDATA[QUIC and the end of TCP sockets]]></title><description><![CDATA[
<p>Article URL: <a href="https://codemia.io/blog/path/QUIC-and-the-End-of-TCP-Sockets-How-User-Space-Transport-Rewrites-Flow-Control">https://codemia.io/blog/path/QUIC-and-the-End-of-TCP-Sockets-How-User-Space-Transport-Rewrites-Flow-Control</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45525303">https://news.ycombinator.com/item?id=45525303</a></p>
<p>Points: 62</p>
<p># Comments: 82</p>
]]></description><pubDate>Thu, 09 Oct 2025 09:14:58 +0000</pubDate><link>https://codemia.io/blog/path/QUIC-and-the-End-of-TCP-Sockets-How-User-Space-Transport-Rewrites-Flow-Control</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=45525303</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45525303</guid></item><item><title><![CDATA[In C++ modules globally unique module names seem to be unavoidable]]></title><description><![CDATA[
<p>Article URL: <a href="https://nibblestew.blogspot.com/2025/09/in-c-modules-globally-unique-module.html">https://nibblestew.blogspot.com/2025/09/in-c-modules-globally-unique-module.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45441745">https://news.ycombinator.com/item?id=45441745</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 01 Oct 2025 18:56:02 +0000</pubDate><link>https://nibblestew.blogspot.com/2025/09/in-c-modules-globally-unique-module.html</link><dc:creator>charles_irl</dc:creator><comments>https://news.ycombinator.com/item?id=45441745</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45441745</guid></item></channel></rss>