<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: mzl</title><link>https://news.ycombinator.com/user?id=mzl</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 26 Apr 2026 10:22:26 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=mzl" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by mzl in "DeepSeek v4"]]></title><description><![CDATA[
<p>Kimi K2.5 and K2.6 are both >1T</p>
]]></description><pubDate>Fri, 24 Apr 2026 07:38:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=47886920</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47886920</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47886920</guid></item><item><title><![CDATA[New comment by mzl in "DeepSeek v4"]]></title><description><![CDATA[
<p>It is tricky to build good infrastructure for prompt caching.</p>
]]></description><pubDate>Fri, 24 Apr 2026 07:21:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47886759</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47886759</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47886759</guid></item><item><title><![CDATA[New comment by mzl in "SpaceX says it has agreement to acquire Cursor for $60B"]]></title><description><![CDATA[
<p>Which version of Kimi and served from where?</p>
]]></description><pubDate>Wed, 22 Apr 2026 09:25:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=47861114</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47861114</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47861114</guid></item><item><title><![CDATA[New comment by mzl in "SpaceX says it has agreement to acquire Cursor for $60B"]]></title><description><![CDATA[
<p>Composer-2 is based on Kimi K2.5, but with extensive RL. Cursor estimated 3x more compute on their RL than the original K2.5 training run (some details in <a href="https://cursor.com/blog/composer-2-technical-report" rel="nofollow">https://cursor.com/blog/composer-2-technical-report</a>).<p>Composer-2 seems very useful in Cursor, while K2.6 according to AA seems to be a really useful general model: <a href="https://artificialanalysis.ai/articles/kimi-k2-6-the-new-leading-open-weights-model" rel="nofollow">https://artificialanalysis.ai/articles/kimi-k2-6-the-new-lea...</a></p>
]]></description><pubDate>Wed, 22 Apr 2026 09:25:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=47861109</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47861109</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47861109</guid></item><item><title><![CDATA[New comment by mzl in "Acetaminophen vs. ibuprofen"]]></title><description><![CDATA[
<p>I've been prescribed slightly more than 5g per day (2 x 650mg tablets every 6 hours) for pain after an operation jointly with ibuprofen, which is scarily close to the limits.</p>
]]></description><pubDate>Wed, 22 Apr 2026 07:26:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47860251</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47860251</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47860251</guid></item><item><title><![CDATA[New comment by mzl in "Tesla concealed fatal accidents to continue testing autonomous driving"]]></title><description><![CDATA[
<p>I've heard people saying the study is bad, but whenever I've asked about why the answers have been pretty bad. Do you have a good source for why we should disregard it?</p>
]]></description><pubDate>Mon, 20 Apr 2026 12:56:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=47833621</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47833621</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47833621</guid></item><item><title><![CDATA[New comment by mzl in "Tesla concealed fatal accidents to continue testing autonomous driving"]]></title><description><![CDATA[
<p>Dan Luu had some interesting analysis about car safety, comparing how different auto-makers fared on newly introduced crash tests: <a href="https://danluu.com/car-safety/" rel="nofollow">https://danluu.com/car-safety/</a><p>The main take-away for me from that page is that very few manufacturers seem to design for actual safety (only Volvo had good results), and Tesla was angry that a new test had been introduced which feels indicative of a bad safety culture.</p>
]]></description><pubDate>Mon, 20 Apr 2026 12:53:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=47833596</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47833596</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47833596</guid></item><item><title><![CDATA[New comment by mzl in "Oracle slashes 30k jobs"]]></title><description><![CDATA[
<p>There was an interesting scandal in Sweden where Oracle managed to sell the Millenium system to a regions hospitals even though they did not fulfill the requirements, and then when it inevitably crashed and burned they had to do an emergency rollback to the previous system after just a few days.<p>Here is an article in English: <a href="https://www.heise.de/en/news/Scrapping-the-millennium-introduction-of-a-health-record-in-Sweden-fails-10323142.html" rel="nofollow">https://www.heise.de/en/news/Scrapping-the-millennium-introd...</a></p>
]]></description><pubDate>Wed, 01 Apr 2026 07:54:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=47598089</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47598089</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47598089</guid></item><item><title><![CDATA[New comment by mzl in "Don't post generated/AI-edited comments. HN is for conversation between humans"]]></title><description><![CDATA[
<p>No, but a lot of AI-adjsuted wordings have the very idiosyncratic AI-style that is prevalent in the AI-slop that is everywhere, and that style has quickly become associated with writing that is generally void of content and insight. So it is natural to get gut-reactions to the typical phrasings that have become associated with AI.</p>
]]></description><pubDate>Thu, 12 Mar 2026 10:02:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=47348540</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47348540</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47348540</guid></item><item><title><![CDATA[New comment by mzl in "Building a Procedural Hex Map with Wave Function Collapse"]]></title><description><![CDATA[
<p>As others have said, this is more of a constraint programming system than Wave Function Collapse. Whatever one wants to call it, I liked it.<p>For guiding the search, you might want to consider search steps that select only one feature, for example that a pair of adjacent tiles should be connected by a road, and just propagate that information. That could be used as a way to guide the search on high-level features first, and then later realize the plans by doing the normal search.</p>
]]></description><pubDate>Tue, 10 Mar 2026 16:14:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=47325236</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47325236</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47325236</guid></item><item><title><![CDATA[New comment by mzl in "Building a Procedural Hex Map with Wave Function Collapse"]]></title><description><![CDATA[
<p>I have a (very slight) beef with the name Algorithm X, as it is more of a data-structure to manage undo-information for the backtracking than an algorithm. It is a very fun, useful, and interesting data-structure, but it doesn't really change what steps are performed in the backtracking search.</p>
]]></description><pubDate>Tue, 10 Mar 2026 07:07:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=47319960</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47319960</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47319960</guid></item><item><title><![CDATA[New comment by mzl in "My spicy take on vibe coding for PMs"]]></title><description><![CDATA[
<p>In my view, Scrum is a way to force dysfunctional teams to have some process, it is not useful for a team that is already delivering and working in a samll-a agile manner.</p>
]]></description><pubDate>Wed, 04 Mar 2026 07:45:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=47244388</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47244388</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47244388</guid></item><item><title><![CDATA[New comment by mzl in "Show HN: ZSE – Open-source LLM inference engine with 3.9s cold starts"]]></title><description><![CDATA[
<p>Are you using the Model GPU memory snapshotting for this?</p>
]]></description><pubDate>Thu, 26 Feb 2026 09:10:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=47163718</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47163718</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47163718</guid></item><item><title><![CDATA[SambaNova Unveils Fastest Chip for Agentic AI, and Raises $350M+]]></title><description><![CDATA[
<p>Article URL: <a href="https://sambanova.ai/press/sambanova-unveils-fastest-chip-for-agentic-ai-collaborates-with-intel-and-raises-350m">https://sambanova.ai/press/sambanova-unveils-fastest-chip-for-agentic-ai-collaborates-with-intel-and-raises-350m</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47136375">https://news.ycombinator.com/item?id=47136375</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 24 Feb 2026 12:41:53 +0000</pubDate><link>https://sambanova.ai/press/sambanova-unveils-fastest-chip-for-agentic-ai-collaborates-with-intel-and-raises-350m</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47136375</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47136375</guid></item><item><title><![CDATA[New comment by mzl in "Step 3.5 Flash – Open-source foundation model, supports deep reasoning at speed"]]></title><description><![CDATA[
<p>I like the intelligence per watt and intelligence per joule framing in <a href="https://arxiv.org/abs/2511.07885" rel="nofollow">https://arxiv.org/abs/2511.07885</a> It feels like a very useful measure for thinking about long-term sustainable variants of AI build-outs.</p>
]]></description><pubDate>Thu, 19 Feb 2026 12:05:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47072926</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47072926</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47072926</guid></item><item><title><![CDATA[New comment by mzl in "Expensively Quadratic: The LLM Agent Cost Curve"]]></title><description><![CDATA[
<p>The cost of running things like prompt caching is defined by the implementation as that gives the infrastructure costs.</p>
]]></description><pubDate>Tue, 17 Feb 2026 06:04:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=47044228</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47044228</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47044228</guid></item><item><title><![CDATA[New comment by mzl in "Expensively Quadratic: The LLM Agent Cost Curve"]]></title><description><![CDATA[
<p>Saying that it is just in index from string prefixes into KV Cache misses all the fun, interesting, and complicated parts of it. While technically the size of the prompt-pointers is tiny compared with the data it points into, the massive scale of managing this over all users and requests and routing inside the compute cluster makes it an expensive thing to implement and tune. Also, keeping the prompt cache sufficiently responsive and storing the large KV Caches somewhere costs a lot as well in resources.<p>I think that the OpenAI docs are pretty useful for the API level understanding of how it can work (<a href="https://developers.openai.com/api/docs/guides/prompt-caching/" rel="nofollow">https://developers.openai.com/api/docs/guides/prompt-caching...</a>). The vLLM docs (<a href="https://docs.vllm.ai/en/stable/design/prefix_caching/" rel="nofollow">https://docs.vllm.ai/en/stable/design/prefix_caching/</a>) and SGLang radix hashing (<a href="https://lmsys.org/blog/2024-01-17-sglang/" rel="nofollow">https://lmsys.org/blog/2024-01-17-sglang/</a>) are useful for insights into how to implement it locally for one computer ode.</p>
]]></description><pubDate>Mon, 16 Feb 2026 20:28:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=47039895</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47039895</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47039895</guid></item><item><title><![CDATA[New comment by mzl in "Expensively Quadratic: The LLM Agent Cost Curve"]]></title><description><![CDATA[
<p>The prompt cache caches KV Cache states based on prefixes of previous prompts and conversations. Now, for a particular coding agent conversation, it might be more involved in how caching works (with cache handles and so on), I'm talking about the general case here. This is a way to avoid repeating the same quadratic cost computing over the prompt. Typically, LLM providers have much lower pricing for reading from this cache than computing again.<p>Since the prompt cache is (by necessity, this is how LLMs work) prefix of a prompt, if you have repeated API calls in some service, there is a lot of savings possible by organizing queries to have less commonly varying things first, and more varying things later. For example, if you included the current date and time as the first data point in your call, then that would force a recomputation every time.</p>
]]></description><pubDate>Mon, 16 Feb 2026 13:59:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=47035009</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47035009</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47035009</guid></item><item><title><![CDATA[New comment by mzl in "Expensively Quadratic: The LLM Agent Cost Curve"]]></title><description><![CDATA[
<p>Depends on which cache you mean. The KV Cache gets read on every token generated, but the prompt cache (which is what incurs the cache read cost) is read on conversation starts.</p>
]]></description><pubDate>Mon, 16 Feb 2026 12:23:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47034155</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=47034155</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47034155</guid></item><item><title><![CDATA[New comment by mzl in "GPT‑5.3‑Codex‑Spark"]]></title><description><![CDATA[
<p>Technically, Cerebras solution is really cool. However, I am skeptical that it will be economically useful for models that are larger in size, as the requirements on the number of racks scales with the the size of the model to fit the weights in SRAM.</p>
]]></description><pubDate>Fri, 13 Feb 2026 07:38:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=46999992</link><dc:creator>mzl</dc:creator><comments>https://news.ycombinator.com/item?id=46999992</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46999992</guid></item></channel></rss>