<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: lukebechtel</title><link>https://news.ycombinator.com/user?id=lukebechtel</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 30 Jun 2026 23:45:35 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=lukebechtel" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by lukebechtel in "An update on recent Claude Code quality reports"]]></title><description><![CDATA[
<p>Some people seem to be suggesting these are coverups for quantization...<p>Those who work on agent harnesses for a living realize how sensitive models can be to even minor changes in the prompt.<p>I would not suspect quantization before I would suspect harness changes.</p>
]]></description><pubDate>Thu, 23 Apr 2026 18:36:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47879675</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47879675</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47879675</guid></item><item><title><![CDATA[New comment by lukebechtel in "Show HN: A game where you build a GPU"]]></title><description><![CDATA[
<p>really fun :) thanks!</p>
]]></description><pubDate>Sat, 04 Apr 2026 20:46:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=47643189</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47643189</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47643189</guid></item><item><title><![CDATA[New comment by lukebechtel in "Cursor 3"]]></title><description><![CDATA[
<p>it sounds like you described it pretty well!</p>
]]></description><pubDate>Thu, 02 Apr 2026 19:07:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=47618800</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47618800</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47618800</guid></item><item><title><![CDATA[New comment by lukebechtel in "Anatomy of the .claude/ folder"]]></title><description><![CDATA[
<p>~/.claude/projects is where the real fun is :)</p>
]]></description><pubDate>Sat, 28 Mar 2026 00:26:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=47550184</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47550184</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47550184</guid></item><item><title><![CDATA[New comment by lukebechtel in "Autoresearch on an old research idea"]]></title><description><![CDATA[
<p>What is your domain?</p>
]]></description><pubDate>Mon, 23 Mar 2026 23:47:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=47496745</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47496745</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47496745</guid></item><item><title><![CDATA[New comment by lukebechtel in "Reports of code's death are greatly exaggerated"]]></title><description><![CDATA[
<p>so we need to make some crazy llms...</p>
]]></description><pubDate>Mon, 23 Mar 2026 06:53:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47486206</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47486206</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47486206</guid></item><item><title><![CDATA[New comment by lukebechtel in "IMG_0416 (2024)"]]></title><description><![CDATA[
<p>there used to be <a href="https://default-filename-tv.neocities.org/" rel="nofollow">https://default-filename-tv.neocities.org/</a> but it got taken down :/</p>
]]></description><pubDate>Fri, 13 Mar 2026 05:35:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=47361049</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47361049</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47361049</guid></item><item><title><![CDATA[New comment by lukebechtel in "Surpassing vLLM with a Generated Inference Stack"]]></title><description><![CDATA[
<p>The bitter lesson strikes again, I suppose!</p>
]]></description><pubDate>Wed, 11 Mar 2026 16:43:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47337912</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47337912</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47337912</guid></item><item><title><![CDATA[New comment by lukebechtel in "Surpassing vLLM with a Generated Inference Stack"]]></title><description><![CDATA[
<p>Good questions! It's clear I need to gather more metrics from our next generated inference library.</p>
]]></description><pubDate>Wed, 11 Mar 2026 16:42:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=47337904</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47337904</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47337904</guid></item><item><title><![CDATA[New comment by lukebechtel in "Surpassing vLLM with a Generated Inference Stack"]]></title><description><![CDATA[
<p>Unfortunately it hasn't been open sourced. We're debating how / when to do this right now.</p>
]]></description><pubDate>Wed, 11 Mar 2026 16:41:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=47337888</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47337888</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47337888</guid></item><item><title><![CDATA[New comment by lukebechtel in "Surpassing vLLM with a Generated Inference Stack"]]></title><description><![CDATA[
<p>This is a fair critique! We plan to use our system to generate many more inference libraries of this nature, and I'll make it a point to release better, broader correctness measures when we do so.</p>
]]></description><pubDate>Wed, 11 Mar 2026 03:00:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=47331305</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47331305</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47331305</guid></item><item><title><![CDATA[New comment by lukebechtel in "Surpassing vLLM with a Generated Inference Stack"]]></title><description><![CDATA[
<p>Yes, great question!<p>The system started without paged attention, and recreated its own paged attention implementation automatically once it realized it was a bottleneck.<p>Pretty cool!</p>
]]></description><pubDate>Tue, 10 Mar 2026 21:49:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=47329203</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47329203</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47329203</guid></item><item><title><![CDATA[New comment by lukebechtel in "Surpassing vLLM with a Generated Inference Stack"]]></title><description><![CDATA[
<p>Unfortunately, not at present; we went for FP8 because we believed it was generally the best tradeoff of quality and speed. Allowed faster iteration as well.<p>We believe our improvements would hold on BF16, but let me check.</p>
]]></description><pubDate>Tue, 10 Mar 2026 20:23:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=47328359</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47328359</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47328359</guid></item><item><title><![CDATA[New comment by lukebechtel in "Surpassing vLLM with a Generated Inference Stack"]]></title><description><![CDATA[
<p>Yes, speculative decoding will make both us <i>and</i> VLLM faster, but we believe it would be a relatively even bump on both sides, so we didn't include it in this comparison. Worth another test!</p>
]]></description><pubDate>Tue, 10 Mar 2026 19:03:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=47327497</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47327497</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47327497</guid></item><item><title><![CDATA[New comment by lukebechtel in "Surpassing vLLM with a Generated Inference Stack"]]></title><description><![CDATA[
<p>We validate with MMLU and Hellaswag presently, and are getting this independently verified by a 3rd party.<p>We have considered open-sourcing some of our optimized inference libraries in the future, but have not yet come to a decision on this.<p>Also if you need a rough intuition as to why this is possible: it's because this entire inference stack was built for exactly one model, and thus we can really tune the entire framework accordingly.</p>
]]></description><pubDate>Tue, 10 Mar 2026 19:02:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=47327480</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47327480</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47327480</guid></item><item><title><![CDATA[Surpassing vLLM with a Generated Inference Stack]]></title><description><![CDATA[
<p>Article URL: <a href="https://infinity.inc/case-studies/qwen3-optimization">https://infinity.inc/case-studies/qwen3-optimization</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47324364">https://news.ycombinator.com/item?id=47324364</a></p>
<p>Points: 62</p>
<p># Comments: 22</p>
]]></description><pubDate>Tue, 10 Mar 2026 15:12:52 +0000</pubDate><link>https://infinity.inc/case-studies/qwen3-optimization</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47324364</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47324364</guid></item><item><title><![CDATA[New comment by lukebechtel in "Claude Code Remote Control"]]></title><description><![CDATA[
<p>I also do this!</p>
]]></description><pubDate>Wed, 25 Feb 2026 16:09:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=47153451</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47153451</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47153451</guid></item><item><title><![CDATA[New comment by lukebechtel in "Ggml.ai joins Hugging Face to ensure the long-term progress of Local AI"]]></title><description><![CDATA[
<p>Thank you Georgi <3</p>
]]></description><pubDate>Fri, 20 Feb 2026 20:44:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=47093682</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47093682</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47093682</guid></item><item><title><![CDATA[New comment by lukebechtel in "Gemini 3.1 Pro"]]></title><description><![CDATA[
<p>sonnet 4.6 is a third, and equivalent to opus 4.5, which is enough for me usually :)<p>EDIT: Gemini does have 1m context for "free" though so that's great.</p>
]]></description><pubDate>Fri, 20 Feb 2026 03:00:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=47083117</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=47083117</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47083117</guid></item><item><title><![CDATA[New comment by lukebechtel in "Gemini 3 Deep Think"]]></title><description><![CDATA[
<p>Arc-AGI-2: 84.6% (vs 68.8% for Opus 4.6)<p>Wow.<p><a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-deep-think/" rel="nofollow">https://blog.google/innovation-and-ai/models-and-research/ge...</a></p>
]]></description><pubDate>Thu, 12 Feb 2026 17:06:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=46991443</link><dc:creator>lukebechtel</dc:creator><comments>https://news.ycombinator.com/item?id=46991443</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46991443</guid></item></channel></rss>