<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: martinloretz</title><link>https://news.ycombinator.com/user?id=martinloretz</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 02 May 2026 19:29:06 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=martinloretz" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by martinloretz in "ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models(2023)"]]></title><description><![CDATA[
<p>I think this paper is the key to the next speedup in local LLM inference. By making the model sparse (using the ReLU activation), we can save around 80% of memory accesses and computations of the Feed Forward Layers. ReLU sets the output of a layer to 0 when it's negative, and since any number multiplied by zero is zero, the next layer doesn't need to load the rows of the weight matrix that would be zero after the multiplication.<p>Unfortunately there aren't a lot of models currently trained with ReLU activation.</p>
]]></description><pubDate>Thu, 20 Feb 2025 14:27:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=43115032</link><dc:creator>martinloretz</dc:creator><comments>https://news.ycombinator.com/item?id=43115032</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43115032</guid></item><item><title><![CDATA[ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models(2023)]]></title><description><![CDATA[
<p>Article URL: <a href="https://arxiv.org/abs/2310.04564">https://arxiv.org/abs/2310.04564</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43114770">https://news.ycombinator.com/item?id=43114770</a></p>
<p>Points: 2</p>
<p># Comments: 1</p>
]]></description><pubDate>Thu, 20 Feb 2025 14:05:59 +0000</pubDate><link>https://arxiv.org/abs/2310.04564</link><dc:creator>martinloretz</dc:creator><comments>https://news.ycombinator.com/item?id=43114770</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43114770</guid></item><item><title><![CDATA[New comment by martinloretz in "Show HN: A GPU-accelerated binary vector index"]]></title><description><![CDATA[
<p>Great work. Can you elaborate on how the radix selection works and how to get that working with float's and inner product distance? I just quickly checked the code, I'm not familiar with radix selection, but really interested in making extremely fast GPU indices.</p>
]]></description><pubDate>Tue, 18 Feb 2025 22:48:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=43096113</link><dc:creator>martinloretz</dc:creator><comments>https://news.ycombinator.com/item?id=43096113</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43096113</guid></item><item><title><![CDATA[A three body problem simulator]]></title><description><![CDATA[
<p>Article URL: <a href="https://three-body-problem.martinloretz.com/">https://three-body-problem.martinloretz.com/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42655911">https://news.ycombinator.com/item?id=42655911</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 10 Jan 2025 14:39:05 +0000</pubDate><link>https://three-body-problem.martinloretz.com/</link><dc:creator>martinloretz</dc:creator><comments>https://news.ycombinator.com/item?id=42655911</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42655911</guid></item></channel></rss>