<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: terafo</title><link>https://news.ycombinator.com/user?id=terafo</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 28 Apr 2026 21:03:03 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=terafo" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by terafo in "Brave overhauled its Rust adblock engine with FlatBuffers, cutting memory 75%"]]></title><description><![CDATA[
<p>Dynamic libraries are a dumpster fire with how they are implemented right now, and I'd really prefer everything to be statically linked. But ideally, I'd like to see exploration of a hybrid solution, where library code is tagged inside a binary, so if the OS detects that multiple applications are using the same version of a library, it's not duplicated in RAM. Such a design would also allow for libraries to be updated if absolutely necessary, either by runtime or some kind of package manager.</p>
]]></description><pubDate>Tue, 06 Jan 2026 02:36:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=46508089</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=46508089</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46508089</guid></item><item><title><![CDATA[New comment by terafo in "Local AI is driving the biggest change in laptops in decades"]]></title><description><![CDATA[
<p>This article specifically talks about PC laptops and discusses changes in them.</p>
]]></description><pubDate>Wed, 24 Dec 2025 04:19:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=46372409</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=46372409</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46372409</guid></item><item><title><![CDATA[New comment by terafo in "New Kindle feature uses AI to answer questions about books"]]></title><description><![CDATA[
<p>Having access to the text and being trained on the text are two different things.</p>
]]></description><pubDate>Fri, 12 Dec 2025 20:57:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=46248857</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=46248857</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46248857</guid></item><item><title><![CDATA[New comment by terafo in "New Kindle feature uses AI to answer questions about books"]]></title><description><![CDATA[
<p>There are LLM's that can process 1 million token context window. Amazon Nova 2 for one, even though it's definitely not the highest quality model. You just put whole book in context and make LLM answer questions about it. And given the fact that domain is pretty limited, you can just store KV cache for most popular books on SSD, eliminating quite a bit of cost.</p>
]]></description><pubDate>Fri, 12 Dec 2025 20:56:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=46248836</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=46248836</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46248836</guid></item><item><title><![CDATA[New comment by terafo in "DeepSeek R2 launch stalled as CEO balks at progress"]]></title><description><![CDATA[
<p>Yes</p>
]]></description><pubDate>Sat, 28 Jun 2025 12:08:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=44404090</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=44404090</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44404090</guid></item><item><title><![CDATA[New comment by terafo in "DeepSeek R2 launch stalled as CEO balks at progress"]]></title><description><![CDATA[
<p>MLA uses way more flops in order to conserve memory bandwidth, H20 has plenty of memory bandwidth and almost no flops. MLA makes sense on H100/H800, but on H20 GQA-based models are a way better option.</p>
]]></description><pubDate>Sat, 28 Jun 2025 11:56:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=44404026</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=44404026</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44404026</guid></item><item><title><![CDATA[New comment by terafo in "MongoDB acquires Voyage AI"]]></title><description><![CDATA[
<p><a href="https://www.youtube.com/watch?v=b2F-DItXtZs" rel="nofollow">https://www.youtube.com/watch?v=b2F-DItXtZs</a></p>
]]></description><pubDate>Mon, 24 Feb 2025 23:40:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=43166275</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=43166275</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43166275</guid></item><item><title><![CDATA[New comment by terafo in "ChatGPT Pro"]]></title><description><![CDATA[
<p>Because you have to do inference distributed between multiple nodes at this point. For prefill because prefill is actually quadratic, but also for memory reasons. KV Cache for 405B at 10M context length would take more than 5 terabytes (at bf16). That's 36 H200 just for KV Cache, but you would need roughly 48 GPUs to serve bf16 version of the model. Generation speed at that setup would be roughly 30 tokens per second, 100k tokens per hour, and you can server only a single user because batching doesn't make sense at these kinds of context lengths. If you pay 3 dollars per hour per GPU, it's $1440 per million tokens cost. For fp8 version the numbers are a bit better: you need only 24 GPUs, generation speed stays roughly the same, so it's only 700 dollars per million tokens. There are architectural modifications that will bring that down significantly, but, nonetheless, it's still really really expensive, but also quite hard to get to work.</p>
]]></description><pubDate>Fri, 06 Dec 2024 05:33:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=42336669</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=42336669</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42336669</guid></item><item><title><![CDATA[New comment by terafo in "Nearly half of Nvidia's revenue comes from four mystery whales each buying $3B+"]]></title><description><![CDATA[
<p>Why mention Microsoft twice?</p>
]]></description><pubDate>Sat, 31 Aug 2024 20:15:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=41411624</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=41411624</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41411624</guid></item><item><title><![CDATA[New comment by terafo in "$50 2GB Raspberry Pi 5 comes with a lower price and a tweaked, cheaper CPU"]]></title><description><![CDATA[
<p>There was. Now second gen of that goes for $15.</p>
]]></description><pubDate>Tue, 20 Aug 2024 09:03:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=41298210</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=41298210</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41298210</guid></item><item><title><![CDATA[New comment by terafo in "New exponent functions that make SiLU and SoftMax 2x faster, at full accuracy"]]></title><description><![CDATA[
<p>Overwhelming majority of flops is indeed spent on matmuls, but softmax disproportionately uses memory bandwidth, so it generally takes much longer than you'd expect from just looking at flops.</p>
]]></description><pubDate>Wed, 15 May 2024 23:01:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=40373403</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=40373403</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40373403</guid></item><item><title><![CDATA[New comment by terafo in "Maxtext: A simple, performant and scalable Jax LLM"]]></title><description><![CDATA[
<p>t5 is an architecture, t5x is a framework for training models that was created with that architecture in mind, but can be used to train other architectures, including decoder-only ones(there is one in examples).</p>
]]></description><pubDate>Wed, 24 Apr 2024 11:37:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=40143104</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=40143104</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40143104</guid></item><item><title><![CDATA[New comment by terafo in "4T transistors, one giant chip (Cerebras WSE-3) [video]"]]></title><description><![CDATA[
<p>To quote their official response "If the WSE weren't rectangular, the complexity of power delivery, I/O, mechanical integrity and cooling become much more difficult, to the point of impracticality.".</p>
]]></description><pubDate>Wed, 13 Mar 2024 20:59:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=39697438</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=39697438</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39697438</guid></item><item><title><![CDATA[New comment by terafo in "4T transistors, one giant chip (Cerebras WSE-3) [video]"]]></title><description><![CDATA[
<p>Not right now.</p>
]]></description><pubDate>Wed, 13 Mar 2024 20:52:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=39697345</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=39697345</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39697345</guid></item><item><title><![CDATA[New comment by terafo in "4T transistors, one giant chip (Cerebras WSE-3) [video]"]]></title><description><![CDATA[
<p>Because SRAM stopped getting smaller with recent nodes.</p>
]]></description><pubDate>Wed, 13 Mar 2024 20:51:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=39697341</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=39697341</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39697341</guid></item><item><title><![CDATA[New comment by terafo in "4T transistors, one giant chip (Cerebras WSE-3) [video]"]]></title><description><![CDATA[
<p>This thing targets training, which isn't affected by tiny accelerators inside CPUs.</p>
]]></description><pubDate>Wed, 13 Mar 2024 20:51:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=39697339</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=39697339</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39697339</guid></item><item><title><![CDATA[New comment by terafo in "4T transistors, one giant chip (Cerebras WSE-3) [video]"]]></title><description><![CDATA[
<p>No, it's comparable to 230Mb of SRAM on Groq chip, since both of them are SRAM-only chips that can't really use external memory.</p>
]]></description><pubDate>Wed, 13 Mar 2024 20:51:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=39697337</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=39697337</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39697337</guid></item><item><title><![CDATA[New comment by terafo in "Alexei Navalny has died"]]></title><description><![CDATA[
<p>I would say that Bradley is actually more valuable, since it can serve wider range of missions, while having higher crew survival rate and being more maneuverable.</p>
]]></description><pubDate>Fri, 16 Feb 2024 16:57:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=39399652</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=39399652</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39399652</guid></item><item><title><![CDATA[New comment by terafo in "Alexei Navalny has died"]]></title><description><![CDATA[
<p>> <i>they have 24,700,000 left of fighting age</i><p>Without equipment, logistics and ammo to support it it's a dead weight. Also, it's very interesting that you omitted Gulf War, which would be the most similar conflict in terms of power dynamics. 4th by strength military in the world in war against large coalition of countries that is led by USA.</p>
]]></description><pubDate>Fri, 16 Feb 2024 16:18:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=39399007</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=39399007</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39399007</guid></item><item><title><![CDATA[New comment by terafo in "Alexei Navalny has died"]]></title><description><![CDATA[
<p>Wrong. Shells, artillery, drone components, engineering vehicles, tanks, APCs, jets, long-range missiles, anti-air defenses. 10x that and Ukraine starts winning again. 10x manpower won't do that.</p>
]]></description><pubDate>Fri, 16 Feb 2024 15:52:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=39398609</link><dc:creator>terafo</dc:creator><comments>https://news.ycombinator.com/item?id=39398609</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39398609</guid></item></channel></rss>