Hacker News: terafo

New comment by terafo in "Brave overhauled its Rust adblock engine with FlatBuffers, cutting memory 75%"

terafo — Tue, 06 Jan 2026 02:36:36 +0000

Dynamic libraries are a dumpster fire with how they are implemented right now, and I'd really prefer everything to be statically linked. But ideally, I'd like to see exploration of a hybrid solution, where library code is tagged inside a binary, so if the OS detects that multiple applications are using the same version of a library, it's not duplicated in RAM. Such a design would also allow for libraries to be updated if absolutely necessary, either by runtime or some kind of package manager.

New comment by terafo in "Local AI is driving the biggest change in laptops in decades"

terafo — Wed, 24 Dec 2025 04:19:32 +0000

This article specifically talks about PC laptops and discusses changes in them.

New comment by terafo in "New Kindle feature uses AI to answer questions about books"

terafo — Fri, 12 Dec 2025 20:57:27 +0000

Having access to the text and being trained on the text are two different things.

New comment by terafo in "New Kindle feature uses AI to answer questions about books"

terafo — Fri, 12 Dec 2025 20:56:05 +0000

There are LLM's that can process 1 million token context window. Amazon Nova 2 for one, even though it's definitely not the highest quality model. You just put whole book in context and make LLM answer questions about it. And given the fact that domain is pretty limited, you can just store KV cache for most popular books on SSD, eliminating quite a bit of cost.

New comment by terafo in "DeepSeek R2 launch stalled as CEO balks at progress"

terafo — Sat, 28 Jun 2025 12:08:30 +0000

Yes

New comment by terafo in "DeepSeek R2 launch stalled as CEO balks at progress"

terafo — Sat, 28 Jun 2025 11:56:44 +0000

MLA uses way more flops in order to conserve memory bandwidth, H20 has plenty of memory bandwidth and almost no flops. MLA makes sense on H100/H800, but on H20 GQA-based models are a way better option.

New comment by terafo in "MongoDB acquires Voyage AI"

terafo — Mon, 24 Feb 2025 23:40:15 +0000

https://www.youtube.com/watch?v=b2F-DItXtZs

New comment by terafo in "ChatGPT Pro"

terafo — Fri, 06 Dec 2024 05:33:25 +0000

Because you have to do inference distributed between multiple nodes at this point. For prefill because prefill is actually quadratic, but also for memory reasons. KV Cache for 405B at 10M context length would take more than 5 terabytes (at bf16). That's 36 H200 just for KV Cache, but you would need roughly 48 GPUs to serve bf16 version of the model. Generation speed at that setup would be roughly 30 tokens per second, 100k tokens per hour, and you can server only a single user because batching doesn't make sense at these kinds of context lengths. If you pay 3 dollars per hour per GPU, it's $1440 per million tokens cost. For fp8 version the numbers are a bit better: you need only 24 GPUs, generation speed stays roughly the same, so it's only 700 dollars per million tokens. There are architectural modifications that will bring that down significantly, but, nonetheless, it's still really really expensive, but also quite hard to get to work.

New comment by terafo in "Nearly half of Nvidia's revenue comes from four mystery whales each buying $3B+"

terafo — Sat, 31 Aug 2024 20:15:56 +0000

Why mention Microsoft twice?

New comment by terafo in "$50 2GB Raspberry Pi 5 comes with a lower price and a tweaked, cheaper CPU"

terafo — Tue, 20 Aug 2024 09:03:37 +0000

There was. Now second gen of that goes for $15.

New comment by terafo in "New exponent functions that make SiLU and SoftMax 2x faster, at full accuracy"

terafo — Wed, 15 May 2024 23:01:12 +0000

Overwhelming majority of flops is indeed spent on matmuls, but softmax disproportionately uses memory bandwidth, so it generally takes much longer than you'd expect from just looking at flops.

New comment by terafo in "Maxtext: A simple, performant and scalable Jax LLM"

terafo — Wed, 24 Apr 2024 11:37:09 +0000

t5 is an architecture, t5x is a framework for training models that was created with that architecture in mind, but can be used to train other architectures, including decoder-only ones(there is one in examples).

New comment by terafo in "4T transistors, one giant chip (Cerebras WSE-3) [video]"

terafo — Wed, 13 Mar 2024 20:59:18 +0000

To quote their official response "If the WSE weren't rectangular, the complexity of power delivery, I/O, mechanical integrity and cooling become much more difficult, to the point of impracticality.".

New comment by terafo in "4T transistors, one giant chip (Cerebras WSE-3) [video]"

terafo — Wed, 13 Mar 2024 20:52:01 +0000

Not right now.

New comment by terafo in "4T transistors, one giant chip (Cerebras WSE-3) [video]"

terafo — Wed, 13 Mar 2024 20:51:52 +0000

Because SRAM stopped getting smaller with recent nodes.

New comment by terafo in "4T transistors, one giant chip (Cerebras WSE-3) [video]"

terafo — Wed, 13 Mar 2024 20:51:50 +0000

This thing targets training, which isn't affected by tiny accelerators inside CPUs.

New comment by terafo in "4T transistors, one giant chip (Cerebras WSE-3) [video]"

terafo — Wed, 13 Mar 2024 20:51:48 +0000

No, it's comparable to 230Mb of SRAM on Groq chip, since both of them are SRAM-only chips that can't really use external memory.

New comment by terafo in "Alexei Navalny has died"

terafo — Fri, 16 Feb 2024 16:57:04 +0000

I would say that Bradley is actually more valuable, since it can serve wider range of missions, while having higher crew survival rate and being more maneuverable.

New comment by terafo in "Alexei Navalny has died"

terafo — Fri, 16 Feb 2024 16:18:27 +0000

> they have 24,700,000 left of fighting age

Without equipment, logistics and ammo to support it it's a dead weight. Also, it's very interesting that you omitted Gulf War, which would be the most similar conflict in terms of power dynamics. 4th by strength military in the world in war against large coalition of countries that is led by USA.

New comment by terafo in "Alexei Navalny has died"

terafo — Fri, 16 Feb 2024 15:52:56 +0000

Wrong. Shells, artillery, drone components, engineering vehicles, tanks, APCs, jets, long-range missiles, anti-air defenses. 10x that and Ukraine starts winning again. 10x manpower won't do that.