Hacker News: derbaum

New comment by derbaum in "Puerto Rico's Solar Microgrids Beat Blackout"

derbaum — Fri, 27 Jun 2025 06:08:39 +0000

Now I'm curious... Is your last suggestion correct? Wouldn't the time to cool down between pause intervals be proportionally longer due to the higher thermal mass and cancel out any savings gained by the long pause? Maybe the overall energy draw is even higher because the heat losses are higher when you spend a longer time with a high dT.

New comment by derbaum in "0.9999 ≊ 1"

derbaum — Mon, 02 Jun 2025 08:56:27 +0000

Whether you multiply by 10 or 2, the same "counter" argument from the article stands. Only now you don't have a trailing zero after infinite nines, you have a trailing 8.

New comment by derbaum in "Qwen3: Think deeper, act faster"

derbaum — Mon, 28 Apr 2025 21:42:04 +0000

Very rough (!) napkin math: for a q8 model (almost lossless) you have parameters = VRAM requirement. For q4 with some performance loss it's roughly half. Then you add a little bit for the context window and overhead. So a 32B model q4 should run comfortably on 20-24 GB.

Again, very rough numbers, there's calculators online.

New comment by derbaum in "Gemma 3 Technical Report [pdf]"

derbaum — Wed, 12 Mar 2025 07:26:58 +0000

The ollama page shows Gemma 27B beating Deepseek v3 and o3-mini on lmarena. I'm very excited to try it out.

New comment by derbaum in "Has LLM killed traditional NLP?"

derbaum — Sat, 18 Jan 2025 22:02:03 +0000

Wouldn't a simple comparison of the word frequency in my text against a list of usual word frequencies do the trick here without an LLM? Sort of a BM25?

New comment by derbaum in "Has LLM killed traditional NLP?"

derbaum — Sat, 18 Jan 2025 21:57:32 +0000

That's interesting, it sounds a bit like those cluster graph visualisation techniques. Unfortunately, my texts seem to fall into clusters that really don't match the ones that I had hoped to get out of these methods. I guess it's just a matter of fine-tuning now.

New comment by derbaum in "Has LLM killed traditional NLP?"

derbaum — Sat, 18 Jan 2025 20:38:01 +0000

One of the things I'm still struggling with when using LLMs over NLP is classification against a large corpus of data. If I get a new text and I want to find the most similar text out of a million others, semantically speaking, how would I do this with an LLM? Apart from choosing certain pre-defined categories (such as "friendly", "political", ...) and then letting the LLM rate each text on each category, I can't see a simple solution yet except using embeddings (which I think could just be done using BERT and does not count as LLM usage?).

New comment by derbaum in "Fluid Simulation Pendant"

derbaum — Tue, 14 Jan 2025 12:25:30 +0000

The "issue" with saying an LLM can't do this is that CFD simulations are not actually that niche. Many university courses ask their students to write these types of algorithms for their course project. All this knowledge is present freely on the internet (as is evident by the Youtube videos that the author mentioned), and as such can be learned by an LLM. The article is of course still very impressive.

New comment by derbaum in "Nvidia's Project Digits is a 'personal AI supercomputer'"

derbaum — Tue, 07 Jan 2025 08:58:21 +0000

I'm a bit surprised by the amount of comments comparing the cost to (often cheap) cloud solutions. Nvidia's value proposition is completely different in my opinion. Say I have a startup in the EU that handles personal data or some company secrets and wants to use an LLM to analyse it (like using RAG). Having that data never leave your basement sure can be worth more than $3000 if performance is not a bottleneck.

New comment by derbaum in "A Replacement for BERT"

derbaum — Thu, 19 Dec 2024 19:08:05 +0000

Hey Jeremy, very exciting release! I'm currently building my first product with RoBERTa as one central component, and I'm very excited to see how ModernBERT compares. Quick question: When do you think the first multilingual versions will show up? Any plans of you training your own?