<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: ipieter</title><link>https://news.ycombinator.com/user?id=ipieter</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 24 Apr 2026 11:42:06 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=ipieter" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[Canada's AI Startup Cohere Buys Germany's Aleph Alpha to Expand in Europe]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.reuters.com/legal/transactional/canadas-cohere-germanys-aleph-alpha-announce-merger-handelsblatt-reports-2026-04-24/">https://www.reuters.com/legal/transactional/canadas-cohere-germanys-aleph-alpha-announce-merger-handelsblatt-reports-2026-04-24/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47887383">https://news.ycombinator.com/item?id=47887383</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 24 Apr 2026 08:35:12 +0000</pubDate><link>https://www.reuters.com/legal/transactional/canadas-cohere-germanys-aleph-alpha-announce-merger-handelsblatt-reports-2026-04-24/</link><dc:creator>ipieter</dc:creator><comments>https://news.ycombinator.com/item?id=47887383</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47887383</guid></item><item><title><![CDATA[New comment by ipieter in "Claude Token Counter, now with model comparisons"]]></title><description><![CDATA[
<p>There is currently very little evidence that morphological tokenizers help model performance [1]. For languages like German (where words get glued together) there is a bit more evidence (eg a paper I worked on [2]), but overall I start to suspect the bitter lesson is also true for tokenization.<p>[1] <a href="https://arxiv.org/pdf/2507.06378" rel="nofollow">https://arxiv.org/pdf/2507.06378</a><p>[2] <a href="https://pieter.ai/bpe-knockout/" rel="nofollow">https://pieter.ai/bpe-knockout/</a></p>
]]></description><pubDate>Mon, 20 Apr 2026 06:04:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=47830894</link><dc:creator>ipieter</dc:creator><comments>https://news.ycombinator.com/item?id=47830894</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47830894</guid></item><item><title><![CDATA[New comment by ipieter in "Why DeepSeek is cheap at scale but expensive to run locally"]]></title><description><![CDATA[
<p>Distributing inference per layer,  instead of splitting each layer across gpus, is indeed another approach, called pipeline parallelism. However, per batch there is less compute (only 1 gpu at a time), so inference is slower. In addition, the orchestration of starting the next batch on gpu #0 while gpu #1 starts is quite tricky. For this reason, tensor parallelism as I described is way more common in LLM inference.</p>
]]></description><pubDate>Mon, 02 Jun 2025 22:44:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=44164040</link><dc:creator>ipieter</dc:creator><comments>https://news.ycombinator.com/item?id=44164040</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44164040</guid></item><item><title><![CDATA[New comment by ipieter in "Why DeepSeek is cheap at scale but expensive to run locally"]]></title><description><![CDATA[
<p>This is an interesting blogpost. While the general conclusion ("We need batching") is true, inference of mixture of experts (MoE) models is actually a bit more nuanced.<p>The main reason we want big batches is because LLM inference is not limited by the compute, but my loading every single weight out of VRAM. Just compare the number of TFLOPS of an H100 with the memory bandwidth, there's basically room for 300 FLOP per byte loaded. So that's why we want big batches: we can perform a lot of operations per parameter/weight that we load from memory. This limit is often referred to as the "roofline model".<p>As models become bigger, this does not scale anymore because the model weights will not fit into GPU memory anymore and you need to distribute them across GPUs or across nodes. Even with NVLink and Infiniband, these communications are slower than loading from VRAM. NVlink is still fine for tensor parallelism, but across nodes this is quite slow.<p>So what MoE allows is expert parallelism, where different nodes keep different experts in memory and don't need to communicate as much between nodes. This only works if there are enough nodes to keep all experts in VRAM and have enough overhead for other stuff (KV cache, other weights, etc). So naturally the possible batch size becomes quite large. And of course you want to maximize this to make sure all GPUs are actually working.</p>
]]></description><pubDate>Sun, 01 Jun 2025 12:41:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=44150452</link><dc:creator>ipieter</dc:creator><comments>https://news.ycombinator.com/item?id=44150452</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44150452</guid></item><item><title><![CDATA[New comment by ipieter in "Debugging my wife's alarm clock"]]></title><description><![CDATA[
<p>The mains frequency is literally how fast the generators in power plants are turning. If the load on the grid increases, those generators slow down slightly and more natural gas/coal/heat needs to be added to increase the frequency again. This whole process is quite complicated as not every plant can react in the same time. Some plants are always at 100% capacity, while others are dedicated to governing the frequency.<p>So there are small fluctuations, often between 0.2 Hz around the base frequency, but the average is very close to the theoretical 50/60 Hz. And for an alarm clock that is good enough.</p>
]]></description><pubDate>Sun, 27 Oct 2024 15:02:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=41963066</link><dc:creator>ipieter</dc:creator><comments>https://news.ycombinator.com/item?id=41963066</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41963066</guid></item><item><title><![CDATA[New comment by ipieter in "Set Up a $4/Mo Hetzner VM to Skip the Serverless Tax"]]></title><description><![CDATA[
<p>I have both DO and Hetzner VMs and I find them comparable, with Hetzner being a bit cheaper. If I look at the logs and fail2ban, it looks like DO does a bit more abusive traffic filtering, but that is basically the only difference.<p>However, the DO docs are on a different level and high quality. But those are also accessible if you are not a customer.</p>
]]></description><pubDate>Sun, 01 Sep 2024 18:29:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=41419099</link><dc:creator>ipieter</dc:creator><comments>https://news.ycombinator.com/item?id=41419099</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41419099</guid></item><item><title><![CDATA[New comment by ipieter in "Show HN: Interactive Graph by LLM (GPT-4o)"]]></title><description><![CDATA[
<p>Yeah I just tried it with numbers I knew a bit and it seems totally made up. The generated chart showed a linear downwards trend while in reality there isn't one and the numbers seem way off.</p>
]]></description><pubDate>Sun, 19 May 2024 10:44:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=40405925</link><dc:creator>ipieter</dc:creator><comments>https://news.ycombinator.com/item?id=40405925</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40405925</guid></item><item><title><![CDATA[New comment by ipieter in "How to Cite ChatGPT"]]></title><description><![CDATA[
<p>Our university recently recommended the same thing, and I think this is a very bad idea for two reasons.<p>1. Not every sequence of words deserves to be cited. GPT-3 and ChatGPT are often confidently wrong about facts, why would you want to add a citation to this? When writing a paper, this needs to be fact-checked anyways, so why not add the original (actual) source?<p>2. It also breaks the citation graph. Imagine all papers now point to a catch-all reference from OpenAI (2023). Adding a citation is about saying where you got certain information from and this current format doesn't give enough to do that, it just points to the catch-all. With any other citation you can either look up the paper or—in the rare case of personal communication in a citation—ask the cited source directly. You can't ask chatGPT "hey, why did you say this in paper X" and expect a meaningful answer.</p>
]]></description><pubDate>Sun, 11 Jun 2023 15:09:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=36282098</link><dc:creator>ipieter</dc:creator><comments>https://news.ycombinator.com/item?id=36282098</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=36282098</guid></item><item><title><![CDATA[New comment by ipieter in "Abandoned Motorola Headquarters (2020)"]]></title><description><![CDATA[
<p>no it’s not. Philips Lighting is split of into a new brand: Signify. They pay to use the Philips name (e.g. for Philips Hue).</p>
]]></description><pubDate>Fri, 13 Aug 2021 20:56:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=28174405</link><dc:creator>ipieter</dc:creator><comments>https://news.ycombinator.com/item?id=28174405</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=28174405</guid></item><item><title><![CDATA[New comment by ipieter in "My Hunt for the Original McDonald’s French-Fry Recipe"]]></title><description><![CDATA[
<p>As a Belgian, I was looking in horror at that recipe. Why would people put sugar on their fries?!<p>So I'm glad you like the Belgian fries. What I've always learned to be key is that we fry our cut potatoes twice: once at 140°C, wait for them to cool down, and once at 175-180°C. Then you add salt, that's it.<p>And the mayo: I don't think many people like it like that when they eat fries at home, but it's easy ¯\_(ツ)_/¯</p>
]]></description><pubDate>Mon, 30 Nov 2020 15:15:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=25254553</link><dc:creator>ipieter</dc:creator><comments>https://news.ycombinator.com/item?id=25254553</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=25254553</guid></item><item><title><![CDATA[New comment by ipieter in "Show HN: Generate a static website from any back end"]]></title><description><![CDATA[
<p>I can answer that, since I also made a small static site generator from scratch in python.<p>I was using another static site generator (Hexo), but at some point I wanted to change some things and add some custom html to my posts. Since the documentation was ... well ... minimalistic on some aspects, I also spent some time looking at the source code. But at that point I was really wondering what benefit I still had from using that generator.<p>In the end, all a static site generator does is collecting some markdown or RST files, converting them to html and putting that into a template html file. And generating some lists (index page, RSS, ...) and some metadata for SEO. So it took me a single Saturday to make a working static site generator and now I can do anything I want without looking up documentation or source code, since it's my own dumpster fire :-)</p>
]]></description><pubDate>Sat, 02 May 2020 09:12:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=23050479</link><dc:creator>ipieter</dc:creator><comments>https://news.ycombinator.com/item?id=23050479</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=23050479</guid></item></channel></rss>