<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: dhruvdh</title><link>https://news.ycombinator.com/user?id=dhruvdh</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 23 Apr 2026 15:32:23 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=dhruvdh" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by dhruvdh in "Pocket TTS: A high quality TTS that gives your CPU a voice"]]></title><description><![CDATA[
<p>Try `uvx pocket-tts serve`</p>
]]></description><pubDate>Fri, 16 Jan 2026 14:09:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=46646472</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=46646472</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46646472</guid></item><item><title><![CDATA[Accelerating LLM Inference with Parallel Draft Models (PARD)]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.amd.com/en/developer/resources/technical-articles/accelerating-generative-llms-interface-with-parallel-draft-model-pard.html">https://www.amd.com/en/developer/resources/technical-articles/accelerating-generative-llms-interface-with-parallel-draft-model-pard.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43656675">https://news.ycombinator.com/item?id=43656675</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 11 Apr 2025 18:10:02 +0000</pubDate><link>https://www.amd.com/en/developer/resources/technical-articles/accelerating-generative-llms-interface-with-parallel-draft-model-pard.html</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=43656675</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43656675</guid></item><item><title><![CDATA[New comment by dhruvdh in "Aiter: AI Tensor Engine for ROCm"]]></title><description><![CDATA[
<p>El Capitan can also do FP8. HPC requires double precision generally but people are trying to make low precision work.</p>
]]></description><pubDate>Mon, 24 Mar 2025 18:08:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=43463796</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=43463796</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43463796</guid></item><item><title><![CDATA[New comment by dhruvdh in "AMD RDNA 4 – AMD Radeon RX 9000 Series Graphics Cards"]]></title><description><![CDATA[
<p>To be fair, you can buy ~3 of these for the price Nvidia charges for 24GB/32GB models.</p>
]]></description><pubDate>Fri, 28 Feb 2025 13:53:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=43205571</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=43205571</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43205571</guid></item><item><title><![CDATA[New comment by dhruvdh in "ROCm Device Support Wishlist"]]></title><description><![CDATA[
<p>To add, AMD only makes _parts_ of an MI300X server.<p>It's like asking a tire manufacturer to give you a car for free.</p>
]]></description><pubDate>Mon, 20 Jan 2025 22:17:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=42773748</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=42773748</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42773748</guid></item><item><title><![CDATA[New comment by dhruvdh in "My failed attempt at AGI on the Tokio Runtime"]]></title><description><![CDATA[
<p>I wish more people would just try to do things just like this and blog about their failures.<p>> The published version of a proof is always condensed. And even if you take all the math that has been published in the history of mankind, it’s still small compared to what these models are trained on.<p>> And people only publish the success stories. The data that are really precious are from when someone tries something, and it doesn’t quite work, but they know how to fix it. But they only publish the successful thing, not the process.<p>- Terence Tao (<a href="https://www.scientificamerican.com/article/ai-will-become-mathematicians-co-pilot/" rel="nofollow">https://www.scientificamerican.com/article/ai-will-become-ma...</a>)<p>Personally, I think failures on their own are valuable. Others can come in and branch off from a decision you made that instead leads to success. Maybe the idea can be applied to a different domain. Maybe your failure clarified something for someone.</p>
]]></description><pubDate>Thu, 26 Dec 2024 16:52:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=42516218</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=42516218</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42516218</guid></item><item><title><![CDATA[New comment by dhruvdh in "MI300X vs. H100 vs. H200 Benchmark Part 1: Training – CUDA Moat Still Alive"]]></title><description><![CDATA[
<p>Disappointed that there wasn’t anything on inference performance in the article at all. That’s what the major customers have announced they use it for.</p>
]]></description><pubDate>Mon, 23 Dec 2024 04:26:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=42491771</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=42491771</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42491771</guid></item><item><title><![CDATA[New comment by dhruvdh in "CUDA Moat Still Alive"]]></title><description><![CDATA[
<p>Which algorithm you pick for what shape of matrices is different and not straightforward to figure out. AMD currently wants you to “tune” ops and likely search for the right algorithm for your shapes while Nvidia has accurate heuristics for picking the right algorithm.</p>
]]></description><pubDate>Mon, 23 Dec 2024 04:21:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=42491739</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=42491739</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42491739</guid></item><item><title><![CDATA[Open-sourcing Three EXAONE 3.5 Models: 2.4B, 7.8B, 32B]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.lgresearch.ai/blog/view?seq=507">https://www.lgresearch.ai/blog/view?seq=507</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=42367356">https://news.ycombinator.com/item?id=42367356</a></p>
<p>Points: 13</p>
<p># Comments: 4</p>
]]></description><pubDate>Mon, 09 Dec 2024 15:55:24 +0000</pubDate><link>https://www.lgresearch.ai/blog/view?seq=507</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=42367356</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42367356</guid></item><item><title><![CDATA[New comment by dhruvdh in "AMD outsells Intel in the datacenter space"]]></title><description><![CDATA[
<p>> despite them being fabless<p>That's not how it works. You need to pump money into fabs to get them working, and Intel doesn't have money. If AMD had fabs to light up their money, they would also have a much lower valuation.<p>The market is completely irrational on AMD. Their 52-week high is ~225$ and 52-week low is ~90$. 225$ was hit when AMD was guiding ~3.5B in datacenter GPU revenue. Now, they're guiding to end the year at 5B+ datacenter GPU revenue, but the stock is ~140$?<p>I think it's because of how early Nvidia announced Blackwell (it isn't any meaningful volume yet), and the market thinks AMD needs to compete with GB200 while they're actually competing with H200 this quarter. And for whatever reason the market thinks that AMD will get zero AI growth next year? I don't know how to explain the stock price.<p>Anyway, they hit record quarterly revenue this Q3 and are guiding to beat this record by ~1B next quarter. Price might move a lot based on how AMD guides for Q1 2025.</p>
]]></description><pubDate>Wed, 06 Nov 2024 00:01:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=42056312</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=42056312</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42056312</guid></item><item><title><![CDATA[New comment by dhruvdh in "AMD outsells Intel in the datacenter space"]]></title><description><![CDATA[
<p>> Performance per watt was better for Intel<p>No, not its not even close. AMD is miles ahead.<p>This is a Phoronix review for Turin (current generation): <a href="https://www.phoronix.com/review/amd-epyc-9965-9755-benchmarks/14" rel="nofollow">https://www.phoronix.com/review/amd-epyc-9965-9755-benchmark...</a><p>You can similarly search for phoronix reviews for the Genoa, Bergamo, and Milan generations (the last two generations).</p>
]]></description><pubDate>Tue, 05 Nov 2024 21:17:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=42055249</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=42055249</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42055249</guid></item><item><title><![CDATA[New comment by dhruvdh in "Nvidia (NVDA) to Replace Intel in the Dow Jones Industrial Average"]]></title><description><![CDATA[
<p>Is AMD behind hyperscaler in-house efforts? Outside of Google I don't think so.</p>
]]></description><pubDate>Sat, 02 Nov 2024 00:06:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=42022873</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=42022873</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42022873</guid></item><item><title><![CDATA[New comment by dhruvdh in "AI PCs Aren't Good at AI: The CPU Beats the NPU"]]></title><description><![CDATA[
<p>Oh, maybe also change the title? I flagged it because of the title/url not matching.</p>
]]></description><pubDate>Thu, 17 Oct 2024 00:28:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=41865253</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=41865253</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41865253</guid></item><item><title><![CDATA[New comment by dhruvdh in "AMD Instinct MI325X to Feature 256GB HBM3E Memory, CDNA4-Based MI355X with 288GB"]]></title><description><![CDATA[
<p>I don't think having a common ancestry for the ISA means much, or even having the same ISA.<p>Anyway, I don't understand what you want from me or are arguing about.
They were trying to win the datacenter CPU market and not the GPU market. They did well at that. They've recently started trying to win the GPU market as well, cause now they can afford to. They seem to be doing well now.</p>
]]></description><pubDate>Fri, 11 Oct 2024 13:42:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=41809422</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=41809422</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41809422</guid></item><item><title><![CDATA[New comment by dhruvdh in "AMD Instinct MI325X to Feature 256GB HBM3E Memory, CDNA4-Based MI355X with 288GB"]]></title><description><![CDATA[
<p>Those are Vega, not CDNA. It wouldn't surprise me if those are rebranded consumer chips, though I haven't checked.</p>
]]></description><pubDate>Fri, 11 Oct 2024 13:17:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=41809225</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=41809225</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41809225</guid></item><item><title><![CDATA[New comment by dhruvdh in "AMD Instinct MI325X to Feature 256GB HBM3E Memory, CDNA4-Based MI355X with 288GB"]]></title><description><![CDATA[
<p>And yet Meta is using MI300X exclusively for all live inference on Llama 405B.<p>Clearly there are workloads AMD wins at, and just going Nvidia by default for everything without considering AMD is suboptimal.</p>
]]></description><pubDate>Fri, 11 Oct 2024 13:00:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=41809099</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=41809099</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41809099</guid></item><item><title><![CDATA[New comment by dhruvdh in "AMD Instinct MI325X to Feature 256GB HBM3E Memory, CDNA4-Based MI355X with 288GB"]]></title><description><![CDATA[
<p>You know AMD primarily sells CPUs right?<p>For datacenter GPUs, they're going from ~500M-750M in 2023 full year (can't find proper numbers), to 4.5B+ full year 2024. In GPUs, it's almost like they're entering a new market.<p>The current Instinct line of products is relatively new too, I found this article [1] on the MI100 launch on Nov, 2020. That's basically start of 2021.<p>To go from MI100 in 2021, to 4.5B+ of MI300X + MI250X in 2024 is great. They are doing just fine.<p>On MI355X, I can't find endnotes for the slides they show, but it is not clear if the 9.2PF of FP6 and FP4 is sparse or not (all the other numbers on that slide were non-sparse). If it isn't they're exceeding GB200's sparse FP6/4 numbers with non-sparse flops (!). They both have the same memory bandwidth though. AMD is doing just fine.<p>[1] <a href="https://www.servethehome.com/amd-radeon-instinct-mi100-32gb-cdna-gpu-launched/" rel="nofollow">https://www.servethehome.com/amd-radeon-instinct-mi100-32gb-...</a></p>
]]></description><pubDate>Fri, 11 Oct 2024 12:56:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=41809066</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=41809066</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41809066</guid></item><item><title><![CDATA[New comment by dhruvdh in "AMD GPU Inference"]]></title><description><![CDATA[
<p>Batching is how you get ~350 tokens/sec on Qwen 14b on vLLM (7900XTX). By running 15 requests at once.<p>Also, there is a Dockerfile.rocm at the root of vLLM's repo. How is it a pain?</p>
]]></description><pubDate>Wed, 02 Oct 2024 20:56:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=41724901</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=41724901</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41724901</guid></item><item><title><![CDATA[New comment by dhruvdh in "AMD GPU Inference"]]></title><description><![CDATA[
<p>Why would you use this over vLLM?</p>
]]></description><pubDate>Wed, 02 Oct 2024 18:02:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=41723376</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=41723376</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41723376</guid></item><item><title><![CDATA[New comment by dhruvdh in "Tinygrad 0.9.0"]]></title><description><![CDATA[
<p>What's the point of the 8000 LOC limit? Has anyone worked in a project with a LOC limit? Why was the limit in place?</p>
]]></description><pubDate>Tue, 28 May 2024 19:58:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=40504914</link><dc:creator>dhruvdh</dc:creator><comments>https://news.ycombinator.com/item?id=40504914</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40504914</guid></item></channel></rss>