<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: timmyd</title><link>https://news.ycombinator.com/user?id=timmyd</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 26 Jun 2026 03:49:28 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=timmyd" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[Qualcomm to Acquire Modular]]></title><description><![CDATA[
<p><a href="https://investor.qualcomm.com/news-events/press-releases/news-details/2026/Qualcomm-to-Acquire-Modular/default.aspx" rel="nofollow">https://investor.qualcomm.com/news-events/press-releases/new...</a><p><a href="https://www.modular.com/blog/qualcomm-to-acquire-modular" rel="nofollow">https://www.modular.com/blog/qualcomm-to-acquire-modular</a><p><a href="https://x.com/clattner_llvm/status/2069769232477192354" rel="nofollow">https://x.com/clattner_llvm/status/2069769232477192354</a>, <a href="https://xcancel.com/clattner_llvm/status/2069769232477192354" rel="nofollow">https://xcancel.com/clattner_llvm/status/2069769232477192354</a></p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48659798">https://news.ycombinator.com/item?id=48659798</a></p>
<p>Points: 236</p>
<p># Comments: 114</p>
]]></description><pubDate>Wed, 24 Jun 2026 13:49:16 +0000</pubDate><link>https://www.reuters.com/business/qualcomm-buy-ai-startup-modular-2026-06-24/</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=48659798</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48659798</guid></item><item><title><![CDATA[Boost Game – retro game where terrain is generated by SIMD kernel computation]]></title><description><![CDATA[
<p>Article URL: <a href="https://boost.modular.com/">https://boost.modular.com/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48545824">https://news.ycombinator.com/item?id=48545824</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 15 Jun 2026 19:21:53 +0000</pubDate><link>https://boost.modular.com/</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=48545824</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48545824</guid></item><item><title><![CDATA[The Five Eras of KVCache]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.modular.com/blog/the-five-eras-of-kvcache">https://www.modular.com/blog/the-five-eras-of-kvcache</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46908555">https://news.ycombinator.com/item?id=46908555</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 06 Feb 2026 03:06:29 +0000</pubDate><link>https://www.modular.com/blog/the-five-eras-of-kvcache</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=46908555</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46908555</guid></item><item><title><![CDATA[New comment by timmyd in "How to Beat Unsloth's CUDA Kernel Using Mojo–With Zero GPU Experience"]]></title><description><![CDATA[
<p>David Robertson took a quantization challenge designed for CUDA experts, and solved it in Mojo with AI assistance, and ended up 1.07x to 1.84x faster than the state-of-the-art C++/CUDA implementation.</p>
]]></description><pubDate>Wed, 14 Jan 2026 17:56:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=46619518</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=46619518</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46619518</guid></item><item><title><![CDATA[How to Beat Unsloth's CUDA Kernel Using Mojo–With Zero GPU Experience]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.modular.com/blog/how-to-beat-unsloth-s-cuda-kernel-using-mojo-with-zero-gpu-experience">https://www.modular.com/blog/how-to-beat-unsloth-s-cuda-kernel-using-mojo-with-zero-gpu-experience</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46619517">https://news.ycombinator.com/item?id=46619517</a></p>
<p>Points: 3</p>
<p># Comments: 1</p>
]]></description><pubDate>Wed, 14 Jan 2026 17:56:46 +0000</pubDate><link>https://www.modular.com/blog/how-to-beat-unsloth-s-cuda-kernel-using-mojo-with-zero-gpu-experience</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=46619517</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46619517</guid></item><item><title><![CDATA[New comment by timmyd in "Apple Silicon GPU Support in Mojo"]]></title><description><![CDATA[
<p>Co-founder here. There isn't any signup - that was 2+ years ago and we've been iterating a lot with the community and listening to feedback - which has been wonderful. Go freely and install with Pip, UV, Pixi etc -> <a href="https://docs.modular.com/mojo/manual/install" rel="nofollow">https://docs.modular.com/mojo/manual/install</a></p>
]]></description><pubDate>Mon, 22 Sep 2025 04:49:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=45329175</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=45329175</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45329175</guid></item><item><title><![CDATA[New comment by timmyd in "ML needs a new programming language – Interview with Chris Lattner"]]></title><description><![CDATA[
<p>You might also enjoy this series: <a href="https://www.modular.com/democratizing-ai-compute" rel="nofollow">https://www.modular.com/democratizing-ai-compute</a> which goes into a lot of the details.</p>
]]></description><pubDate>Fri, 05 Sep 2025 22:02:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=45144168</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=45144168</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45144168</guid></item><item><title><![CDATA[Matrix Multiplication on Nvidia's Blackwell: Part 1 – Introduction]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.modular.com/blog/matrix-multiplication-on-nvidias-blackwell-part-1-introduction">https://www.modular.com/blog/matrix-multiplication-on-nvidias-blackwell-part-1-introduction</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45068040">https://news.ycombinator.com/item?id=45068040</a></p>
<p>Points: 4</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 29 Aug 2025 19:00:13 +0000</pubDate><link>https://www.modular.com/blog/matrix-multiplication-on-nvidias-blackwell-part-1-introduction</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=45068040</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45068040</guid></item><item><title><![CDATA[Inworld TTS: 20x cheaper, state-of-the-art, text-to-speech]]></title><description><![CDATA[
<p>Article URL: <a href="https://inworld.ai/blog/introducing-inworld-tts">https://inworld.ai/blog/introducing-inworld-tts</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44379611">https://news.ycombinator.com/item?id=44379611</a></p>
<p>Points: 27</p>
<p># Comments: 1</p>
]]></description><pubDate>Wed, 25 Jun 2025 17:09:04 +0000</pubDate><link>https://inworld.ai/blog/introducing-inworld-tts</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=44379611</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44379611</guid></item><item><title><![CDATA[GPU Comic: When GPU programming hurts so much, all we can do is laugh]]></title><description><![CDATA[
<p>Article URL: <a href="https://comic.modular.com/">https://comic.modular.com/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44330672">https://news.ycombinator.com/item?id=44330672</a></p>
<p>Points: 15</p>
<p># Comments: 3</p>
]]></description><pubDate>Fri, 20 Jun 2025 18:44:00 +0000</pubDate><link>https://comic.modular.com/</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=44330672</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44330672</guid></item><item><title><![CDATA[Scale or Surrender: When watts determine freedom]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.timdavis.com/blog/scale-or-surrender-when-watts-determine-freedom">https://www.timdavis.com/blog/scale-or-surrender-when-watts-determine-freedom</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44224914">https://news.ycombinator.com/item?id=44224914</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 09 Jun 2025 14:33:03 +0000</pubDate><link>https://www.timdavis.com/blog/scale-or-surrender-when-watts-determine-freedom</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=44224914</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44224914</guid></item><item><title><![CDATA[New comment by timmyd in "Highly efficient matrix transpose in Mojo"]]></title><description><![CDATA[
<p>thanks jsnell - i did they and they appreciated the comment above, and unflagged it. i appreciate it!</p>
]]></description><pubDate>Sat, 07 Jun 2025 00:23:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=44206364</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=44206364</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44206364</guid></item><item><title><![CDATA[New comment by timmyd in "Highly efficient matrix transpose in Mojo"]]></title><description><![CDATA[
<p>FWIW I didnt take the blog as a dunk on CUDA, just as an impressive outcome from the blog writer in Mojo. It's awesome to see this on Hopper - if it makes it go faster thats awesome.</p>
]]></description><pubDate>Fri, 06 Jun 2025 23:45:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=44206156</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=44206156</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44206156</guid></item><item><title><![CDATA[New comment by timmyd in "Highly efficient matrix transpose in Mojo"]]></title><description><![CDATA[
<p>[op here] To be clear: Yes, there are 3 kernels - you can see those in the linked github at the end of the article if you clicked that. These are:<p>transpose_naive - Basic implementation with TMA transfers<p>transpose_swizzle - Adds swizzling optimization for better memory access patterns<p>transpose_swizzle_batched - Adds thread coarsening (batch processing) on top of swizzling<p>Performance comparison with CUDA: The Mojo implementations achieve bandwidths of:<p>transpose_naive: 1056.08 GB/s (32.0025% of max)<p>transpose_swizzle: 1437.55 GB/s (43.5622% of max)<p>transpose_swizzle_batched: 2775.49 GB/s (84.1056% of max)<p>via the GitHub - simveit/efficient_transpose_mojo<p>Comparing to the CUDA implementations mentioned in the article:<p>Naive kernel: Mojo achieves 1056.08 GB/s vs CUDA's 875.46 GB/s<p>Swizzle kernel: Mojo achieves 1437.55 GB/s vs CUDA's 1251.76 GB/s<p>Batched swizzle kernel: Mojo achieves 2775.49 GB/s vs CUDA's 2771.35 GB/s<p>So there is highly efficient matrix transpose in Mojo<p>All three Mojo kernels outperform their CUDA counterparts, with the naive and swizzle kernels showing significant improvements (20.6% and 14.8% faster respectively), while the final optimized kernel achieves essentially identical performance (slightly better by 4.14 GB/s).<p>The "flag" here seemed innapropriate given that its true this implementation is indeed faster, and certainly the final iteration could be improved on further. It wasn't wrong to say 14% or even 20%.</p>
]]></description><pubDate>Fri, 06 Jun 2025 23:20:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=44206045</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=44206045</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44206045</guid></item><item><title><![CDATA[New comment by timmyd in "Highly efficient matrix transpose in Mojo"]]></title><description><![CDATA[
<p>Updated the title to the original. I did base the numbers on<p>"This kernel archives 1437.55 GB/s compared to the 1251.76 GB/s we get in CUDA" (14.8%) which is still impressive</p>
]]></description><pubDate>Fri, 06 Jun 2025 20:53:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=44204855</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=44204855</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44204855</guid></item><item><title><![CDATA[Highly efficient matrix transpose in Mojo]]></title><description><![CDATA[
<p>Article URL: <a href="https://veitner.bearblog.dev/highly-efficient-matrix-transpose-in-mojo/">https://veitner.bearblog.dev/highly-efficient-matrix-transpose-in-mojo/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44204155">https://news.ycombinator.com/item?id=44204155</a></p>
<p>Points: 125</p>
<p># Comments: 64</p>
]]></description><pubDate>Fri, 06 Jun 2025 19:28:29 +0000</pubDate><link>https://veitner.bearblog.dev/highly-efficient-matrix-transpose-in-mojo/</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=44204155</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44204155</guid></item><item><title><![CDATA[Fast vector sum without CUDA]]></title><description><![CDATA[
<p>Article URL: <a href="https://veitner.bearblog.dev/very-fast-vector-sum-without-cuda/">https://veitner.bearblog.dev/very-fast-vector-sum-without-cuda/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44090626">https://news.ycombinator.com/item?id=44090626</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Sun, 25 May 2025 20:06:56 +0000</pubDate><link>https://veitner.bearblog.dev/very-fast-vector-sum-without-cuda/</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=44090626</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44090626</guid></item><item><title><![CDATA[Modular Platform 25.3: 450K+ Lines of Open Source Code and Pip Packaging]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.modular.com/blog/modular-platform-25-3-450k-lines-of-open-source-code-and-pip-install-modular">https://www.modular.com/blog/modular-platform-25-3-450k-lines-of-open-source-code-and-pip-install-modular</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=43907970">https://news.ycombinator.com/item?id=43907970</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 06 May 2025 18:06:30 +0000</pubDate><link>https://www.modular.com/blog/modular-platform-25-3-450k-lines-of-open-source-code-and-pip-install-modular</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=43907970</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43907970</guid></item><item><title><![CDATA[The Case for a Next-Generation AI Developer Platform]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.modular.com/blog/the-case-for-a-next-generation-ai-developer-platform">https://www.modular.com/blog/the-case-for-a-next-generation-ai-developer-platform</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=31952044">https://news.ycombinator.com/item?id=31952044</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 01 Jul 2022 20:03:31 +0000</pubDate><link>https://www.modular.com/blog/the-case-for-a-next-generation-ai-developer-platform</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=31952044</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=31952044</guid></item><item><title><![CDATA[Supportive Confrontation: High Performance Leadership]]></title><description><![CDATA[
<p>Article URL: <a href="https://timdavis.com/support-confrontation-high-performance-leadership-afd403c05e85">https://timdavis.com/support-confrontation-high-performance-leadership-afd403c05e85</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=15496941">https://news.ycombinator.com/item?id=15496941</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 18 Oct 2017 04:45:31 +0000</pubDate><link>https://timdavis.com/support-confrontation-high-performance-leadership-afd403c05e85</link><dc:creator>timmyd</dc:creator><comments>https://news.ycombinator.com/item?id=15496941</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=15496941</guid></item></channel></rss>