<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: aviinuo</title><link>https://news.ycombinator.com/user?id=aviinuo</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 15 May 2026 21:03:36 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=aviinuo" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by aviinuo in "RTX 5090 and M4 MacBook Air: Can It Game?"]]></title><description><![CDATA[
<p>Pro Blackwell 6000 is just a 5090 with more VRAM.
It does not have the tcgen05 (5th gen tensor core) instructions despite the "5th gen tensor core) branding and thus do not support any optimized Blackwell (sm100) kernels.<p>Every Blackwell card other than the (G)B100, (G)B200, (G)B300 and Jetson Thor, use the Ampere tensor core instruction (mma.sync) but with fp4/6/8 added on.
Beyond that the DGX Spark (which is advertised as having the same architecture as B200) has especially weak (not tcgen05) tensor cores that have a very narrow operating window and low utilization.</p>
]]></description><pubDate>Thu, 14 May 2026 21:58:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=48141816</link><dc:creator>aviinuo</dc:creator><comments>https://news.ycombinator.com/item?id=48141816</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48141816</guid></item><item><title><![CDATA[New comment by aviinuo in "AutoKernel: Autoresearch for GPU Kernels"]]></title><description><![CDATA[
<p>Something seems off.
For the 4kx4kx4k fp16 GEMM, cutlass is like 3x faster than this.</p>
]]></description><pubDate>Wed, 11 Mar 2026 11:27:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=47334226</link><dc:creator>aviinuo</dc:creator><comments>https://news.ycombinator.com/item?id=47334226</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47334226</guid></item></channel></rss>