<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: pseudollm</title><link>https://news.ycombinator.com/user?id=pseudollm</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 04 Jun 2026 03:43:54 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=pseudollm" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by pseudollm in "Gemma 4 12B: A unified, encoder-free multimodal model"]]></title><description><![CDATA[
<p>No there isn't - read the paper. It's just 40msec raw audio samples. Multiplied by one matrix to translate to 3800 input vector. That's it. The next 40 msec are fed in the next transformer input step. Without any positional encoding. Repeat ad infinitum</p>
]]></description><pubDate>Thu, 04 Jun 2026 00:01:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=48391855</link><dc:creator>pseudollm</dc:creator><comments>https://news.ycombinator.com/item?id=48391855</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48391855</guid></item><item><title><![CDATA[New comment by pseudollm in "Gemma 4 12B: A unified, encoder-free multimodal model"]]></title><description><![CDATA[
<p>> usefulness of the RTX Spark<p>Not really. There's a reason the announcement didn't include ANY benchmark (!) and didn't mention EXACTLY what is the memory bandwidth. It's going to be dog-slow unusable for large models, as tok/sec is basically bandwidth divided by active weights. Rumoured 300GB/s / 30GB active weights (decent model) = 10 tokens per second, which is really slow</p>
]]></description><pubDate>Wed, 03 Jun 2026 23:52:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=48391775</link><dc:creator>pseudollm</dc:creator><comments>https://news.ycombinator.com/item?id=48391775</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48391775</guid></item></channel></rss>