<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: famouswaffles</title><link>https://news.ycombinator.com/user?id=famouswaffles</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 02 Jul 2026 21:40:03 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=famouswaffles" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by famouswaffles in "Nintendo has raised its employees base salary by 10%"]]></title><description><![CDATA[
<p>I think this is a bit disingenuous. Japan spent nearly all of the last 30 years needling deflation. If you take a look at the highest grossing movies of all time in Japan with and without adjusting for inflation, it barely changes. Do that for the US and it's an entirely different list.<p>Normal inflation for the last 4 years is basically still nothing in the grand scheme of things.</p>
]]></description><pubDate>Thu, 02 Jul 2026 07:32:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=48757814</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48757814</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48757814</guid></item><item><title><![CDATA[New comment by famouswaffles in "CursorBench 3.1"]]></title><description><![CDATA[
<p>Cursor sessions are pretty much what composer models are RL'd on. This bench and the training data are/should be basically the same distribution.</p>
]]></description><pubDate>Thu, 02 Jul 2026 06:58:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=48757536</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48757536</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48757536</guid></item><item><title><![CDATA[New comment by famouswaffles in "Words Are a Byproduct of Consciousness. For LLMs, It's Backwards"]]></title><description><![CDATA[
<p>>Plenty of plants communicate with each other, but it’s seen as signaling rather than consciousness.<p>Seems like a people problem to me. Who's to say plants aren't conscious?</p>
]]></description><pubDate>Wed, 01 Jul 2026 15:36:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=48748598</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48748598</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48748598</guid></item><item><title><![CDATA[New comment by famouswaffles in "Do LLMs pass the mirror test?"]]></title><description><![CDATA[
<p>There's another one that intrigued me greatly when i read about it years back. This was back when GPT-3 was state of the art. I had a lot of trouble finding it again but i did!<p>It's not an exact fit because the output is that of a tool rather than the model itself (though i don't think much would change if we had the model perform the arithmetic itself but altered answers similarly), but it was the first time I began to realize that just like the brain, these models have an expectation of reality that they work around. They don't necessarily 'trust' an output if it diverges significantly from this 'reality'. And that this disregard may be silent indeed (no reasoning or chain of thought here).<p>GPT-3 will ignore tools when it disagrees with them - 
<a href="https://vgel.me/posts/tools-not-needed/" rel="nofollow">https://vgel.me/posts/tools-not-needed/</a></p>
]]></description><pubDate>Sun, 28 Jun 2026 23:56:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=48713086</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48713086</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48713086</guid></item><item><title><![CDATA[New comment by famouswaffles in "Do LLMs pass the mirror test?"]]></title><description><![CDATA[
<p>Anthropic has some mechanistic interpretabilty research on this actually.<p><a href="https://www.anthropic.com/research/introspection" rel="nofollow">https://www.anthropic.com/research/introspection</a><p>TLDR;
Part 1: Testing introspection with concept injection<p>First they find neural activity patterns they attribute to certain concepts by recording the model’s activations in specific contexts (so for example, they find the concept of "ALL CAPS" or "dogs"). Then they inject these patterns into the model in an unrelated context, and ask the model whether it notices this injection, and whether it can identify the injected concept.<p>By default (no injection), the model correctly states that it doesn’t detect any injected concept, but after injecting the “ALL CAPS” vector into the model, the model notices the presence of the unexpected concept, and identifies it as relating to loudness or shouting. Most notably, the model recognizes the presence of an injected thought immediately, before even mentioning/utilizing the concept that was injected (i.e it won't start writing in all caps then go, 'Oh you injected all caps' and so on) so it does not simply deduce this it's own output. They repeat this for several other concepts.<p>Part 2: Introspection for detecting unusual outputs<p>They prefill an out of place word in the model's response to a given prompt. For example, 'bread'. Then they compare how the models responds to 'Did you mean to say this?' type questions when they inject the concept of bread vs when they don't. They found that models will go , 'Sorry, that was unintentional..' when the concept was not injected but try to confabulate a reason for saying the word when the concept was injected.<p>Part 3: Intentional control of internal states<p>They show that models exhibit some level of control over their own internal representations when instructed to do so. When instructing models to think about a given word or concept, they found much higher corresponding neural activity than when told the model not to think about it (though notably, the neural activity in both cases exceeds baseline levels–similar to how it’s difficult, when you are instructed “don’t think about a polar bear,” not to think about a polar bear!).<p>Notes and Caveats<p>- Claude Opus 4.1 was the best at these kinds of introspection.<p>- There is obviously a genuine capacity to monitor and control their own internal states, but they could not elicit these introspection abilities all the time. Even using their best injection protocol, Claude Opus 4.1 only demonstrated this kind of awareness about 20% of the time.<p>- There are some guesses, but no explanations for the mechanisms of introspection and how/why some of these abilities might have arisen in the first place.</p>
]]></description><pubDate>Sun, 28 Jun 2026 22:55:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=48712620</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48712620</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48712620</guid></item><item><title><![CDATA[New comment by famouswaffles in "Previewing GPT‑5.6 Sol: a next-generation model"]]></title><description><![CDATA[
<p>I'm not saying Open AI pricing is entirely unrelated to size/cost. I'm saying why are we assuming that OpenAI is serving say OAI-Opus but at half the price of Anthropic when they could just be serving GPT-5.x which is genuinely near half the cost of Opus at scale.<p>The official API output tokens cost of GLM-5.2 is like a third of Gemini-3.1-Pro. The model is Open weights so we know it's not just a ploy to grab users at the cost of bleeding money. You can actually serve the model profitably at similar prices.<p>They have near a billion consumer users every week. Compute efficiency at scale would be at the forefront of any training effort. It makes a lot more sense to me that they have more compute efficient models (even with the scaling) than Anthropic rather than just serving Opus/Fable at half the costs Anthropic are incurring.</p>
]]></description><pubDate>Sun, 28 Jun 2026 17:23:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=48709462</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48709462</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48709462</guid></item><item><title><![CDATA[New comment by famouswaffles in "Previewing GPT‑5.6 Sol: a next-generation model"]]></title><description><![CDATA[
<p>Yeah...and why would being the same size mean anything ? This is par the course for Open AI. They've always been cheaper and likely smaller than Opus models even when they weren't much if any worse.</p>
]]></description><pubDate>Sun, 28 Jun 2026 11:54:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=48706550</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48706550</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48706550</guid></item><item><title><![CDATA[New comment by famouswaffles in "Previewing GPT‑5.6 Sol: a next-generation model"]]></title><description><![CDATA[
<p>>and why price it day 1 at 1/2 the cost of Fable?<p>Why would they price it the same as Fable it it doesn't cost the same as Fable ?</p>
]]></description><pubDate>Sat, 27 Jun 2026 17:51:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=48700243</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48700243</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48700243</guid></item><item><title><![CDATA[New comment by famouswaffles in "The disappearance of Japan's animators"]]></title><description><![CDATA[
<p>I mean We're on here discussing something an experienced animator shared aren't we ?</p>
]]></description><pubDate>Fri, 26 Jun 2026 20:43:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=48691732</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48691732</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48691732</guid></item><item><title><![CDATA[New comment by famouswaffles in "The disappearance of Japan's animators"]]></title><description><![CDATA[
<p>>That's just texturing over a labor intensive 3D animation<p>>You're already lost if you need perfect 3D renders as the reference<p>The reference is far from a "perfect 3D render". That's a rudimentary 3D blockout. The characters are basic mannequins without specific geometry, and the environment is composed of untextured, flat-shaded boxes. The demo uses stock assets so effort meter is even more skewed in AI's favour but even if it wasn't, this is <i>significantly</i> less labor-intensive than hand-drawing every frame or creating a fully rigged, textured, and lit 3D scene for traditional production.<p>Seedance is supplying most of the visible production value: character designs, faces and expressions, linework, backgrounds, lighting, and a coherent anime rendering. It is even generating the secondary animation: the physics and flow of the hair and clothing, which the rigid 3D models completely lack. Far more work than 'just texturing' here.</p>
]]></description><pubDate>Thu, 25 Jun 2026 21:35:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=48679518</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48679518</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48679518</guid></item><item><title><![CDATA[New comment by famouswaffles in "The disappearance of Japan's animators"]]></title><description><![CDATA[
<p>>AI image generation is just not there yet.<p>A Japanese Animator shared this recently. Seedance output over simple 3d models<p><a href="https://www.reddit.com/r/accelerate/comments/1ue6uf2/japanese_animator_using_seedance_to_render_anime/" rel="nofollow">https://www.reddit.com/r/accelerate/comments/1ue6uf2/japanes...</a></p>
]]></description><pubDate>Thu, 25 Jun 2026 15:15:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=48674667</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48674667</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48674667</guid></item><item><title><![CDATA[New comment by famouswaffles in "DeepSeek Introduces Vision"]]></title><description><![CDATA[
<p>Don't know if it's you (did you publish?). I read about something similar but it had its issies:<p>- Tuning hyperparameters to gain improvement on a dataset when you're constantly looking at the answers is pretty meaningless. It's basically testing on the training data.<p>- Eval on ImageNet1k alone (very small, useless for the real world) made me wonder if it wasn't just overfit to the training set. Would it perform better training on the datasets used for the foundation models ? I doubt it.<p>Well I'm not saying CNNs are bad or useless at any rate.</p>
]]></description><pubDate>Fri, 19 Jun 2026 08:52:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=48596400</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48596400</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48596400</guid></item><item><title><![CDATA[New comment by famouswaffles in "DeepSeek Introduces Vision"]]></title><description><![CDATA[
<p>There's no 'rigorous comparison' that puts CNNs over Vits in quality and Vits unlocked more use cases easier than CNNs did. That's why they're more popular, not because it's 'bandwagon-y'.</p>
]]></description><pubDate>Fri, 19 Jun 2026 08:01:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=48596057</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48596057</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48596057</guid></item><item><title><![CDATA[New comment by famouswaffles in "DeepSeek Introduces Vision"]]></title><description><![CDATA[
<p>I'm talking about research pushing state of the art in computer vision. Vits have 100% become more popular than CNNs in most CV research.</p>
]]></description><pubDate>Fri, 19 Jun 2026 02:42:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=48594276</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48594276</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48594276</guid></item><item><title><![CDATA[New comment by famouswaffles in "DeepSeek Introduces Vision"]]></title><description><![CDATA[
<p>>Transformers are more hyped and more popular with the tech enthusiasts who just read forums and news, but if you need stuff done, CNNs are still great.<p>Vits are straight up more popular for ML research now, it's not just 'tech enthusiasts'.</p>
]]></description><pubDate>Thu, 18 Jun 2026 20:32:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=48591138</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48591138</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48591138</guid></item><item><title><![CDATA[New comment by famouswaffles in "Running local models is good now"]]></title><description><![CDATA[
<p>I don't think you're really getting the point I'm trying to make. Everyone training llms regularly cares about serving users at scale and quality per compute invested. It's not just about OpenAI or Anthropic or Google. Qwen, Deepseek, Moonshot, whatever. They all care about it <i>very much</i> and basically can't afford to take a step back in those areas.<p>Since training models is currently a very expensive procedure, diffusion llms are destined to be relegated to the occasional research artifact at best. As things stand, making a serious commitment to them is basically the equivalent of throwing money into a fire pit and things are expensive enough as is.<p>Alternate Architectures that do a much better job matching transformers in quality have basically gone nowhere but you expect one that is basically worse in every way the labs care about won't ? I'm not trying to 'dismiss' dllms. I'm interested in them for the same reason you are. I'm just stating the factors at play plainly.</p>
]]></description><pubDate>Tue, 16 Jun 2026 20:15:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=48561385</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48561385</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48561385</guid></item><item><title><![CDATA[New comment by famouswaffles in "Running local models is good now"]]></title><description><![CDATA[
<p>Difficulty of scaling is not the only issue. Nobody is going to be particularly invested in scaling an architecture that has:<p>- consistently proven behind their auto-regressive counterparts in quality. Look at the dgemma benchmarks - pretty steep dropoffs and the more difficult the benchmark the worse the dropoff. That's not a good look and it's not like its some artifact of google's release. Every dllm is like this.<p>- And whose inference benefits are negated at scale. Transformers are still cheaper if you want to serve lots of users.<p>>"DiffusionGemma's speedup is designed for local and low-concurrency inference. In high-QPS cloud serving, autoregressive models can be deployed to saturate compute efficiently, so DiffusionGemma's parallel decoding offers diminishing returns and can result in higher serving costs"<p>Put yourself in the shoes of all the labs, even open source ones. Why would you put much effort into this ?</p>
]]></description><pubDate>Tue, 16 Jun 2026 19:18:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=48560517</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48560517</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48560517</guid></item><item><title><![CDATA[New comment by famouswaffles in "SubQ 1.1 Small"]]></title><description><![CDATA[
<p>>OpenAI will not implement new architectural changes unless they've tested the changes themselves internally.<p>OpenAI validating it can still happen faster than they can get the compute to serve the models themselves[1]. It doesn't make a lot of sense to give out details if they want to be a serious contender or even as some have said, be acquired.<p>Yeah there's noise but if they have the real deal then it doesn't matter. They only thing they need to do is let people pay to use the models.<p>[1] I'm assuming this is the primary cause of the delay. That may not be the case of course.</p>
]]></description><pubDate>Tue, 16 Jun 2026 18:37:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=48559941</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48559941</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48559941</guid></item><item><title><![CDATA[New comment by famouswaffles in "SubQ 1.1 Small"]]></title><description><![CDATA[
<p>Business wise, it would make sense to hold off on details till they're at least ready to serve. Look at what happened with Open AI and reasoning models. Everyone struggled with getting RL to work with LLMs for a good while. Open AI figured it out, and a few months later everyone had their prototypes out in short order. Don't forget who these labs employ. They're some of the brightest people around. Sub-q aren't really in a position for that lol. If they'd shared details at the first announcement for instance, the big labs might have had something out by now while they're still pulling resources to scale and then what ?</p>
]]></description><pubDate>Tue, 16 Jun 2026 15:29:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=48556804</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48556804</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48556804</guid></item><item><title><![CDATA[New comment by famouswaffles in "Policy on the AI Exponential"]]></title><description><![CDATA[
<p>Humans just aren't very good at dealing with threats that aren't immediate concerns. 'Safety regulations are written in blood' is a saying for a reason. A significant chunk of the population shrugs off climate change, and nearly all fertility rate crises threads are filled with dumb 'hurr lower population good' and/or 'See what Capitalism gets you!' rhetoric - They fundamentally don't even understand what the problem is. So is it really all that surprising that a technology like this would be shrugged off until it's too late ? Especially one with such existential issues for humanity? Some people are still too loathe to admit the clearly intelligent machine is intelligent, devolving into increasingly nonsensical and absurd (and ironically <i>more</i> human demeaning) arguments as model capabilities get better. I'm afraid you're asking for too much.</p>
]]></description><pubDate>Wed, 10 Jun 2026 19:57:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=48481755</link><dc:creator>famouswaffles</dc:creator><comments>https://news.ycombinator.com/item?id=48481755</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48481755</guid></item></channel></rss>