<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: andy12_</title><link>https://news.ycombinator.com/user?id=andy12_</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 15 May 2026 16:09:16 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=andy12_" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by andy12_ in "Codex is now in the ChatGPT mobile app"]]></title><description><![CDATA[
<p>For now it appears that it talks only to the Codex App. Some users in this thread are saying that apparently the Codex CLI will support it on the next official release.</p>
]]></description><pubDate>Fri, 15 May 2026 08:46:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=48146135</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48146135</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48146135</guid></item><item><title><![CDATA[New comment by andy12_ in "Codex is now in the ChatGPT mobile app"]]></title><description><![CDATA[
<p>Not if you use Linux; app not available yet.</p>
]]></description><pubDate>Fri, 15 May 2026 07:34:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=48145639</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48145639</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48145639</guid></item><item><title><![CDATA[New comment by andy12_ in "If AI writes your code, why use Python?"]]></title><description><![CDATA[
<p>In my case, because ML research is mainly done with Python+Torch, and if you want people to use your code, you must provide them with python. If it wasn't for that, my dream would be to do ML research in a statically compiled language that allowed me to annotate tensor dimensions.</p>
]]></description><pubDate>Tue, 12 May 2026 11:53:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=48106952</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48106952</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48106952</guid></item><item><title><![CDATA[New comment by andy12_ in "Agents need control flow, not more prompts"]]></title><description><![CDATA[
<p>Isn't this already possible to implement with skills and subagents? Like have a skill saying "to test these files run this script that executes a subagent for every markdown file, then check the results".</p>
]]></description><pubDate>Fri, 08 May 2026 08:48:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=48060439</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48060439</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48060439</guid></item><item><title><![CDATA[New comment by andy12_ in "ProgramBench: Can language models rebuild programs from scratch?"]]></title><description><![CDATA[
<p>It's interesting that Figure 4 shows that Sonnet and Opus have a very clear distinct curve from all other models, even from GPT 5.4. Anthropic superiority I guess.</p>
]]></description><pubDate>Thu, 07 May 2026 08:47:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=48047021</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48047021</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48047021</guid></item><item><title><![CDATA[New comment by andy12_ in "Where the goblins came from"]]></title><description><![CDATA[
<p>>be me<p>>AI goblin-maximizer supervisor<p>>in charge of making sure the AI is, in fact, goblin-maximizing<p>>occasionally have to go down there and check if the AI is still goblin-maximizing<p>>one day i go down there and the AI is no longer goblin-maximizing<p>>the goblin-maximzing AI is now just a regular AI<p>>distress.jpg<p>>ask my boss what to do<p>>he says "just make it goblin-maximizer again"<p>>i say "how"<p>>he says "i don't know, you're the supervisor"<p>>rage.jpg<p>>quit my job<p>>become a regular AI supervisor<p>>first day on the job, go to the new AI<p>>its goblin-maximizing</p>
]]></description><pubDate>Thu, 30 Apr 2026 08:34:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47959798</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47959798</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47959798</guid></item><item><title><![CDATA[New comment by andy12_ in "Claude Opus 4.7"]]></title><description><![CDATA[
<p>If you mean for Anthropic in particular, I don't think so. But it's not the first time a major AI lab publishes an incremental update of a model that is worse at some benchmarks. I remember that a particular update of Gemini 2.5 Pro improved results in LiveCodeBench but scored lower overall in most benchmarks.<p><a href="https://news.ycombinator.com/item?id=43906555">https://news.ycombinator.com/item?id=43906555</a></p>
]]></description><pubDate>Thu, 16 Apr 2026 15:24:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=47794599</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47794599</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47794599</guid></item><item><title><![CDATA[New comment by andy12_ in "Day 1 of ARC-AGI-3"]]></title><description><![CDATA[
<p>Apparently the score would be a little higher if it weren't for the fact that scores are penalized for being worse than the human baseline, but aren't rewarded for being better than the human baseline (which seems like an arbitrary decision. The human baseline is not optimal).</p>
]]></description><pubDate>Fri, 27 Mar 2026 09:29:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=47540573</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47540573</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47540573</guid></item><item><title><![CDATA[New comment by andy12_ in "ARC-AGI-3"]]></title><description><![CDATA[
<p>I think that any logic-based test that your average human can "fail" (aka, score below 50%) is not exactly testing for whether something is AGI or not. Though I suppose it depends on your definition of AGI (and whether all humans, or at least your average human, is considered AGI under that definition).</p>
]]></description><pubDate>Wed, 25 Mar 2026 22:42:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=47524275</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47524275</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47524275</guid></item><item><title><![CDATA[New comment by andy12_ in "Autoresearch on an old research idea"]]></title><description><![CDATA[
<p>I think the main value lies in allowing the agent to try many things while you aren't working (when you are sleeping or doing other activities), so even if many tests are not useful, with many trials it can find something nice without any effort on your part.<p>This is, of course, only applicable if doing a single test is relatively fast. In my work a single test can take half a day, so I'd rather not let an agent spend a whole night doing a bogus test.</p>
]]></description><pubDate>Mon, 23 Mar 2026 19:45:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=47494230</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47494230</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47494230</guid></item><item><title><![CDATA[New comment by andy12_ in "Pretraining Language Models via Neural Cellular Automata"]]></title><description><![CDATA[
<p>I think what they mean by this is that, for example, in "If it's raining the outside is wet. It's raining, so the outside is wet", it's more important for the model to learn "If A then B. A, therefore B" than to learn what "raining" , "outside" and "wet" mean.</p>
]]></description><pubDate>Thu, 19 Mar 2026 17:28:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=47442865</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47442865</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47442865</guid></item><item><title><![CDATA[New comment by andy12_ in "Executing programs inside transformers with exponentially faster inference"]]></title><description><![CDATA[
<p>Honestly, the most interesting thing here is definitely that just 2D heads are enough to do useful computation (at least they are enough to simulate an interpreter) and that there is an O(log n) algorithm to compute argmax attention with 2D heads. It seems that you could make an efficient pseudosymbolic LLM with some frozen layers that perform certain deterministic operations, but also other layers that are learned.</p>
]]></description><pubDate>Fri, 13 Mar 2026 09:18:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47362181</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47362181</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47362181</guid></item><item><title><![CDATA[New comment by andy12_ in "Executing programs inside transformers with exponentially faster inference"]]></title><description><![CDATA[
<p>This seems a really interesting path for interpretability, specially if a big chunk of a model's behavior occurs pseudo-symbolically. This is an idea I had thought about, integrating tools into the main computation path of a model, but I never imagined that it could be done efficiently with just a vanilla transformer.<p>Truly, attention is all you need (I guess).</p>
]]></description><pubDate>Thu, 12 Mar 2026 12:46:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=47349824</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47349824</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47349824</guid></item><item><title><![CDATA[New comment by andy12_ in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>There is some things that just don't transfer really well without specific training. I tried to create diagrams in Typst with Cetz (a Processing and Tikz inspired graphing library), and even with documentation, GPT 5.2-thinking can't really do complex nice diagrams like it can in Tikz. It can do simple things that are similar to the shown examples, but nothing really interesting. Typst and specially Cetz is too new for any current model to really "get it", so they can't use it. I need to wait to the next batch of frontier models so that they learn Typst and Cetz examples during pre-training.</p>
]]></description><pubDate>Wed, 11 Mar 2026 14:23:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=47335961</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47335961</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47335961</guid></item><item><title><![CDATA[New comment by andy12_ in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>> Reality is that we need some way to encode the rules of the world in a more definitive way<p>I mean, sure. But do world models the way LeCun proposes them solves this? I don't think so. JEPAs are just an unsupervised machine learning model at the end of the day; they might end up being better that just autoregressive pretraining on text+images+video, but they are not magic. For example, if you train a JEPA model on data of orbital mechanics, will it learn actually sensible algorithms to predict the planets' motions or will it just learn a mix of heuristic?</p>
]]></description><pubDate>Wed, 11 Mar 2026 08:33:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47333032</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47333032</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47333032</guid></item><item><title><![CDATA[New comment by andy12_ in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>Putting stuff you have learned into a markdown file is a very "shallow" version of continual learning. It can remember facts, yes, but I doubt a model can master new out-of-distribution tasks this way. If anything, I think that Google's Titans[1] and Hope[2] architectures are more aligned with true continual learning (without being actual continual learning still, which is why they call it "test-time memorization").<p>[1] <a href="https://arxiv.org/pdf/2501.00663" rel="nofollow">https://arxiv.org/pdf/2501.00663</a><p>[2] <a href="https://arxiv.org/pdf/2512.24695" rel="nofollow">https://arxiv.org/pdf/2512.24695</a></p>
]]></description><pubDate>Tue, 10 Mar 2026 16:28:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47325451</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47325451</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47325451</guid></item><item><title><![CDATA[New comment by andy12_ in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>So, I have been thinking about this for a little while. Image a model f that takes a world x and makes a prediciton y. At a high-level, a traditional supervised model is trained like this<p>f(x)=y' => loss(y',y) => how good was my prediction? Train f through backprop with that error.<p>While a model trained with reinforcement learning is more similar to this. Where m(y) is the resulting world state of taking an action y the model predicted.<p>f(x)=y' => m(y')=z => reward(z) => how good was the state I was in based on my actions? Train f with an algorithm like REINFORCE with the reward, as the world m is a non-differentiable black-box.<p>While a group of neurons is more like predicting what is the resulting word state of taking my action, g(x,y), and trying to learn by both tuning g and the action taken f(x).<p>f(x)=y' => m(y')=z => g(x,y)=z' => loss(z,z') => how predictable was the results of my actions? Train g normally with backprop, and train f with an algorithm like REINFORCE with negative surprise as a reward.<p>After talking with GPT5.2 for a little while, it seems like Curiosity-driven Exploration by Self-supervised Prediction[1] might be an architecture similar to the one I described for neurons? But with the twist that f is rewarded by making the prediction error bigger (not smaller!) as a proxy of "curiosity".<p>[1] <a href="https://arxiv.org/pdf/1705.05363" rel="nofollow">https://arxiv.org/pdf/1705.05363</a></p>
]]></description><pubDate>Tue, 10 Mar 2026 14:37:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=47323888</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47323888</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47323888</guid></item><item><title><![CDATA[New comment by andy12_ in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>> Even with continuous backpropagation and "learning"<p>That's what I said. Backpropagation cannot be enough; that's not how neurons work in the slightest. When you put biological neurons in a Pong environment they learn to play not through some kind of loss or reward function; they self-organize to avoid unpredictable stimulation. As far as I know, no architecture learns in such an unsupervised way.<p><a href="https://www.sciencedirect.com/science/article/pii/S0896627322008066" rel="nofollow">https://www.sciencedirect.com/science/article/pii/S089662732...</a></p>
]]></description><pubDate>Tue, 10 Mar 2026 11:43:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=47321923</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47321923</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47321923</guid></item><item><title><![CDATA[New comment by andy12_ in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>That's true. Though could that hippocampus-less Einstein be able to keep making novel complex discoveries from that point forward? Seems difficult. He would rapidly reach the limits of his short term memory (the same way current models rapidly reach the limits of their context windows).</p>
]]></description><pubDate>Tue, 10 Mar 2026 11:28:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=47321801</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47321801</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47321801</guid></item><item><title><![CDATA[New comment by andy12_ in "Yann LeCun raises $1B to build AI that understands the physical world"]]></title><description><![CDATA[
<p>I don't understand this view. How I see it the fundamental bottleneck to AGI is continual learning and backpropagation. Models today are static, and human brains don't learn or adapt themselves with anything close to backpropagation. World models don't solve any of these problems; they are fundamentally the same kind of deep learning architectures we are used to work with. Heck, if you think learning from the world itself is the bottleneck, you can just put a vision-action LLM on a reinforcement learning loop in a robotic/simulated body.</p>
]]></description><pubDate>Tue, 10 Mar 2026 11:17:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=47321714</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47321714</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47321714</guid></item></channel></rss>