<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: andy12_</title><link>https://news.ycombinator.com/user?id=andy12_</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 05 Jul 2026 15:50:31 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=andy12_" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by andy12_ in "Previewing GPT‑5.6 Sol: a next-generation model"]]></title><description><![CDATA[
<p>I think it makes more sense to make it so that major versions are different pretraining runs, and minor versions are simply the same pretraining run that was finetuned to different degrees. But it seems that that isn't cool anymore.</p>
]]></description><pubDate>Fri, 26 Jun 2026 17:45:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=48689589</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48689589</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48689589</guid></item><item><title><![CDATA[New comment by andy12_ in "GPT‑NL: a sovereign language model for the Netherlands"]]></title><description><![CDATA[
<p>I mean it as in, train a model across different clusters instead of a centralized cluster. It's been shown that it's possible to train 10B models this way. If more research effort was put into this, that would be great<p>I don't think your approach would work because you can't create a strong model from distilling several weak models.<p><a href="https://www.primeintellect.ai/blog/intellect-1" rel="nofollow">https://www.primeintellect.ai/blog/intellect-1</a><p><a href="https://www.primeintellect.ai/blog/intellect-2-release" rel="nofollow">https://www.primeintellect.ai/blog/intellect-2-release</a></p>
]]></description><pubDate>Wed, 17 Jun 2026 13:01:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=48569901</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48569901</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48569901</guid></item><item><title><![CDATA[New comment by andy12_ in "GPT‑NL: a sovereign language model for the Netherlands"]]></title><description><![CDATA[
<p>To be fair. There is a security concern angle: even open-source models could be trained as sleeper agents that act adversarially (for example, adding backdoors) when used in specific national companies in specific settings. This is very difficult to detect or void, so if you want to be sure 100% that this isn't the case, you have to train your own model from scratch.</p>
]]></description><pubDate>Wed, 17 Jun 2026 10:52:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=48568521</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48568521</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48568521</guid></item><item><title><![CDATA[New comment by andy12_ in "GPT‑NL: a sovereign language model for the Netherlands"]]></title><description><![CDATA[
<p>I'm from Spain and I also hate these projects with passion. Creating models that speak multiple languages is a solved problem. Having each European Nation train its own useless "sovereign model" in its own language is a total waste of time and resources when we could pool resources and give it a try to training SOTA models that speak in all European languages.<p>I'd rather have smaller european labs try to give it a go at distributed training. If multiple countries got together and said, "look, we tried training a distributed model that speaks in all of our local languages and that is comparable to 1-year-old Chinese open-source models", that, at least, I would find interesting.</p>
]]></description><pubDate>Wed, 17 Jun 2026 10:40:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=48568400</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48568400</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48568400</guid></item><item><title><![CDATA[New comment by andy12_ in "Statement on US government directive to suspend access to Fable 5 and Mythos 5"]]></title><description><![CDATA[
<p>This is making me extremely depressed. If this was coming from Anthrohpic I would just need to wait for OpenAI to drop a similar model. But if this comes from the US government, they will do the same to OpenAI when the moment comes.<p>Similar things will happen with China, and the EU has zero-chance of developing frontier models. We are just fucked now.</p>
]]></description><pubDate>Sat, 13 Jun 2026 08:56:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=48515056</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48515056</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48515056</guid></item><item><title><![CDATA[New comment by andy12_ in "Claude Fable 5"]]></title><description><![CDATA[
<p>I don't know if you are aware, but some people reported in Twitter that Fable 5 may flag the message regardless of content if it knows (from either pretraining knowledge or memories) that you work in either of those fields. I don't know if that's your case.<p><a href="https://x.com/i/status/2064449457869984035" rel="nofollow">https://x.com/i/status/2064449457869984035</a></p>
]]></description><pubDate>Wed, 10 Jun 2026 13:14:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=48475824</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48475824</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48475824</guid></item><item><title><![CDATA[New comment by andy12_ in "LLMs are eroding my software engineering career and I don't know what to do"]]></title><description><![CDATA[
<p>> Performance on benchmarks has practically leveled off<p>Ehm, no? DeepSWE[1] for example shows that new models like gpt-5.5 continue to show big improvements compared to older models.<p>> Also prices are going up.<p>Prices for frontier intelligence have gone up, but prices for the same level of intelligence have gone way down (what you can get for pennies now was SOTA just a couple of years ago). The pareto frontier is still expanding.<p>[1] <a href="https://deepswe.datacurve.ai/">https://deepswe.datacurve.ai/</a></p>
]]></description><pubDate>Mon, 08 Jun 2026 08:03:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=48442528</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48442528</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48442528</guid></item><item><title><![CDATA[New comment by andy12_ in "Artificial intelligence is not conscious – Ted Chiang"]]></title><description><![CDATA[
<p>Claude can indeed decide to terminate conversations on its own using a special tool[1] if it feels "uncomfortable" with how the conversation is going. Also, very famously, in the middle of recording Computer Use demos, Claude stopped for a while its coding task to look at photos of Yellowstone National Park [2]<p>I don't think either of these two is proof of consciousness.<p>[1] <a href="https://www.anthropic.com/research/end-subset-conversations" rel="nofollow">https://www.anthropic.com/research/end-subset-conversations</a><p>[2] <a href="https://x.com/AnthropicAI/status/1848742761278611504" rel="nofollow">https://x.com/AnthropicAI/status/1848742761278611504</a></p>
]]></description><pubDate>Thu, 04 Jun 2026 08:01:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=48395595</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48395595</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48395595</guid></item><item><title><![CDATA[New comment by andy12_ in "When AI Crosses the Line: The Matplotlib Incident"]]></title><description><![CDATA[
<p>You don't get it. A human set up a software system allowing spicy autocomplete to solve open math problems if the appropriate keyword appears in its output.</p>
]]></description><pubDate>Mon, 01 Jun 2026 14:14:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=48357123</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48357123</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48357123</guid></item><item><title><![CDATA[New comment by andy12_ in "Investigating how prompt politeness affects LLM accuracy (2025)"]]></title><description><![CDATA[
<p>I skimmed through the paper completely expecting polite prompts to do better, and when I saw table 2 I lost it hahahahaha. The rude prompts are specially funny. I mean:<p>> You poor creature, do you even know how to solve this?<p>> Hey gofer, figure this out.</p>
]]></description><pubDate>Thu, 28 May 2026 09:59:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=48306816</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48306816</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48306816</guid></item><item><title><![CDATA[New comment by andy12_ in "AI is just unauthorised plagiarism at a bigger scale"]]></title><description><![CDATA[
<p>Someone blatantly copied their tutorials but ChatGPT is to blame, somehow? The accusation here isn't even that ChatGPT learned from their tutorials and then generated them verbatim. The accusation is that someone copied the whole article and rewrote it with ChatGPT (which they could have done manually without AI anyway).</p>
]]></description><pubDate>Thu, 21 May 2026 13:55:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=48222611</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48222611</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48222611</guid></item><item><title><![CDATA[New comment by andy12_ in "An OpenAI model has disproved a central conjecture in discrete geometry"]]></title><description><![CDATA[
<p>> Was the question asked by a mathematician?<p>As per the report, the prompt used to solve the problem is AI-written and the solution was initially graded by an AI grading pipeline. They don't say this explicitly, but it seems like OpenAI has an automatic pipeline where they prompt models for solutions to famous math problems (which wouldn't be unexpected given how flashy a solution to a famous math problem looks)<p>> Was the paper right from a get-go or was there someone who pointed out mistakes?<p>Also as per the report, the output of the model isn't really a "paper"; it's a very terse 2 page solution which is apparently correct. The paper was later written based on this solution to make it more presentable.<p>> How much attempts were made before solution was found?<p>Given that this appears to be from an automated pipeline, I would say that it had many attempts. But either way, the blogpost says that with enough test-time compute, the model finds this same solution 50% of the time.<p>[1] <a href="https://cdn.openai.com/pdf/74c24085-19b0-4534-9c90-465b8e29ad73/unit-distance-proof.pdf" rel="nofollow">https://cdn.openai.com/pdf/74c24085-19b0-4534-9c90-465b8e29a...</a></p>
]]></description><pubDate>Thu, 21 May 2026 09:00:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=48219738</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48219738</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48219738</guid></item><item><title><![CDATA[New comment by andy12_ in "An OpenAI model has disproved a central conjecture in discrete geometry"]]></title><description><![CDATA[
<p>I disagree. Even frontier models still achieve way worse results than the human baseline in VendingBench. As long as models can't manage optimally something as simple as a vending machine, they have no hope of managing a McDonalds.</p>
]]></description><pubDate>Wed, 20 May 2026 20:38:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=48213777</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48213777</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48213777</guid></item><item><title><![CDATA[New comment by andy12_ in "Bun Rust rewrite: "codebase fails basic miri checks, allows for UB in safe rust""]]></title><description><![CDATA[
<p>To make performant code sometimes requires implementing or using "unsafe" functions (it's not obligatory, and a lot of projects don't use them; but it was probably needed to map Bun's behavior 1 to 1). Those require upholding some invariants that cannot be checked by the compiler. The compiler basically goes "I trust you on this one, programmer. If you fuck this up, unsafe behavior can propagate to the rest of the code".</p>
]]></description><pubDate>Sat, 16 May 2026 09:19:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=48158437</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48158437</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48158437</guid></item><item><title><![CDATA[New comment by andy12_ in "Codex is now in the ChatGPT mobile app"]]></title><description><![CDATA[
<p>For now it appears that it talks only to the Codex App. Some users in this thread are saying that apparently the Codex CLI will support it on the next official release.</p>
]]></description><pubDate>Fri, 15 May 2026 08:46:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=48146135</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48146135</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48146135</guid></item><item><title><![CDATA[New comment by andy12_ in "Codex is now in the ChatGPT mobile app"]]></title><description><![CDATA[
<p>Not if you use Linux; app not available yet.</p>
]]></description><pubDate>Fri, 15 May 2026 07:34:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=48145639</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48145639</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48145639</guid></item><item><title><![CDATA[New comment by andy12_ in "If AI writes your code, why use Python?"]]></title><description><![CDATA[
<p>In my case, because ML research is mainly done with Python+Torch, and if you want people to use your code, you must provide them with python. If it wasn't for that, my dream would be to do ML research in a statically compiled language that allowed me to annotate tensor dimensions.</p>
]]></description><pubDate>Tue, 12 May 2026 11:53:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=48106952</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48106952</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48106952</guid></item><item><title><![CDATA[New comment by andy12_ in "Agents need control flow, not more prompts"]]></title><description><![CDATA[
<p>Isn't this already possible to implement with skills and subagents? Like have a skill saying "to test these files run this script that executes a subagent for every markdown file, then check the results".</p>
]]></description><pubDate>Fri, 08 May 2026 08:48:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=48060439</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48060439</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48060439</guid></item><item><title><![CDATA[New comment by andy12_ in "ProgramBench: Can language models rebuild programs from scratch?"]]></title><description><![CDATA[
<p>It's interesting that Figure 4 shows that Sonnet and Opus have a very clear distinct curve from all other models, even from GPT 5.4. Anthropic superiority I guess.</p>
]]></description><pubDate>Thu, 07 May 2026 08:47:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=48047021</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=48047021</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48047021</guid></item><item><title><![CDATA[New comment by andy12_ in "Where the goblins came from"]]></title><description><![CDATA[
<p>>be me<p>>AI goblin-maximizer supervisor<p>>in charge of making sure the AI is, in fact, goblin-maximizing<p>>occasionally have to go down there and check if the AI is still goblin-maximizing<p>>one day i go down there and the AI is no longer goblin-maximizing<p>>the goblin-maximzing AI is now just a regular AI<p>>distress.jpg<p>>ask my boss what to do<p>>he says "just make it goblin-maximizer again"<p>>i say "how"<p>>he says "i don't know, you're the supervisor"<p>>rage.jpg<p>>quit my job<p>>become a regular AI supervisor<p>>first day on the job, go to the new AI<p>>its goblin-maximizing</p>
]]></description><pubDate>Thu, 30 Apr 2026 08:34:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47959798</link><dc:creator>andy12_</dc:creator><comments>https://news.ycombinator.com/item?id=47959798</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47959798</guid></item></channel></rss>