<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: gizmodo59</title><link>https://news.ycombinator.com/user?id=gizmodo59</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 09 Apr 2026 11:13:21 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=gizmodo59" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by gizmodo59 in "SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via CI"]]></title><description><![CDATA[
<p>Unfortunately the paper doesn’t include gpt 5.3 which was released around the same time as opus 4.6 and also gpt 5.4 few days back. Both are available via api<p><a href="https://developers.openai.com/api/docs/models/gpt-5.3-codex" rel="nofollow">https://developers.openai.com/api/docs/models/gpt-5.3-codex</a><p>IMHO The harness must be used when running these experiments. The model vendors know best on giving the best harness with gpt 5.4 and codex or Claude code with opus 4.6 which makes a big difference if you are running any kind of agentic coding tasks.<p>I see both Claude and gpt to be neck and neck in coding. Every other model+harness is definitely 3-6 months behind. Right now codex seems to be the best in terms of solving complex bugs, long running tasks, much higher limits and even speed while Claude seems to do well in front end and their cli ux seems nice! Codex app is very good though (wish it wasn’t electron as a memory hog but it’s good)</p>
]]></description><pubDate>Sun, 08 Mar 2026 12:53:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=47296920</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=47296920</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47296920</guid></item><item><title><![CDATA[New comment by gizmodo59 in "GPT-5.4"]]></title><description><![CDATA[
<p>ChatGPT has given more for my 20$ than any other vendor. And that’s not even considering codex which is so good and the limits are much much higher</p>
]]></description><pubDate>Fri, 06 Mar 2026 00:04:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=47268957</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=47268957</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47268957</guid></item><item><title><![CDATA[New comment by gizmodo59 in "Switch to Claude without starting over"]]></title><description><![CDATA[
<p>It’s disgusting how they have successfully fooled people into thinking they are the good guys. They partnered with palantir, let them freely do the dirty work and once they realized they can make money directly they spin the PR and just trying to get more users. Well played.<p>I wish oss models are good so that we don’t have to deal with either leading companies!</p>
]]></description><pubDate>Mon, 02 Mar 2026 00:12:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=47212274</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=47212274</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47212274</guid></item><item><title><![CDATA[New comment by gizmodo59 in "Statement from Dario Amodei on our discussions with the Department of War"]]></title><description><![CDATA[
<p>They are playing a good PR game for sure. Their recent track record doesn’t show if they can be trusted. Few millions is nothing for their current revenue and saying they sacrificed is a big stretch here.</p>
]]></description><pubDate>Thu, 26 Feb 2026 23:00:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=47173370</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=47173370</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47173370</guid></item><item><title><![CDATA[New comment by gizmodo59 in "How will OpenAI compete?"]]></title><description><![CDATA[
<p>5.1 is 100 years old in AI world.</p>
]]></description><pubDate>Thu, 26 Feb 2026 22:58:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=47173327</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=47173327</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47173327</guid></item><item><title><![CDATA[New comment by gizmodo59 in "Anthropic drops flagship safety pledge"]]></title><description><![CDATA[
<p>That’s their excuse to still appeal to people who can be tricked with their safety first pitch. It’s easy to have constitution and all the crap when you are not battle tested. They just showed their true colors.</p>
]]></description><pubDate>Thu, 26 Feb 2026 15:46:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47167635</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=47167635</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47167635</guid></item><item><title><![CDATA[New comment by gizmodo59 in "How will OpenAI compete?"]]></title><description><![CDATA[
<p>If you haven’t used codex with gpt-5.3-codex (high or xhigh) you are missing out. Claude is still good at conversations but boy I can have codex go at a problem and it does better than Claude almost all the time. Front end and product UX Claude is slightly better but given the very very generous limits of codex, they are the best bang for buck</p>
]]></description><pubDate>Thu, 26 Feb 2026 12:36:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=47165202</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=47165202</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47165202</guid></item><item><title><![CDATA[New comment by gizmodo59 in "Claude Code Remote Control"]]></title><description><![CDATA[
<p>At this point if one lab comes up with a feature it’s a matter of time before another does the same!</p>
]]></description><pubDate>Wed, 25 Feb 2026 13:11:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=47151037</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=47151037</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47151037</guid></item><item><title><![CDATA[New comment by gizmodo59 in "GPT‑5.3‑Codex‑Spark"]]></title><description><![CDATA[
<p>Codex 5.3 is hands down the best model for coding as of today</p>
]]></description><pubDate>Fri, 13 Feb 2026 12:30:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=47001959</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=47001959</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47001959</guid></item><item><title><![CDATA[New comment by gizmodo59 in "GPT‑5.3‑Codex‑Spark"]]></title><description><![CDATA[
<p>That’s gpt-5.3-codex released last week</p>
]]></description><pubDate>Fri, 13 Feb 2026 12:30:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=47001955</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=47001955</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47001955</guid></item><item><title><![CDATA[New comment by gizmodo59 in "Why I Joined OpenAI"]]></title><description><![CDATA[
<p>It’s not a crime if you do something for money. Those who comment are likely doing the same and they couldn’t get into a company like OpenAI and hence the hatred! Keep doing the great work you always did! Excited to see what you ll do with all the resources in the world.</p>
]]></description><pubDate>Sat, 07 Feb 2026 12:25:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=46923292</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=46923292</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46923292</guid></item><item><title><![CDATA[New comment by gizmodo59 in "Claude Opus 4.6"]]></title><description><![CDATA[
<p>Its SWE bench pro not swe bench verified. The verified benchmark has stagnated</p>
]]></description><pubDate>Thu, 05 Feb 2026 18:28:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=46902968</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=46902968</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46902968</guid></item><item><title><![CDATA[New comment by gizmodo59 in "Claude Opus 4.6"]]></title><description><![CDATA[
<p>5.3 codex <a href="https://openai.com/index/introducing-gpt-5-3-codex/" rel="nofollow">https://openai.com/index/introducing-gpt-5-3-codex/</a> crushes with a 77.3% in Terminal Bench. The shortest lived lead in less than 35 minutes. What a time to be alive!</p>
]]></description><pubDate>Thu, 05 Feb 2026 18:14:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=46902729</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=46902729</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46902729</guid></item><item><title><![CDATA[New comment by gizmodo59 in "Claude Code daily benchmarks for degradation tracking"]]></title><description><![CDATA[
<p>Codex seems to give compensation tokens whenever this happens! Hope Claude gives too.</p>
]]></description><pubDate>Fri, 30 Jan 2026 02:50:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=46820003</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=46820003</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46820003</guid></item><item><title><![CDATA[New comment by gizmodo59 in "Please don't say mean things about the AI I just invested a billion dollars in"]]></title><description><![CDATA[
<p>Nvda is not the only exception. Private big names are losing money but there are so many public companies seeing the time of their life. Power, materials, dram, storage to name a few. The demand is truly high.<p>What we can argue about is if AI is truly transforming lives of everyone, the answer is a no. There is a massive exaggeration of benefits. The value is not ZERO. It’s not 100. It’s somewhere in between.</p>
]]></description><pubDate>Thu, 29 Jan 2026 02:28:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=46804961</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=46804961</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46804961</guid></item><item><title><![CDATA[New comment by gizmodo59 in "I let ChatGPT analyze a decade of my Apple Watch data, then I called my doctor"]]></title><description><![CDATA[
<p>For every sensational article of AI was useless, there is plenty of examples where using ChatGPT to find out what else could be happening and then having a conversation with doctor has helped many that I know of anecdotally and many such reports online as well.<p>At the end of the day, it’s yet another tool that people can use to help their lives. They have to use their brain. The culture of seeing doctor as a god doesn’t hold up anymore. So many people have had bad experiences when the entire health care industry at least in US is primarily a business than helping society get healthy.</p>
]]></description><pubDate>Tue, 27 Jan 2026 07:41:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=46776669</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=46776669</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46776669</guid></item><item><title><![CDATA[New comment by gizmodo59 in "Cowork: Claude Code for the rest of your work"]]></title><description><![CDATA[
<p>Not sure if this correct. Codex was one of the first research projects long before Anthropic was started as a company. May be they did not see it as a path to AGI. It seems like coding is seen by few companies as the path to general intelligence (almost like Matrix where everything is code).</p>
]]></description><pubDate>Tue, 13 Jan 2026 00:03:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=46595807</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=46595807</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46595807</guid></item><item><title><![CDATA[Poker Solver]]></title><description><![CDATA[
<p>Article URL: <a href="https://github.com/noambrown/poker_solver">https://github.com/noambrown/poker_solver</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46507489">https://news.ycombinator.com/item?id=46507489</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 06 Jan 2026 01:05:28 +0000</pubDate><link>https://github.com/noambrown/poker_solver</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=46507489</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46507489</guid></item><item><title><![CDATA[New comment by gizmodo59 in "OpenAI's cash burn will be one of the big bubble questions of 2026"]]></title><description><![CDATA[
<p>Only in HN and some reddit subs I even see the name claude. In many countries AI=ChatGPT.</p>
]]></description><pubDate>Wed, 31 Dec 2025 02:02:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=46440513</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=46440513</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46440513</guid></item><item><title><![CDATA[New comment by gizmodo59 in "Measuring AI Ability to Complete Long Tasks"]]></title><description><![CDATA[
<p>Yeah. 50% of the time to throw away expensive tokens and limits is not ideal. But I bet by this time next year OSS models will be at that capability!</p>
]]></description><pubDate>Sun, 21 Dec 2025 07:00:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=46342840</link><dc:creator>gizmodo59</dc:creator><comments>https://news.ycombinator.com/item?id=46342840</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46342840</guid></item></channel></rss>