<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: alextheparrot</title><link>https://news.ycombinator.com/user?id=alextheparrot</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 26 Apr 2026 11:55:10 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=alextheparrot" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by alextheparrot in "ChatGPT Images 2.0"]]></title><description><![CDATA[
<p>The paper they published last year goes over some of these transformations: <a href="https://arxiv.org/pdf/2510.09263" rel="nofollow">https://arxiv.org/pdf/2510.09263</a></p>
]]></description><pubDate>Tue, 21 Apr 2026 23:23:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=47856195</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=47856195</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47856195</guid></item><item><title><![CDATA[New comment by alextheparrot in "ChatGPT Images 2.0"]]></title><description><![CDATA[
<p>> Integrating an imperceptible, robust, and content-specific watermark<p>From the system card someone linked elsewhere in the discussion</p>
]]></description><pubDate>Tue, 21 Apr 2026 19:57:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47853725</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=47853725</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47853725</guid></item><item><title><![CDATA[New comment by alextheparrot in "Ask ChatGPT to pick a number from 1-10000, it generally selects from 7200-7500"]]></title><description><![CDATA[
<p>No LLMs are calibrated?</p>
]]></description><pubDate>Sat, 21 Mar 2026 07:26:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=47464789</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=47464789</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47464789</guid></item><item><title><![CDATA[New comment by alextheparrot in "Court orders restart of all US offshore wind power construction"]]></title><description><![CDATA[
<p>I mean, you could also frame this as an issue the electorate could actually prioritize instead of just hoping the courts work it out</p>
]]></description><pubDate>Tue, 03 Feb 2026 03:53:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=46866264</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=46866264</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46866264</guid></item><item><title><![CDATA[New comment by alextheparrot in "Recursive Language Models"]]></title><description><![CDATA[
<p>The derivative being a grad(ient) student sampling scaffolds against evals + qualitative observations: most prompt-based llm papers</p>
]]></description><pubDate>Sun, 04 Jan 2026 01:57:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=46484001</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=46484001</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46484001</guid></item><item><title><![CDATA[New comment by alextheparrot in "New research reveals longevity gains slowing, life expectancy of 100 unlikely"]]></title><description><![CDATA[
<p>Cancer is a parasitism that kills the host (or the host dies from other causes and it is not self-sufficient).  Just because something is defined by uncontrolled self-replication doesn’t mean it is stable to live forever (Which is as much a comment on homeostasis as self-renewal)</p>
]]></description><pubDate>Sat, 30 Aug 2025 18:06:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=45076769</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=45076769</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45076769</guid></item><item><title><![CDATA[New comment by alextheparrot in "AI agent benchmarks are broken"]]></title><description><![CDATA[
<p>It isn’t actually very wrong. Your example is  tangential as graders in school have multiple roles — teaching the content and grading.  That’s an implementation detail, not a counter to the premise.<p>I don’t think we should assume answering a test would be easy for a Scantron machine just because it is very good at grading them, either.</p>
]]></description><pubDate>Fri, 11 Jul 2025 20:25:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=44536481</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=44536481</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44536481</guid></item><item><title><![CDATA[New comment by alextheparrot in "AI agent benchmarks are broken"]]></title><description><![CDATA[
<p>> Good evaluations write test sets for the discriminators to show when this is or isn’t true.<p>If they can’t write an evaluation for the discriminator I agree. All the input data issues you highlight also apply to generators.</p>
]]></description><pubDate>Fri, 11 Jul 2025 20:20:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=44536438</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=44536438</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44536438</guid></item><item><title><![CDATA[New comment by alextheparrot in "AI agent benchmarks are broken"]]></title><description><![CDATA[
<p>I wish the other replies and this would engage with the sentence right after it indicating that you should test this premise empirically.</p>
]]></description><pubDate>Fri, 11 Jul 2025 20:16:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=44536401</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=44536401</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44536401</guid></item><item><title><![CDATA[New comment by alextheparrot in "AI agent benchmarks are broken"]]></title><description><![CDATA[
<p>If this sort of error isn’t acceptable, it should be part of an evaluation set for your discriminator<p>Fundamentally I’m not disagreeing with the article, but also think most people who care take the above approach because if you do care you read samples, find the issues, and patch them to hill climb better</p>
]]></description><pubDate>Fri, 11 Jul 2025 15:37:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=44533390</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=44533390</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44533390</guid></item><item><title><![CDATA[New comment by alextheparrot in "AI agent benchmarks are broken"]]></title><description><![CDATA[
<p>LLMs evaluating LLM outputs really isn’t that dire…<p>Discriminating good answers is easier than generating them. Good evaluations write test sets for the discriminators to show when this is or isn’t true. Evaluating the outputs as the user might see them are more representative than having your generator do multiple tasks (e.g. solve a math query and format the output as a multiple choice answer).<p>Also, human labels are good but have problems of their own, it isn’t like by using a “different intelligence architecture” we elide all the possible errors.  Good instructions to the evaluation model often translate directly to better human results, showing a correlation between these two sources of sampling intelligence.</p>
]]></description><pubDate>Fri, 11 Jul 2025 14:14:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=44532406</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=44532406</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44532406</guid></item><item><title><![CDATA[New comment by alextheparrot in "How we’re responding to The NYT’s data demands in order to protect user privacy"]]></title><description><![CDATA[
<p>in the app: Settings ~> Data Controls ~> Improve the model for everyone</p>
]]></description><pubDate>Fri, 06 Jun 2025 16:21:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=44202406</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=44202406</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44202406</guid></item><item><title><![CDATA[New comment by alextheparrot in "Retailers will soon have only about 7 weeks of full inventories left"]]></title><description><![CDATA[
<p>That’s a premise that would make me consider the wiseness of my actions.</p>
]]></description><pubDate>Wed, 30 Apr 2025 15:19:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=43846497</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=43846497</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43846497</guid></item><item><title><![CDATA[New comment by alextheparrot in "Lawmakers are skeptical of Zuckerberg's commitment to free speech"]]></title><description><![CDATA[
<p>Quippy, but off the cuff:
- I don’t go to my present town square(s) socially because it is full of a-social behavior. Same reason to avoid certain bars or clubs, prefer certain parks, or why some are wary of public transit.<p>- I don’t feel a right to decide the vibe of how a business curates its space.  My bakery, coffee shop, local library, etc. all curate a space with an opinion.  I don’t feel I have standing to assert that my preferences should dominate their choices.<p>As an aside, businesses are also an extension of the people, the best ones tend to just not be mode collapsed</p>
]]></description><pubDate>Thu, 10 Apr 2025 14:32:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=43644200</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=43644200</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43644200</guid></item><item><title><![CDATA[New comment by alextheparrot in "The hacking of culture and the creation of socio-technical debt"]]></title><description><![CDATA[
<p>Really enjoyed the piece.<p>A passing thought: the ethe of individuals in the 70s and 80s is important because of the people it informed in subsequent years. While many people still like to hack, code, etc., the relative proportion of people doing this and working in tech continues to diminish as the popularity and importance of the sector grows. I wonder if debt without values / a more cohered zeitgeist is better or worse?</p>
]]></description><pubDate>Wed, 19 Jun 2024 19:37:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=40731692</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=40731692</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40731692</guid></item><item><title><![CDATA[New comment by alextheparrot in "Safe Superintelligence Inc."]]></title><description><![CDATA[
<p>Glibly, I’d also love your definition of the education system writ large.</p>
]]></description><pubDate>Wed, 19 Jun 2024 17:51:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=40730664</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=40730664</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40730664</guid></item><item><title><![CDATA[New comment by alextheparrot in "OpenAI and Apple Announce Partnership"]]></title><description><![CDATA[
<p>Bit of a detail, but where are you deriving “with hundreds of terabytes of unified GPU memory” from?</p>
]]></description><pubDate>Mon, 10 Jun 2024 20:08:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=40638263</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=40638263</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40638263</guid></item><item><title><![CDATA[New comment by alextheparrot in "σ-GPTs: A new approach to autoregressive models"]]></title><description><![CDATA[
<p>No, but it makes more conceptual sense given the model can consider what was said before it</p>
]]></description><pubDate>Fri, 07 Jun 2024 19:32:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=40612094</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=40612094</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40612094</guid></item><item><title><![CDATA[New comment by alextheparrot in "I should have loved biology (2020)"]]></title><description><![CDATA[
<p>I love the romance of this piece, but in my experience he’s just describing the difference in expectations of learning biology at a high school versus advanced undergraduate to graduate level.<p>Romance is for those who care, and most don’t.  But it is so, so beautiful once you do.</p>
]]></description><pubDate>Mon, 22 Apr 2024 03:33:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=40111211</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=40111211</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40111211</guid></item><item><title><![CDATA[New comment by alextheparrot in "Gemma: New Open Models"]]></title><description><![CDATA[
<p>> “Our best shot at making the quarter is if we get an injection of at least [redacted]% , queries ASAP from Chrome.” (Google Exec)<p>Isn’t there a whole anti-trust case going on around this?<p>[0] <a href="https://www.nytimes.com/interactive/2023/10/24/business/google-trial-jerry-dischler-email.html" rel="nofollow">https://www.nytimes.com/interactive/2023/10/24/business/goog...</a></p>
]]></description><pubDate>Wed, 21 Feb 2024 15:06:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=39454685</link><dc:creator>alextheparrot</dc:creator><comments>https://news.ycombinator.com/item?id=39454685</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39454685</guid></item></channel></rss>