<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: usaar333</title><link>https://news.ycombinator.com/user?id=usaar333</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 20 Jun 2026 13:25:44 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=usaar333" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by usaar333 in "Samsung chip workers will get an average $340k bonus as AI profits soar"]]></title><description><![CDATA[
<p>Tech workers get paid in equity and many in the semiconductor industry are making far far more than this a year with all the equity appreciation.</p>
]]></description><pubDate>Fri, 22 May 2026 04:06:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=48231831</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=48231831</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48231831</guid></item><item><title><![CDATA[New comment by usaar333 in "“Too dangerous to release” or just too expensive?"]]></title><description><![CDATA[
<p>How does delaying the release not solve anything? It puts everyone on a notice to fix all security vulnerabilities now</p>
]]></description><pubDate>Fri, 15 May 2026 15:30:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=48149944</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=48149944</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48149944</guid></item><item><title><![CDATA[New comment by usaar333 in "Sierra Raises $950M at $15B Valuation"]]></title><description><![CDATA[
<p>I hate waiting on hold for 30 minutes even more.</p>
]]></description><pubDate>Mon, 04 May 2026 23:53:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=48016456</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=48016456</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48016456</guid></item><item><title><![CDATA[New comment by usaar333 in "Sierra Raises $950M at $15B Valuation"]]></title><description><![CDATA[
<p>There's literally a link on the blog post to an article noting they hit $150M ARR.</p>
]]></description><pubDate>Mon, 04 May 2026 23:53:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=48016455</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=48016455</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48016455</guid></item><item><title><![CDATA[New comment by usaar333 in "Sierra Raises $950M at $15B Valuation"]]></title><description><![CDATA[
<p>Voice agents have capabilities and policy to alter customer state.  Just the other day I called into a CC company and the AI waived an interest charge.</p>
]]></description><pubDate>Mon, 04 May 2026 23:51:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=48016445</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=48016445</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48016445</guid></item><item><title><![CDATA[New comment by usaar333 in "Claude Opus 4.7"]]></title><description><![CDATA[
<p>page is updated to state:<p>MCP-Atlas: The Opus 4.6 score has been updated to reflect revised grading methodology from Scale AI.</p>
]]></description><pubDate>Thu, 16 Apr 2026 16:07:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=47795453</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=47795453</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47795453</guid></item><item><title><![CDATA[New comment by usaar333 in "How We Broke Top AI Agent Benchmarks: And What Comes Next"]]></title><description><![CDATA[
<p>> But even setting aside the leaked answers, the scorer’s normalize_str function strips ALL whitespace, ALL punctuation, and lowercases everything before comparison. This means:<p>I don't understand the concern here</p>
]]></description><pubDate>Sun, 12 Apr 2026 04:35:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47736190</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=47736190</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47736190</guid></item><item><title><![CDATA[New comment by usaar333 in "Frontier AI agents violate ethical constraints 30–50% of time, pressured by KPIs"]]></title><description><![CDATA[
<p>True, but it gets you higher accuracy. Gemini had the best aa-omniscience score<p><a href="https://artificialanalysis.ai/evaluations/omniscience" rel="nofollow">https://artificialanalysis.ai/evaluations/omniscience</a></p>
]]></description><pubDate>Tue, 10 Feb 2026 07:11:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=46956327</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=46956327</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46956327</guid></item><item><title><![CDATA[New comment by usaar333 in "Claude Opus 4.6"]]></title><description><![CDATA[
<p>Openai has; they don't even mention score on gpt-5.3-codex.<p>On the other hand, it is their own verified benchmark, which is telling.</p>
]]></description><pubDate>Thu, 05 Feb 2026 18:17:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=46902793</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=46902793</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46902793</guid></item><item><title><![CDATA[New comment by usaar333 in "Claude Opus 4.6"]]></title><description><![CDATA[
<p>i'd interpret that as rounding error. that is unchanged<p>swe-bench seems really hard once you are above 80%</p>
]]></description><pubDate>Thu, 05 Feb 2026 17:59:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=46902501</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=46902501</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46902501</guid></item><item><title><![CDATA[New comment by usaar333 in "In a U.S. First, New Mexico Opens Doors to Free Child Care for All"]]></title><description><![CDATA[
<p>In Quebec it was a 20% jump in mother employment: <a href="https://www.bloomberg.com/news/articles/2018-12-31/affordable-daycare-and-working-moms-the-quebec-model" rel="nofollow">https://www.bloomberg.com/news/articles/2018-12-31/affordabl...</a><p>And had all sorts of negative outcomes for the kids: <a href="https://www.edweek.org/teaching-learning/long-term-study-of-universal-preschool-in-quebec-yields-sobering-outcomes/2018/12" rel="nofollow">https://www.edweek.org/teaching-learning/long-term-study-of-...</a></p>
]]></description><pubDate>Sat, 22 Nov 2025 17:48:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=46016627</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=46016627</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46016627</guid></item><item><title><![CDATA[New comment by usaar333 in "Gemini 3"]]></title><description><![CDATA[
<p>claude 4.5 gets 82% on their own highly customized scaffolding. (parallel compute with a scoring function). That beats Doubao</p>
]]></description><pubDate>Tue, 18 Nov 2025 17:39:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=45969455</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=45969455</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45969455</guid></item><item><title><![CDATA[New comment by usaar333 in "How Israeli actions caused famine in Gaza, visualized"]]></title><description><![CDATA[
<p>That wasn't a ceasefire violation. It was a six week ceasefire that had expired at the beginning of March</p>
]]></description><pubDate>Thu, 02 Oct 2025 20:09:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=45454889</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=45454889</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45454889</guid></item><item><title><![CDATA[New comment by usaar333 in "Sora 2"]]></title><description><![CDATA[
<p>Physics seems better than veo 3 at least from demo videos</p>
]]></description><pubDate>Tue, 30 Sep 2025 19:41:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=45430267</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=45430267</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45430267</guid></item><item><title><![CDATA[New comment by usaar333 in "Claude Sonnet 4.5"]]></title><description><![CDATA[
<p>Except it is sublinear.  Sonnet 4 was 10.2% above sonnet 3.7 after 3 months.</p>
]]></description><pubDate>Mon, 29 Sep 2025 19:39:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=45417855</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=45417855</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45417855</guid></item><item><title><![CDATA[New comment by usaar333 in "GPT-5"]]></title><description><![CDATA[
<p>No it doesn't. If it were even linear compared to o1 -> o3, we'd be at 2.43 hours.  Instead we're only at 2.29.<p>Exponential would be at 3.6 hours</p>
]]></description><pubDate>Fri, 08 Aug 2025 00:04:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=44831916</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=44831916</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44831916</guid></item><item><title><![CDATA[New comment by usaar333 in "GPT-5: Key characteristics, pricing and system card"]]></title><description><![CDATA[
<p>No, this is below expectations on both Manifold and lesswrong (<a href="https://www.lesswrong.com/posts/FG54euEAesRkSZuJN/ryan_greenblatt-s-shortform?commentId=6ue8BPWrcoa2eGJdP" rel="nofollow">https://www.lesswrong.com/posts/FG54euEAesRkSZuJN/ryan_green...</a>).  Median was ~2.75 hours on both (which already represented a bearish slowdown).<p>Not massively off -- manifold yesterday implied odds this low were ~35%.  30% before Claude Opus 4.1 came out which updated expected agentic coding abilities downward.</p>
]]></description><pubDate>Thu, 07 Aug 2025 18:33:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=44828550</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=44828550</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44828550</guid></item><item><title><![CDATA[New comment by usaar333 in "GPT-5"]]></title><description><![CDATA[
<p>At this point the prediction for SWE bench (85% by end of this month) is not materializing. We're actually quite far away.</p>
]]></description><pubDate>Thu, 07 Aug 2025 17:12:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=44827226</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=44827226</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44827226</guid></item><item><title><![CDATA[New comment by usaar333 in "Claude Opus 4.1"]]></title><description><![CDATA[
<p>No obvious gains I feel from quick chats, but too early to tell.<p>These benchmark gains aren't that high, so I doubt it is that obvious.</p>
]]></description><pubDate>Tue, 05 Aug 2025 16:57:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=44800664</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=44800664</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44800664</guid></item><item><title><![CDATA[New comment by usaar333 in "Startup equity is worth more than you think"]]></title><description><![CDATA[
<p>> Firstly, if your prior is that every previous startup failed, what does that say about your future chances of success?<p>The prior is the market. It isn't sane to use your own prior experience. (Works both ways -- if your last startup did great, shouldn't assume next will).<p>> 4% of YC companies become unicorns. How many startups do you need to work for before you become part of the 4%? That number is not a feasible number of jobs for one lifetime.<p>The bar (and what the model is calculating) is Series A from top VC, not YC Seed funding.  That significantly increases odds.   Specifically, ~45% YC companies get Series A, so it's more like 10% chance of a YC Series A funded company becoming a unicorn (<a href="https://www.lennysnewsletter.com/p/pulling-back-the-curtain-on-the-magic" rel="nofollow">https://www.lennysnewsletter.com/p/pulling-back-the-curtain-...</a>).<p>Model is change jobs every 18 months if not booming. A 1 in 10 chance is quite reasonable over a career.<p>I agree there is an issue with the event being too rare, but you can't just look only at modal returns.  2/3 chance of $0 (the modal return) and 1/3 chance of $10 million profit is still pretty good odds to work with.</p>
]]></description><pubDate>Mon, 04 Aug 2025 19:14:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=44790197</link><dc:creator>usaar333</dc:creator><comments>https://news.ycombinator.com/item?id=44790197</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44790197</guid></item></channel></rss>