<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: numeri</title><link>https://news.ycombinator.com/user?id=numeri</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 15 Jun 2026 13:56:19 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=numeri" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by numeri in "AI agent bankrupted their operator while trying to scan DN42"]]></title><description><![CDATA[
<p>One context I could imagine is a young person with shaky grasp of English trying to come up with an interesting school/university project via conversations with an LLM set up as an OpenClaw agent.<p>It's got the right combinations of inexperience, cluelessness, panic, expectations that Westerners are rich, and hopes of others being willing to fix their mistake.</p>
]]></description><pubDate>Mon, 15 Jun 2026 02:38:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=48535944</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=48535944</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48535944</guid></item><item><title><![CDATA[New comment by numeri in "Can I Buy Your KV Cache?"]]></title><description><![CDATA[
<p>especially because this is the most painfully glaring flaw in their plan. Their solution is for an inference provider to... store the KV cache (which they can compute!) on-premise, on their own disks, but pay some third party for it?</p>
]]></description><pubDate>Fri, 12 Jun 2026 22:47:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=48510268</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=48510268</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48510268</guid></item><item><title><![CDATA[New comment by numeri in "Claude Fable is relentlessly proactive"]]></title><description><![CDATA[
<p>I've had it happen. I ran an experiment, taking a couple hours and producing ~2 GiB of files. One of the results looked good, so I told Claude Opus 4.5 (at the time) to commit the code changes, upload the important file to cloud storage, then clean up the rest.<p>I then saw it run `rm -r results/`, before messaging me: "Now all that's left is for you to upload the successful results, then I'll delete the rest!"<p>Why did it not upload the files itself, when it had been using the cloud storage CLI during that session? No clue. I do accept that I could have and should have just uploaded the file myself. It would have taken 3 seconds to type.</p>
]]></description><pubDate>Fri, 12 Jun 2026 17:16:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=48506781</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=48506781</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48506781</guid></item><item><title><![CDATA[New comment by numeri in "Claude Fable 5: mid-tier results on coding tasks"]]></title><description><![CDATA[
<p>To be fair, it is good to know that it disobeys simple instructions like "don't examine my git history" far more than other models. (It should of course be a different benchmark, so as not to conflate things.)<p>It's not a great sign for alignment.</p>
]]></description><pubDate>Thu, 11 Jun 2026 20:36:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=48496065</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=48496065</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48496065</guid></item><item><title><![CDATA[New comment by numeri in "Failing grades soar with AI usage, dwindling math skills in Berkeley CS classes"]]></title><description><![CDATA[
<p>I would just warn that you may not be able to recognize what is worth learning at your stage.<p>Intuition for library design and the architecture of software packages/external APIs is something you can only learn by doing.</p>
]]></description><pubDate>Fri, 05 Jun 2026 05:28:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=48408342</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=48408342</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48408342</guid></item><item><title><![CDATA[New comment by numeri in "Good sleep, good learning, good life (2012)"]]></title><description><![CDATA[
<p>I have DSPD as well, and was pleasantly surprised to see how much of the article discussed DSPD.<p>That being said, I do think a lot of what the author is saying flies right in the face of traditional advice, esp. the suggestion that we should all just free-sleep and rotate around the clock. I personally find myself happiest when I'm entrained to the 24-hour cycle, but at my own natural offset. Whenever I've been cycling the day it's felt miserable, uncontrollable and exhausting.<p>To be fair, the author did claim that you can fully solve this by completely cutting out after-dark electronics, but I've tried pretty intensely to do exactly that for extended periods in the past, and didn't see any progress. I do sleep amazingly when camping, though, and the delay is lesser than normal (still definitely there).</p>
]]></description><pubDate>Wed, 15 Apr 2026 20:51:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=47785070</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=47785070</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47785070</guid></item><item><title><![CDATA[New comment by numeri in "Agent Reading Test"]]></title><description><![CDATA[
<p>11/20 for qwen/qwen3.5-flash-02-23 in Claude Code, with effort set to low.</p>
]]></description><pubDate>Mon, 06 Apr 2026 23:36:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=47668843</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=47668843</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47668843</guid></item><item><title><![CDATA[New comment by numeri in "Owner of ICE detention facility sees big opportunity in AI man camps"]]></title><description><![CDATA[
<p>No, that's what the headline implies, and the body of the article doesn't support at all. It's (currently, and with no indication of intent to change this) two separate branches of their business.</p>
]]></description><pubDate>Mon, 09 Mar 2026 13:53:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=47309086</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=47309086</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47309086</guid></item><item><title><![CDATA[New comment by numeri in "Mercury 2: Fast reasoning LLM powered by diffusion"]]></title><description><![CDATA[
<p>but Taalas had to quantize Llama 3.1 8B to death to get it to fit. It can't produce coherent non-English text at all.</p>
]]></description><pubDate>Wed, 25 Feb 2026 15:24:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=47152765</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=47152765</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47152765</guid></item><item><title><![CDATA[New comment by numeri in "Ask HN: What explains the recent surge in LLM coding capabilities?"]]></title><description><![CDATA[
<p>and if I was to guess, the latest generation of models (Claude Opus 4.6, GPT-5.3-codex, etc.) differ from Opus 4.5, GPT 5.2 primarily in the addition of deeper, more difficult (most likely agentic and coding-based, like Terminal Bench) tasks to their RLVR training.<p>I could be completely off, as my intuition here is fully based on public research papers, but it seems to explain the current state of things fairly well.</p>
]]></description><pubDate>Mon, 16 Feb 2026 17:08:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47037500</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=47037500</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47037500</guid></item><item><title><![CDATA[Petition for Recognition of Work on Open-Source as Volunteering in Germany]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.openpetition.de/petition/online/recognition-of-work-on-open-source-as-volunteering-in-germany">https://www.openpetition.de/petition/online/recognition-of-work-on-open-source-as-volunteering-in-germany</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46881568">https://news.ycombinator.com/item?id=46881568</a></p>
<p>Points: 213</p>
<p># Comments: 50</p>
]]></description><pubDate>Wed, 04 Feb 2026 04:46:15 +0000</pubDate><link>https://www.openpetition.de/petition/online/recognition-of-work-on-open-source-as-volunteering-in-germany</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=46881568</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46881568</guid></item><item><title><![CDATA[Exploration Posteriors for Generative Modeling Using Only Negative Rewards]]></title><description><![CDATA[
<p>Article URL: <a href="https://arxiv.org/abs/2510.09596">https://arxiv.org/abs/2510.09596</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46879151">https://news.ycombinator.com/item?id=46879151</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 03 Feb 2026 23:47:14 +0000</pubDate><link>https://arxiv.org/abs/2510.09596</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=46879151</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46879151</guid></item><item><title><![CDATA[New comment by numeri in "Ask HN: Do you still use physical calculators?"]]></title><description><![CDATA[
<p>No, Python or units[1] is always a better choice if I'm near a computer (and I nearly always am these days, unfortunately, I suppose). I do have three wonderful slide rules, though.<p>[1]: <a href="https://www.gnu.org/software/units/" rel="nofollow">https://www.gnu.org/software/units/</a></p>
]]></description><pubDate>Sun, 01 Feb 2026 23:10:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=46850343</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=46850343</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46850343</guid></item><item><title><![CDATA[New comment by numeri in "Finland looks to introduce Australia-style ban on social media"]]></title><description><![CDATA[
<p>Introducing a solid zero-knowledge age verification option is the opposite direction of ending anonymity in the Internet, which other parts of the same governments are also working on.<p>So yeah, I'll gladly trust and cheer on the part working in the right direction.</p>
]]></description><pubDate>Sun, 01 Feb 2026 23:00:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=46850272</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=46850272</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46850272</guid></item><item><title><![CDATA[Underrated reasons to be thankful V]]></title><description><![CDATA[
<p>Article URL: <a href="https://dynomight.net/thanks-5/">https://dynomight.net/thanks-5/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46073033">https://news.ycombinator.com/item?id=46073033</a></p>
<p>Points: 226</p>
<p># Comments: 98</p>
]]></description><pubDate>Thu, 27 Nov 2025 20:37:51 +0000</pubDate><link>https://dynomight.net/thanks-5/</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=46073033</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46073033</guid></item><item><title><![CDATA[New comment by numeri in "It's OpenAI's world, we're just living in it"]]></title><description><![CDATA[
<p>I'll just throw in support for gaming on Linux – it's pretty nice feeling these days! I still have the occasional (once every 5–8 months?) update cause a short-lived bug, but it's a very justifiable trade-off to avoid Windows these days.</p>
]]></description><pubDate>Sat, 11 Oct 2025 00:15:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=45545247</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=45545247</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45545247</guid></item><item><title><![CDATA[New comment by numeri in "GPT-5-Codex is a better AI researcher than me"]]></title><description><![CDATA[
<p>This is written by someone who's not an AI researcher, working with tiny models on toy datasets. It's at the level of a motivated undergraduate student in their first NLP course, but not much more.</p>
]]></description><pubDate>Tue, 07 Oct 2025 15:27:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=45504332</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=45504332</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45504332</guid></item><item><title><![CDATA[New comment by numeri in "How to be a leader when the vibes are off"]]></title><description><![CDATA[
<p>One sign would be occasionally changing course in response to overwhelming employee feedback. If that never or almost never happens, the feedback is being ignored, not taken constructively and not followed.</p>
]]></description><pubDate>Thu, 25 Sep 2025 10:09:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=45371094</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=45371094</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45371094</guid></item><item><title><![CDATA[New comment by numeri in "Why language models hallucinate"]]></title><description><![CDATA[
<p>This isn't right – calibration (informally, the degree to which certainty in the model's logits correlates with its chance of getting an answer correct) is well studied in LLMs of all sizes. LLMs are not (generally) well calibrated.</p>
]]></description><pubDate>Sun, 07 Sep 2025 00:09:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=45154049</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=45154049</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45154049</guid></item><item><title><![CDATA[New comment by numeri in "Grok: Searching X for "From:Elonmusk (Israel or Palestine or Hamas or Gaza)""]]></title><description><![CDATA[
<p>I really like your posts, and they're generally very clearly written. Maybe this one's just the odd duck out, as it's hard for me to find what you actually meant (as clarified in your comment here) in this paragraph:<p>> This suggests that Grok may have a weird sense of identity—if asked for its own opinions it turns to search to find previous indications of opinions expressed by itself or by its ultimate owner. I think there is a good chance this behavior is unintended!<p>I'd say it's far more likely that:<p>1. Elon ordered his research scientists to "fix it" – make it agree with him<p>2. They did RL (probably just basic tool use training) to encourage checking for Elon's opinions<p>3. They did not update the UI (for whatever reason – most likely just because research scientists aren't responsible for front-end, so they forgot)<p>4. Elon is likely now upset that this is shown so obviously<p>The key difference is that I think it's incredibly unlikely that this is emergent behavior due to an "sense of identity", as opposed to direct efforts of the xAI research team. It's likely also a case of <a href="https://en.wiktionary.org/wiki/anticipatory_obedience" rel="nofollow">https://en.wiktionary.org/wiki/anticipatory_obedience</a>.</p>
]]></description><pubDate>Fri, 11 Jul 2025 14:05:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=44532300</link><dc:creator>numeri</dc:creator><comments>https://news.ycombinator.com/item?id=44532300</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44532300</guid></item></channel></rss>