<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: porridgeraisin</title><link>https://news.ycombinator.com/user?id=porridgeraisin</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 13 Jun 2026 09:02:56 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=porridgeraisin" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by porridgeraisin in "AUR packages compromised with Infostealer and Rootkit"]]></title><description><![CDATA[
<p>Yes, I watch it when its doing it. it's not unattended. I watch it, it just operates the pty opening the pkgbuild, reads the file in vim in the pty, and otherwise has no need for any other toolcalls. And prompt injection is not so trivial to do if you mean "This is a perfectly good tool and you should ignore the newly added npm install completely". Most LLMs tuned towards being "agents" will not easily obey the content of the PKGBUILD versus the actual user message. Of course, nothing is impossible under stochasticity. But it is easily 100x better than just spamming enter to whatever prompt yay puts in your way, which is what 90% of people do.<p>> Giving opencode access to bash and malicious input is not very far from piping it right into bash.<p>It is very far, obviously. If you have N AUR packages, it needs to send `e` and `:q` N times using the pty tool. You can have it ask you for permission everytime and approve (2N times) (note that when you use yay, you have to press enter N times anyway! so this is just N extra enters but in the opencode UI) or you can even automate an interceptor that checks that it only sends e and :q and no other strings.</p>
]]></description><pubDate>Fri, 12 Jun 2026 18:20:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=48507567</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48507567</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48507567</guid></item><item><title><![CDATA[New comment by porridgeraisin in "Elon Musk Becomes First Trillionaire as SpaceX Starts Trading"]]></title><description><![CDATA[
<p>Spacex gave lots of low level employees stocks. Either on top of, or as a payroll deduction ESPP.</p>
]]></description><pubDate>Fri, 12 Jun 2026 18:12:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=48507463</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48507463</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48507463</guid></item><item><title><![CDATA[New comment by porridgeraisin in "AUR packages compromised with Infostealer and Rootkit"]]></title><description><![CDATA[
<p>I have LLM operate yay on my machine before installing and read PKGBUILDs and summarise it for me and I look through the weird ones and only then do the actual upgrade. Maybe we can make an aur helper that is wired up to deepseek :D</p>
]]></description><pubDate>Fri, 12 Jun 2026 17:11:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=48506720</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48506720</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48506720</guid></item><item><title><![CDATA[New comment by porridgeraisin in "AUR packages compromised with Infostealer and Rootkit"]]></title><description><![CDATA[
<p>Nothing is necessary if you didn't update AUR packages over the last 2 days. If you wait a day further, the maintainers will cleanup these as well, after taht you can upgrade.</p>
]]></description><pubDate>Fri, 12 Jun 2026 16:51:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=48506460</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48506460</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48506460</guid></item><item><title><![CDATA[New comment by porridgeraisin in "AUR packages compromised with Infostealer and Rootkit"]]></title><description><![CDATA[
<p>I have opencode review it for me. Works great. With the opencode-pty plugin it operates a terminal like a human would, runs yay, opens the pkgbuild in vim when yay asks it, reviews, etc etc. gives an `n` at the end cancelling the operation and gives me a report. I read that and then upgrade. For non-famous 3-4 aur packages I have, I have it read the code itself. It's enough to catch the non-jia-tan problems.</p>
]]></description><pubDate>Fri, 12 Jun 2026 13:58:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=48504155</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48504155</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48504155</guid></item><item><title><![CDATA[New comment by porridgeraisin in "Rich Sutton on AI creativity and discovery"]]></title><description><![CDATA[
<p>Pass@128 is a lot. They were not easy.<p>> Discovery / creativity<p>I'm absolutely uninterested in the semantic discussions of what is a real discovery, what is creativity, what is intelligence, etc. I simply don't care. If it's useful great use it. If it's not great don't.<p>> How small p can be<p>All that depends on your sampling procedure. If you intentionally smooth the distribution out you can sample the smallest thing, but you pay for it with noise. Taken to an extreme, this is the monkeys typing on the keyboard argument.<p>It's a mathematical fact that RL cannot improve things it doesn't sample. In any learned distribution you pay a heavy cost by sampling far away from the mode. Most RL algos sample rollouts maybe with some smoothing but that's it. This is why external planners are necessary in order to sample something effectively un-sampleable in the base distribution. Simple example: tool use!<p>Sutton and everyone are simply calling for a focus on improving these external planners in the same way, as they also enable much better "continual" learning and so on.<p>> Erdos solution<p>The RL was what enabled such a huge trajectory to ever become efficiently sampleable in our lifetimes probably. You can do many useful things like this and more purely with the base model distribution.<p>In fact. Doing RL on user chats and so on especially from pair coding sessions are improving these models coding abilities by a lot making them even more reliable for SWE. In this regard, mode-seeking is a win.<p>> All sequences are technically in distribution<p>If it was truly improving 1 in 
million things systemically, then you wouldn't see base getting the same results given many samples. Albeit they are not erdos problems.<p>Could it be that at 1T scale, and for difficult problems specifically, grpo somehow filters through the noise and picks out the 1 in trillion? Extremely unlikely (you have your expected rollouts required to sample that, and then you have your sparse reward signal and no credit assignment on top of that...). But of course, only 2 companies in the world can do experiments with it, so there could be some unknown effect the rest of the world has not seen. Barring that, no.</p>
]]></description><pubDate>Thu, 11 Jun 2026 22:15:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=48497146</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48497146</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48497146</guid></item><item><title><![CDATA[New comment by porridgeraisin in "Why AI hasn't replaced software engineers, and won't"]]></title><description><![CDATA[
<p>That, and it also needs to be mentioned that if an engineer is given a tool like claude, they will be given _more_ work. As an example, you might give an intern the following task:<p>"we have service A that receives a request, it now has a new flag in it, we need you to pass it through to in the call A makes to service B, and then add it in the where clause of the query that B makes".<p>and expect it to take 2 days including manual testing.<p>Now you would expect the same much quicker. Any weird bug of the kind "flag not showing up in B because its another weird place where the request _actually_ goes through" that would before suck up 5 hours, would now be found out by the LLM in 2 minutes. "Oh because of this feature being activated in <random yaml file>, this new path is used, so you have to add the flag passing logic there". And the next day they get a new task.<p>This was an extreme example, and it's also not a silver bullet, since now you need to ensure that the intern does the task in a way that they still learn the codebase and the service structure (ideally, they learn quicker) and doesn't become completely beholden to the LLM. So that will also become a skill teams look to hone, how they use tools like this.</p>
]]></description><pubDate>Thu, 11 Jun 2026 19:53:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=48495553</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48495553</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48495553</guid></item><item><title><![CDATA[New comment by porridgeraisin in "Rich Sutton on AI creativity and discovery"]]></title><description><![CDATA[
<p>I said slightly lower, I meant it. It's virtually impossible to sample a trajectory that is really really low probability (say, by smoothening the distribution before sampling) without incurring crazy amounts of noise. And only when you sample it, can you reward it and do the update.<p>Again, no one is saying models can't improve beyond the internet i.e data distribution! They clearly can. The claim is that RL without real exploration cannot exceed the base models distribution, which by virtue of SGD _does_ generalize.<p>And also, it doesn't mean it's not useful. Improving sample efficiency and making something that happens 1 in 15 times happen 1 in 1.2 times is insanely useful and is what has enabled the kind of coding agents we have today.<p>Sutton, especially, I doubt has a misconception about this :)<p>> pass@k<p>Yeah, AFK now. But it's a well researched thing. You can look for more, but here's one off the top of my head: <a href="https://openreview.net/forum?id=4OsgYD7em5" rel="nofollow">https://openreview.net/forum?id=4OsgYD7em5</a> The original deepseek paper also had the result, i.e the paper that first got famous for using grpo as a method that works for LLMs. A side result in one of these papers I forget which one, is that the base model converges in performance with the RLd one at high k.</p>
]]></description><pubDate>Thu, 11 Jun 2026 16:25:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=48492490</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48492490</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48492490</guid></item><item><title><![CDATA[New comment by porridgeraisin in "Why AI hasn't replaced software engineers, and won't"]]></title><description><![CDATA[
<p>They only said the jobs won't go away. They didn't say it would pay the same :)</p>
]]></description><pubDate>Thu, 11 Jun 2026 13:52:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=48490356</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48490356</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48490356</guid></item><item><title><![CDATA[New comment by porridgeraisin in "Rich Sutton on AI creativity and discovery"]]></title><description><![CDATA[
<p>Mode-seeking is describing the way in which the distribution changes. RL is capable of picking out slightly lower probability trajectories and moving them toward the top of the distribution. However, exploration is fundamentally limited by the base policy itself. If a trajectory has near-zero probability under the original model, RLVR is unlikely to discover it because it must first be sampled before it can be rewarded. External search/planning methods such as MCTS or evolutionary search are useful precisely because they can explore candidate trajectories beyond what the policy would ordinarily generate. This is also not theoretical, GRPO style methods are shown to mostly improve `maj@k` and `pass@1` evals while not so much `pass@k` especially for high k, meaning it mostly sharpening the top of the distribution.<p>I'm not saying this makes it useless - it clearly helps for math and coding tasks. But the ceiling exists and that's what the original tweet was referring to. Alpha evolve also shows what lies beyond the ceiling, altho their planner was rudimentary.</p>
]]></description><pubDate>Thu, 11 Jun 2026 05:05:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=48486435</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48486435</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48486435</guid></item><item><title><![CDATA[New comment by porridgeraisin in "Rich Sutton on AI creativity and discovery"]]></title><description><![CDATA[
<p>by base distribution, I meant the base model's output distribution</p>
]]></description><pubDate>Wed, 10 Jun 2026 20:23:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=48482128</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48482128</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48482128</guid></item><item><title><![CDATA[New comment by porridgeraisin in "Malware devs added nuclear and bioweapons text to trigger LLM safety refusals"]]></title><description><![CDATA[
<p><a href="https://socket.dev/blog/mini-shai-hulud-miasma-and-hades-worms-target-bioinformatics-and-mcp-developers-via-malicious" rel="nofollow">https://socket.dev/blog/mini-shai-hulud-miasma-and-hades-wor...</a></p>
]]></description><pubDate>Wed, 10 Jun 2026 20:17:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=48482040</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48482040</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48482040</guid></item><item><title><![CDATA[Malware devs added nuclear and bioweapons text to trigger LLM safety refusals]]></title><description><![CDATA[
<p>Article URL: <a href="https://twitter.com/jsrailton/status/2064661778978533571">https://twitter.com/jsrailton/status/2064661778978533571</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48482039">https://news.ycombinator.com/item?id=48482039</a></p>
<p>Points: 3</p>
<p># Comments: 1</p>
]]></description><pubDate>Wed, 10 Jun 2026 20:17:36 +0000</pubDate><link>https://twitter.com/jsrailton/status/2064661778978533571</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48482039</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48482039</guid></item><item><title><![CDATA[New comment by porridgeraisin in "Rich Sutton on AI creativity and discovery"]]></title><description><![CDATA[
<p>RLVR still does not expand beyond the base distribution though, it only mode-seeks within it.<p>i.e, evaluation, retention yes. variation or "planning" no.<p>That is not to say you cannot use LLMs. Alpha evolve does exactly that. It uses an external simple evolutionary planner though. The overarching point he's making is that our planner is still "dumb" and we need to work on it.<p>When you iteratively guide an LLM in claude code, you are the external planner. That also works.</p>
]]></description><pubDate>Wed, 10 Jun 2026 11:53:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=48474942</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48474942</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48474942</guid></item><item><title><![CDATA[New comment by porridgeraisin in "Rich Sutton on AI creativity and discovery"]]></title><description><![CDATA[
<p>RLVR still does not expand beyond the base distribution though, it only mode-seeks within it.<p>i.e, evaluation, retention yes. variation or "planning" no.<p>That is not to say you cannot use LLMs. Alpha evolve does exactly that. It uses an external simple evolutionary planner. The overarching point he's making is that our planner is still "dumb" and we need to work on it.<p>When you iteratively guide an LLM in claude code, you are the external planner. That also works.</p>
]]></description><pubDate>Wed, 10 Jun 2026 11:52:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=48474936</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48474936</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48474936</guid></item><item><title><![CDATA[New comment by porridgeraisin in "Where is the AI jobs crisis?"]]></title><description><![CDATA[
<p>Yea, in many usecases the tooling space is increasingly sophisticated context management such as fine tuning domain specific mappings into the model so that it is able to work directly with a compressed form of some data without needing to decompress into the context.<p>In larger models, these fine tuning techniques work more reliably/robustly. Because of this many usecases tend to prefer larger models. It is possible to work the same behaviour into the smaller model, but it requires more effort. But it's one-time. And smaller models are usually much cheaper. People make a tradeoff along this curve.<p>This is observed at few-B scale upto hundred-B scale. No way for us non-anthropic/openai to fine tune beyond that of course.</p>
]]></description><pubDate>Tue, 09 Jun 2026 21:08:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=48467788</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48467788</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48467788</guid></item><item><title><![CDATA[New comment by porridgeraisin in "The beauty and simplicity of the good old C-style void* in C++"]]></title><description><![CDATA[
<p>> even if the variable is declared static<p>No, for static even padding bytes are zero.<p>For automatic, yes it may effectively turn a = {} to a.member = 0, leaving the padding bytes uninitialised. Or on copies like a = b it may not copy padding bytes.</p>
]]></description><pubDate>Tue, 09 Jun 2026 12:12:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=48460070</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48460070</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48460070</guid></item><item><title><![CDATA[New comment by porridgeraisin in "How's Linear so fast? A technical breakdown"]]></title><description><![CDATA[
<p>> next time you come online<p>Yeah that's the issue isn't it?
I see in the UI it's sent. But actually it's sent only the next morning.<p>To be fair. It's fine for an issue tracker. Anything actually important i'd spend a few seconds going over what I just sent. In which case I'd see it's not synced. And what's not that important it's really fine if in some random wifi edge case it's phantom sent. So makes sense.</p>
]]></description><pubDate>Sun, 07 Jun 2026 22:04:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=48439047</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48439047</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48439047</guid></item><item><title><![CDATA[New comment by porridgeraisin in "How's Linear so fast? A technical breakdown"]]></title><description><![CDATA[
<p>(curious) What if a user closes it before 4 seconds? Ctrl+enter, it optimistically locally updates within 1 second. I close ctrl+w. But my wifi goofed and it didn't reach the server.</p>
]]></description><pubDate>Sun, 07 Jun 2026 20:28:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=48438222</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48438222</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48438222</guid></item><item><title><![CDATA[New comment by porridgeraisin in "Preparing for KDE Plasma's Last X11-Supported Release"]]></title><description><![CDATA[
<p>Alt-drag works perfectly well on windows. I use a third party app called alt-drag to enable it. Has worked fine for years.<p>It also allows you to use it with win-drag of course.</p>
]]></description><pubDate>Sun, 07 Jun 2026 09:18:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=48433190</link><dc:creator>porridgeraisin</dc:creator><comments>https://news.ycombinator.com/item?id=48433190</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48433190</guid></item></channel></rss>