<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: dmagog</title><link>https://news.ycombinator.com/user?id=dmagog</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 28 Jun 2026 02:55:20 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=dmagog" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by dmagog in "What happened after 2k people tried to hack my AI assistant"]]></title><description><![CDATA[
<p>Nice experiment, but I'd temper the optimism. "Zero breaches in 6k attempts" is a success-rate estimate, and the model is nondeterministic, so a failed jailbreak isn't proof it's blocked, just that it didn't fire on that sample. 6k different prompts isn't 6k tries of the worst one; an attack with even a 0.1% success rate usually shows zero in a handful of attempts, and the tail is what bites in production. Also, this is direct user injection, the easy case. The channel people actually lose to is indirect: untrusted content arriving via a tool result or fetched doc, which Fiu never had in the loop.</p>
]]></description><pubDate>Fri, 26 Jun 2026 04:31:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=48682331</link><dc:creator>dmagog</dc:creator><comments>https://news.ycombinator.com/item?id=48682331</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48682331</guid></item></channel></rss>