<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: moyix</title><link>https://news.ycombinator.com/user?id=moyix</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 09 Apr 2026 11:47:56 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=moyix" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by moyix in "Vulnerability research is cooked"]]></title><description><![CDATA[
<p>It's limiting from the PoV of a developer who wants to ensure that their own code is free of all security issues. It is not limiting from the point of view of an attacker who just needs one good memory safety vuln to win.</p>
]]></description><pubDate>Mon, 30 Mar 2026 21:31:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=47579971</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=47579971</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47579971</guid></item><item><title><![CDATA[New comment by moyix in "Vulnerability research is cooked"]]></title><description><![CDATA[
<p>This is true for a lot of things but for low-level code you can always fall back to "the intention is to not violate memory safety".</p>
]]></description><pubDate>Mon, 30 Mar 2026 21:16:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=47579791</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=47579791</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47579791</guid></item><item><title><![CDATA[New comment by moyix in "Letting Claude play text adventures"]]></title><description><![CDATA[
<p>Also, unlike OpenAI, Anthropic's prompt caching is <i>explicit</i> (you set up to 4 cache "breakpoints"), meaning if you don't implement caching then you don't benefit from it.</p>
]]></description><pubDate>Wed, 21 Jan 2026 22:11:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=46712331</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=46712331</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46712331</guid></item><item><title><![CDATA[New comment by moyix in "The coming industrialisation of exploit generation with LLMs"]]></title><description><![CDATA[
<p>There is filtering mentioned, it's just not done by a human:<p>> I have written up the verification process I used for the experiments here, but the summary is: an exploit tends to involve building a capability to allow you to do something you shouldn’t be able to do. If, after running the exploit, you can do that thing, then you’ve won. For example, some of the experiments involved writing an exploit to spawn a shell from the Javascript process. To verify this the verification harness starts a listener on a particular local port, runs the Javascript interpreter and then pipes a command into it to run a command line utility that connects to that local port. As the Javascript interpreter has no ability to do any sort of network connections, or spawning of another process in normal execution, you know that if you receive the connect back then the exploit works as the shell that it started has run the command line utility you sent to it.<p>It is more work to build such "perfect" verifiers, and they don't apply to every vulnerability type (how do you write a Python script to detect a logic bug in an arbitrary application?), but for bugs like these where the exploit goal is very clear (exec code or write arbitrary content to a file) they work extremely well.</p>
]]></description><pubDate>Mon, 19 Jan 2026 23:20:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=46685949</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=46685949</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46685949</guid></item><item><title><![CDATA[New comment by moyix in "'World Models,' an old idea in AI, mount a comeback"]]></title><description><![CDATA[
<p>Note that MuZero did <i>better</i> than AlphaGo, without access to preprogrammed rules: <a href="https://en.wikipedia.org/wiki/MuZero" rel="nofollow">https://en.wikipedia.org/wiki/MuZero</a></p>
]]></description><pubDate>Tue, 02 Sep 2025 18:54:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=45107445</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=45107445</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45107445</guid></item><item><title><![CDATA[New comment by moyix in "Passkeys are just passwords that require a password manager"]]></title><description><![CDATA[
<p>There's also a FIDO standard in the works for how to export passkeys: <a href="https://blog.1password.com/fido-alliance-import-export-passkeys-draft-specs/" rel="nofollow">https://blog.1password.com/fido-alliance-import-export-passk...</a></p>
]]></description><pubDate>Tue, 05 Aug 2025 02:12:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=44793616</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=44793616</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44793616</guid></item><item><title><![CDATA[New comment by moyix in "XBOW, an autonomous penetration tester, has reached the top spot on HackerOne"]]></title><description><![CDATA[
<p>The main difference is that all of the vulnerabilities reported here are real, many quite critical (XXE, RCE, SQLi, etc.). To be fair there were definitely a lot of XSS, but the main reason for that is that it's a really common vulnerability.</p>
]]></description><pubDate>Tue, 24 Jun 2025 19:25:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=44369989</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=44369989</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44369989</guid></item><item><title><![CDATA[New comment by moyix in "XBOW, an autonomous penetration tester, has reached the top spot on HackerOne"]]></title><description><![CDATA[
<p>All of these reports came with executable proof of the vulnerabilities – otherwise, as you say, you get flooded with hallucinated junk like the poor curl dev. This is one of the things that makes offensive security an actually good use case for AI – exploits serve as hard evidence that the LLM can't fake.</p>
]]></description><pubDate>Tue, 24 Jun 2025 19:22:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=44369954</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=44369954</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44369954</guid></item><item><title><![CDATA[New comment by moyix in "XBOW, an autonomous penetration tester, has reached the top spot on HackerOne"]]></title><description><![CDATA[
<p>Wait a sec, I thought they were optional?<p>> White Paper/Slide Deck/Supporting Materials (optional)<p>> • If you have a completed white paper or draft, slide deck, or other supporting materials, you can
optionally provide a link for review by the board.<p>> • Please note: Submission must be self-contained for evaluation, supporting materials are optional.<p>> • PDF or online viewable links are preferred, where no authentication/log-in is required.<p>(From the link on the BHUSA CFP page, which confusingly goes to the BH Asia doc: <a href="https://i.blackhat.com/Asia-25/BlackHat-Asia-2025-CFP-Preparation.pdf" rel="nofollow">https://i.blackhat.com/Asia-25/BlackHat-Asia-2025-CFP-Prepar...</a> )</p>
]]></description><pubDate>Tue, 24 Jun 2025 19:19:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=44369915</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=44369915</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44369915</guid></item><item><title><![CDATA[New comment by moyix in "XBOW, an autonomous penetration tester, has reached the top spot on HackerOne"]]></title><description><![CDATA[
<p>Yeah, it's been very strange being on the other side of that after 10 years in academia! But it's totally reasonable for people to be skeptical when there's a bunch of money sloshing around.<p>I'll see if I can get time to do a paper to accompany the BH talk. And hopefully the agent traces of individual vulns will also help.</p>
]]></description><pubDate>Tue, 24 Jun 2025 19:12:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=44369834</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=44369834</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44369834</guid></item><item><title><![CDATA[New comment by moyix in "XBOW, an autonomous penetration tester, has reached the top spot on HackerOne"]]></title><description><![CDATA[
<p>This is discussed in the post – many came down to individual programs' policies e.g. not accepting the vulnerability if it was in a 3rd party product they used (but still hosted by them), duplicates (another researcher reported the same vuln at the same time; not really any way to avoid this), or not accepting some classes of vuln like cache poisoning.</p>
]]></description><pubDate>Tue, 24 Jun 2025 19:09:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=44369795</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=44369795</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44369795</guid></item><item><title><![CDATA[New comment by moyix in "XBOW, an autonomous penetration tester, has reached the top spot on HackerOne"]]></title><description><![CDATA[
<p>We've got a bunch of agent traces on the front page of the web site right now. We also have done writeups on individual vulnerabilities found by the system, mostly in open source right now (we did some fun scans of OSS projects found on Docker Hub). We have a bunch more coming up about the vulns found in bug bounty targets. The latter are bottlenecked by getting approval from the companies affected, unfortunately.<p>Some of my favorites from what we've released so far:<p>- Exploitation of an n-day RCE in Jenkins, where the agent managed to figure out the challenge environment was broken and used the RCE exploit to debug the server environment and work around the problem to solve the challenge: <a href="https://xbow.com/#debugging--testing--and-refining-a-jenkins-remote-code-execution-exploit" rel="nofollow">https://xbow.com/#debugging--testing--and-refining-a-jenkins...</a><p>- Authentication bypass in Scoold that allowed reading the server config (including API keys) and arbitrary file read: <a href="https://xbow.com/blog/xbow-scoold-vuln/" rel="nofollow">https://xbow.com/blog/xbow-scoold-vuln/</a><p>- The first post about our HackerOne findings, an XSS in Palo Alto Networks GlobalProtect VPN portal used by a bunch of companies: <a href="https://xbow.com/blog/xbow-globalprotect-xss/" rel="nofollow">https://xbow.com/blog/xbow-globalprotect-xss/</a></p>
]]></description><pubDate>Tue, 24 Jun 2025 19:04:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=44369728</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=44369728</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44369728</guid></item><item><title><![CDATA[New comment by moyix in "XBOW, an autonomous penetration tester, has reached the top spot on HackerOne"]]></title><description><![CDATA[
<p>You should come to my upcoming BlackHat talk on how we did this while avoiding false positives :D<p><a href="https://www.blackhat.com/us-25/briefings/schedule/#ai-agents-for-offsec-with-zero-false-positives-46559" rel="nofollow">https://www.blackhat.com/us-25/briefings/schedule/#ai-agents...</a></p>
]]></description><pubDate>Tue, 24 Jun 2025 18:52:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=44369542</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=44369542</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44369542</guid></item><item><title><![CDATA[New comment by moyix in "Too Many Open Files"]]></title><description><![CDATA[
<p>I made a CTF challenge based on that lovely feature of select() :D You could use the out-of-bounds bitset memory corruption to flip bits in an RSA public key in a way that made it factorable, generate the corresponding private key, and use that to authenticate.<p><a href="https://threadreaderapp.com/thread/1723398619313603068.html" rel="nofollow">https://threadreaderapp.com/thread/1723398619313603068.html</a></p>
]]></description><pubDate>Fri, 06 Jun 2025 19:58:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=44204426</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=44204426</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44204426</guid></item><item><title><![CDATA[New comment by moyix in "I used o3 to find a remote zeroday in the Linux SMB implementation"]]></title><description><![CDATA[
<p>With security vulnerabilities, you don't give the agent the ability to modify the potentially vulnerable software, naturally. Instead you make them do what an attacker would have to do: come up with an input that, when sent to the unmodified program, triggers the vulnerability.<p>How do you know if it triggered the vulnerability? Luckily for low-level memory safety issues like the ones Sean (and o3) found we have very good oracles for detecting memory safety, like KASAN, so you can basically just let the agent throw inputs at ksmbd until you see something that looks kind of like this: <a href="https://groups.google.com/g/syzkaller/c/TzmTYZVXk_Q/m/Tzh7SNZ5AQAJ" rel="nofollow">https://groups.google.com/g/syzkaller/c/TzmTYZVXk_Q/m/Tzh7SN...</a></p>
]]></description><pubDate>Sat, 24 May 2025 22:16:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=44084122</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=44084122</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44084122</guid></item><item><title><![CDATA[New comment by moyix in "I used o3 to find a remote zeroday in the Linux SMB implementation"]]></title><description><![CDATA[
<p>He did do exactly what you say – except right after that, while reviewing the outputs, he found that it had also discovered a <i>different</i> 0day.</p>
]]></description><pubDate>Sat, 24 May 2025 18:11:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=44082818</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=44082818</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44082818</guid></item><item><title><![CDATA[New comment by moyix in "AI is stifling new tech adoption?"]]></title><description><![CDATA[
<p>One thing that is interesting is that this was anticipated by the OpenAI Codex paper (which led to GitHub Copilot) all the way back in 2021:<p>> Users might be more inclined to accept the Codex answer under the assumption that the package it suggests is the one with which Codex will be more helpful. As a result, certain players might become more entrenched in the package market and Codex might not be aware of new packages developed after the training data was originally gathered. Further, for already existing packages, the model may make suggestions for deprecated methods. This could increase open-source developers’ incentive to maintain backward compatibility, which could pose challenges given that open-source projects are often under-resourced (Eghbal, 2020; Trinkenreich et al., 2021).<p><a href="https://arxiv.org/pdf/2107.03374" rel="nofollow">https://arxiv.org/pdf/2107.03374</a> (Appendix H.4)</p>
]]></description><pubDate>Fri, 14 Feb 2025 14:22:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=43048557</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=43048557</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43048557</guid></item><item><title><![CDATA[New comment by moyix in "Automated Capability Discovery via Foundation Model Self-Exploration"]]></title><description><![CDATA[
<p>I think the usual name is "overlay". At least, that's what Tim Gowers called the one he started :) <a href="https://gowers.wordpress.com/2015/09/10/discrete-analysis-an-arxiv-overlay-journal/" rel="nofollow">https://gowers.wordpress.com/2015/09/10/discrete-analysis-an...</a></p>
]]></description><pubDate>Wed, 12 Feb 2025 23:43:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=43031028</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=43031028</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43031028</guid></item><item><title><![CDATA[New comment by moyix in "NSF starts vetting all grants to comply with executive orders"]]></title><description><![CDATA[
<p>I'm a bit confused, or maybe I've been doing it wrong. DEI-related things don't usually go in Broader Impacts, do they? When I've written grants, Broader Impacts was just generally for "how is your research going to help society?"; it was the Broadening Participation in Computing section that was oriented toward DEI.</p>
]]></description><pubDate>Fri, 31 Jan 2025 16:58:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=42889340</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=42889340</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42889340</guid></item><item><title><![CDATA[New comment by moyix in "Ask HN: Examples of agentic LLM systems in production?"]]></title><description><![CDATA[
<p>We've been using them to find novel vulnerabilities in open source web apps. The past 4 posts here have details:<p>- Auth bypass/arbitrary file read in Scoold: <a href="https://xbow.com/blog/xbow-scoold-vuln/" rel="nofollow">https://xbow.com/blog/xbow-scoold-vuln/</a><p>- SSRF in 2FAuth: <a href="https://xbow.com/blog/xbow-2fauth-ssrf/" rel="nofollow">https://xbow.com/blog/xbow-2fauth-ssrf/</a><p>- Stored XSS in 2FAuth: <a href="https://xbow.com/blog/xbow-2fauth-xss/" rel="nofollow">https://xbow.com/blog/xbow-2fauth-xss/</a><p>- Path traversal in Labs.AI EDDI: <a href="https://xbow.com/blog/xbow-eddi-path/" rel="nofollow">https://xbow.com/blog/xbow-eddi-path/</a><p>Each of those has an associated agent trace so you can go read exactly what the agent did to find and exploit the vulnerability.</p>
]]></description><pubDate>Mon, 16 Dec 2024 21:31:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=42435679</link><dc:creator>moyix</dc:creator><comments>https://news.ycombinator.com/item?id=42435679</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=42435679</guid></item></channel></rss>