<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: tehryanx</title><link>https://news.ycombinator.com/user?id=tehryanx</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 14 Apr 2026 11:27:01 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=tehryanx" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by tehryanx in "Small models also found the vulnerabilities that Mythos found"]]></title><description><![CDATA[
<p>I know you're right that there's a saturation point for context size, but it's not just context size that the larger models have, it's better grounding within that as a result of stronger, more discriminative attention patterns.<p>I'm not saying you're not going to drive confusion by overloading context, but the number of tokens required to trigger that failure mode in opus is going to be a lot higher than the number for gpt-oss-20b.<p>I'm pretty sure a model that can run on a cellphone is going to cap out it's context window long before opus or mythos would hit the point of diminishing returns on context overload. I think using a lower quality model with far fewer / noisier weights and less precise attention is going to drive false positives way before adding context to a SOTA model will.<p>You can even see here, AISLE had to print a retraction because someone checked their work and found that just pointing gpt-oss-20b at the patched version generated FP consistently: <a href="https://x.com/ChaseBrowe32432/status/2041953028027379806" rel="nofollow">https://x.com/ChaseBrowe32432/status/2041953028027379806</a></p>
]]></description><pubDate>Sat, 11 Apr 2026 22:51:39 +0000</pubDate><link>https://news.ycombinator.com/item?id=47734694</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=47734694</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47734694</guid></item><item><title><![CDATA[New comment by tehryanx in "Small models also found the vulnerabilities that Mythos found"]]></title><description><![CDATA[
<p>newer models have larger context windows, and more stable reasoning across larger context windows.<p>If you point your model directly at the thing you want it to assess, and it doesn't have to gather any additional context you're not really testing those things at all.<p>Say you point kimi and opus at some code and give them an agentic looping harness with code review tools. They're going to start digging into the code gathering context by mapping out references and following leads.<p>If the bug is really shallow, the model is going to get everything it needs to find it right away, neither of them will have any advantage.<p>If the bug is deeper, requires a lot more code context, Opus is going to be able to hold onto a lot more information, and it's going to be a lot better at reasoning across all that information. That's a test that would actually compare the models directly.<p>Mythos is just a bigger model with a larger context window and, presumably, better prioritization and stronger attention mechanisms.</p>
]]></description><pubDate>Sat, 11 Apr 2026 19:37:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47733356</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=47733356</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47733356</guid></item><item><title><![CDATA[New comment by tehryanx in "Small models also found the vulnerabilities that Mythos found"]]></title><description><![CDATA[
<p>I get what you're saying, but I think this is still missing something pretty critical.<p>The smaller models can recognize the bug when they're looking right at it, that seems to be verified. And with AISLE's approach you can iteratively feed the models one segment at a time cheaply. But if a bug spans multiple segments, the small model doesn't have the breadth of context to understand those segments in composite.<p>The advantage of the larger model is that it can retain more context and potentially find bugs that require more code context than one segment at a time.<p>That said, the bugs showcased in the mythos paper all seemed to be shallow bugs that start and end in a single input segment, which is why AISLE was able to find them. But having more context in the window theoretically puts less shallow bugs within range for the model.<p>I think the point they are making, that the model doesn't matter as much as the harness, stands for shallow bugs but not for vulnerability discovery in general.</p>
]]></description><pubDate>Sat, 11 Apr 2026 18:36:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=47732916</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=47732916</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47732916</guid></item><item><title><![CDATA[New comment by tehryanx in "Phone Trips"]]></title><description><![CDATA[
<p>I first mirrored these in the early 2000s because I was worried it would eventually vanish. my mirror has been gone for decades, and the original survives. :)</p>
]]></description><pubDate>Sat, 11 Apr 2026 18:22:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=47732809</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=47732809</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47732809</guid></item><item><title><![CDATA[Show HN: Whorl – Fingerprinting LLMs as horrible password generators]]></title><description><![CDATA[
<p>Article URL: <a href="http://bountyplz.xyz/ai,/security/2026/03/15/Model-Fingerprinting-With-Whorl.html">http://bountyplz.xyz/ai,/security/2026/03/15/Model-Fingerprinting-With-Whorl.html</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47523037">https://news.ycombinator.com/item?id=47523037</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 25 Mar 2026 20:51:12 +0000</pubDate><link>http://bountyplz.xyz/ai,/security/2026/03/15/Model-Fingerprinting-With-Whorl.html</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=47523037</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47523037</guid></item><item><title><![CDATA[New comment by tehryanx in "Statement from Dario Amodei on our discussions with the Department of War"]]></title><description><![CDATA[
<p>where is anthropic hyping like that? Most of what I see coming out of anthropic is deep context releases on research they're doing.</p>
]]></description><pubDate>Fri, 27 Feb 2026 10:02:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=47178711</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=47178711</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47178711</guid></item><item><title><![CDATA[New comment by tehryanx in "We will ban you and ridicule you in public if you waste our time on crap reports"]]></title><description><![CDATA[
<p>The real problem here is that this is now the only way the maintainer/reporter can reasonably work.<p>Proving out a security vulnerability from beginning to end is often very difficult for someone who isn't a domain expert or hasn't seen the code. Many times I've been reasonably confident that an issue was exploitable but unable to prove it, and a 10s interaction with the maintainer was enough to uncover something serious.<p>Exhausting these report channels is making this unfeasible. But the number of issues that will go undetected, that would have been detected with minimal collaboration between the reporter and the maintainer, is going to be high.</p>
]]></description><pubDate>Thu, 22 Jan 2026 16:59:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=46721882</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=46721882</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46721882</guid></item><item><title><![CDATA[Show HN: Chordle. Learn to identify pitch by playing Wordle with chords]]></title><description><![CDATA[
<p>Article URL: <a href="https://codepen.io/tehryanx/full/RNRGGEQ">https://codepen.io/tehryanx/full/RNRGGEQ</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46571004">https://news.ycombinator.com/item?id=46571004</a></p>
<p>Points: 2</p>
<p># Comments: 1</p>
]]></description><pubDate>Sat, 10 Jan 2026 23:23:53 +0000</pubDate><link>https://codepen.io/tehryanx/full/RNRGGEQ</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=46571004</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46571004</guid></item><item><title><![CDATA[New comment by tehryanx in "Look, Another AI Browser"]]></title><description><![CDATA[
<p>Rolling your own browser is 10x more dangerous than rolling your own auth or crypto. Building on top of chromium is a good thing here.</p>
]]></description><pubDate>Wed, 22 Oct 2025 18:35:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=45673308</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=45673308</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45673308</guid></item><item><title><![CDATA[New comment by tehryanx in "Ruby core team takes ownership of RubyGems and Bundler"]]></title><description><![CDATA[
<p>Yes it does. He's refuting that in this part of the post:<p>> When they finally did reply, they seem to have developed some sort of theory that I was interested in “access to PII”, which is entirely false. I have no interest in any PII, commercially or otherwise. As my private email published by Ruby Central demonstrates, my entire proposal was based solely on company-level information, with no information about individuals included in any way. Here’s their response, over three days later.</p>
]]></description><pubDate>Fri, 17 Oct 2025 16:16:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=45618466</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=45618466</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45618466</guid></item><item><title><![CDATA[New comment by tehryanx in "DeepFabric – Generate high-quality synthetic datasets at scale"]]></title><description><![CDATA[
<p>based on the description, I think it's using something similar to GLAN <a href="https://arxiv.org/abs/2402.13064" rel="nofollow">https://arxiv.org/abs/2402.13064</a></p>
]]></description><pubDate>Fri, 26 Sep 2025 15:59:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=45387998</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=45387998</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45387998</guid></item><item><title><![CDATA[New comment by tehryanx in "Zoxide: A Better CD Command"]]></title><description><![CDATA[
<p>this feels like a hundred accidents waiting to happen.</p>
]]></description><pubDate>Tue, 23 Sep 2025 15:17:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=45348277</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=45348277</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45348277</guid></item><item><title><![CDATA[New comment by tehryanx in "Show HN: MCP Security Suite"]]></title><description><![CDATA[
<p>Forgive me for belaboring, but I think we're talking past each other a bit. I do understand that in your model the LLM can't send anything unsafe through to the rest of the system. What I'm saying is that the LLM can be manipulated into sending perfectly normal and normally safe requests through to the system that do not align with the users intent.<p>Imagine an LLM with the ability to read emails, update database records, and destroy database records.<p>The user instructs the LLM to update a database record, but a malicious injection from one of those emails overrides that with a directive to destroy the database record. Unless the validator understands the users intent somehow, the destructive action would appear perfectly reasonable.</p>
]]></description><pubDate>Wed, 03 Sep 2025 13:08:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=45115292</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=45115292</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45115292</guid></item><item><title><![CDATA[New comment by tehryanx in "Show HN: MCP Security Suite"]]></title><description><![CDATA[
<p>Personally, I think there's a piece missing in the analogy. I understand that you can put some kind of human-verified mediator in between the LLM and the tool its calling to make sure the parameters are sane, but I also think you're modelling the LLM as a UI element that's generating the request when IMO it makes more sense to model the LLM as the user who is choosing how to interact with the UI elements that are generating the request.<p>In the context of web-request -> validator -> db query, the purpose of the validator is only to ensure that the request is safe, it doesn't care what the user chose to do as long as it's a reasonable action in the context of the app.<p>In the context of user -> LLM -> validator -> tool, the validator has to ensure that the request is safe, but the users intention can be changed at the LLM stage. If the user wanted to update a record, but the LLM decides to destroy it, the validator now has to have some way to understand the users initial intention to know whether or not the request is sane.</p>
]]></description><pubDate>Fri, 29 Aug 2025 10:25:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=45062316</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=45062316</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45062316</guid></item><item><title><![CDATA[New comment by tehryanx in "Show HN: MCP Security Suite"]]></title><description><![CDATA[
<p>Assuming you feed everything into another context to make safe, doesn't the problem just come with it? Why can't the LLM propagate misbehaviour into that stage?</p>
]]></description><pubDate>Mon, 25 Aug 2025 15:38:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=45015048</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=45015048</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45015048</guid></item><item><title><![CDATA[New comment by tehryanx in "Show HN: MCP Jetpack – The easiest way to get started with MCP in Cursor"]]></title><description><![CDATA[
<p>It really concerns me that this is an afterthought rather than MVP table stakes.</p>
]]></description><pubDate>Tue, 22 Jul 2025 00:49:07 +0000</pubDate><link>https://news.ycombinator.com/item?id=44642131</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=44642131</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44642131</guid></item><item><title><![CDATA[New comment by tehryanx in "MCP-B: A Protocol for AI Browser Automation"]]></title><description><![CDATA[
<p>I don't think it is beyond the scope of MCP. Browsers have controls to prevent cross-origin data exposures, and this protocol is designed to bridge origins across a context that they all have access to. It's breaking the existing isolation mechanism. If you're building a system that breaks the existing security controls of the environment it's running in I think you have an architectural responsibility to figure out a way to solve for that.<p>Especially in this context, where decades have been spent building and improving same origin policy controls. The entire web has been built around the expectation that those controls prevent cross origin data access.<p>I also don't even think it's that difficult to solve. For one, data in the context window doesn't have to be a string, it can be an array of objects that contain the origin they were pulled from as metadata. Then you can provide selective content to different MCP-B interfaces depending on their origins. That would live in the protocol layer that would help significantly.</p>
]]></description><pubDate>Fri, 11 Jul 2025 15:10:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=44533023</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=44533023</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44533023</guid></item><item><title><![CDATA[New comment by tehryanx in "MCP-B: A Protocol for AI Browser Automation"]]></title><description><![CDATA[
<p>Sure, but the leak risk is happening in a place outside the site's control.<p>If the purpose of the MCP-B tool on mail.com is to summarize your email, then the site needs to allow the agent to pull your email into the context window. Once it's in the context window it's available to any other MCP-B enabled site that can convince the agent to send it along.</p>
]]></description><pubDate>Fri, 11 Jul 2025 11:51:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=44531076</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=44531076</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44531076</guid></item><item><title><![CDATA[New comment by tehryanx in "MCP-B: A Protocol for AI Browser Automation"]]></title><description><![CDATA[
<p>I appreciate your responses here. The thing that still really stands out to me as a completely novel risk in this framework is that the extension is automatically seeking out and attaching to these servers as soon as a page gets loaded.<p>This seems really bad to me. There are so many ways for a website to end up in one of my browser tabs without me wanting it there, or even knowing it's there.<p>If that happens, and that tab just so happens to be a malicious MCP-B enabled page, it could steal all kinds of data from all kinds of different web apps I'm interacting with. I think it should be seen as the responsibility of the framework to enforce some level of data isolation, or at the least opt-in consent mechanisms.</p>
]]></description><pubDate>Thu, 10 Jul 2025 19:26:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=44524596</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=44524596</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44524596</guid></item><item><title><![CDATA[New comment by tehryanx in "MCP-B: A Protocol for AI Browser Automation"]]></title><description><![CDATA[
<p>sandboxing is a general term for actor isolation, and its context agnostic.<p>For example, when you use the sandbox attribute on an iframe in a web application, it's not the user that's untrusted, it's some other user that's attempting to trigger actions in your client.</p>
]]></description><pubDate>Thu, 10 Jul 2025 17:48:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=44523596</link><dc:creator>tehryanx</dc:creator><comments>https://news.ycombinator.com/item?id=44523596</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44523596</guid></item></channel></rss>