<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: stratos123</title><link>https://news.ycombinator.com/user?id=stratos123</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Fri, 24 Apr 2026 21:34:53 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=stratos123" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by stratos123 in "OpenAI model for masking personally identifiable information (PII) in text"]]></title><description><![CDATA[
<p>There's some interesting technical details in this release:<p>> Privacy Filter is a bidirectional token-classification model with span decoding. It begins from an autoregressive pretrained checkpoint and is then adapted into a token classifier over a fixed taxonomy of privacy labels. Instead of generating text token by token, it labels an input sequence in one pass and then decodes coherent spans with a constrained Viterbi procedure.<p>> The released model has 1.5B total parameters with 50M active parameters.<p>> [To build it] we converted a pretrained language model into a bidirectional token classifier by replacing the language modeling head with a token-classification head and post-training it with a supervised classification objective.</p>
]]></description><pubDate>Thu, 23 Apr 2026 09:30:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=47873741</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47873741</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47873741</guid></item><item><title><![CDATA[New comment by stratos123 in "Polymarket weather bet manipulated with a hairdryer"]]></title><description><![CDATA[
<p>I've seen the markets and they sure did move wildly, but what confirms that this was manipulated? (The alternative I was thinking of is "this twitter poster saw those two market resolutions and made up a narrative connecting them").</p>
]]></description><pubDate>Thu, 23 Apr 2026 09:24:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47873713</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47873713</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47873713</guid></item><item><title><![CDATA[New comment by stratos123 in "Polymarket weather bet manipulated with a hairdryer"]]></title><description><![CDATA[
<p>Is there a source confirming that this in fact happened other than this tweet?</p>
]]></description><pubDate>Wed, 22 Apr 2026 23:45:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47870698</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47870698</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47870698</guid></item><item><title><![CDATA[New comment by stratos123 in "We found a stable Firefox identifier linking all your private Tor identities"]]></title><description><![CDATA[
<p>No, it does allow identification across different websites (the article says "both cross-origin and same-origin tracking"). Both websites just need to create some databases with the same names. Since the databases are origin-scoped, these <i>aren't</i> the same databases, so you can't just write some data into one and read it on another website. But it turns out that if two websites use the same names for all these databases, the order the list of databases is returned in is random-per-user but the same regardless of website.</p>
]]></description><pubDate>Wed, 22 Apr 2026 23:31:02 +0000</pubDate><link>https://news.ycombinator.com/item?id=47870585</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47870585</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47870585</guid></item><item><title><![CDATA[New comment by stratos123 in "Kernel code removals driven by LLM-created security reports"]]></title><description><![CDATA[
<p>A 0% false-positive rate is not necessary for LLM-powered security review to be a big deal. It was worthless a few months ago, when the models were terrible at actually finding vulnerabilities and so basically all the reports were confabulated, with a false positive rate of >95%. Nowadays things are much better - see e.g. [1] by a kernel maintainer.<p>Another way to see this is that you mentioned "LLM found this serious bug in Firefox", but the actual number in that Mozilla report [2] was 14 high-severity bugs, and 90 minor ones. However you look at it, it's an impressive result for a security audit, and I dount that the Antropic team had to manually filter out hundreds-to-thousands of false-positives to produce it.<p>They did have to manually write minimal exploits for each bug, because Opus was bad at it[3]. This is a problem that Mythos doesn't have. With access to Mythos, to repeat the same audit, you'd likely just need to make the model itself write all the exploits, which incidentally would also filter out a lot of the false positives. I think the hype is mostly justified.<p>[1] <a href="https://lwn.net/Articles/1065620/" rel="nofollow">https://lwn.net/Articles/1065620/</a><p>[2] <a href="https://blog.mozilla.org/en/firefox/hardening-firefox-anthropic-red-team/" rel="nofollow">https://blog.mozilla.org/en/firefox/hardening-firefox-anthro...</a><p>[3] <a href="https://www.anthropic.com/news/mozilla-firefox-security" rel="nofollow">https://www.anthropic.com/news/mozilla-firefox-security</a></p>
]]></description><pubDate>Wed, 22 Apr 2026 14:10:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=47863933</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47863933</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47863933</guid></item><item><title><![CDATA[New comment by stratos123 in "Kernel code removals driven by LLM-created security reports"]]></title><description><![CDATA[
<p>In terms of quantity, definitely yes (a single person managing a swarm of Opusi can already find much more real bugs than a security researcher, hence the rise in reports).<p>In terms of quality ("are there bugs that professional humans can't see at any budget but LLMs can?") - it's not very clear, because Opus is still worse than a human specialist, but Mythos might be comparable. We'll just have to wait and see what results Project Glasswing gets.<p>Either way, cybersecurity is going to get real weird real soon, because even slightly-dumb models can have a large effect if they are cheap and fast enough.<p>EDIT: Mozilla thinks "no" to the second question, by the way: "Encouragingly, we also haven’t seen any bugs that couldn’t have been found by an elite human researcher.", when talking about the 271 vulnerabilities recently found by Mythos. <a href="https://blog.mozilla.org/en/firefox/ai-security-zero-day-vulnerabilities/" rel="nofollow">https://blog.mozilla.org/en/firefox/ai-security-zero-day-vul...</a></p>
]]></description><pubDate>Wed, 22 Apr 2026 13:50:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47863630</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47863630</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47863630</guid></item><item><title><![CDATA[New comment by stratos123 in "Claude Opus 4.7"]]></title><description><![CDATA[
<p>Supposedly that's because they stopped optimizing for MRCR and use GraphWalks as their measure of long context now: <a href="https://twitter.com/bcherny/status/2044821690920980626" rel="nofollow">https://twitter.com/bcherny/status/2044821690920980626</a></p>
]]></description><pubDate>Fri, 17 Apr 2026 20:48:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=47810411</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47810411</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47810411</guid></item><item><title><![CDATA[New comment by stratos123 in "Tennessee is about to make building chatbots a Class A felony"]]></title><description><![CDATA[
<p>Is this a joke? This bill does not particularly sound like it's written by a person who even knows x-risks are a thing; it seems to be trying to fight parasocial attachments to LLMs.<p>And even if it was otherwise, it'd be nearly useless since it's a law of one particular state. From the very post you linked: "If an ASI ban is to accomplish anything at all, it has to be effective everywhere. [...]  Driving an AI company out of just your own city will not protect your family from death. It won't even protect your city from job losses, earlier in the timeline."</p>
]]></description><pubDate>Fri, 17 Apr 2026 19:10:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47809450</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47809450</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47809450</guid></item><item><title><![CDATA[New comment by stratos123 in "Five men control AI. Who should control them?"]]></title><description><![CDATA[
<p>These arguments have been going on for more than a decade and have been silly the whole time.<p>> It reminds me of the 'Einstein's superintelligent cat' refutation to such fallacies.<p>One (of the many) problem(s) with this "refutation" is that in reality not only does nobody bother to lock the superintelligent cat in room and leave it no available actions, but you're lucky if they don't hook the cat up directly to the internet. It doesn't matter whether you could maybe control a superintelligence, if you were very careful and treating it very seriously, when nobody is even trying, much less being very careful.</p>
]]></description><pubDate>Fri, 17 Apr 2026 18:50:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=47809257</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47809257</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47809257</guid></item><item><title><![CDATA[New comment by stratos123 in "Five men control AI. Who should control them?"]]></title><description><![CDATA[
<p>Chinese companies are lagging behind and always have. They are fast-followers but don't decide the frontier.</p>
]]></description><pubDate>Fri, 17 Apr 2026 18:45:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=47809218</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47809218</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47809218</guid></item><item><title><![CDATA[New comment by stratos123 in "Qwen3.6-35B-A3B: Agentic coding power, now open to all"]]></title><description><![CDATA[
<p>Why not just download the binaries from github releases?</p>
]]></description><pubDate>Thu, 16 Apr 2026 21:10:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=47799539</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47799539</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47799539</guid></item><item><title><![CDATA[New comment by stratos123 in "Qwen3.6-35B-A3B: Agentic coding power, now open to all"]]></title><description><![CDATA[
<p>It pretty much just works. Run the unsloth quant in llama.cpp and hook it up to pi.  A bunch of minor annoyances like not having support for thinking effort. It also defaults to "interleaved thinking" (thinking blocks get stripped from context), set `"chat_template_kwargs": {"preserve_thinking": True},` if you interrupt the model often and don't want it to forget what it was thinking.</p>
]]></description><pubDate>Thu, 16 Apr 2026 21:07:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=47799519</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47799519</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47799519</guid></item><item><title><![CDATA[New comment by stratos123 in "Qwen3.6-35B-A3B: Agentic coding power, now open to all"]]></title><description><![CDATA[
<p>One interesting thing about Qwen3 is that looking at the benchmarks, the 35B-A3B models seem to be only a bit worse than the dense 27B ones. This is very different from Gemma 4, where the 26B-A4B model is much worse on several benchmarks (e.g. Codeforces, HLE) than 31B.</p>
]]></description><pubDate>Thu, 16 Apr 2026 20:50:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47799326</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47799326</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47799326</guid></item><item><title><![CDATA[New comment by stratos123 in "I ran Gemma 4 as a local model in Codex CLI"]]></title><description><![CDATA[
<p>> I suspect a possible future of local models is extreme specialisation - you load a Python-expert model for Python coding, do your shopping with a model focused just on this task, have a model specialised in speech-to-text plus automation to run your smart home, and so on.<p>I'd find this very surprising, since a lot of cognitive skills are general. At least on the scale of "being trained on a lot of non-Python code improves a model's capabilities in Python", but maybe even "being trained on a lot of unrelated tasks that require perseverance improves a model's capabilities in agentic coding".<p>For this reason there are currently very few specialist models - training on specialized datasets just doesn't work all that well. For example, there are the tiny Jetbrains Mellum models meant for in-editor autocomplete, but even those are AFAIK merely <i>fine-tuned</i> on specific languages, while their pretraining dataset is mixed-language.</p>
]]></description><pubDate>Tue, 14 Apr 2026 16:24:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=47767704</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47767704</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47767704</guid></item><item><title><![CDATA[New comment by stratos123 in "They See Your Photos"]]></title><description><![CDATA[
<p>I'd be surprised if that was required. OpenAI's o3 was already professional-human-level at guessing location from a photo, so unless the companies intentionally stopped training on those datasets, modern models should be too.</p>
]]></description><pubDate>Tue, 14 Apr 2026 13:04:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=47765153</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47765153</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47765153</guid></item><item><title><![CDATA[New comment by stratos123 in "Doom, Played over Curl"]]></title><description><![CDATA[
<p>This idea has the rather severe problem that it requires piping an untrusted remote script directly into bash.</p>
]]></description><pubDate>Sun, 12 Apr 2026 22:57:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47745403</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47745403</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47745403</guid></item><item><title><![CDATA[New comment by stratos123 in "Google removes "Doki Doki Literature Club" from Google Play"]]></title><description><![CDATA[
<p>Gore in shooters is culturally treated as much less "violent" than e.g. graphic scenes of suicide. You could make an argument that it <i>shouldn't</i> be, but it is.</p>
]]></description><pubDate>Sun, 12 Apr 2026 22:31:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=47745235</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47745235</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47745235</guid></item><item><title><![CDATA[New comment by stratos123 in "High-Level Rust: Getting 80% of the Benefits with 20% of the Pain"]]></title><description><![CDATA[
<p>Woah, that's quite an issue. The equivalent code in Python doesn't typecheck, since `list` is invariant and hence list[str] doesn't fit list[str|int]. Unusual for TS to handle types worse than Python.</p>
]]></description><pubDate>Sun, 12 Apr 2026 15:27:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=47740848</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47740848</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47740848</guid></item><item><title><![CDATA[New comment by stratos123 in "Installing every* Firefox extension"]]></title><description><![CDATA[
<p>My favorite part was the metal pipe sound effect. Wish the author investigated which extension does that.</p>
]]></description><pubDate>Fri, 10 Apr 2026 23:39:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47725294</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47725294</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47725294</guid></item><item><title><![CDATA[New comment by stratos123 in "The effects of caffeine consumption do not decay with a ~5 hour half-life"]]></title><description><![CDATA[
<p>There's lots of valid critiques of HPMOR (I recently reread it and the early chapters are painfully obnoxious), but I think "no meaningful plot" and "doesn't do anything interesting" are objectively false. It has like a dozen interacting plotlines, and is massively different from any other HP fanfic and most media in general. It is popular for a reason. If you dropped it early, I'd encourage you to try again.</p>
]]></description><pubDate>Fri, 10 Apr 2026 17:32:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=47721271</link><dc:creator>stratos123</dc:creator><comments>https://news.ycombinator.com/item?id=47721271</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47721271</guid></item></channel></rss>