<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: krackers</title><link>https://news.ycombinator.com/user?id=krackers</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 21 Apr 2026 08:47:00 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=krackers" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by krackers in "Mechanisms of introspective awareness in LLMs [pdf]"]]></title><description><![CDATA[
<p>Author's summary of the paper at <a href="https://x.com/uzaymacar/status/2044091229407748556#m" rel="nofollow">https://x.com/uzaymacar/status/2044091229407748556#m</a> fwiw</p>
]]></description><pubDate>Tue, 21 Apr 2026 06:38:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47845344</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47845344</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47845344</guid></item><item><title><![CDATA[New comment by krackers in "Prove you are a robot: CAPTCHAs for agents"]]></title><description><![CDATA[
<p>>I cannot find it there so this must be a false memory<p>It's also infamously the subject of a Von Neumann joke<p>>Two bicyclists start twenty miles apart and head toward each other, each going at a steady rate of 10 m.p.h. At the same time a fly that travels at a steady 15 m.p.h. starts from the front wheel of the southbound bicycle and flies to the front wheel of the northbound one, then turns around and flies to the front wheel of the southbound one again, and continues in this manner till he is crushed between the two front wheels. Question: what total distance did the fly cover ? The slow way to find the answer is to calculate what distance the fly covers on the first, northbound, leg of the trip, then on the second, southbound, leg, then on the third, etc., etc., and, finally, to sum the infinite series so obtained. The quick way is to observe that the bicycles meet exactly one hour after their start, so that the fly had just an hour for his travels; the answer must therefore be 15 miles. When the question was put to von Neumann, he solved it in an instant, and thereby disappointed the questioner: "Oh, you must have heard the trick before!" "What trick?" asked von Neumann; "all I did was sum the infinite series</p>
]]></description><pubDate>Tue, 21 Apr 2026 05:55:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=47845082</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47845082</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47845082</guid></item><item><title><![CDATA[New comment by krackers in "A simplified model of Fil-C"]]></title><description><![CDATA[
<p>I don't think fil-c is a drop in C replacement, there are things you can do in C such as certain types of pointer abuse that fil-c prohibits (e.g. cast pointer to int,  then back). It's probably easier to port an existing C project to Fil-C than to rewrite it entirely though.</p>
]]></description><pubDate>Mon, 20 Apr 2026 23:51:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=47842747</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47842747</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47842747</guid></item><item><title><![CDATA[New comment by krackers in "John Ternus to become Apple CEO"]]></title><description><![CDATA[
<p>There's some irony there in that the whole maps fiasco lead to firing of Forstall which allowed Ive to become head of design, which basically led to the current state of macOS design.<p>I do wish that some day someone will tell the story of what happened during that time. Maps was bad at launch yes, but it also wouldn't get better without people contributing more data, and the fact that it took a decade to slowly improve implies that there's nothing anyone could have done to get it right "off the bat". It still feels to me Forstall was set up as the fall guy, especially considering no one was fired for antennagate.</p>
]]></description><pubDate>Mon, 20 Apr 2026 21:40:43 +0000</pubDate><link>https://news.ycombinator.com/item?id=47841253</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47841253</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47841253</guid></item><item><title><![CDATA[New comment by krackers in "Mechanisms of introspective awareness in LLMs [pdf]"]]></title><description><![CDATA[
<p>Hm interesting, this seems to examine Anthropic's prior work.<p>If I understood the paper right, during post-training (e.g. DPO) models learn the correct "shape" of responses. But unlike SFT they're also penalized for going off-manifold so this incentivizes development of circuits that can detect off-manifold responses (you can see this clearly with RLVR perhaps, where models have a "but wait" reflex to steer themselves back in the correct direction) [^1]. Since part of the training is to be the archetypical chatbot assistant though, when combining with anti-jailbreak training this usually gets linked into "refusal" circuits.<p>One hypothesis might be that the question itself is leading. I.e. models will by default respond "no" to "are there any injected thoughts", just as they would to "are you conscious" or "do you have feeligns", because of RLHF that triggers refusal behavior. Then injection provides a strong enough signal that ends up "scrambling" this pathway, _suppressing_ the normal refusal behavior and allowing them to report the injection. (Describing the contents of the injected vector is trivial either way, as the paper notes the detection is the important part).<p>The interesting thing is that ablating away refusals doesn't actually change the false positive rate though, so instead the above hypothesis of injections overriding a default refusal doesn't fit. Instead there really does seem to be a separate "evidence carrier" detector sensitive to off-manifold responses, which just so happens to get wired into the "refusal circuits" but when "unwired" via ablation allows the model to report injections.<p>I guess what's not clear to me though is whether this is really detecting _injection_ itself. Wouldn't the same circuits be triggered by any anomalous context? It shouldn't be any surprise that models can detect models anomalies in input tokens (after all LLMs were designed to model text), so I don't see why anomalies in the residual stream would be any different (it's not like a layer cares whether the "bread" embedding was injected externally or came through from the input token).<p>In theory the case of "anomalous input context" versus "anomalous residual via external injection" _can_ be distinguished though, because there would be a sort of "discontinuity" in the residual stream as you pass through layers, and the hidden state at token i depth n feeds into that of token i+1 depth n+1, you could in theory create a computational graph that could detect such tampering.<p>I think the paper sort of indirectly tested this in section 3.2 "SPECIFICITY TO THE ASSISTANT PERSONA"<p>>In contrast, the two nonstandard roles (Alice-Bob, story framing) induce confabulation. Thus, introspection is not exclusive to responding as the assistant character, although reliability decreases outside standard roles.<p>Which does seem to imply that as soon as you step out of distribution to things like roleplay that RLHF specifically penalized, the anomaly detectors start firing as well.<p>[^1] I think this is also related to how RLHF/DPO are sequence-level optimizations, with a notion of credit assignment. And optimizing in this way results in the model having a notion of whether the current position in the rollout is "good" or not.</p>
]]></description><pubDate>Mon, 20 Apr 2026 07:03:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=47831183</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47831183</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47831183</guid></item><item><title><![CDATA[New comment by krackers in "Introspective Diffusion Language Models"]]></title><description><![CDATA[
<p>Masked language modeling has been compared loosely to text diffusion [1], so the paper's title claim may be loosely true in some sense even if it's misleading.<p>[1] <a href="https://nathan.rs/posts/roberta-diffusion/" rel="nofollow">https://nathan.rs/posts/roberta-diffusion/</a></p>
]]></description><pubDate>Mon, 20 Apr 2026 06:05:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=47830902</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47830902</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47830902</guid></item><item><title><![CDATA[New comment by krackers in "What are skiplists good for?"]]></title><description><![CDATA[
<p>Maybe for very high-level intuition, it's vaguely similar to other randomized algorithms that you want to minimize the worst-case on expectation and the easiest way to do so is to just introduce randomness (think quicksort, which is N^2 worst case with a badly chosen pivot). Your idea of there being an optimal distance is similar to the concept of "derandomization" maybe, e.g. in Quicksort there are deterministic pivot selection algorithms to avoid the worst case. But all of those require much more effort to compute and require an algorithm whose output is a function of the input data. Whereas randomly picking a pivot or randomly creating express lanes is simpler and avoids the data dependency (which is important since unlike sorting, the data isn't fixed ahead of time).</p>
]]></description><pubDate>Mon, 20 Apr 2026 04:40:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=47830462</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47830462</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47830462</guid></item><item><title><![CDATA[New comment by krackers in "Anonymous request-token comparisons from Opus 4.6 and Opus 4.7"]]></title><description><![CDATA[
<p>If I have a conversation with claude then come back 30 minutes later to resume the conversation, the KV values for that prefill prefix are going to be exactly the same. That's the whole point of this caching in the first place.<p>If you're willing to incur a latency penalty on a "cold resume" (which is fine for most use-cases), why couldn't they just move it to disk. The size of the KV cache should scale on the order of something like (context_length * n_layers * residual_length). I think for a standard V3-MoE model at 1M token length, this should be on the order of 100G at FP16? And you can surely play tricks with KV compression (e.g. the recent TurboQuant paper). It doesn't seem like an outrageous amount of data to put onto cheap scratch HDD (and it doesn't grow indefinitely since really old conversations can be discarded).</p>
]]></description><pubDate>Sun, 19 Apr 2026 03:02:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=47821479</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47821479</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47821479</guid></item><item><title><![CDATA[New comment by krackers in "Anonymous request-token comparisons from Opus 4.6 and Opus 4.7"]]></title><description><![CDATA[
<p>>pay for reinitializing the cache<p>Why can't they save the kv cache to disk then later reload it to memory?</p>
]]></description><pubDate>Sat, 18 Apr 2026 22:03:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=47819914</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47819914</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47819914</guid></item><item><title><![CDATA[New comment by krackers in "America Lost the Mandate of Heaven"]]></title><description><![CDATA[
<p>>manufacturing alone isn’t going to grow their economy any further.<p>But why does the economy need to grow? If you can manufacture everything you need, and you have access to the raw resources, what else do you need as a country. In what sense is growing your economy with VC scams like Juicero better than actually having industrial output?</p>
]]></description><pubDate>Sat, 18 Apr 2026 20:53:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=47819449</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47819449</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47819449</guid></item><item><title><![CDATA[New comment by krackers in "Average is all you need"]]></title><description><![CDATA[
<p>The harder part is understanding the nature of the data you're working with. There's always some catch ("oh that field `foo` was never backfilled, so for queries before 2020 you have to recompute it by joining with legacyBar instead")</p>
]]></description><pubDate>Sat, 18 Apr 2026 00:42:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=47812157</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47812157</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47812157</guid></item><item><title><![CDATA[New comment by krackers in "Slop Cop"]]></title><description><![CDATA[
<p>Isn't this called the tricolon? Ironically the names of the patterns all seem AI generated.</p>
]]></description><pubDate>Sat, 18 Apr 2026 00:38:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=47812133</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47812133</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47812133</guid></item><item><title><![CDATA[New comment by krackers in "Google Gemma 4 Runs Natively on iPhone with Full Offline AI Inference"]]></title><description><![CDATA[
<p>More info on specific design choices needed to run models here [1]. I mean it is possible given that apple themselves did it in [2], but it's also not as general purpose or flexible as a GPU.<p>[1] <a href="https://news.ycombinator.com/item?id=43881692">https://news.ycombinator.com/item?id=43881692</a>
[2] <a href="https://machinelearning.apple.com/research/neural-engine-transformers" rel="nofollow">https://machinelearning.apple.com/research/neural-engine-tra...</a></p>
]]></description><pubDate>Fri, 17 Apr 2026 23:44:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47811820</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47811820</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47811820</guid></item><item><title><![CDATA[New comment by krackers in "Codex for almost everything"]]></title><description><![CDATA[
<p>Neat, good to know! And it does seem my mental model of event loop was broken. Accessibility related interactions don't have any related NSEvent.<p>They are handled as part of the "conceptual" run loop, but they seem to be dispatched internally by AXRuntime library from a callback off some mach port. And because of this, the call to nextEventMatchingEventMask in the main -[NSApplication run] loop never even sees any such NSEvent.<p><pre><code>    -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:]  (in AppKit)
        _DPSNextEvent  (in AppKit)
          _BlockUntilNextEventMatchingListInModeWithFilter  (in HIToolbox)
            ReceiveNextEventCommon  (in HIToolbox)
              RunCurrentEventLoopInMode  (in HIToolbox)
                CFRunLoopRunSpecific  (in CoreFoundation)
                  __CFRunLoopRun  (in CoreFoundation)
                    __CFRunLoopDoSource1  (in CoreFoundation)
                      __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__  (in CoreFoundation)
                        mshMIGPerform  (in HIServices)
                          _XPerformAction  (in HIServices)
                            _AXXMIGPerformAction  (in HIServices)

</code></pre>
In some sense this is sort of similar to apple events, which are also "hidden" from the caller of nextEventMatchingEventMask. From what I can see those are handled by DPSNextEvent, which sorts based on the raw carbon EventRef. aevt types have `AEProcessAppleEvent` called on them, then the event is just consumed silently. Others get converted to a CGEvent and returned back to caller for it to handle. But of course accessibility events didn't exist in Classic mac, so they can't be handled at this layer so they were pushed further down. You can almost see the historical legacy here..<p>[1] <a href="https://www.cocoawithlove.com/2009/01/demystifying-nsapplication-by.html" rel="nofollow">https://www.cocoawithlove.com/2009/01/demystifying-nsapplica...</a></p>
]]></description><pubDate>Fri, 17 Apr 2026 23:02:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=47811515</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47811515</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47811515</guid></item><item><title><![CDATA[New comment by krackers in "Codex for almost everything"]]></title><description><![CDATA[
<p>Could you elaborate on what you mean? My understanding of the Cocoa event loop was that ultimately everything is received as an NSEvent at the application layer (maybe that's wrong though).<p>Do you mean that you can just AXUIElementPerformAction once you have a reference to it and the OS will internally synthesize the right type of event, even if it's not in the foreground?</p>
]]></description><pubDate>Fri, 17 Apr 2026 18:19:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=47808941</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47808941</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47808941</guid></item><item><title><![CDATA[New comment by krackers in "Codex for almost everything"]]></title><description><![CDATA[
<p>Which specific ones though allow you to send input to a window without raising it? People have been trying to do "focus follows mouse [without auto raise]" for a long time on mac, and the synthetic event equivalent to command+click is the only discovered method I'm aware of, e.g. used in <a href="https://github.com/sbmpost/AutoRaise" rel="nofollow">https://github.com/sbmpost/AutoRaise</a><p>There is also this old blog post by Yegge [1] which mentions `AXUIElementPostKeyboardEvent` but there were plenty of bugs with that, and I haven't seen anyone else build on it. I guess the modern equivalent is `CGEventPostToPSN`/`CGEventPostToPid`. I guess it's a good candidate though, perhaps the Sky team they acquired knows the right private APIs to use to get this working.<p>Edit: The thread at [2] also has some interesting tidbits, such as Automator.app having "Watch Me Do" which can also do this, and a CLI tool that claims to use the CGEventPostToPid API [3]. Maybe there's more ways to do it than I realized.<p>[1] <a href="https://steve-yegge.blogspot.com/2008/04/settling-osx-focus-follows-mouse-debate.html" rel="nofollow">https://steve-yegge.blogspot.com/2008/04/settling-osx-focus-...</a>
[2] <a href="https://www.macscripter.net/t/keystroke-to-background-app-as-vs-automator/77570" rel="nofollow">https://www.macscripter.net/t/keystroke-to-background-app-as...</a>
[3] <a href="https://github.com/socsieng/sendkeys" rel="nofollow">https://github.com/socsieng/sendkeys</a></p>
]]></description><pubDate>Thu, 16 Apr 2026 20:56:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=47799403</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47799403</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47799403</guid></item><item><title><![CDATA[New comment by krackers in "Codex for almost everything"]]></title><description><![CDATA[
<p>>background computer use<p>How does that even work technically? macOS doesn't support multiple cursors. On native Cocoa apps you can pass input to a window without raising via command+click so possibly they synthesized those events, but fewer and fewer apps support that these days. And AppleScript is basically dead, so they can't be using that either.<p>I also read they acquired the Sky team (who I think were former Apple employees). No wonder they were able to pull of something so slick.</p>
]]></description><pubDate>Thu, 16 Apr 2026 20:32:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=47799128</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47799128</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47799128</guid></item><item><title><![CDATA[New comment by krackers in "Tax Wrapped 2025"]]></title><description><![CDATA[
<p>>See what the federal government spent with your tax dollars.<p>Is thinking of it in this sense actually accurate? I always assumed since every government has embraced MMT they can spend whatever they want simply by printing it out of thin air. Then taxation could be understood as the only crude knob to "destroy money", and also has the effect of forcing USD to be the primary national currency (e.g. owning bitcoin won't do you any good if you ultimately need to pay taxes in USD).</p>
]]></description><pubDate>Tue, 14 Apr 2026 01:39:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=47760240</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47760240</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47760240</guid></item><item><title><![CDATA[New comment by krackers in "The economics of software teams: Why most engineering orgs are flying blind"]]></title><description><![CDATA[
<p>If good writing was easy then "LLM slop writing" wouldn't be a thing.</p>
]]></description><pubDate>Mon, 13 Apr 2026 23:39:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=47759378</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47759378</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47759378</guid></item><item><title><![CDATA[New comment by krackers in "Google has the same AI adoption curve as John Deere"]]></title><description><![CDATA[
<p>> just cancelled IntelliJ for a thousand engineers<p>IntelliJ can't cost more than the AI provider subscriptions, and it will actually handle large refactors without breaking your codebase.</p>
]]></description><pubDate>Mon, 13 Apr 2026 20:24:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=47757366</link><dc:creator>krackers</dc:creator><comments>https://news.ycombinator.com/item?id=47757366</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47757366</guid></item></channel></rss>