<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: azakai</title><link>https://news.ycombinator.com/user?id=azakai</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Thu, 21 May 2026 02:04:57 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=azakai" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by azakai in "An OpenAI model has disproved a central conjecture in discrete geometry"]]></title><description><![CDATA[
<p>There was a lot new in calculus, but it also didn't come out of nowhere.<p>That Newton and Leibniz came up with similar ideas in parallel, independently, around the same time (what are the odds?), supports that.<p><a href="https://en.wikipedia.org/wiki/Leibniz%E2%80%93Newton_calculus_controversy" rel="nofollow">https://en.wikipedia.org/wiki/Leibniz%E2%80%93Newton_calculu...</a></p>
]]></description><pubDate>Wed, 20 May 2026 22:42:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=48215312</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=48215312</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48215312</guid></item><item><title><![CDATA[New comment by azakai in "Natural Language Autoencoders: Turning Claude's Thoughts into Text"]]></title><description><![CDATA[
<p>Thanks! I missed that part before.</p>
]]></description><pubDate>Fri, 08 May 2026 00:36:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=48056977</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=48056977</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48056977</guid></item><item><title><![CDATA[New comment by azakai in "Natural Language Autoencoders: Turning Claude's Thoughts into Text"]]></title><description><![CDATA[
<p>I had the same question. I think that could be answered by <i>using</i> the predicted activation, but I don't see that in the paper.<p>That is, rather than just translate activation to text, then text to activation, that final activation could then be applied to the neural network, and it would be allowed to continue running from there.<p>If it kept running in a similar way, that would show that the predicted activation is close enough to the original one. Which would add some confidence here.<p>But a lot better would be to then do experiments with <i>altered</i> text. That is, if the text said "this is true" and it was changed to "this is false", and that intervention led to the final output implying it was false, that would be very interesting.<p>This seems obvious but I don't see it mentioned as a future direction there, so maybe there is an obvious reason it can't work.</p>
]]></description><pubDate>Thu, 07 May 2026 23:28:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=48056460</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=48056460</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48056460</guid></item><item><title><![CDATA[New comment by azakai in "He asked AI to count carbs 27000 times. It couldn't give the same answer twice"]]></title><description><![CDATA[
<p>The hardware can also add nondeterminism. GPUs reorder operations, leading to different results.<p>Vendors might also be running A/B testing or who knows what, even when you ask for a temperature of 0.<p>But, if you run a fixed model with temperature 0 on your local CPU, it will be deterministic (unless there are bugs).</p>
]]></description><pubDate>Wed, 29 Apr 2026 17:54:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47951903</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47951903</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47951903</guid></item><item><title><![CDATA[New comment by azakai in "He asked AI to count carbs 27000 times. It couldn't give the same answer twice"]]></title><description><![CDATA[
<p>A carb counting app might use API calls to these frontier models and then do some kind of analysis. It could see if different models agree or not, or multiple calls, and with how much variance.<p>So it would be more accurate to test the apps rather than the APIs, unless the goal is to warn people that just open chatgpt and ask there.</p>
]]></description><pubDate>Wed, 29 Apr 2026 15:55:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=47950178</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47950178</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47950178</guid></item><item><title><![CDATA[New comment by azakai in "Talkie: a 13B vintage language model from 1930"]]></title><description><![CDATA[
<p>fwiw, asking the model directly, "who is the ruler of England at present?" returns "Queen Victoria is the reigning sovereign of England."</p>
]]></description><pubDate>Tue, 28 Apr 2026 02:36:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=47929916</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47929916</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47929916</guid></item><item><title><![CDATA[New comment by azakai in "Graphs that explain the state of AI in 2026"]]></title><description><![CDATA[
<p>Another way to put it: if training a model cost 72,000 tons of carbon, and it then gets used by 100 million people (typical of major models), the cost per person is 0.00072 tons.<p>Per the article, the average human uses over 5 tons per year (Americans: 18). Adding 0.00072 to 5 is not really noticeable.<p>(There is also the cost of inference, of course.)</p>
]]></description><pubDate>Sat, 18 Apr 2026 23:51:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=47820563</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47820563</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47820563</guid></item><item><title><![CDATA[New comment by azakai in "The case for zero-error horizons in trustworthy LLMs"]]></title><description><![CDATA[
<p>It is academically interesting what pure neural networks can do, of course. But when someone goes to Claude and tries to do something, they don't care if it solves the problem using a neural network or a call out to Python. So long as the result is right.<p>More generally, the ability to use tools is a form of intelligence, just like when humans and crows do it. Being able to craft the right Python script and use the result is non-trivial.</p>
]]></description><pubDate>Fri, 03 Apr 2026 14:56:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=47627377</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47627377</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47627377</guid></item><item><title><![CDATA[New comment by azakai in "The case for zero-error horizons in trustworthy LLMs"]]></title><description><![CDATA[
<p>An "elaborate harness" that can break down a problem into sub-tasks, write Python scripts for the ones it can't solve itself, and then combine the results, seems able to solve a wide range of cognitive tasks?<p>At least in theory.</p>
]]></description><pubDate>Thu, 02 Apr 2026 19:16:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=47618911</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47618911</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47618911</guid></item><item><title><![CDATA[New comment by azakai in "Even GPT-5.2 Can't Count to Five: Zero-Error Horizons in Trustworthy LLMs"]]></title><description><![CDATA[
<p>You are trying it on a production model. The paper is using models with tool calls disabled.</p>
]]></description><pubDate>Thu, 02 Apr 2026 18:29:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=47618306</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47618306</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47618306</guid></item><item><title><![CDATA[New comment by azakai in "Even GPT-5.2 Can't Count to Five: Zero-Error Horizons in Trustworthy LLMs"]]></title><description><![CDATA[
<p>It has "outsourced" it to another component, sure, but does that matter?<p>What the user sees is the total behavior of the entire system, not whether the system has internal divisions and separations.</p>
]]></description><pubDate>Thu, 02 Apr 2026 18:27:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=47618266</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47618266</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47618266</guid></item><item><title><![CDATA[New comment by azakai in "The case for zero-error horizons in trustworthy LLMs"]]></title><description><![CDATA[
<p>I do think this is a tool issue. Here is what the article says:<p>> For the multiplication task, note that agents that make external calls to a calculator tool may have ZEH = ∞. While ZEH = ∞ does have meaning, in this paper we primarily evaluate the LLM itself without external tool calls<p>The models can count to infinity if you give them access to tools. The production models do this.<p>Not that the paper is wrong, it is still interesting to measure the core neural network of a model. But modern models use tools.</p>
]]></description><pubDate>Thu, 02 Apr 2026 18:25:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47618242</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47618242</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47618242</guid></item><item><title><![CDATA[New comment by azakai in "Epoch confirms GPT5.4 Pro solved a frontier math open problem"]]></title><description><![CDATA[
<p>There are many examples of current limitations, but do you see a reason to think they are fundamental limitations? (I'm not saying they aren't, I'm curious what the evidence is for that.)</p>
]]></description><pubDate>Wed, 25 Mar 2026 15:55:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=47519098</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47519098</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47519098</guid></item><item><title><![CDATA[New comment by azakai in "We rewrote our Rust WASM parser in TypeScript and it got faster"]]></title><description><![CDATA[
<p>You're generally right - rewrites let you improve the code - but they do have an actual reason the new language was better: avoiding copies on the boundary.<p>They say they measured that cost, and it was most of the runtime in the old version (though they don't give exact numbers). That cost does not exist at all in the new version, simply because of the language.</p>
]]></description><pubDate>Fri, 20 Mar 2026 23:49:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=47462413</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47462413</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47462413</guid></item><item><title><![CDATA[New comment by azakai in "We rewrote our Rust WASM parser in TypeScript and it got faster"]]></title><description><![CDATA[
<p>O(N²) -> O(N) was 3.3x faster, but before that, eliminating the boundary (replacing wasm with JS) led to speedups of 2.2x, 4.6x, 3.0x (see one table back).<p>It looks like neither is the "real win". both the language and the algorithm made a big difference, as you can see in the first column in the last table - going to wasm was a big speedup, and improving the algorithm on top of that was another big speedup.</p>
]]></description><pubDate>Fri, 20 Mar 2026 23:46:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=47462388</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47462388</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47462388</guid></item><item><title><![CDATA[New comment by azakai in "Making WebAssembly a first-class language on the Web"]]></title><description><![CDATA[
<p>> Figma is one site. There are also a handful of other sites that use wasm. But most of the web does not use wasm.<p>Most of the web also doesn't use the Video element, but it isn't 'a dud' either.<p>Video and wasm are critical for a small subset of the web. That subset includes YouTube and Netflix for Video, and Figma and Photoshop and Unity games for wasm.</p>
]]></description><pubDate>Thu, 12 Mar 2026 03:18:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=47346007</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47346007</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47346007</guid></item><item><title><![CDATA[New comment by azakai in "Making WebAssembly a first-class language on the Web"]]></title><description><![CDATA[
<p>Sorry, I'll ask my question in a better way: what applications written in wasm that exist today would benefit from this?<p>Now, maybe there aren't many because of performance - maybe they haven't used wasm because it was too slow. But I would appreciate seeing data on that - an application that tried wasm and gave up after seeing the overhead, at the least. But I would also expect to see apps that use wasm even despite some DOM overhead, because of the speedup on non-DOM code - and I'd like to see data on how much DOM overhead they are currently suffering.<p>I am asking because I'm familiar with a lot of apps ported to wasm, and they don't do this. That may just be because I am seeing one particular slice of the ecosystem! So I am very curious to learn about other parts.</p>
]]></description><pubDate>Thu, 12 Mar 2026 03:08:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=47345916</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47345916</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47345916</guid></item><item><title><![CDATA[New comment by azakai in "Making WebAssembly a first-class language on the Web"]]></title><description><![CDATA[
<p>> DOM performance is a big deal and bottleneck for a lot of applications<p>What are examples of such applications? Honest question - I'm curious to learn more about issues such applications have in production.<p>> But we really shouldn't be requiring everyone to become an expert to benefit from wasm.<p>If the toolchain does it for them, they don't need to be experts, no more than people need to be DWARF experts to debug native applications.<p>I agree tools could be a lot better here! But as I think you know, my position is that we can move faster and get better results on the tools side.</p>
]]></description><pubDate>Wed, 11 Mar 2026 23:52:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=47344249</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47344249</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47344249</guid></item><item><title><![CDATA[New comment by azakai in "Is legal the same as legitimate: AI reimplementation and the erosion of copyleft"]]></title><description><![CDATA[
<p>> LLMs do not encode nor encrypt their training data. The fact they can recite training data is a defect not a default.<p>About this specific point, it is unclear how much of a defect memorization actually is - there are also reasons to see it as necessary for effective learning. This link explains it well:<p><a href="https://infinitefaculty.substack.com/p/memorization-vs-generalization-in" rel="nofollow">https://infinitefaculty.substack.com/p/memorization-vs-gener...</a></p>
]]></description><pubDate>Mon, 09 Mar 2026 21:55:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=47316138</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47316138</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47316138</guid></item><item><title><![CDATA[New comment by azakai in "Notes on writing Rust-based Wasm"]]></title><description><![CDATA[
<p>> There is no solution without tradeoffs here, but the only reason JS-glue-code is winning out is because the complexity is moved from browsers to each language or framework that wants to work with wasm<p>Correct, but this is has been one of wasm's guiding principles since the start: move complexity from browsers to toolchains.<p>Wasm is simple to optimize in browsers, far simpler than JavaScript. It does require a lot more toolchain work! But that avoids browser exploits.<p>This is the reason we don't support the wasm text format in browsers, or wasm-ld, or wasm-opt. All those things would make toolchains easier to develop.<p>You are right that this sometimes causes duplicate effort among toolchains, each one needing to do the same thing, and that is annoying. But we could also share that effort, and we already do in things like LLVM, wasm-ld, wasm-opt, etc.<p>Maybe we could share the effort of making JS bindings as well. In fact there is a JS polyfill for the component model, which does exactly that.</p>
]]></description><pubDate>Sun, 08 Mar 2026 23:25:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=47302742</link><dc:creator>azakai</dc:creator><comments>https://news.ycombinator.com/item?id=47302742</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47302742</guid></item></channel></rss>