<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: versteegen</title><link>https://news.ycombinator.com/user?id=versteegen</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 15 Jun 2026 10:27:29 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=versteegen" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by versteegen in "Shepherd's Dog: A Game by Fable"]]></title><description><![CDATA[
<p>I think you misunderstand what was meant by "toying with your own idea" here. I interpret it as daydreaming about it.</p>
]]></description><pubDate>Sat, 13 Jun 2026 14:21:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=48517642</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=48517642</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48517642</guid></item><item><title><![CDATA[New comment by versteegen in "Access to frontier AI will soon be limited by economic and security constraints"]]></title><description><![CDATA[
<p>I've also worked extensively on ARC AGI 1/2, and I mainly agree.  Marketing and training. Performance of LLMs on ARC is most importantly a function of training on grid/table-like data. It doesn't have to be specifically synthetic ARC data though. Training an LLM to be better at perceiving grid-like arrangements of data in a spatial way like an image, rather than just tabular, is hugely useful for things outside of ARC benchmarks, though it's a narrow skill. Hence, I'm <i>sure</i> they do it. I <i>want</i> them to do that. I <i>believe</i> the labs when they say they didn't train specifically for ARC-AGI 1/2 (where did Google say otherwise? I don't see it). But it <i>does not mean</i> the models are getting better at general purpose reasoning. They were already plenty good enough at that. You can describe ARC images in words and reason about it using a level of intelligence LLMs have had for years: they're designed to be easy! LLMs just couldn't reason about image-like grids very well.</p>
]]></description><pubDate>Fri, 15 May 2026 09:55:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=48146636</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=48146636</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48146636</guid></item><item><title><![CDATA[New comment by versteegen in "Bun's experimental Rust rewrite hits 99.8% test compatibility on Linux x64 glibc"]]></title><description><![CDATA[
<p>This explains a lot. But you merely need to look into the family of spice forks to realise, given the way that they're strangely limited to certain operating systems and embedded inside certain proprietary IDEs, that's there's something very wrong with the code architecture.<p>So, that would be an awesome project!</p>
]]></description><pubDate>Sun, 10 May 2026 11:53:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=48083213</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=48083213</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48083213</guid></item><item><title><![CDATA[New comment by versteegen in "Amateur armed with ChatGPT solves an Erdős problem"]]></title><description><![CDATA[
<p>I agree <i>except</i>: this <i>is</i> creative work. Creativity can be and is being mechanised. True originality is extremely rare. Most novelty is the repurposing of one idea or concept elsewhere in a way we call find surprising, but the choice to apply A to B could have been made for any reason including mechanical: very many inventions are accidents. In-depth knowledge / conceptual understanding of something is built on abstraction, and abstractions are portable.<p>If you had a list of N concepts and M ways to apply them you could try all N*M combinations, and get some very interesting results. For a real example, see the theory of inventive problem solving (TRIZ)'s amusing "40 principles of invention" by Soviet inventor Genrich Altshuller. <a href="https://en.wikipedia.org/wiki/TRIZ" rel="nofollow">https://en.wikipedia.org/wiki/TRIZ</a></p>
]]></description><pubDate>Sun, 26 Apr 2026 09:07:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=47908711</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47908711</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47908711</guid></item><item><title><![CDATA[New comment by versteegen in "SDL Now Supports DOS"]]></title><description><![CDATA[
<p>:) SCREEN 13 (VGA Mode 13h) is almost correct, but actually it originally used a 320x200 VGA Mode X assembly graphics library. I believe 320x200 instead of 320x240 to be compatible with earlier pure-QB code for SCREEN 13 reused in the engine. (Mode X isn't a single mode, it has some adjustable parameters.)</p>
]]></description><pubDate>Sat, 25 Apr 2026 12:09:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=47900797</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47900797</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47900797</guid></item><item><title><![CDATA[New comment by versteegen in "SDL Now Supports DOS"]]></title><description><![CDATA[
<p>I'm going to find out. I've been meaning for years to port the OHRRPGCE back to DOS, where it came from.<p>I'm very surprised to see SDL3 re-gain DOS support, since they've aggressively dropped support for almost every port/OS they had in the SDL 1.2 days.</p>
]]></description><pubDate>Sat, 25 Apr 2026 04:19:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47898600</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47898600</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47898600</guid></item><item><title><![CDATA[New comment by versteegen in "DeepSeek v4"]]></title><description><![CDATA[
<p>Which model's best depends on how you use it. There's a huge difference in behaviour between Claude and GPT and other models which makes some poor substitutes for others in certain use cases. I think the GPT models are a bad substitute for Claude ones for tasks such as pair-programming (where you want to see the CoT and have immediate responses) and writing code that you actually want to read and edit yourself, as opposed to just letting GPT run in the background to produce working code that you won't inspect. Yes, GPT 5.4 is cheap and brilliant but very black-box and often very slow IME. GPT-5.4 still seems to behave the same as 5.1, which includes problems like: doesn't show useful thoughts, can think for half an hour, says "Preparing the patch now" then thinks for another 20 min, gives no impression of what it's doing, reads microscopic parts of source files and misses context, will do anything to pass the tests including patching libraries...</p>
]]></description><pubDate>Fri, 24 Apr 2026 05:17:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=47885863</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47885863</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47885863</guid></item><item><title><![CDATA[New comment by versteegen in "GPT-5.5"]]></title><description><![CDATA[
<p>Interesting (would like to hear more), but solving a Rubiks cube would appear to be a poor way to measure spatial understanding or reasoning. Ordinary human spatial intuition lets you think about how to move a tile to a certain location, but not really how to make consistent progress towards a solution; what's needed is knowledge of solution techniques. I'd say what you're measuring is 'perception' rather than reasoning.</p>
]]></description><pubDate>Fri, 24 Apr 2026 04:43:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=47885633</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47885633</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47885633</guid></item><item><title><![CDATA[New comment by versteegen in "How to make a fast dynamic language interpreter"]]></title><description><![CDATA[
<p>You're correct. I neglected that; extension API compatibility is a big (the most important?) difference between PyPy and CPython's JIT. Amongst language features that affect optimisation potential, an extension API can be the worst.<p>Edit: I think what you're alluding to is that tracing JITs can overcome a lot of dynamic language features which make things hopeless for method JITs. Where LuaJIT really shines vs PyPy is outside of JITed loops. (Also memory and compile overheads). I realise this is a bit of a motte and bailey.</p>
]]></description><pubDate>Wed, 22 Apr 2026 03:40:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=47858644</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47858644</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47858644</guid></item><item><title><![CDATA[New comment by versteegen in "Changes to GitHub Copilot individual plans"]]></title><description><![CDATA[
<p>The Anthropic Pro plan cost double and gave you, I don't know, a tenth the usage, depending on how efficiently you used Copilot requests, and no access to a large set of models including GPT and Gemini and free ones.</p>
]]></description><pubDate>Wed, 22 Apr 2026 00:24:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=47856904</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47856904</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47856904</guid></item><item><title><![CDATA[New comment by versteegen in "Changes to GitHub Copilot individual plans"]]></title><description><![CDATA[
<p>Yes, Github's per-request pricing was insane; anyone suggesting using CC instead or asking if any other provider is as cheap just doesn't understand the insanity. Clearly losing a lot of money on the people making good use of it.<p>I was actually hoping they would change it to something that more closely tracks their actual costs so that they wouldn't have to rug-pull this badly. In particular what was really bad about it was that sending prompts to agents while they were working (to give them corrections) cost extra so I stopped doing that (after initially OpenCode didn't cause billing for that, until they became official).</p>
]]></description><pubDate>Wed, 22 Apr 2026 00:21:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=47856861</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47856861</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47856861</guid></item><item><title><![CDATA[New comment by versteegen in "How to make a fast dynamic language interpreter"]]></title><description><![CDATA[
<p>Yes, language design is a hugely important determinant of interpreter or JIT speed. There are many highly optimised VMs for dynamic languages but LuaJIT is king because Lua is such a small and suitable language, and although it does have a couple difficult to optimise features, they are few enough that you can expend the effort. It's nothing like Python. It's not much of an exaggeration to say Python is designed to minimise the possibility of a fast JIT, with compounding layers of dynamism. After years of work, the CPython 3.15 JIT finally managed ~5% faster than the stock interpreter on x86_64.</p>
]]></description><pubDate>Tue, 21 Apr 2026 09:01:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47846383</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47846383</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47846383</guid></item><item><title><![CDATA[New comment by versteegen in "The seven programming ur-languages (2022)"]]></title><description><![CDATA[
<p>Von Neumann may possibly have been the smartest man to ever live, but giving him credit for all of this is too much, brushing aside many other inventors (oft independent, to his credit).</p>
]]></description><pubDate>Mon, 20 Apr 2026 01:36:01 +0000</pubDate><link>https://news.ycombinator.com/item?id=47829393</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47829393</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47829393</guid></item><item><title><![CDATA[New comment by versteegen in "Thoughts and feelings around Claude Design"]]></title><description><![CDATA[
<p>They're definitely not subsidizing API pricing, can't believe how prevalent that fallacy is on HN of all places. The question is how profitable Claude Code is. Your example 2 is real and major but your example 1 is ridiculous, almost any new model from any company is better at the same price, and how is increasing the price an example of decreasing prices??<p>BTW, Github Copilot is pricing Opus 4.7 at 2.5x the cost of Opus 4.6 at <i>promotional pricing</i> (so maybe it'll be 4-5x). But Github's request based pricing is insane, completely divorced from their actual costs (you can achieve 1+M tokens for $0.10 if you give it a large request), so I'd assume they're losing a lot of money.</p>
]]></description><pubDate>Sun, 19 Apr 2026 01:59:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=47821203</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47821203</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47821203</guid></item><item><title><![CDATA[New comment by versteegen in "Middle schooler finds coin from Troy in Berlin"]]></title><description><![CDATA[
<p>Even more detail in the DW article:<p>"""
Fortunately, the boy was very precise and showed me exactly where he found it on a map. Then we went into our findings registration and found that this agricultural site was actually a well-known place," Henker explained.<p>Berlin's Museum for Pre- and Early History has been systematically conducting surveys on empty land in Berlin since the 1950s to determine where possible excavation sites might be.<p>In this particular spot, explains Henker, the upper layers of the soil were surveyed in the 1950s and 70s and again later. "Every time, they discovered a few distinct finds that made them say 'ok, there's probably more in the ground here'."<p>Over the years, fragments of ceramics, Slavonic-era knives and a bronze button have been unearthed on the site, as well as burnt human bones, leading researchers to conclude that this are was used as a burial ground dating as far back as the early Iron Age — and has been in use throughout the centuries.
"""<p><a href="https://www.dw.com/en/teen-discovers-first-ancient-greek-artifact-found-in-berlin/a-76833757" rel="nofollow">https://www.dw.com/en/teen-discovers-first-ancient-greek-art...</a></p>
]]></description><pubDate>Sat, 18 Apr 2026 15:32:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=47816683</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47816683</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47816683</guid></item><item><title><![CDATA[New comment by versteegen in "Day 1 of ARC-AGI-3"]]></title><description><![CDATA[
<p>Where do you see that? I only skimmed the prompts but don't see any aspects of any of the games explained in there. There are a few hints which are legitimate prior knowledge about games in general, though some looks too inflexible to me. Prior knowledge ("Core priors") is a critical requirement of the ARC series, read the reports.</p>
]]></description><pubDate>Fri, 27 Mar 2026 12:26:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=47541876</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47541876</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47541876</guid></item><item><title><![CDATA[New comment by versteegen in "Day 1 of ARC-AGI-3"]]></title><description><![CDATA[
<p>The dataset miscomparison is a big problem. The prompt is super specific to ARC-AGI-3, which is perfectly fine to do, but skimming it I saw nothing that appears specific to the 25 games in the dataset. Especially considering they've only had one day for overfitting. Could be quite subtle leakage though.</p>
]]></description><pubDate>Fri, 27 Mar 2026 12:24:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47541852</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47541852</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47541852</guid></item><item><title><![CDATA[New comment by versteegen in "Day 1 of ARC-AGI-3"]]></title><description><![CDATA[
<p>...Their agent is called "Agentica ARC-AGI-3 agent for Opus 4.6 (120k) High".<p>Yes, it's unfair to compare results for the 25 (easier) public games against scores for the 55 semi-private games (scores for which are taken from <a href="https://arcprize.org/leaderboard">https://arcprize.org/leaderboard</a>).<p>But you're wrong to say that a custom harness invalidates the result. Yes, the official "ARC verified" scoreboard <i>for frontier LLMs</i> requires (<a href="https://arcprize.org/policy">https://arcprize.org/policy</a>):<p>>  using extremely generic and miminal LLM testing prompts, no client-side "harnesses", no hand-crafted tools, and no tailored model configuration<p>but these are limitations placed in order to compare LLMs from frontier labs on equal footing, <i>not</i> limitations that apply to submissions in general. It's not as if a solution to ARC-AGI-3 must involve training a custom LLM! This Agentica harness is completely legitimate approach to ARC-AGI-3, similar to J. Berman's for ARC-AGI-1/2, for example.</p>
]]></description><pubDate>Fri, 27 Mar 2026 12:03:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=47541681</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47541681</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47541681</guid></item><item><title><![CDATA[New comment by versteegen in "ARC-AGI-3"]]></title><description><![CDATA[
<p>> An AI that can only perform at the average human level is useless unless it can be trained for the job like humans can.<p>Yes, if you want skilled labour. But that's not at all what ARC-AGI attempts to test for: it's testing for general intelligence as possessed by anyone without a mental incapacity.</p>
]]></description><pubDate>Thu, 26 Mar 2026 00:58:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=47525470</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47525470</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47525470</guid></item><item><title><![CDATA[New comment by versteegen in "No, it doesn't cost Anthropic $5k per Claude Code user"]]></title><description><![CDATA[
<p>> Aren't they losing money on the retail API pricing, too?<p>No, they aren't, and probably neither is anyone else offering API pricing. And Anthropic's API margins may be higher than anyone else.<p>For example, DeepSeek released numbers showing that R1 was served at approximately "a cost profit margin of 545%" (meaning 82% of revenue is profit), see my comment <a href="https://news.ycombinator.com/item?id=46663852">https://news.ycombinator.com/item?id=46663852</a></p>
]]></description><pubDate>Tue, 10 Mar 2026 06:07:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=47319564</link><dc:creator>versteegen</dc:creator><comments>https://news.ycombinator.com/item?id=47319564</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47319564</guid></item></channel></rss>