<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: spion</title><link>https://news.ycombinator.com/user?id=spion</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 03 May 2026 03:12:52 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=spion" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by spion in "If AI writes code, should the session be part of the commit?"]]></title><description><![CDATA[
<p>A summary of the session should be part of the commit message.</p>
]]></description><pubDate>Mon, 02 Mar 2026 02:45:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=47213264</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=47213264</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47213264</guid></item><item><title><![CDATA[New comment by spion in "WebMCP is available for early preview"]]></title><description><![CDATA[
<p>Why aren't we using HATEOAS as a way to expose data and actions to agents?</p>
]]></description><pubDate>Mon, 02 Mar 2026 02:43:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=47213255</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=47213255</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47213255</guid></item><item><title><![CDATA[New comment by spion in "Choosing learning over autopilot"]]></title><description><![CDATA[
<p>cold take speculation: the architecture astronautics of the Java era probably destroyed a lot of the desire for better abstractions and thinking over copy-pasting, minimalism and open standards<p>hot take speculation: we base a lot of our work on open source software and libraries, but a lot of that software is cheaply made, or made for the needs of a company that happens to open-source it. the pull of the low-quality "standardized" open source foundations is preventing further progress.</p>
]]></description><pubDate>Tue, 13 Jan 2026 22:20:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=46609136</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=46609136</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46609136</guid></item><item><title><![CDATA[New comment by spion in "Choosing learning over autopilot"]]></title><description><![CDATA[
<p>Has anyone measured whether doing things with AI leads to any learning? One way to do this is to measure whether subsequent related tasks have improvements in time-to-functional-results with and without AI, as % improvement. Additionally two more datapoints can be taken: with-ai -> without-ai, and without-ai -> with-ai</p>
]]></description><pubDate>Tue, 13 Jan 2026 22:15:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=46609059</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=46609059</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46609059</guid></item><item><title><![CDATA[New comment by spion in "Stop Forwarding Errors, Start Designing Them"]]></title><description><![CDATA[
<p>Great article. Really advances the thinking on error handling. Rust already has a head start compared to most other languages with Result, expect and anyhow (well, color_eyre and tracing), but there was indeed a missing piece tying together error handling "actionability" with "better than stack trace" context for the programmer.<p>With regards to context for the programmer, I still think ultimately tracing and color_eyre (see <a href="https://docs.rs/color-eyre/latest/color_eyre/" rel="nofollow">https://docs.rs/color-eyre/latest/color_eyre/</a>) form a good-enough pair for service style applications, with tracing providing the missing additional context. But its nice to see a simpler approach to actionability.</p>
]]></description><pubDate>Mon, 05 Jan 2026 01:15:50 +0000</pubDate><link>https://news.ycombinator.com/item?id=46494278</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=46494278</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46494278</guid></item><item><title><![CDATA[New comment by spion in "Stop Forwarding Errors, Start Designing Them"]]></title><description><![CDATA[
<p>IMO you need both things: culture to make it happen, and technology to make it easy and reasonable looking. Rust lacks the former to some degree; Go lacks the later to some degree (see e.g. kustomize error formatting - everything ends up on a single line)</p>
]]></description><pubDate>Mon, 05 Jan 2026 00:48:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=46494059</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=46494059</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46494059</guid></item><item><title><![CDATA[New comment by spion in "Stop Forwarding Errors, Start Designing Them"]]></title><description><![CDATA[
<p>I don't think there is anything in Go (the language) that helps achieve this - its mostly cultural. (Go creators and community being very outspoken about handling errors).<p>In fact, the easiest thing to do in Go is to ignore the error; the next easiest is to early-return the same error with no additional context.<p>Technically speaking, Rust has way better tools for adding context to errors. See for example <a href="https://docs.rs/color-eyre/latest/color_eyre/" rel="nofollow">https://docs.rs/color-eyre/latest/color_eyre/</a><p>It does expect you to use `wrap_err` to get the benefits, though. Which is easier to do than what Go requires you to do for good contextual errors, and even easier if you want reasonable-looking formatting from the Go version.</p>
]]></description><pubDate>Mon, 05 Jan 2026 00:45:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=46494047</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=46494047</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46494047</guid></item><item><title><![CDATA[New comment by spion in "Advent of Code 2025: Number of puzzles reduce from 25 to 12 for the first time"]]></title><description><![CDATA[
<p>I wonder if it would've felt more natural if the "part 2s" of the puzzles became separate days instead. (Still 12 days worth of puzzles, but spread out across 24 days, with maybe one extra, smaller, easier puzzle for the last day to relax)</p>
]]></description><pubDate>Sun, 26 Oct 2025 16:02:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=45712913</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=45712913</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45712913</guid></item><item><title><![CDATA[New comment by spion in "Shai-Hulud malware attack: Tinycolor and over 40 NPM packages compromised"]]></title><description><![CDATA[
<p>pnpm just added minimum age for dependencies <a href="https://pnpm.io/blog/releases/10.16#new-setting-for-delayed-dependency-updates" rel="nofollow">https://pnpm.io/blog/releases/10.16#new-setting-for-delayed-...</a></p>
]]></description><pubDate>Tue, 16 Sep 2025 20:38:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=45267697</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=45267697</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45267697</guid></item><item><title><![CDATA[New comment by spion in "AI coding"]]></title><description><![CDATA[
<p>I don't think thats contrary to the article's claim: the current tools are so bad and tedious to use for repetitive work that AI is helpful with a huge amount of it.</p>
]]></description><pubDate>Sat, 13 Sep 2025 13:01:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=45231751</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=45231751</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45231751</guid></item><item><title><![CDATA[New comment by spion in "Anthropic agrees to pay $1.5B to settle lawsuit with book authors"]]></title><description><![CDATA[
<p>Its not settled whether AI training is fair use.</p>
]]></description><pubDate>Sat, 06 Sep 2025 18:12:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=45151562</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=45151562</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45151562</guid></item><item><title><![CDATA[New comment by spion in "We put a coding agent in a while loop"]]></title><description><![CDATA[
<p>No, it still doesn't work. But the only way to realise it is to actually really try using it.</p>
]]></description><pubDate>Mon, 25 Aug 2025 15:35:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=45015005</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=45015005</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45015005</guid></item><item><title><![CDATA[New comment by spion in "We put a coding agent in a while loop"]]></title><description><![CDATA[
<p>Try actually doing it, realise how very far the outcome is from what the blog posts describe the vast majority of the time, and get dread from the state of (social) media instead.</p>
]]></description><pubDate>Mon, 25 Aug 2025 11:16:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=45012593</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=45012593</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45012593</guid></item><item><title><![CDATA[New comment by spion in "AI is a floor raiser, not a ceiling raiser"]]></title><description><![CDATA[
<p>I think agents have a curve where they're kinda bad at bootstrapping a project, very good if used in a small-to-medium-sized existing project and then it goes downhill from there as size increases, slowly.<p>Something about a brand-new project often makes LLMs drop to "example grade" code, the kind you'd never put in production. (An example: claude implemented per-task file logging in my prototype project by pushing to an array of log lines, serializing the entire thing to JSON and rewriting the entire file, for every logged event)</p>
]]></description><pubDate>Thu, 31 Jul 2025 19:43:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=44749328</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=44749328</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44749328</guid></item><item><title><![CDATA[New comment by spion in "Use Your Type System"]]></title><description><![CDATA[
<p>There are a few languages where this is not too tedious (although other things tend to be a bit more tedious than needed in those)<p>The main problem with these is how do you actually get the verification needed when data comes in from outside the system. Check with the database every time you want to turn a string/uuid into an ID type? It can get prohibitively expensive.</p>
]]></description><pubDate>Fri, 25 Jul 2025 11:27:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=44681971</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=44681971</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44681971</guid></item><item><title><![CDATA[New comment by spion in "Use Your Type System"]]></title><description><![CDATA[
<p>The OP is the author of grugbrain.dev</p>
]]></description><pubDate>Fri, 25 Jul 2025 11:23:28 +0000</pubDate><link>https://news.ycombinator.com/item?id=44681946</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=44681946</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44681946</guid></item><item><title><![CDATA[New comment by spion in "Builder.ai did not "fake AI with 700 engineers""]]></title><description><![CDATA[
<p>Indeed. Which is why I think the only way to really evaluate the progress of LLMs is to curate your own personal set of example failures that you don't share with anyone else and only use it via APIs that provide some sort of no-data-retention and no-training guarantees.</p>
]]></description><pubDate>Thu, 12 Jun 2025 23:11:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=44264208</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=44264208</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44264208</guid></item><item><title><![CDATA[New comment by spion in "Builder.ai did not "fake AI with 700 engineers""]]></title><description><![CDATA[
<p>What you think is an absurd question may not be as absurd as it seems, given the trillions of tokens of data on the internet, including its darkest corners.<p>In my experience, its better to simply try using LLMs in areas where they don't have a lot of training data (e.g. reasoning about the behaviour of terraform plans). Its not a hard cutoff of being _only_ able to reason exactly about solved things, but its not too far off as a first approximation.<p>The researchers took exiting known problems and parameterised their difficulty [1]. While most of these are not by any means easy for humans, the interesting observation to me was that the failure_N was not proportional to the complexity of the problem, but more with how common solution "printouts" for that size of the problem can typically be encountered in the training data. For example, "towers of hanoi" which has printouts of solutions for a variety of sizes went to very large number of steps N, while the river crossing, which is almost entirely not present in the training data for N larger than 3, failed above pretty much that exact number.<p>[1]: <a href="https://machinelearning.apple.com/research/illusion-of-thinking" rel="nofollow">https://machinelearning.apple.com/research/illusion-of-think...</a></p>
]]></description><pubDate>Thu, 12 Jun 2025 21:48:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=44263553</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=44263553</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44263553</guid></item><item><title><![CDATA[New comment by spion in "Human coders are still better than LLMs"]]></title><description><![CDATA[
<p>Its hard to say. Historically new discoveries in AI often generated great excitement and high expectations, followed by some progress, then stalling, disillusionment and AI winter. Maybe this time it will be different. Either way what was achieved so far is already a huge deal.</p>
]]></description><pubDate>Thu, 29 May 2025 20:18:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=44129913</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=44129913</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44129913</guid></item><item><title><![CDATA[New comment by spion in "Human coders are still better than LLMs"]]></title><description><![CDATA[
<p>Vibe-wise, it seems like progress is slowing down and recent models aren't substantially better than their predecessors. But it would be interesting to take a well-trusted benchmark and plot max_performance_until_date(foreach month). (Too bad aider changed recently and there aren't many older models;  <a href="https://aider.chat/docs/leaderboards/by-release-date.html" rel="nofollow">https://aider.chat/docs/leaderboards/by-release-date.html</a> has not been updated in a while with newer stuff, and the new benchmark doesn't have the classic models such as 3.5, 3.5 turbo, 4, claude 3 opus)</p>
]]></description><pubDate>Thu, 29 May 2025 18:36:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=44128878</link><dc:creator>spion</dc:creator><comments>https://news.ycombinator.com/item?id=44128878</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44128878</guid></item></channel></rss>