Hacker News: spion

New comment by spion in "If AI writes code, should the session be part of the commit?"

spion — Mon, 02 Mar 2026 02:45:08 +0000

A summary of the session should be part of the commit message.

New comment by spion in "WebMCP is available for early preview"

spion — Mon, 02 Mar 2026 02:43:31 +0000

Why aren't we using HATEOAS as a way to expose data and actions to agents?

New comment by spion in "Choosing learning over autopilot"

spion — Tue, 13 Jan 2026 22:20:09 +0000

cold take speculation: the architecture astronautics of the Java era probably destroyed a lot of the desire for better abstractions and thinking over copy-pasting, minimalism and open standards

hot take speculation: we base a lot of our work on open source software and libraries, but a lot of that software is cheaply made, or made for the needs of a company that happens to open-source it. the pull of the low-quality "standardized" open source foundations is preventing further progress.

New comment by spion in "Choosing learning over autopilot"

spion — Tue, 13 Jan 2026 22:15:30 +0000

Has anyone measured whether doing things with AI leads to any learning? One way to do this is to measure whether subsequent related tasks have improvements in time-to-functional-results with and without AI, as % improvement. Additionally two more datapoints can be taken: with-ai -> without-ai, and without-ai -> with-ai

New comment by spion in "Stop Forwarding Errors, Start Designing Them"

spion — Mon, 05 Jan 2026 01:15:50 +0000

Great article. Really advances the thinking on error handling. Rust already has a head start compared to most other languages with Result, expect and anyhow (well, color_eyre and tracing), but there was indeed a missing piece tying together error handling "actionability" with "better than stack trace" context for the programmer.

With regards to context for the programmer, I still think ultimately tracing and color_eyre (see https://docs.rs/color-eyre/latest/color_eyre/) form a good-enough pair for service style applications, with tracing providing the missing additional context. But its nice to see a simpler approach to actionability.

New comment by spion in "Stop Forwarding Errors, Start Designing Them"

spion — Mon, 05 Jan 2026 00:48:34 +0000

IMO you need both things: culture to make it happen, and technology to make it easy and reasonable looking. Rust lacks the former to some degree; Go lacks the later to some degree (see e.g. kustomize error formatting - everything ends up on a single line)

New comment by spion in "Stop Forwarding Errors, Start Designing Them"

spion — Mon, 05 Jan 2026 00:45:49 +0000

I don't think there is anything in Go (the language) that helps achieve this - its mostly cultural. (Go creators and community being very outspoken about handling errors).

In fact, the easiest thing to do in Go is to ignore the error; the next easiest is to early-return the same error with no additional context.

Technically speaking, Rust has way better tools for adding context to errors. See for example https://docs.rs/color-eyre/latest/color_eyre/

It does expect you to use `wrap_err` to get the benefits, though. Which is easier to do than what Go requires you to do for good contextual errors, and even easier if you want reasonable-looking formatting from the Go version.

New comment by spion in "Advent of Code 2025: Number of puzzles reduce from 25 to 12 for the first time"

spion — Sun, 26 Oct 2025 16:02:08 +0000

I wonder if it would've felt more natural if the "part 2s" of the puzzles became separate days instead. (Still 12 days worth of puzzles, but spread out across 24 days, with maybe one extra, smaller, easier puzzle for the last day to relax)

New comment by spion in "Shai-Hulud malware attack: Tinycolor and over 40 NPM packages compromised"

spion — Tue, 16 Sep 2025 20:38:31 +0000

pnpm just added minimum age for dependencies https://pnpm.io/blog/releases/10.16#new-setting-for-delayed-...

New comment by spion in "AI coding"

spion — Sat, 13 Sep 2025 13:01:05 +0000

I don't think thats contrary to the article's claim: the current tools are so bad and tedious to use for repetitive work that AI is helpful with a huge amount of it.

New comment by spion in "Anthropic agrees to pay $1.5B to settle lawsuit with book authors"

spion — Sat, 06 Sep 2025 18:12:32 +0000

Its not settled whether AI training is fair use.

New comment by spion in "We put a coding agent in a while loop"

spion — Mon, 25 Aug 2025 15:35:52 +0000

No, it still doesn't work. But the only way to realise it is to actually really try using it.

New comment by spion in "We put a coding agent in a while loop"

spion — Mon, 25 Aug 2025 11:16:28 +0000

Try actually doing it, realise how very far the outcome is from what the blog posts describe the vast majority of the time, and get dread from the state of (social) media instead.

New comment by spion in "AI is a floor raiser, not a ceiling raiser"

spion — Thu, 31 Jul 2025 19:43:12 +0000

I think agents have a curve where they're kinda bad at bootstrapping a project, very good if used in a small-to-medium-sized existing project and then it goes downhill from there as size increases, slowly.

Something about a brand-new project often makes LLMs drop to "example grade" code, the kind you'd never put in production. (An example: claude implemented per-task file logging in my prototype project by pushing to an array of log lines, serializing the entire thing to JSON and rewriting the entire file, for every logged event)

New comment by spion in "Use Your Type System"

spion — Fri, 25 Jul 2025 11:27:52 +0000

There are a few languages where this is not too tedious (although other things tend to be a bit more tedious than needed in those)

The main problem with these is how do you actually get the verification needed when data comes in from outside the system. Check with the database every time you want to turn a string/uuid into an ID type? It can get prohibitively expensive.

New comment by spion in "Use Your Type System"

spion — Fri, 25 Jul 2025 11:23:28 +0000

The OP is the author of grugbrain.dev

New comment by spion in "Builder.ai did not "fake AI with 700 engineers""

spion — Thu, 12 Jun 2025 23:11:33 +0000

Indeed. Which is why I think the only way to really evaluate the progress of LLMs is to curate your own personal set of example failures that you don't share with anyone else and only use it via APIs that provide some sort of no-data-retention and no-training guarantees.

New comment by spion in "Builder.ai did not "fake AI with 700 engineers""

spion — Thu, 12 Jun 2025 21:48:30 +0000

What you think is an absurd question may not be as absurd as it seems, given the trillions of tokens of data on the internet, including its darkest corners.

In my experience, its better to simply try using LLMs in areas where they don't have a lot of training data (e.g. reasoning about the behaviour of terraform plans). Its not a hard cutoff of being _only_ able to reason exactly about solved things, but its not too far off as a first approximation.

The researchers took exiting known problems and parameterised their difficulty [1]. While most of these are not by any means easy for humans, the interesting observation to me was that the failure_N was not proportional to the complexity of the problem, but more with how common solution "printouts" for that size of the problem can typically be encountered in the training data. For example, "towers of hanoi" which has printouts of solutions for a variety of sizes went to very large number of steps N, while the river crossing, which is almost entirely not present in the training data for N larger than 3, failed above pretty much that exact number.

[1]: https://machinelearning.apple.com/research/illusion-of-think...

New comment by spion in "Human coders are still better than LLMs"

spion — Thu, 29 May 2025 20:18:20 +0000

Its hard to say. Historically new discoveries in AI often generated great excitement and high expectations, followed by some progress, then stalling, disillusionment and AI winter. Maybe this time it will be different. Either way what was achieved so far is already a huge deal.

New comment by spion in "Human coders are still better than LLMs"

spion — Thu, 29 May 2025 18:36:24 +0000

Vibe-wise, it seems like progress is slowing down and recent models aren't substantially better than their predecessors. But it would be interesting to take a well-trusted benchmark and plot max_performance_until_date(foreach month). (Too bad aider changed recently and there aren't many older models; https://aider.chat/docs/leaderboards/by-release-date.html has not been updated in a while with newer stuff, and the new benchmark doesn't have the classic models such as 3.5, 3.5 turbo, 4, claude 3 opus)