Hacker News: sophiabits

New comment by sophiabits in "Why AI hasn't replaced software engineers, and won't"

sophiabits — Thu, 11 Jun 2026 15:28:16 +0000

> You can't fire Claude if it fucks up

What's the difference between "firing" Claude vs moving to a model from a different provider? The latter seems very analogous to firing an employee for performance and backfilling with someone new.

Re the rest, it's just not my experience that models become incapable of making good decisions in cases where input token count > the context window, but ymmv based on domain.

A very extreme example of this: a couple years ago when GPT 4 was state of the art and the 32K context variant was gated to design partners I worked at an EdTech company in the college admissions space that wanted to produce quarterly reports on student progress for parents. That involved crunching a LOT of data (multiple hours of meeting transcripts per week, very detailed notes about student activities, their general profile - UK and US admissions function very differently!)

It was a difficult problem, but we _did_ manage to produce these reports 4K output tokens at a time at a level of quality that exceeded what humans could do internally, and models+the surrounding tooling have only gotten better since then.

New comment by sophiabits in "Now AI agents need what RSS does"

sophiabits — Wed, 03 Jun 2026 00:59:15 +0000

TIL you can get RSS feeds for YouTube channels. Thanks for this!

New comment by sophiabits in "An Engineer's Post Protesting Laptop Surveillance Is Going Viral Inside Meta"

sophiabits — Fri, 15 May 2026 00:44:17 +0000

> Selfishly, I don't want my screen scraped because it feels like an invasion of my privacy, [...] But zooming out, I don't want to live in a world where humans—employees or otherwise—are exploited for their training data.

I wonder whether this Meta employee felt the same way about privacy a month ago, before they were personally impacted by this new initiative.

New comment by sophiabits in "Postmortem: TanStack NPM supply-chain compromise"

sophiabits — Tue, 12 May 2026 01:59:09 +0000

I do not envy the position the npm team are in. They removed the ability to unpublish packages as a response to the left-pad incident[1] because it wasn't desirable for individual developers to break downstream dependencies by pulling their package maliciously.

Of course the side effect is that now it's much harder to pull packages for legitimate reasons :/

[1] https://en.wikipedia.org/wiki/Npm_left-pad_incident

New comment by sophiabits in "Why TUIs are back"

sophiabits — Sun, 03 May 2026 21:48:49 +0000

A browser is OOM more expensive to run than a terminal app, regardless of what you're running inside said browser

New comment by sophiabits in "Backpacks got worse on purpose"

sophiabits — Wed, 15 Apr 2026 16:28:54 +0000

Purchasing power is probably a better metric in a vacuum, but it's hard to analyze accurately

For example, the comment you're citing is claiming that because minimum wage has increased only 3x over the same period of time in which inflation has eroded the relative value of a dollar by 6x, that wages overall have increased at half the rate of inflation. But minimum wage is a measurement of a minimum, while inflation is a measurement of /average/ price increase so they can't be compared 1:1 in this way.

The housing argument also seems odd. In New Zealand (where I'm from -- I'm not familiar with the US' housing market, so the commenter could be right about that geo!) house prices have increased by far more than 20x since the 70s, but the houses available are of substantially higher quality due to improved regulations (e.g. all newer homes are subject to healthy homes rules which mandate insulation) so just comparing inflation-adjusted home prices vs income doesn't tell the full story

(Aside from that, a whole heap of items like food, electronics, transportation are all both far cheaper AND higher quality today than in the 70s)

New comment by sophiabits in "When does MCP make sense vs CLI?"

sophiabits — Sun, 01 Mar 2026 19:27:56 +0000

> the MCP server is automatically launched when the Agent loads that skill

The main problem with this approach at the moment is it busts your prompt cache, because LLMs expect all tool definitions to be defined at the beginning of the context window. Input tokens are the main driver of inference costs and a lot of use cases aren't economical without prompt caching.

Hopefully in future LLMs are trained so you can add tool definitions anywhere in the context window. Lots of use cases benefit from this, e.g. in ecommerce there's really no point providing a "clear cart" tool to the LLM upfront, it'd be nice if you could dynamically provide it after item(s) are first added.

New comment by sophiabits in "Claude Code 2.0"

sophiabits — Tue, 30 Sep 2025 05:57:13 +0000

Comments are often the best tool for explaining why a bit of code is formulated how it is, or explaining why a more obvious alternate implementation is a dead end.

An example of this: assume you live in a world where the formula for the circumference of a circle has not been derived. You end up deriving the formula yourself and write a function which returns 2piradius. This is as simple as it gets, not hacky at all, and you would /definitely/ want to include a comment explaining how you arrived at your weird and arbitrary-looking "3.1415" constant.

Agentic checkout in ~100 lines of Python

sophiabits — Fri, 04 Jul 2025 20:10:38 +0000

Article URL: https://sophiabits.com/blog/building-agentic-checkout

Comments URL: https://news.ycombinator.com/item?id=44467503

Points: 2

# Comments: 0

A deep dive into OpenAI's Structured Outputs

sophiabits — Wed, 07 Aug 2024 03:05:08 +0000

Article URL: https://sophiabits.com/blog/openai-structured-outputs-deep-dive

Comments URL: https://news.ycombinator.com/item?id=41177864

Points: 1

# Comments: 0

New comment by sophiabits in "Structured Outputs in the API"

sophiabits — Tue, 06 Aug 2024 22:31:00 +0000

I’ve especially noticed this with gpt-4o-mini [1], and it’s a big problem. My particular use case involves keeping a running summary of a conversation between a user and the LLM, and 4o-mini has a really bad tendency of inventing details in order to hit the desired summary word limit. I didn’t see this with 4o or earlier models

Fwiw my subjective experience has been that non-technical stakeholders tend to be more impressed with / agreeable to longer AI outputs, regardless of underlying quality. I have lost count of the number of times I’ve been asked to make outputs longer. Maybe this is just OpenAI responding to what users want?

[1] https://sophiabits.com/blog/new-llms-arent-always-better#exa...

New comment by sophiabits in "New LLMs aren't always better"

sophiabits — Mon, 05 Aug 2024 20:39:26 +0000

I wanted to document a particular genAI antipattern which I've seen a few times now.

LLMs are theoretically pretty fungible, because you send English and get English back--but in practice you still need to do some amount of technical due diligence before swapping model. These things are benchmarked on tasks which rarely resemble your specific use case. Blindly swap models at your own risk!

Something that has become very clear since the advent of GPT-3.5 is that LLMs are far from magic, and using them does not remove the need for good engineering fundamentals. It's important to have a solid eval suite so you can quickly benchmark your system against different LLMs, because the APIs we're all building on are constant moving targets.

New LLMs aren't always better

sophiabits — Mon, 05 Aug 2024 20:39:17 +0000

Article URL: https://sophiabits.com/blog/new-llms-arent-always-better

Comments URL: https://news.ycombinator.com/item?id=41165374

Points: 1

# Comments: 1

New comment by sophiabits in "Anti-patterns in event-driven architecture"

sophiabits — Mon, 10 Jun 2024 20:01:56 +0000

Even with this setup in place you need a heightened level of caution relative to a monolith. In a monolith I can refactor function signatures however I desire because the whole service is an atomically deployed unit. Once you have two independently deployed components that goes out the window and you now need to be a lot more mindful when introducing breaking changes to an endpoint’s types

New comment by sophiabits in "Breaking up is hard to do: Chunking in RAG applications"

sophiabits — Sun, 09 Jun 2024 04:36:37 +0000

Not sure what the situation is like now, but we stopped using LangChain last year because the rate of change in the library was huge. Whenever we needed to upgrade for a new feature or bug fix we’d be 20~ versions behind and need to work through breaking changes. Eventually we decided that it was easier to just write everything ourselves.

This is from the first half of 2023 or so; maybe things are more stable now, but looks like the Python implementation is still pre-v1.

New comment by sophiabits in "Garbage collect your technical debt (2021)"

sophiabits — Sun, 09 Jun 2024 04:28:50 +0000

The other possibility (which is common in startups) is that often the “right way” is different depending on the scale of the system you need to design for. In cases like this you end up with technical debt a year down the line, but at the time the feature was shipped the engineering decisions made were extremely reasonable.

I’ve seen a few colleagues jump to writing off all technical debt as being inherently bad, but in cases like this it’s a sign of success and something that’s largely impossible to avoid (the EV of building for 10-100x current scale is generally negative, factoring in the risk of the business going bust). There’s a kind of entropy at play here.

Big fan of tidying things up incrementally as you go [1], because it enables teams to at least mitigate this natural degradation over time

[1] https://sophiabits.com/blog/be-a-tidy-kiwi

New comment by sophiabits in "Garbage collect your technical debt (2021)"

sophiabits — Sun, 09 Jun 2024 04:20:09 +0000

Big fan of this. Here in New Zealand we have a slogan “be a tidy kiwi” that encourages people to pick up their litter and be good stewards of for our natural environment

Imo the same mentality is good to have in software, and I’ve always appreciated being in a team that makes codebase improvements alongside feature additions. It makes things a lot more pleasant

New comment by sophiabits in "What can LLMs never do?"

sophiabits — Sun, 28 Apr 2024 07:16:20 +0000

This already exists (in a slightly different prompt format); it's the underlying idea behind ReAct: https://react-lm.github.io

As you say, I'm skeptical this counts as AGI. Although I admit that I don't have a particularly rock solid definition of what _would_ constitute true AGI.

New comment by sophiabits in "TypeScript: Branded Types"

sophiabits — Thu, 25 Apr 2024 09:34:36 +0000

This sounds like an argument against TypeScript in general, no?

e.g. If I am parsing a string to a number via Number.parseInt, I don’t need a “: number” annotation because I can just call the variable “myNumber” and use that.

Branding a string is in many ways an extension on the idea of “branding” my “myNumber” variable as “: number” rather than leaving it as “: any”. Even if the TS type system is easy to bail out of, I still want the type annotations in the first place because they are useful regardless. I like reducing the number of things I need to think about and shoving responsibility off to my tools.

New comment by sophiabits in "TypeScript: Branded Types"

sophiabits — Thu, 25 Apr 2024 09:23:49 +0000

Happens a lot with junction tables ime. e.g. At my last job we had three tables: user, stream, user_stream. user_stream is an N:N junction between a user and a stream

A user is free to leave and rejoin a stream, and we want to retain old data. So each user_stream has columns id, user_id, stream_id (+ others)

Issues occur when people write code like the following:

streamsService.search({ withIds: userStreams.map((stream) => stream.id), });

The issue is easily noticed if you name the “stream” parameter “userStream” instead, but this particular footgun came up _all_ the time in code review; and it also occurred with other junction tables as well. Branded types on the various id fields completely solve this mistake at design time.