Hacker News: abathur

New comment by abathur in "Appearing productive in the workplace"

abathur — Wed, 06 May 2026 19:55:18 +0000

I guess, but have you actually encountered a teacher grading an assignment solely based on word count?

I certainly wish more teachers encouraged parsimony and penalized fluff and bullshittery, but I'd be surprised to find them doing it outside of some narrow cases where the point is just to make you write something at all.

Tthey generally want to encourage their students to engage with the topic at a certain level and practice the thinking needed to research, structure, and implement an argument of a certain length. They want you to put at least 5 pounds of idea in the 5-10 pound idea bag.

If you're convinced you've hacked word economy and satisfied the assignment except for this goshdarnpeskyminimumwordcount, you're probably misunderstanding the lesson the instructor is willing to read through a bunch of bad writing to impart and cheating yourself.

New comment by abathur in "CSS as a Query Language"

abathur — Sat, 25 Apr 2026 00:45:48 +0000

I feel like we need a name for css-the-syntax (and maybe -the-semantics) as separate from css-the-body-of-rules/functions/units/etc-defined-by-csswg.

There's juice in it, but it's hard to talk about and survey other uses without just searching GH for code using css parsers and just see what kind of shenanigans people are up to.

I've been playing around with a weird thing that's kinda like a template engine, but driven by a mix of a lightweight node-based markup language, css selectors for expressing what goes into the template, and a css-alike for controlling exactly how all of these parts come together.

New comment by abathur in "12k AI-generated blog posts added in a single commit"

abathur — Sat, 04 Apr 2026 18:26:27 +0000

But can you trust that the things they say aren't just laundered AI blogspam?

New comment by abathur in "Training students to prove they're not robots is pushing them to use more AI"

abathur — Sun, 08 Mar 2026 16:28:08 +0000

I agree with you that a face-to-face q&a is a reasonably good way to detect low-effort cheating, but I'll still quibble a bit:

- I don't think this lowers the cost of detection as much as you imagine. You still need to know the paper better than the student and have to sacrifice already tight instruction/planning/grading time to have all of these conversations. Even if you catch enough to successfully deter most, it likely means not covering something else. It won't be too hard to catch low-effort cheaters who can't be bothered to read the paper, but you're on the low-leverage side of an arms race with the remaining students. You have experience on your side and they can't know what you'll ask, but they outnumber you and can certainly read the paper and use LLMs to quiz them on it. You have to invest your effort without knowing how each student prepared, so you'll spend about as much effort on every low-effort cheat as you do on the highest-effort cheat you are prepared to catch.

- Not sure it is "from the wrong direction" since both approaches raise the cost of cheating and lower the cost of detecting it.

- While this does avoid encouraging students to dumb down their work, it does still raise the cost of not-cheating. Unless you surprise the students with these conversations, the ones that care most will still anxiously prepare.

New comment by abathur in "Training students to prove they're not robots is pushing them to use more AI"

abathur — Sat, 07 Mar 2026 23:08:33 +0000

Grammar and AP style rules, iirc. (I may not. It's been enough years now. I did try and fail to find the syllabus in my box of five-star notebooks. We mostly used reporters notebooks for this class, and I took it over the summer. The materials are probably in a plastic bag somewhere...)

New comment by abathur in "Training students to prove they're not robots is pushing them to use more AI"

abathur — Sat, 07 Mar 2026 21:51:58 +0000

There are many disciplines in which students work on effectively distinct projects.

For example, the life-changingly-well-designed newswriting course I took in college assigned every single student a different story to spend several weeks reporting out so that we wouldn't all be out harassing the same poor people for interviews.

New comment by abathur in "Training students to prove they're not robots is pushing them to use more AI"

abathur — Sat, 07 Mar 2026 21:41:48 +0000

Sure--yes--the student will learn something if they actually wrote a 20-page paper on some given topic. But how are you going to evaluate their ability to compose the 20-page argument?

I would prefer not to be confrontational here, but I am having a hard time imagining that you've deeply considered the pedagogy of how to teach and evaluate students on squishy skills like this.

Knowing a bunch of facts about something is a world apart from structuring a compelling in-depth argument about it.

New comment by abathur in "Training students to prove they're not robots is pushing them to use more AI"

abathur — Sat, 07 Mar 2026 21:26:04 +0000

Does crapping on the average school's deep well of expertise for evaluating how effectively AI software solutions address their problems somehow fix the underlying problem (that the cost of catching cheaters is significantly higher than the cost of cheating)?

(This is roughly the same problem as evaluating software that only does an approximation of what it claims to do.)

(Aside: AI-based variations on this theme are in the early stages of proliferating across our society. They're being developed by many people using this forum and being sold to our schools, businesses, governments, and other organizations with little regard to whether they actually do what they claim.)

New comment by abathur in "Training students to prove they're not robots is pushing them to use more AI"

abathur — Sat, 07 Mar 2026 20:48:15 +0000

I don't disagree with you that a reasonable way to cope with the current problems is to ensure everything that "counts" is done in a controlled environment, but pedagogy and its goals are vast.

There are things you learn from spending several days structuring a 20-page argument that you will not learn (and cannot assess) from oral examination or a 5-paragraph essay written in a blue book.

New comment by abathur in "Reflections on AI at the End of 2025"

abathur — Sun, 21 Dec 2025 00:50:57 +0000

> Granted, but this reads a bit like a headline from The Onion: "'Hard to imagine a more favourable situation than pressing nails into wood' said local man unimpressed with neighbour's new hammer".

Chuffed you picked this example to ~sneer about.

There's a near-infinite list of problems one can solve with a hammer, but there are vanishingly few things one can build with just a hammer.

> You (or the person I was replying to) basically have to make the case that Simon Willison is ignorant about LLMs and programming, is desperate about something, or is deluding himself that the port worked when it actually didn't, to keep the original claim.

I don't have to do any such thing.

I said the experiments were both interesting and illuminating and I meant it. But that doesn't mean they will generalize to less-favorable problems. (Simon's doing great work to help stake out what does and doesn't work for him. I have seen every single one of the posts you're alluding to as they were posted, and I hesitated to reply here because I was leery someone would try to frame it as an attack on him or his work.)

> Is it? I can't use an example where they weren't useful or failed.

  https://en.wiktionary.org/wiki/cherry-pick

  (idiomatic) To pick out the best or most desirable items
  from a list or group, especially to obtain some advantage
  or to present something in the best possible light. 

  (rhetoric, logic, by extension) To select only evidence which supports an argument, 
  and reject or ignore contradictory evidence.

> any number of people failing at plumbing a bathroom sink don't prove that plumbing is impossible or not useful. One success at plumbing a bathroom sink is enough to demonstrate that it is possible and useful - it doesn't need dozens of examples - even if the task is narrowly scoped and well-trodden.

This smells like sleight of hand.

I'm happy to grant this (with a caveat^) if your point is that this success proves LLMs can build an HTML parser in a language with several popular source-available examples and thousands of tests (and probably many near-identical copies of the underlying HTML specs as they evolve) with months of human guidance^ and (with much less guidance) rapidly translate that parser into another language with many popular source-available answers and the same test suite. Yes--sure--one example of each is proof they can do both tasks.

But I take your GP to be suggesting something more like: this success at plumbing a sink inside the framework an existing house with plumbing provides is proof that these things can (or will) build average fully-plumbed houses.

^Simon, who you noted is not ignorant about LLMs and programming, was clear that the initial task of getting an LLM to write the first codebase that passed this test suite took Emil months of work.

> If a Tesla humanoid robot could plumb in a bathroom sink, it might not be good value for money, but it would be a useful task. If it could do it for $30 it might be good value for money as well even if it couldn't do any other tasks at all, right?

The only part of this that appears to have been done for about $30 was the translation of the existing codebase. I wouldn't argue that accomplishing this task for $30 isn't impressive.

But, again, this smells like sleight of hand.

We have probably plumbed billions of sinks (and hopefully have billions or even trillions more to go), so any automation that can do one for $30 has clear value.

A world with a billion well-tested HTML parsers in need of translation is likely one kind of hell or another. Proof an LLM-based workflow can translate a well-tested HTML parser for $30 is interesting and illuminating (I'm particularly interested in whether it'll upend how hard some of us have to fight to justify the time and effort that goes into high-quality test suites), but translating them obviously isn't going to pay the bills by itself.

(If the success doesn't generalize to less favorable situations that do pay the bills, this clearly valuable capability may be repriced to better reflect how much labor and risk it saves relative to a human rewrite.)

New comment by abathur in "Reflections on AI at the End of 2025"

abathur — Sat, 20 Dec 2025 21:35:21 +0000

I think both of those experiments do a good job of demonstrating utility on a certain kind of task.

But this is cherry-picking.

In the grand scheme of the work we all collectively do, very few programming projects entail something even vaguely like generating an Nth HTML parser in a language that already has several wildly popular HTML parsers--or porting that parser into another language that has several wildly popular HTML parsers.

Even fewer tasks come with a library of 9k+ tests to sharpen our solutions against. (Which itself wouldn't exist without experts trodding this ground thoroughly enough to accrue them.)

The experiments are incredibly interesting and illuminating, but I feel like it's verging on gaslighting to frame them as proof of how useful the technology is when it's hard to imagine a more favorable situation.

New comment by abathur in "Your job is to deliver code you have proven to work"

abathur — Thu, 18 Dec 2025 18:31:21 +0000

I've been tasked with doing a very superficial review of a codebase produced by an adult who purports to have decades of database/backend experience with the assistance of a well-known agent.

While skimming tests for the python backend, I spotted the following:

    @patch.dict(os.environ, {"ENVIRONMENT": "production"})
    def test_settings_environment_from_env(self) -> None:
        """Test environment setting from env var."""
        from importlib import reload

        import app.config

        reload(app.config)

        # Settings should use env var
        assert os.environ.get("ENVIRONMENT") == "production"

This isn't an outlier. There are smells everywhere.

New comment by abathur in "Kilauea erupts, destroying webcam [video]"

abathur — Sun, 07 Dec 2025 06:36:59 +0000

Also https://en.wikipedia.org/wiki/Laze_(geology) :)

New comment by abathur in "Markdown is holding you back"

abathur — Sun, 23 Nov 2025 00:45:29 +0000

This is the kind of dismissive sneer the HN guidelines advise against.

You can write dev docs for humans and still want machine readability (without caring about whether some LLM can make sense of the docs).

Machine readability is how you repurpose your own documentation in different contexts. If your documentation it isn't machine readable it might as well be in a .doc(x) file.

New comment by abathur in "UK's first small nuclear power station to be built in north Wales"

abathur — Sun, 16 Nov 2025 17:10:57 +0000

There's quite a lot of pricing data available for the energy market and it might be possible to approximate battery profitability by rerunning normal and long-tail history.

See https://www.ercot.com/mktinfo/prices and https://www.ercot.com/gridmktinfo/dashboards and https://www.ercot.com/gridmktinfo/dashboards/energystoragere... for example.

New comment by abathur in "AGI fantasy is a blocker to actual engineering"

abathur — Fri, 14 Nov 2025 16:10:34 +0000

Yep--I'm agreeing that one's a good comparison to elaborate on.

Exploring how it stacks up against an essential use probably won't persuade people who perceive it as wasteful.

New comment by abathur in "AGI fantasy is a blocker to actual engineering"

abathur — Fri, 14 Nov 2025 15:49:45 +0000

Agriculture feeds people, Simon.

It's fair to be critical of how the ag industry uses that water, but a significant fraction of that activity is effectively essential.

If you're going to minimize people's concern like this, at least compare it to discretionary uses we could ~live without.

The data's about 20 years old, but for example https://www.usga.org/content/dam/usga/pdf/Water%20Resource%2... suggests we were using over 2b gallons a day to water golf courses.

New comment by abathur in "The terminal of the future"

abathur — Wed, 12 Nov 2025 16:29:36 +0000

I'll cop to not reading the whole list before commenting, but I skimmed this and didn't really notice anything about speed or performance.

When using tools that can emit 0 to millions of lines of output, performance seems like table-stakes for a professional tool.

I'm happy to see people experiment with the form, but to be fit for purpose I suspect the features a shell or terminal can support should work backwards from benchmarks and human testing to understand how much headroom they have on the kind of hardware they'd like to support and which features fit inside it.

New comment by abathur in "When the job search becomes impossible"

abathur — Wed, 17 Sep 2025 04:55:03 +0000

This is probably crazy talk, but I have been wondering how requiring people to slap a stamp on an envelope and mail in a résumé would go.

New comment by abathur in "I miss using em dashes"

abathur — Tue, 02 Sep 2025 02:00:25 +0000

The em is more than a synonym for semicolon.