Hacker News: faxmeyourcode

New comment by faxmeyourcode in "pg_durable: Microsoft open sources in-database durable execution"

faxmeyourcode — Fri, 05 Jun 2026 16:44:21 +0000

Somebody else in the thread brought up the benefit of snapshotting a database at a point in time stores not only the state of execution but also the code, etc. That is a unique benefit I'd be interested in exploring over storing your orchestration outside of the database.

Not trying to dismiss the project - it looks like a lot of hard work has gone in and somebody has a use for it. I just come from an airflow style external orchestrator frame of mind that manages durability state in postgres but keeps the control flow out. Sorry if I came off as a bit snarky

New comment by faxmeyourcode in "pg_durable: Microsoft open sources in-database durable execution"

faxmeyourcode — Fri, 05 Jun 2026 16:22:30 +0000

I aggree - I'm not understanding the value of the project either if you look at the example here https://github.com/microsoft/pg_durable/blob/main/examples/i...

It's an interesting technical achievement I guess, but it's very bizarre to try and read this

    SELECT df.start(
        @> (
            ($$SELECT ... FROM demo.invoices WHERE status = 'pending'$$ |=> 'inv')
            ~> df.if_rows('inv',
                $$UPDATE ... SET status = 'processing'$$
                ~> (df.http(...) |=> 'resp')
                ~> df.if($$SELECT $r.ok$$,
                    -- classify, branch, wait for signal ...
                ),
                df.sleep(5)
            )
        ),
        'invoice-approval-pipeline'
    );

New comment by faxmeyourcode in "pg_durable: Microsoft open sources in-database durable execution"

faxmeyourcode — Fri, 05 Jun 2026 16:18:14 +0000

This feels like the wrong solution to an age old problem solved by the DAG schedulers like Apache Airflow for a while now.

Why would I want to store my control flow in the database and not in code? It feels strange.

Not trying to dismiss the project, I'm just not getting it yet I think.

New comment by faxmeyourcode in "Disagreement among frontier LLMs on real-world fact-checks"

faxmeyourcode — Thu, 28 May 2026 14:59:16 +0000

I had a hunch that opus 4.7 hedged more than other models - and it turns out it's true

    model                 total_claims  hedged_count  hedged_pct
    claude-opus-4-7       1000          451           45.1
    sonar-pro             1000          391           39.1
    gpt-5.4               1000          277           27.7
    gemini-3-retrieval    1000          129           12.9
    gemini-3-pro          1000          60            6.0

datasette query here

https://lite.datasette.io/?csv=https%3A%2F%2Fstatic.simonwil...

New comment by faxmeyourcode in "Boris Cherny: TI-83 Plus Basic Programming Tutorial (2004)"

faxmeyourcode — Thu, 07 May 2026 11:15:49 +0000

An HP 50g was my calculator of choice, and the whole RPN style really rubbed off on me. Plus it had more advanced symbolic algebra capabilities than a ti83 equivalent. I enjoyed learning common lisp, scheme, racket, etc through high school and college and still am fond of them today because of this calculator.

New comment by faxmeyourcode in "GPT-5.5"

faxmeyourcode — Thu, 23 Apr 2026 18:34:18 +0000

How does it compare to mythos?

New comment by faxmeyourcode in "25 Years of Eggs"

faxmeyourcode — Mon, 23 Mar 2026 18:52:18 +0000

Wow, I didn't realize some RFID could reach 15 feet out - that's good to know. I naively thought you essentially had to be touching the surface of the tag.

New comment by faxmeyourcode in "Qwen3.5 Fine-Tuning Guide"

faxmeyourcode — Tue, 10 Mar 2026 15:15:44 +0000

If you treat LLMs as generic transformers, you can fine tune with a ton of examples of input output pairs. For messy input data with lots of examples already built, this is ideal.

At my day job we have experimented with fine tuned transformers for our receipt processing workflow. We take images of receipts, run them through OCR (this step might not even be necessary, but we do it at scale already anyways), and then take the OCR output text blobs and "transform" them into structured receipts with retailer, details like zip code, transaction timestamps, line items, sales taxes, sales, etc.

I trained a small LLM (mistral-7b) via SFT with 1000 (maybe 10,000? I don't remember) examples from receipts in our database from 2019. When I tested the model on receipts from 2020 it hit something like 98% accuracy.

The key that made this work so well is that we had a ton of data (potentially billions of example input/output pairs) and we could easily evaluate the correctness by unpacking the json output and comparing with our source tables.

Note that this isn't running in production, it was an experiment. There are edge cases I didn't consider, and there's a lot more to it in terms of accurately evaling, when to re-train, dealing with net new receipt types, retailers, new languages (we're doing global expansion RN so it's top of mind), general diversity of edge cases in your training data, etc.

New comment by faxmeyourcode in "Qwen3.5 Fine-Tuning Guide"

faxmeyourcode — Wed, 04 Mar 2026 17:14:20 +0000

Especially for super constrained applications. I don't care if the language model that I use for my extremely specific business domain can solve PhD math or remember the works of Shakespeare. I'd trade all of that for pure task specific accuracy.

New comment by faxmeyourcode in "Qwen3.5 Fine-Tuning Guide"

faxmeyourcode — Wed, 04 Mar 2026 17:06:32 +0000

Labeling or categorization tasks like this are the bread and butter of small fine tuned models. Especially if you need outputs in a specific json format or whatever.

I did an experiment where I did very simple SFT on Mistral 7b and it was extremely good at converting receipt images into structured json outputs and I only used 1,000 examples. The difficulty is trying to get a diverse enough set of examples, evaling, etc.

If you have great data with simple input output pairs, you should really give it a shot.

New comment by faxmeyourcode in "Claude's Cycles [pdf]"

faxmeyourcode — Tue, 03 Mar 2026 16:56:56 +0000

> Filip also told me that he asked Claude to continue on the even case after the odd case had been resolved. “But there after a while it seemed to get stuck. In the end, it was not even able to write and run explore programs correctly anymore, very weird. So I stopped the search.”

Interesting snippet towards the end. I wonder if they were using claude.ai or claude code. Sounds like they ran out of context and entered the "dumb zone."

New comment by faxmeyourcode in "Google Street View in 2026"

faxmeyourcode — Thu, 26 Feb 2026 19:16:55 +0000

Yea, I agree. The dataset is < 100MB... so duckdb can very easily handle this on an old macbook air.

https://duckdb.org/2025/05/19/the-lost-decade-of-small-data

New comment by faxmeyourcode in "The First Fully General Computer Action Model"

faxmeyourcode — Thu, 26 Feb 2026 04:20:24 +0000

Neel, this is really cool. How long have you been working on this, and where did you guys get inspiration from? Did you work on vlms earlier or something like that? Just curious.

Also, thanks for choosing a technical blog post for presenting this information.

New comment by faxmeyourcode in "Ask HN: What are you working on? (February 2026)"

faxmeyourcode — Mon, 09 Feb 2026 15:43:42 +0000

I love the idea behind MyVisualRoutine as a father with a disabled kiddo, thanks for sharing.

The app is beautiful - much better than I could build - what tech is it using if you don't mind me asking? Is it flutter, react native, something else? Just want to get better at mobile dev.

New comment by faxmeyourcode in "UEFI Bindings for JavaScript"

faxmeyourcode — Mon, 09 Feb 2026 15:00:06 +0000

Love this. An example of complete and total dominion over the machine. Great quote here too lol

> Prometheus stole fire from the gods and gave it to man. For this he was chained to a rock and tortured for eternity.

New comment by faxmeyourcode in "Claude Opus 4.6"

faxmeyourcode — Fri, 06 Feb 2026 14:21:58 +0000

Everybody is different, I simply cannot stand the sight of chatgpt styled writing. Give me paragraphs.

New comment by faxmeyourcode in "Some notes on starting to use Django"

faxmeyourcode — Thu, 29 Jan 2026 05:40:24 +0000

I've lobbied to replace our internal tool with a django admin panel. I prototyped it and it showed that it would reduce our code by > 15k lines.

Any internal webapps I need to build like this will 100% be set up with django in the future due to this. I don't need it to be pretty, I just want the UI, database migrations, users, roles, groups, etc for free

New comment by faxmeyourcode in "Vibe coding kills open source"

faxmeyourcode — Mon, 26 Jan 2026 14:20:12 +0000

One trick to get out of this scenario where you're writing a ton is to ask the model to interview until we're in alignment on what is being built. Claude and open code both have an AskUserQuestionTool which is really nice for this and cuts down on explanation a lot. It becomes an iterative interview and clarifies my thinking significantly.

New comment by faxmeyourcode in "Vibe coding kills open source"

faxmeyourcode — Mon, 26 Jan 2026 14:17:16 +0000

> LLMs dont care about the story, they just care about the current state of the code

You have to tell it about the backstory. It does not know unless you write about it somewhere and give it as input to the model.

New comment by faxmeyourcode in "AI is a horse (2024)"

faxmeyourcode — Fri, 23 Jan 2026 15:25:49 +0000

I'm sure the same could be said about tractors when they were coming on the scene.

There was probably initial excitement about not having to manually break the earth, then stories spread about farmers ruining entire crops with one tractor, some farms begin touting 10x more efficiency by running multiple tractors at once, some farmers saying the maintenance burden of a tractor is not worth it compared to feeding/watering their mule, etc.

Fast forward and now gigantic remote controlled combines are dominating thousands of acres of land with the efficiency greater than 100 men with 100 early tractors.