<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: faxmeyourcode</title><link>https://news.ycombinator.com/user?id=faxmeyourcode</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Wed, 10 Jun 2026 10:07:32 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=faxmeyourcode" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by faxmeyourcode in "pg_durable: Microsoft open sources in-database durable execution"]]></title><description><![CDATA[
<p>Somebody else in the thread brought up the benefit of snapshotting a database at a point in time stores not only the state of execution but also the code, etc. That is a unique benefit I'd be interested in exploring over storing your orchestration outside of the database.<p>Not trying to dismiss the project - it looks like a lot of hard work has gone in and somebody has a use for it. I just come from an airflow style external orchestrator frame of mind that manages durability state in postgres but keeps the control flow out. Sorry if I came off as a bit snarky</p>
]]></description><pubDate>Fri, 05 Jun 2026 16:44:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=48415053</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=48415053</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48415053</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "pg_durable: Microsoft open sources in-database durable execution"]]></title><description><![CDATA[
<p>I aggree - I'm not understanding the value of the project either if you look at the example here <a href="https://github.com/microsoft/pg_durable/blob/main/examples/invoice-approval/without-df.md" rel="nofollow">https://github.com/microsoft/pg_durable/blob/main/examples/i...</a><p>It's an interesting technical achievement I guess, but it's very bizarre to try and read this<p><pre><code>    SELECT df.start(
        @> (
            ($$SELECT ... FROM demo.invoices WHERE status = 'pending'$$ |=> 'inv')
            ~> df.if_rows('inv',
                $$UPDATE ... SET status = 'processing'$$
                ~> (df.http(...) |=> 'resp')
                ~> df.if($$SELECT $r.ok$$,
                    -- classify, branch, wait for signal ...
                ),
                df.sleep(5)
            )
        ),
        'invoice-approval-pipeline'
    );</code></pre></p>
]]></description><pubDate>Fri, 05 Jun 2026 16:22:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=48414709</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=48414709</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48414709</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "pg_durable: Microsoft open sources in-database durable execution"]]></title><description><![CDATA[
<p>This feels like the wrong solution to an age old problem solved by the DAG schedulers like Apache Airflow for a while now.<p>Why would I want to store my control flow in the database and not in code? It feels strange.<p>Not trying to dismiss the project, I'm just not getting it yet I think.</p>
]]></description><pubDate>Fri, 05 Jun 2026 16:18:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=48414641</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=48414641</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48414641</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "Disagreement among frontier LLMs on real-world fact-checks"]]></title><description><![CDATA[
<p>I had a hunch that opus 4.7 hedged more than other models - and it turns out it's true<p><pre><code>    model                 total_claims  hedged_count  hedged_pct
    claude-opus-4-7       1000          451           45.1
    sonar-pro             1000          391           39.1
    gpt-5.4               1000          277           27.7
    gemini-3-retrieval    1000          129           12.9
    gemini-3-pro          1000          60            6.0
</code></pre>
datasette query here<p><a href="https://lite.datasette.io/?csv=https%3A%2F%2Fstatic.simonwillison.net%2Fstatic%2Fcors-allow%2F2026%2Flenz-llm-disagreement.csv#/data?sql=WITH+verdicts+AS+%28%0A++SELECT+claim_id%2C+%27gpt-5.4%27+AS+model%2C+%22gpt-5.4_verdict%22+AS+verdict%0A++FROM+%22lenz-llm-disagreement%22%0A%0A++UNION+ALL%0A++SELECT+claim_id%2C+%27claude-opus-4-7%27%2C+%22claude-opus-4-7_verdict%22%0A++FROM+%22lenz-llm-disagreement%22%0A%0A++UNION+ALL%0A++SELECT+claim_id%2C+%27gemini-3-pro%27%2C+%22gemini-3-pro_verdict%22%0A++FROM+%22lenz-llm-disagreement%22%0A%0A++UNION+ALL%0A++SELECT+claim_id%2C+%27gemini-3-retrieval%27%2C+%22gemini-3-retrieval_verdict%22%0A++FROM+%22lenz-llm-disagreement%22%0A%0A++UNION+ALL%0A++SELECT+claim_id%2C+%27sonar-pro%27%2C+%22sonar-pro_verdict%22%0A++FROM+%22lenz-llm-disagreement%22%0A%29%0ASELECT%0A++model%2C%0A++COUNT%28*%29+AS+total_claims%2C%0A++SUM%28CASE+WHEN+verdict+NOT+IN+%28%27True%27%2C+%27False%27%29+AND+verdict+IS+NOT+NULL+THEN+1+ELSE+0+END%29+AS+hedged_count%2C%0A++ROUND%28%0A++++100.0+*+SUM%28CASE+WHEN+verdict+NOT+IN+%28%27True%27%2C+%27False%27%29+AND+verdict+IS+NOT+NULL+THEN+1+ELSE+0+END%29+%2F+COUNT%28*%29%2C%0A++++1%0A++%29+AS+hedged_pct%0AFROM+verdicts%0AGROUP+BY+model%0AORDER+BY+hedged_count+DESC%3B" rel="nofollow">https://lite.datasette.io/?csv=https%3A%2F%2Fstatic.simonwil...</a></p>
]]></description><pubDate>Thu, 28 May 2026 14:59:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=48309906</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=48309906</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48309906</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "Boris Cherny: TI-83 Plus Basic Programming Tutorial (2004)"]]></title><description><![CDATA[
<p>An HP 50g was my calculator of choice, and the whole RPN style really rubbed off on me. Plus it had more advanced symbolic algebra capabilities than a ti83 equivalent. I enjoyed learning common lisp, scheme, racket, etc through high school and college and still am fond of them today because of this calculator.</p>
]]></description><pubDate>Thu, 07 May 2026 11:15:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=48048028</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=48048028</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48048028</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "GPT-5.5"]]></title><description><![CDATA[
<p>How does it compare to mythos?</p>
]]></description><pubDate>Thu, 23 Apr 2026 18:34:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=47879638</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=47879638</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47879638</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "25 Years of Eggs"]]></title><description><![CDATA[
<p>Wow, I didn't realize some RFID could reach 15 feet out - that's good to know. I naively thought you essentially had to be touching the surface of the tag.</p>
]]></description><pubDate>Mon, 23 Mar 2026 18:52:18 +0000</pubDate><link>https://news.ycombinator.com/item?id=47493598</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=47493598</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47493598</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "Qwen3.5 Fine-Tuning Guide"]]></title><description><![CDATA[
<p>If you treat LLMs as generic transformers, you can fine tune with a ton of examples of input output pairs. For messy input data with lots of examples already built, this is ideal.<p>At my day job we have experimented with fine tuned transformers for our receipt processing workflow. We take images of receipts, run them through OCR (this step might not even be necessary, but we do it at scale already anyways), and then take the OCR output text blobs and "transform" them into structured receipts with retailer, details like zip code, transaction timestamps, line items, sales taxes, sales, etc.<p>I trained a small LLM (mistral-7b) via SFT with 1000 (maybe 10,000? I don't remember) examples from receipts in our database from 2019. When I tested  the model on receipts from 2020 it hit something like 98% accuracy.<p>The key that made this work so well is that we had a ton of data (potentially billions of example input/output pairs) and we could easily evaluate the correctness by unpacking the json output and comparing with our source tables.<p>Note that this isn't running in production, it was an experiment. There are edge cases I didn't consider, and there's a lot more to it in terms of accurately evaling, when to re-train, dealing with net new receipt types, retailers, new languages (we're doing global expansion RN so it's top of mind), general diversity of edge cases in your training data, etc.</p>
]]></description><pubDate>Tue, 10 Mar 2026 15:15:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=47324409</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=47324409</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47324409</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "Qwen3.5 Fine-Tuning Guide"]]></title><description><![CDATA[
<p>Especially for super constrained applications. I don't care if the language model that I use for my extremely specific business domain can solve PhD math or remember the works of Shakespeare. I'd trade all of that for pure task specific accuracy.</p>
]]></description><pubDate>Wed, 04 Mar 2026 17:14:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=47250644</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=47250644</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47250644</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "Qwen3.5 Fine-Tuning Guide"]]></title><description><![CDATA[
<p>Labeling or categorization tasks like this are the bread and butter of small fine tuned models. Especially if you need outputs in a specific json format or whatever.<p>I did an experiment where I did very simple SFT on Mistral 7b and it was extremely good at converting receipt images into structured json outputs and I only used 1,000 examples. The difficulty is trying to get a diverse enough set of examples, evaling, etc.<p>If you have great data with simple input output pairs, you should really give it a shot.</p>
]]></description><pubDate>Wed, 04 Mar 2026 17:06:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=47250531</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=47250531</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47250531</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "Claude's Cycles [pdf]"]]></title><description><![CDATA[
<p>> Filip also told me that he asked Claude to continue on the even case after the odd case had been resolved. “But there after a while it seemed to get stuck. In the end, it was not even able to write and run explore programs correctly anymore, very weird. So I stopped the search.”<p>Interesting snippet towards the end. I wonder if they were using claude.ai or claude code. Sounds like they ran out of context and entered the "dumb zone."</p>
]]></description><pubDate>Tue, 03 Mar 2026 16:56:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=47235247</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=47235247</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47235247</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "Google Street View in 2026"]]></title><description><![CDATA[
<p>Yea, I agree. The dataset is < 100MB... so duckdb can very easily handle this on an old macbook air.<p><a href="https://duckdb.org/2025/05/19/the-lost-decade-of-small-data" rel="nofollow">https://duckdb.org/2025/05/19/the-lost-decade-of-small-data</a></p>
]]></description><pubDate>Thu, 26 Feb 2026 19:16:55 +0000</pubDate><link>https://news.ycombinator.com/item?id=47170708</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=47170708</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47170708</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "The First Fully General Computer Action Model"]]></title><description><![CDATA[
<p>Neel, this is really cool. How long have you been working on this, and where did you guys get inspiration from? Did you work on vlms earlier or something like that? Just curious.<p>Also, thanks for choosing a technical blog post for presenting this information.</p>
]]></description><pubDate>Thu, 26 Feb 2026 04:20:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=47161799</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=47161799</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47161799</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "Ask HN: What are you working on? (February 2026)"]]></title><description><![CDATA[
<p>I love the idea behind MyVisualRoutine as a father with a disabled kiddo, thanks for sharing.<p>The app is beautiful - much better than I could build - what tech is it using if you don't mind me asking? Is it flutter, react native, something else? Just want to get better at mobile dev.</p>
]]></description><pubDate>Mon, 09 Feb 2026 15:43:42 +0000</pubDate><link>https://news.ycombinator.com/item?id=46946439</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=46946439</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46946439</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "UEFI Bindings for JavaScript"]]></title><description><![CDATA[
<p>Love this. An example of complete and total dominion over the machine. Great quote here too lol<p>> Prometheus stole fire from the gods and gave it to man. For this he was chained to a rock and tortured for eternity.</p>
]]></description><pubDate>Mon, 09 Feb 2026 15:00:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=46945942</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=46945942</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46945942</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "Claude Opus 4.6"]]></title><description><![CDATA[
<p>Everybody is different, I simply cannot stand the sight of chatgpt styled writing. Give me paragraphs.</p>
]]></description><pubDate>Fri, 06 Feb 2026 14:21:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=46913155</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=46913155</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46913155</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "Some notes on starting to use Django"]]></title><description><![CDATA[
<p>I've lobbied to replace our internal tool with a django admin panel. I prototyped it and it showed that it would reduce our code by > 15k lines.<p>Any internal webapps I need to build like this will 100% be set up with django in the future due to this. I don't need it to be pretty, I just want the UI, database migrations, users, roles, groups, etc for free</p>
]]></description><pubDate>Thu, 29 Jan 2026 05:40:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=46806243</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=46806243</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46806243</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "Vibe coding kills open source"]]></title><description><![CDATA[
<p>One trick to get out of this scenario where you're writing a ton is to ask the model to interview until we're in alignment on what is being built. Claude and open code both have an AskUserQuestionTool which is really nice for this and cuts down on explanation a lot. It becomes an iterative interview and clarifies my thinking significantly.</p>
]]></description><pubDate>Mon, 26 Jan 2026 14:20:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=46765936</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=46765936</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46765936</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "Vibe coding kills open source"]]></title><description><![CDATA[
<p>> LLMs dont care about the story, they just care about the current state of the code<p>You have to tell it about the backstory. It does not know unless you write about it somewhere and give it as input to the model.</p>
]]></description><pubDate>Mon, 26 Jan 2026 14:17:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=46765896</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=46765896</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46765896</guid></item><item><title><![CDATA[New comment by faxmeyourcode in "AI is a horse (2024)"]]></title><description><![CDATA[
<p>I'm sure the same could be said about tractors when they were coming on the scene.<p>There was probably initial excitement about not having to manually break the earth, then stories spread about farmers ruining entire crops with one tractor, some farms begin touting 10x more efficiency by running multiple tractors at once, some farmers saying the maintenance burden of a tractor is not worth it compared to feeding/watering their mule, etc.<p>Fast forward and now gigantic remote controlled combines are dominating thousands of acres of land with the efficiency greater than 100 men with 100 early tractors.</p>
]]></description><pubDate>Fri, 23 Jan 2026 15:25:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=46733611</link><dc:creator>faxmeyourcode</dc:creator><comments>https://news.ycombinator.com/item?id=46733611</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46733611</guid></item></channel></rss>