Hacker News: jgraettinger1

New comment by jgraettinger1 in "Just Use Postgres for Durable Workflows"

jgraettinger1 — Thu, 28 May 2026 21:44:04 +0000

At Estuary, we have an in-house Rust crate [1] for building scale-out durable actors / FSMs in Postgres. It powers all async activity in our control plane -- slews of fine-grain scheduled actions, complex change propagation through data-flow topologies, reliable alert and email delivery, and more -- at hundreds to thousands of state transitions per second (today). It's been a wonderful pattern to build on, and is all of three source files.

Here's a an example computing a Fibonacci sequence (very inefficiently, with lots of spawned sub-tasks and message passing) [2]

[1] https://github.com/estuary/flow/tree/master/crates/automatio... [2] https://github.com/estuary/flow/blob/master/crates/automatio...

New comment by jgraettinger1 in "I believe there are entire companies right now under AI psychosis"

jgraettinger1 — Sat, 16 May 2026 05:32:45 +0000

The right metaphor isn't painting, though, it's molding clay. That first pass is slop, but it's raw clay that the agent is very good at molding given a modicum of direction and "not this, do that" comments. The combined first-pass and reshaping time is still far less than writing by hand from scratch. And increasingly, that first pass is ... not bad?

New comment by jgraettinger1 in "Code Review for Claude Code"

jgraettinger1 — Tue, 10 Mar 2026 00:54:54 +0000

I ask Claude or codex to review staged work regularly, as part of my workflow. This is often after I’ve reviewed myself, so I’m asking it to catch issues I missed.

It will _always_ find about 8 issues. The number doesn’t change, but it gets a bit … weird if it can’t really find a defect. Part of the art of using the tool is recognizing this is happening, and understanding it’s scraping the bottom of its barrel.

However, if there _are_ defects, it’s quite good at finding and surfacing them prominently.

New comment by jgraettinger1 in "Agentic Engineering Patterns"

jgraettinger1 — Wed, 04 Mar 2026 21:38:28 +0000

I think there's a Danger Zone when planning is light-weight and iterative, and code is cheap, but reviewing code is expensive: it leads to a kind of local hill-climbing.

Suppose you iterate through many sessions of lightweight planning, implementation, and code review. It _feels_ high velocity, you're cranking through the feature, but you've also invested a lot of your time and energy (planning isn't free, and code review and fit-for-purpose checks, in particular, are expensive). As often happens -- with or without AI -- you get towards the end and realize: there might have been a fundamentally better approach to take.

The tradeoff of that apparent velocity is that _now_ course correction is much more challenging. Those ephemeral plans are now gone. The effort you put into providing context within those plans is gone. You have an locally optimal solution, but you don't have a great way of expressing how to start over from scratch pointed in a slightly different direction.

I think that part can be really valuable, because given a sufficiently specific arrow, the AI can just rip.

Whether it's worth the effort, I suppose, depends on how high-conviction you are on your original chosen approach.

New comment by jgraettinger1 in "Agentic Engineering Patterns"

jgraettinger1 — Wed, 04 Mar 2026 20:24:39 +0000

Maintaining a high-quality requirements / specification document for large features prior to implementation, and then referencing it in "plan mode" prompts, feels like consensus best practice at this stage.

However a thing I'm finding quite valuable in my own workflows, but haven't seen much discussion of, is spending meaningful time with AI doing meta-planning of that document. For example, I'll spend many sessions partnered with AI just iterating on the draft document, asking it to think through details, play contrarian, surface alternatives, poke holes, identify points of confusion, etc. It's been so helpful for rapidly exploring a design space, and I frequently find it makes suggestions that are genuinely surprising or change my perspective about what we should build.

I feel like I know we're "done" when I thoroughly understand it, a fresh AI instance seems to really understand it (as evaluated by interrogating it), and neither of us can find anything meaningful to improve. At that point we move to implementation, and the actual code writing falls out pretty seamlessly. Plus, there's a high quality requirements document as a long-lived artifact.

Obviously this is a heavyweight process, but is suited for my domain and work.

ETA one additional practice: if the agent gets confused during implementation or otherwise, I find it's almost always due to a latent confusion about the requirements. Ask the agent why it did a thing, figure out how to clarify in the requirements, and try again from the top rather than putting effort into steering the current session.

New comment by jgraettinger1 in "The death of thread per core"

jgraettinger1 — Wed, 22 Oct 2025 13:24:10 +0000

As someone with workloads that can benefit from these techniques, but limited resources to put them into practice, my working thesis has been:

* Use a multi-threaded tokio runtime that's allocated a thread-per-core * Focus on application development, so that tasks are well scoped / skewed and don't _need_ stealing in the typical case * Over time, the smart people working on Tokio will apply research to minimize the cost of work-stealing that's not actually needed. * At the limit, where long-lived tasks can be distributed across cores and all cores are busy, the performance will be near-optimal as compared with a true thread-per-core model.

What's your hot take? Are there fundamental optimizations to a modern thread-per-core architecture which seem _impossible_ to capture in a work-stealing architecture like Tokio's?

New comment by jgraettinger1 in "Anthropic raises $13B Series F"

jgraettinger1 — Wed, 03 Sep 2025 02:59:44 +0000

It still doesn't make sense. Cursor undoubtedly has smart engineers who could implement the Anthropic text editing tool interface in their IDE. Why not just do that for one of your most important LLM integrations?

New comment by jgraettinger1 in "AI agent benchmarks are broken"

jgraettinger1 — Fri, 11 Jul 2025 17:09:58 +0000

> You can't do that for LLM output.

That's true if you're just evaluating the final answer. However, wouldn't you evaluate the context -- including internal tokens -- built by the LLM under test ?

In essence, the evaluator's job isn't to do separate fact-finding, but to evaluate whether the under-test LLM made good decisions given the facts at hand.

New comment by jgraettinger1 in "What is Realtalk’s relationship to AI? (2024)"

jgraettinger1 — Fri, 11 Jul 2025 02:14:55 +0000

But that’s not good. You don’t want Bob to be the gate keeper for why a process is the way it is.

In my experience working with agents helps eliminate that crap, because you have to bring the agent along as it reads your code (or process or whatever) for it to be effective. Just like human co-workers need to be brought along, so it’s not all on poor Bob.

New comment by jgraettinger1 in "The drawbridges come up: the dream of a interconnected context ecosystem is over"

jgraettinger1 — Tue, 17 Jun 2025 12:59:39 +0000

Hi, I’m a cofounder / CTO of estuary.dev. Our whole mission is democratizing and enabling use of data within orgs.

Open to a conversation about your work here? Reach me at johnny at estuary dot dev.

New comment by jgraettinger1 in "Why Use Structured Errors in Rust Applications?"

jgraettinger1 — Sun, 01 Jun 2025 16:45:00 +0000

I would recommend the `anyhow` crate and use of anyhow::Context to annotate errors on the return path within applications, like:

  falliable_func().context("failed to frob the peanut")?

Combine that with the `thiserror` crate for implementing errors within a library context. `thiserror` makes it easy to implement structured errors which embed other errors, and plays well with `anyhow`.

New comment by jgraettinger1 in "What's working for YC companies since the AI boom"

jgraettinger1 — Sat, 31 May 2025 19:17:40 +0000

> Microsoft owns GitHub and VSCode yet cursor was able to out execute them

Really? My startup is under 30 people. We develop in the open (source available) and are extremely willing to try new process or tooling if it'll gain us an edge -- but we're also subject to SOC2.

Our own evaluation was Cursor et all isn't worth the headache of the compliance paperwork. Copilot + VSCode is playing rapid catch-up and is a far easier "yes".

How large is the intersection of companies who a) believe Cursor has a substantive edge in capability, and b) have willingness to send Cursor their code (and go through the headaches of various vendor reviews and declarations)?

New comment by jgraettinger1 in "HTAP is Dead"

jgraettinger1 — Thu, 29 May 2025 05:40:43 +0000

> There's so many moving parts here.

Yep. At the scope of a single table, append-only history is nice but you're often after a clone of your source table within Iceberg, materialized from insert/update/delete events with bounded latency.

There are also nuances like Postgres REPLICA IDENTITY and TOAST columns. Enabling REPLICA IDENTITY FULL amplifies you source DB WAL volume, but not having it means your CDC updates will clobber your unchanged TOAST values.

If you're moving multiple tables, ideally your multi-table source transactions map into corresponding Iceberg transactions.

Zooming out, there's the orchestration concern of propagating changes to table schema over time, or handling tables that come and go at the source DB, or adding new data sources, or handling sources without trivially mapped schema (legacy lakes / NoSQL / SaaS).

As an on-topic plug, my company tackles this problem. Postgres => Iceberg is a common use case.

[0] https://docs.estuary.dev/reference/Connectors/materializatio...

New comment by jgraettinger1 in "The copilot delusion"

jgraettinger1 — Fri, 23 May 2025 04:20:02 +0000

> On the other hand, one could argue that AI is just another abstraction

I, as a user of a library abstraction, get a well defined boundary and interface contract — plus assurance it’s been put through paces by others. I can be pretty confident it will honor that contract, freeing me up to not have to know the details myself or second guess the author.

New comment by jgraettinger1 in "Introducing S2"

jgraettinger1 — Sun, 22 Dec 2024 21:01:24 +0000

The market is gobsmackingly huge, it's just the go-to-market entry points which are narrow.

In my opinion, the key is to find a value prop and positioning which lets prospects try your service while spending a minimum of their own risk capital / reputation points within their own org.

That makes it hard to go after core storage, because it's such a widely used, fundamental, and reliable part of most every company's infrastructure. You and I may agree that conventions of incremental files in S3 are a less-than-ideal primitive for representing streams, but plenty of companies are doing it this way just fine and don't feel that it's broken.

WarpStream, on the other hand, leaned in to the perceived complexity of running Kafka and the share of users who wanted a Kafka solution with the operational profile of using S3. Internal champions can sell trying their service because the prospect's existing thing is already understood to be a pain in the butt.

For what it's worth, if I were entering the space anew today I'd be thinking carefully about the Iceberg standard and what I might be able to do with it.

New comment by jgraettinger1 in "Introducing S2"

jgraettinger1 — Sun, 22 Dec 2024 19:18:52 +0000

> To me the best solution seem like combining storing writes on EBS (or even NVMe) initially to minimize the time until writes can be acknowledged, and creating a chunk on S3 standard every second or so.

Yep, this is approximately Gazette's architecture (https://github.com/gazette/core). It buys the latency profile of flash storage, with the unbounded storage and durability of S3.

An addendum is there's no need to flush to S3 quite that frequently, if readers instead tail ACK'd content from local disk. Another neat thing you can do is hand bulk historical readers pre-signed URLs to files in cloud storage, so those bytes don't need to proxy through brokers.

New comment by jgraettinger1 in "Introducing S2"

jgraettinger1 — Sun, 22 Dec 2024 19:01:29 +0000

Roughly ten years ago, I started Gazette [0]. Gazette is in an architectural middle-ground between Kafka and WarpStream (and S2). It offers unbounded byte-oriented log streams which are backed by S3, but brokers use local scratch disks for initial replication / durability guarantees and to lower latency for appends and reads (p99 <5ms as opposed to >500ms), while guaranteeing all files make it to S3 with niceties like configurable target sizes / compression / latency bounds. Clients doing historical reads pull content directly from S3, and then switch to live tailing of very recent appends.

Gazette started as an internal tool in my previous startup (AdTech related). When forming our current business, we very briefly considered offering it as a raw service [1] before moving on to a holistic data movement platform that uses Gazette as an internal detail [2].

My feedback is: the market positioning for a service like this is extremely narrow. You basically have to make it API compatible with a thing that your target customer is already using so that trying it is zero friction (WarpStream nailed this), or you have to move further up to the application stack and more-directly address the problems your target customers are trying to solve (as we have). Good luck!

[0]: https://gazette.readthedocs.io/en/latest/ [1]: https://news.ycombinator.com/item?id=21464300 [2]: https://estuary.dev

New comment by jgraettinger1 in "Scale Ruins Everything"

jgraettinger1 — Mon, 14 Oct 2024 22:07:59 +0000

One labeling is "organised crime".

Another is: civil disobedience with a profit motive.

New comment by jgraettinger1 in "Everything you need to know about Python 3.13 – JIT and GIL went up the hill"

jgraettinger1 — Sat, 28 Sep 2024 18:52:43 +0000

> For example, imagine a high volume, low latency, synchronous computer vision inference service.

I'm not in this space and this is probably too simplistic, but I would think pairing asyncio to do all IO (reading / decoding requests and preparing them for inference) coupled with asyncio.to_thread'd calls to do_inference_in_C_with_the_GIL_released(my_prepared_request), would get you nearly all of the performance benefit using current Python.

New comment by jgraettinger1 in "Chrome is entrenching third-party cookies that will mislead users"

jgraettinger1 — Fri, 30 Aug 2024 04:45:55 +0000

> which is a total joke since it's often trivial to re-identify anonymized data

HIPAA doesn't say ROT13 or anything else in particular counts as "anonymized". It's an after-the-fact assessment. If your "encrypted" data is accidentally released, and there's any reasonable suspicion inside or outside the company that it's crack-able, then it's a YOU problem and you need to notify a bajillion people by mail and per-state press release plus large fines.

I think you're being overly pessimistic on the strengths of US regulations on this with regard to preventing deliberate malfeasance, and that most of the stupid we see in stories is really just by accident or individual actors.