Hacker News: ajaystream

New comment by ajaystream in "€54k spike in 13h from unrestricted Firebase browser key accessing Gemini APIs"

ajaystream — Thu, 16 Apr 2026 17:05:27 +0000

I am going to ignore the form comments -- I guess I am not sure how I feel about being called an LLM (good or bad ?), not sure only time will tell. If LLMs turn out to be the turd of the universe -- bad or maybe good ? -- Emphasis on the '--' for comic interlude

New comment by ajaystream in "Lean proved this program correct; then I found a bug"

ajaystream — Tue, 14 Apr 2026 03:12:36 +0000

The spec-completeness problem here is the same one that bites distributed systems verification: the proof holds inside an operating envelope (no adversarial inputs, trusted runtime, bounded sizes), and the interesting failures live at the boundary. TLA+ has the same property - you can prove liveness under a fairness assumption the deployment silently violates, and nothing in the proof tells you when reality drifted outside.

What I'd actually want from the tooling is a machine-checkable statement of the envelope itself, propagated as a runtime guard rather than a compile-time comment. Then "proof holds" and "we are still inside the proof's domain" are two separate, observable properties, and the unverified-parser / unverified-runtime cases stop being invisible.

New comment by ajaystream in "Ask HN: Human psychology of non-AI-native users"

ajaystream — Thu, 19 Mar 2026 21:09:42 +0000

I hardly felt that the problem could be answered in such poetic and psychological eloquence. I get what you are saying.

New comment by ajaystream in "Ask HN: Human psychology of non-AI-native users"

ajaystream — Wed, 18 Mar 2026 17:01:25 +0000

I am building in finance, where workflows dominate - ie if A then B, but if ~ A then C then D etc. Configuration is a major challenge, but AI helps in getting there, but ever so often as you know - LLMs to the wrong thing even though they had been told many times over before. Users tend to abandon the interface and then manually make their updates. Is there a user interaction model to get the user to stay engaged ?

Ask HN: Human psychology of non-AI-native users

ajaystream — Wed, 18 Mar 2026 16:58:17 +0000

Many of you are building AI for non-technical solutions ... legal etc. How are you dealing with the human psychology of users having to correct behavior that was described before every once in a while ?

Comments URL: https://news.ycombinator.com/item?id=47428211

Points: 1

# Comments: 3

New comment by ajaystream in "Ask HN: How do you handle payments for AI agents?"

ajaystream — Wed, 18 Mar 2026 16:47:03 +0000

The other challenge that we have found is accuracy and completeness of fields required to be updated across use cases. Either we have to mandate all the fields or when we set them optional in the tool def. it sometimes blows through - how are you handling that ?

Ask HN: Is anyone building write guarantees for agents working across tool

ajaystream — Tue, 17 Mar 2026 21:55:53 +0000

We ran into a specific failure mode building production agent workflows. Fields from contracts creating inaccurate subscription updates — dates off by a day. Products created at random when they should have been updated. Tax amounts not written to the tax field but instantiated as entirely new products. Every failure a plausible-looking write that succeeded technically and was wrong operationally. HITL helped — processing one contract at a time with user confirmation at every step kept it accurate. But users eventually said "I have explained this to you 30 times, just get it done." The moment we reduced confirmation steps to let it run, it started failing again. No errors. No alerts. Just drift that showed up in reconciliation weeks later. Prompting and mapping tables compensated at the margins but never held. The agent had no verified ground truth on how fields related across systems — it inferred every time. And most times inferred inconsistently. Help ?

Comments URL: https://news.ycombinator.com/item?id=47418879

Points: 1

# Comments: 0