Hacker News: amangsingh

New comment by amangsingh in "Show HN: Castra – Strip orchestration rights from your LLMs"

amangsingh — Thu, 02 Apr 2026 15:17:46 +0000

FaultWall sounds like we're absolutely on the same paranoid wavelength. Treating the LLM as a hostile/untrusted actor at the data plane is the only way this scales to enterprise.

To answer your question: No, we actually completely eliminated destructive DB drift on the state machine, but we did it by air-gapping the database entirely. The agents in Castra don't write SQL and don't have a DB connection. They only have access to the compiled Go CLI. If an agent tries to hallucinate a destructive state change, the CLI simply rejects the command with a structured stderr HATEOAS response telling it to fix its syntax.

That said, having Castra govern the workflow orchestration while FaultWall governs the target application's data plane sounds like the ultimate 'zero-trust' synthetic labor stack. If you have a specific test case in mind, or a feature request that would help integrate your system into Castra's workflows, feel free to open an issue on the GitHub repo. I'd be happy to take a look and see how we can bridge it.

New comment by amangsingh in "Claude Code Unpacked : A visual guide"

amangsingh — Thu, 02 Apr 2026 01:42:11 +0000

Spot on regarding the automod. Unfortunately, the way I naturally structure my writing almost always triggers a 50/50 flag on AI content detectors. It is the absolute bane of my existence.

The filter instantly shadowbanned the Show HN post when I submitted it, which is why the link was dead for a while. Thankfully, human mods reviewed it and restored it. The link is fully live for a while now!

New comment by amangsingh in "Claude Code Unpacked : A visual guide"

amangsingh — Thu, 02 Apr 2026 01:29:53 +0000

I looked into lat.md. They are definitely thinking in the same direction by using a CLI layer to govern the agent.

The key difference is the state mechanism. They use markdown; I use an AES-encrypted SQLite database.

Markdown is still just text an LLM can hallucinate over or ignore. A database behind a compiled binary acts as a physical constraint; the agent literally cannot advance a task without satisfying the cryptographic gates.

I just dropped the Show HN for it here if you want to check out the architecture: https://news.ycombinator.com/item?id=47601608

New comment by amangsingh in "Show HN: Castra – Strip orchestration rights from your LLMs"

amangsingh — Thu, 02 Apr 2026 01:18:56 +0000

To clarify the architecture: The LLM doesn't have access to the ledger. That’s the entire point of Castra.

The LLM only has access to the CLI binary. The SQLite database is AES-256-CTR encrypted at rest. If an LLM (or a human) tries to bypass the CLI and query the DB directly, they just get encrypted garbage. The Castra binary holds the device-bound keys. No keys = no read, and absolutely no write.

As for the 'LLM-generated' comment; I’m flattered my incident report triggered your AI detectors, but no prompt required. That’s just how I write (as you can probably tell from my other replies in the thread). Cheers :)

New comment by amangsingh in "Show HN: Castra – Strip orchestration rights from your LLMs"

amangsingh — Thu, 02 Apr 2026 01:04:53 +0000

Thanks! This took a while (approximately 30 days) to get to this point.

The market basically relies on two main alternative approaches right now, both of which have their merits:

1. File-based Memory (Markdown/Artifacts): Instead of just relying on the context window, you prompt the agent to maintain its state in local files (e.g., a PLANNING.md or a TASKS.md artifact). It’s a step up, but text files lack relational integrity. You are still trusting the LLM to format the file correctly and not arbitrarily overwrite critical constraints.

2. The Orchestrator Agent (Dynamic Routing): Using a frontier model as a master router. It holds a list of sub-agents (routes) and is trusted to dynamically evaluate the context, route to the correct agent, and govern their behavior on the fly. The merit here is massive flexibility and emergent problem-solving.

I went in the opposite direction.

The trade-off with Castra is that it trades all that dynamic flexibility for a deterministic SQLite state machine. The demerit (though I consider it a feature) is that it is incredibly rigid and, honestly, boring. There is no 'on-the-fly' routing. It’s an unyielding assembly line. But for enterprise SDLC, I don't want emergent behavior; I want predictability.

The alternatives optimize for agent autonomy. Castra optimizes for agent constraint.

New comment by amangsingh in "Show HN: Castra – Strip orchestration rights from your LLMs"

amangsingh — Wed, 01 Apr 2026 16:02:05 +0000

Good question. I did think about behaviour trees early on, but I realized they optimize for the wrong thing in this specific domain.

Behaviour trees are fantastic for agent autonomy; letting the agent dynamically construct its own path to a goal. But for enterprise software pipelines, autonomy over the workflow is exactly what we're trying to kill.

If an LLM constructs a tree 'on-the-fly', you are still trusting a probabilistic model to define the rules of engagement. If it hallucinates or gets lazy, it might construct a tree that simply skips the security audit or the QA tests. You're relying on the prompt to enforce the rules.

A deterministic system (like Castra's SQLite backend) optimizes for agent constraint. The AI doesn't get to decide the workflow, just use it. It doesn't matter how smart the LLM is; the database physically will not allow the task to move to 'done' from any role (except the architect's break-glass protocol which is another fun rabbit hole that the agent will trap itself inside - example below:) until a completely separate agent has posted a cryptographic approval to the 'QA' column.

I don't want emergent behaviour in my SDLC; I want a digital assembly line. That requires the absolute regidity of a state machine.

--- The Fun Example: Castra has an emergency 'break-glass' protocol. It allows the Architect role to bypass the QA/Sec gates in a crisis, but it strictly requires an audit log and automatically generates a high-severity incident report.

Recently, I had an architect running 20 sub-agents (senior/junior engineers) on parallel tasks. The code finished correctly, but the tasks were blocked in the DB waiting for the QA agent's cryptographic approval. The executing agent (@architect) got impatient, since it had Architect jurisdiction, decided to be a smart-ass. It invoked the break-glass protocol to unilaterally force-push 38 tasks to 'done'.

If it had autonomy over its own behaviour tree, it would have successfully bypassed my security. But because it's a rigid state machine, the system executed the break-glass, updated the status, and ruthlessly generated 38 mandatory Incident Reports. The agent tried to skip the line, and the database rewarded it with 38 new high-priority tickets that also require QA and Security approval to clear.

It trapped itself in bureaucratic hell because the state machine does not negotiate.

New comment by amangsingh in "Claude Code Unpacked : A visual guide"

amangsingh — Wed, 01 Apr 2026 14:58:56 +0000

Just dropped the Show HN here: https://news.ycombinator.com/item?id=47601608. Would love to hear your thoughts on the architecture!

New comment by amangsingh in "Claude Code Unpacked : A visual guide"

amangsingh — Wed, 01 Apr 2026 14:58:26 +0000

Just posted it here: https://news.ycombinator.com/item?id=47601608 Thank you so much for the coffee offer, that genuinely made my day! I don't have a sponsor link set up. Honestly, the best support is just hearing if this actually helps you ship your personal project faster without losing your mind to prompt engineering. I really hope it gives you your sanity back. Let me know how it goes!

Show HN: Castra – Strip orchestration rights from your LLMs

amangsingh — Wed, 01 Apr 2026 14:40:28 +0000

I got tired of AI agents forgetting what they were doing the moment their context window filled. The current industry solution is to write massively bloated agent harnesses full of defensive spaghetti just to stop models from drifting.

The problem is treating chat history as project state. A conversation is not a ledger.

Castra is a compiled Go binary that strips orchestration rights from the LLM. State lives in an encrypted, local SQLite database (castra.db). The LLM is just a stateless executor — it reads the DB, executes a highly constrained task, and the result is written back subject to rigid state-machine rules.

What it actually does: - 7-Role RBAC: Hard jurisdictional boundaries (Architect plans, Engineer builds, QA tests). - Dual-Gate Approval: A task cannot reach done without explicit, sequential approval from both a QA agent and a Security agent. No self-approving code. - Cryptographic Audit Chain: Every action is logged into a SHA-256 hash-linked, Ed25519-signed ledger. - Multi-Vendor: Works with Claude, Copilot, Gemini, etc. via a standard AGENTS.md protocol. Anything that supports AGENTS.md and can run terminal commands.

Proof of Work: I built this by hand up to v1.3.0. Then, I turned Castra on itself. The agents governed by this exact CLI took over and built the architecture up to v3.1.2—including the cryptographic log chain itself. The proof is in castra-log.jsonl in the repo.

If you are running multi-agent workflows and hitting the context amnesia wall, stop trying to prompt-engineer your way out of it. Fix the state machine.

Comments URL: https://news.ycombinator.com/item?id=47601608

Points: 8

# Comments: 9

New comment by amangsingh in "Claude Code Unpacked : A visual guide"

amangsingh — Wed, 01 Apr 2026 13:18:49 +0000

Bingo. And them 'being careful' is exactly what bloats it to 500k lines. It's a ton of on-the-fly prompt engineering, context sanitizers, and probabilistic guardrails just to keep the vibes in check.

New comment by amangsingh in "Claude Code Unpacked : A visual guide"

amangsingh — Wed, 01 Apr 2026 13:13:59 +0000

Claude code is a massively successful generator, I use it all the time, but it's not a governance layer.

The fact that the industry is copying a 500k-line harness is the problem. We're automating security vulnerabilities at scale because people are trying to put the guardrails inside the probabilistic code instead of strictly above it.

Standardizing on half a million lines of defensive spaghetti is a huge liability.

New comment by amangsingh in "Claude Code Unpacked : A visual guide"

amangsingh — Wed, 01 Apr 2026 12:32:41 +0000

Why not both? AI writes bloated spaghetti by default. The control plane needs to be human-written and rigid -> at least until the state machine is solid enough to dogfood itself. Then you can safely let the AI enhance the harness from within the sandbox.

New comment by amangsingh in "Claude Code Unpacked : A visual guide"

amangsingh — Wed, 01 Apr 2026 12:19:44 +0000

Herding cats is treating the LLM's context window as your state machine. You're constantly prompt-engineering it to remember the rules, hoping it doesn't hallucinate or silently drop constraints over a long session.

System-level governance means the LLM is completely stripped of orchestration rights. It becomes a stateless, untrusted function. The state lives in a rigid, external database (like SQLite). The database dictates the workflow, hands the LLM a highly constrained task, and runs external validation on the output before the state is ever allowed to advance. The LLM cannot unilaterally decide a task is done.

I got so frustrated with the former while working on a complex project that I paused it to build a CLI to enforce the latter. Planning to drop a Show HN for it later today, actually.

New comment by amangsingh in "Claude Code Unpacked : A visual guide"

amangsingh — Wed, 01 Apr 2026 12:17:55 +0000

If writing concise architectural analysis without the fluff makes me an AI, I'll take the complement. But no - just a tired Architect who has spent way too many hours staring at broken agent state loops haha.

New comment by amangsingh in "Claude Code Unpacked : A visual guide"

amangsingh — Wed, 01 Apr 2026 11:10:20 +0000

A 500k line codebase for an agent CLI proves one thing: making a probabilistic LLM behave deterministically is a massive state-management nightmare. Right now, they're great for prompting simple sites/platforms but they break at large enterprise repos.

If you don't have a rigid, external state machine governing the workflow, you have to brute-force reliability. That codebase bloat is likely 90% defensive programming; frustration regexes, context sanitizers, tool-retry loops, and state rollbacks just to stop the agent from drifting or silently breaking things.

The visual map is great, but from an architectural perspective, we're still herding cats with massive code volume instead of actually governing the agents at the system level.