Hacker News: luodaint

Show HN: Feedjolt a Feedback tool wiith unlimited seats and developer friendly

luodaint — Mon, 01 Jun 2026 06:18:18 +0000

We launched Feedjolt last week, It's a Feedback tool like Canny without any price surprise.

Comments URL: https://news.ycombinator.com/item?id=48353228

Points: 3

# Comments: 0

Ask HN: Did agentic coding change the way you think about commit granularity?

luodaint — Mon, 25 May 2026 12:17:09 +0000

Jujutsu is trending on the homepage, and the topic is using discipline when dealing with version control. Six months into working agentially on a daily basis, something changed for me.

When each coding session has one specific aim to achieve before the agent starts: one migration, one service, one abstraction layer, the diff itself is already the unit you can review. The whole commit granularity issue went away. No more interactive rebases for me.

Conversely, coding sessions with a large enough scope, implementing dark mode, and fixing the authentication flow across three subsystems will always end up with a diff that is really hard to read, regardless of commit granularity. Inconclusive diff is the result of bad scope, not bad commits.

Comments URL: https://news.ycombinator.com/item?id=48265976

Points: 1

# Comments: 0

New comment by luodaint in "Defeating Git Rigour Fatigue with Jujutsu"

luodaint — Mon, 25 May 2026 12:15:05 +0000

Agentic development transforms the scope of work. Once a session is committed to having one topic from the get-go — one move, one service, one abstraction — the diff generated from such a session is atomic by construction. Committing happens at the session level, and the commitment discipline problem solved by Jujutsu does not come into play.

It is also true in reverse. Scopes set too broadly ("dark mode implementation," "auth flow fixes") lead to un-readable diffs no matter what tool you use for version control. Un-readable diff does not stem from commitment discipline; it is a scope problem.

That said, this fact does not diminish the usefulness of Jujutsu. There are valid use cases for the rebase and stacking operations. However, the discussion about commit granularity takes on a whole new context once the constraint of having readable commits is established at the scope setting stage.

New comment by luodaint in "Amazon Web Services – Four Years and Out"

luodaint — Mon, 25 May 2026 06:17:26 +0000

For organizations that make this an operational reality rather than a slogan, there is likely to be at least one individual for each product development cycle that has had recent contact with a customer. Not from reading a support transcript. From contact. That direct, uncensored communication is what the principle truly represents.

The recovery from being an orphaned customer account serves as the litmus test. In this case, it took someone who was unique and non-interchangeable and poked "the right bear" -- and it succeeded. But that's precisely the way that enshittification of the principle occurs.

New comment by luodaint in "The Eternal Sloptember"

luodaint — Mon, 25 May 2026 06:15:40 +0000

Data from six months of production from one SaaS codebase provides a more limited response. Maintainability doesn't depend on the level of AI usage. Maintainability depends on the discipline during diff reviews. Good sessions: One topic per session; scope defined prior to the agent starting; all diffs read prior to committing. Poor sessions: Broad scope; undefined constraints; rubber-stamped results.

The quality of the codebase decays precisely at the rate you stop reading the results. This is not an issue of AI writing the code. This is an issue of unreviewed code. geohot's issue is entirely valid. This problem does exist. But this isn't dependent on the generation phase.

New comment by luodaint in "Constraint Decay: The Fragility of LLM Agents in Back End Code Generation"

luodaint — Mon, 25 May 2026 06:14:50 +0000

Not those carefully designed constraints that I set up from the beginning, but short-term ones that I came up with after an agent failed in some way: "Validate JWT at the route level, not the component." "Call workspace provisioning on each user creation." Both because of things the agent had done incorrectly.

Aspiration vs. consequence, in other words. An aspiration constraint describes a desired outcome for the system; a consequence constraint maps to a problem already encountered. And the agent ignores the former when faced with the path of least resistance while obeying the latter because it is brief, unambiguous, and precise about preventing that particular failure mode. Which is key rather than the harness in determining survival through session rotation.

New comment by luodaint in "GitHub confirms breach of 3,800 repos via malicious VSCode extension"

luodaint — Thu, 21 May 2026 12:58:29 +0000

Your extensions in VSCode have ambient access to your filesystem, your tokens, and your environment. The servers of tools like Claude Code or Cursor have that ambient access too. This was justified for Nx Console's purposes. This is justified in a coding agent's filesystem MCP. The exact same trust model: install it, it runs, you trust its scope implicitly.

What I ended up changing after contemplating this matter: all my MCP servers are scripts from my repository and not npm packages. All the information about the scopes these servers can use is contained explicitly in my context file (certain directories, certain tools). There's nothing untrusted reaching my filesystem/tokens.

There's the same supply chain problem in VSCode extensions as there is with the MCP servers. Very few companies that audit their extensions started auditing their MCP servers.

New comment by luodaint in "Ask HN: How to enforce engineers to understand the code they are shipping"

luodaint — Wed, 20 May 2026 12:13:17 +0000

In any case, once the agent session finishes, the constraint file for the next session is written by the same individual. In this case, if the individual didn't read every output line by the agent, he will not know what to include in the new constraint file, leading the agent to make some wrong assumptions about the model, leading to some issues at a certain point in time, which, again, he cannot diagnose because his knowledge of the model is incorrect.

Diff reading is not a practice forced on developers from above. On the contrary, it is the only way for a developer to stay competent enough to lead the next session properly.

Instead of discussing how to ensure that developers will understand the importance of diff reading, the question here is whether the developers understand they cannot shift the responsibility of creating a mental model of the system away from themselves and still maintain effective control over the agent's behavior.

New comment by luodaint in "Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks"

luodaint — Wed, 20 May 2026 12:12:51 +0000

When it comes to the business logic of production use, this particular failure type is less obvious compared to benchmarking tasks. Benchmarking involves having the answer already known — it helps detect mismatches easily. Business logic pipeline does not. If LLM gives out a valid output that happens to be semantically incorrect, the pipeline goes through. There is no mistake to catch.

Created a dedupe pipeline where an LLM decides whether two feature requests are similar enough to merge. Occasional mistakes in terms of false positives — valid JSON structure, but incorrectly assessed similarity. In this case, it didn’t help to implement the retry technique. The solution was implementing a deterministic gate validating the output of the model based on its semantic similarity score calculated separately.

The reason why recovery works only with the help of additional tools when the error rate is at zero percent becomes clear: the LLM does not recognize the fact that it made a mistake. The guardrail becomes necessary for that — the retry is just one way of implementing the guardrail concept.

Ask HN: Parallel agent code writers, how do you stop them from clashing quietly?

luodaint — Tue, 19 May 2026 11:24:38 +0000

It’s getting easier to run two agent sessions in parallel over the same codebase. Avoiding them from making inconsistent assumptions, not so much.

My observations: parallel sessions acting on adjacent subsystems won't stay aligned without a common constraint set. The session that assumes the auth invariant will not know that another session just changed a constraint it relies on. The clash won’t manifest at commit time; it will occur at integration time, when the false assumption has already been propagated to three other files.

No approach feels entirely satisfactory. What works for you?

Comments URL: https://news.ycombinator.com/item?id=48191910

Points: 2

# Comments: 0

New comment by luodaint in "Cursor Introduces Composer 2.5"

luodaint — Tue, 19 May 2026 10:53:55 +0000

Benchmarks measure turn-level capabilities: you feed a task into the system and then grade the result. Capability for production-level usage concerns session-level decision making: does the agent know when to stop editing, retain the right amount of context, or go back and reread the file if the state has changed?

This is not a property of the model, but a property of the discipline; it can be operationalized by what you have documented before the session begins. Without "stop editing where you can no longer follow your changes to the spec" and "go back and read the migration file before changing the schema," there is nothing to halt the process until it fails integration.

Those teams who get consistent results independent of the model being used typically do so because they have operationalized their discipline first. Those switching out models monthly tend to expect the model to supply them.

New comment by luodaint in "AI is a technology not a product"

luodaint — Mon, 18 May 2026 12:16:35 +0000

If you don't know what exactly the user needs, the AI feature is the pitch itself. "Powered by AI" is something to say when you do not know how to sell the outcome. It's also something to develop when you have not set up the feedback loop to know which outcomes to optimize for.

If the signal is clear – if you have observed the same person facing the same problem in the same workflow – then the AI feature deserves its place in the product by automating one step that they hate. The outcome does not necessarily need to be AI-powered. The user simply stops facing that problem anymore.

The Gruber's logic works on the level of the whole product. But there is also a diagnostic implication here – the louder the product sells its AI capabilities, the less the team understands what exactly the product does.

New comment by luodaint in "Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep"

luodaint — Mon, 18 May 2026 11:02:04 +0000

Metric that measures the quality beyond simple tokens count: correction loop frequency.

When grep does not find a file of interest, the agent does not fail; it will continue working on an incomplete context. For a monolingual code base, the miss rate is okay. In case of polylingual code (Python backend code and TypeScript frontend code), the problems emerge when it comes to querying for cross-file dependencies. Grep will return a route from the backend API. However, there is an interface in TypeScript that needs to be matched. Agent generates a response that does not fit the type. Correction cycle is one; two if the type conflict is ambiguous.

Combining grep with the understanding of semantic relations between files is a solution. Number of tokens saved is real but underestimates the actual benefit since fewer correction cycles are more valuable than tokens themselves.

New comment by luodaint in "Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep"

luodaint — Mon, 18 May 2026 11:01:23 +0000

"The problem is that there are several bottlenecks internally," which include the requirements, specs, and testing. Another one not mentioned in the article.

Before you had faster implementation times, something would take six weeks to implement. Feedback from the client about how far off target you were came through in the same amount of time: a help desk ticket, a post-call check-in, a quarter end review. The price you paid for being off target was proportional to how long it took to figure out.

Now, when you can ship features in an afternoon, the customer feedback loop remains the same speed. Surveys, help desk tickets, and churn analysis come back days, even weeks later, by which point you've shipped five new features going the same way.

You can fix the internal bottlenecks easily enough: write better specs, have faster test cycles, deploy continuously. The customer feedback loop bottleneck is built into the system. It won't get any faster just because implementation did.

Today most organizations are busy fixing the internal bottleneck, but not the external one.

New comment by luodaint in "MCP Hello Page"

luodaint — Sun, 17 May 2026 15:37:53 +0000

Bearer tokens using mcp-remote are the pragmatic way forward out of the authentication mess.

Spec-based implementations of OAuth 2.0/2.1, including dynamic client registration and token exchanges, are absolutely necessary on a large scale but are an enormous barrier for adoption during early deployments. Cookie tricks ignore the spec and produce special cases when client credentials need to be rotated.

Solution in the real world: configure the server to provide the WWW-Authenticate header with value "Bearer" on unauthorized requests, issue limited-scoped tokens, leave the rest up to mcp-remote. The client authenticates once, uses bearer token in its requests, server verifies against scopes. No dynamic client registration required.

The authentication solution becomes clearer if you distinguish between "what the spec allows" and "what you need in your deployment." Not many servers really require OAuth on day 1.

New comment by luodaint in "Ask HN: Do you still spend time maintaining Claude.md / AGENTS.md files?"

luodaint — Sun, 17 May 2026 15:37:01 +0000

The difference that affected my approach: CLAUDE.md includes three types of content, which function differently.

Facts (directory organization, commands, references in docs), always work. There’s no discussion here.

Constraints related to regression prevention (specifying certain restrictions based on a particular failure) – work consistently if each individual item doesn’t exceed one sentence and describes a failure I’ve already observed. Example: “Validate JWT at the route level, not the component.” It works since the agent was caught doing it incorrectly. “Always call workspace provisioning when creating a user” – ditto.

Behavior rules (rules regarding comments, naming, dos and don'ts) – this is where the OP fits in. My observation: such rules work if they are concise, specific, and based on an already existing failure pattern. They don’t work if created in anticipation of some failure that didn’t happen yet. Rules made from scratch are usually not followed. The 50 line limitation is justified for this category.

New comment by luodaint in "Moving away from Tailwind, and learning to structure my CSS"

luodaint — Sun, 17 May 2026 15:36:10 +0000

When an agent writes a component, the styles come along with the JSX. "flex items-center gap-4 bg-purple-600 rounded-lg" doesn’t require any mental context switch to a separate stylesheet file. Custom classes force you into a separate file for each component; utility classes keep the style information close at hand.

The other thing Tailwind stops from happening: class name bloat. In its absence, agents invent classes such as "card-header-inner", "feature-block-content", "sidebar-item-wrapper" – all separate naming choices. After a few months of development, you accumulate hundreds of classes that are not owned by anyone. The limit placed by Tailwind is in its vocabulary; there are no names to invent. This trade-off described by Julia exists. It's just articulated a bit differently.

New comment by luodaint in "AI is making me dumb"

luodaint — Fri, 15 May 2026 12:38:40 +0000

Those sessions which stayed together had something in common: singularity of concern. 1 session = 1 feature, 1 bug, 1 migration. As soon as you have a session covering several subsystems, the diff becomes too large to read properly, and proper reading is the only thing keeping the mental model up-to-date.

Also useful: writing the constraint before the session, not after the failure. "The auth state should be checked on the route level rather than the component" becomes quite clear once you see an agent applying the same rule in three slightly different ways in two files. Writing down the constraint beforehand allows you to detect the violation; rubber-stamping achieves nothing.

What really multiplies in value is not prompting but understanding your own system enough to prompt the agent properly.

New comment by luodaint in "How Claude Code works in large codebases"

luodaint — Fri, 15 May 2026 12:38:12 +0000

As a result of shipping a working product for six months with Claude Code integrated into SaaS production, the only surviving constraints had all been less than 50 lines: migrations' generation vs execution conventions, authentication flow invariants, internationalization setup rules. Any longer constraints would get either selectively ignored or cause confusion when starting sessions.

The important distinction: CLAUDE.md will not explain how the model understands your architecture. Rather, it will prevent certain kinds of regression from happening. "Never create a user without calling the workspace provision step" is the right constraint. "This is how our entire system works" is not – the model learns it from the codebase.

The mistake is writing constraints based on an architecture constructed with slop. The sequence is important here.

New comment by luodaint in "If AI writes your code, why use Python?"

luodaint — Fri, 15 May 2026 06:00:29 +0000

The idea of the Pydantic-as-code-smell hinges on the objective being type-safety throughout the codebase. It isn't the aim when an agent creates the majority of the internal logic.

The winning architectural approach: enforcement at the borders, but flexibility within. The agent uses Pydantic for validating FastAPI schemas and models for the database—those are the contracts that need validation. The internal logic the agent produces is subject to line-by-line analysis, rather than being inferred from type propagation.

That's the right way to do things. It isn't some sort of a compromise. There is a clear boundary between validated "external input" and internal logic. And you aren't counting on type inference to propagate across the codebase. You catch errors at the border, where they come into or out of your codebase.

Your criticism of the type system in Python is spot on. The problem is that it is an add-on. It isn't consistent. And a language developed from the ground up for type annotations will do a far better job. However, this isn't the general case for agent-generated codebases.