Hacker News: rs545837

New comment by rs545837 in "Ratty – A terminal emulator with inline 3D graphics"

rs545837 — Mon, 11 May 2026 14:48:33 +0000

Damn this was really fun to use.

New comment by rs545837 in "Vera: a programming language designed for machines to write"

rs545837 — Thu, 30 Apr 2026 03:23:22 +0000

I agree 100% with this thinking approach, I've been working in this domain for quite a few months now.

The right granularity for agents isn't files or lines, it's entities: functions, classes, methods. That's how both humans and agents actually think about code.

We built sem(Ataraxy-Labs/sem) which extracts entities from 30+ languages via tree-sitter and builds a cross-file dependency graph, so building semantic version control and semantic diff. weave (same org) takes it further and does git merges at entity level. Matches functions by name, merges their bodies independently.

The dependency graph also answers questions LLMs can't. I love the analysis based on ASTs.

New comment by rs545837 in "A type-safe, realtime collaborative Graph Database in a CRDT"

rs545837 — Thu, 23 Apr 2026 17:37:58 +0000

yeah definitely plastic scm has always been my inspiration, just trying to revive it.

New comment by rs545837 in "A type-safe, realtime collaborative Graph Database in a CRDT"

rs545837 — Tue, 21 Apr 2026 16:59:02 +0000

Oh this is cool. The Yjs as storage backend trick is clever, you basically get CRDT sync for free without having to build your own replication layer. And the pluggable storage means you can develop against in-memory and then flip to YGraph for collab mode without touching your queries. That's a nice developer experience.

The live queries also caught my eye. Having traversals auto reexecute when data changes sounds straightforward until you realize the underlying data is being merged from multiple peers concurrently. Getting that right without stale reads or phantom edges is genuinely hard.

I've been researching on something like this in a similar space but for source code, therefore built a tool called Weave(https://github.com/Ataraxy-Labs/weave) for entity level merges for git. Instead of merging lines of text, it extracts functions, classes, and methods, builds a dependency graph between them, and merges at that level.

Seeing codemix makes me think there might be something interesting here. Right now our entity graph and our CRDT state are two separate things. The graph lives our analysis engine and the CRDT lives in different crate. If something like @codemix/graph could unify those, you'd have a single data structure where the entity dependency graph is the CRDT.

New comment by rs545837 in "jj – the CLI for Jujutsu"

rs545837 — Wed, 15 Apr 2026 04:59:58 +0000

Think two agents working on the same codebase at the same time. Agent A is refactoring the auth module, Agent B is adding a new API endpoint that imports from auth. Separate worktrees, separate branches, but they're touching overlapping code.

ingle agent per feature works great today. But as agents get faster and cheaper, the bottleneck shifts to, how many agents can work on one repo simultaneously without stepping on each other.

New comment by rs545837 in "jj – the CLI for Jujutsu"

rs545837 — Tue, 14 Apr 2026 16:54:24 +0000

You could agree that the PR is the meaningful unit for shipping, but push back gently that for agents working in parallel, the commit/changeset level matters more than it used to because agents don't coordinate the way humans do. Multiple agents touching the same repo need finer-grained units of change than "the whole PR."

New comment by rs545837 in "jj – the CLI for Jujutsu"

rs545837 — Tue, 14 Apr 2026 14:57:21 +0000

jj is genuinely great and I think it deserves way more adoption than it has right now. The mental model is so much cleaner than git, undo actually works the way you'd expect it to, and working with stacked changes feels natural instead of that constant low-grade anxiety of actually breaking something. It's probably the best frontend for version control that exists today.

For the last few months though I've been thinking a lot about what you said at the end there. What if version control actually understood the code it was tracking, not as lines of text but as the actual structures we write and think in, functions, classes, methods, the real building blocks? A rename happening on one branch and an unrelated function addition on another aren't a real conflict in any meaningful sense, they only look like one because every tool we have today treats source code as flat text files.

For enhancing this kind of structural intelligence I started working on https://github.com/ataraxy-labs/sem, which uses tree-sitter to parse code into semantic entities and operates at that level instead of lines. When you start thinking of code not as text there's another dimension where things can go, even a lot of logic at the comiler level with call graphs becomes useful.

New comment by rs545837 in "GitHub Stacked PRs"

rs545837 — Tue, 14 Apr 2026 05:13:41 +0000

This is awesome honestly, Stacked PRs are one of those features that feels obvious in hindsight. Breaking a n-line PR into 3 focused layers where each one is independently reviewable is a huge win for both the author and reviewer. The native GitHub UI with the stack navigator is the right call too, and there's no reason this should require a third-party tool.

One thing I keep thinking about in this same direction: even within a single layer of a stack, line-level diffs are still noisy. You rename a function and update x call sites, the diff shows y changed lines. A reviewer has to mentally reconstruct "oh this is just a rename" from raw red/green text.

Semantic diffing (showing which functions, classes, methods were added/modified/deleted/moved) would pair really well with stacks. Each layer of the stack becomes even easier to review when the diff tells you "modified function X, added function Y" instead of just showing changed lines.

I've been researching something in this direction, https://ataraxy-labs.github.io/sem/. It does entity-level diffs, blame, and impact analysis. Would love to see forges like GitHub move in this direction natively. Stacked PRs solve the too much at once problem. Semantic diffs solve the "what actually changed" problem. Together they'd make code review dramatically better.

Structural and semantic component for improving code reviews with local models

rs545837 — Wed, 08 Apr 2026 06:44:42 +0000

I was curious in improving code reviews because they still suck, so researching on a triage layer that you can attach to your local LLMs/api calls for better code reviews.

Most review tools dump a PR diff into a model and hope it finds bugs. The model sees added/removed lines, hunk headers, context lines. It has no idea that the function it's looking at is called by x other functions across y files, or that a type change here breaks an interface three directories away.

The triage layer parses source code into ASTs using tree-sitters, extracts semantically meaningful entities (functions, classes, methods, structs), and builds a cross-file dependency graph. It ranks every changed entity by transitive blast radius. Cuts the review surface by 80-90%, and increases the attention score on the bug significantly. Now I am sure it can be out of distribution few times but for fast code reviews this tradeoff is worth making.

Once you've narrowed the problem to "here are the n riskiest entities in this PR," you don't need a frontier model. You need a model that just knows your code. A 7B fine-tuned on your codebase knows your patterns, your conventions, your common bugs. Structural triage handles the global reasoning that results in your model handling the judgment call really well.

Commands:

- inspect diff - entity-level diff with risk scoring and blast radius

- inspect predict - show which unchanged entities are at risk of breaking

- inspect review - structural triage + LLM review

- inspect pr - review a GitHub PR

21 language parsers. Written in Rust. Open source.

Github: https://github.com/Ataraxy-Labs/inspect

Comments URL: https://news.ycombinator.com/item?id=47686246

Points: 2

# Comments: 0

New comment by rs545837 in "Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs"

rs545837 — Thu, 02 Apr 2026 05:49:38 +0000

One cheap optimization for the compile overhead case: skip commits that only touch files unrelated to the failing test. If you know the test's dependency chain, any commit that doesn't touch that chain gets prior weight zero. Equivalent to git bisect skip but automatic. Cuts the search space before you compile anything.

New comment by rs545837 in "Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs"

rs545837 — Thu, 02 Apr 2026 05:45:19 +0000

This is a real pain point. One thing that helps: when an LLM agent makes changes across multiple commits, look at what it actually touched structurally. Often the agent adds a feature in commit 5 but subtly breaks something in commit 3 by changing a shared function it didn't fully understand.

New comment by rs545837 in "Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs"

rs545837 — Thu, 02 Apr 2026 05:41:00 +0000

You're right, at 300 tests bayesect converges to ~97-100% across the board. I reran with calibration.py and confirmed.

Went a step further and tested graph-weighted priors (per-commit weight proportional to transitive dependents, Pareto-distributed). The prior helps in the budget-constrained regime:

128 commits, 500 trials:

Budget=50, 70/30: uniform 22% → graph 33% Budget=50, 80/20: uniform 71% → graph 77% Budget=100, 70/30: uniform 56% → graph 65% At 300 tests the gap disappears since there's enough data to converge anyway. The prior is worth a few bits, which matters when bits are scarce.

Script: https://gist.github.com/rs545837/b3266ecf22e12726f0d55c56466...

New comment by rs545837 in "Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs"

rs545837 — Wed, 01 Apr 2026 23:08:32 +0000

Yes, bayesect accuracy increases with more iterations. The comparison was at a fixed budget(300 test runs) when I was running. Sorry should have clarified more on that.

New comment by rs545837 in "Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs"

rs545837 — Wed, 01 Apr 2026 22:28:23 +0000

Really fun work, and the writeup on the math is great. The Beta-Bernoulli conjugacy trick making the marginal likelihood closed-form is elegant.

We ran benchmarks comparing bisect vs bayesect across flakiness levels. At 90/10, bisect drops to ~44% accuracy while bayesect holds at ~96%. At 70/30 it's 9% vs 67%. The entropy-minimization selection is key here since naive median splitting converges much slower.

One thing we found, you can squeeze out another 10-15% accuracy by weighting the prior with code structure. Commits that change highly-connected functions (many transitive dependents in the call graph) are more likely culprits than commits touching isolated code. That prior is free, zero test runs needed.

Information-theoretically, the structural prior gives you I_prior bits before running any test, reducing the total tests needed from log2(n)/D_KL to (log2(n) - I_prior)/D_KL. On 1024-commit repos with 80/20 flakiness: 92% accuracy with graph priors vs 85% pure bayesect vs 10% git bisect.

We're building this into sem (https://github.com/ataraxy-labs/sem), which has an entity dependency graph that provides the structural signal.

New comment by rs545837 in "More on Version Control"

rs545837 — Mon, 30 Mar 2026 14:57:15 +0000

usually the whole discussion has been around line-level vs commit-level history, but there's a layer nobody's talking about, and I have been exploring it here these days with https://github.com/Ataraxy-Labs/sem. It gives you entity-level version control. It parses your code into functions, classes, methods using tree-sitter (12 languages so far), computes a structural hash for each entity, and builds a cross-file dependency graph. So sem diff HEAD~1 doesn't give you "+3 -2 in tax.py", it gives you "calculate_tax signature changed, 47 dependents, 3 callers will break". The key insight is distinguishing signature changes from body changes.

New comment by rs545837 in "Improved Git Diffs with Delta, Fzf and a Little Shell Scripting"

rs545837 — Sat, 28 Mar 2026 17:51:04 +0000

You can, but it's slow, expensive, and hallucinates. An LLM looking at a raw diff might miss a renamed function or invent a dependency that doesn't exist. sem does it structurally: parses both sides with tree-sitter, computes structural hashes, walks the real dependency graph. If you want to layer an LLM on top for summarization, you're feeding it 10 entities instead of 500 lines of unified diff.

New comment by rs545837 in "Improved Git Diffs with Delta, Fzf and a Little Shell Scripting"

rs545837 — Sat, 28 Mar 2026 17:49:44 +0000

Right, sem gives you both. sem diff --verbose shows the full before/after body of each changed entity. The entity-level view tells you what changed and what's affected. The line-level detail is still there when you need it.

New comment by rs545837 in "Improved Git Diffs with Delta, Fzf and a Little Shell Scripting"

rs545837 — Sat, 28 Mar 2026 16:45:31 +0000

We've been building an open source tool called sem (https://github.com/ataraxy-labs/sem) that takes this one level further: entity-level diffs instead of AST-level.

Instead of showing you which syntax nodes changed, it shows you which functions, classes, and methods changed, classifies the change (text-only, syntax, functional), and walks a dependency graph to tell you the blast radius.

The delta + difftastic integration problem in that issue is interesting because sem already has the pieces both sides need, before/after content with full context for every changed entity, plus structured JSON output. The blocker in #535 is that difftastic's JSON doesn't include surrounding context. sem's output includes complete entity bodies by default.

Would love to collaborate on a common interchange format if anyone from the delta or difftastic projects is interested. Entity-level granularity sits naturally above AST-level diffs and below file-level diffs, and having a standard way to represent "what changed and what depends on it" would be useful for the whole ecosystem.

What if CRDTs for version control worked at entity granularity instead of lines?

Hacker News: rs545837

New comment by rs545837 in "Ratty – A terminal emulator with inline 3D graphics"

New comment by rs545837 in "Vera: a programming language designed for machines to write"

New comment by rs545837 in "A type-safe, realtime collaborative Graph Database in a CRDT"

New comment by rs545837 in "A type-safe, realtime collaborative Graph Database in a CRDT"

New comment by rs545837 in "jj – the CLI for Jujutsu"

New comment by rs545837 in "jj – the CLI for Jujutsu"

New comment by rs545837 in "jj – the CLI for Jujutsu"

New comment by rs545837 in "GitHub Stacked PRs"

Structural and semantic component for improving code reviews with local models

New comment by rs545837 in "Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs"

New comment by rs545837 in "Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs"

New comment by rs545837 in "Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs"

New comment by rs545837 in "Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs"

New comment by rs545837 in "Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs"

New comment by rs545837 in "More on Version Control"

New comment by rs545837 in "Improved Git Diffs with Delta, Fzf and a Little Shell Scripting"

New comment by rs545837 in "Improved Git Diffs with Delta, Fzf and a Little Shell Scripting"

New comment by rs545837 in "Improved Git Diffs with Delta, Fzf and a Little Shell Scripting"

What if CRDTs for version control worked at entity granularity instead of lines?

Inspect – Semantic code review. Entity graphs + LLMs