Hacker News: sluongng

New comment by sluongng in "Google copybara: moving code between repositories"

sluongng — Wed, 01 Jul 2026 17:09:14 +0000

Yeah I vibe coded https://github.com/sluongng/capyfun during a hackathon recently to add a generative transformation layer on top of the traditional imperative transformations.

New comment by sluongng in "Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint"

sluongng — Mon, 18 May 2026 20:12:15 +0000

There are plenty of cool advancements in reducing inference cold start when I was meeting with folks in person at FOSDEM this year. However, I still struggle to understand: why would folks care about this?

Major AI Labs all have secured their own compute in the form of hardware, data center, and power generation. That means their resource pool is fixed, and they can do all sorts of tricks to pre-load, pre-allocate, etc... to improve on inference latency.

Cold start is usually a solution for "cloud" environment when your pool is flexible, and you only pay for what you use. Its effectiveness lowered in bare-metal settings as folks do not care about scaling up and down as much.

So my question is: who is this for? AWS and GCP running Anthropic models?

New comment by sluongng in "Show HN: Git bayesect – Bayesian Git bisection for non-deterministic bugs"

sluongng — Thu, 02 Apr 2026 03:35:18 +0000

You can run bisect with first-parent

New comment by sluongng in "Ninja is a small build system with a focus on speed"

sluongng — Mon, 30 Mar 2026 13:27:48 +0000

My teammate has a great time reimplementing Ninja (slop-free) in Go here https://github.com/buildbuddy-io/reninja to make it even faster with Remote Build Execution.

New comment by sluongng in "Parallel coding agents with tmux and Markdown specs"

sluongng — Tue, 03 Mar 2026 08:32:28 +0000

Yeah, I don't disagree with your assessment at all. I think the H2A ratio is still a good metric for the AI adoption rate of an organization. At a higher H2A ratio, you will also start to hear people measuring things using token volumes, which I think is also a similar metric (because most models nowadays run on a relatively fixed Tokens/second speed).

All of this is not a direct signal to a productivity boost. I think at higher volumes, you will need to start to account for the "yield" rate of the token volumes above: what are the volumes of tokens that get to the final production deployment? At which stage is it a constraint on the yield? Is it the models, or is it the harness, or something else (i.e. Code Review, CI/CD, Security Scans etc...)? And then it becomes an optimization problem to reduce the Cost of Goods Sold while improving/maintaining Revenues. The "productivity" will then be dissolved into multiple separate but more tangible metrics.

New comment by sluongng in "Parallel coding agents with tmux and Markdown specs"

sluongng — Mon, 02 Mar 2026 19:43:31 +0000

I do. The reason why the current generation of agents are good at coding is because the labs have sufficient time and computes to generate synthetic chain-of-thoughts data, feed those data through RL before use them to train the LLMs. These distillation takes time, time which starts from the release of the previous generation of models.

So we are just now getting agents which can reliably loop themselves for medium size tasks. This generation opens a new door towards agent-managing-agents chain of thoughts data. I think we would only get multi-agents with high reliability sometimes by the mid to end of 2026, assuming no major geopolitical disruption.

New comment by sluongng in "Parallel coding agents with tmux and Markdown specs"

sluongng — Mon, 02 Mar 2026 18:09:24 +0000

Yeah the 8 agents limit aligns well with my conversations with folks in the leading labs

https://open.substack.com/pub/sluongng/p/stages-of-coding-ag...

I think we need much different toolings to go beyond 1 human - 10 agents ratio. And much much different tooling to achieve a higher ratio than that

New comment by sluongng in "Cracking the Python Monorepo"

sluongng — Sun, 01 Mar 2026 08:09:02 +0000

Most of the time, the CI resources in a python monorepo is not spent on packaging. It’s spent on running the tests.

I would love to read more about how the author is tackling the testing problem in their setup.

New comment by sluongng in "Move tests to closed source repo"

sluongng — Fri, 27 Feb 2026 08:06:19 +0000

https://sluongng.substack.com/i/186718212/test-is-king I wrote about this less than a month ago. Things are moving pretty fast in this direction.

New comment by sluongng in "Show HN: I ported Tree-sitter to Go"

sluongng — Wed, 25 Feb 2026 19:01:02 +0000

Oh this is really neat for the Bazel community, as depending on tree-sitter to build a gazelle language extension, with Gazelle written in Go, requires you to use CGO.

Now perhaps we can get rid of the CGO dependency and make it pure Go instead. I have pinged some folks to take a look at it.

New comment by sluongng in "Putting Gemini to Work in Chrome"

sluongng — Thu, 29 Jan 2026 09:02:16 +0000

Not yet in Linux?

New comment by sluongng in "Rust at Scale: An Added Layer of Security for WhatsApp"

sluongng — Wed, 28 Jan 2026 12:49:12 +0000

I suspect they just use no_std whenever its applicable

https://github.com/facebook/buck2/commit/4a1ccdd36e0de0b69ee...

https://github.com/facebook/buck2/commit/bee72b29bc9b67b59ba...

Turn out if you have strong control over the compiler and linker instrumentations, there are a lot of ways to optimize binary size

New comment by sluongng in "I made my own Git"

sluongng — Tue, 27 Jan 2026 12:24:33 +0000

Zstd dictionary compression is essentially how Meta's Mercurial fork (Sapling VCS) stores blobs https://sapling-scm.com/docs/dev/internals/zstdelta. The source code is available in GitHub if folks want to study the tradeoffs vs git delta-compressed packfiles.

I think theoratically, Git delta-compression is still a lot more optimized for smaller repos. But for bigger repos where sharding storaged is required, path-based delta dictionary compression does much better. Git recently (in the last 1 year) got something called "path-walk" which is fairly similar though.

New comment by sluongng in "Transfering Files with gRPC"

sluongng — Mon, 26 Jan 2026 15:36:46 +0000

The evolving schema is much more attractive than a bunch of plain text HTTP headers when you want to communicate additional metadata with the file download/upload.

For example, there are common metadata such as the digest (hash) of the blob, the compression algorithm, the base compression dictionary, whether Reed-Solomon is applicable or not, etc...

And like others have pointed out, having existing grpc infrastructure in place definitely helps using it a lot easier.

But yeah, it's a tradeoff.

New comment by sluongng in "Transfering Files with gRPC"

sluongng — Mon, 26 Jan 2026 13:56:06 +0000

https://github.com/googleapis/googleapis/blob/master/google/... is a more complete version of this. It supports resumable uploads, and the download can start from an offset within a file, allowing you to download only part of the file instead of the whole.

Another version of this is to use grpc to communicate the "metadata" of a download file, and then "side" load the file using a side channel with http (or some other light-weight copy methods). Gitlab uses this to transfer Git packfiles and serve git fetch requests iirc https://gitlab.com/gitlab-org/gitaly/-/blob/master/doc/sidec...

Post-Agentic Code Forges

sluongng — Sat, 24 Jan 2026 08:10:08 +0000

Article URL: https://sluongng.substack.com/p/post-agentic-code-forges

Comments URL: https://news.ycombinator.com/item?id=46741909

Points: 1

# Comments: 0

New comment by sluongng in "A faster path to container images in Bazel"

sluongng — Thu, 25 Dec 2025 09:28:28 +0000

The underlying problem is that most container images are not cache efficient. Compressed tarballs arent and that’s what most of container images are. And Bazel relies heavily on caching to stay fast.

Most of the hyper scaler actually do not store container images as tarballs at scale. They usually flatten the layers and either cache the entire file system merkle tree, or breaking it down to even smaller blocks to cache them efficiently. See Alibaba Firefly Nydus, AWS Firecracker, etc… There is also various different forms of snapshotters that can lazily materialize the layers like estargz, soci, nix, etc… but none of them are widely adopted.

New comment by sluongng in "Fast trigram based code search"

sluongng — Fri, 05 Dec 2025 14:17:54 +0000

> They use Google's web indexing technology adapted for trigrams, which was mostly developed to support their massive internal monorepo

Do you have a source for this? I would love to read more about it.

In the doc of the Zoekt repo, it says

> What does cs.bazel.build run on?

> Currently, it runs on a single Google Cloud VM with 16 vCPUs, 60G RAM and an attached physical SSD.

https://github.com/sourcegraph/zoekt/blob/main/doc/faq.md#wh...

so at least they were using Zoekt up until a certain point in the past.

New comment by sluongng in "Introducing architecture variants"

sluongng — Fri, 31 Oct 2025 17:08:59 +0000

Nice. This is one of the main reasons why I picked CachyOS recently. Now I can fallback to Ubuntu if CachyOS gets me stuck somewhere.

New comment by sluongng in "Modern CI is too complex and misdirected (2021)"

sluongng — Wed, 20 Aug 2025 11:59:08 +0000

The most concerning part about modern CI to me is how most of it is running on GitHub Actions, and how GitHub itself has been deprioritizing GitHub Actions maintenance and improvements over AI features.

Seriously, take a look at their pinned repo: https://github.com/actions/starter-workflows

> Thank you for your interest in this GitHub repo, however, right now we are not taking contributions.

> We continue to focus our resources on strategic areas that help our customers be successful while making developers' lives easier. While GitHub Actions remains a key part of this vision, we are allocating resources towards other areas of Actions and are not taking contributions to this repository at this time.