Hacker News: stillsut

New comment by stillsut in "Microsoft Amplifier"

stillsut — Sat, 11 Oct 2025 22:35:07 +0000

Exactly what I was looking for, thanks.

I've been doing something similiar: aider+gpt-5, claude-code+sonnet, gemini-cli+2.5-pro. I want to coder-cli next.

A main problem with this approach is summarizing the different approaches before drilling down into reviewing the best approach.

Looking at a `git diff --stat` across all the model outputs can give you a good measure of if there was an existing common pattern for your requested implementation. If only one of the models adds code to a module that the others do not, it's usually a good jumping off point to exploring the differing assumptions each of the agents built towards.

New comment by stillsut in "Microsoft Amplifier"

stillsut — Sat, 11 Oct 2025 20:06:52 +0000

I've actually written my own a homebrew framework like this which is a.) cli-coder agnostic and b.) leans heavily on git worktrees [0].

The secret weapon to this approach is asking for 2-4 solutions to your prompt running in parallel. This helps avoid the most time consuming aspect of ai-coding: reviewing a large commit, and ultimately finding the approach to the ai took is hopeless or requires major revision.

By generating multiple solutions, you can cutdown investing fully into the first solution and use clever ways to select from all the 2-4 candidate solutions and usually apply a small tweak at the end. Anyone else doing something like this?

[0]: https://github.com/sutt/agro

New comment by stillsut in "Sampling and structured outputs in LLMs"

stillsut — Tue, 30 Sep 2025 14:30:52 +0000

I think the easiest explanation is to look at the table here: https://github.com/sutt/innocuous?tab=readme-ov-file#how-it-...

Watch how the "Cumulative encoding" row grows each iteration (that's where the BTC address will be encoded) and then look at the other rows for how the algorithm arrives at that.

Thanks for checking it out!

New comment by stillsut in "Claude Sonnet 4.5"

stillsut — Tue, 30 Sep 2025 14:20:18 +0000

I've been building something like this, a markdown that tracks your prompts, and the code generated.

https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

Check it out, I'd be curious of your feedback.

New comment by stillsut in "Ask HN: What are you working on? (September 2025)"

stillsut — Tue, 30 Sep 2025 13:36:17 +0000

Encoding / decoding hidden messages in LLM output.

https://github.com/sutt/innocuous

The traditional use-case is steganography ("hidden writing"). But I see more potential applications than just for spy stuff.

I'm using this project as a case study for writing CS-oriented codebases and keeping track of every prompt and generated code line in a markdown file: https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

My favorite pattern I've found is to write encode implementations manually, and then AI pretty easily is able to follow that logic and translate it into a decode function.

New comment by stillsut in "Sampling and structured outputs in LLMs"

stillsut — Tue, 23 Sep 2025 17:46:52 +0000

I'm also working on a library to steer the sampling step of LLM's but more for steganographic / arbitrary data encoding purposes.

Should work with any llama.cpp compatible model: https://github.com/sutt/innocuous

New comment by stillsut in "Nostr"

stillsut — Fri, 19 Sep 2025 13:15:55 +0000

You can also earn zaps for pull requests working on Nostr clients.

We've been hosting some bounties like this one here: https://app.lightningbounties.com/issue/615dc5f7-ed91-4ecd-8...

New comment by stillsut in "Vibe coding has turned senior devs into 'AI babysitters'"

stillsut — Mon, 15 Sep 2025 15:52:14 +0000

I've got some receipts for what I think is good vibe coding...

I save every prompt and associated ai-generated diff in a markdown file for a steganography package I'm working on.

Check out this document: https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

In particular, under v0.1.0 see `decode-branch.md` prompt and it's associated generated diff which implements memoization for backtracking while performing decoding.

It's a tight PR that fits the existing codebase and works well, you just need a motivating example you can reproduce which can help me you quickly determine if the proposed solution is working. I usually generate 2-3 solutions initially and then filter them quickly based on a test case. And as you can see from the prompt, it's far from well formatted or comprehensive, just "slap dash" listing of potentially relevant information similar to what would be discussed at an informal whiteboard session.

New comment by stillsut in "Defeating Nondeterminism in LLM Inference"

stillsut — Thu, 11 Sep 2025 14:50:41 +0000

I'm actually working on something similar to this where you can encode information into the outputs of LLM's via steganography: https://github.com/sutt/innocuous

Since I'm really looking to sample the only the top ~10 tokens, and I mostly test on CPU-based inference of 8B models, there's probably not a lot of worries getting a different order of the top tokens based on hardware implementation, but I'm still going to take a look at it eventually, and build in guard conditions against any choice that would be changed by an epsilon of precision loss.

New comment by stillsut in "AI might yet follow the path of previous technological revolutions"

stillsut — Mon, 08 Sep 2025 20:38:16 +0000

I think the "magic" that we've found a common toolset of methods - embeddings and layers of neural networks - that seem to reveal useful patterns and relationships from a vast array of corpus of unstructured analog sensors (pictures, video, point clouds) and symbolic (text, music) and that we can combine these across modalities like CLIP.

It turns out we didn't need a specialist technique for each domain, there was a reliable method to architect a model that can learn itself, and we could already use the datasets we had, they didn't need to be generated in surveys or experiments. This might seem like magic to an AI researcher working in the 1990's.

New comment by stillsut in "AI might yet follow the path of previous technological revolutions"

stillsut — Mon, 08 Sep 2025 20:20:34 +0000

"Unstructured data learners and generators" is probably the most salient distinction for how current system compare to previous "AI systems" examples (NLP, if-statements) that OP mentioned.

New comment by stillsut in "I am giving up on Intel and have bought an AMD Ryzen 9950X3D"

stillsut — Mon, 08 Sep 2025 16:04:41 +0000

Makes sense: M-chips, Falcon-9, GPT's are product subsets or the incumbent's traditional product capabilities.

New comment by stillsut in "I am giving up on Intel and have bought an AMD Ryzen 9950X3D"

stillsut — Mon, 08 Sep 2025 15:05:51 +0000

At a meta-level, I wonder if there's this un-talked about advantage of poaching ambitious talent out of an established incumbent to work a new product line in a new organization, in this case Apple Silicon disrupting Intel/AMD. And we've also seen SpaceX do this NASA/Boeing, and OpenAI do it to Google's ML departments.

It seems like large, unchallenged organizations like Intel (or NASA or Google) collect all the top talent out of school. But changing budgets, changing business objectives, frozen product strategies make it difficult for emerging talent to really work on next-generation technology (those projects have already been assigned to mid-career people who "paid their dues").

Then someone like Apple Silicon with M-chip or SpaceX with Falcon-9 comes along and poaches the people most likely to work "hardcore" (not optimizing for work/life balance) while also giving the new product a high degree of risk tolerance and autonomy. Within a few years, the smaller upstart organization has opened up in un-closeable performance gap with behemoth incumbent.

Has anyone written about this pattern (beyond Innovator's Dilemma)? Does anyone have other good examples of this?

New comment by stillsut in "Where's the shovelware? Why AI coding claims don't add up"

stillsut — Wed, 03 Sep 2025 23:39:37 +0000

Got your shovelware right here...with receipts.

Background: I'm building a python package side project which allows you to encode/decode messages into LLM output.

Receipts: the tool I'm using creates a markdown that displays every prompt typed, and every solution generated, along with summaries of the code diffs. You can check it out here: https://github.com/sutt/innocuous/blob/master/docs/dev-summa...

Specific example: Actually used a leet-code style algorithms implementation of memo-ization for branching. This would have taken a couple of days to implement by hand, but it took about 20 minutes to write the spec and 20 minutes to review solutions and merge the solution generated. If you're curious you can see this diff generated here: https://github.com/sutt/innocuous/commit/cdabc98

New comment by stillsut in "Python has had async for 10 years – why isn't it more popular?"

stillsut — Wed, 03 Sep 2025 16:23:32 +0000

Not an expert but my chats with ChatGPT led me to believe async + FastAPI can give you 40x throughput for request handling over non-async code.

The essential idea was I could be processing ~100 requests per vCPU in the async event loop while threading would max out 2-4 threads per CPU. Of course let us assume for either model we're waiting for 50-2000ms DB query or service call to finish before sending the response.

Is this not true? And if it is true, why isn't the juice is worth the squeeze: more than an order of magnitude more saturation/throughput for the same hardware and same language, just with a new engine at its heart?

New comment by stillsut in "SynthID – A tool to watermark and identify content generated through AI"

stillsut — Sat, 30 Aug 2025 13:04:10 +0000

Hey I made an open source version of this last week (albeit for different purposes). Check it out at: https://github.com/sutt/innocuous

There's lot of room for contributions here, and I think "fingerprinting layer" is an under-valued part of the LLM stack, not being explored by enough entrants.

New comment by stillsut in "Some thoughts on LLMs and software development"

stillsut — Thu, 28 Aug 2025 22:07:37 +0000

> we should always consider asking the LLM the same question more than once, perhaps with some variation in the wording. Then we can compare answers,

Yup, this matches my recommended workflow exactly. Why waste time trying to turn an initially bad answer into a passable one, when you could simply re-generate (possibly with different context)

I wrote up an example of this workflow here: https://github.com/sutt/agro/blob/master/docs/case-studies/a...

New comment by stillsut in "My experience creating software with LLM coding agents – Part 2 (Tips)"

stillsut — Sat, 23 Aug 2025 18:58:27 +0000

Not OP but I know it can be difficult to really difficult to measure or communicate this to people who aren't familiar with the codebase or the problem being solved.

Other than just dumping 10M tokens of chats into a gist and say read through everything I said back and forth with claude for a week.

But, I think I've got the start of a useful summary format: it that takes every prompt and points to the corresponding code commit produced by ai + adds a line diff amount and summary of the task. Check it out below.

https://github.com/sutt/agro/blob/master/docs/dev-summary-v1...

(In this case it's an python cli ai-coding framework that I'm using to build the package itself)

New comment by stillsut in "AGENTS.md – Open format for guiding coding agents"

stillsut — Wed, 20 Aug 2025 13:10:35 +0000

I'm rolling my own like this [0]:

.agdocs/

├── specs/ # Task specification files

├── conf/ # Configuration files

├── guides/ # Development guides for agents

└── swap/ # Temporary files (gitignored)

Every markdown file in guides/ gets copied to any agent (aider, claude, gemini) I kick-off.

I gitignore this .agdocs directory by default. I find this useful because otherwise you get into _please commit or stash your changes_ when trying to switch branches.

But I also run an rsync script before each release to copy the .agdocs to a git tracked mirrored directory [1].

[0]: https://github.com/sutt/agro/blob/master/README.md#layout

[1]: https://github.com/sutt/vidstr/tree/master/.public-agdocs/gu...

New comment by stillsut in "Ask HN: Have any successful startups been made by 'vibe coding'?"

stillsut — Tue, 19 Aug 2025 23:08:06 +0000

Yeah, I've been able to stack ~200 AI-generated commits on top of each other. Over 95% of the code is now AI generated.

You can see each prompt and each commit that was generated here: https://github.com/sutt/agro/blob/master/docs/dev-summary-v1...

Since AI-Gen code is such a roll of the dice, the key is to roll the dice a lot - usually generate at least 3 potential solutions from at least two separate providers at the outset - and get good at quickly reviewing the offered solutions, or iterating on the prompt and regenerating.