Hacker News: suninsight

New comment by suninsight in "Launch HN: Webhound (YC S23) – Research agent that builds datasets from the web"

suninsight — Fri, 26 Sep 2025 05:28:17 +0000

Very cool and nicely executed ! Definitely see a lot of value in this.

I was actually building a version of this using NonBioS.ai, but this is already pretty well done, so will just use this instead.

New comment by suninsight in "Getting AI to work in complex codebases"

suninsight — Wed, 24 Sep 2025 14:55:43 +0000

So I can attest to the fact that all of the things proposed in this article actually works. And you can try it out yourself on any arbitrary code base within few minutes.

This is how: I work for a company called NonBioS.ai - we already implement most of what is mentioned in this article. Actually we implemented this about 6 months back and what we have now is an advanced version of the same flow. Every user in NonBioS gets a full linux VM with root access. You can ask nonbios to pull in your source code and ask it to implement any feature. The context is all managed automatically through a process we call "Strategic Forgetting" which is in someways an advanced version of the logic in this article.

Strategic Forgetting handles the context automatically - think of it like automatic compaction. It evaluates information retention based on several key factors:

1. Relevance Scoring: We assess how directly information contributes to the current objective vs. being tangential noise

2. Temporal Decay: Information gets weighted by recency and frequency of use - rarely accessed context naturally fades

3. Retrievability: If data can be easily reconstructed from system state or documentation, it's a candidate for pruning

4. Source Priority: User-provided context gets higher retention weight than inferred or generated content

The algorithm runs continuously during coding sessions, creating a dynamic "working memory" that stays lean and focused. Think of it like how you naturally filter out background conversations to focus on what matters.

And we have tried it out in very complex code bases and it works pretty well. Once you know how well it works, you will not have a hard time believing that the days of using IDE's to edit code is probably numbered.

Also - you can try it out for yourself very quickly at NonBioS.ai. We have a very generous free tier that will be enough for the biggest code base you can throw at nonbios. However, big feature implementations or larger refactorings might take time longer than what is afforded in the free tier.

What I learned managing an AI developer while seeking enlightenment

suninsight — Fri, 05 Sep 2025 15:06:14 +0000

Article URL: https://pocha.substack.com/p/what-i-learned-managing-an-ai-developer

Comments URL: https://news.ycombinator.com/item?id=45139437

Points: 4

# Comments: 0

New comment by suninsight in "Launch HN: Halluminate (YC S25) – Simulating the internet to train computer use"

suninsight — Mon, 11 Aug 2025 16:44:51 +0000

how about halarax ...halucinate and paralax

New comment by suninsight in "Claude Code is all you need"

suninsight — Mon, 11 Aug 2025 15:40:59 +0000

So we tried that route - but problem is that these interfaces aren't suited for asynchronous updates. Like if the agent is working for the next hour or so - how do you communicate that in mediums like these. An Agent, unlike a human, is only invoked when you give it a task.

If you use the interface at nonbios.ai - you will quickly realize that it is hard to reproduce on slack/discord. Even though its still technically 'chat'

New comment by suninsight in "Claude Code is all you need"

suninsight — Mon, 11 Aug 2025 15:06:28 +0000

I think if you use Cursor, using Claude Code is a huge upgrade. The problem is that Cursor was a huge upgrade from the IDE, so we are still getting used to it.

The company I work for builds a similar tool - NonBioS.ai. It is in someways similar to what the author does above - but packaged as a service. So the nonbios agent has a root VM and can write/build all the software you want. You access/control it through a web chat interface - we take care of all the orchestration behind the scene.

Its also in free Beta right now, and signup takes a minute if you want to give it a shot. You can actually find out quickly if the Claude code/nonbios experience is better than Cursor.

New comment by suninsight in "From AI to Agents to Agencies"

suninsight — Thu, 10 Jul 2025 06:56:39 +0000

1. Multi-Agent is divide a part into tasks and hand off each part to a different Agent. This is different in the sense that a task is not divided into parts aprior. When the agent gets to a roadblock - lets say it is unable to fix a software issue - it rolls up to a deep think model to unblock. But you might be right that the difference is too subtle too notice.

2. "they got me looking at what was built with their product but I can’t see the actual code. Feels scammy" - What do you mean by you can't see the actual code ? You can just signup and use NonBioS to build software. And you can see the code written by NonBioS in multiple ways - ask it give you a downloadable zip, ask it to checkin the code to github, ask it to show you the code on the screen. Infact that the black boxes which scroll up, you can just expand them and see the code it is writing directly.

New comment by suninsight in "From AI to Agents to Agencies"

suninsight — Wed, 09 Jul 2025 13:47:12 +0000

He is NOT talking about multi-agent systems, which is exactly why he is calling it an Agency. The author goes to great length to explain why this is NOT a multi-agent system because it can be easily misunderstood to be that.

New comment by suninsight in "From AI to Agents to Agencies"

suninsight — Wed, 09 Jul 2025 13:45:49 +0000

This isn't multi-agents at all. Infact if you read the article in detail, you will realize that the author goes in detail to explain how this system is different from multi-agents. And this is exactly why the author calls it "Agency" because it is fundamentally different from multi-agents.

I agree that multi-agent doesnt work in practice. But this isnt that.

From AI to Agents to Agencies

suninsight — Wed, 09 Jul 2025 09:29:16 +0000

Article URL: https://blog.nishantsoni.com/p/from-ai-to-agents-to-agencies-the

Comments URL: https://news.ycombinator.com/item?id=44507919

Points: 10

# Comments: 10

New comment by suninsight in "Spending Too Much Money on a Coding Agent"

suninsight — Thu, 03 Jul 2025 16:25:14 +0000

So what we do at NonBioS.ai is to use a cheaper model to do routine tasks, but switch to a higher thinking model seamlessly if the agent get stuck. Its most cost efficient, and we take that switching cost away from the engineer.

But broadly agree to the argument of the post - just spending more might still be worth it.

New comment by suninsight in "Gemini Robotics On-Device brings AI to local robotic devices"

suninsight — Tue, 24 Jun 2025 15:21:11 +0000

This will not end well.

New comment by suninsight in "Building Effective AI Agents"

suninsight — Thu, 19 Jun 2025 12:18:07 +0000

Most of our stuff is built in house actually, simply because everything else is still kind of catching up. You can find a bunch of information on the blog (https://www.nonbios.ai/blog)

The only software that we use is Langfuse for observability and that too was breaking down for us. But they launched a new version - V3 - which might still work out for us.

I would suggest to just use standard non-AI specific python libraries and build your own systems. If you are migrating from N8N to a self hosted system then you can actually use NonBioS to build it out for you directly. If you join our discord channels, we can get an engineer to help you out also.

New comment by suninsight in "Building Effective AI Agents"

suninsight — Thu, 19 Jun 2025 12:12:33 +0000

It is AI Software Dev called NonBioS.ai

New comment by suninsight in "Building Effective AI Agents"

suninsight — Wed, 18 Jun 2025 08:13:18 +0000

As someone who works for a company having a real Agent in production, (not a workflow), I cannot disagree more than the very first statement here: Use Agent Frameworks like Langraph. We did exactly that, and had to throw everything away just a month down the line. Then we built everything from scratch and now our system scales pretty well.

To be fair, I think there might be a space for using Agent Frameworks, but the Agent space is too early for a good enough framework to emerge. The semi contrarian though, which I hold to a certain extent, is that the Agent space is moving so fast that a good enough framework might NEVER emerge.

New comment by suninsight in "The unreasonable effectiveness of an LLM agent loop with tool use"

suninsight — Fri, 16 May 2025 10:13:06 +0000

Yes, but we dont believe that this is a 'fundamental' problem. We have learnt to guide their actions a lot better and they go down the rabbit a lot less now than when we started out.

New comment by suninsight in "The unreasonable effectiveness of an LLM agent loop with tool use"

suninsight — Fri, 16 May 2025 08:34:47 +0000

That is very accurate with what we have found. models do a lot better, but with huge speed drops. For now, we have chosen accuracy over speed. But speed drop is like 3-4x - so we might move to an architecture where we 'think' only sporadically.

Everything happening in the LLM space is so close to how humans think naturally.

New comment by suninsight in "The unreasonable effectiveness of an LLM agent loop with tool use"

suninsight — Fri, 16 May 2025 08:30:11 +0000

Yes it works really well. We do something like that at NonBioS.ai - longer post below. The agent self reflects if it is stuck or confused and calls out the human for help.

New comment by suninsight in "The unreasonable effectiveness of an LLM agent loop with tool use"

suninsight — Fri, 16 May 2025 07:38:32 +0000

So managing context is what takes the maximum effort. We use a bunch of strategies to reduce it, including, but not limited to:

1. Custom MCP server to work on linux command line. This wasn't really a 'MCP' server because we started working on it before MCP was a thing. But thats the easiest way to explain it now. The MCP server is optimised to reduce context.

2. Guardrails to reduce context. Think about it as prompt alterations giving the LLM subtle hints to work with less context. The hints could be at a behavioural level and a task level.

3. Continuously pruning the built up context to make the Agent 'forget'. Forgetting what is not important is what we believe a foundational capability.

This is kind of inspired by the science which says humans use sleep to 'forget' not useful memories and is critical to keeping the brain healthy. This translates directly to LLM's - making them forget is critical to keep them focussed on the larger task and their actions alligned.

New comment by suninsight in "The unreasonable effectiveness of an LLM agent loop with tool use"

suninsight — Fri, 16 May 2025 07:02:12 +0000

It only seems effective, unless you start using it for actual work. The biggest issue - context. All tool use creates context. Large code bases come with large context out of the bat. LLM's seem to work, unless they are hit with a sizeable context. Anything above 10k and the quality seems to deteriorate.

Other issue is that LLM's can go off on a tangent. As context builds up, they forget what their objective was. One wrong turn, and in the rabbit hole they go never to recover.

The reason I know, is because we started solving these problems an year back. And we aren't done yet. But we did cover a lot of distance.

[Plug]: Try it out at https://nonbios.ai:

- Agentic memory → long-horizon coding

- Full Linux box → real runtime, not just toy demos

- Transparent → see & control every command

- Free beta — no invite needed. Works with throwaway email (mailinator etc.)