Hacker News: danoandco

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Mon, 13 Apr 2026 02:49:21 +0000

Definitely and Twill is for SWE delegation first, not so much the “general agent on my machine.”

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Mon, 13 Apr 2026 02:47:13 +0000

It's a crowded market. On the CLI-agnostic cloud agent positioning, there are only startups so far. Only incumbent is Github Agents as you mentioned in another thread.

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Mon, 13 Apr 2026 02:38:00 +0000

Yes, broadly. The main structural difference is that we’re agent-agnostic, so we can combine lab-native CLIs in one workflow. GitHub will likely struggle there because they have direct partnerships with Anthropic and OpenAI.

On the features themselves, we have a better UX across integrations, and more advanced features like video recording.

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Sat, 11 Apr 2026 04:17:11 +0000

On gh-aw: it looks solid for the event-driven automation shape (triage, docs sync, CI fix). We're after a slightly different shape: interactive back-and-forth, steering from Slack or Linear, persistent sandboxes with a booted dev server for live previews. Thanks for the pointer, I'll dig into it more.

On labs eating our lunch: it's definitely a risk. Our bet is that reusing lab-native CLIs is enough to position ourselves in the market

On behind the firewall: it's something we're looking into. We open-sourced agentbox-sdk in that direction.

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Sat, 11 Apr 2026 02:49:12 +0000

Mmh this works on my end. Sending you an email. Ty

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Sat, 11 Apr 2026 02:45:26 +0000

On computer use: Yes. Sandboxes come with a computer-use CLI for driving Linux GUI apps via X11.

On triggers: Cron, GitHub (PRs, issues, @twill mentions in review comments), Slack, Linear, Notion, Asana webhooks, plus CLI and web. Our PR-comment workflow is you would have to tag @twill with an instruction. That being said, you can also setup a daily cron on Twill that checks PRs with a specific label like Confidence Score : x/5 and tell it to auto-approve when 5/5 for example.

On setup scripts: Per-repo entrypoint script, env vars, and ports, all accessible on the UI. There is a dedicated Dev Environment agent mode that you start with to setup the infra. You can steer the agent into how to setup if it gets stuck. So this should be smooth. The agent can also rewrite the entrypoint mid-task.

There is also a Twill skill you can add to your local agents to dispatch tasks to Twill. Meaning you can research and plan locally using your CLI and delegate the implementation to a sandbox on Twill.

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Sat, 11 Apr 2026 01:53:26 +0000

Awesome! Thanks for trying it.

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Fri, 10 Apr 2026 21:54:40 +0000

Jules is similar to Twill with the following differences:

- Twill is CLI-agnostic, meaning you can use Claude Code, Codex or Gemini. Jules only works with Gemini.

- We focus on the delegation experience: Twill has native integrations with your typical stack like Slack or Linear. The PRs comes back with proofs of work, such as screenshots or videos.

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Fri, 10 Apr 2026 21:17:29 +0000

On the Twill web app, you can run the same task across different agents and multiple attempts (each in its own sandbox). Then you pick the best result. This is super handy for UI work where you can open the live preview for each attempt and compare. Next step for us is adding a final pass where an agent evaluates the results and combines the best parts into one PR.

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Fri, 10 Apr 2026 21:02:14 +0000

Similar but reusing lab-native CLIs like Claude Code or Codex, which they perform RL on. And so in the long-run, we believe this approach wins over custom harnesses.

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Fri, 10 Apr 2026 20:36:39 +0000

We’re focused on SWE use cases. Code is nice because there’s already a built-in verification loop: diffs, tests, CI, review, rollback. But you do quickly get to a state where the agent needs to make a risky action (db migration, or an infra operation). And this is where the permissions features from the agents are handy: allowlist, automode, etc. So you have approve/reject only the high risk actions. And I think this risk model is valid for both technical and non-technical use cases

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Fri, 10 Apr 2026 20:18:04 +0000

Totally right on the compile time. CIs have the same bottleneck, and the ecosystem is working on fixing this (faster cpus, better caching) in both coding agents and CI to improve overall velocity

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Fri, 10 Apr 2026 19:08:51 +0000

For a solo dev running one task at a time, a beefy desktop overnight is totally viable. We see a lot of this with the Mac Mini hype

Cloud starts to matter when you want to (a) run a swarm of agents on multiple independent tasks in parallel, (b) share agents across a team, or (c) not worry about keeping a machine online

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Fri, 10 Apr 2026 18:43:26 +0000

Yes, this is the pass@k metric from code generation research. Found the relevant paper Evaluating Large Language Models Trained on Code (Chen et al., 2021) which introduced the metric.

New comment by danoandco in "Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs"

danoandco — Fri, 10 Apr 2026 18:36:22 +0000

Claude managed agents is a general-purpose hosted runtime for Claude. While Twill focuses on SWE tasks.

And so the SWE workflow is pre-built (research, planning, verification, PR, proof of work). Twill is also agnostic to the agent, so you can use codex for instance. Additionally you have more flexibility on sandbox sizing on Twill

Launch HN: Twill.ai (YC S25) – Delegate to cloud agents, get back PRs

danoandco — Fri, 10 Apr 2026 16:22:13 +0000

Hey HN, we're Willy and Dan, co-founders of Twill.ai (https://twill.ai/). Twill runs coding CLIs like Claude Code and Codex in isolated cloud sandboxes. You hand it work through Slack, GitHub, Linear, our web app or CLI, and it comes back with a PR, a review, a diagnosis, or a follow-up question. It loops you in when it needs your input, so you stay in control.

Demo: https://www.youtube.com/watch?v=oyfTMXVECbs

Before Twill, building with Claude Code locally, we kept hitting three walls

1. Parallelization: two tasks that both touch your Docker config or the same infra files are painful to run locally at once, and manual port rebinding and separate build contexts don't scale past a couple of tasks.

2. Persistence: close your laptop and the agent stops. We wanted to kick off a batch of tasks before bed and wake up to PRs.

3. Trust: giving an autonomous agent full access to your local filesystem and processes is a leap, and a sandbox per task felt safer to run unattended.

All three pointed to the same answer: move the agents to the cloud, give each task its own isolated environment.

So we built what we wanted. The first version was pure delegation: describe a task, get back a PR. Then multiplayer, so the whole team can talk to the same agent, each in their own thread. Then memory, so "use the existing logger in lib/log.ts, never console.log" becomes a standing instruction on every future task. Then automation: crons for recurring work, event triggers for things like broken CI.

This space is crowded. AI labs ship their own coding products (Claude Code, Codex), local IDEs wrap models in your editor, and a wave of startups build custom cloud agents on bespoke harnesses. We take the following path: reuse the lab-native CLIs in cloud sandboxes. Labs will keep pouring RL into their own harnesses, so they only get better over time. That way, no vendor lock-in, and you can pick a different CLI per task or combine them.

When you give Twill a task, it spins up a dedicated sandbox, clones your repo, installs dependencies, and invokes the CLI you chose. Each task gets its own filesystem, ports, and process isolation. Secrets are injected at runtime through environment variables. After a task finishes, Twill snapshots the sandbox filesystem so the next run on the same repo starts warm with dependencies already installed. We chose this architecture because every time the labs ship an improvement to their coding harness, Twill picks up the improvement automatically.

We’re also open-sourcing agentbox-sdk, https://github.com/TwillAI/agentbox-sdk, an SDK for running and interacting with agent CLIs across sandbox providers.

Here’s an example: a three-person team assigned Twill to a Linear backlog ticket about adding a CSV import feature to their Rails app. Twill cloned the repo, set up the dev environment, implemented the feature, ran the test suite, took screenshots and attached them to the PR. The PR needed one round of revision, which they requested through Github. For more complex tasks, Twill asks clarifying questions before writing code and records a browser session video (using Vercel's Webreel) as proof of work.

Free tier: 10 credits per month (1 credit = $1 of AI compute at cost, no markup), no credit card. Paid plans start at $50/month for 50 credits, with BYOK support on higher tiers. Free pro tier for open-source projects.

We’d love to hear how cloud coding agents fit into your workflow today, and if you try Twill, what worked, what broke, and what’s still missing.

Comments URL: https://news.ycombinator.com/item?id=47720418

Points: 77

# Comments: 95

Clone any web app in minutes

danoandco — Thu, 02 Apr 2026 00:22:01 +0000

Article URL: https://twill.ai/clone

Comments URL: https://news.ycombinator.com/item?id=47608479

Points: 4

# Comments: 0

New comment by danoandco in "Show HN: Score your GitHub repo for AI coding agents"

danoandco — Wed, 01 Apr 2026 05:20:03 +0000

Thanks for running it and the feedback!

For the ADR vs AGENTS: CLIs usually load the AGENTS.md with a tag saying: "this context may or may not be relevant to your tasks. You should not respond to this context unless it is highly relevant to your task." That's claude code for instance. So ADR is rather something agents would not question.

Go linting: That's weird, ill take a look

Docs vs comments: great point but i think they serve two purposes. one is global (specs, design docs, etc.) and one is local (how a method works, or reason for a specific workaround)

Agent skills for desktop automation and video recording

danoandco — Tue, 31 Mar 2026 23:33:05 +0000

Article URL: https://github.com/TwillAI/skills

Comments URL: https://news.ycombinator.com/item?id=47594866

Points: 3

# Comments: 0

New comment by danoandco in "Show HN: Score your GitHub repo for AI coding agents"

danoandco — Tue, 17 Mar 2026 23:56:27 +0000

true, i think the key thing is explaining somewhere in the repo "why" something was done. like the rationale for choosing X over Y service for instance.

maybe this record is just the git log, and the agent just needs to access the git log.

we'll see how that matures over time