<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: adchurch</title><link>https://news.ycombinator.com/user?id=adchurch</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 30 Jun 2026 21:39:30 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=adchurch" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>Effectively yes (based on cost though, not raw token count)</p>
]]></description><pubDate>Mon, 29 Jun 2026 13:46:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=48719245</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48719245</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48719245</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>We trained a model to select which LLM to call at any given turn, based on lots of agent traces</p>
]]></description><pubDate>Sat, 27 Jun 2026 16:41:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=48699688</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48699688</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48699688</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>Yes the open source models are very good, that’s a big part of what makes this router save so much money in practice! There definitely are some things they still don’t handle well though where you do want a frontier model</p>
]]></description><pubDate>Sat, 27 Jun 2026 16:39:32 +0000</pubDate><link>https://news.ycombinator.com/item?id=48699680</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48699680</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48699680</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>Yes we can route to Gemini models too and we handle all the translation complexity there!</p>
]]></description><pubDate>Sat, 27 Jun 2026 16:37:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=48699661</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48699661</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48699661</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>We welcome the competition :)</p>
]]></description><pubDate>Sat, 27 Jun 2026 16:34:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=48699634</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48699634</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48699634</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>Yep exactly</p>
]]></description><pubDate>Sat, 27 Jun 2026 16:33:03 +0000</pubDate><link>https://news.ycombinator.com/item?id=48699627</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48699627</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48699627</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>Yes because it's a model explicitly trained to make model selections! Opus probably doesn't have a great idea of when to send a task to DeepSeek vs. to Sonnet, for example.</p>
]]></description><pubDate>Sat, 27 Jun 2026 02:29:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=48694579</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48694579</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48694579</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>We haven't experimented with routing to local LLMs much. Technically they benefit from the cache too although it's more a question of latency than cost. But tbh I haven't seen great results in the wild from working with local LLMs for coding - curious if you've had any success with them?</p>
]]></description><pubDate>Fri, 26 Jun 2026 21:44:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=48692371</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48692371</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48692371</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>I think the key detail here is that we use embeddings of the prompt + previous context in order to decide where to route the request, <i>and</i> if one model is getting stuck we can course correct and move to a different model.<p>So: we can reasonably cluster similar problems together and learn how models handle them, and the entire system doesn't fail if the initial decision is off.</p>
]]></description><pubDate>Fri, 26 Jun 2026 21:42:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=48692347</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48692347</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48692347</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>We consider the cost of missing the cache when making each routing decision after the initial one. Discussed in a bit more depth here: <a href="https://news.ycombinator.com/item?id=48689448">https://news.ycombinator.com/item?id=48689448</a></p>
]]></description><pubDate>Fri, 26 Jun 2026 21:34:52 +0000</pubDate><link>https://news.ycombinator.com/item?id=48692285</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48692285</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48692285</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>Good questions. From what I can tell, vLLM semantic router is more optimized for one-off prompt/response workflows rather than agentic coding (I don't think it's cache aware).<p>As another commenter (<a href="https://news.ycombinator.com/item?id=48689994">https://news.ycombinator.com/item?id=48689994</a>) pointed out, for one-off requests, I think it makes more sense to lock to one model whose behavior you understand very well. For dynamic requests like the ones going to a coding agent I think dynamic routing makes more sense but it does need to be cache aware.</p>
]]></description><pubDate>Fri, 26 Jun 2026 21:33:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=48692260</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48692260</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48692260</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>Cool, interested to see your approach when you do launch! I think it's a really interesting problem</p>
]]></description><pubDate>Fri, 26 Jun 2026 21:27:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=48692205</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48692205</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48692205</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>Great question! Our main product quantifies engineering productivity & quality so I think we're uniquely qualified to answer this - our velocity has only gone up and our quality (bugs introduced, code turnover) has not budged per our own analysis.</p>
]]></description><pubDate>Fri, 26 Jun 2026 21:26:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=48692193</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48692193</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48692193</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>Oh interesting, didn't know Cursor did that! Totally makes sense though, routing subagents is def the easiest win, no need to have any cache awareness.</p>
]]></description><pubDate>Fri, 26 Jun 2026 21:24:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=48692165</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48692165</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48692165</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>If you have a Claude sub with subsidized usage we use that. If not you pay API prices.</p>
]]></description><pubDate>Fri, 26 Jun 2026 21:23:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=48692153</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48692153</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48692153</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>Really appreciate the thoughtful feedback!<p>1. Agree it's important, fwiw the proxy model doesn't blow this up though - only incurs a 1 time cost when switching models and we're aware of that when making routing decisions<p>2. The agents are model aware yes but they are not incentivized to optimize too heavily here (in particular they don't use OS models even when they would be better). I think that's where this router comes in and brings genuine improvement.<p>3. Two parts here: 1 is continuing to grow our golden dataset over time, 2 is using reward signals from production traffic (on a per-customer basis or, if allowed, across all users)<p>4. Yes we have these internally, great callout that we should publish! Will do + will link from the repo soon. (Fwiw I think these benchmarks are useful but don't fully capture vibes - you should try it out yourself for that!)</p>
]]></description><pubDate>Fri, 26 Jun 2026 20:07:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=48691347</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48691347</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48691347</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>Appreciate the kind words! Lmk if you have any feedback on it from using!</p>
]]></description><pubDate>Fri, 26 Jun 2026 20:02:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=48691288</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48691288</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48691288</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>I would argue they do not have a good incentive to build this and make it better.  Why would Anthropic route Claude Code traffic to DeepSeek (at 20% of the cost)?</p>
]]></description><pubDate>Fri, 26 Jun 2026 20:01:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=48691278</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48691278</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48691278</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>Very important consideration, addressed it in another thread (<a href="https://news.ycombinator.com/item?id=48689448">https://news.ycombinator.com/item?id=48689448</a>). tl;dr we built this to be cache aware for exactly this reason</p>
]]></description><pubDate>Fri, 26 Jun 2026 19:01:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=48690590</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48690590</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48690590</guid></item><item><title><![CDATA[New comment by adchurch in "Show HN: Smart model routing directly in Claude, Codex and Cursor"]]></title><description><![CDATA[
<p>When we started building this we did it as an experiment and we thought the same thing might be true (cache misses would make the whole thing pointless). This turned out not to be true! I think there are 3 reasons intuitively:<p>1. Small models can carry out a good number of requests e2e
2. Small model for part of a request + cache miss < big model for entire request in many cases
3. Subagents<p>For our own usage we've saved 40% so far (that is of course including costs of uncached requests when switching models)</p>
]]></description><pubDate>Fri, 26 Jun 2026 19:00:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=48690572</link><dc:creator>adchurch</dc:creator><comments>https://news.ycombinator.com/item?id=48690572</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48690572</guid></item></channel></rss>