<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: vrm</title><link>https://news.ycombinator.com/user?id=vrm</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 15 Jun 2026 12:38:03 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=vrm" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by vrm in "Building durable workflows on Postgres"]]></title><description><![CDATA[
<p>TBH it's intended only for internal use (we don't even publish it as a crate at this point) so I don't particularly mind it being low-key. But I appreciate it!</p>
]]></description><pubDate>Thu, 28 May 2026 21:26:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=48315737</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=48315737</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48315737</guid></item><item><title><![CDATA[New comment by vrm in "Building durable workflows on Postgres"]]></title><description><![CDATA[
<p>If you don't need a ton of throughput I think `absurd` (and our Rust derivative `durable`) are very nice options that keep the client side extremely simple. It's also lightweight enough that a coding agent can keep the entire thing in its head easily and just run queries to look up state as needed.</p>
]]></description><pubDate>Thu, 28 May 2026 20:27:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=48314984</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=48314984</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48314984</guid></item><item><title><![CDATA[New comment by vrm in "Building durable workflows on Postgres"]]></title><description><![CDATA[
<p>Since DBOS doesn't support Rust, we implemented a very minimal Rust version of this at <a href="https://github.com/tensorzero/durable" rel="nofollow">https://github.com/tensorzero/durable</a>. It has been quite stable and extensible but of course you need to be very careful with the SQL implementations. Hope this is interesting to readers here.</p>
]]></description><pubDate>Thu, 28 May 2026 18:59:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=48313781</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=48313781</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48313781</guid></item><item><title><![CDATA[ATLAS: Autoformalized Textbook Library At Scale]]></title><description><![CDATA[
<p><a href="https://twitter.com/arnal_charles/status/2060009395107377282" rel="nofollow">https://twitter.com/arnal_charles/status/2060009395107377282</a>, <a href="https://xcancel.com/arnal_charles/status/2060009395107377282" rel="nofollow">https://xcancel.com/arnal_charles/status/2060009395107377282</a><p>Paper: <i>Formalizing Mathematics at Scale</i> - <a href="https://arxiv.org/abs/2605.29955" rel="nofollow">https://arxiv.org/abs/2605.29955</a></p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48311485">https://news.ycombinator.com/item?id=48311485</a></p>
<p>Points: 32</p>
<p># Comments: 4</p>
]]></description><pubDate>Thu, 28 May 2026 16:40:18 +0000</pubDate><link>https://github.com/facebookresearch/atlas-lean</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=48311485</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48311485</guid></item><item><title><![CDATA[New comment by vrm in "Making deep learning go brrrr from first principles (2022)"]]></title><description><![CDATA[
<p>It’s really not a concept you can express in idiomatic Python very easily. This comes from the actual generated assembly involving copies from global GPU memory into registers (slow, bandwidth saturates quickly) and back in between the cosines. If you can avoid the intermediate roundtrip that cuts the cost approximately in half.</p>
]]></description><pubDate>Sat, 23 May 2026 16:41:06 +0000</pubDate><link>https://news.ycombinator.com/item?id=48249089</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=48249089</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48249089</guid></item><item><title><![CDATA[New comment by vrm in "Formal Verification Gates for AI Coding Loops"]]></title><description><![CDATA[
<p>One question I have here: I think this type of thing would be trivial to do in Rust with constructors, private fields, and newtypes. What am I getting on top of it?</p>
]]></description><pubDate>Wed, 20 May 2026 21:05:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=48214113</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=48214113</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48214113</guid></item><item><title><![CDATA[Stop comparing price per million tokens: the hidden LLM API costs]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.tensorzero.com/blog/stop-comparing-price-per-million-tokens-the-hidden-llm-api-costs/">https://www.tensorzero.com/blog/stop-comparing-price-per-million-tokens-the-hidden-llm-api-costs/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47798525">https://news.ycombinator.com/item?id=47798525</a></p>
<p>Points: 3</p>
<p># Comments: 2</p>
]]></description><pubDate>Thu, 16 Apr 2026 19:46:29 +0000</pubDate><link>https://www.tensorzero.com/blog/stop-comparing-price-per-million-tokens-the-hidden-llm-api-costs/</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=47798525</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47798525</guid></item><item><title><![CDATA[Ask HN: What do you recommend for test observability?]]></title><description><![CDATA[
<p>I maintain an OSS project with a very involved CI setup. We're at the point where it is worth having observability into which tests are flaky, especially within intra-test-run retries. An ideal solution would be a managed service that takes junit.xml exports from cargo nextest, vitest, playwright, pytest, and go test.
What do you all recommend?</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45222181">https://news.ycombinator.com/item?id=45222181</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 12 Sep 2025 13:52:54 +0000</pubDate><link>https://news.ycombinator.com/item?id=45222181</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=45222181</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45222181</guid></item><item><title><![CDATA[Improving Cursor Tab with RL]]></title><description><![CDATA[
<p>Article URL: <a href="https://cursor.com/en/blog/tab-rl">https://cursor.com/en/blog/tab-rl</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45218365">https://news.ycombinator.com/item?id=45218365</a></p>
<p>Points: 6</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 12 Sep 2025 03:19:08 +0000</pubDate><link>https://cursor.com/en/blog/tab-rl</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=45218365</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45218365</guid></item><item><title><![CDATA[New comment by vrm in "Databricks is raising a Series K Investment at >$100B valuation"]]></title><description><![CDATA[
<p>that is earnings (net income) not revenue (top line) so these are wildly different and incomparable numbers</p>
]]></description><pubDate>Wed, 20 Aug 2025 14:31:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=44962319</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=44962319</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44962319</guid></item><item><title><![CDATA[New comment by vrm in "Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?"]]></title><description><![CDATA[
<p>a 6:1 parameter ratio is too small for specdec to have that much of an effect. You'd really want to see 10:1 or even more for this to start to matter</p>
]]></description><pubDate>Fri, 08 Aug 2025 20:37:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=44841454</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=44841454</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44841454</guid></item><item><title><![CDATA[New comment by vrm in "Diffsitter – A Tree-sitter based AST difftool to get meaningful semantic diffs"]]></title><description><![CDATA[
<p>This is neat! I think in general there are really deep connections between semantically meaningful diffs (across modalities) and supervision of AI models. You might imagine a human-in-the-loop workflow where the human makes edits to a particular generation and then those edits are used as supervision for a future implementation of that thing. We did some related work here: <a href="https://www.tensorzero.com/blog/automatically-evaluating-ai-coding-assistants-with-each-git-commit/" rel="nofollow">https://www.tensorzero.com/blog/automatically-evaluating-ai-...</a> on the coding use case but I'm interested in all the different approaches to the problem and especially on less structured domains.</p>
]]></description><pubDate>Thu, 10 Jul 2025 19:33:36 +0000</pubDate><link>https://news.ycombinator.com/item?id=44524681</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=44524681</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44524681</guid></item><item><title><![CDATA[Automatically Evaluating AI Coding Assistants with Each Git Commit]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.tensorzero.com/blog/automatically-evaluating-ai-coding-assistants-with-each-git-commit/">https://www.tensorzero.com/blog/automatically-evaluating-ai-coding-assistants-with-each-git-commit/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44475733">https://news.ycombinator.com/item?id=44475733</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Sat, 05 Jul 2025 21:34:46 +0000</pubDate><link>https://www.tensorzero.com/blog/automatically-evaluating-ai-coding-assistants-with-each-git-commit/</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=44475733</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44475733</guid></item><item><title><![CDATA[Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Tree Search]]></title><description><![CDATA[
<p>Article URL: <a href="https://arxiv.org/abs/2503.04412">https://arxiv.org/abs/2503.04412</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44439235">https://news.ycombinator.com/item?id=44439235</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Wed, 02 Jul 2025 00:30:12 +0000</pubDate><link>https://arxiv.org/abs/2503.04412</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=44439235</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44439235</guid></item><item><title><![CDATA[New comment by vrm in "Reverse Engineering Cursor's LLM Client"]]></title><description><![CDATA[
<p>if you haven't check out our repo -- it's free, fully self-hosted, production-grade, and designed for precisely this application :)<p><a href="https://github.com/TensorZero/tensorzero">https://github.com/TensorZero/tensorzero</a></p>
]]></description><pubDate>Sat, 07 Jun 2025 20:28:22 +0000</pubDate><link>https://news.ycombinator.com/item?id=44212419</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=44212419</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44212419</guid></item><item><title><![CDATA[New comment by vrm in "Reverse Engineering Cursor's LLM Client"]]></title><description><![CDATA[
<p>I definitely see different prompts based on what I'm doing in the app. As we mentioned there are different prompts for if you're asking questions, doing Cmd-K edits, working in the shell, etc. I'd also imagine that they customize the prompt by model (unobserved here, but we can also customize per-model using TensorZero and A/B test).</p>
]]></description><pubDate>Sat, 07 Jun 2025 14:29:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=44209875</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=44209875</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44209875</guid></item><item><title><![CDATA[New comment by vrm in "Reverse Engineering Cursor's LLM Client"]]></title><description><![CDATA[
<p>we're doing the latter! Cursor lets you configure the OpenAI base URL so we were able to have Cursor call Ngrok -> Nginx (for auth) -> TensorZero -> LLMs. We explain in detail in the blog post.</p>
]]></description><pubDate>Sat, 07 Jun 2025 14:27:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=44209862</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=44209862</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44209862</guid></item><item><title><![CDATA[New comment by vrm in "Reverse Engineering Cursor's LLM Client"]]></title><description><![CDATA[
<p>wireshark would work for seeing the requests from the desktop app to Cursor’s servers (which make the actual LLM requests). But if you’re interested in what the actual requests to LLMs look like from Cursor’s servers you have to set something like this up. Plus, this lets us modify the request and A/B test variations!</p>
]]></description><pubDate>Sat, 07 Jun 2025 12:05:56 +0000</pubDate><link>https://news.ycombinator.com/item?id=44209106</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=44209106</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44209106</guid></item><item><title><![CDATA[New comment by vrm in "AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms"]]></title><description><![CDATA[
<p>We're working on an OSS industrial-grade version of this at TensorZero but there's a long way to go. I think the easiest out of the box solution today is probably OpenAI RFT but that's a partial solve with substantial vendor lock-in.</p>
]]></description><pubDate>Wed, 14 May 2025 16:03:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=43986143</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=43986143</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43986143</guid></item><item><title><![CDATA[New comment by vrm in "AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms"]]></title><description><![CDATA[
<p>This is very neat work! Will be interested in how they make this sort of thing available to the public but it is clear from some of the results they mention that search + LLM is one path to the production of net-new knowledge from AI systems.</p>
]]></description><pubDate>Wed, 14 May 2025 15:32:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=43985774</link><dc:creator>vrm</dc:creator><comments>https://news.ycombinator.com/item?id=43985774</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43985774</guid></item></channel></rss>