<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: GabrielBianconi</title><link>https://news.ycombinator.com/user?id=GabrielBianconi</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 13 Jun 2026 04:11:25 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=GabrielBianconi" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by GabrielBianconi in "Even (very) noisy LLM evaluators are useful for improving AI agents"]]></title><description><![CDATA[
<p>Any function that can score (i.e. "evaluate") your LLM system (e.g. your agent).<p>For example:<p>- You write a heuristic (regex, code, etc.) that assigns a score to an output<p>- You make another LLM score the output from your system (aka "LLM-as-a-judge")<p>- You have an automated system that can verify the generated outputs (e.g. does generated code compile or pass tests?)<p>People often talk about "LLM evals (evaluations)" which will include a set of evaluators i.e. scoring functions.<p>We'll make this clearer next time!</p>
]]></description><pubDate>Sat, 30 May 2026 07:49:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=48333754</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=48333754</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48333754</guid></item><item><title><![CDATA[Even (very) noisy LLM evaluators are useful for improving AI agents]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.tensorzero.com/blog/even-very-noisy-llm-evaluators-are-useful-for-improving-ai-agents/">https://www.tensorzero.com/blog/even-very-noisy-llm-evaluators-are-useful-for-improving-ai-agents/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48291016">https://news.ycombinator.com/item?id=48291016</a></p>
<p>Points: 35</p>
<p># Comments: 10</p>
]]></description><pubDate>Wed, 27 May 2026 07:49:56 +0000</pubDate><link>https://www.tensorzero.com/blog/even-very-noisy-llm-evaluators-are-useful-for-improving-ai-agents/</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=48291016</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48291016</guid></item><item><title><![CDATA[Designing for Agents]]></title><description><![CDATA[
<p>Article URL: <a href="https://twitter.com/teddy_riker/status/2047312986696454584">https://twitter.com/teddy_riker/status/2047312986696454584</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=48109916">https://news.ycombinator.com/item?id=48109916</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 12 May 2026 15:41:26 +0000</pubDate><link>https://twitter.com/teddy_riker/status/2047312986696454584</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=48109916</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=48109916</guid></item><item><title><![CDATA[New comment by GabrielBianconi in "Stop comparing price per million tokens: the hidden LLM API costs"]]></title><description><![CDATA[
<p>It's getting more and more challenging to keep track!</p>
]]></description><pubDate>Thu, 16 Apr 2026 20:03:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=47798743</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=47798743</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47798743</guid></item><item><title><![CDATA[New comment by GabrielBianconi in "If DSPy is so great, why isn't anyone using it?"]]></title><description><![CDATA[
<p>TensorZero works with the OpenAI SDK out of the box:<p>```<p>from openai import OpenAI<p># Point the client to the TensorZero Gateway<p>client = OpenAI(base_url="http://localhost:3000/openai/v1", api_key="not-used")<p>response = client.chat.completions.create(<p><pre><code>    # Call any model provider (or TensorZero function)

    model="tensorzero::model_name::anthropic::claude-sonnet-4-6",

    messages=[

        {

            "role": "user",

            "content": "Share a fun fact about TensorZero.",

        }

    ],
</code></pre>
)<p>```<p>You can layer additional features only as needed (fallbacks, templates, A/B testing, etc).</p>
]]></description><pubDate>Mon, 23 Mar 2026 16:58:40 +0000</pubDate><link>https://news.ycombinator.com/item?id=47492113</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=47492113</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47492113</guid></item><item><title><![CDATA[We're building an automated AI engineer, and it works]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.tensorzero.com/blog/automated-ai-engineer/">https://www.tensorzero.com/blog/automated-ai-engineer/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47491580">https://news.ycombinator.com/item?id=47491580</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 23 Mar 2026 16:20:03 +0000</pubDate><link>https://www.tensorzero.com/blog/automated-ai-engineer/</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=47491580</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47491580</guid></item><item><title><![CDATA[Mitchell Hashimoto on Feature Design [video]]]></title><description><![CDATA[
<p>Article URL: <a href="https://twitter.com/mitchellh/status/2001810354096214059">https://twitter.com/mitchellh/status/2001810354096214059</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46321305">https://news.ycombinator.com/item?id=46321305</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 19 Dec 2025 01:36:01 +0000</pubDate><link>https://twitter.com/mitchellh/status/2001810354096214059</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=46321305</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46321305</guid></item><item><title><![CDATA[Bandits in Your LLM Gateway]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.tensorzero.com/blog/bandits-in-your-llm-gateway/">https://www.tensorzero.com/blog/bandits-in-your-llm-gateway/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45888437">https://news.ycombinator.com/item?id=45888437</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 11 Nov 2025 15:32:13 +0000</pubDate><link>https://www.tensorzero.com/blog/bandits-in-your-llm-gateway/</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=45888437</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45888437</guid></item><item><title><![CDATA[Claude Plays Catan [video]]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.youtube.com/watch?v=BER3EhUIyz0">https://www.youtube.com/watch?v=BER3EhUIyz0</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45417147">https://news.ycombinator.com/item?id=45417147</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Mon, 29 Sep 2025 18:31:35 +0000</pubDate><link>https://www.youtube.com/watch?v=BER3EhUIyz0</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=45417147</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45417147</guid></item><item><title><![CDATA[Is OpenAI's Reinforcement Fine-Tuning (RFT) Worth It?]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.tensorzero.com/blog/is-openai-reinforcement-fine-tuning-rft-worth-it/">https://www.tensorzero.com/blog/is-openai-reinforcement-fine-tuning-rft-worth-it/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45375954">https://news.ycombinator.com/item?id=45375954</a></p>
<p>Points: 4</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 25 Sep 2025 17:27:02 +0000</pubDate><link>https://www.tensorzero.com/blog/is-openai-reinforcement-fine-tuning-rft-worth-it/</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=45375954</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45375954</guid></item><item><title><![CDATA[How Kimi K2 achieves efficient RL parameter updates]]></title><description><![CDATA[
<p>Article URL: <a href="https://moonshotai.github.io/checkpoint-engine/">https://moonshotai.github.io/checkpoint-engine/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45263188">https://news.ycombinator.com/item?id=45263188</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 16 Sep 2025 14:57:01 +0000</pubDate><link>https://moonshotai.github.io/checkpoint-engine/</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=45263188</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45263188</guid></item><item><title><![CDATA[Ask HN: How much would it cost to own and operate a personal gTLD?]]></title><description><![CDATA[
<p>I've briefly looked into it before, but the discussion here [1] earlier today made me curious again:<p>How much would it cost to own and operate a personal gTLD? Say, `.gabriel`.<p>ChatGPT claims $250k to start then $100k per year. Is this reasonable? Completely off?<p>[1] https://news.ycombinator.com/item?id=45068215</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45069326">https://news.ycombinator.com/item?id=45069326</a></p>
<p>Points: 3</p>
<p># Comments: 1</p>
]]></description><pubDate>Fri, 29 Aug 2025 20:59:35 +0000</pubDate><link>https://news.ycombinator.com/item?id=45069326</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=45069326</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45069326</guid></item><item><title><![CDATA[Deploying DeepSeek on 96 H100 GPUs]]></title><description><![CDATA[
<p>Article URL: <a href="https://lmsys.org/blog/2025-05-05-large-scale-ep/">https://lmsys.org/blog/2025-05-05-large-scale-ep/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45064329">https://news.ycombinator.com/item?id=45064329</a></p>
<p>Points: 285</p>
<p># Comments: 80</p>
]]></description><pubDate>Fri, 29 Aug 2025 14:07:28 +0000</pubDate><link>https://lmsys.org/blog/2025-05-05-large-scale-ep/</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=45064329</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45064329</guid></item><item><title><![CDATA[Sporks of AGI: why the Real Thing is better than the Next Best Thing]]></title><description><![CDATA[
<p>Article URL: <a href="https://sergeylevine.substack.com/p/sporks-of-agi">https://sergeylevine.substack.com/p/sporks-of-agi</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45054098">https://news.ycombinator.com/item?id=45054098</a></p>
<p>Points: 3</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 28 Aug 2025 16:27:13 +0000</pubDate><link>https://sergeylevine.substack.com/p/sporks-of-agi</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=45054098</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45054098</guid></item><item><title><![CDATA[We raised $7.3M to build an open-source stack for industrial-grade LLM apps]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.tensorzero.com/blog/tensorzero-raises-7-3m-seed-round-to-build-an-open-source-stack-for-industrial-grade-llm-applications/">https://www.tensorzero.com/blog/tensorzero-raises-7-3m-seed-round-to-build-an-open-source-stack-for-industrial-grade-llm-applications/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44948735">https://news.ycombinator.com/item?id=44948735</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 19 Aug 2025 06:13:13 +0000</pubDate><link>https://www.tensorzero.com/blog/tensorzero-raises-7-3m-seed-round-to-build-an-open-source-stack-for-industrial-grade-llm-applications/</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=44948735</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44948735</guid></item><item><title><![CDATA[New comment by GabrielBianconi in "Fine-tuned small LLMs can beat large ones with programmatic data curation"]]></title><description><![CDATA[
<p>We set up dataset splits and the usual best practices. Of course, if you overdo things, you can still hack benchmarks; our goal isn't to publish SOTA numbers but rather to illustrate results from our methodology. We didn't even tune hyperparameters, we just used the default choices. Definitely a valid concern for teams chasing SOTA though.<p>Thanks!</p>
]]></description><pubDate>Tue, 05 Aug 2025 14:48:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=44798715</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=44798715</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44798715</guid></item><item><title><![CDATA[New comment by GabrielBianconi in "Ask HN: How does the Postgres ecosystem compare to Vitess at 1PB+?"]]></title><description><![CDATA[
<p>Thanks, Sam! I'm excited to see what you guys come up with.</p>
]]></description><pubDate>Mon, 04 Aug 2025 22:56:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=44792265</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=44792265</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44792265</guid></item><item><title><![CDATA[Ask HN: How does the Postgres ecosystem compare to Vitess at 1PB+?]]></title><description><![CDATA[
<p>I'm not an expert in databases.<p>The MySQL ecosystem has a mature open-source solution for scaling horizontally with Vitess.<p>The Postgres ecosystem seems to have alternatives like Citus, CockroachDB, etc.<p>Are they similarly mature? How do they compare for massive-scale deployments (1PB+ of data, insert-heavy workload)?</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44792168">https://news.ycombinator.com/item?id=44792168</a></p>
<p>Points: 4</p>
<p># Comments: 2</p>
]]></description><pubDate>Mon, 04 Aug 2025 22:43:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=44792168</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=44792168</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44792168</guid></item><item><title><![CDATA[New comment by GabrielBianconi in "Fine-tuned small LLMs can beat large ones with programmatic data curation"]]></title><description><![CDATA[
<p>With supervised fine-tuning (SFT), you'll often see good results with 100-1000+ datapoints (they can be variations of the same prompt template). If you have more limited data, reinforcement fine-tuning (RFT) can work well in the 10-100 range.<p>Good luck!</p>
]]></description><pubDate>Mon, 04 Aug 2025 19:57:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=44790698</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=44790698</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44790698</guid></item><item><title><![CDATA[New comment by GabrielBianconi in "Fine-tuned small LLMs can beat large ones with programmatic data curation"]]></title><description><![CDATA[
<p>AFAIK, distillation typically refers to tuning on the logits of the larger model, so you wouldn't be able to do that with fine-tuning APIs (OpenAI + Google in our blog post). We fine-tune on the outputs themselves.<p>But broadly speaking, yes, we generate data using a large model, curate the best samples using metrics from the environment, and fine-tune on that data. This isn't a novel technique from an academic perspective; our focus is on applying it to different use cases (e.g. agentic RAG, agentic tool use) and models (OpenAI, Google, Qwen).<p>Thanks!</p>
]]></description><pubDate>Mon, 04 Aug 2025 19:42:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=44790542</link><dc:creator>GabrielBianconi</dc:creator><comments>https://news.ycombinator.com/item?id=44790542</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44790542</guid></item></channel></rss>