<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: ReDeiPirati</title><link>https://news.ycombinator.com/user?id=ReDeiPirati</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sat, 18 Apr 2026 11:14:16 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=ReDeiPirati" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by ReDeiPirati in "I still prefer MCP over skills"]]></title><description><![CDATA[
<p>> Don't focus on what you prefer: it does not matter. Focus on what tool the LLM requires to do its work in the best way.<p>I noticed that LLMs will tend to work by default with CLIs even if there's a connected MCP, likely because a) there's an overexposure of CLIs in training data b) because they are better composable and inspectable by design so a better choice in their tool selection.</p>
]]></description><pubDate>Fri, 10 Apr 2026 09:20:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=47715494</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=47715494</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47715494</guid></item><item><title><![CDATA[New comment by ReDeiPirati in "Ask HN: Who is hiring? (April 2026)"]]></title><description><![CDATA[
<p>HumanSignal | <a href="https://humansignal.com/" rel="nofollow">https://humansignal.com/</a> | REMOTE North America, South America, Europe | Full-time | Engineering roles<p>We created Label Studio (<a href="https://github.com/HumanSignal/label-studio/" rel="nofollow">https://github.com/HumanSignal/label-studio/</a>), which has quickly become the most popular open source data labeling and AI evaluation platform with 350K+ users around the world and millions of annotations each month, alongside a community of thousands of data scientists and ML engineers sharing knowledge and working to advance AI.<p>We're a remote team full of people passionated about open source and AI. We are very pragmatic and strong team players.<p>We are looking for multiples roles to support the growth of Label Studio:<p>- Senior Backend Engineer: <a href="https://boards.greenhouse.io/humansignal/jobs/4291492004" rel="nofollow">https://boards.greenhouse.io/humansignal/jobs/4291492004</a><p>- Senior Frontend Engineer: <a href="https://boards.greenhouse.io/humansignal/jobs/4630367004" rel="nofollow">https://boards.greenhouse.io/humansignal/jobs/4630367004</a><p>- Senior Full Stack Engineer: <a href="https://boards.greenhouse.io/humansignal/jobs/4803399004" rel="nofollow">https://boards.greenhouse.io/humansignal/jobs/4803399004</a><p>- AI Engineer (GTM): <a href="https://job-boards.greenhouse.io/humansignal/jobs/5828847004" rel="nofollow">https://job-boards.greenhouse.io/humansignal/jobs/5828847004</a><p>See <a href="https://boards.greenhouse.io/humansignal" rel="nofollow">https://boards.greenhouse.io/humansignal</a> for more openings.<p>---<p>HumanSignal Service is looking for AI trainers for the next frontier of AI. Especially of interest:<p>- Graphic Designers<p>- Content Creators (Social Media)<p>- Podcasters/Voice Actors<p>- Medical Experts<p>- Automotive Experts<p>Apply here: <a href="https://join.humansignal.com" rel="nofollow">https://join.humansignal.com</a></p>
]]></description><pubDate>Wed, 01 Apr 2026 17:13:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=47603680</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=47603680</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47603680</guid></item><item><title><![CDATA[New comment by ReDeiPirati in "Detecting and Preventing Distillation Attacks"]]></title><description><![CDATA[
<p>I think they are exposing how fragile and vulnerable in reality they are, and I wonder when it will happen that a group of highly motivated individuals will organize to create a truly community driven distilled models.</p>
]]></description><pubDate>Tue, 24 Feb 2026 10:19:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=47135254</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=47135254</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47135254</guid></item><item><title><![CDATA[New comment by ReDeiPirati in "Anthropic announces proof of distillation at scale by MiniMax, DeepSeek,Moonshot"]]></title><description><![CDATA[
<p>I think they are exposing how fragile and vulnerable in reality they are, and I wonder when it will happen that a group of highly motivated individuals will organize to create a truly community driven distilled models.</p>
]]></description><pubDate>Tue, 24 Feb 2026 10:18:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=47135240</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=47135240</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47135240</guid></item><item><title><![CDATA[Skills in the 21st Century]]></title><description><![CDATA[
<p>Article URL: <a href="https://twitter.com/levie/status/2010055953157357622">https://twitter.com/levie/status/2010055953157357622</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=46575320">https://news.ycombinator.com/item?id=46575320</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Sun, 11 Jan 2026 12:52:14 +0000</pubDate><link>https://twitter.com/levie/status/2010055953157357622</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=46575320</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46575320</guid></item><item><title><![CDATA[New comment by ReDeiPirati in "Agent design is still hard"]]></title><description><![CDATA[
<p>> We find testing and evals to be the hardest problem here. This is not entirely surprising, but the agentic nature makes it even harder. Unlike prompts, you cannot just do the evals in some external system because there’s too much you need to feed into it. This means you want to do evals based on observability data or instrumenting your actual test runs. So far none of the solutions we have tried have convinced us that they found the right approach here.<p>I'm curious about the solutions the op has tried so far here.</p>
]]></description><pubDate>Sat, 22 Nov 2025 16:54:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=46016124</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=46016124</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46016124</guid></item><item><title><![CDATA[Benchmarking Humans and AI in Contract Drafting]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.legalbenchmarks.ai/research/phase-2-research">https://www.legalbenchmarks.ai/research/phase-2-research</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45528006">https://news.ycombinator.com/item?id=45528006</a></p>
<p>Points: 4</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 09 Oct 2025 14:14:17 +0000</pubDate><link>https://www.legalbenchmarks.ai/research/phase-2-research</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=45528006</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45528006</guid></item><item><title><![CDATA[Benchmarking Humans and AI in Contract Drafting]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.legalbenchmarks.ai/research/phase-2-research">https://www.legalbenchmarks.ai/research/phase-2-research</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45288019">https://news.ycombinator.com/item?id=45288019</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Thu, 18 Sep 2025 10:44:35 +0000</pubDate><link>https://www.legalbenchmarks.ai/research/phase-2-research</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=45288019</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45288019</guid></item><item><title><![CDATA[Why Most LLM Chatbots Never Make It to Production]]></title><description><![CDATA[
<p>Article URL: <a href="https://humansignal.com/blog/why-most-llm-chatbots-never-make-it-to-production/">https://humansignal.com/blog/why-most-llm-chatbots-never-make-it-to-production/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45238254">https://news.ycombinator.com/item?id=45238254</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Sun, 14 Sep 2025 07:51:24 +0000</pubDate><link>https://humansignal.com/blog/why-most-llm-chatbots-never-make-it-to-production/</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=45238254</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45238254</guid></item><item><title><![CDATA[Why Most LLM Chatbots Never Make It to Production]]></title><description><![CDATA[
<p>Article URL: <a href="https://humansignal.com/blog/why-most-llm-chatbots-never-make-it-to-production/">https://humansignal.com/blog/why-most-llm-chatbots-never-make-it-to-production/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=45221639">https://news.ycombinator.com/item?id=45221639</a></p>
<p>Points: 2</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 12 Sep 2025 12:59:11 +0000</pubDate><link>https://humansignal.com/blog/why-most-llm-chatbots-never-make-it-to-production/</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=45221639</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45221639</guid></item><item><title><![CDATA[Evaluating the GPT-5 Series on Custom Benchmarks]]></title><description><![CDATA[
<p>Article URL: <a href="https://labelstud.io/blog/evaluating-the-gpt-5-series-on-custom-benchmarks/">https://labelstud.io/blog/evaluating-the-gpt-5-series-on-custom-benchmarks/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44836980">https://news.ycombinator.com/item?id=44836980</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Fri, 08 Aug 2025 13:53:19 +0000</pubDate><link>https://labelstud.io/blog/evaluating-the-gpt-5-series-on-custom-benchmarks/</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=44836980</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44836980</guid></item><item><title><![CDATA[New comment by ReDeiPirati in "Unsafe and Unpredictable: My Volvo EX90 Experience"]]></title><description><![CDATA[
<p>> And I’m saying this as a Swede. Buy German cars, specifically within the Volkswagen auto group (Audi, VW, Skoda etc) if you want reliable quality.<p>I own a 2020 BMW with an electronic gearbox, which broke at around 80k km just a couple of months after the warranty expired (yeah I know!). It was a bit of a headache going back and forth with BMW to request a free repair. Fortunately, the headquarters agreed to cover the cost, and they installed a refurbished electronic  gearbox. I was quite relieved that I didn’t have to pay about €10K out of pocket!<p>All that to say that I wouldn’t call BMW particularly reliable in terms of quality these days, but their customer support was decent, at least in my case.</p>
]]></description><pubDate>Wed, 23 Jul 2025 12:44:15 +0000</pubDate><link>https://news.ycombinator.com/item?id=44658592</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=44658592</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44658592</guid></item><item><title><![CDATA[New comment by ReDeiPirati in "An open letter from educators who refuse the call to adopt GenAI in education"]]></title><description><![CDATA[
<p>Ultimately those are tools and I think the goal is to educate students to use them properly. Also because I don't expect the knowledge paradox to disappear anytime soon with these models.</p>
]]></description><pubDate>Fri, 11 Jul 2025 08:11:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=44529539</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=44529539</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44529539</guid></item><item><title><![CDATA[New comment by ReDeiPirati in "About AI Evals"]]></title><description><![CDATA[
<p>I'd have agreed with you, if the principles would be different. But what was showed in the content is EXACTLY what those tools are doing today. Actually those tools are way more powerful and considering & covering way more scenarios.<p>> There’s nothing wrong with starting from scratch or rebuilding an existing tool from the ground up. There’s no reason to blindly build from the status quo.<p>Generally speaking all the options are ok, but not if you want to have something up as fast as you can or if your team is piloting something. I think the time you spend to vibe code it is greater than to setting any of those tools up.<p>And BTW, you shouldn't vibe code something that flows proprietary data. At least you would work with co-pilots</p>
]]></description><pubDate>Thu, 03 Jul 2025 19:16:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=44458301</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=44458301</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44458301</guid></item><item><title><![CDATA[New comment by ReDeiPirati in "About AI Evals"]]></title><description><![CDATA[
<p>> Q: What makes a good custom interface for reviewing LLM outputs?
Great interfaces make human review fast, clear, and motivating. We recommend building your own annotation tool customized to your domain ...<p>Ah! This is a horrible advice. Why should you recommend reinventing the wheel where there is already great open source software available? Just use <a href="https://github.com/HumanSignal/label-studio/">https://github.com/HumanSignal/label-studio/</a> or any other type of open source annotation software you want to get started. These tools cover already pretty much all the possible use-cases, and if they aren't you can just build on top of them instead of building it from zero.</p>
]]></description><pubDate>Thu, 03 Jul 2025 17:52:45 +0000</pubDate><link>https://news.ycombinator.com/item?id=44457557</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=44457557</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44457557</guid></item><item><title><![CDATA[Ask HN: How are you evaluating your LLMs in production?]]></title><description><![CDATA[
<p>Hello HN! Which tools do you use to evaluate your LLMs and agents in production?</p>
<hr>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44436590">https://news.ycombinator.com/item?id=44436590</a></p>
<p>Points: 2</p>
<p># Comments: 1</p>
]]></description><pubDate>Tue, 01 Jul 2025 18:09:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=44436590</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=44436590</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44436590</guid></item><item><title><![CDATA[Your Data Engine Is the Moat - Here’s How to Own It.]]></title><description><![CDATA[
<p>Article URL: <a href="https://labelstud.io/blog/your-data-engine-is-the-moat-here-s-how-to-own-it/">https://labelstud.io/blog/your-data-engine-is-the-moat-here-s-how-to-own-it/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44369093">https://news.ycombinator.com/item?id=44369093</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 24 Jun 2025 18:17:07 +0000</pubDate><link>https://labelstud.io/blog/your-data-engine-is-the-moat-here-s-how-to-own-it/</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=44369093</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44369093</guid></item><item><title><![CDATA[Meta is reportedly making a $15B bet on AGI by purchasing 49% of Scale AI]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.theverge.com/news/684322/meta-scale-ai-15-billion-investment-zuckerberg">https://www.theverge.com/news/684322/meta-scale-ai-15-billion-investment-zuckerberg</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=44245128">https://news.ycombinator.com/item?id=44245128</a></p>
<p>Points: 4</p>
<p># Comments: 1</p>
]]></description><pubDate>Wed, 11 Jun 2025 07:35:04 +0000</pubDate><link>https://www.theverge.com/news/684322/meta-scale-ai-15-billion-investment-zuckerberg</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=44245128</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44245128</guid></item><item><title><![CDATA[New comment by ReDeiPirati in "Ask HN: Cursor or Windsurf?"]]></title><description><![CDATA[
<p>Recently started using Cursor for adding a new feature on a small codebase for work, after a couple of years where I didn't code. It took me a couple of tries to figure out how to work with the tool effectively, but it worked great! I'm now learning how to use it with TaskMaster, it's such a different way to do and play with software. Oh, one important note: I went with Cursor also because of the pricing, that's despite confusing in term of fast vs slow requests, it smells less consumption base.<p>BTW There's a new OSS competitor in town that got the front a couple of days ago - Void: Open-source Cursor alternative <a href="https://news.ycombinator.com/item?id=43927926">https://news.ycombinator.com/item?id=43927926</a></p>
]]></description><pubDate>Mon, 12 May 2025 07:15:38 +0000</pubDate><link>https://news.ycombinator.com/item?id=43960386</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=43960386</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43960386</guid></item><item><title><![CDATA[New comment by ReDeiPirati in "Clair Obscur Metacritic user score"]]></title><description><![CDATA[
<p>have you played Final Fantasy VII Rebirth?</p>
]]></description><pubDate>Fri, 02 May 2025 14:55:41 +0000</pubDate><link>https://news.ycombinator.com/item?id=43870664</link><dc:creator>ReDeiPirati</dc:creator><comments>https://news.ycombinator.com/item?id=43870664</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=43870664</guid></item></channel></rss>