Hacker News: ReDeiPirati

New comment by ReDeiPirati in "The Problem with the Ferrari Luce EV Offers a Lesson for Every Leader"

ReDeiPirati — Thu, 28 May 2026 12:26:45 +0000

I really like how you put it down! This is how I'd have imaged to be the Apple car, not a Ferrari. The "irony" of people defending this monstrosity are saying thing like "this is the type of comments of someone who cannot afford this", but they are completely missing the point. If it doesn't making people dream about it, then it's not worth of being a real Ferrari product.

Show HN: BetterCallClaude – Open Source AI Legal Agents for Italy

ReDeiPirati — Thu, 28 May 2026 05:16:29 +0000

Article URL: https://bettercallclaude.it/

Comments URL: https://news.ycombinator.com/item?id=48304854

Points: 2

# Comments: 0

New comment by ReDeiPirati in "Ferrari Luce"

ReDeiPirati — Tue, 26 May 2026 06:15:07 +0000

it looks so good the new Apple car /s

New comment by ReDeiPirati in "I still prefer MCP over skills"

ReDeiPirati — Fri, 10 Apr 2026 09:20:15 +0000

> Don't focus on what you prefer: it does not matter. Focus on what tool the LLM requires to do its work in the best way.

I noticed that LLMs will tend to work by default with CLIs even if there's a connected MCP, likely because a) there's an overexposure of CLIs in training data b) because they are better composable and inspectable by design so a better choice in their tool selection.

New comment by ReDeiPirati in "Ask HN: Who is hiring? (April 2026)"

ReDeiPirati — Wed, 01 Apr 2026 17:13:11 +0000

HumanSignal | https://humansignal.com/ | REMOTE North America, South America, Europe | Full-time | Engineering roles

We created Label Studio (https://github.com/HumanSignal/label-studio/), which has quickly become the most popular open source data labeling and AI evaluation platform with 350K+ users around the world and millions of annotations each month, alongside a community of thousands of data scientists and ML engineers sharing knowledge and working to advance AI.

We're a remote team full of people passionated about open source and AI. We are very pragmatic and strong team players.

We are looking for multiples roles to support the growth of Label Studio:

- Senior Backend Engineer: https://boards.greenhouse.io/humansignal/jobs/4291492004

- Senior Frontend Engineer: https://boards.greenhouse.io/humansignal/jobs/4630367004

- Senior Full Stack Engineer: https://boards.greenhouse.io/humansignal/jobs/4803399004

- AI Engineer (GTM): https://job-boards.greenhouse.io/humansignal/jobs/5828847004

See https://boards.greenhouse.io/humansignal for more openings.

---

HumanSignal Service is looking for AI trainers for the next frontier of AI. Especially of interest:

- Graphic Designers

- Content Creators (Social Media)

- Podcasters/Voice Actors

- Medical Experts

- Automotive Experts

Apply here: https://join.humansignal.com

New comment by ReDeiPirati in "Detecting and Preventing Distillation Attacks"

ReDeiPirati — Tue, 24 Feb 2026 10:19:19 +0000

I think they are exposing how fragile and vulnerable in reality they are, and I wonder when it will happen that a group of highly motivated individuals will organize to create a truly community driven distilled models.

New comment by ReDeiPirati in "Anthropic announces proof of distillation at scale by MiniMax, DeepSeek,Moonshot"

ReDeiPirati — Tue, 24 Feb 2026 10:18:09 +0000

Skills in the 21st Century

ReDeiPirati — Sun, 11 Jan 2026 12:52:14 +0000

Article URL: https://twitter.com/levie/status/2010055953157357622

Comments URL: https://news.ycombinator.com/item?id=46575320

Points: 1

# Comments: 0

New comment by ReDeiPirati in "Agent design is still hard"

ReDeiPirati — Sat, 22 Nov 2025 16:54:51 +0000

> We find testing and evals to be the hardest problem here. This is not entirely surprising, but the agentic nature makes it even harder. Unlike prompts, you cannot just do the evals in some external system because there’s too much you need to feed into it. This means you want to do evals based on observability data or instrumenting your actual test runs. So far none of the solutions we have tried have convinced us that they found the right approach here.

I'm curious about the solutions the op has tried so far here.

Benchmarking Humans and AI in Contract Drafting

ReDeiPirati — Thu, 09 Oct 2025 14:14:17 +0000

Article URL: https://www.legalbenchmarks.ai/research/phase-2-research

Comments URL: https://news.ycombinator.com/item?id=45528006

Points: 4

# Comments: 0

Benchmarking Humans and AI in Contract Drafting

ReDeiPirati — Thu, 18 Sep 2025 10:44:35 +0000

Article URL: https://www.legalbenchmarks.ai/research/phase-2-research

Comments URL: https://news.ycombinator.com/item?id=45288019

Points: 2

# Comments: 0

Why Most LLM Chatbots Never Make It to Production

ReDeiPirati — Sun, 14 Sep 2025 07:51:24 +0000

Article URL: https://humansignal.com/blog/why-most-llm-chatbots-never-make-it-to-production/

Comments URL: https://news.ycombinator.com/item?id=45238254

Points: 1

# Comments: 0

Why Most LLM Chatbots Never Make It to Production

ReDeiPirati — Fri, 12 Sep 2025 12:59:11 +0000

Article URL: https://humansignal.com/blog/why-most-llm-chatbots-never-make-it-to-production/

Comments URL: https://news.ycombinator.com/item?id=45221639

Points: 2

# Comments: 0

Evaluating the GPT-5 Series on Custom Benchmarks

ReDeiPirati — Fri, 08 Aug 2025 13:53:19 +0000

Article URL: https://labelstud.io/blog/evaluating-the-gpt-5-series-on-custom-benchmarks/

Comments URL: https://news.ycombinator.com/item?id=44836980

Points: 1

# Comments: 0

New comment by ReDeiPirati in "Unsafe and Unpredictable: My Volvo EX90 Experience"

ReDeiPirati — Wed, 23 Jul 2025 12:44:15 +0000

> And I’m saying this as a Swede. Buy German cars, specifically within the Volkswagen auto group (Audi, VW, Skoda etc) if you want reliable quality.

I own a 2020 BMW with an electronic gearbox, which broke at around 80k km just a couple of months after the warranty expired (yeah I know!). It was a bit of a headache going back and forth with BMW to request a free repair. Fortunately, the headquarters agreed to cover the cost, and they installed a refurbished electronic gearbox. I was quite relieved that I didn’t have to pay about €10K out of pocket!

All that to say that I wouldn’t call BMW particularly reliable in terms of quality these days, but their customer support was decent, at least in my case.

New comment by ReDeiPirati in "An open letter from educators who refuse the call to adopt GenAI in education"

ReDeiPirati — Fri, 11 Jul 2025 08:11:49 +0000

Ultimately those are tools and I think the goal is to educate students to use them properly. Also because I don't expect the knowledge paradox to disappear anytime soon with these models.

New comment by ReDeiPirati in "About AI Evals"

ReDeiPirati — Thu, 03 Jul 2025 19:16:29 +0000

I'd have agreed with you, if the principles would be different. But what was showed in the content is EXACTLY what those tools are doing today. Actually those tools are way more powerful and considering & covering way more scenarios.

> There’s nothing wrong with starting from scratch or rebuilding an existing tool from the ground up. There’s no reason to blindly build from the status quo.

Generally speaking all the options are ok, but not if you want to have something up as fast as you can or if your team is piloting something. I think the time you spend to vibe code it is greater than to setting any of those tools up.

And BTW, you shouldn't vibe code something that flows proprietary data. At least you would work with co-pilots

New comment by ReDeiPirati in "About AI Evals"

ReDeiPirati — Thu, 03 Jul 2025 17:52:45 +0000

> Q: What makes a good custom interface for reviewing LLM outputs? Great interfaces make human review fast, clear, and motivating. We recommend building your own annotation tool customized to your domain ...

Ah! This is a horrible advice. Why should you recommend reinventing the wheel where there is already great open source software available? Just use https://github.com/HumanSignal/label-studio/ or any other type of open source annotation software you want to get started. These tools cover already pretty much all the possible use-cases, and if they aren't you can just build on top of them instead of building it from zero.

Ask HN: How are you evaluating your LLMs in production?

ReDeiPirati — Tue, 01 Jul 2025 18:09:59 +0000

Hello HN! Which tools do you use to evaluate your LLMs and agents in production?

Comments URL: https://news.ycombinator.com/item?id=44436590

Points: 2

# Comments: 1

Your Data Engine Is the Moat - Here’s How to Own It.

ReDeiPirati — Tue, 24 Jun 2025 18:17:07 +0000

Article URL: https://labelstud.io/blog/your-data-engine-is-the-moat-here-s-how-to-own-it/

Comments URL: https://news.ycombinator.com/item?id=44369093

Points: 1

# Comments: 0