Hacker News: cpard

New comment by cpard in "SWE-bench Verified no longer measures frontier coding capabilities"

cpard — Sun, 26 Apr 2026 18:57:26 +0000

The trust issue might be solved by having standardisation bodies created, similar to W3C or even TPC, although TPC didn’t end that well.

New comment by cpard in "SWE-bench Verified no longer measures frontier coding capabilities"

cpard — Sun, 26 Apr 2026 18:55:38 +0000

That’s true. it also depends heavily on the type of task, not everything is equally represented on the web today and it remains to be seen if this is going to change or not.

New comment by cpard in "SWE-bench Verified no longer measures frontier coding capabilities"

cpard — Sun, 26 Apr 2026 17:43:24 +0000

Benchmarks/evals are really hard and they become harder when there’s huge incentive to game them at an industry scale.

ELT-Bench is another recent example. It was the first serious attempt at a benchmark for data engineering workloads, published about a year ago.

A few days ago, a follow-up paper from a group that includes one of the original authors audited the benchmark itself. The team gfound that the benchmark has structural issues that biased results.

Here’s the paper: https://arxiv.org/abs/2603.29399

None of these are new though, the industry has gone through all that before just in a smaller scale and there’s a lot to learn from that. Here’s a post I wrote on the parallels we see today to what happened with the benchmarketing wars of the database systems.

https://www.typedef.ai/blog/from-benchmarketing-to-benchmaxx...

New comment by cpard in "Centuries of selective breeding turned wild cabbage into different vegetables"

cpard — Mon, 16 Mar 2026 18:45:30 +0000

Examples please!

New comment by cpard in "Centuries of selective breeding turned wild cabbage into different vegetables"

cpard — Sun, 15 Mar 2026 06:02:51 +0000

How long ago did this happen?

New comment by cpard in "Centuries of selective breeding turned wild cabbage into different vegetables"

cpard — Sun, 15 Mar 2026 05:44:17 +0000

I think the sprouts trauma is the result of picking the wrong cooking method.

I was so surprised when I tried baked sprouts for the first time (use a really host cast iron skilet for even better results) that I started to believe that every vegetable can be delicious as long as you bake it!

New comment by cpard in "Show HN: DenchClaw – Local CRM on Top of OpenClaw"

cpard — Tue, 10 Mar 2026 04:46:27 +0000

I get the value of a personal CRM and potential power of having one locally managed by LLMs and I'd love to see such a solution, because to your point, outreach is just a small part of what you can do with a personal CRM. But, the way you describe and deliver this project is very confusing to me, it's a CRM but also Cursor for your Mac (what does that even mean?), I already run Cursor on my Mac, it also has a file tree view to use it as a better MacOS find I guess?

I think that a much cleaner messaging on what this tool is for would help.

Also a question about the implementation, why DuckDB for a CRM?

Something like SQLite feels like a much natural fit for a CRM where you primarily create, update and maybe delete records and you really care for the integrity of the data model.

From a quick look on the data model, everything seems to be a VARCHAR, if this is the case, why not just store everything in the file system instead? You do that with the md files and whatever is getting extracted from the SaaS tools.

New comment by cpard in "Building a TUI is easy now"

cpard — Sun, 15 Feb 2026 05:45:49 +0000

Building TUIs might be easy now but building good user experience on a TUI is feels harder than ever has been to me. The modern libraries make a lot of things easy but we are currently pushing terminals far beyond what they were designed for.

Claude Code et.al. are good examples of that. Diffs, user approval flows, non-linear flows in general and a ton of text buffered are all elements that we know really well how to handle in web interfaces but are challenging for the terminal.

New comment by cpard in "Show HN: Data Engineering Book – An open source, community-driven guide"

cpard — Sun, 15 Feb 2026 05:08:38 +0000

It's important in a book treating an emerging field (data eng for LLMs) to mention emerging categories related to it such as storage formats purpose built for the full ML lifecycle.

Lance[1] (the format, not just LanceDB) is a great example, where you have columnar storage optimized for both analytical operations and vector workloads together with built-in versioning for dataset iteration.

Plus (very important) random access, which is important for stuff like sampling and efficient filtering during curation but also for working with multimodal data, e.g. videos.

Lance is not alone, vortex[2] is another one, nimble[3] from Meta yet another one and I might be missing a few more.

[1] https://github.com/lance-format/lance [2] https://vortex.dev [3] https://github.com/facebookincubator/nimble

New comment by cpard in "GPT-5.2 derives a new result in theoretical physics"

cpard — Fri, 13 Feb 2026 23:34:14 +0000

of course the results were much worse than what was communicated on the media, it was content marketing not an attempt to build a better c compiler.

New comment by cpard in "GPT-5.2 derives a new result in theoretical physics"

cpard — Fri, 13 Feb 2026 23:21:21 +0000

there's 90% job loss assuming that this is a zero sum type of thing where humans and agents compete for working on a fixed amount of work.

I'm curious why you think I'm acting like it's all or nothing. What I was trying to communicate is the exact opposite, that it's not all or nothing. Maybe it's the way I articulate things, I'm genuinely interested what makes it sound like this.

New comment by cpard in "GPT-5.2 derives a new result in theoretical physics"

cpard — Fri, 13 Feb 2026 21:36:47 +0000

sure, I won't argue on this, although it did manage to deliver the marketing value they were looking for, at the end their goal was not to replace gcc but to make people talk about AI and Anthropic.

What I said in my original comment is that AI delivers when it's used by experts, in this case there was someone who was definitely not a C compiler expert, what would happen if there was a real expert doing this?

New comment by cpard in "GPT-5.2 derives a new result in theoretical physics"

cpard — Fri, 13 Feb 2026 21:29:02 +0000

the c compiler results or the physics results this post is about?

New comment by cpard in "GPT-5.2 derives a new result in theoretical physics"

cpard — Fri, 13 Feb 2026 21:27:24 +0000

The reason there is a marketing opportunity is because, to your point, there is a legitimate concern. Marketing builds and amplifies the concern to create awareness.

When the systems turn into something trivial to manage with the new tooling, humans build more complex or add more layers on the existing systems.

New comment by cpard in "GPT-5.2 derives a new result in theoretical physics"

cpard — Fri, 13 Feb 2026 20:49:32 +0000

AI can be an amazing productivity multiplier for people who know what they're doing.

This result reminded me of the C compiler case that Anthropic posted recently. Sure, agents wrote the code for hours but there was a human there giving them directions, scoping the problem, finding the test suites needed for the agentic loops to actually work etc etc. In general making sure the output actually works and that it's a story worth sharing with others.

The "AI replaces humans in X" narrative is primarily a tool for driving attention and funding. It works great for creating impressions and building brand value but also does a disservice to the actual researchers, engineers and humans in general, who do the hard work of problem formulation, validation and at the end, solving the problem using another tool in their toolbox.

GraphLite: An Embeddable Graph Database with ISO Graph Query Language Support

cpard — Fri, 21 Nov 2025 19:43:22 +0000

Article URL: https://github.com/GraphLite-AI/GraphLite

Comments URL: https://news.ycombinator.com/item?id=46008137

Points: 6

# Comments: 0

New comment by cpard in "Vortex: An extensible, state of the art columnar file format"

cpard — Thu, 20 Nov 2025 01:32:29 +0000

As others said, Vortex is complementary to the table Formats you mentioned.

There are other formats though that it can be compared to.

The Lance columnar format is one: https://github.com/lancedb/lancedb

And Nimble from Meta is another: https://github.com/facebookincubator/nimble

Parquet is so core to data infra and widespread, that removing it from its throne is a really really hard task.

The people behind these projects that are willing to try and do this, have my total respect.

New comment by cpard in "A better way to search Hacker News using LLMs"

cpard — Wed, 19 Nov 2025 18:26:21 +0000

Many times I read something on HN and come back to find it after a few days or weeks and using the current keyword based search has been consistently giving me a hard time, so I played around with LLMs as an alternative way of searching and finding information on HN.

A better way to search Hacker News using LLMs

cpard — Wed, 19 Nov 2025 18:26:21 +0000

Article URL: https://github.com/typedef-ai/fenic-examples/tree/main/hn_agent

Comments URL: https://news.ycombinator.com/item?id=45983042

Points: 3

# Comments: 1

Nanoimprint Lithography: Stop Saying It Will Replace EUV

cpard — Sun, 26 Oct 2025 19:40:38 +0000

Article URL: https://newsletter.semianalysis.com/p/nanoimprint-lithography-stop-saying

Comments URL: https://news.ycombinator.com/item?id=45714625

Points: 2

# Comments: 0