Hacker News: jpcompartir

New comment by jpcompartir in "Claude Code Cheat Sheet"

jpcompartir — Tue, 24 Mar 2026 10:59:05 +0000

This looks like a Claude-generated SVG to me, is it not?

New comment by jpcompartir in "Autoresearch on an old research idea"

jpcompartir — Tue, 24 Mar 2026 10:54:59 +0000

Fair push back, but I do think the LSTM vs Transformers point kinda supports my position in the limit, not refutes. Once the compute bottleneck is removed, LSTMs scale favourably. https://arxiv.org/pdf/2510.02228 (I believe there's similar work done on vanilla LSTMs, but I'd have to go digging)

So the bottleneck was compute. Which is compatible with 'data or compute'. But to accept your point, at the time the algorothmic advances were useful/did unlock/remove the bottleneck.

A wider point is that eventually (once compute and data are scaled enough) the algorithms are all learning the same representations: https://arxiv.org/pdf/2405.07987

And of course the canon: https://nonint.com/2023/06/10/the-it-in-ai-models-is-the-dat... http://www.incompleteideas.net/IncIdeas/BitterLesson.html

Scaling compute & data > algorithmic cleverness

New comment by jpcompartir in "Autoresearch on an old research idea"

jpcompartir — Mon, 23 Mar 2026 19:35:25 +0000

There are better techniques for hyper-parameter optimisation, right? I fear I have missed something important, why has Autoresearch blown up so much?

The bottleneck in AI/ML/DL is always data (volume & quality) or compute.

Does/can Autoresearch help improve large-scale datasets? Is it more compute efficien than humans?

New comment by jpcompartir in "Statement from Dario Amodei on our discussions with the Department of War"

jpcompartir — Fri, 27 Feb 2026 15:26:38 +0000

As a non-US citizen, I'm quite glad in the knowledge that Claude won't be used to kill other non-US citizens with autonomous weapons

New comment by jpcompartir in "Statement from Dario Amodei on our discussions with the Department of War"

jpcompartir — Fri, 27 Feb 2026 09:21:18 +0000

"Regardless, these threats do not change our position: we cannot in good conscience accede to their request."

New comment by jpcompartir in "Ggml.ai joins Hugging Face to ensure the long-term progress of Local AI"

jpcompartir — Sat, 21 Feb 2026 13:07:17 +0000

This is great, brings clear benefits to both sides and the rest of us.

Always rooting for Hugging Face

New comment by jpcompartir in "Gemini 3.1 Pro"

jpcompartir — Thu, 19 Feb 2026 22:23:10 +0000

Yep, Gemini is virtually unusable compared to Anthropic models. I get it for free with work and use maybe once a week, if that. They really need to fix the instruction following.

New comment by jpcompartir in "Claude Code is being dumbed down?"

jpcompartir — Thu, 12 Feb 2026 09:47:37 +0000

Thanks for the long and considered response, but this is a really ugly UX decision.

As others have said - 'reading 10 files' is useless information - we want to be able to see at a glance where it is and what it's doing, so that we can re-direct if necessary.

With the release of Cowork, couldn't Claude Code double down on needs of engineers?

New comment by jpcompartir in "Railway (PaaS) global outage"

jpcompartir — Wed, 11 Feb 2026 17:25:57 +0000

Yeah 100%

This won't change my decision, but it is still impeccable timing

New comment by jpcompartir in "Railway (PaaS) global outage"

jpcompartir — Wed, 11 Feb 2026 16:55:47 +0000

This is great, not 10 minutes before this outage did I present Railway as a viable option for some small-scale hosting for prototypes and non-critical apps as an alternative to the Cloud giants

New comment by jpcompartir in "Claude Opus 4.6"

jpcompartir — Fri, 06 Feb 2026 21:08:13 +0000

4.6 is a beast.

Everything in plan mode first + AskUserQuestionTool, review all plans, get it to write its own CLAUDE.md for coding standards and edit where necessary and away you go.

Seems noticeably better than 4.5 at keeping the codebase slim. Obviously it still needs to be kept an eye on, but it's a step up from 4.5.

New comment by jpcompartir in "Cowork: Claude Code for the rest of your work"

jpcompartir — Mon, 12 Jan 2026 21:30:01 +0000

I've been working with a claude-specific directory in Claude Code for non-coding work (and the odd bit of coding/documentation stuff) since the first week of Claude Code, or even earlier - I think when filesystem MCP dropped.

It's a very powerful way to work on all kinds of things. V. interested to try co-work when it drops to Plus subscribers.

New comment by jpcompartir in "Reasoning models reason well, until they don't"

jpcompartir — Fri, 31 Oct 2025 10:01:30 +0000

I can't remember which paper it's from, but isn't the variance in performance explained by # of tokens generated? i.e. more tokens generated tends towards better performance.

Which isn't particularly amazing, as # of tokens generated is basically a synonym in this case for computation.

We spend more computation, we tend towards better answers.

New comment by jpcompartir in "LLMs are mortally terrified of exceptions"

jpcompartir — Thu, 09 Oct 2025 20:23:47 +0000

Most comments seem to be taking the code seriously, when it's clearly satirical?

New comment by jpcompartir in "An LLM is a lossy encyclopedia"

jpcompartir — Tue, 09 Sep 2025 14:10:59 +0000

Assuming you've read OpenAI's paper released this week?

https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4a...

They attribute these 'compression artefacts' to pre-training, they also reference the original snowballing paper: How Language Model Hallucinations Can Snowball: https://arxiv.org/pdf/2305.13534

They further state that reasoning is no panacea. W hilst you did say: "the models mitigate more and more"

You were replying to my comment which said:

"'Bad' generations early in the output sequence are somewhat mitigatable by injecting self-reflection tokens like 'wait', or with more sophisticated test-time compute techniques."

So our statements there are logically compatible, i.e. you didn't make a statement that contradicts what I said.

"Our error analysis is general yet has specific implications for hallucination. It applies broadly, including to reasoning and search-and-retrieval language models, and the analysis does not rely on properties of next-word prediction or Transformer-based neural networks."

"Search (and reasoning) are not panaceas. A number of studies have shown how language models augmented with search or Retrieval-Augmented Generation (RAG) reduce hallucinations (Lewis et al., 2020; Shuster et al., 2021; Nakano et al., 2021; Zhang and Zhang, 2025). However, Observation 1 holds for arbitrary language models, including those with RAG. In particular, the binary grading system itself still rewards guessing whenever search fails to yield a confident answer. Moreover, search may not help with miscalculations such as in the letter-counting example, or other intrinsic hallucinations"

New comment by jpcompartir in "Polars Cloud and Distributed Polars now available"

jpcompartir — Thu, 04 Sep 2025 10:09:50 +0000

Polars is great, absolute best of luck with the launch

New comment by jpcompartir in "An LLM is a lossy encyclopedia"

jpcompartir — Tue, 02 Sep 2025 14:32:14 +0000

You seem to be responding to a strawman, and assuming I think something I don't think.

As of today, 'bad' generations early in the sequence still do tend towards responses that are distant to the ideal response. This is testable/verifiable by pre-filling responses, which I'd advise you to experiment with for yourself.

'Bad' generations early in the output sequence are somewhat mitigatable by injecting self-reflection tokens like 'wait', or with more sophisticated test-time compute techniques. However, those remedies can simultaneously turn 'good' generations into bad, they are post-hoc heuristics which treat symptoms not causes.

In general, as the models become larger they are able to compress more of their training data. So yes, using the terminology of the commenter I was responding to, larger models should tend to have fewer 'compression artefacts' than smaller models.

New comment by jpcompartir in "An LLM is a lossy encyclopedia"

jpcompartir — Tue, 02 Sep 2025 12:33:51 +0000

Interesting, in the LLM case these compression artefacts then get fed into the generating process of the next token, hence the errors compound.

New comment by jpcompartir in "Important machine learning equations"

jpcompartir — Thu, 28 Aug 2025 12:38:31 +0000

I would echo some caution if using as a reference, as in another blog the writer states:

"Backpropagation, often referred to as “backward propagation of errors,” is the cornerstone of training deep neural networks. It is a supervised learning algorithm that optimizes the weights and biases of a neural network to minimize the error between predicted and actual outputs.."

https://chizkidd.github.io/2025/05/30/backpropagation/

backpropagation is a supervised machine learning algorithm, pardon?

New comment by jpcompartir in "Everything is correlated (2014–23)"

jpcompartir — Fri, 22 Aug 2025 10:56:34 +0000

And if we increase N enough we will be able to find these 'good measurements' and 'statistically significant differences' everywhere.

Worse still if we did not agree in advance what hypotheses we were testing, and go looking back through historical data to find 'statistically significant' correlations.