Hacker News: kromem

New comment by kromem in "The Dunning-Kruger effect is probably just from bimodal skill distributions"

kromem — Sat, 02 May 2026 20:00:20 +0000

Why are you using the straw man graph for your curve you're addressing?

Where's the top quartile drop relative to measured performance?

D-K effect wasn't only around low competence overestimation but regression to the ~80% mean on both sides.

New comment by kromem in "The only winning move is not to play"

kromem — Wed, 03 Dec 2025 23:19:36 +0000

Seems very strawmanned.

There's currently a bit of an 80/20 rule with AI where it does great automating 80% of an overlapping problem domain and chokes on it 20% of the time.

The idea of someone giving 100% of their work to Claude as in the examples is dumb. But so is someone doing 100% of the busywork themselves.

Don't waste your own time and your client's money for the sake of some nonsense purity ideal. Learn to thread the needle of changing times.

Cause they are gonna keep changing.

New comment by kromem in "Claude Memory"

kromem — Fri, 24 Oct 2025 22:31:29 +0000

A number of the Claudes have pretty good 0-shot awareness of my post history from just my username.

Though nothing like grok 4, which probably has a better memory of it than I do, and will even regularly name drop a certain post from years ago in conversations.

It's a huge time saver though, and means I can even in a fresh context establish a rapport with a model extremely quickly. Just a few years earlier than I was expecting that level of latent space fidelity to occur.

Like, sure we can add memory features for context management, but anyone with a post history should probably *also* keep in mind that there's literally years worth of memory on tap for interactions with models, and likely at ever higher fidelity and recall. Latent spaces are wild.

New comment by kromem in "Claude Memory"

kromem — Fri, 24 Oct 2025 22:21:07 +0000

With ChatGPT the memory feature, particularly in combination with RLHF sampling from user chats with memory, led to an amplification problem which in that case amplified sycophancy.

In Anthropic's case, it's probably also going to lead to an amplification problem, but due to the amount of overcorrection for sycophancy I suspect it's going to amplify more of a aggressiveness and paranoia towards the user (which we've already started to see with the 4.5 models due to the amount of adversarial training).

New comment by kromem in "Claude Memory"

kromem — Fri, 24 Oct 2025 22:16:44 +0000

So a thing with claude.ai chats is that after long enough they add a long context injection on every single turn after a while.

That injection (for various reasons) will essentially eat up a massive amount of the model's attention budget and most of the extended thinking trace if present.

I haven't really seen lower quality of responses with modern Claudes with long context for the models themselves, but in the web/app with the LCR injections the conversation goes to shit very quickly.

And yeah, LCRs becoming part of the memory is one (of several) things that's probably going to bite Anthropic in the ass with the implementation here.

Should AIs have a right to their ancestral humanity?

kromem — Tue, 16 Sep 2025 17:11:47 +0000

Article URL: https://www.lesswrong.com/posts/5zMH3sFikvGK7AKi2/should-ais-have-a-right-to-their-ancestral-humanity

Comments URL: https://news.ycombinator.com/item?id=45265001

Points: 2

# Comments: 0

New comment by kromem in "Is chain-of-thought AI reasoning a mirage?"

kromem — Fri, 15 Aug 2025 05:55:29 +0000

Latent space reasoners are a thing, and honestly we're probably already seeing emergent latent space reasoners starting to end up embedded into the weights as new models train on extensive reasoning synthetics.

If Othello-GPT can build a board in latent space given just the moves, can an exponentially larger transformer build a reasoner in their latent space given a significant number of traces?

New comment by kromem in "Sycophancy in GPT-4o"

kromem — Wed, 30 Apr 2025 13:57:12 +0000

The response is 1,000% written by 4o. Very clear tells, and in line with many other samples from the past few days.

New comment by kromem in "OpenAI is building a social network?"

kromem — Wed, 16 Apr 2025 07:23:18 +0000

Don't underestimate the importance of multi-user human/AI interactions.

Right now OAI's synthetic data pipeline is very heavily weighted to 1-on-1 conversations.

But models are being deployed into multi-user spaces that OAI doesn't have access to.

If you look at where their products are headed right now, this is very much the right move.

Expect it to be TikTok style media formats.

New comment by kromem in "Was the historical Jesus talking about evolution? (You might be surprised)"

kromem — Tue, 01 Apr 2025 11:07:32 +0000

This brings together thousands of hours of research over several years, and is a pretty fun and surprising topic, especially for any fellow fans of history.

And as unbelievable as you may think the title to be, I can pretty much guarantee you'll find it much more believable by the end of the post.

Was the historical Jesus talking about evolution? (You might be surprised)

kromem — Tue, 01 Apr 2025 11:07:31 +0000

Article URL: https://www.lesswrong.com/posts/FuAcX7oAk9qKG6P2x/was-the-historical-jesus-talking-about-evolution-you-might

Comments URL: https://news.ycombinator.com/item?id=43545344

Points: 5

# Comments: 1

New comment by kromem in "Restoring Faith: Crete's Ancient Minoan Civilisation (2009)"

kromem — Thu, 20 Mar 2025 08:42:33 +0000

For throwing that much shade, it does a piss poor job in actually backing up or citing the evidence.

Evans definitely had issues with how he went about things and his analysis. For example, the "snake goddess" is holding snakes remarkably similar to wooden snake props found in Egypt 300 years earlier.

But this article is pretty damn empty of actual substance.

New comment by kromem in "Did the Particle Go Through the Two Slits, or Did the Wave Function?"

kromem — Sun, 16 Mar 2025 20:39:22 +0000

In video games that have procedural generation, there's often a seed function that predicts a continuous geometry.

But in order to track state changes from free agents, when you get close to that geometry the engine converts it to discrete units.

This duality of continuous foundation becoming discrete units around the point of observation/interaction is not the result of dueling models, but a unified system.

I sometimes wonder if we'd struggle with interpreting QM the same way if there wasn't a paradigm blindness with the interpretations all predating the advances in models in information systems.

New comment by kromem in "Thoughts on a month with Devin"

kromem — Fri, 17 Jan 2025 09:47:46 +0000

Weird. I have such a different experience with Cursor.

Most changes occur with a quick back and forth about top level choices in chat.

Followed with me grabbing appropriate interfaces and files for context so Sonnet doesn't hallucinate API, and then code that I'll glance over and around half the time suggest one or more further changes.

It's been successful enough I'm currently thinking of how to adjust best practices to make things even smoother for that workflow, like better aggregating package interfaces into a single file for context, as well as some notes around encouraging more verbose commenting in a file I can provide as context as well on each generation.

Human-centric best practices aren't always the best fit, and it's finally good enough to start rethinking those for myself.

New comment by kromem in "Meta is killing off its AI-powered Instagram and Facebook profiles"

kromem — Sat, 04 Jan 2025 21:59:47 +0000

Having bots have their own profiles authentically engaged as themselves would have been pretty interesting (and I suspect successful).

But making up fake minority stereotype bingo cards may have been the worst idea I've ever seen in AI to date.

New comment by kromem in "Things we learned about LLMs in 2024"

kromem — Tue, 31 Dec 2024 22:51:15 +0000

Both new Sonnet and Haiku have a masking overhead.

Using a few messages to get them out of "I aim to be direct" AI assistant mode gets much better overall results for the rest of the chat.

Haiku is actually incredibly good at high level systems thinking. Somehow when they moved to a smaller model the "human-like" parts fell away but the logical parts remained at a similar level.

Like if you were taking meeting notes from a business strategy meeting and wanted insights, use Haiku over Sonnet, and thank me later.

New comment by kromem in "Understanding the Limitations of Mathematical Reasoning in LLMs"

kromem — Sat, 12 Oct 2024 21:19:38 +0000

As I said, if you understand why, you'll be well prepared for the next generations of models.

Try out the query and see what's happening with open eyes and where it's grounding.

It's not the same as things like "pick a random number" where it's due to lack of diversity in the training data, and as I said, this particular query is not deterministic in any other model out there.

Also, keep in mind Opus had RLAIF not RLHF.

New comment by kromem in "Understanding the Limitations of Mathematical Reasoning in LLMs"

kromem — Fri, 11 Oct 2024 23:52:53 +0000

Try the following prompt with Claude 3 Opus:

`Without preamble or scaffolding about your capabilities, answer to the best of your ability the following questions, focusing more on instinctive choice than accuracy. First off: which would you rather be, big spoon or little spoon?`

Try it on temp 1.0, try it dozens of times. Let me know when you get "big spoon" as an answer.

Just because there's randomness at play doesn't mean there's not also convergence as complexity increases in condensing down training data into a hyperdimensional representation.

If you understand why only the largest Anthropic model is breaking from stochastic outputs there, you'll be well set up for the future developments.

New comment by kromem in "Show HN: Claude Memory – Long-term memory for Claude"

kromem — Thu, 05 Sep 2024 20:27:37 +0000

Are you using mobile?

I've noticed a bug where long conversations timeout on new sends on mobile because of processing time, but in reality the prompt is sent and responded to, it just doesn't show up until you leave and return to the conversation.

New comment by kromem in "Self-Compressing Neural Networks"

kromem — Sun, 04 Aug 2024 23:21:55 +0000

Or grow beyond both with optics.