Hacker News: dnhkng

New comment by dnhkng in "Do LLMs Break the Sapir-Whorf Hypothesis?"

dnhkng — Thu, 02 Apr 2026 04:24:25 +0000

Yes, dammit.

Author here.

I drafted it before I left for holiday, at it's not ready to publish.

It wasn't supposed to be officially posted yet, but I ran out of time before my flight.

My apologies!

New comment by dnhkng in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"

dnhkng — Tue, 24 Mar 2026 17:21:25 +0000

Thanks!

I have pushed basic code to GitHub (https://github.com/dnhkng/RYS)

Some interesting areas to explore might be a combination of deleting some layers and duplicating others. i.e. reduce VRAM by dropping some layer (this works, well documented), and recovering performance by duplicating others (saves VRAM). I am not pursuing this, but it seems interesting!

New comment by dnhkng in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"

dnhkng — Tue, 24 Mar 2026 14:56:24 +0000

Author here: The code is up on GitHub.

The probes I used seem to help identify good configurations, but are quite noisey. A small probe set was initially used to make the scan tractable, and then the higher ranked models were retested on a set ~10x larger.

New comment by dnhkng in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"

dnhkng — Tue, 24 Mar 2026 14:54:23 +0000

Author here: That was done in this blog post, in the beam search. I started with the best re-layer configs, and iteratively added more blocks, including the same multiple times, during a long beam search.

It turns out this does not help (somewhat surprisingly).

New comment by dnhkng in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"

dnhkng — Tue, 24 Mar 2026 13:59:04 +0000

There was some work done on this a while back, during the FrankenMerge craze of 23'

I am working with TurboDerp to integrate this into the Exllama v3 format.

New comment by dnhkng in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"

dnhkng — Tue, 24 Mar 2026 13:56:19 +0000

Author here. Another thing I want to highlight: the language-agnostic "thinking space" finding came from Evan Maunder, who read Part 1 and ran an elegant experiment — same sentence in English, Mandarin, and Base64, cosine similarity at every layer. The representations converge by the early layers, stay nearly identical through the mid-stack, then diverge again at the end as the model commits to an output format.

I extended this to a 2×2 design (two languages × two content types) and the result is even starker: by layer 10, cross-language same-content pairs are more similar than same-language different-content pairs. The model cares about what you're saying, not what language you're saying it in.

This is also what makes layer duplication work — those mid-stack layers operate in a space where input and output distributions match, so you can loop through them without breaking anything. The encoding and decoding boundaries are where the blue walls show up in the heatmaps.

New comment by dnhkng in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"

dnhkng — Tue, 24 Mar 2026 13:30:27 +0000

Author here. The result that surprised me most: after evaluating 3,024 beam search candidates, training a surrogate model on ~4,600 measurements, and scoring 2 million configurations — the Pareto-optimal configs were all simple contiguous blocks. No exotic multi-block compositions, no sparse repeats. Just "repeat layers 31–33" and you're on the efficiency frontier.

I think this says something interesting about how transformers organise computation internally. The mid-stack reasoning circuits are coherent enough that you can loop through them twice without distribution mismatch. The encoding/decoding boundaries are not.

Show HN: More LLM Neuroanatomy: A Hint of a Universal Language?

dnhkng — Tue, 24 Mar 2026 13:29:04 +0000

Article URL: https://dnhkng.github.io/posts/rys-ii/

Comments URL: https://news.ycombinator.com/item?id=47502295

Points: 1

# Comments: 0

New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"

dnhkng — Fri, 13 Mar 2026 16:41:12 +0000

I stick with models I can run on VRAM, but DeepSeek Speciale have the best reasoning capabilities of the models I can actually run (https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale). What hardware can you access?

I have Deepseek etc, but inferencing on DDR5 would take about 2-3 weeks for a simple scan. I think this works best with dense models, but it also seems ok with MoE.

@everyone: Can someone hook me up with Nvidia sponsorship?

New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"

dnhkng — Fri, 13 Mar 2026 15:00:53 +0000

Yes, I have done these thype of experiments; thats for the next post

New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"

dnhkng — Fri, 13 Mar 2026 14:59:00 +0000

Glad to see someone replicate the results already :)

New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"

dnhkng — Wed, 11 Mar 2026 06:44:10 +0000

But blogging is fun!

I do wish one of the big labs would sponsor with a rack of HGX Rubin NVL8's. I have lots of ideas to test, and I have probably hit the spending limit with the boss on hardware (she hasn't seen the new power bill yet...)

New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"

dnhkng — Wed, 11 Mar 2026 06:41:21 +0000

Yes, thats true.

But that points again to the main idea: The model has learnt to transform Base64 into a form it can already use in the 'regular' thinking structures.

The alternative is that there is an entire parallel structure just for Base64, which based on my 'chats' with LLMs in that format seems implausible; it acts like the regular model.

If there is a 'translation' organ in the model, why not a math or emotion processing organs? Thats what I set out to find, and are illustrated in the heatmaps.

Also, any writing tips from the Master blogger himself? Huge fan (squeal!)

New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"

dnhkng — Wed, 11 Mar 2026 06:34:51 +0000

Hi, thanks for the praise!

On the other papers, models like SOLAR or training a model that uses a single layers are probably going to hit a wall, based on the heatmaps I found. The transformer stack starts with randomised weights, (analogous to undifferentiated stem cells), and it seems they later form 'organs' during the trillions of pre-training tokens they undergo. My hypothesis is that you probably only want one copy of the 'token-to-thought', and 'thought-to-token' organs. It seems that you can make one layer do all three things (transforms in and out, and do the 'thinking'), but I think specialisation will always win.

New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"

dnhkng — Wed, 11 Mar 2026 06:26:23 +0000

Cheers. I will go back though my other old projects (optogenetics, hacking Crispr/CAS9 etc), and put them on my blog.

On your questions: 1) A few other papers have been mentioned in the thread, like Solar10.7B. They duplicated the whole transformer stack, and it kinda helped. But as I found experimentally, that probably not a great idea. You are duplicating 'organs' (i.e. input processing stuff), that should only have one copy. Also, that paper didn't see immediate improvements; they had to do continued pre-training to see benefits. At that point, I'm guessing the big labs stopped bothering. Limited by hardware, I had to find unusual angles to approach this topic.

2) Nah, no more wetware for me. I did a half decade of research at a big neurobiology institute, and while it was very enjoyable, I can truly say that grant writing and paper review are 'not my thing'. This reason this info was delayed so long is that I wanted a paper in the AI field to go along with my papers in other fields. But as a Hobbyist with no official affiliation, and the attention span of a gnat, I gave up and started a blog instead. Maybe someone will cite it?

New comment by dnhkng in "Show HN: How I Topped the HuggingFace Open LLM Leaderboard on Two Gaming GPUs"

dnhkng — Tue, 10 Mar 2026 19:58:30 +0000

Yes, it's an amazing time to be a hacker!

New comment by dnhkng in "Show HN: How I Topped the HuggingFace Open LLM Leaderboard on Two Gaming GPUs"

dnhkng — Tue, 10 Mar 2026 19:57:59 +0000

It's still non-trivial, as multi-digit numbers can be constructed a huge combination of valid tokens.

The code in the blog helps derive useful metrics from partial answers.

New comment by dnhkng in "Show HN: How I Topped the HuggingFace Open LLM Leaderboard on Two Gaming GPUs"

dnhkng — Tue, 10 Mar 2026 19:29:13 +0000

There are similar patterns in the models from all the big labs. I think the transform layer stack starts out 'undifferentiated', analogous to stem cells. Pre-training pushes the model to develop structure and this technique helps discover the hidden structure.

New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"

dnhkng — Tue, 10 Mar 2026 19:25:33 +0000

At some point I will clean up and share the dynamic layer modification code for oobabooga Text-Generation-WebuUI.

You can enter the setting, and apply new re-layering architectures. Its very weird chatting with these brain-damaged models.

New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"

dnhkng — Tue, 10 Mar 2026 19:22:34 +0000

Yes!

I tried that pretty early on, the its basically never good. Its described in the the section: https://dnhkng.github.io/posts/rys/#the-beginning-of-llm-neu...