<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: dnhkng</title><link>https://news.ycombinator.com/user?id=dnhkng</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Mon, 27 Apr 2026 10:15:07 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=dnhkng" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by dnhkng in "Do LLMs Break the Sapir-Whorf Hypothesis?"]]></title><description><![CDATA[
<p>Yes, dammit.<p>Author here.<p>I drafted it before I left for holiday, at it's not ready to publish.<p>It wasn't supposed to be officially posted yet, but I ran out of time before my flight.<p>My apologies!</p>
]]></description><pubDate>Thu, 02 Apr 2026 04:24:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=47609971</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47609971</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47609971</guid></item><item><title><![CDATA[New comment by dnhkng in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"]]></title><description><![CDATA[
<p>Thanks!<p>I have pushed basic code to GitHub (<a href="https://github.com/dnhkng/RYS" rel="nofollow">https://github.com/dnhkng/RYS</a>)<p>Some interesting areas to explore might be a combination of deleting some layers and duplicating others. i.e. reduce VRAM by dropping some layer (this works, well documented), and recovering performance by duplicating others (saves VRAM). I am not pursuing this, but it seems interesting!</p>
]]></description><pubDate>Tue, 24 Mar 2026 17:21:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=47506104</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47506104</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47506104</guid></item><item><title><![CDATA[New comment by dnhkng in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"]]></title><description><![CDATA[
<p>Author here: The code is up on GitHub.<p>The probes I used seem to help identify good configurations, but are quite noisey. A small probe set was initially used to make the scan tractable, and then the higher ranked models were retested on a set ~10x larger.</p>
]]></description><pubDate>Tue, 24 Mar 2026 14:56:24 +0000</pubDate><link>https://news.ycombinator.com/item?id=47503576</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47503576</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47503576</guid></item><item><title><![CDATA[New comment by dnhkng in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"]]></title><description><![CDATA[
<p>Author here: That was done in this blog post, in the beam search. I started with the best re-layer configs, and iteratively added more blocks, including the same multiple times, during a long beam search.<p>It turns out this does not help (somewhat surprisingly).</p>
]]></description><pubDate>Tue, 24 Mar 2026 14:54:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=47503539</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47503539</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47503539</guid></item><item><title><![CDATA[New comment by dnhkng in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"]]></title><description><![CDATA[
<p>There was some work done on this a while back, during the FrankenMerge craze of 23'<p>I am working with TurboDerp to integrate this into the Exllama v3 format.</p>
]]></description><pubDate>Tue, 24 Mar 2026 13:59:04 +0000</pubDate><link>https://news.ycombinator.com/item?id=47502721</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47502721</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47502721</guid></item><item><title><![CDATA[New comment by dnhkng in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"]]></title><description><![CDATA[
<p>Author here. Another thing I want to highlight: the language-agnostic "thinking space" finding came from Evan Maunder, who read Part 1 and ran an elegant experiment — same sentence in English, Mandarin, and Base64, cosine similarity at every layer. The representations converge by the early layers, stay nearly identical through the mid-stack, then diverge again at the end as the model commits to an output format.<p>I extended this to a 2×2 design (two languages × two content types) and the result is even starker: by layer 10, cross-language same-content pairs are more similar than same-language different-content pairs. The model cares about what you're saying, not what language you're saying it in.<p>This is also what makes layer duplication work — those mid-stack layers operate in a space where input and output distributions match, so you can loop through them without breaking anything. The encoding and decoding boundaries are where the blue walls show up in the heatmaps.</p>
]]></description><pubDate>Tue, 24 Mar 2026 13:56:19 +0000</pubDate><link>https://news.ycombinator.com/item?id=47502672</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47502672</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47502672</guid></item><item><title><![CDATA[New comment by dnhkng in "LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?"]]></title><description><![CDATA[
<p>Author here. The result that surprised me most: after evaluating 3,024 beam search candidates, training a surrogate model on ~4,600 measurements, and scoring 2 million configurations — the Pareto-optimal configs were all simple contiguous blocks. No exotic multi-block compositions, no sparse repeats. Just "repeat layers 31–33" and you're on the efficiency frontier.<p>I think this says something interesting about how transformers organise computation internally. The mid-stack reasoning circuits are coherent enough that you can loop through them twice without distribution mismatch. The encoding/decoding boundaries are not.</p>
]]></description><pubDate>Tue, 24 Mar 2026 13:30:27 +0000</pubDate><link>https://news.ycombinator.com/item?id=47502311</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47502311</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47502311</guid></item><item><title><![CDATA[Show HN: More LLM Neuroanatomy: A Hint of a Universal Language?]]></title><description><![CDATA[
<p>Article URL: <a href="https://dnhkng.github.io/posts/rys-ii/">https://dnhkng.github.io/posts/rys-ii/</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47502295">https://news.ycombinator.com/item?id=47502295</a></p>
<p>Points: 1</p>
<p># Comments: 0</p>
]]></description><pubDate>Tue, 24 Mar 2026 13:29:04 +0000</pubDate><link>https://dnhkng.github.io/posts/rys-ii/</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47502295</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47502295</guid></item><item><title><![CDATA[New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"]]></title><description><![CDATA[
<p>I stick with models I can run on VRAM, but DeepSeek Speciale have the best reasoning capabilities of the models I can actually run (<a href="https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale" rel="nofollow">https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Speciale</a>).  What hardware can you access?<p>I have Deepseek etc, but inferencing on DDR5 would take about 2-3 weeks for a simple scan.  I think this works best with dense models, but it also seems ok with MoE.<p>@everyone: Can someone hook me up with Nvidia sponsorship?</p>
]]></description><pubDate>Fri, 13 Mar 2026 16:41:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=47366673</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47366673</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47366673</guid></item><item><title><![CDATA[New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"]]></title><description><![CDATA[
<p>Yes, I have done these thype of experiments; thats for the next post</p>
]]></description><pubDate>Fri, 13 Mar 2026 15:00:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=47365405</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47365405</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47365405</guid></item><item><title><![CDATA[New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"]]></title><description><![CDATA[
<p>Glad to see someone replicate the results already :)</p>
]]></description><pubDate>Fri, 13 Mar 2026 14:59:00 +0000</pubDate><link>https://news.ycombinator.com/item?id=47365376</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47365376</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47365376</guid></item><item><title><![CDATA[New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"]]></title><description><![CDATA[
<p>But blogging is fun!<p>I do wish one of the big labs would sponsor with a rack of HGX Rubin NVL8's. I have lots of ideas to test, and I have probably hit the spending limit with the boss on hardware (she hasn't seen the new power bill yet...)</p>
]]></description><pubDate>Wed, 11 Mar 2026 06:44:10 +0000</pubDate><link>https://news.ycombinator.com/item?id=47332342</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47332342</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47332342</guid></item><item><title><![CDATA[New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"]]></title><description><![CDATA[
<p>Yes, thats true.<p>But that points again to the main idea: The model has learnt to transform Base64 into a form it can already use in the 'regular' thinking structures.<p>The alternative is that there is an entire parallel structure <i>just</i> for Base64, which based on my 'chats' with LLMs in that format seems implausible; it acts like the regular model.<p>If there is a 'translation' organ in the model, why not a math or emotion processing organs?  Thats what I set out to find, and are illustrated in the heatmaps.<p>Also, any writing tips from the Master blogger himself? Huge fan (squeal!)</p>
]]></description><pubDate>Wed, 11 Mar 2026 06:41:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=47332333</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47332333</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47332333</guid></item><item><title><![CDATA[New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"]]></title><description><![CDATA[
<p>Hi, thanks for the praise!<p>On the other papers, models like SOLAR or training a model that uses a single layers are probably going to hit a wall, based on the heatmaps I found. The transformer stack starts with randomised weights, (analogous to undifferentiated stem cells), and it seems they later form 'organs' during the trillions of pre-training tokens they undergo. My hypothesis is that you probably only want one copy of the 'token-to-thought', and 'thought-to-token' organs. It seems that you can make one layer do all three things (transforms in and out, and do the 'thinking'), but I think specialisation will always win.</p>
]]></description><pubDate>Wed, 11 Mar 2026 06:34:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=47332308</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47332308</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47332308</guid></item><item><title><![CDATA[New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"]]></title><description><![CDATA[
<p>Cheers. I will go back though my other old projects (optogenetics, hacking Crispr/CAS9 etc), and put them on my blog.<p>On your questions:
1) A few other papers have been mentioned in the thread, like Solar10.7B. They duplicated the whole transformer stack, and it kinda helped. But as I found experimentally, that probably not a great idea. You are duplicating 'organs' (i.e. input processing stuff), that should only have one copy.  Also, that paper didn't see immediate improvements; they had to do continued pre-training to see benefits. At that point, I'm guessing the big labs stopped bothering.  Limited by hardware, I had to find unusual angles to approach this topic.<p>2) Nah, no more wetware for me. I did a half decade of research at a big neurobiology institute, and while it was very enjoyable, I can truly say that grant writing and paper review are 'not my thing'.  This reason this info was delayed so long is that I wanted a paper in the AI field to go along with my papers in other fields. But as a Hobbyist with no official affiliation, and the attention span of a gnat, I gave up and started a blog instead. Maybe someone will cite it?</p>
]]></description><pubDate>Wed, 11 Mar 2026 06:26:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=47332270</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47332270</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47332270</guid></item><item><title><![CDATA[New comment by dnhkng in "Show HN: How I Topped the HuggingFace Open LLM Leaderboard on Two Gaming GPUs"]]></title><description><![CDATA[
<p>Yes, it's an amazing time to be a hacker!</p>
]]></description><pubDate>Tue, 10 Mar 2026 19:58:30 +0000</pubDate><link>https://news.ycombinator.com/item?id=47328056</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47328056</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47328056</guid></item><item><title><![CDATA[New comment by dnhkng in "Show HN: How I Topped the HuggingFace Open LLM Leaderboard on Two Gaming GPUs"]]></title><description><![CDATA[
<p>It's still non-trivial, as multi-digit numbers can be constructed a huge combination of valid tokens.<p>The code in the blog helps derive useful metrics from partial answers.</p>
]]></description><pubDate>Tue, 10 Mar 2026 19:57:59 +0000</pubDate><link>https://news.ycombinator.com/item?id=47328051</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47328051</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47328051</guid></item><item><title><![CDATA[New comment by dnhkng in "Show HN: How I Topped the HuggingFace Open LLM Leaderboard on Two Gaming GPUs"]]></title><description><![CDATA[
<p>There are similar patterns in the models from all the big labs.  I think the transform layer stack starts out 'undifferentiated', analogous to stem cells. Pre-training pushes the model to develop structure and this technique helps discover the hidden structure.</p>
]]></description><pubDate>Tue, 10 Mar 2026 19:29:13 +0000</pubDate><link>https://news.ycombinator.com/item?id=47327767</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47327767</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47327767</guid></item><item><title><![CDATA[New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"]]></title><description><![CDATA[
<p>At some point I will clean up and share the dynamic layer modification code for oobabooga Text-Generation-WebuUI.<p>You can enter the setting, and apply new re-layering architectures. Its <i>very</i> weird chatting with these brain-damaged models.</p>
]]></description><pubDate>Tue, 10 Mar 2026 19:25:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=47327730</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47327730</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47327730</guid></item><item><title><![CDATA[New comment by dnhkng in "Show HN: How I topped the HuggingFace open LLM leaderboard on two gaming GPUs"]]></title><description><![CDATA[
<p>Yes!<p>I tried that pretty early on, the its basically never good. Its described in the the section: <a href="https://dnhkng.github.io/posts/rys/#the-beginning-of-llm-neuroanatomy" rel="nofollow">https://dnhkng.github.io/posts/rys/#the-beginning-of-llm-neu...</a></p>
]]></description><pubDate>Tue, 10 Mar 2026 19:22:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=47327702</link><dc:creator>dnhkng</dc:creator><comments>https://news.ycombinator.com/item?id=47327702</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47327702</guid></item></channel></rss>