<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: ashirviskas</title><link>https://news.ycombinator.com/user?id=ashirviskas</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Sun, 12 Apr 2026 16:48:27 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=ashirviskas" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by ashirviskas in "A new trick brings stability to quantum operations"]]></title><description><![CDATA[
<p>Nah, it's a decade away from *now*.</p>
]]></description><pubDate>Fri, 10 Apr 2026 11:06:23 +0000</pubDate><link>https://news.ycombinator.com/item?id=47716274</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=47716274</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47716274</guid></item><item><title><![CDATA[New comment by ashirviskas in "Grace Hopper's Revenge"]]></title><description><![CDATA[
<p>I'll admit, my brain was DDoSed by the article and I thought that maybe posting it here will get us someone with more DDoS proof brain to dissect it.</p>
]]></description><pubDate>Tue, 17 Mar 2026 12:51:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=47411965</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=47411965</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47411965</guid></item><item><title><![CDATA[New comment by ashirviskas in "Grace Hopper's Revenge"]]></title><description><![CDATA[
<p>What if it is the quality of data? Internet is full of terrible python/js, but probably not Elixir.</p>
]]></description><pubDate>Tue, 17 Mar 2026 12:44:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=47411880</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=47411880</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47411880</guid></item><item><title><![CDATA[New comment by ashirviskas in "Grace Hopper's Revenge"]]></title><description><![CDATA[
<p>I found it interesting that Elixir scores so high, but I'm not sure whether I can agree with the cause.</p>
]]></description><pubDate>Tue, 17 Mar 2026 09:26:48 +0000</pubDate><link>https://news.ycombinator.com/item?id=47410350</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=47410350</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47410350</guid></item><item><title><![CDATA[Grace Hopper's Revenge]]></title><description><![CDATA[
<p>Article URL: <a href="https://www.thefuriousopposites.com/p/grace-hoppers-revenge">https://www.thefuriousopposites.com/p/grace-hoppers-revenge</a></p>
<p>Comments URL: <a href="https://news.ycombinator.com/item?id=47410349">https://news.ycombinator.com/item?id=47410349</a></p>
<p>Points: 67</p>
<p># Comments: 56</p>
]]></description><pubDate>Tue, 17 Mar 2026 09:26:48 +0000</pubDate><link>https://www.thefuriousopposites.com/p/grace-hoppers-revenge</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=47410349</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47410349</guid></item><item><title><![CDATA[New comment by ashirviskas in "Show HN: I solved Claude Code's context drift with persistent Markdown files"]]></title><description><![CDATA[
<p>>The biggest difference: Saturday: Build auth with Claude Sunday: Come back, describe next feature Claude reads REQUIREMENTS.md, sees existing auth schema Builds new feature without touching auth vs. the normal experience of Claude rewriting everything<p>What do you mean rewriting everything?<p>When I started properly structuring my projects, it just follows the pattern and doesn't just "rewrite everything". It finds things in places it expects to find.<p>Your project seems to solve a specific flaw in your flow. And as a npm package, which is super suspicious.<p>EDIT: Oh, it's just a useless product looking for problems to solve just for some $$$ a month.</p>
]]></description><pubDate>Tue, 17 Mar 2026 00:22:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=47406963</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=47406963</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47406963</guid></item><item><title><![CDATA[New comment by ashirviskas in "Why XML tags are so fundamental to Claude"]]></title><description><![CDATA[
<p>Author does not know what they're talking about.<p>> In other words, XML tags have not only a special place at inference level but also during training<p>Their cited source has 0 proof of that. It's just like python/C/html in training. Doesn't mean it's special. And no, you don't need to format your prompts as python code just because of that.<p>> In truth, it does not matter that these tags are XML. Other models use ad hoc delimiters (as explained in a previous article; example: <|begin_of_text|> and <|end_of_text|>) and Claude could have done the same. What matters is what these tags represent.<p>Those strings are just representations of special tokens in models for EOS. What does it have to do with anything this article pretends to know about?<p>Please don't post such intellectual trash on here :')<p>Claude analysis of the article:<p>The author is making an interesting philosophical argument — that XML tags in Claude function as metalinguistic delimiters analogous to quotation marks in natural language, formulaic speech markers in Homer, or recognition sequences in DNA.<p>The core thesis is about first-order vs. second-order expression boundaries, which is a legitimate linguistic/information-theory concept.
But to your actual question — do they understand what tokens are?<p>No, not in the technical sense you're pointing at. The article conflates two very different things:<p>1. Tokenizer-level special tokens — things like <|begin_of_text|>, <|end_of_text|>, <|start_header_id|> etc. These are literal entries in the vocabulary with dedicated token IDs. They're not "learned" through training in the same way — they're hardcoded into the tokenizer and have special roles in the attention mechanism during training. They exist at a fundamentally different layer than XML tags in prompt text.<p>2. XML tags as structured text within the input — these are just regular tokens (<, instructions, >) that Claude learned to attend to during RLHF/training because Anthropic's training data and system prompts heavily use them. They're effective because of training distribution, not because they occupy some special place in the tokenizer.<p>The author notices that other models use <|begin_of_text|> style delimiters and says Claude "could have done the same" but chose XML instead. That's a category error. Claude also has special tokens at the tokenizer level — XML tags in prompts are a completely separate mechanism operating at a different abstraction layer.<p>The philosophical observation about delimiter necessity in communication systems is fine on its own. But grafting it onto a misunderstanding of how tokenization and model architecture actually work weakens the argument. They're essentially pattern-matching on surface-level similarities (both use angle brackets!) without understanding the underlying mechanics.</p>
]]></description><pubDate>Mon, 02 Mar 2026 02:16:58 +0000</pubDate><link>https://news.ycombinator.com/item?id=47213068</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=47213068</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47213068</guid></item><item><title><![CDATA[New comment by ashirviskas in "The path to ubiquitous AI (17k tokens/sec)"]]></title><description><![CDATA[
<p>Smaller quant or smaller model?<p>Afaik it can work with anything, but sharing vocab solves a lot of headaches and the better token probs match, the more efficient it gets.<p>Which is why it is usually done with same family models and most often NOT just different quantizations of the same model.</p>
]]></description><pubDate>Fri, 20 Feb 2026 13:45:49 +0000</pubDate><link>https://news.ycombinator.com/item?id=47087966</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=47087966</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47087966</guid></item><item><title><![CDATA[New comment by ashirviskas in "Claude Sonnet 4.6"]]></title><description><![CDATA[
<p>Someone ping me in 5 years, I want to see if this aged like milk or wine</p>
]]></description><pubDate>Tue, 17 Feb 2026 22:58:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=47054635</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=47054635</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47054635</guid></item><item><title><![CDATA[New comment by ashirviskas in "Fedora Asahi Remix is now working on Apple M3"]]></title><description><![CDATA[
<p>Apple made lower than 16GB M3 models? Man, can't wait till the cheapest model is at least 128GB.</p>
]]></description><pubDate>Tue, 27 Jan 2026 00:32:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=46773847</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=46773847</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46773847</guid></item><item><title><![CDATA[New comment by ashirviskas in "I was banned from Claude for scaffolding a Claude.md file?"]]></title><description><![CDATA[
<p>Another European chiming in, I enjoyed OPs article.</p>
]]></description><pubDate>Thu, 22 Jan 2026 22:18:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=46725851</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=46725851</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46725851</guid></item><item><title><![CDATA[New comment by ashirviskas in "Claude Chill: Fix Claude Code's flickering in terminal"]]></title><description><![CDATA[
<p>Do you also write your bytecode by human hands? At which abstraction layer do we draw the line?</p>
]]></description><pubDate>Wed, 21 Jan 2026 02:17:26 +0000</pubDate><link>https://news.ycombinator.com/item?id=46700382</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=46700382</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46700382</guid></item><item><title><![CDATA[New comment by ashirviskas in "Claude Chill: Fix Claude Code's flickering in terminal"]]></title><description><![CDATA[
<p>> it has, but python being single threaded (until recently) didn't make it an attractive choice for CLI tools.<p>You probably mean GIL, as python has supported multi threading for like 20 years.<p>Idk if ranger is slow because it is written in python. Probably it is the specific implementation.</p>
]]></description><pubDate>Wed, 21 Jan 2026 02:15:16 +0000</pubDate><link>https://news.ycombinator.com/item?id=46700370</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=46700370</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46700370</guid></item><item><title><![CDATA[New comment by ashirviskas in "I was a top 0.01% Cursor user, then switched to Claude Code 2.0"]]></title><description><![CDATA[
<p>It is only 4 years old</p>
]]></description><pubDate>Mon, 19 Jan 2026 22:33:25 +0000</pubDate><link>https://news.ycombinator.com/item?id=46685434</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=46685434</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46685434</guid></item><item><title><![CDATA[New comment by ashirviskas in "I was a top 0.01% Cursor user, then switched to Claude Code 2.0"]]></title><description><![CDATA[
<p>Keyboard autocomplete?</p>
]]></description><pubDate>Mon, 19 Jan 2026 22:32:47 +0000</pubDate><link>https://news.ycombinator.com/item?id=46685424</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=46685424</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46685424</guid></item><item><title><![CDATA[New comment by ashirviskas in "Starting from scratch: Training a 30M Topological Transformer"]]></title><description><![CDATA[
<p>I mean this is exactly what it is. Just a wrapper to replace the tokenizer. That is exactly how LLMs can read images.<p>I'm just focusing on different parts</p>
]]></description><pubDate>Mon, 19 Jan 2026 03:41:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=46674814</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=46674814</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46674814</guid></item><item><title><![CDATA[New comment by ashirviskas in "Starting from scratch: Training a 30M Topological Transformer"]]></title><description><![CDATA[
<p>You get that association that is relevant to your project only if you can cram the whole codebase. Otherwise it is making rough estimates and some of the time that seems to be where the models fail.<p>It can only be fully resolved with either infinite context length, or doing it similar to how humans do it - add some LSP "color" to the code tokens.<p>You can get a feel of what LLMs deal with when you try opening 3000 lines of code in a simple text editor and try to do something. May work for simple fixes, but not whole codebase refactors. Only ultra skilled humans can be productive in it (using my subjective definition of "productive")</p>
]]></description><pubDate>Sun, 18 Jan 2026 17:19:34 +0000</pubDate><link>https://news.ycombinator.com/item?id=46669790</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=46669790</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46669790</guid></item><item><title><![CDATA[New comment by ashirviskas in "Starting from scratch: Training a 30M Topological Transformer"]]></title><description><![CDATA[
<p>I wonder what if we just crammed more into the "tokens"? I am running an experiment of replacing discrete tokens with embeddings + small byte encoder/decoder. That way you can use embedding space much more efficiently and have it contain much more nuance.<p>Experiments I want to build on top of it:<p>1. Adding lsp context to the embeddings - that way the model could _see_ the syntax better, closer to how we use IDEs and would not need to read/grep 25k of lines just to find where something is used.
2. Experiments with different "compression" ratios. Each embedding could encode a different amount of bytes and we would not rely on a huge static token dictionary.<p>I'm aware that papers exist that explore these ideas, but so far no popular/good open source models employ this. Unless someone can prove me wrong.</p>
]]></description><pubDate>Sun, 18 Jan 2026 13:42:21 +0000</pubDate><link>https://news.ycombinator.com/item?id=46667749</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=46667749</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46667749</guid></item><item><title><![CDATA[New comment by ashirviskas in "Anthropic Explicitly Blocking OpenCode"]]></title><description><![CDATA[
<p>Well, using Claude Pro/Max Calude Code api without Claude Code, instead of their actual API they monetize goes against their ToS.<p>I don't like it too, but it is what it is.<p>If I gave free water refils if you used my brand XYZ water bottle, you should not cry that you don't get free refills to your ABC branded bottle.<p>It may be scummy, but it does make sense.</p>
]]></description><pubDate>Thu, 15 Jan 2026 01:51:46 +0000</pubDate><link>https://news.ycombinator.com/item?id=46626914</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=46626914</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46626914</guid></item><item><title><![CDATA[New comment by ashirviskas in "Chromium Has Merged JpegXL"]]></title><description><![CDATA[
<p>You should never use GIF anymore, it is super inefficient. Just do video, it is 5x to 10x more efficient.<p><a href="https://web.dev/articles/replace-gifs-with-videos" rel="nofollow">https://web.dev/articles/replace-gifs-with-videos</a></p>
]]></description><pubDate>Tue, 13 Jan 2026 12:01:11 +0000</pubDate><link>https://news.ycombinator.com/item?id=46599903</link><dc:creator>ashirviskas</dc:creator><comments>https://news.ycombinator.com/item?id=46599903</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46599903</guid></item></channel></rss>