<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hacker News: taliesinb</title><link>https://news.ycombinator.com/user?id=taliesinb</link><description>Hacker News RSS</description><docs>https://hnrss.org/</docs><generator>hnrss v2.1.1</generator><lastBuildDate>Tue, 07 Apr 2026 05:36:50 +0000</lastBuildDate><atom:link href="https://hnrss.org/user?id=taliesinb" rel="self" type="application/rss+xml"></atom:link><item><title><![CDATA[New comment by taliesinb in "Two studies in compiler optimisations"]]></title><description><![CDATA[
<p>Interesting. It all seems very brittle, though. And that something has gone very wrong with our ecosystem of tools, languages, and processes when it becomes advisable to massage source until specific passes in a specific version LLVM don't mess things up for other passes.<p>Not picking on the OA in the slightest; just thinking in terms of holistic system design. If you know what you want to happen, and you are smart enough to introspect the behavior of the tool and decide that it didnt happen, you are more than smart enough to just write it correctly in the first place.<p>Perhaps that is unrealistic, perhaps there is a hidden iceberg of necessary but convolutive optimizations no human could realistically or legibly write. But ok, where do you <i>really</i> need to engage in this kind of optimization golf? Inlined functions?<p>Ok, what about this targeted language feature for a future-day Zig:<p>1. Write an ordinary zig function
2. Write inline assembly version of that function
3. Write a "comptime assert" that first compiles to second, which only "runs" for the relevant arch.  
4. What should that assert mean? That the compiler just uses your assembly version instead, but _also_ uses existing compiler machinery or an external theorem prover to verify they "behave the same up to X", for customizable values of X<p>That has the right feel, maybe. You are "pinning" specific, vetted, optimizations without compromising the intent, readability, or correctness of your code. And easy iteration is possible, because a failing comptime assert will just dump the assembly; you can even start with an empty manual impl.</p>
]]></description><pubDate>Thu, 26 Mar 2026 06:00:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=47527068</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=47527068</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47527068</guid></item><item><title><![CDATA[New comment by taliesinb in "Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training"]]></title><description><![CDATA[
<p>Amusingly, you need only have circuits of prime depth, though you should probably adjust their widths using something principled, perhaps Euler's totient function.</p>
]]></description><pubDate>Thu, 19 Mar 2026 02:09:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=47433960</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=47433960</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47433960</guid></item><item><title><![CDATA[New comment by taliesinb in "Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training"]]></title><description><![CDATA[
<p>And i bet these would be useful in initial and final parts of transformer too. Because syntactic parsing and unparsing of brackets, programming language ASTs, etc is highly recursive; no doubt current models are painfully learning "unrolled" versions of the relevant recursive circuits, unrolled to some fixed depth that must compete for layers with other circuits, since your total budget is 60 or whatever. Incredibly duplicative and by definition unable to generalize to arbitrary depth!</p>
]]></description><pubDate>Thu, 19 Mar 2026 02:04:05 +0000</pubDate><link>https://news.ycombinator.com/item?id=47433916</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=47433916</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47433916</guid></item><item><title><![CDATA[New comment by taliesinb in "Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training"]]></title><description><![CDATA[
<p>There is an obvious implication: since the initial models were trained <i>without</i> loops, it is exceedingly unlikely that a single stack of consecutive N layers represents <i>only</i> a single, repeatable circuit that can be safely looped. It is much more likely that the loopable circuits are superposed across multiple layers and have different effective depths.<p>That you <i>can</i> profitably loop some say 3-layer stack is likely a happy accident, where the performance <i>loss</i> from looping 3/4 of mystery circuit X that partially overlaps that stack is more than outweighed by the performance <i>gain</i> from looping 3/3 of mystery circuit Y that exactly aligns with that stack.<p>So, if you are willing to train from scratch, just build the looping in during training and let each circuit find its place, in disentangled stacks of various depths. Middle of transformer is:<p>(X₁)ᴹ ⊕  (Y₁∘Y₂)ᴺ ⊕ (Z₁∘Z₂∘Z₃)ᴾ ⊕ …<p>Notation: Xᵢ is a layer (of very small width) in a circuit of depth 1..i..D, ⊕ is parallel composition (which sums the width up to rest of transformer), ∘ is serial composition (stacking), and ᴹ is looping. The values of ᴹ shouldnt matter as long as they are > 1, the point is to crank them up after training.<p>Ablating these individual circuits will tell you whether you needed them at all, but also roughly what they were <i>for</i> in the first place, which would be very interesting.</p>
]]></description><pubDate>Thu, 19 Mar 2026 01:57:57 +0000</pubDate><link>https://news.ycombinator.com/item?id=47433868</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=47433868</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=47433868</guid></item><item><title><![CDATA[New comment by taliesinb in "Unison 1.0"]]></title><description><![CDATA[
<p>If you could link to where this is implemented I'd be very grateful!</p>
]]></description><pubDate>Wed, 26 Nov 2025 02:27:51 +0000</pubDate><link>https://news.ycombinator.com/item?id=46053534</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=46053534</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46053534</guid></item><item><title><![CDATA[New comment by taliesinb in "Unison 1.0"]]></title><description><![CDATA[
<p>Hello! Yes I am curious, how does one deal with cycles in the code hash graph? Mutually recursive functions for example?</p>
]]></description><pubDate>Tue, 25 Nov 2025 20:43:44 +0000</pubDate><link>https://news.ycombinator.com/item?id=46050530</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=46050530</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=46050530</guid></item><item><title><![CDATA[New comment by taliesinb in "Mysterious cosmic 'dots' are baffling astronomers. What are they?"]]></title><description><![CDATA[
<p>I’ve been super interested in these kinds of cosmic turduckens. See also <a href="https://en.wikipedia.org/wiki/Thorne%E2%80%93%C5%BBytkow_object" rel="nofollow">https://en.wikipedia.org/wiki/Thorne%E2%80%93%C5%BBytkow_obj...</a> and <a href="https://en.wikipedia.org/wiki/Quasi-star" rel="nofollow">https://en.wikipedia.org/wiki/Quasi-star</a></p>
]]></description><pubDate>Fri, 17 Oct 2025 06:28:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=45613843</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=45613843</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=45613843</guid></item><item><title><![CDATA[New comment by taliesinb in "Implementing a functional language with graph reduction (2021)"]]></title><description><![CDATA[
<p>Given that whole name binding thing is ultimately a story of how to describe a graph using a tree, I was primed to look for monoidal category-ish things, and sure enough the S and K combinators look very much like copy and delete operators; counit and comultiplication for a comonoid. That’s very vibe-based, anyone know of a formal version of this observation?</p>
]]></description><pubDate>Fri, 25 Jul 2025 19:05:17 +0000</pubDate><link>https://news.ycombinator.com/item?id=44687005</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=44687005</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=44687005</guid></item><item><title><![CDATA[New comment by taliesinb in "The algebra and calculus of algebraic data types (2015)"]]></title><description><![CDATA[
<p>Yes, this is a very cool story.<p>But, fascinatingly, integration does in fact have a meaning. First, recall from the OP that d/dX List(X) = List(X) * List(X). You punched a hole in a list and you got two lists: the list to the left of the hole and the list to the right of the hole.<p>Ok, so now define CrazyList(X) to be the anti-derivative of <i>one</i> list: d/dX CrazyList(X) = List(X). Then notice that punching a hole in a <i>cyclic</i> list does not cause it to fall apart into two lists, since the list to the left and to the right are the <i>same</i> list. CrazyList = CyclicList! Aka a ring buffer.<p>There's a paper on this, apologies I can't find it right now. Maybe Alternkirch or a student of his.<p>The true extent of this goes far beyond anything I imagined, this is really only the tip of a vast iceberg.</p>
]]></description><pubDate>Wed, 24 Jul 2024 15:34:20 +0000</pubDate><link>https://news.ycombinator.com/item?id=41058125</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=41058125</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=41058125</guid></item><item><title><![CDATA[New comment by taliesinb in "Show HN: Mandala – Automatically save, query and version Python computations"]]></title><description><![CDATA[
<p>Cool! Looks pretty professional.<p>I explored a similar idea once (also implemented in Python, via decorators) to help speed up some neuroscience research that involved a lot of hyperparameter sweeps. It's named after a Borges story about a man cursed to remember everything: <a href="https://github.com/taliesinb/funes">https://github.com/taliesinb/funes</a><p>Maybe one day we'll have a global version of this, where all non-private computations are cached on a global distributed store somehow via content-based hashing.</p>
]]></description><pubDate>Thu, 11 Jul 2024 23:27:09 +0000</pubDate><link>https://news.ycombinator.com/item?id=40941529</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=40941529</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40941529</guid></item><item><title><![CDATA[New comment by taliesinb in "Exploring biphasic programming: a new approach in language design"]]></title><description><![CDATA[
<p>The end-game is just dissolving any distinction between compile-time and run-time. Other examples of dichotomies that could be partially dissolved by similar kinds of universal acid:<p>* dynamic typing vs static typing, a continuum that JIT-ing and compiling attack from either end -- in some sense dynamically typed programs are ALSO statically typed -- with all function types are being dependent function types and all value types being sum types. After all, a term of a dependent sum, a dependent pair, <i>is</i> just a boxed value.<p>* monomorphisation vs polymorphism-via-vtables/interfaces/protocols, which trade roughly speaking instruction cache density for data cache density<p>* RC vs GC vs heap allocation via compiler-assisted proof of memory ownership relationships of how this is supposed to happen<p>* privileging the stack and instruction pointer rather than making this kind of transient program state a first-class data structure like any other, to enable implementing your own co-routines and whatever else. an analogous situation: Zig deciding that memory allocation should NOT be so privileged as to be an "invisible facility" one assumes is global.<p>* privileging <i>pointers</i> themselves as a global type constructor rather than as typeclasses. we could have pointer-using functions that transparently monomorphize in more efficient ways when you happen to know how many items you need and how they can be accessed, owned, allocated, and de-allocated. global heap pointers waste <i>so</i> much space.<p>Instead, one would have code for which it makes more or less sense to spend time optimizing in ways that privilege memory usage, execution efficiency, instruction density, clarity of denotational semantics, etc, etc, etc.<p>Currently, we have these weird siloed ways of doing <i>certain</i> kinds of privileging in <i>certain</i> languages with rather arbitrary boundaries for how far you can go. I hope one day we have languages that just dissolve all of this decision making and engineering into universal facilities in which the language can be anything you need it to be -- it's just a neutral substrate for expressing computation and how you want to produce machine artifacts that can be run in various ways.<p>Presumably a future language like this, if it ever exists, would descend from one of today's proof assistants.</p>
]]></description><pubDate>Tue, 02 Jul 2024 21:41:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=40860692</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=40860692</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40860692</guid></item><item><title><![CDATA[New comment by taliesinb in "Zig Goals"]]></title><description><![CDATA[
<p>An earnest question: can you elaborate on what Zig got wrong in that respect?</p>
]]></description><pubDate>Fri, 07 Jun 2024 20:14:14 +0000</pubDate><link>https://news.ycombinator.com/item?id=40612502</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=40612502</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40612502</guid></item><item><title><![CDATA[New comment by taliesinb in "Ask HN: Who is hiring? (June 2024)"]]></title><description><![CDATA[
<p>In another life!</p>
]]></description><pubDate>Tue, 04 Jun 2024 19:40:08 +0000</pubDate><link>https://news.ycombinator.com/item?id=40578187</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=40578187</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40578187</guid></item><item><title><![CDATA[New comment by taliesinb in "Ask HN: Who is hiring? (June 2024)"]]></title><description><![CDATA[
<p>Yes, get in touch with me.</p>
]]></description><pubDate>Tue, 04 Jun 2024 19:38:31 +0000</pubDate><link>https://news.ycombinator.com/item?id=40578170</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=40578170</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40578170</guid></item><item><title><![CDATA[New comment by taliesinb in "Ask HN: Who is hiring? (June 2024)"]]></title><description><![CDATA[
<p>Symbolica.ai | London, Australia | REMOTE, INTERNS, VISA<p>We're trying to apply the insights of category theory, dependent type theory, and functional programming to deep learning. How do we best equip neural nets with strong inductive biases from these fields to help them reason in a structured way? Our upcoming ICML paper gives some flavor <a href="https://arxiv.org/abs/2402.15332" rel="nofollow">https://arxiv.org/abs/2402.15332</a> ; you can also watch <a href="https://www.youtube.com/watch?v=rie-9AEhYdY&t=387s" rel="nofollow">https://www.youtube.com/watch?v=rie-9AEhYdY&t=387s</a> ; but there is a lot more to say.<p>If you are fluent in 2 or more of { category theory, Haskell (/Idris/Agda/...), deep learning }, you'll probably have a lot of fun with us!<p>Check out our open positions at <a href="https://jobs.gusto.com/boards/symbolica-ai-67195a74-31b4-4052-ba18-e859d461808c" rel="nofollow">https://jobs.gusto.com/boards/symbolica-ai-67195a74-31b4-405...</a></p>
]]></description><pubDate>Mon, 03 Jun 2024 16:54:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=40564574</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=40564574</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40564574</guid></item><item><title><![CDATA[New comment by taliesinb in "Einsum for Tensor Manipulation"]]></title><description><![CDATA[
<p>No worries! Yes, exactly. It's also similar to doing arithmetic with and without units. Sure, you can do arithmetic  without units, but when you are actually working with real-world quantities,  but you easily get yourself into a muddle. Unit-carrying quantities and algebraic systems built on them prevent you from doing silly things, and in fact <i>guide</i> you to getting what you want.</p>
]]></description><pubDate>Sun, 28 Apr 2024 11:21:53 +0000</pubDate><link>https://news.ycombinator.com/item?id=40187741</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=40187741</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40187741</guid></item><item><title><![CDATA[New comment by taliesinb in "Einsum for Tensor Manipulation"]]></title><description><![CDATA[
<p>Looks interesting, but your link is broken, can you try again or give us a direct PDF or Arxiv link?</p>
]]></description><pubDate>Sat, 27 Apr 2024 21:58:12 +0000</pubDate><link>https://news.ycombinator.com/item?id=40183970</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=40183970</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40183970</guid></item><item><title><![CDATA[New comment by taliesinb in "Einsum for Tensor Manipulation"]]></title><description><![CDATA[
<p>Thanks for highlighting XArray in your other comments. Yup, XArray is great. As are Dex and the various libraries for named axis DL programming within PyTorch and Jax. I never said these things don't exist -- I even mention them in the linked blog series!<p>But I do think it's fair to say they are in their infancy, and there is a missing theoretical framework to explain what is going on.<p>I anticipate name-free array programming will eventually be considered a historical curiosity for most purposes, and everyone will wonder how we put up without it for so long.</p>
]]></description><pubDate>Sat, 27 Apr 2024 21:55:33 +0000</pubDate><link>https://news.ycombinator.com/item?id=40183943</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=40183943</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40183943</guid></item><item><title><![CDATA[New comment by taliesinb in "Einsum for Tensor Manipulation"]]></title><description><![CDATA[
<p>While I applaud the OP's exposition, imagine that instead of having axis names live as single-letter variables within einsum, our arrays themselves had these names attached to their axes?<p>It's almost like when we moved from writing machine code to structured programming: instead of being content to document that certain registers corresponding to certain semantically meaningful quantities at particular times during program execution, we manipulated <i>only</i> named variables and left compilers to figure out how to allocate these to registers? We're at that point now with array programming.<p><a href="https://nlp.seas.harvard.edu/NamedTensor" rel="nofollow">https://nlp.seas.harvard.edu/NamedTensor</a><p><a href="https://math.tali.link/rainbow-array-algebra/" rel="nofollow">https://math.tali.link/rainbow-array-algebra/</a><p><a href="https://arxiv.org/abs/2102.13196" rel="nofollow">https://arxiv.org/abs/2102.13196</a></p>
]]></description><pubDate>Sat, 27 Apr 2024 20:48:29 +0000</pubDate><link>https://news.ycombinator.com/item?id=40183379</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=40183379</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=40183379</guid></item><item><title><![CDATA[New comment by taliesinb in "Linear Algebra of Types (2019)"]]></title><description><![CDATA[
<p>Hey there, a bit late but I've read your other comments and I'd like to get in touch. I happen to be very focused on type derivatives, in the context of applying category theory to AI, having just discovered Conor's original papers. Your comment was extremely helpful. Please email me at tali@tali.link if you see this.</p>
]]></description><pubDate>Mon, 08 Apr 2024 05:27:37 +0000</pubDate><link>https://news.ycombinator.com/item?id=39966463</link><dc:creator>taliesinb</dc:creator><comments>https://news.ycombinator.com/item?id=39966463</comments><guid isPermaLink="false">https://news.ycombinator.com/item?id=39966463</guid></item></channel></rss>