Hacker News: samsartor

New comment by samsartor in "What happened to nerds?"

samsartor — Mon, 15 Jun 2026 10:20:15 +0000

The premise of this post is that tech founders used to be admirable nerds, but have since changed. I wonder if it isn't the other way around. We're the nerds. Us. Here. We used to admire tech founders because sometimes they were nerds too, but then we changed. We grew up. We got wise to it.

The author wants founders to stop projecting “an obsession with wealth and power” and instead “focus carefully on projecting an obsession with core nerd values”. And maybe it doesn't occur to them (as a fellow nerd) that _wealth and power were the whole point_. The author enjoyed being blind to the greed of it all, and now being unable to unsee they are begging the founders “please please just pretend a bit better”.

New comment by samsartor in "Are you expected to run five Python type-checkers now?"

samsartor — Mon, 08 Jun 2026 14:32:50 +0000

Elementwise equality! Given two dataframe columns or ndarrays, users often expect `==` to give out a column or ndarrays of bools (like `+`, ``, `*, `&`, and just about every other binary operator).

New comment by samsartor in "1-Bit Bonsai Image 4B Image Generation for Local Devices"

samsartor — Sun, 31 May 2026 21:27:02 +0000

Personally I think it's fine to use "diffusion" to refer to the whole family of models

New comment by samsartor in "A sleep-like consolidation mechanism for LLMs"

samsartor — Tue, 26 May 2026 19:19:31 +0000

Yah I think E2E-TTT is a lot more like what people in this comments section are picturing. I can't tell that this method updates model weights at all during the "sleep" period, only the usual SSM state updated by any Mamba model after each token. They just optimized the model to use that SSM state _more_ when an eviction is about to happen.

New comment by samsartor in "Language Models Need Sleep"

samsartor — Tue, 26 May 2026 19:07:48 +0000

The abstract and method sections only mention updating the SSM state during "sleep" (ie the same vectors that change after each token in stock Mamba) not any of the actual weight matrices. AFAICT this is just another attention compaction paper, with misleading tile? It is not very clearly written

New comment by samsartor in "RISC-V Router"

samsartor — Fri, 15 May 2026 02:50:13 +0000

Under the hood, the StartWRT UI is just another OpenWRT package, and it plays nicely with luci.

New comment by samsartor in "RISC-V Router"

samsartor — Fri, 15 May 2026 02:45:06 +0000

I helped a bit to develop this UI myself. Support for vlans was baked into it from day 1. The idea being good admin/guest/iot/hosted/etc separation without extra access points.

New comment by samsartor in "Landmark ancient-genome study shows surprise acceleration of human evolution"

samsartor — Sat, 18 Apr 2026 05:15:21 +0000

My understanding is that humans have very limited genetic diversity compared to most other animals, because of the population bottlenecks we've been through. And further, that diversity is mostly between individuals, not between groups. The distinction is easy to see in cats vs dogs: they both have similar overall genetic diversity but two Chihuahuas have virtually all the same genes (the small angry ones) while two tabby cats are more distinct. The two cats have different combinations of big/small nice/mean smart/dumb, but the genes average out to the same "typical" kind of cat in both cases.

Because humans get around so much, and because we think interesting-looking people are hot, the diversity is spread pretty broadly across the whole population. The average european person and the average east asian person are a little bit different genetically, but way less different than any two real europeans or two east-asians are to one another.

In short, the distributions of individuals overlap so much that the trendlines are pretty close to useless. And historically speaking, the people who tried to make a hard distinction out of those trendlines had awful motives.

New comment by samsartor in "What is RISC-V and why it matters to Canonical"

samsartor — Fri, 10 Apr 2026 23:09:37 +0000

I think that this is something of a misunderstanding. There isn't a litteral RISC processor inside the x86 processor with a tiny little compiler sitting in the middle. Its more that the out-of-order execution model breaks up instructions into μops so that the μops can separately queue at the core's dozens of ALUs, multiple load/store units, virtual->physical address translation units, etc. The units all work together in parallel to chug through the incoming instructions. High-performance RISC-V processors do exactly the same thing, despite already being "RISC".

New comment by samsartor in "Pushing and Pulling: Three reactivity algorithms"

samsartor — Sun, 08 Mar 2026 23:31:54 +0000

I've been working on a reactivity system for rust over the past couple of years, which uses a lot of these ideas! It also tries to make random concurrent modification less of a pain, with transactional memory and CRDT stuff. And gives you free undo/redo.

Still kind of WIP, but it isn't secret. People are welcome to check it out at https://gitlab.com/samsartor/hornpipe

New comment by samsartor in "Building a new Flash"

samsartor — Thu, 05 Mar 2026 08:48:43 +0000

https://ruffle.rs/ is pretty solid

New comment by samsartor in "Elsevier shuts down its finance journal citation cartel"

samsartor — Mon, 23 Feb 2026 14:59:30 +0000

I feel like my papers are better for having gone through peer review, and I'm a better researcher for having had a few rejections. Of course the reviewers can't hover around in your lab watching everything you do. But even if reviewers can't check the validity of the evidence in your paper, they do a pretty good job ensuring that the claims you make are supported by the evidence you present. That's a valuable if imperfect guardrail! What would be the alternative?

New comment by samsartor in "Vitamin D and Omega-3 have a larger effect on depression than antidepressants"

samsartor — Thu, 29 Jan 2026 12:50:07 +0000

Several people in my family have a MTHFR gene mutation that screws stuff up, including causing problems with anxiety+depression. But a simple B12 shot every couple of weeks does wonders.

New comment by samsartor in "Backpropagation is a leaky abstraction (2016)"

samsartor — Sun, 02 Nov 2025 17:10:02 +0000

Yes. Pretraining and fine-tuning use standard Adam optimizers (usually with weight-decay). Reinforcement learning has been the odd-man out historically, but these days almost all RL algorithms also use backprop and gradient descent.

New comment by samsartor in "'Attention is all you need' coauthor says he's 'sick' of transformers"

samsartor — Fri, 24 Oct 2025 18:21:41 +0000

I'm skeptical that we'll see a big breakthrough in the architecture itself. As sick as we all are of transformers, they are really good universal approximators. You can get some marginal gains, but how more _universal_ are you realistically going to get? I could be wrong, and I'm glad there are researchers out there looking at alternatives like graphical models, but for my money we need to look further afeild. Reconsider the auto-regressive task, cross entropy loss, even gradient descent optimization itself.

New comment by samsartor in "Show HN: Every single torrent is on this website"

samsartor — Mon, 29 Sep 2025 18:34:16 +0000

In a library of all possible strings, this is just text compression (as the other comment observes). But in a finite library it gets even simpler, in a cool way! We can treat each text as a unique symbol and use an entropy encoding (eg Huffman) to assign length-optimized key to each based on likelihood (eg from an LLM). Building the library is something like O(n log n), which isn't terrible. But adding new texts would change the IDs for existing texts (which is annoying). There might be a good way to reserve space for future entries probabilistically? Out of my depth at this point!

New comment by samsartor in "WASM 3.0 Completed"

samsartor — Wed, 17 Sep 2025 19:51:18 +0000

My old team shipped a web port of our 3D modeling software back in 2017. The entire engine is the same as the desktop app, written in C++, and compiled to wasm.

Wasm is not now and will never be a magic "press here to replace JS with a new language" button. But it works really well for bringing systems software into a web environment.

New comment by samsartor in "Starship's Tenth Flight Test"

samsartor — Sun, 24 Aug 2025 22:36:11 +0000

The simulatable stuff is almost perfect. It's the stuff that can't be simulated that fails.

Take the last flight as an example. The booster experienced what was (probably) a structural failure in the propellant fuel lines. Simulating stress in the structure under static conditions is quite straightforward. Simulating the stress as the rocket ascends vertically and the tanks empty is hard, but doable.

Simulating the dynamic loading as the rocket flips? The fuel sloshes around, the sloshing fuel changes the kenimatics of the rocket, the kenimatics of the rocket change how the fuel sloshes, the engines try to correct adding a new force, the thrust from the engines creates increased force on the fuel increasing the pressure to the pumps, the performance of the engines changes because of the new fuel flow, that alters the acceleration further causing fuel to slosh, gass bubbles are intrained in the fuel from all the sloshing thus altering its flow/sloshing behavior, valves open and close creating pressure waves in the fuel that travel up and down the fuel lines (the water-hammer effect alone being enough to burst the pipes if valve closing is not well-timed), and the rocket itself flexes as all this happens, testing every exact detail of the manufacturing which you have to go out to the factory and physically measure. No simulation software ever imagined can handle all that coupling of systems.

The usual solution is to make some conservative estimates (the center-of-mass of the fuel will move by at most some amount, bubbles will last at most some time, the engines will have so much control authority, etc). But that requires experience. And this is aerospace, so safety margins are tiny.

New comment by samsartor in "Derivatives, Gradients, Jacobians and Hessians"

samsartor — Sun, 17 Aug 2025 16:34:37 +0000

And remember, optimization problems can be _incredibly_ high-dimensional. A 7B parameter LLM is a 7-billion-dimensional optimization landscape. A grid-search with a resolution of 10 (ie 10 samples for each dimension) would requre evaluating the loss function 10^(7*10^9) times. That is, the number of evaluations is a number with 7B digits.

New comment by samsartor in "AI is a floor raiser, not a ceiling raiser"

samsartor — Thu, 31 Jul 2025 20:18:50 +0000

This also tracks with my experience. Of course, technical progress never looks smooth through the steep part of the s-curve, more a sequence of jagged stair-steps (each their own little s-curve in miniature). We might only be at the top of a stair. But my feeling is that we're exhausting the form-factor of LLMs. If something new and impressive comes along it'll be shaped different and fill a different niche.