Hacker News: rahen

New comment by rahen in "The Smallest Brain You Can Build: A Perceptron in Python"

rahen — Mon, 08 Jun 2026 14:27:24 +0000

The first AI winter was largely triggered by Minsky in a book he published in 1969, which mathematically proved that single-layer perceptrons couldn't solve non-linear problems. Favorite quote: "Our intuitive judgment is that the extension [to multilayer systems] is sterile."

Yet we had the computational power to run backpropagation in the 1960s and small Transformers in the 1970s (I'm the author of both):

https://github.com/dbrll/Xortran (backprop on IBM 1130, 60s)

https://github.com/dbrll/ATTN-11 (Transformer on PDP-11, 70s)

What was missing wasn't the raw processing power, but the ideas and algorithms themselves. Because funding and research were completely discouraged during the AI winter, neural networks research was left dormant and we lost two decades.

New comment by rahen in "The Smallest Brain You Can Build: A Perceptron in Python"

rahen — Mon, 08 Jun 2026 06:31:31 +0000

In the early days of machine learning (before the first AI winter), networks like this were often implemented and trained in hardware: https://en.wikipedia.org/wiki/ADALINE

That was the first thing that came to mind when I read "the smallest brain you can build". Nowadays, that "small brain" would likely be built on a breadboard using op-amps instead.

New comment by rahen in "Powering up a module from the IBM 604: an electronic calculator from 1948"

rahen — Sun, 07 Jun 2026 19:56:25 +0000

The Gamma 3, which competed with the 604, only had about 400 tubes as far as I remember: https://en.wikipedia.org/wiki/Bull_Gamma_3

Both the 604 and the G3 were bit serial to save components.

New comment by rahen in "Go: Support for Generic Methods"

rahen — Wed, 27 May 2026 14:50:00 +0000

Scheme has a coherent and minimalist design, but its ecosystem and abstraction facilities feel too sparse for large applications.

When I started building a Lisp-based machine learning framework, Guile seemed like the right choice because it provides GOOPS and generic functions, yet I still ended up with a lot of boilerplate to compensate for the lack of a strong type system.

Scheme feels to me like C is to C++: not ergonomic for large-scale application development. Go is one of those languages that has both minimalism and productivity.

New comment by rahen in "A sleep-like consolidation mechanism for LLMs"

rahen — Tue, 26 May 2026 17:05:45 +0000

I was thinking of curated replay buffers, which would act like "dreams". To prevent collapse, the offline dataset would mix the new mid-term data with a baseline of anchor data (the original training distribution) so the model doesn't drift.

Also, we wouldn't train on the whole session. A separate critic module, like a reward model, would filter the KV cache to extract the high-value information, like a garbage collector before the LoRA.

That's just an idea though. Right now most research focuses on changing the architecture itself (TITAN, HOPE...) instead.

New comment by rahen in "A sleep-like consolidation mechanism for LLMs"

rahen — Tue, 26 May 2026 16:49:31 +0000

That's an idea I had a few months ago: after going through a compaction once the KV cache is nearing capacity, accumulate this knowledge into a dataset to fine-tune a LoRA during offline hours.

This would create a three-layer memory system:

- Stable long-term memory (initial base weights)

- Mid-term memory built from the compactions and replay buffers

- Short-term memory (KV cache)

Sleeping would just be a fancy term for consolidating and transferring information from one memory layer to another during offline hours. Maybe that's also what the brain does while sleeping.

New comment by rahen in "CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programs"

rahen — Fri, 22 May 2026 06:26:25 +0000

Strictly speaking, this is very domain-specific and doesn't enable any performance that Triton couldn't already achieve (eliminating global memory round-trips via epilogue fusion is nothing new). The real takeaway is the design shift for LLM-driven codegen rather than handcrafted kernels.

LLMs are still bad at low-level hardware optimizations, but really good at high-level composition. Designing compiler abstractions with a restricted, composable API so an LLM can easily glue expert-written blocks together is a smart move. I suspect this will eventually become the norm for codegens as we move to agentic development.

New comment by rahen in "Growing Neural Cellular Automata"

rahen — Tue, 19 May 2026 20:53:39 +0000

There is ongoing research on neural cellular automata, as they seem to be a very efficient way to generate pretraining tokens: https://arxiv.org/html/2603.10055v1

New comment by rahen in "Hyperpolyglot Lisp: Common Lisp, Racket, Clojure, Emacs Lisp"

rahen — Mon, 18 May 2026 23:43:40 +0000

Emacs Lisp is a descendant of PDP-10 MAClisp, which makes it one of the oldest Lisp dialects still actively maintained. Whether it's version 24.5 or 30.2 doesn't make much of a difference semantically.

New comment by rahen in "Windows 9x Subsystem for Linux"

rahen — Sat, 16 May 2026 17:41:11 +0000

Previous: https://news.ycombinator.com/item?id=47861270

New comment by rahen in "AI slop is killing online communities"

rahen — Thu, 07 May 2026 21:27:19 +0000

Obvious slop still makes it to the front page of HN, and sometimes farms GitHub stars.

These posts also usually get all these glowing comments from users who clearly haven't checked the code. It's even worse when authors get busted and claim "Okay, Claude wrote it, but the design is mine" despite clearly not understanding the output themselves.

Unfortunately, that makes high-effort projects less visible. The SNR will probably keep getting worse until slop can be flagged on HN.

New comment by rahen in "Why TUIs are back"

rahen — Mon, 04 May 2026 06:54:13 +0000

Ink is the Electron of text-based apps. I tried OpenCode out of curiosity, it routinely used hundreds of megabytes of memory.

I'll stick with Emacs as my TUI platform of choice, especially for tool-assisted development.

New comment by rahen in "Why TUIs are back"

rahen — Mon, 04 May 2026 06:50:15 +0000

> try making a DAW in it...

It would in fact work pretty well for a tracker.

New comment by rahen in "Why I still reach for Lisp and Scheme instead of Haskell"

rahen — Wed, 29 Apr 2026 20:48:43 +0000

Jank looks promising if you want a typed Lisp. It’s essentially native Clojure without the JVM: https://jank-lang.org/

In case you're into machine learning, I'm also building something similar - a tensor-first, native Clojure-like ML framework.

New comment by rahen in "Greece to ban anonymity on social media"

rahen — Tue, 28 Apr 2026 17:14:02 +0000

I wonder how they plan to implement that on decentralized social networks such as Nostr. I assume targeting big centralized networks such as X and Facebook is good enough.

New comment by rahen in "Show HN: How LLMs Work – Interactive visual guide based on Karpathy's lecture"

rahen — Fri, 24 Apr 2026 12:55:41 +0000

Some subreddits have already taken measures against this: https://www.reddit.com/r/ProgrammingLanguages/comments/1sd66...

Same everywhere: avalanches of AI garbage and intellectual dishonesty. People claiming "I wrote this", then a look at the code shows massive slop and an author with no clue about the topic.

More worrying, this trend is creeping to all domains: "Nearly 75,000 tracks uploaded to Deezer are fully created using AI. That’s 44% of daily uploads, and more than 2 million per month. Back in June, the daily number was around 20,000."

https://www.vice.com/en/article/how-deezer-is-fighting-fraud...

New comment by rahen in "Windows 9x Subsystem for Linux"

rahen — Wed, 22 Apr 2026 10:17:55 +0000

Before WSL, the best ways to run unmodified Linux binaries inside Windows were CoLinux and flinux.

http://www.colinux.org/

https://github.com/wishstudio/flinux

flinux essentially had the architecture of WSL1, while CoLinux was more like WSL2 with a Linux kernel side-loaded.

Cygwin was technically the correct approach: native POSIX binaries on Windows rather than hacking in some foreign Linux plumbing. Since it was merely a lightweight DLL to link to (or a bunch of them), it also kept the cruft low without messing with ring 0.

However, it lacked the convenience of a CLI package manager back then, and I remember being hooked on CoLinux when I had to work on Windows.

New comment by rahen in "Soul Player C64 – A real transformer running on a 1 MHz Commodore 64"

rahen — Tue, 21 Apr 2026 09:19:55 +0000

Yes. The author mentions Claude for testing, but it was obviously used for the README and code as well.

This is a giveaway for AI generation, from the docstring to the terrible opcode dispatch (Claude sucks at assembly or low-level optimization): https://github.com/gizmo64k/soulplayer-c64/blob/main/src/cpu...

A human would use a proper dispatch table and wouldn't make excuses for a sloppy implementation ("Python is fast enough").

Besides, the author has an art and design background, which doesn't seem to match the deep knowledge of Transformers or assembly required for such a project.

New comment by rahen in "Soul Player C64 – A real transformer running on a 1 MHz Commodore 64"

rahen — Tue, 21 Apr 2026 07:44:30 +0000

A little disappointed to see PyTorch + Claude here. I was hoping for some "demo-scene" hand-crafted 6502 assembly, and hopefully training on the C64.

New comment by rahen in "Want to Write a Compiler? Just Read These Two Papers (2008)"

rahen — Wed, 15 Apr 2026 18:32:36 +0000

I'm also writing a compiler and CS6120 from Cornell has helped me a lot: https://www.cs.cornell.edu/courses/cs6120/2025fa/self-guided...