Hacker News: x1000

New comment by x1000 in "Golden Ratio using an equilateral triangle inscribed in a circle"

x1000 — Wed, 28 Jan 2026 08:43:41 +0000

It’s the negative of the inverse of the golden ratio. (Also 1 minus the golden ratio.) So, good for anything the golden ratio itself is good for.

New comment by x1000 in "Universe expected to decay in 10⁷⁸ years, much sooner than previously thought"

x1000 — Mon, 12 May 2025 15:49:19 +0000

Not a physicist, but I see it this way too. My understanding of Boltzmann brains is that they are a theoretical consequence of infinite time and space in a universe with random quantum fluctuations. And that those random fluctuations would still be present in an otherwise empty universe. So then this article has no bearing on the Boltzmann brain thought experiment or its ramifications.

New comment by x1000 in "Parameter-Free KV Cache Compression for Memory-Efficient Long-Context LLMs"

x1000 — Thu, 27 Mar 2025 20:19:50 +0000

If they had experimented using a newer model (gemma 3, deepseek-1 7b, etc.) and reported better results, would that be because their newer baseline model was better than the llama 2 model used in the previous methods' experiments? A more comprehensive study would include results for as many baseline models as possible. But there are likely other researchers in the lab all waiting to use those expensive GPUs for their experiments as well.

New comment by x1000 in "MIT 6.S184: Introduction to Flow Matching and Diffusion Models"

x1000 — Tue, 04 Mar 2025 01:02:17 +0000

It’s the fundamentals that underly Stable Diffusion, Dalle, and various other SOTA image generation models, video, and audio generation models. They’ve also started taking off in the field of robotics control [1]. These models are trained to incrementally nudge samples of pure noise onto the distributions of their training data. Because they’re trained on noised versions of the training set, the models are able to better explore, navigate, and make use of the regions near the true data distribution in the denoising process. One of the biggest issues with GANs is a thing called “mode collapse” [2].

[1] https://www.physicalintelligence.company/blog/pi0

[2] https://en.wikipedia.org/wiki/Mode_collapse

New comment by x1000 in "Zod: TypeScript-first schema validation with static type inference"

x1000 — Wed, 09 Oct 2024 17:51:03 +0000

I ran into exactly same pain point which was enough to nullify the benefits of using zod at all.

New comment by x1000 in "Differential Transformer"

x1000 — Tue, 08 Oct 2024 17:29:42 +0000

Could you help explain how we would achieve an attention score of exactly 0, in practice? Here’s my take:

If we’re subtracting one attention matrix from another, we’d end up with attention scores between -1 and 1, with a probability of effectively 0 for any single entry to exactly equal 0.

What’s more, the learnable parameter \lambda allows for negative values. This would allow the model to learn to actually add the attention scores, making a score of exactly 0 impossible.

New comment by x1000 in "Introdution to Computer architecture explained with Minecraft [video]"

x1000 — Tue, 06 Aug 2024 13:38:52 +0000

My first exposure to computer architecture was through a Minecraft video[1] (which I likely stumbled upon on Digg). In Linear Algebra lecture the next day, I overheard my classmates discussing the video. I purchased the game later that week.

Seeing the circuitry of a computer in this way helped me to understand that computers operated by means other than pure magic. And, the video I saw was much less descriptive of how a computer works than the one the OP linked. So, although neither video amounts to a full college course on the topic, there’s still a lot of value in their ability to expose people to the topic. It’s inspiring to see how computers are mostly a composition of NAND gates, and to compare the massive structures in the videos with the microprocessors of the real world.

[1] https://youtu.be/LGkkyKZVzug?si=hZRdmablPt15gGqn

New comment by x1000 in ""Attention", "Transformers", in Neural Network "Large Language Models""

x1000 — Mon, 25 Dec 2023 01:17:55 +0000

There’s a video[1] of Karpathy recounting an email correspondence he had with with Bahdanau. The email explains that the word “Attention” comes from Bengio who, in one of his final reviews of the paper, determined it to be preferable to Bahdanau’s original idea of calling it “RNNSearch”.

[1] https://youtu.be/XfpMkf4rD6E?t=18m23s

New comment by x1000 in "Can GPT-4 and GPT-3.5 play Wordle?"

x1000 — Tue, 21 Mar 2023 02:41:26 +0000

Imagine you are a LLM and all you see are tokens. Your job is not only to predict the next token in a sequence, but also to create a nice embedding for the token (where two similar words sit next to each other). Given a small enough latent space, you're probably not concerning yourself too much with the "structure inside" the tokens. But given a large enough latent space, and a large enough training corpus, you will encounter certain tokens frequently enough that you will begin to see a pattern. At some point during training, you are fed:

1) An English dictionary as input.

2) List of words that start with "app" wiki page as input.

3) Other alphabetically sorted pieces of text.

4) Elementary school homeworks for spelling.

5) Papers on glyphs, diphthongs, and other phonetic concepts.

You begin to recognize that the tokens in these lists appear near each other in this strange context. You hardly ever see token 11346 ("apple") and token 99015 ("appli") this close to each other before. But you see it frequently enough that you decide to nudge these two tokens' embeddings closer to one another.

Your ability to predict the next token in a sequence has improved. You have no idea why these two tokens are close every ten millionth training example. Your word embeddings start to encode spelling information. Your word embeddings start to encode handwriting information. Your word embeddings start to encode phonic information. You've never seen or heard the actual word, "apple". But, after enough training, your embeddings contain enough information so that if you're asked, ["How do", "you", "spell", "apple"], you are confident as you proclaim ["a", "p", "p", "l", "e", "."] as the obvious answer.

New comment by x1000 in "BTC Endgame"

x1000 — Sat, 20 Feb 2021 06:36:57 +0000

An interesting aspect of these cryptocurrencies is the aspect of consensus, not through the intended mechanisms like PoW, but through societal acceptance. Look at BTC and BTG (bitcoin and bitcoin gold). One has the suffix of "gold" while the other maintains the (arguably?) superior lack of any such embellishments/augmentations. Was it the miner's decision to call it that? Was it the users?

Look at Ethereum v Ethereum classic. Same deal. We have two chains that share a common history, yet at some point the users of both decided to split and then society had to come to a consensus on what each chain would be called. Again, did the miners sit around and conspire to which chain would be called "Ethereum?" I don't think so. I think the decision was decentralized and emergent.

My point is, even if there was a nefarious actor who attempted a 51% attack, it seems like there would be enough of a societal pressure to ignore their empty blocks. There would exist a chain that would still be valued by the perpetrators, but not so much by the individuals being harmed by such an attack. The attacked chain would be maintained and acquire a new name "Bitcoin Hacked" or something similar, and the chain where society ignores the empty blocks would go on its merry way still being called "bitcoin."

New comment by x1000 in "Intelligent disobedience"

x1000 — Mon, 25 May 2020 19:53:51 +0000

Reminds me of Asimov's Second Law of Robotics[1].

[1] https://en.wikipedia.org/wiki/Three_Laws_of_Robotics

New comment by x1000 in "Generating Ethereum Address from Scratch"

x1000 — Sat, 22 Jun 2019 17:25:15 +0000

I've been studying the bitcoin book lately [1]. Slowly but surely I'm getting a better understanding of all the pieces involved. I was inspired by an article I found here on HN, [2]. I saw that and decided I wanted to implement as much of the bitcoin protocol as possible without using any libraries.

I still haven't made too much progress. It was easy to generate private keys. Then I had to implement ECC. That took me quite a while to nail. But now I've got public keys on the curve mentioned in this article. Now I'm still in the process of mapping this into the public key hash. I got the sha256 hash working pretty quickly using just the psuedocode on wikipedia. But public key hashes are ripemd160(sha256(publicKey)), and I'm still having troubles with the ripemd160 hash.

After that, I'll base58 encode the hash. Then there's a couple of BIPs I wanna implement, the HD wallets and the 13 word mnemonics ones in particular.

Basically, I think this stuff's important for me to know. All these math abstractions are chained together in a certain way that gives us cryptocurrency protocols, internet monies. In the process, I get a strong foundation in cryptography as well as domain knowledge of cryptocurrencies.

[1] https://github.com/bitcoinbook/bitcoinbook

[2] http://www.righto.com/2014/09/mining-bitcoin-with-pencil-and...

New comment by x1000 in "From Design Patterns to Category Theory (2017)"

x1000 — Tue, 11 Jun 2019 00:26:45 +0000

Been following this guy for a while now too. And you're absolutely right. It's been fun watching his evolution. Glad to see this series of articles posted here. I've been sharing this series with my coworkers and we've since gotten into Haskell, Elm, and just taking more functional approaches in general in our primary stack.

New comment by x1000 in "Game Programming Patterns: Double Buffer (2014)"

x1000 — Mon, 27 Aug 2018 16:09:56 +0000

Is there a reason you specified partial derivative with respect to time? Why not just d/dt(state)?

I've been tempted lately to model application state as the integral over time of all events (deltas) that have occurred. For example, imagine a simple game state:

  type GameState = {
    x: float; // x coordinate
    t: float; // current time
    v: float; // velocity variable
  }

If initial game state =

  { x: 0,
    t: 0,
    v: 0, };

and you have some events (deltas):

  [{t: 1, v:1}, {t:2, v:-1 /*this is deltaV, so back to v: 0*/}, {t: 3, v:2}, {t:5, v:-2}]

You could "sum" them up (integrate over time) and get a game state of x = 5, v = 0 where t >= 5.

I've found this approach kinda sucks when you add things like collision detection. Your application would have to emit velocity deltas when objects would collide. If you've got a point that bounces around in a box with perfectly elastic collisions, then over time you'd have an infinite number of these collision/velocity delta events. But as you receive new events, all your precomputed collision events are for nothing (if your player logs out). So the traditional update loop simulate each tick as it occurs seems to work best.

Is there a more general way of thinking about this integration? Maybe with respect to another variable? Perhaps it would address this problem I'm having where I feel forced to quantize my game state onto ticks.

New comment by x1000 in "Game Programming Patterns: Double Buffer (2014)"

x1000 — Mon, 27 Aug 2018 08:02:32 +0000

I thought this would only apply to rendering graphics, but after reading the "When To Use It" section, I realized I've done double buffering on entire game states before (~2 years ago project). At the beginning of my update loop, I'd (deep) copy the current game state into a new object and begin incrementally updating the copy. Then I'd reassign right before Thread.sleeping (or whatever language idiom) until the next game loop "tick".

Wasn't too fond of my C# deep copy solution: var serialized = JsonSerializer.Serialize(this); return JsonSerializer.Deserialize(serialized);

I took an interested in functional programming, pure functions, immutability, etc. soon after.

New comment by x1000 in "The International 2018: Results"

x1000 — Sat, 25 Aug 2018 02:44:42 +0000

Idk anything about any of it. But, what if open AI is "bad" at ward placement because it plays itself so much, it already know exactly where it would be if it were the other team?

New comment by x1000 in "Pattern Matching with TypeScript"

x1000 — Wed, 05 Jul 2017 18:05:13 +0000

This article seems relevant, https://www.typescriptlang.org/docs/handbook/advanced-types.... . Particularly the part on discriminated unions. I'm just now learning TypeScript, so I have yet to actually implement anything this way.