Hacker News: soraki_soladead

New comment by soraki_soladead in "Artificial intelligence is not conscious"

soraki_soladead — Thu, 04 Jun 2026 00:29:58 +0000

I think you're misremembering or misunderstanding Picard's argument. It isn't a tangent. Here's the transcript[0].

TL;DR Picard's initial arguments are pretty weak, even admitting that Riker as opposing counsel almost had him convinced. During a recess Picard talks to Guinan where she alludes to the future subjugation of many Datas which Picard connects to slavery. Back in the courtroom Picard calls Maddox as a hostile witness and gets him to define sentience--intelligence, self-awareness, consciousness--then walks him into conceding Data meets the first two. Picard's closing boils down to, "we don't know if he meets the third--you can call Data a toaster and rule he is property--_but what if you're wrong_". The judge rules on the basis of erroring on the side of caution due to that uncertainty. It's really a great scene.

We're not there yet, obviously. No LLM brings Data's level of awareness but it's as relevant a story as ever because it isn't really about AI but othering for the purpose of subjugation.

[0] http://www.chakoteya.net/NextGen/135.htm

New comment by soraki_soladead in "Colored Shadow Penumbra"

soraki_soladead — Thu, 07 May 2026 21:10:39 +0000

The post links another that goes into the theory a little: https://shahriyarshahrabi.medium.com/in-the-valley-of-gods-s...

Apparently a combination of Mie and Rayleigh scattering.

- https://en.wikipedia.org/wiki/Mie_scattering

- https://en.wikipedia.org/wiki/Rayleigh_scattering

New comment by soraki_soladead in "NASA Force"

soraki_soladead — Fri, 17 Apr 2026 17:21:33 +0000

I'm not saying you're wrong but then why do a big website and branding push. If they had someone in mind they'd bury it on a regular job posting.

They specify early to mid career. Imo they're anticipating a ton of applications and bounding it makes reviewing them tractable.

New comment by soraki_soladead in "Claude Design"

soraki_soladead — Fri, 17 Apr 2026 16:19:03 +0000

> There is also no pride.

Is the pride not in solving the users' problems?

> nobody talks about it, treats it with interest, or pays above market rate to work on it.

Definitely needs a citation for this one. For so many products the user isn't paying for standout design. They're paying for insight, leverage, velocity, convenience, whatever. The market definitely supports this by paying above market salaries.

Good design can be a useful differentiator but it isn't the only way for a tool or product to "spark joy" and often _fancy_ design (not good design) is used as a crutch for a subpar product.

New comment by soraki_soladead in "Make tmux pretty and usable (2024)"

soraki_soladead — Mon, 13 Apr 2026 15:51:55 +0000

Context: https://xkcd.com/1053/

Then, if you're like me and read this years ago, play around with the Light Mode dropdown which was new to me. :)

New comment by soraki_soladead in "Eternity in six hours: Intergalactic spreading of intelligent life (2013)"

soraki_soladead — Sun, 12 Apr 2026 16:32:01 +0000

We don't know however "It would take so long" is an anthropomorphic assumption of time scale.

New comment by soraki_soladead in "Mystery jump in oil trading ahead of Trump post draws scrutiny"

soraki_soladead — Tue, 24 Mar 2026 15:43:15 +0000

Why would we settle for anything less than discontinuing both?

New comment by soraki_soladead in "Exploring JEPA for real-time speech translation"

soraki_soladead — Sat, 14 Mar 2026 00:15:42 +0000

Roughly, when you train a model to make its predictions align to its own predictions in some way, you create a scenario where the simplest "correct" solution is to output a single value under diverse inputs, aka representation collapse. This guarantees that your predicted representations agree, which is technically what you want it to do but it's degenerate.

EMA helps because it changes more slowly than the learning network which prevents rapid collapse by forcing the predictions to align to what a historical average would predict. This is a harder and more informative task because the model can't trivially output one value and have it match the EMA target so the model learns more useful representations.

EMA has a long history in deep learning (many GANs use it, TD-learning like DQN, many JEPA papers, etc.) so authors often omit defense of it due to over-familiarity or sometimes cargo culting. :)

New comment by soraki_soladead in "NanoGPT Slowrun: Language Modeling with Limited Data, Infinite Compute"

soraki_soladead — Wed, 04 Mar 2026 19:04:20 +0000

also, BabyLM is more of a conference track / workshop than an open-repo competition which creates a different vibe

New comment by soraki_soladead in "Towards Nyquist Learners"

soraki_soladead — Tue, 19 Nov 2024 02:56:53 +0000

Alias-Free GANs? https://nvlabs.github.io/stylegan3/

New comment by soraki_soladead in "Diffusion for World Modeling"

soraki_soladead — Sun, 13 Oct 2024 15:27:57 +0000

We have lossless memory for models today. That's the training data. You could consider this the offline version of a replay buffer which is also typically lossless.

The online, continuous and lossy version of this problem is more like how our memory works and still largely unsolved.

New comment by soraki_soladead in "A most profound video game: a good cognitive aid for research"

soraki_soladead — Sun, 16 Jun 2024 14:12:50 +0000

You might enjoy this paper[0] which shows that recurrent position encodings recover grid cell representations and maps to path integration found in a popular model of the hippocampus. This isn't terribly surprising since RNNs have shown this before[1, 2] but its an interesting connection.

[0] https://arxiv.org/abs/2112.04035

[1] https://arxiv.org/abs/1803.07770

[2] https://www.nature.com/articles/s41586-018-0102-6

New comment by soraki_soladead in "Reproducing GPT-2 in llm.c"

soraki_soladead — Wed, 29 May 2024 04:15:39 +0000

Fwiw, that's SwiGLU in #3 above. Swi = Swish = silu. GLU is gated linear unit; the gate construction you describe.

New comment by soraki_soladead in "Building a deep learning rig"

soraki_soladead — Sat, 24 Feb 2024 18:11:50 +0000

It's not always about cost. Sometimes the ergonomics of a local machine are nicer.

New comment by soraki_soladead in "Cyclist hit by driverless Waymo car in San Francisco, police say"

soraki_soladead — Thu, 08 Feb 2024 04:35:45 +0000

Agree that's not a great look for the supervisor.

Cyclists have a bad rep in SF because many (not all) ride quite dangerously. It's a common sight to see cyclists running four-way stop signs and lights without even yielding. I live adjacent to a four-way stop and there's an incident where a cyclist fails to yield nearly hourly.

Meanwhile, Waymo has millions of incident-free miles and of all the self-driving car companies generally takes safety seriously, even if they will act to protect their interests here.

Until more evidence comes out I'll be taking Waymo's side here. I want safer vehicles and Waymo is currently the best bet.

New comment by soraki_soladead in "Cyclist hit by driverless Waymo car in San Francisco, police say"

soraki_soladead — Thu, 08 Feb 2024 03:19:08 +0000

Cyclists do this all the time in SF. Afaik an "Idaho stop" is not legal here, despite it being common and often unsafe for obvious reasons.

New comment by soraki_soladead in "Markov Chains are the Original Language Models"

soraki_soladead — Fri, 02 Feb 2024 02:32:46 +0000

FLOPs by perplexity by samples is an interesting way to compare this family of models.

New comment by soraki_soladead in "Learn Datalog Today"

soraki_soladead — Sun, 21 Jan 2024 19:08:38 +0000

https://github.com/cozodb/pycozo/blob/main/pycozo/test_build...

Here's the python version of what I think you're looking for. Shouldn't be too difficult to port to rust.

New comment by soraki_soladead in "Build full “product skills” and you'll probably be fine"

soraki_soladead — Sun, 19 Mar 2023 20:27:45 +0000

Sure. A few below but far from exhaustive:

- https://arxiv.org/abs/1909.07528 - https://arxiv.org/abs/2212.10403 - https://arxiv.org/abs/2201.11903 - https://arxiv.org/abs/2210.13382

There are also literally hundreds of articles and tweet threads about it. Moreover, as I said, you can test many of my claims above directly using readily available LLMs.

GP has a much harder defense. They have to prove that despite all of these capabilities that LLMs are not intelligent. That the mechanisms by which humans possess intelligence is fundamentally distinct from a computer’s ability to exhibit the same behaviors so much that it invalidates any claim that LLMs exhibit intelligence.

Intelligence: “the ability to acquire and apply knowledge and skills”. It is difficult to argue that modern LLMs cannot do this. At best we can quibble about the meaning of individual words like “acquire”, “apply”, “knowledge”, and “skills”. That’s a significant goal post shift from even a year ago.

New comment by soraki_soladead in "Build full “product skills” and you'll probably be fine"

soraki_soladead — Sun, 19 Mar 2023 18:39:03 +0000

> They are not intelligent.

Citation needed. Numerous actual citations have demonstrated hallmarks of intelligence for years. Tool use. Comprehension and generalization of grammars. World modeling with spatial reasoning through language. Many of these are readily testable in GPT. Many people have… and I dare say that LLMs reading comprehension, problem solving and reasoning skills do surpass that of many actual humans.

> They model intelligent behavior

It is not at all clear that modeling intelligent behavior is any different from intelligence. This is an open question. If you have an insight there I would love to read it.

> They don't know or care what language is: they learn whatever patterns are present in text, language or not.

This is identical to how children learn language prior to schooling. They listen and form connections based on the cooccurrence of words. They’re brains are working overtime to predict what sounds follow next. Before anyone says “not from text!” please don’t forget people who can’t see or hear. Before anyone says, “not only from language!” multimodal LLMs are here now too!

I’m not saying they’re perfect or even possess the same type of intelligence. Obviously the mechanisms are different. However far too many people in this debate are either unaware of their capabilities or hold on too strongly to human exceptionalism.

> There is this religious cult surrounding LLMs that bases all of its expectations of what an LLM can become on a personification of the LLM.

Anthropomorphizing LLMs is indeed an issue but is separate from a debate on their intelligence. I would argue there’s a very different religious cult very vocally proclaiming “that’s not really intelligence!” as these models sprint past goal posts.