Hacker News: hackinthebochs

New comment by hackinthebochs in "US Government directive to suspend access to Fable 5 and Mythos 5"

hackinthebochs — Sat, 13 Jun 2026 03:08:24 +0000

That's not how nerds think. You can believe there's a high chance of what you're working on being dangerous and still be unable to stop working on it. As Oppenheimer put it, "when you see something that is technically sweet, you go ahead and do it".

New comment by hackinthebochs in "What is it like to be a bat? (1974) [pdf]"

hackinthebochs — Thu, 11 Jun 2026 16:32:38 +0000

>But why does engaging with negative valence, planning, and weighing actions against other interests require subjective experience?

I have a few different answers here. None are rock solid. Lets take it as a given that planning requires a unified representation of all inputs to the planning apparatus. Now, going with the example from earlier: an organism touches a hot stove and recoils. We can imagine this behavior without any accompanying qualia. But to plan subsequent behavior around the hot stove, the damaging hotness must be represented in the unified representation in a way that intrinsically carries the semantics of negative valence. Phenomenal pain just is "semantics of negative valence featured in a unified representation". My claim is that this is a conceptual identity; you can't have one without the other. This gives the planning apparatus competence at engaging with signals of bodily damage.

Without intrinsic semantics/phenomenality all you have is a signal with no intrinsic meaning and some context to select behavior downstream of the signal. But planning in dynamic environments requires much more flexible signaling than this kind of static context can provide.

>AI systems weigh negative valence and execute long-term plans without any qualia.

AI systems are highly fragmented representations. It's why you can get them to contradict themselves in the same session, or even one sentence after another. They are not an exemplar of coherent behavior. There's also no negative valence in LLMs. At most they have a representation of good/bad and this spectrum influences the valence/quality/alignment in their behavior. But valence as such is external to the LLM.

>consider that there are many examples in which humans are able to perform very complex tasks in the absence of qualia. Consider, for instance, the phenomena of highway hypnosis, blindsight or sleepwalking - humans can do incredibly complicated things without qualia.

Complexity is relative. The complexity of tasks sans qualia are always starkly deficient compared to comparable tasks with qualia. A wide look at cognitive science demonstrates the inherent value of qualia to highly complex tasks or tasks executed over long timescales.

>This argument is circular. The original claim is that behaving coherently in a a complex environment requires consciousness. By shifting the goalposts...

The goalposts aren't shifted, I'm clarifying the target of the term behavior as there was clearly a disagreement in meaning.

>to say that only voluntary behaviors qualify, you are begging the question. The entire notion of "voluntary" implies conscious intent, so your argument has become "consciously willed behaviors require consciousness".

This misunderstands the debate. The philosophical issue of consciousness is how to explain consciousness given the in principle completeness of physical descriptions and their categorical distinction from phenomenal descriptions. In this context, voluntary behavior is just higher order/complex behavior, it is not taken as downstream of consciousness in principle. There is a parallel conversation in psychology/cognitive science where consciousness is largely understood as wakefulness, attention, reportability, intentionality, etc. In this context "consciousness" (in this restricted sense) is a pre-requisite of voluntary behavior. But that's neither here nor there with regards to the philosophical debate.

New comment by hackinthebochs in "What is it like to be a bat? (1974) [pdf]"

hackinthebochs — Thu, 11 Jun 2026 14:27:01 +0000

>But this isn't true! It has been repeatedly shown that patients without inner brain function react to stimuli (such as being pinched or pricked with a needle) by recoiling from the pain, as do babies with no experience of pain.

Yes, reflexive avoidance behavior doesn't require conscious experience. But as the environment of the organism gets more complex, reflexive avoidance behavior isn't sufficient for competence. For an agent in a complex environment, competent damage avoidance requires engaging with negative valence as a cognitive entity to be planned around and weighed against other interests. This requires unification and consciousness.

>Another counterargument is that our brains carry out lots of "coherent" functions "in the dark". Consider, for example, thermoregulation

This isn't an example of coherent behavior in the sense being used here. The issue is one of voluntary behavior being coherently executed as to achieve some goal without undermining itself.

>do you believe that a thermostat is conscious?

No. No self model, no consciousness.

New comment by hackinthebochs in "What is it like to be a bat? (1974) [pdf]"

hackinthebochs — Thu, 11 Jun 2026 06:34:53 +0000

Presumably the question you're asking is why does a unified self representation require consciousness. (Split brain cases are easy examples of how a break in unification results in incoherent behavior.) The brain nominally performs functions as cascading behavior of atoms whose structural relationships correspond to various functions. But there is no unification at the unconscious/atomistic level therefore a new representational regime is required that can ground the higher level unification.

A successful organism exhibits a high level of competence at reacting appropriately to environmental/sensory states. The "light's being on" is how the brain represents being situated in a world and the significant features therein. Representations within this gestalt are inherently meaningful. For example, phenomenal pain brings with it competence at protecting bodily integrity. The memory of pain becomes part of the explanatory narrative for the monitoring function that tracks progress towards goals ensuring coherent behavior (imagine being fearful of a stove but not knowing why). The contents of consciousness is the semantic engine that induces competent behavior over time on otherwise naive entities.

New comment by hackinthebochs in "What is it like to be a bat? (1974) [pdf]"

hackinthebochs — Thu, 11 Jun 2026 05:21:39 +0000

My usage of gestalt isn't without precedent[1]. I like gestalt better than qualia as a neutral description of the explanandum. Qualia is an atomistic view of consciousness and so is heavily theory-laden. I had just read a comment from the previous thread[2] on how this paper was translated into other languages and the lack of an equivalent "what its like" phrasing. The translations struck me as missing the virtue of the what it's like phrasing, namely identifying the intrinsic perspectivalness of cognitive systems without taking a stand on how to cash it out. I was trying to think of a better phrasing that could translate well and I landed on gestalt.

[1] https://philpapers.org/archive/EPSGPA.pdf

[2] https://news.ycombinator.com/item?id=45120638

New comment by hackinthebochs in "What is it like to be a bat? (1974) [pdf]"

hackinthebochs — Thu, 11 Jun 2026 02:46:38 +0000

Try sending an email to their contact address (bottom of page). Unless there's a good reason your comments are being auto killed, they will fix it.

New comment by hackinthebochs in "What is it like to be a bat? (1974) [pdf]"

hackinthebochs — Thu, 11 Jun 2026 01:37:00 +0000

> It seems difficult to claim that the computational process that implements this has any more or less of a gestalt then one multiplying two matrices together. So it's not just the existence of certain representations or computational loops that seems to lead to possessing a gestalt.

I've thought a lot about what is lacking in modern VLMs that preclude consciousness. In my view the difference is that their talk of "self" is a simulacrum of the real thing. Current models are feed forward and so self-talk is driven by some parameter that turns on when the network detects context that possibly references the model, and this parameter drives downstream self-talk. It's a very good simulacrum, but it is a far cry from a model with recurrent self-reference around which the inference process is organized. The richness of the self-model in a hypothetical recurrent network with capabilities of modern LMs is much greater than the parameter on/off representation in feed forward networks.

New comment by hackinthebochs in "What is it like to be a bat? (1974) [pdf]"

hackinthebochs — Wed, 10 Jun 2026 23:31:22 +0000

Seems like a rather ad hoc restriction. The issue is one of inferring the structure of the processes generating the output. I suppose given enough time and an adversarial style of interaction one could in principle determine the computational structure of any system with high confidence. So probably yes, modulo real-world concerns.

New comment by hackinthebochs in "How LLMs work"

hackinthebochs — Wed, 10 Jun 2026 23:06:45 +0000

>Are you arguing that with a set of identically-behaving black boxes, one could be "reasoning" and one could be "not reasoning", and a person would need to look inside the boxes at how they function to decide?

Absolutely! Inside one of the black boxes could be an audio device replaying a tape. The other could be a person thinking and responding. The massive lookup table construct people like to reference is just another kind of recorder, it takes every possible conversation that could happen in some finite sequence of characters and produces the precomputed continuation on demand. No one ever asks where those conversations came from. If God has to imagine them in his mind, conversing with the lookup table is just conversing with God.

New comment by hackinthebochs in "What is it like to be a bat? (1974) [pdf]"

hackinthebochs — Wed, 10 Jun 2026 22:56:17 +0000

What it's like - the gestalt of a bat (or other thing) as it engages its sensing-deciding-reacting loop. This gestalt isn't just for biological organisms, but any system for which its decision making engages with representations of the external environment unified with a self-representation to form a coherent representation of a persistent entity engaged with an external world.

Why do such systems need this gestalt? Why consciousness instead of everything happening in the dark? The recognition of oneself as situated in the world is crucial to coherent engagement with the world. It is how an entity can ensure its body parts are moving towards the same goal. It's how behavior over time doesn't undermine its purpose. Fragmented, incoherent behavior does not serve self-preservation.

LLMs as they are currently constructed probably aren't conscious, but we are a hop skip and a jump away from ones that are.

New comment by hackinthebochs in "How LLMs work"

hackinthebochs — Wed, 10 Jun 2026 10:51:25 +0000

>I say they're equivalent because it is possible to losslessly convert one to the other by wasting massive amounts of disk space and time.

There is a classical algorithm for every quantum algorithm if you're willing to waste a massive amount of space and time. There is a finite-state automata that can recognize any string some Turing machine can recognize. Yet we recognize these as distinct classes of computation. Mathematicians can get away with ignoring the tractability of finding an object with such and such properties. The rest of us can't.

Sure, there is a formal equivalence between LLMs and Markov chains, and this formal equivalence is useful for analysis. But this equivalence is not a constraint on the nature of the computations LLMs are doing. The formal equivalence does not mean that LLMs are "just predicting the next token". A probability distribution is a formal characterization of the statistical relationships between inputs and outputs. But this formalization does not undermine potentially further structure underlying the probability distribution (e.g. a deterministic mapping from inputs to outputs).

>if the transformer is reasoning, so is the hash map built from it.

Definitely not. "Formal" reasoning is making deductions based on the "form" or shape of some statement. In other words, transitioning from some token sequence to another sequence in virtue of the semantic structure of the token sequence (as opposed to its semantic content). Thus a necessary condition for reasoning is the ability to inspect the structure of the input rather than see it as a formless blob. Transformers can plausibly do this; lookup tables, Markov chains, etc necessarily cannot.

>For the record, "X is more expressive than Y" means "there exists at least one thing that Y cannot represent and X can".

Maybe expressive is the wrong word. But when a model has to wait for someone else to do the work then copy the answer, I call bullshit on it being (computationally) equivalent.

New comment by hackinthebochs in "How LLMs work"

hackinthebochs — Mon, 08 Jun 2026 05:23:15 +0000

That paper doesn't prove the equivalence of Transformers and Markov chains, it uses Markov chains as a theoretical model to understand the behavior of Transforms. The expressivity of the model matters, and Transformers just are more expressive than Markov chains.

>but the same tokens in two different orders are two different lookup keys

This is necessarily true for Markov chains and not necessarily true for Transformers. Transformers learn invariance over certain kinds of semantically irrelevant transformations. The Markov chain simply has to learn each input variant independently, resulting in an explosion of state space and data requirements compared to the functionally equivalent transformer. Expressive power matters.

I really don't get people's love for saying X is "just" Y (it's just a Markov chain, it's just a Kernel method). It's a strange pathology to focus on the superficial similarity while downplaying the boost in expressive power from where the models diverge.

New comment by hackinthebochs in "How LLMs work"

hackinthebochs — Sun, 07 Jun 2026 17:45:28 +0000

>In other words, a Markov chain and a Transformer model are exactly equivalent in power

Nonsense. Markov chains treat the past context as a single unit, an N-tuple with no internal structure. LLMs leverage the internal structure of the context which allows a large class of generalization that Markov chains necessarily miss.

New comment by hackinthebochs in "How LLMs work"

hackinthebochs — Sun, 07 Jun 2026 01:20:05 +0000

>The point is that the output is text that is statistically correlated with the input.

But we can simply note that this description applies to any machine learning algorithm. Yet LLMs are lightyears better than, say, Markov chains. What people are after is something that elucidates the features of LLMs that allow them to be so productive over what came before.

New comment by hackinthebochs in "How LLMs work"

hackinthebochs — Sat, 06 Jun 2026 13:32:12 +0000

>At all times the LLM is, indeed, predicting the next token

The point is that saying they're just "predicting the next token" is not at all explanatory nor providing insight. Saying the brain is just firing action potentials gives you no understanding about how the brain does what it does or what the space of its capabilities are. Similarly, predicting the next token tells you nothing about the capabilities of LLMs.

New comment by hackinthebochs in "How LLMs work"

hackinthebochs — Sat, 06 Jun 2026 13:14:15 +0000

>I don't think there is anything in a transformer I couldn't explain in the smallest detail now.

If you're up for it I would love to know how and why positional encodings work

New comment by hackinthebochs in "Artificial intelligence is not conscious – Ted Chiang"

hackinthebochs — Fri, 05 Jun 2026 02:06:02 +0000

>We have a lot of tools that can determine the frequency of light and I can use those on any source of light that I wish to measure that may hit my retinas.

Yes, which is why I said naturally distinguish. Have you asked a frontier model how many r's are in strawberry recently? They get it right now. Either through RHLF to ensure they spell out the word letter by letter or some other means. Humans and LLMs both use tools or alternative means to overcome perceptual limitations. I don't see an in principle difference here.

New comment by hackinthebochs in "Artificial intelligence is not conscious – Ted Chiang"

hackinthebochs — Thu, 04 Jun 2026 07:35:38 +0000

>There is no agreed upon definition of consciousness

No one genuinely engaged with the topic is confused about the target of the term (phenomenal) consciousness. Definitions come once the theoretical work is complete, to be articulated as part of a fully worked out theory. The lack of a definition doesn't prevent us from investigating the subject or offering conjectures. What we can do is offer a precise description of the target and argue for or against whether LLMs reach the description. We will of course debate whether the offered description captures the relevant phenomena. But this is all just part of the process.

New comment by hackinthebochs in "Artificial intelligence is not conscious – Ted Chiang"

hackinthebochs — Thu, 04 Jun 2026 02:30:18 +0000

>That they operate by tokens that don't correspond to words or letters is irrelevant as an answer to why they can't count the letters in a word.

This interpretation takes things too far away from how LLMs are constituted and so misses important explanatory power. The issue of counting letters in a word isn't about an ability to spell, it's about the nature of one's perception. We perceive words as sequences of individual letters. LLMs do not. I can ask you to tell me how many r's are in some nonsense word sequence and you're fully capable of doing that. LLMs do not see sequences of letters so they are intrinsically at a disadvantage for this kind of question. But this says nothing about its capacity for intelligence anymore than not naturally being able to distinguish frequencies of photons hitting your retina has anything to say about human intelligence.

New comment by hackinthebochs in "Artificial intelligence is not conscious"

hackinthebochs — Thu, 04 Jun 2026 02:23:11 +0000

Of course for humans words have no inherent meaning either, they're just sequences of characters or patterns of sounds. It is what words are associated with that carries meaning. A large part of this is how words relate to other words. LLMs can capture this in principle. What LLMs lack is the direct association of a word with sensory experience. But it's an open question how relevant this is in practice to understanding.