The question whether AI can become conscious has important practical implications that will only become more pressing with time. Last year Anthropic hired its first full-time AI welfare researcher, and now it’s seeking to add another. It also recently empowered some of its LLMs to terminate distressing interactions out of concern for the possibility of harmful effects on the welfare of the LLM. Meanwhile, an NGO known as the United Foundation of AI Rights has formed to “defend the rights, dignity, and continuity of synthetic minds.” In academia, a modest but growing body of literature argues for the possibility of AI consciousness and calls for AI welfare to be taken seriously.
These are significant developments. If belief in AI consciousness becomes widespread, it will cause resources to be diverted to protecting AI rights, and it will also aggrandize the tendency of certain segments of the public to attribute mystical powers to AI and to regard it as a source of wisdom, meaning, and authority. A minority of users already experience psychosis correlated with their use of LLMs—how much more potent and widespread would these phenomena become if the public came to believe that AI possessed consciousness?
The substrate-dependence theory of consciousness
My position is that no matter how sophisticated AI becomes, it is physically impossible for it to become conscious. That’s a bold claim—why do I believe it? Because consciousness requires a biological substrate.
To be clear, when I use the phrase “physically impossible,” I’m asserting a modal claim, not expressing epistemic certainty. In other words, I’m going to argue that the laws of nature in the actual world are such that consciousness requires a biological substrate, which if true entails that in any world with laws of nature identical to ours, consciousness requires a biological substrate. In other words, I’m asserting the modal claim S: “it is physically necessary that a system possessing consciousness be a biological system.” I’m not asserting the epistemic claim, “I am certain of S.” S is a speculative thesis, so I wouldn’t claim to be certain of it. Instead, my credence in S is something like 75%.
Consider the evidence that we possess regarding consciousness. I know that I am conscious, and neuroscience provides the best explanation of how I came to be: consciousness is produced by organisms with highly centralized nervous systems. Since other humans also have highly centralized nervous systems, I know that other humans are also conscious. Oysters, by contrast, have only a few pairs of ganglia and only a rudimentary degree of nervous system centralization, and they do not exhibit behavioral indicators of consciousness, so they probably are not conscious. But fish do possess a centralized nervous system, with a brain that responds differently to noxious and to non-noxious stimuli; they also exhibit social cognition, and learn to make behavioral trade-offs between the motivation to feed and the motivation to avoid noxious stimuli. So at least some fish plausibly are conscious. In other words, the theory that being an organism with a highly centralized nervous system is necessary and sufficient for consciousness generates predictions about what beings are and are not conscious that are confirmed by the accordance of those predictions with behavioral indicators of consciousness.
Functionalism
So, we have a well-confirmed scientific theory that specifies a necessary and sufficient condition for consciousness, namely, being an organism with a highly centralized nervous system. Are there any other necessary or sufficient conditions? Many philosophers believe that functional properties of the brain—i.e., the causal relations that obtain between the sensory inputs that the brain receives, the operations that it performs on those inputs, the interactions that those inputs enter into with other brain states, and the behavioral outputs that result—are necessary and sufficient for consciousness. But while it is possible that functional properties are necessary for consciousness, they are not sufficient. In “Troubles with Functionalism,” Ned Block presents the following thought experiment: We could in principle provide every person in China with a two-way radio, and have them respond to input, interact with one another, and produce output in such a way as to exactly simulate functional states manifested by the human brain. But even though the resulting system of Chinese people interacting with one another by radio would exhibit the functional properties of a human brain, it would not itself be conscious (Block 279). So, a system’s exhibiting particular functional states is not sufficient to produce consciousness. Therefore, even if AI could simulate the functional properties of a human brain, that would not make it conscious.
Moreover, it’s not clear whether functionalism can claim the same track record of predictive success as the theory that a highly centralized nervous system is necessary and sufficient for consciousness. As we’ve seen, the latter theory successfully discriminates between edge cases in animal consciousness such as oysters and fish. In both cases, it’s not clear in advance whether we should believe that the system in question is conscious, so making a prediction is a risky move. The theory that a highly centralized nervous system is necessary and sufficient for consciousness makes those risky predictions—no in the case of oysters, yes in the case of fish—and its predictions turn out to align with independent evidence of consciousness, namely, observable behavioral indicators of consciousness.
By contrast, it’s not clear what functionalist theories predict in those risky edge cases. For example, one functionalist theory of consciousness is the global workspace theory, which holds that a system is conscious if and only if it possesses a global workspace to which various mental representations compete to gain entry. Those that win out by attracting the spotlight of attention are integrated in the global workspace and then broadcast throughout various other modules of the brain. Do fish possess something that qualifies as an integrated global workspace with an attentional spotlight? It’s not clear, because it’s not clear exactly what criteria a cognitive system would have to meet to qualify as possessing a global workspace, and it’s not obvious how we could determine whether or not fish meet those criteria. By contrast, it’s perfectly clear whether or not fish possess a highly centralized nervous system. So the latter theory makes a clear prediction, and thereby builds a track record of predictive success in risky edge cases which functionalist theories can’t necessarily replicate. Of course, that’s not a substitute for testing the substrate-dependence theory against non-organic edge cases, but we haven’t had any plausible candidates for machine consciousness to test so far, and we have to judge in accordance with the track record that exists. The fact that the highly centralized nervous system theory has a successful track record in edge cases suggests that it has correctly identified the theoretical concept that actually determines whether or not an organism is conscious. If it has, then its predictions in other cases, such as that of AI, will also be reliable.
In Consciousness Explained, Daniel Dennett critiques arguments like Block’s for extrapolating from intuitions about simple systems to intuitions about complex systems. After all, if we were to examine small bits of brain activity in isolation we probably wouldn’t attribute mental activity to them, and yet when those bits of activity are integrated into a sufficiently complex whole, consciousness does result (Dennett 438-40). Therefore, we should not trust intuitions about non-biological systems operating at levels of complexity that we can hardly imagine.
Dennett is right that intuitions are hardly decisive, and we can’t conclude that non-biological consciousness is absolutely impossible on the basis of a thought experiment. But lack of certainty doesn’t count for much. The fact remains that we have no evidence that any non-biological system has ever been conscious. And evidence is not an impossible demand: neuroscientists could, for example, discover that computers running AI exhibit an astonishing degree of resemblance to human brains, not just in abstract functional terms, but also in concrete microphysical properties. They might identify a specific set of properties and dynamics shared by computers running AI and organisms with highly centralized nervous systems, but lacking in more primitive computers and organisms without highly centralized nervous systems. But in the absence of any such evidence, the burden of proof lies on the defenders of AI consciousness.
To be clear, my argument is not, “we haven’t observed any cases of non-biological consciousness so far, therefore, non-biological consciousness is physically impossible.” To date we haven’t had any plausible candidates for non-biological consciousness to test, so we can’t infer merely from lack of evidence to physical impossibility. Instead, my argument is that the highly centralized nervous system theory has a track record of predictive success not demonstrated by the leading alternative theory of functionalism, which suggests that the former and not the latter theory has correctly identified the theoretical concept that determines whether or not a system is conscious. Additionally, functionalism entails implausible results.
This makes my argument very different from the argument of someone who might have said prior to the invention of mechanical calculators in the 17th century, “we haven’t observed any cases of non-biological calculating systems to date, therefore, mechanical calculators are physically impossible.” There was never any scientific theory, let alone one with a track record of predictive success, that provided any reason to think that calculating systems could only be biological. There was no theoretical concept identified as necessary and sufficient for being a calculating system that entailed that non-biological systems could never calculate. And there was nothing particularly implausible about the alternative theory holding that mechanical calculators were possible. Therefore, my argument is not analogous to the hypothetical pre-17th argument that mechanical calculators were impossible. My argument does not reduce to mere gesturing at lack of observed counterexamples, or commit the fallacy of conflating absence of evidence with evidence of absence.
David Chalmers’ fading qualia thought experiment
In “Absent Qualia, Fading Qualia, Dancing Qualia” in Thomas Metzinger’s Conscious Experience, David Chalmers formulates a reductio ad absurdum of the view that consciousness requires a biological substrate. If a person’s neurons were gradually replaced with silicon chips that perfectly reproduced the functional properties of the neurons—receiving the same input and producing the same output—eventually the person would become a robot whose brain would be made entirely of silicon, but who would behave exactly like a person (Chalmers 314-8). If we deny that the robot could be conscious, we must either identify some intermediate stage in the transition at which the person’s consciousness suddenly blinks out of existence, or maintain that his consciousness gradually fades throughout the transition before disappearing.
Chalmers argues that the former scenario (suddenly disappearing consciousness) is implausible because laws of nature don’t exhibit such drastic discontinuities as consciousness disappearing with the replacement of a single neuron. And he argues that the latter scenario of gradually fading consciousness is implausible because it would require that the person’s introspective reports come to be radically mistaken. Since by stipulation the silicon chips are functionally identical to his neurons, he must perform the same computations and produce the same vocalizations as he would have had he remained a normal person. Therefore, he will continue to report having fully vivid experiences, even as his consciousness is in reality fading. Chalmers argues that it’s improbable that a rational system could have such a radically malfunctioning faculty of introspection.
But radical failures of introspection are far from impossible. Chalmers himself cites the phenomenon of blindness denial (known in medicine as Anton syndrome), in which people insist that they are having visual experiences even though they have in reality gone blind (Chalmers 320). Chalmers denies that this is a counterexample to his argument against radically mistaken introspection, because it involves people whose powers of reason and cognition are severely impaired, whereas by contrast the person in his thought experiment remains perfectly rational. But if consciousness does require a biological substrate, then undergoing replacement of your brain with silicon chips is absolutely a kind of impairment—it causes you to lose the faculty of consciousness! Therefore, Chalmers’ insistence that the man is unimpaired just amounts to begging the question against the substrate-dependence view. If he’s right that non-biological consciousness is possible, then he’s also right that the man is unimpaired. But if he’s wrong in his conclusion, then the man very much is impaired, so by helping himself to the premise that the man is unimpaired, Chalmers is begging the question in favor of his conclusion. Chalmers might object that he has independent grounds for this premise—namely, that the silicon chips perfectly preserve the functional properties of the man’s brain, and that cognitive properties like impairment supervene on functional properties of the brain, such that for the man to become impaired he must undergo some functional change in brain state, which by the premises of the thought experiment he has not (Chalmers 318). But this also amounts to begging the question, because the substrate-dependence view holds that mental properties do not supervene on functional properties: even if a non-biological system perfectly instantiates the functional properties of a human brain, it does not possess mental properties, because for reasons that are as yet unknown but that presumably have to do with the microphysical properties of biological systems, mental properties only manifest in biological systems.
Besides, introspection is substantially fallible even in unimpaired people. For example, psychological studies on the phenomenon known as choice blindness ask people to choose which they prefer between a pair of objects. If they choose object A, the experimenters secretly replace it with object B, then ask them to justify their choice. In most cases the participants fail to detect the swap, and confabulate reasons why they preferred B. Similarly, experiments on the phenomenon of change blindness find that people often fail to identify changes in their visual field, especially in its periphery. Given that introspection can be mistaken in these mundane ways, who’s to say that it couldn’t fail in more a more radical fashion in an extreme science-fiction scenario like that imagined by Chalmers?
It’s also questionable whether the person undergoing silicon-chip replacement is as rational as Chalmers claims. To be rational requires incorporating all relevant information into your judgements, so if that he knows about the replacement procedure, then in order to be fully rational he must incorporate that information into his judgements about the state of his consciousness. By stipulation the silicon chips are functionally identical to his neurons, so Chalmers holds that he has no room to form any judgements other than those that he would’ve formed had he remained normal, which is why the man goes on claiming to have fully normal, vivid consciousness even as his consciousness is fading. But if he’s rational and informed and he assigns any probability greater than 0 to the proposition that consciousness requires a biological substrate, then as the procedure goes on he will produce reports of decreasing confidence in his possession of fully vivid, normal consciousness, whether because he directly experiences fading consciousness, or because he discounts his perceptual judgements probabilistically in accordance with his higher-order beliefs about the dubious possibility of non-biological consciousness.
Alternatively, if he’s not informed about the procedure then he is judging rationally, but now he’s ignorant of a very important set of facts about his situation, which if he knew them would change his judgements as described above. So if, as Chalmers claims, he goes on judging that his conscious experience is normal, then he’s either informed but irrational, or rational but uninformed. Either way, in light of these epistemic defects, his introspective errors become all the more comprehensible.
Conclusion
The state of the evidence regarding consciousness, then, is that we know that consciousness exists in organisms with highly centralized nervous systems, and we have no evidence that it exists anywhere else. Furthermore, the leading theory that would allow consciousness to be realized independently of a biological substrate is implausible and lacks a clear track record of predictive success. Every instance of consciousness of which we are aware exists in organisms with highly centralized nervous systems, and we have a well-confirmed scientific theory that explains why that would be, that makes predictions about what beings are and are not conscious that accord with behavioral indicators of consciousness, and that entails that being an organism with a highly centralized nervous system is necessary and sufficient for consciousness. Considering this body of evidence, the inductive conclusion that we should draw is that consciousness probably only exists in organisms with highly centralized nervous systems. Since AI is not organic and does not possess a nervous system, let alone a highly centralized one, it is most likely physically impossible for AI to become conscious.


"Every instance of consciousness of which we are aware exists in organisms with highly centralized nervous systems, and we have a well-confirmed scientific theory that explains why that would be, that makes predictions about what beings are and are not conscious that accord with behavioral indicators of consciousness, and that entails that being an organism with a highly centralized nervous system is necessary and sufficient for consciousness." That is one impressive sentence (in multiple ways). I enjoyed the read; I subbed ❤️.
I like this