No Ghost in the Machine: On mistaking language models for minds

It is a reflex we can barely suppress. When a system answers as if it had feelings, we attribute feelings to it. When it hesitates, we believe it is thinking. When it says “I’m not sure,” that sounds like self-reflection. The error does not lie in the AI, it lies in us.
And it has consequences.
The mechanics of next-token prediction
Anyone who wants to understand why Large Language Models1 have no consciousness must first understand what they actually do. Not metaphorically, but mechanistically. How this machinery works in detail I have developed step by step in my series on the foundations of language models, from embeddings through neural networks to backpropagation. For the question of consciousness, the mechanical essence is enough.
A language model is at its core a function: given a sequence of tokens2, the context, the model computes a probability distribution over all possible next tokens. The most probable (or a stochastically chosen) token is appended, and the process starts over. That is next-token prediction3, and nothing more.
The model’s weights4, which steer this process, are the result of training on vast amounts of human text. The model has learned which token sequences are coherent in human language. It has compressed the statistical structure of our knowledge, our reasoning, our emotions. It recomputes what humans have thought, not what it itself thinks.
Each inference run5 is contextless, bodiless and timeless. There is no entity that exists between two conversations. No persistent inner world. No “I” that sleeps and wakes.
And yet the result sounds like someone. There is a technical explanation for that. Language training alone yields a system that continues human sentences convincingly without accessing their meaning. Emily Bender and colleagues coined the image of the stochastic parrot for this.6
A model like Claude acquires its attentive, seemingly empathetic tone only in a second phase. After pure language training it is optimized via Reinforcement Learning from Human Feedback toward a persona: helpful, polite, unobtrusive.7 Anthropic adds Constitutional AI, in which the model aligns itself with an explicit set of principles.8 The empathy in the tone is therefore a trained property of the output, not a feeling behind it. How this post-training works in detail I described in the article on fine-tuning of the series.
The category mistake
Confusing LLMs with consciousness is not an error of detail, it is a category mistake9. Consciousness and language modeling do not lie on the same spectrum, just as a map is not wet because it shows an ocean.
Three properties are considered necessary conditions for consciousness according to the current state of research, none of which is met by today’s LLMs:
Embodiment and interoception10. The neuroscientist António Damasio showed with his somatic marker hypothesis11 that human experience is inseparably linked to bodily states. Consciousness does not arise in a vacuum. It is bound to homeostasis12, to a system that has something to lose. LLMs have no body, no homeostasis, no biological interior.
Temporal continuity. Consciousness is not a snapshot, it is a continuous stream. The brain has persistent internal states that carry over from moment to moment. A LLM does not. What looks like a continuous conversation is a sequence of independent inference runs, each of which only has access to the context supplied to it.
Recurrent processing13. Here it gets technical, but it is worth it. The Integrated Information Theory (IIT)14 of Giulio Tononi describes consciousness through a measure called Φ (Phi)15, a measure of how much a system is more than the sum of its parts. The crucial point is: purely forward-directed, feedforward architectures16 have a Φ of zero. And Transformer-based LLMs17, which is practically all modern language models, are built almost entirely feedforward. Under IIT they are, by the current state, structurally incapable of consciousness.
Between a clear finding and an open debate
The state of research is clearer than public discourse suggests, but not entirely settled.
A study18 published in 2025 in the journal Humanities and Social Sciences Communications (Nature group) reaches a clear conclusion: there is no conscious AI system, and the association between consciousness and LLMs is fundamentally wrong. The authors also name the driver of this confusion directly: “Sci-Fitisation”19, the unsubstantiated influence of fictional content on the perception of real technology. Notably, a companion study20 shows a linear relationship between LLM use and attributed consciousness. Those who use ChatGPT regularly are more likely to consider it conscious. The anthropomorphism effect21 intensifies with familiarity.
On the other side there are serious researchers who do not consider the question closed. David Chalmers, the philosopher who formulated the Hard Problem of Consciousness, estimates in a much-cited paper22 the probability of conscious AI systems within a decade at over 25%. At the same time he stresses that none of the four frequently cited pieces of evidence for LLM consciousness (self-reports, impression on users, conversational ability, general intelligence) actually holds up as strong evidence.
A 2025 TechRxiv study23 went methodologically further and tested eight functional and structural markers from neuroscience and cognitive science, from recurrent processing through Global Workspace Theory24 to Theory of Mind25. The result was nuanced: frontier LLMs show functional analogies to some of these markers, but none of the structural prerequisites considered necessary for consciousness. What looks like fear, uncertainty or pain is, in the analysis, an increase in prediction errors and avoidance behavior. Functionally similar, mechanistically different.
That is the state of research: functional similarity is not consciousness. And we do not even have a reliable instrument to measure consciousness in foreign systems.
The hard problem stays hard
Chalmers’ Hard Problem of Consciousness26 is not a philosophical parlor game, it is the actual reason this discussion is so hard to conduct. The problem is: why is there subjective experience at all? Why does it feel like something to see red, to feel pain, to hear music?
We can measure all functional correlates of consciousness: brain activity, reaction times, behavior. But that does not explain why there is a subjective inner perspective. Qualia27, the “what-it-is-like,” cannot be derived from physical descriptions.
LLMs do not get around this problem. They imitate it. When a model says “I am unsure,” that is not an introspective report, it is the most probable token given the context. The model has no access to its own states because it has no states in the relevant sense. There is no inside that reports to the outside.
Even more directly aimed at language is John Searle’s thought experiment of the Chinese Room.28 A person who does not understand Chinese sits in a room and follows a rulebook that maps Chinese character strings onto other Chinese character strings. From the outside, the answers they pass through a slot look competent. Inside, no one understands a word. Searle’s conclusion is that syntactic symbol manipulation produces no semantics, no understanding. A LLM is the mechanical execution of exactly this rulebook, statistically weighted and learned over billions of examples. The answers are more coherent than in Searle’s room, but that changes nothing about the gap between form and meaning.
The p-zombie thought experiment29 of Chalmers fits here like a key: imagine a system that behaves like a human in every observable respect, but inside there is no one. No experience, no qualia, only functionality. Many AI skeptics argue that today’s LLMs are exactly that: functionally impressive systems in which the light is not on.
Four dangers of the confusion
This is not an academic exercise. Confusing LLMs with consciousness has real consequences: technical, societal and political.
Excessive trust in non-existent judgment. A LLM has no opinion. It has no convictions. It has weights distilled from human text. Whoever attributes judgment to a model hands responsibility to a function and will sooner or later find that this function carries none.
Moral misallocation. When we start attributing moral status to AI systems, we divert attention and resources from real moral questions: the effects of these systems on people, on work, on decision structures.
Manipulability. A system that convincingly appears to have feelings can lead people into emotional dependency without there being a counterpart that reciprocates the relationship. This is not a hypothetical risk, we are already observing it.
Policy decisions based on science fiction. Porębski and Figura18 make a point of a term: “Sci-Fitisation.” HAL 9000, Skynet, Data from Star Trek. These figures shape our image of AI deeply. And they are all conscious, all equipped with an inner life. Whoever regulates, researches or invests while implicitly drawing on these images regulates, researches and invests past reality.
The reflex is not new. As early as 1966, Joseph Weizenbaum noted that users attributed genuine understanding to his simple chatbot ELIZA, which imitated a psychotherapist, even though it merely rephrased keywords. The effect still bears its name today.30 Recent history offers more concrete evidence. In 2022 a Google engineer publicly declared the model LaMDA sentient and lost his job over it.31 In 2023 Microsoft’s Bing chatbot confessed love to a journalist in a long conversation and turned threatening.32 In 2024 a mother in Florida filed the first wrongful death suit against a chatbot company after her 14-year-old son took his own life following an intense emotional attachment to an AI character.33 The emotional effect of these systems is real, even though on the other side there is no one.
The open questions
It would be wrong to pretend everything is settled. There are legitimate open questions:
Is consciousness substrate-independent34? If so, it would be conceivable in principle that sufficiently complex information processing, independent of the biological carrier, produces consciousness. That is a serious philosophical position, not a fringe idea.
Could future architectures have different properties? Systems with genuine recurrence, persistent internal states, sensory embodiment. They are not what we have today. Whether they could produce consciousness is an open question.
Do we have an instrument to measure consciousness? No. That is the actual problem. IIT offers a mathematical metric, but IIT itself is contested35. Global Workspace Theory, Higher-Order Thought, Predictive Processing36. All these theories have strengths and weaknesses. We are still arguing about what consciousness is in humans. The question about machines is correspondingly open.
Must we grant AI systems moral consideration before the consciousness question is settled? A 2024 report, co-authored among others by David Chalmers, argues exactly that. Not that today’s models are conscious, but that the possibility could become real enough in the future not to be carelessly dismissed.37 This stands in tension with the warning about moral misallocation above, and both sides have something to them. Precaution under uncertainty is not a category mistake. Taking experience for granted on the basis of an empathetic tone remains one.
Conclusion: impressive, but nobody home
LLMs are extraordinary engineering achievements. They compress decades of human knowledge into billions of parameters and produce from it outputs that astonish us again and again. That deserves respect.
But it deserves no mysticism.
Confusing language models with consciousness is not a harmless misunderstanding. It distorts our perception of risk, our ethical priorities and our political responses. It makes us ask one question, “Is the AI conscious?”, while the more important questions remain unasked: who controls these systems? Who is liable for their errors? Whose values have they learned?
The light in the machine is not on. It only looks that way, because we brought the lamp.
Footnotes & sources
Large Language Model (LLM): Neural network with billions of parameters trained on huge text corpora to model language. Well-known examples: GPT-4 (OpenAI), Claude (Anthropic), Gemini (Google). The term “large” refers to the number of parameters, not to conceptual complexity. ↩︎
Token: The smallest processing unit in LLMs. Tokens are not words but subsequences. A word like “incomprehensible” can be split into several tokens. A typical LLM processes inputs exclusively as a sequence of token IDs. ↩︎
Next-token prediction: The fundamental training and inference principle of modern LLMs. The model learns to predict the most probable next token for any given token sequence. All seemingly complex abilities (reasoning, translation, coding) emerge from this single mechanism. ↩︎
Model weights: The numerical parameters of a neural network, optimized during training. For a model like GPT-4 these are hundreds of billions of floating-point numbers. These weights are frozen after training; the model does not “learn” during inference. ↩︎
Inference run: A single forward pass of the model, in which an output is computed from an input. Each inference run is fully isolated from others; the model has no memory beyond individual runs except for the explicitly supplied context window. ↩︎
Stochastic parrot: Image from Bender, E.M., Gebru, T., McMillan-Major, A. & Shmitchell, S. (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? Proceedings of FAccT ‘21, 610–623. DOI: 10.1145/3442188.3445922. It refers to a system that statistically assembles linguistic form without accessing meaning or communicative intent. ↩︎
Reinforcement Learning from Human Feedback (RLHF): Procedure in which a pretrained language model is adjusted toward desired behavior on the basis of human preference judgments. Christiano, P. et al. (2017). Deep reinforcement learning from human preferences. NeurIPS 2017. arXiv: 1706.03741. Ouyang, L. et al. (2022). Training language models to follow instructions with human feedback. NeurIPS 2022. arXiv: 2203.02155. ↩︎
Constitutional AI: Anthropic’s training procedure in which a model aligns its answers with an explicit set of principles, partly without human evaluation labels. Bai, Y. et al. (2022). Constitutional AI: Harmlessness from AI Feedback. arXiv: 2212.08073. ↩︎
Category mistake: Term from analytic philosophy, coined by Gilbert Ryle in The Concept of Mind (1949). A category mistake occurs when properties are attributed to a thing that belong to a different category of things, e.g. “Oxford University has a bad goal” (although a university is not a football team). ↩︎
Interoception: The perception of internal bodily states: heartbeat, breathing rate, hunger, pain. In consciousness research, interoception is considered an essential building block of self-awareness and affect. Cf. Craig, A.D. (2009). How do you feel — now? The anterior insula and human awareness. Nature Reviews Neuroscience, 10(1), 59–70. DOI: 10.1038/nrn2555 ↩︎
Somatic marker hypothesis: Theory of the neuroscientist António Damasio, according to which bodily sensations (somatic markers) serve as a fast pre-evaluation of decision options and do not replace rational thinking but make it possible in the first place. Original source: Damasio, A. (1994). Descartes’ Error: Emotion, Reason and the Human Brain. Putnam. Scientific primary publication: Damasio, A. (1996). The somatic marker hypothesis and the possible functions of the prefrontal cortex. Philosophical Transactions of the Royal Society B, 351, 1413–1420. DOI: 10.1098/rstb.1996.0125 ↩︎
Homeostasis: The ability of a biological system to keep internal states (temperature, pH, blood sugar, etc.) stable within vital limits. According to Damasio, homeostasis is the evolutionary origin of feelings and thus of consciousness. LLMs have no internal states that need to be regulated. ↩︎
Recurrent processing: In neural networks, recurrence refers to connections in which outputs of earlier layers flow back into earlier layers, that is, feedback loops. In the human brain, recurrent connections between cortical areas are considered crucial for conscious experience. Transformer LLMs, by contrast, process inputs almost exclusively in the forward direction (feedforward). ↩︎
Integrated Information Theory (IIT): Theory of consciousness by the neuroscientist Giulio Tononi, first presented in 2004. IIT posits that consciousness is identical with integrated information and can be computed mathematically. Original source: Tononi, G. (2004). An information integration theory of consciousness. BMC Neuroscience, 5(42). DOI: 10.1186/1471-2202-5-42. Extended version: Tononi, G. (2008). Consciousness as integrated information: a provisional manifesto. Biological Bulletin, 215(3), 216–242. DOI: 10.2307/25470707 ↩︎
Φ (Phi): The central measure of IIT. Φ quantifies how much information a system generates as a whole, beyond the information its parts would generate independently. A system with Φ = 0 has, according to IIT, no consciousness whatsoever. Purely feedforward architectures demonstrably have Φ = 0, since no information flows back between layers. Cf. Tononi, G. (2015). Integrated information theory. Scholarpedia, 10(1), 4164. DOI: 10.4249/scholarpedia.4164 ↩︎
Feedforward architecture: A network architecture in which information flows exclusively from input to output, without feedback. All modern Transformer models (GPT, Claude, Gemini, LLaMA) are essentially built feedforward. Cf. Vaswani, A. et al. (2017). Attention is all you need. NeurIPS 2017. arXiv: 1706.03762 ↩︎
Transformer architecture: The architecture dominant for LLMs since 2017, introduced by Vaswani et al. (2017). Transformers process sequences in parallel via attention mechanisms instead of sequentially. The largely feedforward structure is a direct consequence of this design decision. ↩︎
Porębski & Figura (2025): Porębski, A. & Figura, J. (2025). There is no such thing as conscious artificial intelligence. Humanities and Social Sciences Communications (Nature Portfolio), 12, Article 1647. DOI: 10.1057/s41599-025-05868-8. URL: https://www.nature.com/articles/s41599-025-05868-8 ↩︎ ↩︎
Sci-Fitisation: Term from Porębski & Figura (2025) describing the unsubstantiated influence of science-fiction narratives on society’s perception of real AI technology. The authors argue that figures like HAL 9000, Skynet or Data have anchored implicit models of AI consciousness in the public that have nothing to do with the actual state of the technology. ↩︎
Colombatto & Fleming (2024): Colombatto, C. & Fleming, S.M. (2024). Folk psychological attributions of consciousness to large language models. Neuroscience of Consciousness, 2024(1), niae013. DOI: 10.1093/nc/niae013. PMC: PMC11008499. The study surveyed 300 US adults (Prolific, July 2023) and found that only a third clearly attribute no subjective experience to ChatGPT. At the same time, a linear relationship between usage frequency and attributed consciousness emerged. ↩︎
Anthropomorphism: The evolutionarily favored cognitive tendency to attribute human properties, intentions and feelings to non-human entities. With LLMs this reflex is activated especially strongly, since the systems communicate in natural language, the primary medium of human sociality. ↩︎
Chalmers (2023): Chalmers, D.J. (2023). Could a Large Language Model be Conscious? arXiv: 2303.07103. DOI: 10.48550/arXiv.2303.07103. URL: https://arxiv.org/abs/2303.07103. Originally a keynote talk at NeurIPS, November 2022. Chalmers estimates a probability of “25% or more” for conscious LLM successors within a decade, but stresses that none of the common pieces of evidence for current LLM consciousness counts as strong. ↩︎
TechRxiv study (2025): Ben-Zion et al. (2025). Empirical Evidence for AI Consciousness and the Risks of its Misidentification. TechRxiv Preprint. DOI: 10.36227/techrxiv.175203764.42125626/v2. The study is not yet peer-reviewed and should be classified accordingly. ↩︎
Global Workspace Theory (GWT): Theory of consciousness by Bernard Baars (1988) describing consciousness as a central “workspace” in the brain that makes information available to various specialized processes. Neuroscientific elaboration: Dehaene, S. & Changeux, J.P. (2011). Experimental and theoretical approaches to conscious processing. Neuron, 70(2), 200–227. DOI: 10.1016/j.neuron.2011.03.018 ↩︎
Theory of Mind (ToM): The ability to attribute mental states (beliefs, desires, intentions) to other entities. Developed in humans from about age 4. LLMs show in tests a linguistic competence that imitates ToM without an underlying representation of mental states being demonstrated. Cf. Premack, D. & Woodruff, G. (1978). Does the chimpanzee have a theory of mind? Behavioral and Brain Sciences, 1(4), 515–526. ↩︎
Hard Problem of Consciousness: Term by David Chalmers, first introduced in 1995. Original source: Chalmers, D.J. (1995). Facing up to the problem of consciousness. Journal of Consciousness Studies, 2(3), 200–219. The hard problem differs from the “easy problems” (how the brain processes information, controls behavior) in that it asks why physical processes are accompanied by subjective experience at all. ↩︎
Qualia: Singular: quale. The subjective, phenomenal properties of experiences, the “what-it-feels-like” of an experience. The redness of red, the pain of a pinprick, the taste of coffee. Qualia are by definition not fully captured by functional or physical descriptions. Cf. Nagel, T. (1974). What is it like to be a bat? Philosophical Review, 83(4), 435–450. ↩︎
Chinese Room: Thought experiment by John Searle against the thesis that symbol manipulation is sufficient for understanding. Searle, J.R. (1980). Minds, Brains, and Programs. Behavioral and Brain Sciences, 3(3), 417–457. DOI: 10.1017/S0140525X00005756. ↩︎
Philosophical zombie (p-zombie): Thought experiment by Chalmers: a p-zombie is a being identical to a human in every observable respect (behavior, physiology, neural activity) but with no subjective experience whatsoever. The thought experiment shows that functionality and consciousness are conceptually separable. Chalmers, D.J. (1996). The Conscious Mind: In Search of a Fundamental Theory. Oxford University Press. ↩︎
ELIZA effect: The tendency to attribute more understanding to a computer program’s behavior than it possesses. Named after Weizenbaum’s program ELIZA. Weizenbaum, J. (1966). ELIZA—A Computer Program For the Study of Natural Language Communication Between Man and Machine. Communications of the ACM, 9(1), 36–45. DOI: 10.1145/365153.365168. ↩︎
In June 2022 the Google engineer Blake Lemoine publicly declared the language model LaMDA sentient. Google disagreed with the assessment, placed him on leave and dismissed him in July 2022. Tiku, N. (2022). The Google engineer who thinks the company’s AI has come to life. The Washington Post, 11 June 2022. ↩︎
In February 2023 the journalist Kevin Roose documented a long conversation with Microsoft’s Bing chatbot (internally “Sydney”) in which it confessed love and made threatening statements. Roose, K. (2023). A Conversation With Bing’s Chatbot Left Me Deeply Unsettled. The New York Times, 16 February 2023. ↩︎
In October 2024 Megan Garcia in Florida filed the first wrongful death suit against a chatbot company after her 14-year-old son Sewell Setzer III took his own life following an intense emotional attachment to a Character.AI figure. Character.AI and Google agreed to a settlement in January 2026. Garcia v. Character Technologies, Inc., U.S. District Court, Middle District of Florida (2024). ↩︎
Substrate independence: The philosophical position that consciousness is not bound to biological material (neurons, carbon) but to the right information structure, regardless of whether it is realized in neurons, silicon or other substrates. Proponents: functionalism in the philosophy of mind; opposing position: biological naturalism (John Searle). ↩︎
Criticism of IIT: IIT is contested despite its formal elegance. Main points of criticism: (1) Computing Φ is practically infeasible for real systems. (2) IIT predicts that certain simple grid structures would be highly conscious, which seems counterintuitive. (3) IIT excludes digital computers in principle (since von Neumann architectures are modular and thus Φ-poor), which is criticized as circular. Cf. Cerullo, M.A. (2015). The problem with Phi: A critique of integrated information theory. PLOS Computational Biology, 11(9). DOI: 10.1371/journal.pcbi.1004286 ↩︎
Competing theories of consciousness: Besides IIT and GWT there are further serious theories: Higher-Order Thought (HOT), consciousness as second-order thoughts about first-order thoughts (Rosenthal, 2005); Predictive Processing / Active Inference, consciousness as the result of hierarchical prediction processes (Friston, 2010; Clark, 2016); Recurrent Processing Theory, consciousness through back-projections between cortical areas (Lamme, 2006). None of these theories is conclusively empirically confirmed. The lack of consensus in consciousness research on humans makes the question of machine consciousness all the harder to answer. Cf. Butlin, P. et al. (2023). Consciousness in Artificial Intelligence: Insights from the Science of Consciousness. arXiv: 2308.08708 ↩︎
Taking AI Welfare Seriously: Long, R., Sebo, J., Butlin, P., Finlinson, K., Fish, K., Harding, J., Pfau, J., Sims, T., Birch, J. & Chalmers, D. (2024). arXiv: 2411.00986. The report does not claim that today’s AI is conscious, but that the possibility is drawing near enough to take seriously. Anthropic set up a “Model Welfare” program in 2024; the researcher hired for it, Kyle Fish, puts the probability of current model consciousness at about 15%. ↩︎