9 min read

Geoffrey Hinton Thinks We’re Calling AI’s Biggest Flaw by the Wrong Name

When Neil deGrasse Tyson asked Geoffrey Hinton about AI “hallucinations,” Hinton did not hedge. We should stop calling them hallucinations, he said. We should call them confabulations.

That sounds like a professor’s correction. It is much bigger than that.

The word “hallucination” has done a lot of cultural work for the AI industry. It makes the failure sound freakish, like a visual glitch in an otherwise healthy machine. The model saw something that was never there. Fix the optics, tune the system, move on.

“Confabulation” points somewhere more unsettling. It suggests the model is not malfunctioning in an exotic way. It is producing a plausible account from imperfectly stored traces, which is much closer to how human memory works than most people want to admit.

That shift matters because names set expectations. If these systems hallucinate, then perfection feels like an engineering milestone. If they confabulate, then error is tied to the way the knowledge is represented in the first place.

The wrong metaphor has been steering the conversation

In ordinary speech, hallucination means perceiving something unreal. A person sees a dog in the room. No dog exists. The experience itself is the problem.

Confabulation is different. In psychology, it describes the unconscious production of a story that feels true to the speaker but contains false details. The person is not lying. They are not trying to deceive anyone. They are assembling a coherent account from fragments, patterns, and gaps.

That is a much better description of what large language models do. They do not typically “see” phantom facts. They generate an answer that fits the prompt, the context, and the patterns stored in their parameters. When the fit is loose or the underlying representation is incomplete, the result can still sound polished. It may even preserve the shape of the truth while getting the specifics wrong.

This is why the common promise that hallucinations will soon disappear has always felt too neat. You can reduce them. You can route around them with retrieval systems, better training data, stronger post-processing, and domain-specific constraints. You can make them rarer and easier to catch. But if the system’s core operation is reconstruction from distributed weights, some degree of confabulation is not a side effect. It comes with the territory.

John Dean showed the pattern long before chatbots did

Hinton used a famous example from human memory to make the point. During the Watergate scandal, John Dean, Nixon’s White House counsel, gave detailed testimony about conversations and meetings around the cover-up. His memory seemed extraordinary. He recalled who said what, how discussions unfolded, and what the atmosphere in the room had been like.

Then the tapes surfaced.

Because Nixon had secretly recorded Oval Office conversations, psychologists later had an unusual chance to compare confident human memory with a near-verbatim record. In a classic 1981 paper, Ulric Neisser studied Dean’s testimony against those tapes and found something deeply instructive. Dean had not delivered a faithful replay of events. He got many details wrong. He misplaced people in meetings. He blended separate conversations together. He shifted chronology. He attributed remarks to the wrong individuals.

Yet the testimony was not worthless. In broad structure, Dean had captured the reality of the cover-up. He remembered its logic, its pressure, its intent. The granular details wobbled. The deeper pattern held.

That is confabulation in its most useful and most dangerous form. It is useful because it preserves meaning when exact recall fails. It is dangerous because confidence in the reconstructed version can be very high.

The image most people still carry for memory is a filing cabinet or a hard drive. Something happened, then a record got stored, and later we retrieve it. Neisser’s work, and a great deal of cognitive science after it, makes that image hard to defend. Memory is not passive storage plus playback. It is active rebuilding.

The brain does not keep transcripts

Hinton’s explanation on StarTalk was blunt and elegant. If you remember something that happened recently, he said, it is not because a clean file is sitting somewhere in the brain. The experience changed the strengths of neural connections. When you recall the event, you construct something from those altered connections that resembles what happened.

That word, construct, is the hinge.

Human memory stores dispositions, associations, salience, compressed traces. When you remember your last birthday dinner, you do not pull a perfect recording from a shelf in your skull. You regenerate a scene from distributed changes in synaptic strength. Some parts are vivid because they mattered. Some are blurry because they did not. Some are quietly filled in because the mind prefers continuity over blank space.

Language models work in a related way, even if the analogy should not be pushed too far. They do not keep a neat library of sentences with labels attached. Training adjusts a vast field of numerical weights. Those weights capture statistical regularities, semantic relationships, stylistic patterns, and a surprising amount of world structure. When prompted, the model generates a response from that compressed internal state.

This is why a model can answer in a way that feels informed without actually retrieving a source. It is rebuilding an answer from learned patterns. Sometimes the reconstruction lands cleanly on the facts. Sometimes it produces a detail that is locally plausible and globally wrong. The machine did not pull a fake citation from a secret drawer. It generated something citation-shaped because the prompt demanded one and its internal map lacked a reliable anchor.

The resemblance to human recall is not total. People have bodies, goals, emotions, and lived episodes. Models do not remember their childhood because they do not have one. But at the level Hinton is pointing to, the comparison is still powerful. Both systems can store knowledge in connection strengths and later produce outputs that are reconstructive rather than archival.

The deeper problem is confidence, not fabrication alone

Hinton also gave the problem a second, less poetic name: artificial overconfidence.

That may be the most practical phrase in the entire exchange. Humans confabulate, but many real situations contain friction that reveals uncertainty. We hesitate. We qualify. We notice a mismatch in someone else’s face and back up. Memory is flawed, yet social life often exposes the cracks.

Language models have a different presentation layer. The answer arrives in complete sentences, with smooth syntax and no visible sweating. The prose does not signal whether the content came from a strong internal representation, a weak pattern match, or a statistical shrug wearing a tie.

This is why people over-trust them in high-stakes settings. In medicine, law, finance, or customer support, a graceful falsehood can do more damage than a blunt “I don’t know.” A clinician who uses a model to summarize literature does not need the model to be charming. They need it to distinguish between grounded synthesis and confident invention. The same goes for lawyers after the now-famous cases involving fabricated citations. The failure was not only that the models were wrong. It was that they were wrong in a format designed to look settled.

Calling these errors confabulations makes that design problem easier to see. The issue is not just factual accuracy. It is calibration. A system that reconstructs should say when its footing is weak.

Better systems will admit how they know

Once you accept the confabulation frame, some design choices come into focus.

A general-purpose model answering from parametric memory alone will always have a fuzzy edge. If you want reliability, you give it tools that can pin its reconstruction to external evidence. Retrieval helps because it changes the job. The model is no longer inventing an answer from weights alone; it is synthesizing around actual documents. Source-grounded generation helps for the same reason. So do interfaces that expose provenance, confidence bands, and disagreement between candidate answers.

This does not make the problem vanish. Retrieval can fetch the wrong document. Citations can be mechanically attached without true support. Confidence estimates can themselves be poorly calibrated. But these methods at least respect the architecture of the problem. They are not pretending the model is a tape recorder. They are giving a reconstructive system something firmer to reconstruct around.

It also suggests a healthier social contract with AI. In some contexts, these systems are excellent partners for drafting, summarizing, translating, and pattern-finding. In other contexts, they behave like an eloquent witness with an impressive memory for themes and a shaky grip on specifics. You do not dismiss that witness entirely, and you definitely do not let them testify uncorroborated.

The popular debate often swings between awe and contempt. Either the model is a genius in silicon, or it is a stochastic fool with autocomplete cosplay. Confabulation points to a stranger middle ground. A system can capture structure, reason usefully across a problem, and still manufacture details because reconstruction is part of its normal operation.

The name changes the job ahead

Renaming hallucinations as confabulations does not solve anything by itself. It does something more valuable. It gives us a more truthful mental model.

If you think these systems fail because they occasionally glitch, you will wait for a patch that makes them perfectly factual. If you think they reconstruct from distributed traces, you will build products, workflows, and institutions that expect verification. That is how we already treat human memory when it matters. We cross-check witnesses, demand records, compare accounts, and treat confidence as evidence only when it is earned.

Hinton’s point lands because it shrinks the distance between human cognition and machine output in an uncomfortable place. The comparison does not prove that models understand the world as people do. It does show that one of the easiest ways to dismiss them has been too simple. Their most notorious failure is not evidence that they are alien to intelligence. It may be evidence that reconstruction and error are woven together more tightly than we wanted to believe.

End of entry.

Published April 2026