12 min read

Are Latent Representations Real? What the Aharonov-Bohm Effect Suggests About AI

For a long time, physics treated potentials like scaffolding. Useful, elegant, disposable.

You wrote down a gravitational or electromagnetic potential because it made equations manageable. Then you differentiated it, recovered forces or fields, and moved on to the things that were supposed to be physically serious. The potential itself looked suspiciously arbitrary. Add a constant, nothing observable changes. Redefine it with a gauge transformation, the world stays put. That sounded less like reality and more like accounting.

AI talks about embeddings and latent spaces in almost exactly that tone.

The coordinates inside a model are often presented as conveniences for optimization. Rotate the basis of a hidden layer and, with the right compensating changes downstream, the model computes the same function. Individual neurons can be remixed. Absolute values in an embedding vector do not carry fixed meaning. What matters, we are told, is the behavior at the output: the next token, the classification, the image. The rest is internal plumbing.

That position is tidy. It may also be too small for what these systems are actually doing.

Physics demoted potentials for two centuries

The original objection to potentials was not stupid. It was clean.

In classical mechanics and electromagnetism, observable effects seemed to come from forces and fields. If you wanted to know how a planet moved, you could compute gravitational attraction directly. If you wanted to know how a charge behaved, electric and magnetic fields did the explanatory work. Potentials helped because they compressed complicated vector bookkeeping into smoother scalar or vector functions. They were a trick with very good taste.

But the arbitrariness would not go away. A gravitational potential can be shifted by a constant without changing the force. In electromagnetism, the vector potential (A) can be altered by the gradient of a scalar function, and the magnetic field remains the same. If multiple mathematical descriptions generate the same measurable world, the natural instinct is to say the descriptive surplus is not physically real.

Richard Feynman often spoke for that tradition. Potentials were useful intermediates. The fields were what nature cared about.

That view held because the experiments seemed to support it. If the field at a point determines the force on a particle at that point, then the field looks like the local bearer of physical influence. Potentials become an elegant detour, the kind theorists love and experimentalists tolerate.

Then quantum mechanics made the detour look a lot like the road.

A phase shift with no local force

In 1959, Yakir Aharonov and David Bohm proposed an effect that still feels slightly illegal when you first meet it.

Take a solenoid carrying magnetic flux. Arrange things so the magnetic field is confined inside it. Send electrons around the outside along two paths, one on each side, and let them interfere. In the region the electrons travel through, the magnetic field is zero. By the ordinary field-first story, nothing about the enclosed magnetism should matter. No field where the electrons are, no influence on their motion.

Quantum theory predicts otherwise.

The electron wavefunction picks up a phase that depends on the line integral of the vector potential along the path. Two paths that enclose different magnetic flux acquire different phases, even though the electrons never pass through a region with nonzero magnetic field. When the beams recombine, the interference fringes shift. The pattern on the screen changes because of something the old picture said was merely auxiliary.

Later experiments, especially Tonomura’s 1980s work with a carefully shielded magnetic torus, made the result hard to explain away. The electrons were excluded from the field-bearing region. The phase shift remained.

This did not mean that the raw numerical value of the potential at a point suddenly became sacred. Gauge freedom survives. What became physical was the gauge-invariant structure captured by the potential across a path or around a loop. In modern language, the important object is closer to a holonomy than to a local number scribbled in a notebook. The redundancy in description is real, but it does not erase the thing being described.

That distinction matters for AI.

When people say latent representations are “just coordinates,” they are making an argument from arbitrariness that sounds a lot like the pre-Aharonov-Bohm argument about potentials. And just as in physics, the interesting reply is not that every coordinate must be literally real. The stronger reply is that invariant structure can be real even when any one coordinate system is not.

AI makes the same move with embeddings

Consider a language model’s hidden state. At each layer, each token is represented by a high-dimensional vector. That vector can be transformed in many ways without changing the model’s overall function, provided the surrounding weights are adjusted consistently. Researchers know this well. It is one reason single-neuron stories are so fragile. A neuron that looks like the “France neuron” today can dissolve into a distributed pattern tomorrow after a basis change.

From that, a common conclusion follows: hidden representations are not the right level of reality. They are implementation details of a function approximator. If the outputs stay fixed, the latent space is negotiable.

There is truth in that. A particular axis in a 12,288-dimensional space is usually not a natural kind. Treating one coordinate as if it were a tiny homunculus with a label is how you end up shipping a demo and calling it science.

Still, the jump from “coordinates are arbitrary” to “the representation has no reality worth talking about” is too quick. Geometry survives coordinate changes. Relative directions survive. Subspaces survive. The way information becomes linearly available to later computations survives. If two different bases preserve the same task-relevant relations, the arbitrariness lives at the naming layer, not necessarily at the level of structure.

That is already how practitioners behave when they stop talking philosophy and start debugging models.

They train probes to recover syntax, sentiment, factual attributes, speaker identity, or code structure from hidden states. They align representations across models trained from different random seeds. They find that certain high-level distinctions become easier to decode in middle layers, then harder again, as the model transforms them for downstream use. Anthropic’s work on representation engineering showed that shifting activation directions can steer behavior in ways that are surprisingly semantically coherent. Sparse autoencoders, despite all their limitations, often recover recurring features that look less like random coordinates and more like reusable internal factors.

None of this proves that latents are “real” in a metaphysical sense. It does something more practical. It suggests that there are stable invariants inside the computation, and those invariants carry explanatory weight.

Invariant geometry is the real candidate

The cleanest version of the claim is smaller than it first appears.

You do not need to believe that an embedding vector corresponds to a little concept-object inside the machine. You only need to believe that some relational structure in representation space is indispensable for explaining what the model can do. Reality, here, means explanatory and intervention-worthy. If changing a structured feature reliably changes behavior, and if that feature recurs across training runs or architectures after suitable alignment, dismissing it as “mere coordinates” starts to sound evasive.

Word embeddings offered an early, almost cartoonishly friendly version of this. Relations like king minus man plus woman landing near queen were oversold in the popular imagination, but they were not hallucinated. They pointed to a geometric regularity: semantic and syntactic differences could become approximately linear in learned spaces. The exact axes were arbitrary. The relations were not.

Modern deep models complicate the picture because their representations are contextual, layered, and nonlinear. A token does not have one embedding in the old static sense. “Bank” in a finance sentence and “bank” in a river sentence are not roommates. They are distant cousins who only share a name. Even so, contextual models still produce latent geometries with regularities that downstream layers exploit with remarkable efficiency.

This is where the analogy to potentials becomes useful. The question is not whether one number in one basis deserves ontological worship. The question is whether the equivalence class of structures that survive allowed transformations is doing real work. In physics, gauge-dependent descriptions can still encode gauge-invariant effects. In AI, basis-dependent activations can still encode basis-invariant computation.

A model’s hidden state is not meaningful because we can point to coordinate 1847 and declare it the honesty dimension. It is meaningful because families of transformations, distances, projections, and trajectories can remain behaviorally consequential across many redescriptions.

Internal paths may carry meaning

The Aharonov-Bohm effect adds one more wrinkle that feels especially relevant now: path dependence.

The measurable effect is not the local value of the magnetic field where the particle sits. It is the accumulated phase along alternative routes. The global structure of the situation matters. Two journeys through apparently empty space are not equivalent because of what they enclose.

Inside large models, we may need a similar shift in attention. Much interpretability work still hunts for local explanations: this neuron, this attention head, this MLP feature. Sometimes that works. Often it produces the mechanistic equivalent of judging a novel from three highlighted sentences.

A lot of computation in transformers is path-like. Information is injected into the residual stream, reweighted by attention, transformed by MLPs, and passed forward through many layers. What later becomes linearly decodable at one point may depend on a cumulative sequence of small updates that are individually unremarkable. Anyone who has looked at activation patching results has seen versions of this. A behavior can hinge on a route through the network rather than on one glaring activation spike.

That does not give us an AI version of a line integral in the strict mathematical sense. The analogy should not be stretched until it snaps. But it does suggest a better research instinct. Instead of asking only what a hidden state is at one layer, we should ask what invariant quantity accumulates across the computation and which transformations leave that quantity intact. In other words, the right object may be a trajectory through representation space, or a family of such trajectories, rather than a frozen latent snapshot.

This is one reason attention circuits and residual stream analyses feel more promising than neuron mythology. They move from isolated units toward structured transport. The model is not storing meaning in one place like a badly organized garage. It is moving and reshaping constraints across depth.

The analogy has teeth, and also limits

Physics earns its confidence with experimental sharpness. AI rarely does.

The Aharonov-Bohm effect has a precise theoretical description and a decisive empirical signature. Hidden representations in neural networks are messier. Training is contingent. Architectures vary. Interpretability tools can manufacture neat stories out of noise if you let them. Probe performance can reflect information made decodable by the probe rather than explicitly used by the model. Alignment between models is often approximate and brittle. Some apparent features vanish under retraining.

So the analogy should be used as a lens, not a verdict.

There is also a deeper difference. In gauge theory, redundancy is built into a physical formalism that we independently trust. In neural networks, many internal descriptions are underdetermined because we built the system as a flexible function learner. Nature did not hand us a privileged decomposition; gradient descent handed us one of many serviceable settlements. That makes ontological claims about a given latent feature much riskier than ontological claims about electromagnetic structure.

And yet the old objection from arbitrariness still fails in the same way. Underdetermination by coordinates does not imply absence of structure. If anything, modern AI gives that lesson fresh urgency. We already know that models can converge on recurring internal organizations because they need some efficient way to carry information from input to output. The fact that multiple equivalent bases exist does not erase the pressure toward usable geometry.

Aharonov once said he was “very ignorant, fortunately.” There is wisdom in that line. Entire fields can inherit a philosophical posture from old tools and keep it long after the tools have changed. Physics treated potentials as second-class longer than it should have because the old intuition felt natural. AI may be doing something similar when it talks as if only outputs are respectable and everything inside the model is bookkeeping.

Reality without fixed coordinates

The useful question is not whether latent representations are real in the same way rocks are real. That framing invites a sterile fight nobody can win.

The useful question is whether there are internal structures we can identify, align, and intervene on such that behavior changes in regular, reproducible ways. If the answer keeps coming back yes, then “just vectors” will age poorly. It will sound like “just a potential” sounded before phase shifts showed up on the screen.

The most interesting objects inside AI systems may turn out to be neither individual neurons nor raw embeddings, but invariant geometries and cumulative paths through them. That is a more modest claim than saying the latent space is the secret seat of truth. It is also stronger, because it ties meaning to what survives transformation and to what can be experimentally manipulated.

Physics needed nearly two centuries to learn that a mathematical convenience could carry indispensable structure. AI has barely started that argument, and the systems are already weird enough to deserve better philosophy than coordinate nihilism.

End of entry.

Published April 2026