Why Physics Can Mislead You About Physical Reality
If you want to learn about the nature of physical reality, naturally, you would turn to physics. It would seem a bit contradictory to say that physics itself can mislead you about the nature of physical reality. Yet, this can actually happen, and let me explain.
For any physical theory, it is possible to mathematically formulate it in various different mathematically equivalent ways. Yet, some formulations of the theory may be more difficult to carry out calculations in than others. Naturally, physicists will gravitate towards the formalism that is the simplest to perform calculations in.
Before quantum mechanics, there was matrix mechanics as developed by Heisenberg. Matrix mechanics is mathematically equivalent to quantum mechanics, and so it gives all of the same predictions. When Schrodinger developed the modern formulation of quantum mechanics, he referred to it as wave mechanics to distinguish it from Heisenberg’s formulation.
Over time, it became less and less common to use matrix mechanics because it is not as easy to perform calculations in, and thus wave mechanics became ubiquitous. All the discussion around interpreting quantum theory then became centered around interpreting the meaning of the wave function, a feature unique to wave mechanics.
Why was wave mechanics developed, then, if matrix mechanics is mathematically equivalent? What was the motivation to develop a mathematically equivalent formalism? It would seem redundant to spend time trying to formulate the same theory in different mathematical language without any empirical differences in your calculated predictions.
Schrodinger’s reasoning was not because he thought matrix mechanics was empirically wrong, nor was it out of any desire to mathematically simplify the theory. Schrodinger’s reasoning was explicitly that he believed matrix mechanics gave a misleading physical picture as to what actually going on.
Even though two different formulations of the same theory may be mathematically and empirically equivalent, this does not imply that they are physically equivalent.
In Einstein’s theory of special relativity, he derives that the passage of space and time is relative through an assumption that the one-way speed of light is absolute. It is possible to construct a mathematically equivalent model that makes all the same empirical predictions by assuming that the one-way speed of light is relative, and, in such a model, the passage of space and time would be absolute. This is known as Lorentz ether theory and it is mathematically equivalent and empirically equivalent to special relativity.
Would you argue that a universe where space and time are absolute is the same as one where it is not? I do not think so. Those two mathematical formulations clearly give you a different physical picture as to what is going on, even if they are mathematically equivalent, and even if they are empirically equivalent.
Schrodinger’s criticism of matrix mechanics was that it is discontinuous. Particles did not continuously evolve from interaction A to interaction B. Particles just have a quantum state at interaction A and then have another at B. Schrodinger derided this, saying, “I cannot believe that the electron hops around like a flea,” and so he sought to give a more continuous account of what the particle is doing between interactions A and B.
However, in his book Science and Humanism, Schrodinger changes his mind. He argues that the development of wave mechanics failed in its quest to build a continuous picture of reality because it just shifts the discontinuous gap from one interaction to the next with a discontinuous gap between the quantum state and empirical observation, as the wave function would be spread out over many different possibilities yet “collapses” into just one when you observe it. Schrodinger thus goes onto argue that he believes the “gap” picture is actually the more physically correct picture.
Clearly, particles hopping from one interaction to the next, or particles spreading out as waves and collapsing when they interact with a measuring device, are not the same physical picture. Yet, matrix mechanics is indeed mathematically and empirically equivalent, despite the fact it physically implies particles never spread out as waves but instead do not even meaningfully exist in between interactions and just hop from one interaction to the next.
You can’t disprove this with any sort of measurement because a measurement is by definition an interaction, and so measurements only tell you what they are doing during interactions, not in between them. This ultimately gets us to why I argue that physics can mislead you about physical reality.
Physicists always favor the mathematical model that is the most mathematically simple and the easiest to carry out calculations and predictions in. Different equivalent mathematical formalisms provide different physical pictures of what is going on. There is no convincing philosophical argument that the simplest one necessarily provides the most accurate physical picture of what is really going on.
Imagine that you have a security system that tracks when people enter the doors of the building, and the building has two doors. The state of the computer system at any given moment would be two bits of information, those being 00 for no one at either of the two doors, 01 or 10 for a single person at one of the doors or the other, or 11 for two people accessing both doors at the same time.
The security system frequently logs this two bit state at that time along with a timestamp, and so you can recover what is really going on at the two doors at any point in time just by looking at the logs.
Now, let’s say the building is in a small town where people rarely access the doors at all because there just isn’t many people, so statistically it is just impossible to ever see both doors ever accessed at once. In such a scenario, the case of 11 would never actually physically occur because both doors cannot be opened at the same time. You could also simplify the log by dropping off 00 and just not recording anything in the log at all when neither doors are being accessed.
You could therefore simplify this entire thing by mapping 01 to 0 and 10 to 1 to represent which door is being accessed, and then only log a timestamp if one or the other is being accessed, and don’t log it if neither are being accessed. You now have a much simpler encoding of the system that stores much less information, and only describes the doors using 1 bit of information rather than 2.
If a hacker hacked into the company’s database and found the log of the door behavior, but had no idea that this is what the log represents, they would just see a single bit value being logged at random times. They therefore might assume that the single bit value is tied to the state of some single object. It would not be immediately obvious to them that this compressed encoding is actually representing the physical state of two different physical objects, of two different doors that are technically independent of one another but just assumed that only one is being accessed at a time in the given encoding.
If they guessed it was a door log, they might be likely to guess that it references the state of a single a door with 1 meaning it is opened and 0 means it is closed. There is no reason that would be immediately obvious to them that 0 is just an encoding for 01, for one door being closed and the other open, and 10 being the reverse, and thus the single bit is describing the behavior of both doors and not a single door or a single object.
Hence, the mathematical simplification leads to physical confusion if you don’t already know the original mathematics that this was simplified from. This might seem like a contrived example, but there are real examples like this where this confusion becomes apparent.
The Mach-Zehnder interferometer uses two beam splitters to illustrate interference effects whereby the photon is guaranteed to only leave the second beam splitter on a single trajectory, we will call this trajectory A. If you introduce a measuring device to see where the photon is in between the two beam splitters, it alters the outcome such that the photon can leave the second beam splitter on two possible trajectories, we will call them A and B.
Now, imagine that we have a measuring device that may or may not be broken. If it’s broken, it won’t function as a measuring device at all, so the photon will be guaranteed to leave on only the single trajectory A, yet if it’s functional, the photon will be guaranteed to leave on either trajectory A or B.
There is a famous paradox called the Elitzur–Vaidman paradox that points out that in the second case, it is not guaranteed that the measuring device will measure the photon. Hence, it is possible that the measuring device does not actually register the presence of the photon, but also that the photon leaves on trajectory B, which s only possible if the measuring device is functional. Hence, you would learn from measuring the photon on trajectory B that the measuring device is functional, despite it never having interacted with it, and therefore this seems like a spooky “interaction-free measurement.”
There is, however, a rather simple solution to this as pointed out in the literature by this paper and this paper. If the photon can leave the second beam splitter on two possible trajectories, then surely it is a binary operator and not a unary operator. It’s possible to have a photon leave one path or the other, 01 or 10, have no photons leaving the beam splitter, 00, or having a photon leave the beam splitter on both trajectories, 11. If the output is binary, then surely the inputs are binary as well.
In the typical mathematical formulation of the Mach-Zehnder interferometer, it is just assumed that only a single photon will ever be enter the first beam splitter, so the initial case of 01 or 10 are the only possibilities. This allows you to then map 01→0 and 10→1, where 0 represents a single photon one path and no photon on the other, and 1 represents the reverse. You can then simplify the system by compressing the binary operators (which would be the Givens operator with a phase of pi/4) into a unary operator (the typical beam splitter operator).
This assumption is not physically true. The two paths are independent of one another, so it is indeed physically possible for two photons, or neither, to enter the first beam splitter, and the simplification arises just because we are assuming in that specific case we’re interesting in that such an event would not happen. This is akin to how it is physically possible for people to enter both doors simultaneously, but we just assume in our encoding that such a thing wouldn’t actually happen.
This, however, leads to confusion if you look at the single bit registered by the measuring device being a 0 and then conclude the measuring device measured nothing yet somehow we learn the state of the measuring device. It makes it appear as if there is an “interaction-free measurement” going on.
The fundamental unit of quantum information is not a bit but a qubit, so the state of the system would be more accurately represented by |00⟩, |01⟩, |10⟩, and |11⟩, where a photon on one path or the other is |01⟩ or |10⟩. If the measuring device registers no photon, that means it registers |0⟩, yet |0⟩ is not nothing, it just means that the measuring device registered the value of the field to be +1 for one of its observables, but each qubit also has orthogonal observables which are perturbed by the act of measurement.
That perturbed qubit then comes together with the other qubit at the second beam splitter, and the perturbation makes them out of phase with one another, thus they interact at the second beam splitter differently than when a measuring device is not present and they are in phase with each other.
This way of representing the system is mathematically equivalent and so it gives the same empirical predictions, yet it provides a different physical picture as to what is going on. A field is something that exists everywhere, so it is impossible for your measuring device to measure nothing at all. You are always measuring something, which is the field at that location. Since the beam splitter is a binary operator, then two field modes propagate from its output, and when you measure |0⟩ or |1⟩, you are not measuring “nothing” or “something,” but you are measuring the field which just so happens to have those values for its observables at that particular moment in time, but at the same time, the measuring device perturbs the field such that it alters the outcome.
The derivation of “interaction-free measurement” derives ultimately from a mathematical simplification where we map a system with a certain number of degrees of freedom into a mathematical formalism that has less degrees of freedom. If you then try to interpret this formalism directly as the direct physical description of the system without knowing what it was distilled from, it appears as if there is an interaction-free measurement. The different mathematically equivalent and empirical equivalent way of expressing the system I have discussed here is more mathematically complex yet the paradox disappears in that framework.