Quantum Mechanics is a Local Theory
There is No “Spooky Action at a Distance”
Entanglement Guarantees Locality
Quantum mechanics is a statistical theory whereby probability amplitudes are complex-valued. It operates identically to any other probabilistically theory except that it makes sense in quantum mechanics to say that events can have a negative, or even imaginary probability amplitude associated with them, rather than merely being between 0 and 1. This gives rise to interference effects which are the hallmark of quantum mechanics whereby positive and negative probabilities can cancel out in a way that cannot occur in traditional probability theory. I have an article here whereby I attempt to give a simple intuition for interference effects.
Not only do I explain interference in that article, but I also explain how whenever two systems are entangled, it no longer becomes valid to assign the state vector (the list of complex-valued probability amplitudes) to the individual subsystems, but only to the system as a whole. Interference effects thus only become a property of the system as a whole and disappear for the individual subsystems.
In the linked article I don’t actually go into the real mathematics, but here I will. We can demonstrate this using the program Octave (similar to MATLAB but better). First, we need to define some logic gates. The logic gates we will be using are only two: the CX gate and the Hadamard gate. Logic gates are represented by a row for each possible input and columns for the associated outputs. If it is a single qubit gate, then you will need two rows for the possible inputs of 0 and 1, and then two columns for the outcomes of 0 and 1. If it is a two qubit gate, then you will need four rows for the inputs 00, 01, 10, and 11, and four columns for the outputs 00, 01, 10, and 11. Below we define our two logic gates.
octave:1> H = 1/sqrt(2) * [ 1, 1; 1, -1 ]
H =
0.7071 0.7071
0.7071 -0.7071
octave:2> CX = [
> 1, 0, 0, 0;
> 0, 0, 0, 1;
> 0, 0, 1, 0;
> 0, 1, 0, 0
> ]
CX =
1 0 0 0
0 0 0 1
0 0 1 0
0 1 0 0
For the Hadamard gate, if the input is 0, then we look at the first row for the outputs, which is an output of 1/sqrt(2) for 0 and 1/sqrt(2) for 1. That means the Hadamard gate will place a qubit that is in an eigenstate (either 0 or 1) into a superposition of states. The CX gate negates the most significant qubit only if the least significant qubit is 1. Hence, 00 has an output of 00, 01 has an output of 11, 10 has an output of 10, and 11 has an output of 01. You can again tell this by matching the inputs to the rows and the outputs to the columns.
Applying a logic gate is performed just by multiplying the operation by the state vector as shown below where U is some unitary logic gate. In physical terms, this would be some sort of interaction with a particle that results in changing its probability amplitudes for what you might observe.
We can begin with a bipartite (two-qubit) system which would be represented by a state vector that would be of size 4 as it would have a list of probability amplitudes for the probabilities of measuring 00, 01, 10, and 11. We can entangle these two qubits together by first applying a Hadamard gate to the least significant qubit and then applying the CX gate. The most significant qubit will be negated based on the value of the least significant, but the least significant would be in a superposition of states. The result would be that the two qubits would be correlated with one another but both collectively in a superposition of states.
As you can see below in the final output, there are only non-zero probability amplitudes for the possible outcomes of 00 and 11, so the qubits are guaranteed to have the same value when you observe them, even if prior to observing them they are in a superposition of states.
octave:3> psi = [ 1; 0; 0; 0 ]
psi =
1
0
0
0
octave:4> psi = kron(eye(2), H) * psi
psi =
0.7071
0.7071
0
0
octave:5> psi = CX * psi
psi =
0.7071
0
0
0.7071
As a side note, you can Kronecker product logic gates together to form a larger gate that acts as running them in parallel. You need to do this in the example above (with the “kron” function) because we have a bipartite system which is too small to apply the Hadamard gate to it which is a single qubit logic gate, so we Kronecker product it with the identity matrix (with the “eye” function) which forms a new logic gate that applies the identity matrix (does nothing) to the most significant qubit and applies the Hadamard gate to the least significant.
Let’s say I handed you a single qubit that I applied a Hadamard gate to. You can see in the logic gate matrix that the input of 0 or 1 produces an output with equal probabilities of 0 or 1, so it would have a 50% chance of being 0 or 1. Yet, like the interference example I showed in the other article, if you apply the Hadamard gate twice it cancels itself out so you get the original value you started with, so if you started with a 1 and apply it twice then the output is a 1.
Classical and quantum probability behaves differently. Imagine if I randomly, with a 50% probability, handed you a 0 or a 1 and had you apply a Hadamard gate to it. If you follow the logic gate matrix, then either way, you end up with an output with an equal probability for 0 and 1. Despite in both cases there initially being a 50% chance of measuring a 0 or a 1 prior to applying a Hadamard gate, only in one of the cases does it give you a determined outcome, while in the other it gives you a random outcome.
The difference between quantum and classical probabilities can be expressed using density matrices. A density matrix is computed just by multiplying the Hermitian transpose of the state vector by itself. The Hermitian transpose is computed by transposing the matrix and taking the complex conjugate of all its values.
Density matrices hold the Born rule probabilities (values 0–1) in their diagonals, which are the numbers from the top-left to the bottom-right. For any superposition of states, there are at least some off-diagonals (numbers in the matrix which are not on the diagonal) with non-zero entries. For eigenstates, however, they always have all zeroes in the off-diagonals.
You can see below after applying a Hadamard gate the probability of the output of 50% for 0 and 50% for 1. There are also non-zero values for the off-diagonals. However, for the eigenstates, there are only zeroes in the off-diagonals.
octave:7> [1; 0] * transpose(conj([ 1; 0 ]))
ans =
1 0
0 0
octave:8> [0; 1] * transpose(conj([ 0; 1 ]))
ans =
0 0
0 1
octave:9> (H * [0; 1]) * transpose(conj(H * [ 0; 1 ]))
ans =
0.5000 -0.5000
-0.5000 0.5000
You can thus represent classical probabilities with linear combinations of eigenstate density matrices each multiplied by their probability of occurring. You will still get a matrix with probability values on its diagonals but with all zero entries in its off-diagonals, which would not correspond to any possible superposition of quantum states.
If you see the example below, we take the two eigenstate density matrices for 0 and 1 and combine them with 0.5 of each, and we get a new density matrix that still has probabilities of 50% for 0 and 50% for 1 in the diagonals but all zeroes in the off-diagonals, which differs from the density matrix whereby we apply the Hadamard gate.
octave:13> 0.5 * ([0; 1] * transpose(conj([ 0; 1 ]))) + 0.5 * ([1; 0] * transpose(conj([ 1; 0 ])))
ans =
0.5000 0
0 0.5000
To show why this is interesting, we then have to introduce the partial trace. Recall that whenever you have entangled particles or qubits, you cannot assign the state vector to the individual particles or qubits but only to the system as a whole. However, what if we want to know how a single particle will behave on its own? You can compute this using a partial trace whereby you can take a density matrix for an entangled system and trace out (ignore) particles you don’t care about. Below are the equations for this whereby the first traces out the least significant qubit and the second traces out the most significant qubit.
Let’s go back to our bipartite entangled system. Remember that? Let’s first compute its density matrix to see what it looks like. You can tell that it is a density matrix with non-zeroes in two of the off-diagonals so it is a quantum probability distribution that can interfere with itself.
octave:14> psi
psi =
0.7071
0
0
0.7071
octave:15> p = psi * transpose(conj(psi))
ans =
0.5000 0 0 0.5000
0 0 0 0
0 0 0 0
0.5000 0 0 0.5000
Now, what happens if we were to trace out the most significant qubit leaving us with just the density matrix for a single qubit in this entangled system? You can see the results below. What we are left with is a density matrix that is classical. That is to say, for a single qubit taken in isolation from an entangled pair, it would behave as if it were still random, but classically random. It would not exhibit interference effects.
octave:20> kron([1, 0], eye(2)) * p * kron(transpose(conj([1, 0])), eye(2)) + kron([0, 1], eye(2)) * p * kron(transpose(conj([0, 1])), eye(2))
ans =
0.5000 0
0 0.5000
Remember that we formed this entangled state by first applying the Hadamard gate to a single qubit then entangling it with another with the CX gate. That means the single qubit on its own could exhibit interference effects, then after being entangled with the other qubit, it lost its ability to interfere with itself. Indeed, take a look below where we apply simply the Hadamard gate to the least significant qubit without the CX gate and trace out the most the significant qubit again. We get the density matrix with non-zero values in the off-diagonals.
octave:21> kron(eye(2), H) * [1; 0; 0; 0]
ans =
0.7071
0.7071
0
0
octave:22> p = ans * transpose(conj(ans))
p =
0.5000 0.5000 0 0
0.5000 0.5000 0 0
0 0 0 0
0 0 0 0
octave:23> kron([1, 0], eye(2)) * p * kron(transpose(conj([1, 0])), eye(2)) + kron([0, 1], eye(2)) * p * kron(transpose(conj([0, 1])), eye(2))
ans =
0.5000 0.5000
0.5000 0.5000
Entangling a qubit or a particle with another one removes its ability to interfere with itself as only the system as a whole can exhibit interference effects. Indeed, we can repeat this whole process with a tripartite system where we entangle three qubits together and trace out just one of them. We find that we end up with two qubits that both together and separately do not interfere with themselves.
octave:24> psi = kron(CX, eye(2)) * kron(eye(2), CX) * kron(eye(4), H) * [1; 0; 0; 0; 0; 0; 0; 0 ]
psi =
0.7071
0
0
0
0
0
0
0.7071
octave:25> p = psi * transpose(conj(psi))
p =
0.5000 0 0 0 0 0 0 0.5000
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0.5000 0 0 0 0 0 0 0.5000
octave:27> kron([1, 0], eye(4)) * p * kron(transpose(conj([1, 0])), eye(4)) + kron([0, 1], eye(4)) * p * kron(transpose(
conj([0, 1])), eye(4))
ans =
0.5000 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0.5000
octave:28> kron([1, 0], eye(2)) * ans * kron(transpose(conj([1, 0])), eye(2)) + kron([0, 1], eye(2)) * ans * kron(transpose(conj([0, 1])), eye(2))
ans =
0.5000 0
0 0.5000
Entanglement, rather than being nonlocal, in fact guarantees that quantum mechanics remains a local theory. Why? Well, first, to entangle the qubits, you have to have them come together to interact. That is a local interaction. Second, when they become entangled, it is further guaranteed that separately they will not exhibit interference effects. They will behave like classical particles. This means that the only way to view the interference effects is to bring the entangled particles locally together.
Affecting One Does not Affect the Other
When the two particles are separate, they will, again, behave classically. There is nothing you could do to them that could not be explained classically. There is a myth that when you interact with one particle, it instantly interacts with the other, such as if you flip a particle in an entangled pair, then the other particle also gets flipped.
Let’s say we have two entangled qubits that have equal probabilities of being 00 and 11. If affecting one affects the other, then flipping one of them should flip the other. This would change the probability distribution to 11 and 00 with the same probabilities, although that is identical to the initial state, and thus nothing would change. If affecting one doesn’t affect the other, then flipping the least significant qubit would change the probability distribution for 00 and 11 to 01 and 10. It would transform the perfect correlation into a perfect anti-correlation.
Let’s say I have two envelopes with a coin in them. I mix the coins up such that in both envelopes H (heads) is facing up in both envelops or T (tails) is facing up in both envelopes. The possible outcomes are thus HH and TT. Now, let’s say we separate millions of miles apart before opening our envelopes, but you decide to be clever and flip yours upsidedown before opening it. The effect of this will be to change the probability distribution from HH and TT to HT and TH. Yours would be guaranteed to be the opposite of mine, but you did not affect mine at all.
We can carry out this experiment in Octave as shown below. Flipping one of the qubits changes the probabilities from equal probability of 00 and 11 to equal probability of 01 and 10. It does not affect the other qubit and behaves as it would in the classical case.
octave:29> X = [ 0, 1; 1, 0 ]
X =
0 1
1 0
octave:31> psi = CX * kron(eye(2), H) * [ 1; 0; 0; 0 ]
psi =
0.7071
0
0
0.7071
octave:32> kron(eye(2), X) * psi
ans =
0
0.7071
0.7071
0
Indeed, if we compute the reduced density matrix for both qubits, you would find that the reduced density matrices are actually not altered. There is nothing you could actually do to one of the qubits to affect the reduced density matrix of the other one.
That is essentially what the No-Communication Theorem demonstrates. If you create two density matrices where one is of the two entangled qubits and another is of the two entangled qubits by applying some sort of unitary operation to, let’s say, the least significant qubit, you can perform a partial trace to get the reduced density matrix of the most significant qubit and prove this reduced density matrix will always be the same as if no unitary operation was applied to the least significant qubit at all.
Separately, they are guaranteed not to affect one another. Only when brought back together do you observe interference between them. You thus can only ever observe interference effects locally. They simply cannot be observed nonlocally.
Therefore, if you want to claim quantum mechanics is nonlocal, you are pushed into a corner where you have to argue that this nonlocality is conspiratorial. It has to be nonlocal in a way where all that nonlocality is hidden below the surface and cannot actually be observed so that the mathematics just so happens to work out such that it is identical to as if it were local. Nature would have to be nonlocal but in a way that conspires to hide it from us.
In the real world, when Bell tests are carried out, you first entangle two particles locally, then you separate them at a vast distance, and then you bring them back together and observe their interference effects with each other. If you never bring them back together, no violations to Bell inequalities can be observed. Bell tests only would require something nonlocal if you presumed the outcomes were predetermined. I talk about it in this article here.
If you treat the outcome as predetermined, you have to preassign all the values, but this leads to a mathematical contradiction unless you presume the measurement settings alter the outcome. Yet, with a Bell test, the measurements are spatially distributed between the two particles. Each individual particle would have to be affected by the measurement settings which could be at a great distance, and thus this implies nonlocality.
Yet, if you do not presume predetermination, then you do not need to preassign all the values, and so you do not run into this contradiction. Bell’s theorem rules out local hidden variable theories (setting debates about superdeterminism aside) but not locality itself.
Quantum mechanics is probabilistic not because there is some hidden variable that if we knew we could predict the outcome ahead of time. Quantum mechanics is probabilistic precisely because the outcome is not predetermined. What we are ignorant of is what the outcome will be in the future when we interact with it. Of course, if we somehow knew that information, we could predict the outcome ahead of time, but that is impossible as it would imply knowing the future. Like all statistical theories, it is statistical because we are ignorant of something, but in this case, what we are ignorant of is something that cannot be acquired ahead of time.
The EPR Argument
The No-Communication theorem proves that affecting one qubit in an entangled pair cannot in any way affect the other nonlocally. Yet, many people still believe quantum mechanics is a nonlocal theory. The reason for this is an argument put forward in the famous EPR paper which has later become associated with the concept of entanglement.
If we do not posit hidden variables, then we inevitably have to conclude that the properties of systems can pass in and out of reality at different times. For example, if you measure the momentum of a particle, you could not simply say the position becomes “unknown” or “uncertain,” as this suggests that the particle has a position and you are just unaware of it. Rather, you would have to treat its position as if it is not ontologically real. If you were to go measure its position, then the situation would reverse and the particle’s position would become realized (the passage from an indeterminate to a determinate state) whereas the momentum would cease to have reality.
It thus logically follows from this that there would not be a one-to-one correspondence between the mathematical description of the system and its ontology, as the wave function description would be a tool for predicting the future realization of properties of systems and would not describe it in the present. Naturally, this would lead you to ask what is the relationship between the ontology of the system and the mathematical description.
This is where the EPR paper comes in, by Albert Einstein, Boris Podolsky, and Nathan Rosen. The EPR paper makes a particular assumption about the relationship between the ontology of the system and the mathematics, and it uses this assumption to derive seeming nonlocal action. The postulate they put forward in the paper is quoted below.
If, without in any way disturbing a system, we can predict with certainty (i.e., with probability equal to unity) the value of a physical quantity, then there exists an element of physical reality corresponding to this physical quantity.
— Einstein & Podolsky & Rosen, “Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?’
The postulate here suggests that if you can predict the value of something with certainty prior to observing it, then it should be treated as having an “element of reality,” i.e. it is ontologically real. In quantum theory, you can predict with certainty the outcome of a measurement if the system is in an eigenstate, which merely means that the probability amplitude for a particular outcome is equal to 1, while it is equal to 0 for all other possibilities. If we were to adopt the EPR postulate, then a system in an eigenstate would have reality, while a system not in an eigenstate would not possess reality.
Now, consider two statistically correlated systems. In a classical case, we can think of a pair of shoes randomly separated between two boxes. Opening a single box will tell you what it is in the other box because they are guaranteed to be opposite of one another, i.e. they are correlated with one another. You can establish similar correlations in quantum mechanics, such as a particle with zero net spin could decay into a particle with a positive and a negative spin, as the conservation of angular momentum would guarantee that the total spin remains zero. Which particle has which spin is random, but they are guaranteed to be opposite of one another.
The difference between the classical and the quantum case is that in the case with the shoe box, the probabilistic nature of it is simply due to the observer’s ignorance, not due to the shoe lacking reality. In the quantum mechanical case, the two particles would genuinely not possess the property of being. Or, at least, the specific properties of the particle that are uncertain would not possess being. There would be certainty that, if they are electrons, they would have a negative charge, but there would be uncertainty as to what their spin values would be.
By observing the particle’s spin state, you could become certain of its state, which if the adopt the EPR postulate, would equate to the realization of that particle’s spin state. However, because the two particles are guaranteed to have opposite spin states, we could simulateously update the prediction to certainty for the other particle’s spin state, even if it is light years away. This would imply that measuring a particle local to the observer could nonlocally cause the realization of a particle simulateously and at vast distances.
By measuring either A or B, we are in a position to predict with certainty, and without in any way disturbing the second system, either the value of the quantity P or the value of the quantity Q. In accordance with our criterion of reality, in the first case we must consider the quantity P as being an element of reality; in the second case, the quantity Q is an element of reality.
— Einstein & Podolsky & Rosen, “Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?’
This argument is largely philosophical given that, as shown with the No-Communication Theorem, there cannot be any observable consequences of this. If you buy into the EPR postulate, then you would agree that at least on a philosophical level, there seems to be some kind of pseudo-nonlocality here. However, it is possible to avoid this philosophical conundrum simply by rejecting the EPR postulate.
If we reject the EPR postulate, then we need a different way to connect the mathematics to the ontology of the system. Instead of tying the realization of the properties of physical systems to certainty of their state, we can instead tie the realization of properties of systems to be something that occurs when one physical system interacts with another, but only from the context frame of the systems involved in the interaction. The term context frame as it will be used here is kind of like a reference frame. It describes one system from the “perspective” of another, whereby the latter system is used as the zero-point (the origin) of the coordinate system.
To even call an interaction an “interaction” implies a third-person perspective that is not part of the interaction. To say a tree is interacting with my eyes by reflecting light into it is something you can only observe as a third-party observer. From my own perspective, all I see is the tree itself, not my eyes. The zero-point of a context frame is always effectively non-existent in that context, so all I would describe is the realization of the properties of the tree as it stands in front of me without including my eyes into the picture.
Whenever two physical systems interact, realization occurs only from the context frames of the two systems involved in the interaction, and not from the context frame of any system that is not part of it. As I stated, even calling it an “interaction” implies you are standing outside of it, and this is true in quantum theory. Whenever two physical systems interact, you have to describe them as becoming entangled with one another, which still suggests they have to be described in a superposition of states, and does not correspond to the realization of their properties.
Consider a situation whereby Alice and Bob have possession of a particle in a superposition of states, yet only Alice is curious enough to measure its value. For Alice, from her context, since she directly interacted with the system, she would directly perceive its value, and thus it would be realized in her context. However, if Bob does not make this measurement nor has Alice revealed it to Bob, then Bob could only at best describe Alice as entangled with the particle.
If we adopt this postulate instead of the EPR postulate, in most cases they are the same. If you observe something, you have to physically interact with it, and what you “observe” is just what is realized from your context frame. If you observe something, of course, you will gain certainty as to its value, and so in most cases the EPR postulate and this postulate we can refer to as contextual realization, or simply the CR postulate, are exactly identical.
Yet, there is a subtle difference, and that is in the case of entangled systems. First, let’s take a look at the EPR postulate in more detail. Consider an entangled system of two qubits that are guaranteed to have the same value. We will call the two quotes A and B where qubit A is sent to Alice and qubit B is sent to B. At time t=0, Alice and Bob receive their particles but have yet to perform any measurements on them, they are still in an uncertain state. At time t=1, Alice measures her particle, which gives her certainty as to what Bob’s particle would be. Then, at time t=2, Alice travels to Bob to confirm her prediction.
The Bernoulli function here is just a way to express certainty and uncertainty for boolean values. Bernoulli(0) corresponds to 100% certainty that the qubit has a value of 0, Bernoulli(0.5) corresponds to 50%/50% certainty as to whether or not the qubit has a value of 0 or 1, and Bernoulli(1) refers to 100% certainty that the qubit has a value of 1. The ontic state refers to the real state in ontological reality, whereas the epistemic state is the amount of certainty as to what he state would be if it were to be measured.
Take a note of time t=1. If we accept the EPR postulate, then Alice measuring her qubit allows her to update what she believes Bob’s qubit would be to absolute certainty, and thus it must also acquire being at that moment. Both qubit A and qubit B acquire reality simulateously, implying some sort of pseudo-nonlocality is going on. Alice traveling to Bob to confirm her prediction at t=2 is just reiterating what she already knows, and thus plays no essential role.
The case is very different if we adopt the CR postulate instead. From the CR perspective, properties of systems are only realized in particular context frames, so we cannot speak of the global ontology of the system, but only its ontology in a particular context, such as Alice’s context. Realization also only occurs when that system interacts with another system, and only from its own context frame. If we adopt this understanding, then things evolve a bit differently.
Within Alice’s own context frame, prior to interacting with either of the two qubits at time t=0, neither of them have being. When she observes her qubit at time t=1, she can indeed update her certainty as to the values of both qubits A and B simulateously. Notice the big difference here: even though Alice now has certainty as to what qubit B would be if she were to go measure it, its properties still have not yet been realized!
The properties of qubit B can only be realized from Alice’s context at time t=2, when Alice travels to the qubit to interact with it. This is a local action and thus there is no simultaneous nonlocal realization of the states of particles under the CR postulate. While the No-Communication Theorem gets rid of nonlocality, the CR postulate also gets rid of this pseudo-nonlocality, leaving quantum theory as a theory that does not even have the appearance of nonlocality.