The theory of chaos is an extraordinarily broad mathematical topic, and we all have some intuition for what it means when a system is *chaotic*. The ideas of unpredictability, spontaneity, intractability, turbulence, and perhaps randomness, all come to mind. But deterministic chaos is somewhat different than our intuitions would have us believe. If you watch the shape of a flickering flame, the whitewater in a rocky river, or the price of crude oil in North America, you’re definitely seeing behavior which can’t be described without chaos. But you’re also most likely seeing the effects of any number of *random* influences on the system, whether a faltering breeze or some oil speculator’s whimsy. Chaos theory deals with the behavior of *deterministic* systems—that is, systems with no random inputs. All the intricacy and intrigue of chaotic behavior can arise in systems which might seem deceptively uncomplicated, like a pendulum hanging from another pendulum, or three stars orbiting each other.

But if you’ve ever heard of the “butterfly effect” (a term coined by a pioneer of chaos theory, Edward Lorenz), it’s likely your intuition is right about the central feature of deterministic chaos: **chaotic systems have high sensitivity to initial conditions. **

If chaotic systems are so unpredictable and temperamental, how can we possibly make chaos work for us? One answer is encryption.

There are lots of ways to encrypt messages for secure transmission, but the idea of using chaos in encryption is pretty new. One shocking type of chaotic encryption was invented in 1993 by Kevin Cuomo of MIT, who (along with Alan Oppenheim) published a paper [1] outlining a new method for using chaos to send private messages. To understand how Cuomo’s chaotic encryption can work, you first have to believe in a surprising phenomenon called *synchronized chaos.*

Cuomo’s method relies on this synchronized chaos, a somewhat mysterious discovery summarized in a 1990 paper [2] by Louis Pecora and Thomas Carroll at The Naval Research Laboratory. The phenomenon occurs in some situations when part of the *output* of one chaotic system is used as an *input* for a twin chaotic system. If the two systems are properly synchronized, then the second system will mimic the behavior of the first with uncanny fidelity.

Just like Pecora, Carroll, Cuomo, and Oppenheim have done, we’ll look at synchronization in the chaotic Lorenz system (look at my post Edward Lorenz’s Strange Attraction for a Deeper Dive into the Lorenz system). The system comes from Edward Lorenz’s simplification of an atmospheric convection model, and its intriguing chaotic behavior has been studied for decades. It is defined by the following system of nonlinear differential equations:

where *σ*, *r*, and *b *are positive parameters related to the physics of convection. We’ll use *σ* = 10, *r *= 28, and *b *= 8/3, since those are the values Lorenz originally used to study the system. The variables *x*, *y*, and *z* make up the state of the system at each instant in time, so think of them as coordinates in state-space. Solutions to the Lorenz equations always outline a chaotic *attractor*, shown in Figure 1.

**Figure 1. **A solution to the Lorenz equations with initial conditions (x, y, z) = (0, 1, 0), found by numerical integration. The solutions (called *trajectories*) alternate irregularly between the “wings” of the attractor without ever intersecting themselves or one another.

Pecora and Carroll built an electrical circuit to model the behavior of the Lorenz system. I’ll call that circuit the Talker (T). The circuit T is designed to generate its output voltage signals (X_{T}, Y_{T}, Z_{T}) according to the Lorenz equations:

So as we would expect, measuring the voltages (X_{T}, Y_{T}, Z_{T}) shows the chaotic signature of the Lorenz system. Next, an almost identical circuit is built as a second chaotic system, called Copycat (C), with output signals (X_{C}, Y_{C}, Z_{C}). But there’s one key difference between the Talker and the Copycat: in the circuit C, the output X_{C} is snipped and replaced with Talker’s signal X_{T} where it feeds into the parts that generate Y_{C} and Z_{C}. The resulting situation is shown in the Figure.

**Figure 2. **The Copycat circuit may be synchronized with the Talker circuit by feeding X_{T} into the components of the Copycat which generate Y_{C} and Z_{C}. See Figures 3 and 5 in Ref. [1] for more detail.

The effect of feeding the signal X_{T} into places where the the Copycat “expects” to receive the signal X_{C} is to alter the Copycat’s governing equations:

Notice that these equations which determine the state of the Copycat, (X_{C}, Y_{C}, Z_{C}), now depend on X_{T} coming from the Talker. In this situation T and C are said to be *synchronized*, and the outputs (X_{C}, Y_{C}, Z_{C}) are approximations of the outputs (X_{T}, Y_{T}, Z_{T}). The agreement between T and C is clear from a plot of the magnitude of the Copycat’s error , as in the Figure below.

**Figure 3. **The magnitude of the difference between the state of C and T as a function of time. The initial separation was chosen as (clipped from the Figure), and the error quickly drops to hover around 0.05. The distance between the trajectories of C and T is tiny compared to the attractor they outline—about 10^{4} times smaller—so the synchronization is working extremely well.

The Copycat circuit is receiving only **partial** information about the state of the Talker circuit, but **all** of its outputs will synchronize with the Talker circuit outputs. Take a moment to think about that. It’s almost as if C has total knowledge of the state of T *and* is able to copy it almost exactly—despite being a chaotic system itself!

Things get even stranger, and this is where the possibility for chaotic encryption comes in. Cuomo realized the potential for synchronization even when a signal *other* than X_{T} is fed into the Copycat circuit.

Since we’re now talking about encrypted communication, we’ll again consider two identically prepared systems which are governed by the Lorenz equations. They’re called Talker (T), which has outputs (X_{T}, Y_{T}, Z_{T}), and Receiver (R), which has outputs (X_{R}, Y_{R}, Z_{R}). Now imagine you have some message *m(t)* that you want to encrypt and send securely—we’ll use the example of the audio clip below.

https://logicaltightrope.files.wordpress.com/2013/09/kenjamoriginal.wav%20

This audio signal is our message *m(t)*. In our case, the Talker and Receiver will be computer simulations which solve the Lorenz equations numerically, instead of the electrical circuits used by the inventors of this encryption method.

Now we construct a new signal *s(t)* = *m(t)* + X_{T} by adding the message to an output of Talker, being careful to scale *m(t)* and X_{T} so that *m(t)* is much smaller than X_{T} on average. The result is some unrecognizable junk, since the chaotic signal X_{T} drowns out the message. Take a listen:

Now for the prestige: if you feed the junk *s*(t) into the Receiver exactly as X_{T} was fed into the Copycat above, the synchronization **still works****. **That’s almost absurd—the Receiver isn’t even being fed an output of the Talker anymore, but rather some junk signal that contains both X_{T} and our message, yet it still recovers the state of the Talker. The Receiver is now governed by the equations:

The outputs (X_{R}, Y_{R}, Z_{R}) of the Receiver circuit are approximations of the outputs (X_{T}, Y_{T}, Z_{T}) of the Talker circuit, so X_{R} ≈ X_{T}. And remember that *m(t)* = *s(t)* – X_{T} from when we constructed *s(t)*. So by replacing X_{T} with its approximation X_{R} in the equation above, we can create a reconstruction of the original message: = *s(t)* – X_{R} ≈ *m(t)*. Have a listen to our final reconstructed audio signal.

The Figure below compares the original audio signal to the reconstructed signal.

**Figure 4.** The message *m(t)* is shown in red, along with the reconstructed message in blue. The reconstructed message is slightly noisier than the original, but completely recognizable.

Let’s summarize:

We started with two identical Lorenz systems and a message that we wanted to encrypt. We added some chaotic noise to the message to produce an encrypted signal for transmission. There’s no fear of an interceptor deciphering our signal; the chaotic system R itself is the key to decoding the message. We used the transmitted noisy signal as an input to part of the receiving system, and then used part of the receiver’s output to reconstruct the original message. Presto!

How important is it that T and R be synchronized systems, anyway? Could the reconstruction still work if they were slightly different? Well, let’s see what happens when we change just one of the Lorenz parameters in the Receiver system by 5%. With the same message signal, we get this output:

The result is totally obscure, although if you listen closely you can hear some of a beat from the original music. As the systems become more poorly synchronized, the fidelity of the reconstructed message drops extremely rapidly. Keep in mind that even when T and R are operating with identical parameters, synchronized as well as possible, they cannot be *perfectly *synchronized. The output of R will never exactly mimic the output of T, since R is still operating under different governing equations than T is.

So the method works well with a musical sample, but let’s see how well it works with a sample of dialogue. After all, it’s probably more realistic that your secret messages will be spoken rather than strummed. Let’s use the following audio sample.

https://logicaltightrope.files.wordpress.com/2013/09/orangutansoriginal.wav%20The chaotic encrypted signal is:

https://logicaltightrope.files.wordpress.com/2013/09/orangutanssignal.wav%20After decrypting the message, we hear:

https://logicaltightrope.files.wordpress.com/2013/09/orangutansrecovered.wav%20The result is completely comprehensible.

You can implement this method yourself without too much trouble, in case you want to send private messages to a friend. Start with simulating the Lorenz system by solving the equations numerically. Then convert your private message into any time-series, like the audio signal I used above, and add the chaotic noise from your simulation to the message. In order to decrypt the message, your friend will need to solve the receiver equations for (X_{R}, Y_{R}, Z_{R}) numerically with the encrypted signal as an input. That’s it—as long as you agree on the Lorenz parameters *σ*, *r*, and *b* ahead of time, it just takes two numerical solvers for you to communicate privately with chaos.

The possibilities for this synchronization-based chaotic encryption go far beyond what I’ve demonstrated here. The Lorenz system itself arises from certain laser systems, allowing for some optical communications to be encrypted with chaos [3]. Image encryption is also a relatively straightforward application of the method when pixel intensities are used as the message *m(t)* [4]. Further, the synchronization phenomenon has been observed in other chaotic systems, including *discrete* systems which are defined by iterative maps instead of differential equations. There are even some remarkable uses of synchronized chaos in neurophysiology, drawing analogies between chaotic systems’ “knowledge” of surrounding systems and models of the chaotic interdependent behavior in real neural networks. The strangeness of chaos can definitely work in our favor, as long as we’re careful with it.

**References and Further Reading**

[0] Edward Lorenz’s Strange Attraction.

[1] Cuomo, Kevin M.; Oppenheim, Alan V. “Circuit Implementation of Synchronized Chaos with Applications to Communications.” Phys. Rev. Lett. V. 71, No. 1, pp. 65-68. July 1993.

[2] Pecora, Louis M.; Carroll, Thomas L. “Synchronization in Chaotic Systems.” Phys. Rev. Lett. V. 64, No. 8, pp. 821-824. February 1990.

[3] Mirasso, Claudio R.; Colet, Pere; García Fernández, Priscila. “Synchronization of Chaotic Semiconductor Lasers: Application to Encoded Communications.” IEEE Photonics Technology Letters V. 8, No. 2, pp. 299-301. February 1996.

[4] Al-Maadeed, Somaya; Al-Ali, Afnan; Abdalla, Turki. “A New Chaos-Based Image-Encryption and Compression Algorithm.” Journal of Electrical and Computer Engineering Vol. 2012. January 2012.

[5] Strogatz, Stephen H. *Nonlinear Dynamics and Chaos*. 1994.

[6] Lorenz, Edward N. “Deterministic Nonperiodic Flow.” Journal of The Atmospheric Sciences V. 20, pp. 130-141. March 1963.

[7] Greene, Kate. “Encryption Using Chaos.” MIT Technology Review. January 2006.

[8] Skarda, Christine A.; Freeman, Walter J. “How brains make chaos in order to make sense of the world.” Behavioral and Brain Sciences V. 10, pp. 161-195. 1987.

If you watch a newspaper blowing in the wind, its flopping will be pretty unpredictable. The forces that govern it are constantly changing, maybe even randomly changing: the breeze fluctuates, the pages flap, the creases spin. As onlookers we would decide that the system is governed by some really complicated rules and forces, and we’d be right. It’s easy to give a system the appearance of randomness when its inputs are (or might as well be) random. But what’s the deeper nature of chaos? In chaotic dynamics, we find that the appearance of randomness, unpredictability, and intricacy of a system’s behavior can arise from a few simple *deterministic *rules. The strict definition of chaos has been somewhat controversial, but a working standard is:

Chaos is non-periodic behavior in a deterministic system with high sensitivity to initial conditions.

The nature of chaos is that simple rules can give rise to unpredictable and subtle behavior.

We’ll be looking at the Lorenz system, a famous system of differential equations which comes from a 1963 paper Edward Lorenz of MIT published in the *Journal of Atmospheric Sciences* [1]. The publication was momentous, and the Lorenz system helped spur the development modern chaos theory.* *In his paper, Lorenz derives a system of differential equations as a simplification of an existing fluid-convection model, with intent to study convection in the atmosphere:

where *σ*, *r*, and *b* are positive parameters related to the physics of convection. We’ll usually use *σ *= 10, *b* = 8/3, and *r* = 28, since Lorenz originally studied the system with those values, but we’ll give a little special treatment to the value of *r* later on. Notice that the equations don’t look terribly complicated; there are only two nonlinear terms (the *xz* and *xy*), and there are only three equations. Likewise the system is clearly deterministic—it involves no random inputs. Yet we’ll see that the system exhibits subtle and chaotic behavior.

Dynamical systems are generally analyzed in terms of their *states*, which are just lists of variables that describe the system at any point in time. If your system is described by *n* variables, then you can imagine an *n*-dimensional space of all possible states—this space is referred to as *phase-space,* and a particular state of the system is just a point in the phase-space. If your system starts in a particular state, the dynamics will cause the system to follow some *trajectory* through phase-space as its state changes. A trajectory is just a solution to the system equations. (Technically, each point along a trajectory could serve as the initial conditions for another solution. But since those solutions have the same path through phase-space, they’re referred to as a single trajectory.)

For example, if the system you’re working with is a mass on a spring (in one dimension), you might describe the state by the position and the velocity of the mass. Both are functions of time, and the state of your system is some point in phase-space called *S* = (*x*, *v*). The variables *x* and *v* are functions of time, so your system will trace out a trajectory in phase-space as it moves through time. In this case, the trajectory is an ellipse, as shown in the Figure below.

**Figure 1.** A one-dimensional mass-spring system traces out an ellipse in phase-space.

The analysis of phase-space trajectories is central to understanding nonlinear dynamics and chaos, partially because the behavior of these systems can’t generally be “solved” completely. It’s often much more useful (and easier) to figure out the qualitative behavior of a system in terms of the footprint it leaves in phase-space, especially for chaotic systems.

Returning to the Lorenz system, our goal is to have a phase-space portrait of the solutions to the system equations, and to understand its structure in terms of chaos. If we were to find that all trajectories converged on a point, or orbited around a point, for example, that would immediately show us how the system behaves. The question is: if we start a trajectory at a given point in phase-space, what happens next?

From direct numerical integration of the Lorenz equations, the plot in Figure 2 is our answer. The trajectory shown outlines the limiting set for the system, called an *attractor*, toward which all trajectories are attracted. Remember, what you’re seeing isn’t really the attractor itself, but rather one particular solution that outlines it—similar to how iron filings will line up to outline an unseen magnetic field.

**Figure 2.** (Click to enlarge) Lorenz discovered this chaotic attractor after simulating his convection model. Notice that the trajectory swings irregularly between the two “wings” of the butterfly-shaped attractor. Plot generated using *σ *= 10, *b* = 8/3, *r* = 28.

I’ve also plotted the output of just the *x*-component so you can get a feel for what the non-periodic solutions of this chaotic system really look like.

**Figure 3. **One of the three non-periodic state variables generated in the Lorenz system. The irregular oscillations in the positive- and negative-*x*-regions correspond to the trajectory lingering respectively on the right and left wings of the attractor in Figure 2. Although there are strong reasons to believe the solution is non-periodic, nobody has proven the non-periodicity totally rigorously. Plot generated using *σ *= 10, *b* = 8/3, *r* = 28.

We can learn a whole lot about the system and its attractor with some analysis. First, it’s useful to think about what happens to *volumes* in phase-space (the following derivation will follow section 9.2 in Strogatz [2]). If we have some volume in phase-space (meaning some blob of possible initial conditions) bounded by a surface , the volume will expand in time as all the points in evolve under the Lorenz equations. The velocity of each point on the surface is denoted by and let be a unit-vector normal to . Then in a time , a patch of area on the surface will add a volume to the volume .

**Figure 4. **A volume is added to during the interval for each element on the surface of .

Integrating over all patches of area on the surface, we get

Then, after subtracting and dividing by , the divergence theorem gives

.

Finally, computing

gives

.

So we see that volumes in phase-space evolve as , and draw an important conclusion: volumes in phase-space contract toward zero. That means that the volume of the attractor itself must be *zero *since any blob of initial conditions is drawn to the attractor, but its volume decreases exponentially*.*

Another tactic for understanding the system is to look for points which are fixed in phase-space for all time, called *fixed points*. If the system starts at (or reaches) a fixed point, it will stay there forever. The fixed points occur when

For the Lorenz system, (x, y, z) = (0, 0, 0) is a fixed point, and there are two other fixed points at C^{+} = and C^{–} = which exist only when *r* > 1 (check for yourself).

The fact that C^{+} and C^{–} only exist for *r* > 1 means that the system behaves very differently for *r* ≤ 1 and *r* > 1. If we were to tune *r* continuously starting from some tiny number, then something serious would have to happen to the system dynamics when we reached *r* = 1. That event is called a *bifurcation, *and *r* = 1 is called a *bifurcation value* of *r**.* There are many types of bifurcations, but what they all have in common is this: as we change a system parameter through a bifurcation value, the system undergoes sudden qualitative changes in behavior. In our case, the system “sprouts” two new fixed points, C^{+} and C^{–}, as *r* increases through 1.

The actual effects of the fixed points on the solutions of the differential equations (that is, on the trajectories through phase-space) are *not* obvious without numerical integration. If you’re interested in how the analysis is done, take a look at Section 9.2 in Strogatz. Meanwhile, we’ll get some intuition for what these particular fixed points do from numerical integration of the Lorenz equations. It turns out that for *r* < 1, all trajectories are attracted to the origin. The case for an *r-*value not too much greater than 1 is shown below.

**Figure 5. **The Lorenz system with *r* = 10. Each color represents a different initial state. For *r* not too much greater than 1, the fixed points C^{+} and C^{–} are attractive for all trajectories. In this case the system is pretty boring (and certainly not chaotic).

The system actually has yet another bifurcation at *r* ≈ 24.74. In this bifurcation, the location of the fixed points is unchanged—what does change, though, is their effect on the trajectories. For *r* > 24.74, trajectories which approach C^{+} and C^{–} are actually repelled from those fixed points. Meanwhile the fixed point at the origin also repels trajectories. When looking for chaos in this system, we’ll want to use a value of *r* beyond this second bifurcation.

Let’s summarize what we know so far:

- All trajectories will tend toward some limiting set in phase-space which has zero volume. This is the attractor we glimpsed from our numerical integration.
- Trajectories are repelled from all three fixed points, so they must keep moving forever.

But we haven’t actually proven that the Lorenz system is truly chaotic, according to our definition above. To show that it really is, we still need to show that the system has “sensitive” dependence on initial conditions. Figure 6 below shows what happens to two trajectories which start off with almost identical initial conditions.

**Figure 6.** (Click to enlarge) Solutions to the Lorenz system with initial conditions (0, 1.00, 0) [blue] and (0, 1.01, 0) [red]. Notice that while both trajectories adhere to the attractor, their separation at any given point in time may be very large.

Below I’ve plotted the *x*-outputs for the trajectories shown above to make it easy to see that the two solutions really do have drastically different behavior.

**Figure 7. **The *x*-outputs of two trajectories varying by 1% in initial conditions. The system’s behavior clearly has a sensitive dependence on initial conditions.

For chaotic systems in general, if two points start off with a separation in phase-space, after a time *t* their separation will be . From numerical results, the magnitude of two trajectories’ separation is found to be roughly exponential in time:

This is not a precise relationship, just a rough model of behavior observed in lots of chaotic systems. The true relationship between and is complicated and depends on where the trajectories start, and for the Lorenz system is limited by the maximum separation between two points on the attractor. Anyway, the rate of separation of points in phase-space is roughly described by λ, which is called the *Lyapunov exponent* of the system. For the Lorenz system λ is found to be about 0.9. The fact that the Lyapunov exponent is positive means that the system is very sensitive to initial conditions—initially proximate solutions will diverge exponentially quickly. In fact, in our definition of chaos, what I meant by “high sensitivity to initial conditions” was actually that the system should have a positive Lyapunov exponent! The attractor in Figure 2 is called a *strange attractor* due to its sensitive dependence on initial conditions, and we can conclude that the Lorenz system really is chaotic.

As a final puzzling thought, let’s think about the long-term fate of trajectories on the Lorenz attractor. There is an important Existence and Uniqueness Theorem which tells us that each point in phase-space corresponds to a single unique solution (that is, a single unique *trajectory*) of the Lorenz system. That means that no two trajectories can ever cross, or else the point of intersection would itself serve as the initial conditions for two different solutions, violating the Theorem! Likewise, no trajectory can ever intersect itself. If it did, it would have to enter some closed loop which always brings it back to the point where it intersected itself, meaning it would be periodic. Keep in mind that every single point in the infinite phase-space can serve as the initial conditions for a trajectory. So the implications of the Existence and Uniqueness Theorem here are dramatic:

Every possible trajectory stays within a bounded attracting region in phase-space forever* *without *ever* reaching the same point twice *and without ever* *reaching any of the same points that any other trajectory ever reaches in infinite time*.

It’s like every trajectory is playing an infinite game of “Snake” with itself and every other trajectory simultaneously, and wins!

And oh yeah, as if this strange attractor weren’t already strange enough, it’s also a roughly 2.05-dimensional fractal—but more on that later! In the meantime, check out some cool attractors that arise in other chaotic systems.

**Figure 8.** (Click to enlarge) Plots of chaotic attractors were generated by numerical integration of their respective governing equations.

**References and Further Reading**

[1] Lorenz, Edward N. “Deterministic Nonperiodic Flow.” Journal of The Atmospheric Sciences V. 20, pp. 130-141. March 1963.

[2] Strogatz, Stephen H. *Nonlinear Dynamics and Chaos*. Chapters 6, 9.

In developing a relativistic quantum theory, it turns out that we can’t just quantize relativistic particles the way we can quantize non-relativistic particles, as I’ll detail below. Rather, we will need to construct a quantum *field* in order to take quantum mechanics into the relativistic regime. Here, we’ll take it for granted that a quantum field can be constructed which satisfies certain fundamental requirements.

In this notation, is a 4-vector representing a point in spacetime, is a spatial vector in , and I’ll use boldface for operators (like the Hamiltonian, ). In the end our quantum field () will be operator-valued at each spacetime point—I just won’t use boldface for the fields themselves, even though they are operators. So associates each spacetime point with an operator called .

So, why doesn’t familiar quantum mechanics play well with relativity?

In non-relativistic quantum mechanics, the Schrödinger equation (SEq),

,

describes the evolution of a particle’s state in time. Unfortunately for physicists, the SEq is not “relativistically invariant.” This means that if a wave function satisfies the SEq in one reference frame, it will not generally satisfy the equation after we perform a Lorentz boost or a rotation to a different reference frame.

There is a wave equation, however, which is relativistically invariant and does naturally arise as the description for a certain type of quantum field (although here we’ll treat it as an equation for a classical field ). This is the Klein-Gordon equation,

,

which looks just like a classical wave equation with an added mass term. We can solve the Klein-Gordon equation, starting with an educated guess for :

Using the Klein-Gordon equation,

Solving for our allowed energies, we see that , and our solution takes the form

( and are just normalization constants).

There’s something fishy going on here: the term corresponds to solutions with *negative *energy. That means treating the Klein-Gordon equation as the equation of motion for a single relativistic quantum particle cannot be the correct interpretation. If is viewed as a field, the negative energy solutions can be interpreted as antiparticle solutions–take a look at Section 8.1 of Sakurai’s “Modern Quantum Mechanics” for more on this interpretation. In any case, a single-particle interpretation of is incorrect.

We know from Einstein’s mass-energy equivalence *E* = mc^{2} that mass can be created if a suitable cost in energy is paid. If we scatter two pions, for example, we may expect the following process:

That’s not too exciting on the surface. But if the incoming pions have sufficiently high energy, we will have no choice but to consider the processes

and , and so on,

because the energy of the incoming pions may be high enough to produce numerous daughter pions.

Generally, we have to worry about pair-production of particles whenever a particle is confined to a very small space–comparable to its Compton wavelength , where *m* is the particle’s mass. From the uncertainty relation , if the particle is confined so that the uncertainty in its position is , then . From *E* = mc^{2}, the threshold on momentum for creating a particle is . So if approaches as is squeezed, pair-production is a relevant problem.

Incidentally, this problem of pair-production might make it seem silly to study the hydrogen atom in the scrutinizing detail that non-relativistic quantum mechanics affords it. After all, the electron is confined to an *awfully* small space. But it turns out that the characteristic size of hydrogen’s orbitals is around the Bohr radius, , which can be related to the Compton wavelength of an electron by the fine-structure constant : . So is sufficiently large that pair-production doesn’t come into play, and we can use non-relativistic quantum mechanics to study the hydrogen atom.

Anyway, whatever quantum field theory we use, it must be constructed to account for creation and destruction of particles.

* *We run into another problem in single-particle quantum mechanics when we calculate the amplitude (which represents how probable it is for a process to occur) for a particle to travel from a point to in a time *t*. Because the speed of light *c* is our universal speed limit, we would expect to be zero for points which are separated by a distance farther than light could travel in a time *t*. In special relativity, the boundary of points which are accessible without exceeding *c* defines a *light cone*, shown in Figure 1.

**Figure 1.** For a particle or person at the origin, the speed limit *c* means that only points within the light cone are accessible. It is impossible to travel or send information to points outside the light cone. For you here on Earth, a spot on Jupiter one second in the future is outside your light cone. You cannot influence that spacetime point—no matter what.

So if , the particle cannot possibly travel from to in a time *t*.

Returning to the problem of particles traveling outside their light cone: if we actually use the free-particle Hamiltonian to calculate the amplitude for a particle to travel from to , we start with

.

Inserting an identity operator gives

,

and the exponential term acts on to give

.

The inner-product term is just the wave function for a momentum eigenstate, . After replacing similarly, we have our result:

,

which is nonzero. That means that in the familiar framework of quantum mechanics, it’s possible for a particle to travel between any two points in any amount of time. Even if you replace the Hamiltonian operator in the calculation above with its relativistic form , you still find that is nonzero outside the light cone! That’s no good; faster-than-light travel has to be prohibited by QFT.

As an aside, notice that the we calculated above corresponds to the propagation of a free particle between any two points in space. Strictly speaking, represents the final state of interest and represents the actual final state of the system which starts out localized at , so taking their inner product tells us the overlap of the desired final state with the actual final state. In other words, squaring gives the probability (density) of the propagation from to . Our result for is a 3D Gaussian centered around the initial point, which flattens out as time moves on. That makes some sense; a free particle initially localized at will have a totally undetermined momentum (from the uncertainty principle), and so its motion will not have a preferred direction.

* *As a final point in the problem of applying special relativity to quantum mechanics, we’ll take a look at causality from the point of view of *operators*.

First, remember that spacelike-separated points in spacetime cannot have a causal influence on each other.

In non-relativistic quantum mechanics, every Hermitian operator is associated with an observable quantity (say, the component of angular momentum in the z-direction). If two such operators A and B commute, then any observer can measure the quantities associated with A and B simultaneously with well-defined results. This is not so for non-commuting observables such as position and momentum.

Now think about two observers in spacelike-separated regions R_{1} and R_{2}. The experimenter in R_{1} is trying to measure the observable A_{1}, and the exprimenter in R_{2} is trying to measure an the observable A_{2}. If A_{1} and A_{2} commute, the experimenters can succeed.

But what if A_{1} and A_{2} do not commute? Well, then the outcomes of the measurements will depend on the *order* in which the experimenters make their respective measurements—as is necessarily the case with non-commuting observables—meaning that a measurement taking place in R_{1} *superluminally* affects what outcomes may be seen in R_{2}.

So QFT has a new requirement:

for any observables A_{1} in R_{1} and A_{2} in R_{2}, where R_{1} and R_{2} are spacelike-separated regions. **Spacelike-separated operators must commute.**

This requirement squares nicely with a framework of fields rather than particles. In a field theory, we can treat the fields themselves as operators which depend on spacetime points like and . Any observable quantities we want to measure will be built up of the fields. Then, if we want operators corresponding to spacelike-separated observables to commute, we just have to enforce that the fields themselves commute:

for any spacelike-separated points and .

This is one of about four fundamental requirements for constructing a quantum field. The others deal with their behavior under spacetime translations and Lorentz transformations, as well as the specifics of how they handle creation and annihilation of particles. But that’s a story for another post! Quantum mechanics has totally failed its special relativity exams.

**Recommended Readings:**

[1] Sakurai, JJ; Napolitano, Jim. *Modern Quantum Mechanics*. Chapters 3, 8.

[2] Peskin, Michael E.; Schroeder, Daniel V.. *An Introduction to Quantum Field Theory*. Chapters 1, 2.

Some of the most dramatic and unintuitive experimental results in quantum mechanics have come from dedicated physicists’ insistence on pushing the bounds of what *should* be physically possible (but not all, as Henry Becquerel’s serendipitous discovery of radioactivity reminds us). Dedication to probing the strangeness of quantum physics has been paying off for over a hundred years.

One of the more recent and striking successes of quantum theory, first posited in 1993 by Avshalom Elitzur and Lev Vaidman of Tel Aviv University, is the idea that we can actually detect an object without using photons or any other particles to look at it. What in the world does that mean? Well,

Imagine you’re tasked with determining whether a new (highly classified) special “quantum bomb” is operational or is a dud. This quantum bomb has two characteristics that make it extremely volatile:

- If the bomb is operational, it will explode when a single photon of light strikes it;
- If the bomb is a dud, it will not interact with photons in any way.

This may be an extremely strange and dangerous bomb, but you’ve found yourself before a problem which seems impossible: any light you shine on the bomb, perhaps to test it, will just cause an explosion. How can you possibly test that the bomb works without causing it to explode?

You need to perform an **interaction-free measurement.
**

All familiar forms of measurement involve some physical interaction. When your eye or a camera sees an object, it is seeing photons which have interacted with the object you are observing—they have been scattered by its surface. It seems there’s just no way around having some sort of physical interaction involved in measuring something—be it a chemical reaction allowing you to smell, or absorption and reflection of photons on a basketball’s surface. (That’s not quite true, actually: some other touch-free measurements do exist. You can locate a nucleus by shooting protons at it and observing their deflection in the electric field surrounding the nucleus, and you can prove that your lost keys are under the sofa if you check everywhere else and don’t find them. In both of these cases, though, you have to know something about the object you’re trying to observe–like the fact that the keys must exist somewhere–in order to make a “measurement.”)

But a fascinating alternative form of measurement exists which requires no prior information whatsoever about the object to be detected. **With interaction-free measurements (IFMs), you can “see” the quantum bomb without even “looking”**, in the sense that you need not cause any photons to interact with the bomb in order to detect it. I’ll sketch out a couple thought experiments that show how an IFM works.

Suppose you’ve set up an interferometer made up of two beam-splitters (each with 50% transmittance), two mirrors, and two photon detectors D_{1} and D_{2}, arranged as in Figure 1.

**Figure 1.** An incoming photon may traverse either path 1, path 2, or both paths through the interferometer, depending on the observation scheme.

The lengths of path 1 and path 2 are tuned to be exactly equal. If paths 1 and 2 are unobstructed, the wave-nature of incoming photons will cause interference between the two paths through the interferometer—analogous to the interference seen in Young’s double-slit experiment (where paths 1 and 2 would correspond to paths through each of the two slits, respectively). The photon detectors D_{1} and D_{2} are positioned so that D_{1} detects *all* the outgoing photons (like Young’s bright fringes) and D_{2} observes *no* photons at all (like Young’s dark fringes). If you want more persuasion to see why interference causes only one of the detectors to ever fire, take a look at the technical aside.

Following the description in the original paper by Elitzur and Vaidman [1], call the quantum state of a photon traveling horizontally and that of a photon traveling vertically . A mirror will change the direction of a photon from horizontal to vertical (or vice versa), and introduce a phase shift to the state (check out the explanation on page 6 of the problem set in Ref. [2] for an explanation of why the phase-shift on reflection is ). So a mirror acts like:

Additionally, a beam-splitter will produce a superposition of a horizontal state and a vertical state:

The s come from unitarity, meaning essentially that the total amplitude of the resulting state should be 1. Looking at Figure 2, follow each path through the interferometer and check the operation of each element with the operational descriptions above. You’ll see that if the upper- and lower-path lengths are tuned to be equal, the upper beam-splitter operates like , which is just a horizontally-traveling photon state with a phase-shift of relative to the incoming state. So the outgoing state has no vertical component, and D_{2} won’t observe any photons.

**Figure 2.** The states corresponding to photons in each branch of the interferometer cause constructive interference at D_{1} and destructive interference at D_{2}.

So for any photon you let into the interferometer, you know detector D_{1} is going to fire.

Now suppose you place your quantum bomb somewhere in path 2. If the bomb is operational, no photons can possibly complete path 2 without exploding the bomb, so no interference can occur. Photons will behave more like particles. If the bomb is a dud, it doesn’t interact with the photon at all, so the situation is just like it was when the bomb wasn’t obstructing path 2.

What happens to an incoming photon if there is an operational bomb in path 2? At the first beam-splitter, a photon has a probability of ½ to take the upper path and strike the bomb. If the photon takes the lower path, it once again has a probability of ½ to be either reflected (and sent to D_{1}) or transmitted (and sent to D_{2}) by the second beam-splitter. So about 50% of the photons will take path 2 and hit the bomb, and the other 50% will be evenly divided between reaching D_{1} and D_{2}. This means that if we fire an individual photon into the setup containing a functional bomb, D_{2} will fire with a probability of ¼. If D_{2} does fire (which, remember, cannot happen unless the bomb is operational), we have determined that an operational bomb is present in the system *without exploding it*. If D_{1} fires, we haven’t learned anything, since D_{1} will always fire if the bomb is a dud and will sometimes fire if the bomb is operational. That’s great, but we’re still stuck with blowing up the bomb 50% of the time and learning nothing 25% of the time. If you repeated the experiment 100 times with 100 operational bombs, about 50 bombs would explode, about 25 operational bombs would be found to be present without exploding, and about 25 cases would be totally mysterious. We can force physics to do better than that.

Mark A. Kasevich of Stanford University proposed a method for successfully making such an interaction-free measurement much more than 25% of the time. An illustration of the idea uses two polarizing beam-splitters, which transmit horizontally-polarized light and reflect vertically-polarized light, a polarization rotator, and two switchable mirrors arranged as shown in Figure 3. A horizontally (H)-polarized photon is allowed to enter the setup through a switchable mirror at the bottom left. Let’s say the polarization rotator rotates the polarization by 15° toward the vertical.

**Figure 3.** An incoming H-polarized photon cycles through the outer mirrors in the setup, passing through a polarization rotator and a beam-splitter-interferometer (consisting of the two beam-splitters and two mirrors in the upper-right corner of the diagram) during each lap. After a predetermined number of cycles (here, N=6), the photon is allowed to leave the setup through the switchable mirror in the bottom right. The polarization of the outgoing photon is then observed.

First, what happens to an incoming H-polarized photon entering the system when there’s no bomb involved? The photon (polarization) will be rotated 15° toward the vertical, then follow both arms of the beam-splitter-interferometer, and then repeat the cycle until it’s rotated to be fully V-polarized. Here the beam-splitters separate and recombine the horizontal and vertical components of the photon, but the interference effect is exactly analogous to that of the interferometer in Figure 1. After 90°/15° = 6 cycles, we switch off the switchable mirror and let the photon exit. If we measure its polarization, we will always observe vertical polarization.

Now for the interesting part: what if an operational bomb is placed in the lower leg of the beam-splitter-interferometer? An incoming H-polarized photon will be rotated 15° toward the vertical before coming upon the upper beam-splitter. The photon cannot interfere with itself, since one path is obstructed, so it must choose either the upper or the lower path. What is the probability for each choice? Well, Malus’ Law tells us the photon has a probability to be transmitted (and thereby become H-polarized) and of being reflected. If the photon is reflected, it causes the bomb to explode. If the photon is transmitted, however, it must be H-polarized (since it passed an H-polarizing splitter), so it passes through the lower beam-splitter with certainty. The process repeats, and let’s say we let the photon exit at the switchable mirror after 6 cycles. The odds that the photon survives through 6 cycles without striking the bomb are just . If the photon doesn’t survive 6 cycles, the whole experiment is reduced to rubble. If we measure the outgoing photon’s polarization, it must be horizontal, since it passed an H-polarizing splitter. But remember that if no bomb were involved (or if the bomb were a dud and couldn’t obstruct the lower path of the beam-splitter-interferometer), the outgoing photon’s polarization would always be vertical. So if we place a bomb in the setup and perform the experiment, we will get one of three outcomes:

- An explosion.
- An H-polarized outgoing photon, meaning the bomb is operational.
- A V-polarized outgoing photon, meaning the bomb is a dud.

If we assume that the bomb is operational, the respective probabilities of these outcomes are:

- .
- .

Case 2 is the successful IFM, and its probability can theoretically be made arbitrarily close to 1 by increasing the number of cycles N (which means decreasing the polarization rotator’s rotation angle from 15° to 90°/N). If the photon could run 1000 cycles through the setup, the probability of a successful IFM climbs to over 99.75%.

The fact that nature allows us to push the probability of an interaction-free measurement arbitrarily close to 1 (in theory) is astonishing—it’s contrary to all of our everyday experience with making measurements. It means that the universe has at least one way of allowing us to glean information which was previously thought inaccessible. The Scientific American article [3] talks about potential applications of IFMs, like taking X-ray images without exposing the subject to the deleterious radiation or taking photographs of super-cold (and fragile) Bose-Einstein condensate states which would be destroyed if struck by a photon. The ability to make IFMs with X-rays is still really impractical, though, given that it’s not easy to engineer convenient optical components that work well at such high energies. But even if these ideas are still distant in practical terms, the conceptual underpinnings are there: we at least know how to *see* without even *looking*.

**References and Further Reading**

[1] Elitzur, Avshalom; Vaidman, Lev. “Quantum Mechanical Interaction-Free Measurements.” *arXiv:hep-th/9305002v2* (1993).

[2] McDonald, Kirk T. “Ph501 Electrodynamics Problem Set 6.” pp. 6. Princeton University. 2001. http://puhep1.princeton.edu/~kirkmcd/examples/ph501set6.pdf.

[3] Kwiat, Paul; Weinferter, Harald; Zeilinger, Anton. “Quantum Seeing in the Dark.” Scientific American, pp. 72-78. November 1996.

[4] Kwiat, Paul; Weinferter, Harald; Zeilinger, Anton; Kasevich, Mark. “Interaction-Free Measurement.” Phys. Rev. Lett. V. 74, No. 24, pp. 4763-4766. 12 June 1995.

[5] DeWeerd, Alan J. “Interaction-free measurement.” Am. J. Physics 70 (3), pp. 272-275. March 2002.