You are viewing the html version of Simple Nature, by Benjamin Crowell. This version is only designed for casual browsing, and may have some formatting problems. For serious reading, you want the Adobe Acrobat version.

Table of Contents

Section 7.1 - Time Is Not Absolute
Section 7.2 - Distortion of Space and Time
Section 7.3 - Dynamics
Section 7.4 - General Relativity (optional)

Chapter 7. Relativity

7.1 Time Is Not Absolute


a / This Global Positioning System (GPS) system, running on a smartphone attached to a bike's handlebar, depends on Einstein's theory of relativity. Time flows at a different rate aboard a GPS satellite than it does on the bike, and the GPS software has to take this into account.


b / The clock took up two seats, and two tickets were bought for it under the name of “Mr. Clock.”

When Einstein first began to develop the theory of relativity, around 1905, the only real-world observations he could draw on were ambiguous and indirect. Today, the evidence is part of everyday life. For example, every time you use a GPS receiver, a, you're using Einstein's theory of relativity. Somewhere between 1905 and today, technology became good enough to allow conceptually simple experiments that students in the early 20th century could only discuss in terms like “Imagine that we could...” A good jumping-on point is 1971. In that year, J.C. Hafele and R.E. Keating brought atomic clocks aboard commercial airliners, b, and went around the world, once from east to west and once from west to east. Hafele and Keating observed that there was a discrepancy between the times measured by the traveling clocks and the times measured by similar clocks that stayed home at the U.S. Naval Observatory in Washington. The east-going clock lost time, ending up off by \(-59\pm10\) nanoseconds, while the west-going one gained \(273\pm7\) ns.

7.1.1 The correspondence principle

This establishes that time doesn't work the way Newton believed it did when he wrote that “Absolute, true, and mathematical time, of itself, and from its own nature flows equably without regard to anything external...” We are used to thinking of time as absolute and universal, so it is disturbing to find that it can flow at a different rate for observers in different frames of reference. Nevertheless, the effects that Hafele and Keating observed were small. This makes sense: Newton's laws have already been thoroughly tested by experiments under a wide variety of conditions, so a new theory like relativity must agree with Newton's to a good approximation, within the Newtonian theory's realm of applicability. This requirement of backward-compatibility is known as the correspondence principle.


c / Newton's laws do not distinguish past from future. The football could travel in either direction while obeying Newton's laws.

7.1.2 Causality

It's also reassuring that the effects on time were small compared to the three-day lengths of the plane trips. There was therefore no opportunity for paradoxical scenarios such as one in which the east-going experimenter arrived back in Washington before he left and then convinced himself not to take the trip. A theory that maintains this kind of orderly relationship between cause and effect is said to satisfy causality.

Causality is like a water-hungry front-yard lawn in Los Angeles: we know we want it, but it's not easy to explain why. Even in plain old Newtonian physics, there is no clear distinction between past and future. In figure c, number 18 throws the football to number 25, and the ball obeys Newton's laws of motion. If we took a video of the pass and played it backward, we would see the ball flying from 25 to 18, and Newton's laws would still be satisfied. Nevertheless, we have a strong psychological impression that there is a forward arrow of time. I can remember what the stock market did last year, but I can't remember what it will do next year. Joan of Arc's military victories against England caused the English to burn her at the stake; it's hard to accept that Newton's laws provide an equally good description of a process in which her execution in 1431 caused her to win a battle in 1429. There is no consensus at this point among physicists on the origin and significance of time's arrow, and for our present purposes we don't need to solve this mystery. Instead, we merely note the empirical fact that, regardless of what causality really means and where it really comes from, its behavior is consistent. Specifically, experiments show that if an observer in a certain frame of reference observes that event A causes event B, then observers in other frames agree that A causes B, not the other way around. This is merely a generalization about a large body of experimental results, not a logically necessary assumption. If Keating had gone around the world and arrived back in Washington before he left, it would have disproved this statement about causality.


d / All three clocks are moving to the east. Even though the west-going plane is moving to the west relative to the air, the air is moving to the east due to the earth's rotation.


f / The correspondence principle requires that the relativistic distortion of time become small for small velocities.

7.1.3 Time distortion arising from motion and gravity

Hafele and Keating were testing specific quantitative predictions of relativity, and they verified them to within their experiment's error bars. Let's work backward instead, and inspect the empirical results for clues as to how time works.

The two traveling clocks experienced effects in opposite directions, and this suggests that the rate at which time flows depends on the motion of the observer. The east-going clock was moving in the same direction as the earth's rotation, so its velocity relative to the earth's center was greater than that of the clock that remained in Washington, while the west-going clock's velocity was correspondingly reduced. The fact that the east-going clock fell behind, and the west-going one got ahead, shows that the effect of motion is to make time go more slowly. This effect of motion on time was predicted by Einstein in his original 1905 paper on relativity, written when he was 26.

If this had been the only effect in the Hafele-Keating experiment, then we would have expected to see effects on the two flying clocks that were equal in size. Making up some simple numbers to keep the arithmetic transparent, suppose that the earth rotates from west to east at 1000 km/hr, and that the planes fly at 300 km/hr. Then the speed of the clock on the ground is 1000 km/hr, the speed of the clock on the east-going plane is 1300 km/hr, and that of the west-going clock 700 km/hr. Since the speeds of 700, 1000, and 1300 km/hr have equal spacing on either side of 1000, we would expect the discrepancies of the moving clocks relative to the one in the lab to be equal in size but opposite in sign.


e / A graph showing the time difference between two atomic clocks. One clock was kept at Mitaka Observatory, at 58 m above sea level. The other was moved back and forth to a second observatory, Norikura Corona Station, at the peak of the Norikura volcano, 2876 m above sea level. The plateaus on the graph are data from the periods when the clocks were compared side by side at Mitaka. The difference between one plateau and the next shows a gravitational effect on the rate of flow of time, accumulated during the period when the mobile clock was at the top of Norikura. Cf. problem 25, p. 444.

In fact, the two effects are unequal in size: \(-59\) ns and 273 ns. This implies that there is a second effect involved, simply due to the planes' being up in the air. This was verified more directly in a 1978 experiment by Iijima and Fujiwara, figure e, in which identical atomic clocks were kept at rest at the top and bottom of a mountain near Tokyo. This experiment, unlike the Hafele-Keating one, isolates one effect on time, the gravitational one: time's rate of flow increases with height in a gravitational field. Einstein didn't figure out how to incorporate gravity into relativity until 1915, after much frustration and many false starts. The simpler version of the theory without gravity is known as special relativity, the full version as general relativity. We'll restrict ourselves to special relativity until section 7.4, and that means that what we want to focus on right now is the distortion of time due to motion, not gravity.

We can now see in more detail how to apply the correspondence principle. The behavior of the three clocks in the Hafele-Keating experiment shows that the amount of time distortion increases as the speed of the clock's motion increases. Newton lived in an era when the fastest mode of transportation was a galloping horse, and the best pendulum clocks would accumulate errors of perhaps a minute over the course of several days. A horse is much slower than a jet plane, so the distortion of time would have had a relative size of only \(\sim10^{-15}\) --- much smaller than the clocks were capable of detecting. At the speed of a passenger jet, the effect is about \(10^{-12}\), and state-of-the-art atomic clocks in 1971 were capable of measuring that. A GPS satellite travels much faster than a jet airplane, and the effect on the satellite turns out to be \(\sim10^{-10}\). The general idea here is that all physical laws are approximations, and approximations aren't simply right or wrong in different situations. Approximations are better or worse in different situations, and the question is whether a particular approximation is good enough in a given situation to serve a particular purpose. The faster the motion, the worse the Newtonian approximation of absolute time. Whether the approximation is good enough depends on what you're trying to accomplish. The correspondence principle says that the approximation must have been good enough to explain all the experiments done in the centuries before Einstein came up with relativity.

By the way, don't get an inflated idea of the importance of the Hafele-Keating experiment. Special relativity had already been confirmed by a vast and varied body of experiments decades before 1971. The only reason I'm giving such a prominent role to this experiment, which was actually more important as a test of general relativity, is that it is conceptually very direct.

7.2 Distortion of Space and Time


a / Two events are given as points on a graph of position versus time. Joan of Arc helps to restore Charles VII to the throne. At a later time and a different position, Joan of Arc is sentenced to death.


b / A change of units distorts an \(x\)-\(t\) graph. This graph depicts exactly the same events as figure a. The only change is that the \(x\) and \(t\) coordinates are measured using different units, so the grid is compressed in \(t\) and expanded in \(x\).


c / A convention we'll use to represent a distortion of time and space.


d / A Galilean version of the relationship between two frames of reference. As in all such graphs in this chapter, the original coordinates, represented by the gray rectangle, have a time axis that goes to the right, and a position axis that goes straight up.


e / A transformation that leads to disagreements about whether two events occur at the same time and place. This is not just a matter of opinion. Either the arrow hit the bull's-eye or it didn't.


f / A nonlinear transformation.


h / In the units that are most convenient for relativity, the transformation has symmetry about a 45-degree diagonal line.


i / Interpretation of the Lorentz transformation. The slope indicated in the figure gives the relative velocity of the two frames of reference. Events A and B that were simultaneous in frame 1 are not simultaneous in frame 2, where event A occurs to the right of the \(t=0\) line represented by the left edge of the grid, but event B occurs to its left.

7.2.1 The Lorentz transformation

Relativity says that when two observers are in different frames of reference, each observer considers the other one's perception of time to be distorted. We'll also see that something similar happens to their observations of distances, so both space and time are distorted. What exactly is this distortion? How do we even conceptualize it?

The idea isn't really as radical as it might seem at first. We can visualize the structure of space and time using a graph with position and time on its axes. These graphs are familiar by now, but we're going to look at them in a slightly different way. Before, we used them to describe the motion of objects. The grid underlying the graph was merely the stage on which the actors played their parts. Now the background comes to the foreground: it's time and space themselves that we're studying. We don't necessarily need to have a line or a curve drawn on top of the grid to represent a particular object. We may, for example, just want to talk about events, depicted as points on the graph as in figure a. A distortion of the Cartesian grid underlying the graph can arise for perfectly ordinary reasons that Isaac Newton would have readily accepted. For example, we can simply change the units used to measure time and position, as in figure b.

We're going to have quite a few examples of this type, so I'll adopt the convention shown in figure c for depicting them. Figure c summarizes the relationship between figures a and b in a more compact form. The gray rectangle represents the original coordinate grid of figure a, while the grid of black lines represents the new version from figure b. Omitting the grid from the gray rectangle makes the diagram easier to decode visually.

Our goal of unraveling the mysteries of special relativity amounts to nothing more than finding out how to draw a diagram like c in the case where the two different sets of coordinates represent measurements of time and space made by two different observers, each in motion relative to the other. Galileo and Newton thought they knew the answer to this question, but their answer turned out to be only approximately right. To avoid repeating the same mistakes, we need to clearly spell out what we think are the basic properties of time and space that will be a reliable foundation for our reasoning. I want to emphasize that there is no purely logical way of deciding on this list of properties. The ones I'll list are simply a summary of the patterns observed in the results from a large body of experiments. Furthermore, some of them are only approximate. For example, property 1 below is only a good approximation when the gravitational field is weak, so it is a property that applies to special relativity, not to general relativity.

Experiments show that:

  1. No point in time or space has properties that make it different from any other point.
  2. Likewise, all directions in space have the same properties.
  3. Motion is relative, i.e., all inertial frames of reference are equally valid.
  4. Causality holds, in the sense described on page 381.
  5. Time depends on the state of motion of the observer.

Most of these are not very subversive. Properties 1 and 2 date back to the time when Galileo and Newton started applying the same universal laws of motion to the solar system and to the earth; this contradicted Aristotle, who believed that, for example, a rock would naturally want to move in a certain special direction (down) in order to reach a certain special location (the earth's surface). Property 3 is the reason that Einstein called his theory “relativity,” but Galileo and Newton believed exactly the same thing to be true, as dramatized by Galileo's run-in with the Church over the question of whether the earth could really be in motion around the sun. Property 4 would probably surprise most people only because it asserts in such a weak and specialized way something that they feel deeply must be true. The only really strange item on the list is 5, but the Hafele-Keating experiment forces it upon us.

If it were not for property 5, we could imagine that figure d would give the correct transformation between frames of reference in motion relative to one another. Let's say that observer 1, whose grid coincides with the gray rectangle, is a hitch-hiker standing by the side of a road. Event A is a raindrop hitting his head, and event B is another raindrop hitting his head. He says that A and B occur at the same location in space. Observer 2 is a motorist who drives by without stopping; to him, the passenger compartment of his car is at rest, while the asphalt slides by underneath. He says that A and B occur at different points in space, because during the time between the first raindrop and the second, the hitch-hiker has moved backward. On the other hand, observer 2 says that events A and C occur in the same place, while the hitch-hiker disagrees. The slope of the grid-lines is simply the velocity of the relative motion of each observer relative to the other.

Figure d has familiar, comforting, and eminently sensible behavior, but it also happens to be wrong, because it violates property 5. The distortion of the coordinate grid has only moved the vertical lines up and down, so both observers agree that events like B and C are simultaneous. If this was really the way things worked, then all observers could synchronize all their clocks with one another for once and for all, and the clocks would never get out of sync. This contradicts the results of the Hafele-Keating experiment, in which all three clocks were initially synchronized in Washington, but later went out of sync because of their different states of motion.

It might seem as though we still had a huge amount of wiggle room available for the correct form of the distortion. It turns out, however, that properties 1-5 are sufficient to prove that there is only one answer, which is the one found by Einstein in 1905. To see why this is, let's work by a process of elimination.

Figure e shows a transformation that might seem at first glance to be as good a candidate as any other, but it violates property 3, that motion is relative, for the following reason. In observer 2's frame of reference, some of the grid lines cross one another. This means that observers 1 and 2 disagree on whether or not certain events are the same. For instance, suppose that event A marks the arrival of an arrow at the bull's-eye of a target, and event B is the location and time when the bull's-eye is punctured. Events A and B occur at the same location and at the same time. If one observer says that A and B coincide, but another says that they don't, we have a direct contradiction. Since the two frames of reference in figure e give contradictory results, one of them is right and one is wrong. This violates property 3, because all inertial frames of reference are supposed to be equally valid. To avoid problems like this, we clearly need to make sure that none of the grid lines ever cross one another.

The next type of transformation we want to kill off is shown in figure f, in which the grid lines curve, but never cross one another. The trouble with this one is that it violates property 1, the uniformity of time and space. The transformation is unusually “twisty” at A, whereas at B it's much more smooth. This can't be correct, because the transformation is only supposed to depend on the relative state of motion of the two frames of reference, and that given information doesn't single out a special role for any particular point in spacetime. If, for example, we had one frame of reference rotating relative to the other, then there would be something special about the axis of rotation. But we're only talking about inertial frames of reference here, as specified in property 3, so we can't have rotation; each frame of reference has to be moving in a straight line at constant speed. For frames related in this way, there is nothing that could single out an event like A for special treatment compared to B, so transformation f violates property 1.

The examples in figures e and f show that the transformation we're looking for must be linear, meaning that it must transform lines into lines, and furthermore that it has to take parallel lines to parallel lines. Einstein wrote in his 1905 paper that “... on account of the property of homogeneity [property 1] which we ascribe to time and space, the [transformation] must be linear.”1 Applying this to our diagrams, the original gray rectangle, which is a special type of parallelogram containing right angles, must be transformed into another parallelogram. There are three types of transformations, figure g, that have this property. Case I is the Galilean transformation of figure d on page 386, which we've already ruled out.


g / Three types of transformations that preserve parallelism. Their distinguishing feature is what they do to simultaneity, as shown by what happens to the left edge of the original rectangle. In I, the left edge remains vertical, so simultaneous events remain simultaneous. In II, the left edge turns counterclockwise. In III, it turns clockwise.

Case II can also be discarded. Here every point on the grid rotates counterclockwise. What physical parameter would determine the amount of rotation? The only thing that could be relevant would be \(v\), the relative velocity of the motion of the two frames of reference with respect to one another. But if the angle of rotation was proportional to \(v\), then for large enough velocities the grid would have left and right reversed, and this would violate property 4, causality: one observer would say that event A caused a later event B, but another observer would say that B came first and caused A.

The only remaining possibility is case III, which I've redrawn in figure h with a couple of changes. This is the one that Einstein predicted in 1905. The transformation is known as the Lorentz transformation, after Hendrik Lorentz (1853-1928), who partially anticipated Einstein's work, without arriving at the correct interpretation. The distortion is a kind of smooshing and stretching, as suggested by the hands. Also, we've already seen in figures a-c on page 385 that we're free to stretch or compress everything as much as we like in the horizontal and vertical directions, because this simply corresponds to choosing different units of measurement for time and distance. In figure h I've chosen units that give the whole drawing a convenient symmetry about a 45-degree diagonal line. Ordinarily it wouldn't make sense to talk about a 45-degree angle on a graph whose axes had different units. But in relativity, the symmetric appearance of the transformation tells us that space and time ought to be treated on the same footing, and measured in the same units.

As in our discussion of the Galilean transformation, slopes are interpreted as velocities, and the slope of the near-horizontal lines in figure i is interpreted as the relative velocity of the two observers. The difference between the Galilean version and the relativistic one is that now there is smooshing happening from the other side as well. Lines that were vertical in the original grid, representing simultaneous events, now slant over to the right. This tells us that, as required by property 5, different observers do not agree on whether events that occur in different places are simultaneous. The Hafele-Keating experiment tells us that this non-simultaneity effect is fairly small, even when the velocity is as big as that of a passenger jet, and this is what we would have anticipated by the correspondence principle. The way that this is expressed in the graph is that if we pick the time unit to be the second, then the distance unit turns out to be hundreds of thousands of miles. In these units, the velocity of a passenger jet is an extremely small number, so the slope \(v\) in figure i is extremely small, and the amount of distortion is tiny --- it would be much too small to see on this scale.

The only thing left to determine about the Lorentz transformation is the size of the transformed parallelogram relative to the size of the original one. Although the drawing of the hands in figure h may suggest that the grid deforms like a framework made of rigid coat-hanger wire, that is not the case. If you look carefully at the figure, you'll see that the edges of the smooshed parallelogram are actually a little longer than the edges of the original rectangle. In fact what stays the same is not lengths but areas, as proved in the caption to figure j.


j / Proof that Lorentz transformations don't change area: We first subject a square to a transformation with velocity \(v\), and this increases its area by a factor \(R(v)\), which we want to prove equals 1. We chop the resulting parallelogram up into little squares and finally apply a \(-v\) transformation; this changes each little square's area by a factor \(R(-v)\), so the whole figure's area is also scaled by \(R(-v)\). The final result is to restore the square to its original shape and area, so \(R(v)R(-v)=1\). But \(R(v)=R(-v)\) by property 2 of spacetime on page 385, which states that all directions in space have the same properties, so \(R(v)=1\).


k / The \(\gamma\) factor.


l / The ruler is moving in frame 1, represented by a square, but at rest in frame 2, shown as a parallelogram. Each picture of the ruler is a snapshot taken at a certain moment as judged according to frame 2's notion of simultaneity. An observer in frame 1 judges the ruler's length instead according to frame 1's definition of simultaneity, i.e., using points that are lined up vertically on the graph. The ruler appears shorter in the frame in which it is moving. As proved in figure m, the length contracts from \(L\) to \(L/\gamma\).


o / Muons accelerated to nearly \(c\) undergo radioactive decay much more slowly than they would according to an observer at rest with respect to the muons. The first two data-points (unfilled circles) were subject to large systematic errors.

7.2.2 The \(\gamma\) factor

With a little algebra and geometry (homework problem 7, page 440), one can use the equal-area property to show that the factor \(\gamma\) (Greek letter gamma) defined in figure k is given by the equation
\[\begin{equation*} \gamma = \frac{1}{\sqrt{1-v^2}} . \end{equation*}\]
If you've had good training in physics, the first thing you probably think when you look at this equation is that it must be nonsense, because its units don't make sense. How can we take something with units of velocity squared, and subtract it from a unitless 1? But remember that this is expressed in our special relativistic units, in which the same units are used for distance and time. In this system, velocities are always unitless. This sort of thing happens frequently in physics. For instance, before James Joule discovered conservation of energy, nobody knew that heat and mechanical energy were different forms of the same thing, so instead of measuring them both in units of joules as we would do now, they measured heat in one unit (such as calories) and mechanical energy in another (such as foot-pounds). In ordinary metric units, we just need an extra conversion factor \(c\), and the equation becomes
\[\begin{equation*} \gamma = \frac{1}{\sqrt{1-\left(\frac{v}{c}\right)^2}} . \end{equation*}\]

Here's why we care about \(\gamma\). Figure k defines it as the ratio of two times: the time between two events as expressed in one coordinate system, and the time between the same two events as measured in the other one. The interpretation is:

Time dilation

A clock runs fastest in the frame of reference of an observer who is at rest relative to the clock. An observer in motion relative to the clock at speed \(v\) perceives the clock as running more slowly by a factor of \(\gamma\).


m / This figure proves, as claimed in figure l, that the length contraction is \(x=1/\gamma\). First we slice the parallelogram vertically like a salami and slide the slices down, making the top and bottom edges horizontal. Then we do the same in the horizontal direction, forming a rectangle with sides \(\gamma\) and \(x\). Since both the Lorentz transformation and the slicing processes leave areas unchanged, the area \(\gamma x\) of the rectangle must equal the area of the original square, which is 1.

As proved in figures l and m, lengths are also distorted:

Length contraction

A meter-stick appears longest to an observer who is at rest relative to it. An observer moving relative to the meter-stick at \(v\) observes the stick to be shortened by a factor of \(\gamma\).


What is \(\gamma\) when \(v=0\)? What does this mean?

(answer in the back of the PDF version of the book)
Example 1: An interstellar road trip
Alice stays on earth while her twin Betty heads off in a spaceship for Tau Ceti, a nearby star. Tau Ceti is 12 light-years away, so even though Betty travels at 87% of the speed of light, it will take her a long time to get there: 14 years, according to Alice.


n / Example 1.

Betty experiences time dilation. At this speed, her \(\gamma\) is 2.0, so that the voyage will only seem to her to last 7 years. But there is perfect symmetry between Alice's and Betty's frames of reference, so Betty agrees with Alice on their relative speed; Betty sees herself as being at rest, while the sun and Tau Ceti both move backward at 87% of the speed of light. How, then, can she observe Tau Ceti to get to her in only 7 years, when it should take 14 years to travel 12 light-years at this speed?

We need to take into account length contraction. Betty sees the distance between the sun and Tau Ceti to be shrunk by a factor of 2. The same thing occurs for Alice, who observes Betty and her spaceship to be foreshortened.

Example 2: Large time dilation
The time dilation effect in the Hafele-Keating experiment was very small. If we want to see a large time dilation effect, we can't do it with something the size of the atomic clocks they used; the kinetic energy would be greater than the total megatonnage of all the world's nuclear arsenals. We can, however, accelerate subatomic particles to speeds at which \(\gamma\) is large. For experimental particle physicists, relativity is something you do all day before heading home and stopping off at the store for milk. An early, low-precision experiment of this kind was performed by Rossi and Hall in 1941, using naturally occurring cosmic rays. Figure p shows a 1974 experiment2 of a similar type which verified the time dilation predicted by relativity to a precision of about one part per thousand.


p / Apparatus used for the test of relativistic time dilation described in example 2. The prominent black and white blocks are large magnets surrounding a circular pipe with a vacuum inside.
(c) 1974 by CERN.

Particles called muons (named after the Greek letter \(\mu\), “myoo”) were produced by an accelerator at CERN, near Geneva. A muon is essentially a heavier version of the electron. Muons undergo radioactive decay, lasting an average of only 2.197 \(\mu\text{s}\) before they evaporate into an electron and two neutrinos. The 1974 experiment was actually built in order to measure the magnetic properties of muons, but it produced a high-precision test of time dilation as a byproduct. Because muons have the same electric charge as electrons, they can be trapped using magnetic fields. Muons were injected into the ring shown in figure p, circling around it until they underwent radioactive decay. At the speed at which these muons were traveling, they had \(\gamma=29.33\), so on the average they lasted 29.33 times longer than the normal lifetime. In other words, they were like tiny alarm clocks that self-destructed at a randomly selected time. Figure o shows the number of radioactive decays counted, as a function of the time elapsed after a given stream of muons was injected into the storage ring. The two dashed lines show the rates of decay predicted with and without relativity. The relativistic line is the one that agrees with experiment.


q / Colliding nuclei show relativistic length contraction.

Example 3: An example of length contraction

Figure q shows an artist's rendering of the length contraction for the collision of two gold nuclei at relativistic speeds in the RHIC accelerator in Long Island, New York, which went on line in 2000. The gold nuclei would appear nearly spherical (or just slightly lengthened like an American football) in frames moving along with them, but in the laboratory's frame, they both appear drastically foreshortened as they approach the point of collision. The later pictures show the nuclei merging to form a hot soup, in which experimenters hope to observe a new form of matter.


r / Example 4: In the garage's frame of reference, the bus is moving, and can fit in the garage due to its length contraction. In the bus's frame of reference, the garage is moving, and can't hold the bus due to its length contraction.

Example 4: The garage paradox
One of the most famous of all the so-called relativity paradoxes has to do with our incorrect feeling that simultaneity is well defined. The idea is that one could take a schoolbus and drive it at relativistic speeds into a garage of ordinary size, in which it normally would not fit. Because of the length contraction, the bus would supposedly fit in the garage. The driver, however, will perceive the garage as being contracted and thus even less able to contain the bus.

The paradox is resolved when we recognize that the concept of fitting the bus in the garage “all at once” contains a hidden assumption, the assumption that it makes sense to ask whether the front and back of the bus can simultaneously be in the garage. Observers in different frames of reference moving at high relative speeds do not necessarily agree on whether things happen simultaneously. As shown in figure r, the person in the garage's frame can shut the door at an instant B he perceives to be simultaneous with the front bumper's arrival A at the back wall of the garage, but the driver would not agree about the simultaneity of these two events, and would perceive the door as having shut long after she plowed through the back wall.


s / A proof that causality imposes a universal speed limit. In the original frame of reference, represented by the square, event A happens a little before event B. In the new frame, shown by the parallelogram, A happens after \(t=0\), but B happens before \(t=0\); that is, B happens before A. The time ordering of the two events has been reversed. This can only happen because events A and B are very close together in time and fairly far apart in space. The line segment connecting A and B has a slope greater than 1, meaning that if we wanted to be present at both events, we would have to travel at a speed greater than \(c\) (which equals 1 in the units used on this graph). You will find that if you pick any two points for which the slope of the line segment connecting them is less than 1, you can never get them to straddle the new \(t=0\) line in this funny, time-reversed way. Since different observers disagree on the time order of events like A and B, causality requires that information never travel from A to B or from B to A; if it did, then we would have time-travel paradoxes. The conclusion is that \(c\) is the maximum speed of cause and effect in relativity.


u / A ring laser gyroscope.


Discussion question B

7.2.3 The universal speed \(c\)

Let's think a little more about the role of the 45-degree diagonal in the Lorentz transformation. Slopes on these graphs are interpreted as velocities. This line has a slope of 1 in relativistic units, but that slope corresponds to \(c\) in ordinary metric units. We already know that the relativistic distance unit must be extremely large compared to the relativistic time unit, so \(c\) must be extremely large. Now note what happens when we perform a Lorentz transformation: this particular line gets stretched, but the new version of the line lies right on top of the old one, and its slope stays the same. In other words, if one observer says that something has a velocity equal to \(c\), every other observer will agree on that velocity as well. (The same thing happens with \(-c\).)

Velocities don't simply add and subtract.

This is counterintuitive, since we expect velocities to add and subtract in relative motion. If a dog is running away from me at 5 m/s relative to the sidewalk, and I run after it at 3 m/s, the dog's velocity in my frame of reference is 2 m/s. According to everything we have learned about motion, the dog must have different speeds in the two frames: 5 m/s in the sidewalk's frame and 2 m/s in mine. But velocities are measured by dividing a distance by a time, and both distance and time are distorted by relativistic effects, so we actually shouldn't expect the ordinary arithmetic addition of velocities to hold in relativity; it's an approximation that's valid at velocities that are small compared to \(c\).

A universal speed limit

For example, suppose Janet takes a trip in a spaceship, and accelerates until she is moving at \(0.6c\) relative to the earth. She then launches a space probe in the forward direction at a speed relative to her ship of \(0.6c\). We might think that the probe was then moving at a velocity of \(1.2c\), but in fact the answer is still less than \(c\) (problem 1, page 439). This is an example of a more general fact about relativity, which is that \(c\) represents a universal speed limit. This is required by causality, as shown in figure s.


t / The Michelson-Morley experiment, shown in photographs, and drawings from the original 1887 paper. 1. A simplified drawing of the apparatus. A beam of light from the source, s, is partially reflected and partially transmitted by the half-silvered mirror \(\text{h}_1\). The two half-intensity parts of the beam are reflected by the mirrors at a and b, reunited, and observed in the telescope, t. If the earth's surface was supposed to be moving through the ether, then the times taken by the two light waves to pass through the moving ether would be unequal, and the resulting time lag would be detectable by observing the interference between the waves when they were reunited. 2. In the real apparatus, the light beams were reflected multiple times. The effective length of each arm was increased to 11 meters, which greatly improved its sensitivity to the small expected difference in the speed of light. 3. In an earlier version of the experiment, they had run into problems with its “extreme sensitiveness to vibration,” which was “so great that it was impossible to see the interference fringes except at brief intervals ... even at two o'clock in the morning.” They therefore mounted the whole thing on a massive stone floating in a pool of mercury, which also made it possible to rotate it easily. 4. A photo of the apparatus.

Light travels at \(c\).

Now consider a beam of light. We're used to talking casually about the “speed of light,” but what does that really mean? Motion is relative, so normally if we want to talk about a velocity, we have to specify what it's measured relative to. A sound wave has a certain speed relative to the air, and a water wave has its own speed relative to the water. If we want to measure the speed of an ocean wave, for example, we should make sure to measure it in a frame of reference at rest relative to the water. But light isn't a vibration of a physical medium; it can propagate through the near-perfect vacuum of outer space, as when rays of sunlight travel to earth. This seems like a paradox: light is supposed to have a specific speed, but there is no way to decide what frame of reference to measure it in. The way out of the paradox is that light must travel at a velocity equal to \(c\). Since all observers agree on a velocity of \(c\), regardless of their frame of reference, everything is consistent.

The Michelson-Morley experiment

The constancy of the speed of light had in fact already been observed when Einstein was an 8-year-old boy, but because nobody could figure out how to interpret it, the result was largely ignored. In 1887 Michelson and Morley set up a clever apparatus to measure any difference in the speed of light beams traveling east-west and north-south. The motion of the earth around the sun at 110,000 km/hour (about 0.01% of the speed of light) is to our west during the day. Michelson and Morley believed that light was a vibration of a mysterious medium called the ether, so they expected that the speed of light would be a fixed value relative to the ether. As the earth moved through the ether, they thought they would observe an effect on the velocity of light along an east-west line. For instance, if they released a beam of light in a westward direction during the day, they expected that it would move away from them at less than the normal speed because the earth was chasing it through the ether. They were surprised when they found that the expected 0.01% change in the speed of light did not occur.

Example 5: The ring laser gyroscope

If you've flown in a jet plane, you can thank relativity for helping you to avoid crashing into a mountain or an ocean. Figure u shows a standard piece of navigational equipment called a ring laser gyroscope. A beam of light is split into two parts, sent around the perimeter of the device, and reunited. Since the speed of light is constant, we expect the two parts to come back together at the same time. If they don't, it's evidence that the device has been rotating. The plane's computer senses this and notes how much rotation has accumulated.

Example 6: No frequency-dependence

Relativity has only one universal speed, so it requires that all light waves travel at the same speed, regardless of their frequency and wavelength. Presently the best experimental tests of the invariance of the speed of light with respect to wavelength come from astronomical observations of gamma-ray bursts, which are sudden outpourings of high-frequency light, believed to originate from a supernova explosion in another galaxy. One such observation, in 2009,3 found that the times of arrival of all the different frequencies in the burst differed by no more than 2 seconds out of a total time in flight on the order of ten billion years!

Discussion Questions

A person in a spaceship moving at 99.99999999% of the speed of light relative to Earth shines a flashlight forward through dusty air, so the beam is visible. What does she see? What would it look like to an observer on Earth?

A question that students often struggle with is whether time and space can really be distorted, or whether it just seems that way. Compare with optical illusions or magic tricks. How could you verify, for instance, that the lines in the figure are actually parallel? Are relativistic effects the same, or not?

On a spaceship moving at relativistic speeds, would a lecture seem even longer and more boring than normal?

Mechanical clocks can be affected by motion. For example, it was a significant technological achievement to build a clock that could sail aboard a ship and still keep accurate time, allowing longitude to be determined. How is this similar to or different from relativistic time dilation?

Figure q from page 392, depicting the collision of two nuclei at the RHIC accelerator, is reproduced below. What would the shapes of the two nuclei look like to a microscopic observer riding on the left-hand nucleus? To an observer riding on the right-hand one? Can they agree on what is happening? If not, why not --- after all, shouldn't they see the same thing if they both compare the two nuclei side-by-side at the same instant in time?


v / Discussion question E: colliding nuclei show relativistic length contraction.

If you stick a piece of foam rubber out the window of your car while driving down the freeway, the wind may compress it a little. Does it make sense to interpret the relativistic length contraction as a type of strain that pushes an object's atoms together like this? How does this relate to discussion question E?

The machine-gunner in the figure sends out a spray of bullets. Suppose that the bullets are being shot into outer space, and that the distances traveled are trillions of miles (so that the human figure in the diagram is not to scale). After a long time, the bullets reach the points shown with dots which are all equally far from the gun. Their arrivals at those points are events A through E, which happen at different times. The chain of impacts extends across space at a speed greater than \(c\). Does this violate special relativity?


Discussion question G.


w / Fields carry energy.


x / Discussion question E.

7.2.4 No action at a distance

The Newtonian picture

The Newtonian picture of the universe has particles interacting with each other by exerting forces from a distance, and these forces are imagined to occur without any time delay. For example, suppose that super-powerful aliens, angered when they hear disco music in our AM radio transmissions, come to our solar system on a mission to cleanse the universe of our aesthetic contamination. They apply a force to our sun, causing it to go flying out of the solar system at a gazillion miles an hour. According to Newton's laws, the gravitational force of the sun on the earth will immediately start dropping off. This will be detectable on earth, and since sunlight takes eight minutes to get from the sun to the earth, the change in gravitational force will, according to Newton, be the first way in which earthlings learn the bad news --- the sun will not visibly start receding until a little later. Although this scenario is fanciful, it shows a real feature of Newton's laws: that information can be transmitted from one place in the universe to another with zero time delay, so that transmission and reception occur at exactly the same instant. Newton was sharp enough to realize that this required a nontrivial assumption, which was that there was some completely objective and well-defined way of saying whether two things happened at exactly the same instant. He stated this assumption explicitly: “Absolute, true, and mathematical time, of itself, and from its own nature flows at a constant rate without regard to anything external...”

Time delays in forces exerted at a distance

Relativity forbids Newton's instantaneous action at a distance. For suppose that instantaneous action at a distance existed. It would then be possible to send signals from one place in the universe to another without any time lag. This would allow perfect synchronization of all clocks. But the Hafele-Keating experiment demonstrates that clocks A and B that have been initially synchronized will drift out of sync if one is in motion relative to the other. With instantaneous transmission of signals, we could determine, without having to wait for A and B to be reunited, which was ahead and which was behind. Since they don't need to be reunited, neither one needs to undergo any acceleration; each clock can fix an inertial frame of reference, with a velocity vector that changes neither its direction nor its magnitude. But this violates the principle that constant-velocity motion is relative, because each clock can be considered to be at rest, in its own frame of reference. Since no experiment has ever detected any violation of the relativity of motion, we conclude that instantaneous action at a distance is impossible.

Since forces can't be transmitted instantaneously, it becomes natural to imagine force-effects spreading outward from their source like ripples on a pond, and we then have no choice but to impute some physical reality to these ripples. We call them fields, and they have their own independent existence. Gravity is transmitted through a field called the gravitational field. Besides gravity, there are other fundamental fields of force such as electricity and magnetism (ch. 10-11). Ripples of the electric and magnetic fields turn out to be light waves. This tells us that the speed at which electric and magnetic field ripples spread must be \(c\), and by an argument similar to the one in subsection 7.2.3 the same must hold for any other fundamental field, including the gravitational field.

Fields don't have to wiggle; they can hold still as well. The earth's magnetic field, for example, is nearly constant, which is why we can use it for direction-finding.

Even empty space, then, is not perfectly featureless. It has measurable properties. For example, we can drop a rock in order to measure the direction of the gravitational field, or use a magnetic compass to find the direction of the magnetic field. This concept made a deep impression on Einstein as a child. He recalled that as a five-year-old, the gift of a magnetic compass convinced him that there was “something behind things, something deeply hidden.”

More evidence that fields of force are real: they carry energy.

The smoking-gun argument for this strange notion of traveling force ripples comes from the fact that they carry energy. In figure x/1, Alice and Betty hold balls A and B at some distance from one another. These balls make a force on each other; it doesn't really matter for the sake of our argument whether this force is gravitational, electrical, or magnetic. Let's say it's electrical, i.e., that the balls have the kind of electrical charge that sometimes causes your socks to cling together when they come out of the clothes dryer. We'll say the force is repulsive, although again it doesn't really matter.

If Alice chooses to move her ball closer to Betty's, x/2, Alice will have to do some mechanical work against the electrical repulsion, burning off some of the calories from that chocolate cheesecake she had at lunch. This reduction in her body's chemical energy is offset by a corresponding increase in the electrical interaction energy. Not only that, but Alice feels the resistance stiffen as the balls get closer together and the repulsion strengthens. She has to do a little extra work, but this is all properly accounted for in the interaction energy.

But now suppose, x/3, that Betty decides to play a trick on Alice by tossing B far away just as Alice is getting ready to move A. We have already established that Alice can't feel B's motion instantaneously, so the electric forces must actually be propagated by an electric field. Of course this experiment is utterly impractical, but suppose for the sake of argument that the time it takes the change in the electric field to propagate across the diagram is long enough so that Alice can complete her motion before she feels the effect of B's disappearance. She is still getting stale information about B's position. As she moves A to the right, she feels a repulsion, because the field in her region of space is still the field caused by B in its old position. She has burned some chocolate cheesecake calories, and it appears that conservation of energy has been violated, because these calories can't be properly accounted for by any interaction with B, which is long gone.

If we hope to preserve the law of conservation of energy, then the only possible conclusion is that the electric field itself carries away the cheesecake energy. In fact, this example represents an impractical method of transmitting radio waves. Alice does work on charge A, and that energy goes into the radio waves. Even if B had never existed, the radio waves would still have carried energy, and Alice would still have had to do work in order to create them.

Discussion Questions

Amy and Bill are flying on spaceships in opposite directions at such high velocities that the relativistic effect on time's rate of flow is easily noticeable. Motion is relative, so Amy considers herself to be at rest and Bill to be in motion. She says that time is flowing normally for her, but Bill is slow. But Bill can say exactly the same thing. How can they both think the other is slow? Can they settle the disagreement by getting on the radio and seeing whose voice is normal and whose sounds slowed down and Darth-Vadery?


The figure shows a famous thought experiment devised by Einstein. A train is moving at constant velocity to the right when bolts of lightning strike the ground near its front and back. Alice, standing on the dirt at the midpoint of the flashes, observes that the light from the two flashes arrives simultaneously, so she says the two strikes must have occurred simultaneously. Bob, meanwhile, is sitting aboard the train, at its middle. He passes by Alice at the moment when Alice later figures out that the flashes happened. Later, he receives flash 2, and then flash 1. He infers that since both flashes traveled half the length of the train, flash 2 must have occurred first. How can this be reconciled with Alice's belief that the flashes were simultaneous? Explain using a graph.


Resolve the following paradox by drawing a spacetime diagram (i.e., a graph of \(x\) versus \(t\)). Andy and Beth are in motion relative to one another at a significant fraction of \(c\). As they pass by each other, they exchange greetings, and Beth tells Andy that she is going to blow up a stick of dynamite one hour later. One hour later by Andy's clock, she still hasn't exploded the dynamite, and he says to himself, “She hasn't exploded it because of time dilation. It's only been 40 minutes for her.” He now accelerates suddenly so that he's moving at the same velocity as Beth. The time dilation no longer exists. If he looks again, does he suddenly see the flash from the explosion? How can this be? Would he see her go through 20 minutes of her life in fast-motion?

Use a graph to resolve the following relativity paradox. Relativity says that in one frame of reference, event A could happen before event B, but in someone else's frame B would come before A. How can this be? Obviously the two people could meet up at A and talk as they cruised past each other. Wouldn't they have to agree on whether B had already happened?

The rod in the figure is perfectly rigid. At event A, the hammer strikes one end of the rod. At event B, the other end moves. Since the rod is perfectly rigid, it can't compress, so A and B are simultaneous. In frame 2, B happens before A. Did the motion at the right end cause the person on the left to decide to pick up the hammer and use it?


y / The light cone.


ab / Example 10.


ac / Example 11.


ad / The pattern of waves made by a point source moving to the right across the water. Note the shorter wavelength of the forward-emitted waves and the longer wavelength of the backward-going ones.


ae / A graphical representation of the Lorentz transformation for a velocity of \((3/5)c\). The long diagonal is stretched by a factor of two, the short one is half its former length, and the area is the same as before.


af / At event O, the source and the receiver are on top of each other, so as the source emits a wave crest, it is received without any time delay. At P, the source emits another wave crest, and at Q the receiver receives it.

7.2.5 The light cone

Given an event P, we can now classify all the causal relationships in which P can participate. In Newtonian physics, these relationships fell into two classes: P could potentially cause any event that lay in its future, and could have been caused by any event in its past. In relativity, we have a three-way distinction rather than a two-way one. There is a third class of events that are too far away from P in space, and too close in time, to allow any cause and effect relationship, since causality's maximum velocity is \(c\). Since we're working in units in which \(c=1\), the boundary of this set is formed by the lines with slope \(\pm1\) on a \((t,x)\) plot. This is referred to as the light cone, for reasons that become more visually obvious when we consider more than one spatial dimension, figure aa.

Events lying inside one another's light cones are said to have a timelike relationship. Events outside each other's light cones are spacelike in relation to one another, and in the case where they lie on the surfaces of each other's light cones the term is lightlike. \myoptionalsubsection[2]{The spacetime interval}

The light cone is an object of central importance in both special and general relativity. It relates the geometry of spacetime to possible cause-and-effect relationships between events. This is fundamentally how relativity works: it's a geometrical theory of causality.

These ideas naturally lead us to ask what fruitful analogies we can form between the bizarre geometry of spacetime and the more familiar geometry of the Euclidean plane. The light cone cuts spacetime into different regions according to certain measurements of relationships between points (events). Similarly, a circle in Euclidean geometry cuts the plane into two parts, an interior and an exterior, according to the measurement of the distance from the circle's center. A circle stays the same when we rotate the plane. A light cone stays the same when we change frames of reference. Let's build up the analogy more explicitly.

Measurement in Euclidean geometry

We say that two line segments are congruent, \(\text{AB}\cong \text{CD}\), if the distance between points A and B is the same as the distance between C and D, as measured by a rigid ruler.

Measurement in spacetime

We define \(\text{AB}\cong \text{CD}\) if:

  1. AB and CD are both spacelike, and the two distances are equal as measured by a rigid ruler, in a frame where the two events touch the ruler simultaneously.
  2. AB and CD are both timelike, and the two time intervals are equal as measured by clocks moving inertially.
  3. AB and CD are both lightlike.

The three parts of the relativistic version each require some justification.

Case 1 has to be the way it is because space is part of spacetime. In special relativity, this space is Euclidean, so the definition of congruence has to agree with the Euclidean definition, in the case where it is possible to apply the Euclidean definition. The spacelike relation between the points is both necessary and sufficient to make this possible. If points A and B are spacelike in relation to one another, then a frame of reference exists in which they are simultaneous, so we can use a ruler that is at rest in that frame to measure their distance. If they are lightlike or timelike, then no such frame of reference exists. For example, there is no frame of reference in which Charles VII's restoration to the throne is simultaneous with Joan of Arc's execution, so we can't arrange for both of these events to touch the same ruler at the same time.

The definition in case 2 is the only sensible way to proceed if we are to respect the symmetric treatment of time and space in relativity. The timelike relation between the events is necessary and sufficient to make it possible for a clock to move from one to the other. It makes a difference that the clocks move inertially, because the twins in example 1 on p. 391 disagree on the clock time between the traveling twin's departure and return.

Case 3 may seem strange, since it says that any two lightlike intervals are congruent. But this is the only possible definition, because this case can be obtained as a limit of the timelike one. Suppose that AB is a timelike interval, but in the planet earth's frame of reference it would be necessary to travel at almost the speed of light in order to reach B from A. The required speed is less than \(c\) (i.e., less than 1) by some tiny amount \(\epsilon\). In the earth's frame, the clock referred to in the definition suffers extreme time dilation. The time elapsed on the clock is very small. As \(\epsilon\) approaches zero, and the relationship between A and B approaches a lightlike one, this clock time approaches zero. In this sense, the relativistic notion of “distance” is very different from the Euclidean one. In Euclidean geometry, the distance between two points can only be zero if they are the same point.

The case splitting involved in the relativistic definition is a little ugly. Having worked out the physical interpretation, we can now consolidate the definition in a nicer way by appealing to Cartesian coordinates.

Cartesian definition of distance in Euclidean geometry

Given a vector \((\Delta x,\Delta y)\) from point A to point B, the square of the distance between them is defined as \(\overline{\text{AB}}^2=\Delta x^2+\Delta y^2\).

Definition of the interval in relativity

Given points separated by coordinate differences \(\Delta x\), \(\Delta y\), \(\Delta z\), and \(\Delta t\), the spacetime interval \(\interval\) (cursive letter “I”) between them is defined as \(\interval = \Delta t^2-\Delta x^2-\Delta y^2-\Delta z^2\).

This is stated in natural units, so all four terms on the right-hand side have the same units; in metric units with \(c \ne 1\), appropriate factors of \(c\) should be inserted in order to make the units of the terms agree. The interval \(\interval\) is positive if AB is timelike (regardless of which event comes first), zero if lightlike, and negative if spacelike. Since \(\interval\) can be negative, we can't in general take its square root and define a real number \(\overline{\text{AB}}\) as in the Euclidean case. When the interval is timelike, we can interpret \(\sqrt{\interval}\) as a time, and when it's spacelike we can take \(\sqrt{-\interval}\) to be a distance.

The Euclidean definition of distance (i.e., the Pythagorean theorem) is useful because it gives the same answer regardless of how we rotate the plane. Although it is stated in terms of a certain coordinate system, its result is unambiguously defined because it is the same regardless of what coordinate system we arbitrarily pick. Similarly, \(\interval\) is useful because, as proved in example 8 below, it is the same regardless of our frame of reference, i.e., regardless of our choice of coordinates.

Example 7: Pioneer 10
\(\triangleright\) The Pioneer 10 space probe was launched in 1972, and in 1973 was the first craft to fly by the planet Jupiter. It crossed the orbit of the planet Neptune in 1983, after which telemetry data were received until 2002. The following table gives the spacecraft's position relative to the sun at exactly midnight on January 1, 1983 and January 1, 1995. The 1983 date is taken to be \(t=0\).
t (s)xyz

0 1784times1012 textupm 3951times1012 textupm 0237times1012 textupm
37869120000times108 textups 2420times1012 textupm 8827times1012 textupm 0488times1012 textupm

Compare the time elapsed on the spacecraft to the time in a frame of reference tied to the sun.

\(\triangleright\) We can convert these data into natural units, with the distance unit being the second (i.e., a light-second, the distance light travels in one second) and the time unit being seconds. Converting and carrying out this subtraction, we have:

Δt (s)ΔxΔyΔz

37869120000times108 textups02121times104 textups 1626times104 textups0084times104 textups

Comparing the exponents of the temporal and spatial numbers, we can see that the spacecraft was moving at a velocity on the order of \(10^{-4}\) of the speed of light, so relativistic effects should be small but not completely negligible.

Since the interval is timelike, we can take its square root and interpret it as the time elapsed on the spacecraft. The result is \(\sqrt{\interval}=3.786911996\times 10^8\ \text{s}\). This is 0.4 s less than the time elapsed in the sun's frame of reference.


z / Light-rectangles, example 8.
1. The gray light-rectangle represents the set of all events such as P that could be visited after A and before B.
2. The rectangle becomes a square in the frame in which A and B occur at the same location in space.
3. The area of the dashed square is \(\tau^2\), so the area of the gray square is \(\tau^2/2\).

Example 8: Invariance of the interval
In this example we prove that the interval is the same regardless of what frame of reference we compute it in. This is called “Lorentz invariance.” The proof is limited to the timelike case. Given events A and B, construct the light-rectangle as defined in figure ab/1. On p. 389 we proved that the Lorentz transformation doesn't change the area of a shape in the \(x\)-\(t\) plane. Therefore the area of this rectangle is unchanged if we switch to the frame of reference ab/2, in which A and B occurred at the same location and were separated by a time interval \(\tau\). This area equals half the interval \(\interval\) between A and B. But a straightforward calculation shows that the rectangle in ab/1 also has an area equal to half the interval calculated in that frame. Since the area in any frame equals half the interval, and the area is the same in all frames, the interval is equal in all frames as well.


aa / Example 9.

Example 9: A numerical example of invariance
Figure ac shows two frames of reference in motion relative to one another at \(v=3/5\). (For this velocity, the stretching and squishing of the main diagonals are both by a factor of 2.) Events are marked at coordinates that in the frame represented by the square are
\[\begin{align*} (t,x) & = (0,0) \text{and} \\ (t,x) &= (13,11) . \end{align*}\]
The interval between these events is \(13^2-11^2=48\). In the frame represented by the parallelogram, the same two events lie at coordinates
\[\begin{align*} (t',x') & = (0,0) \text{and} \\ (t',x') &= (8,4) . \end{align*}\]
Calculating the interval using these values, the result is
[4] \(8^2-4^2=48\), which comes out the same as in the other frame.

\myoptionalsubsection[4]{Four-vectors and the inner product}

Example 7 makes it natural that we define a type of vector with four components, the first one relating to time and the others being spatial. These are known as four-vectors. It's clear how we should define the equivalent of a dot product in relativity:

\[\begin{equation*} \mathbf{A}\cdot\mathbf{B} = A_t B_t - A_xB_x - A_yB_y - A_zB_z \end{equation*}\]

The term “dot product” has connotations of referring only to three-vectors, so the operation of taking the scalar product of two four-vectors is usually referred to instead as the “inner product.” The spacetime interval can then be thought of as the inner product of a four-vector with itself. We care about the relativistic inner product for exactly the same reason we care about its Euclidean version; both are scalars, so they have a fixed value regardless of what coordinate system we choose.

Example 10: The twin paradox
Alice and Betty are identical twins. Betty goes on a space voyage at relativistic speeds, traveling away from the earth and then turning around and coming back. Meanwhile, Alice stays on earth. When Betty returns, she is younger than Alice because of relativistic time dilation (example 1, p. 391).

But isn't it valid to say that Betty's spaceship is standing still and the earth moving? In that description, wouldn't Alice end up younger and Betty older? This is referred to as the “twin paradox.” It can't really be a paradox, since it's exactly what was observed in the Hafele-Keating experiment (p. 381).

Betty's track in the \(x\)-\(t\) plane (her “world-line” in relativistic jargon) consists of vectors \(\mathbf{b}\) and \(\mathbf{c}\) strung end-to-end (figure ad). We could adopt a frame of reference in which Betty was at rest during \(\mathbf{b}\) (i.e., \(b_x=0\)), but there is no frame in which \(\mathbf{b}\) and \(\mathbf{c}\) are parallel, so there is no frame in which Betty was at rest during both \(\mathbf{b}\) and \(\mathbf{c}\). This resolves the paradox.

We have already established by other methods that Betty ages less that Alice, but let's see how this plays out in a simple numerical example. Omitting units and making up simple numbers, let's say that the vectors in figure ad are

\[\begin{align*} \mathbf{a} &= (6,1) \\ \mathbf{b} &= (3,2) \\ \mathbf{c} &= (3,-1) , \end{align*}\]

where the components are given in the order \((t,x)\). The time experienced by Alice is then

\[\begin{equation*} |\mathbf{a}| = \sqrt{6^2-1^2} =5.9 , \end{equation*}\]

which is greater than the Betty's elapsed time

\[\begin{equation*} |\mathbf{b}|+|\mathbf{c}| = \sqrt{3^2-2^2}+\sqrt{3^2-(-1)^2} = 5.1 . \end{equation*}\]

Example 11: Simultaneity using inner products
Suppose that an observer O moves inertially along a vector \(\mathbf{o}\), and let the vector separating two events P and Q be \(\mathbf{s}\). O judges these events to be simultaneous if \(\mathbf{o}\cdot\mathbf{s}=0\). To see why this is true, suppose we pick a coordinate system as defined by O. In this coordinate system, O considers herself to be at rest, so she says her vector has only a time component, \(\mathbf{o}=(\Delta t,0,0,0)\). If she considers P and Q to be simultaneous, then the vector from P to Q is of the form \((0,\Delta x,\Delta y,\Delta z)\). The inner product is then zero, since each of the four terms vanishes. Since the inner product is independent of the choice of coordinate system, it doesn't matter that we chose one tied to O herself. Any other observer \(\text{O}'\) can look at O's motion, note that \(\mathbf{o}\cdot\mathbf{s}=0\), and infer that O must consider P and Q to be simultaneous, even if \(\text{O}'\) says they weren't.

\myoptionalsubsection[2]{Doppler shifts of light and addition of velocities}

When Doppler shifts happen to ripples on a pond or the sound waves from an airplane, they can depend on the relative motion of three different objects: the source, the receiver, and the medium. But light waves don't have a medium. Therefore Doppler shifts of light can only depend on the relative motion of the source and observer.

One simple case is the one in which the relative motion of the source and the receiver is perpendicular to the line connecting them. That is, the motion is transverse. Nonrelativistic Doppler shifts happen because the distance between the source and receiver is changing, so in nonrelativistic physics we don't expect any Doppler shift at all when the motion is transverse, and this is what is in fact observed to high precision. For example, the photo shows shortened and lengthened wavelengths to the right and left, along the source's line of motion, but an observer above or below the source measures just the normal, unshifted wavelength and frequency. But relativistically, we have a time dilation effect, so for light waves emitted transversely, there is a Doppler shift of \(1/\gamma\) in frequency (or \(\gamma\) in wavelength).

The other simple case is the one in which the relative motion of the source and receiver is longitudinal, i.e., they are either approaching or receding from one another. For example, distant galaxies are receding from our galaxy due to the expansion of the universe, and this expansion was originally detected because Doppler shifts toward the red (low-frequency) end of the spectrum were observed.

Nonrelativistically, we would expect the light from such a galaxy to be Doppler shifted down in frequency by some factor, which would depend on the relative velocities of three different objects: the source, the wave's medium, and the receiver. Relativistically, things get simpler, because light isn't a vibration of a physical medium, so the Doppler shift can only depend on a single velocity \(v\), which is the rate at which the separation between the source and the receiver is increasing.

The square in figure ah is the “graph paper” used by someone who considers the source to be at rest, while the parallelogram plays a similar role for the receiver. The figure is drawn for the case where \(v=3/5\) (in units where \(c=1\)), and in this case the stretch factor of the long diagonal is 2. To keep the area the same, the short diagonal has to be squished to half its original size. But now it's a matter of simple geometry to show that OP equals half the width of the square, and this tells us that the Doppler shift is a factor of 1/2 in frequency. That is, the squish factor of the short diagonal is interpreted as the Doppler shift. To get this as a general equation for velocities other than 3/5, one can show by straightforward fiddling with the result of part c of problem 7 on p. 440 that the Doppler shift is

\[\begin{equation*} D(v) = \sqrt{\frac{1-v}{1+v}} . \end{equation*}\]

Here \(v>0\) is the case where the source and receiver are getting farther apart, \(v\lt0\) the case where they are approaching. (This is the opposite of the sign convention used in subsection 6.1.5. It is convenient to change conventions here so that we can use positive values of \(v\) in the case of cosmological red-shifts, which are the most important application.)

Suppose that Alice stays at home on earth while her twin Betty takes off in her rocket ship at 3/5 of the speed of light. When I first learned relativity, the thing that caused me the most pain was understanding how each observer could say that the other was the one whose time was slow. It seemed to me that if I could take a pill that would speed up my mind and my body, then naturally I would see everybody else as being slow. Shouldn't the same apply to relativity? But suppose Alice and Betty get on the radio and try to settle who is the fast one and who is the slow one. Each twin's voice sounds slooooowed doooowwwwn to the other. If Alice claps her hands twice, at a time interval of one second by her clock, Betty hears the hand-claps coming over the radio two seconds apart, but the situation is exactly symmetric, and Alice hears the same thing if Betty claps. Each twin analyzes the situation using a diagram identical to ah, and attributes her sister's observations to a complicated combination of time distortion, the time taken by the radio signals to propagate, and the motion of her twin relative to her.


Turn your book upside-down and reinterpret figure ah.

(answer in the back of the PDF version of the book)
Example 12: A symmetry property of the Doppler effect
Suppose that A and B are at rest relative to one another, but C is moving along the line between A and B. A transmits a signal to C, who then retransmits it to B. The signal accumulates two Doppler shifts, and the result is their product \(D(v)D(-v)\). But this product must equal 1, so we must have \(D(-v)D(v)=1\), which can be verified directly from the equation.
Example 13: The Ives-Stilwell experiment

The result of example 12 was the basis of one of the earliest laboratory tests of special relativity, by Ives and Stilwell in 1938. They observed the light emitted by excited by a beam of \(\text{H}_2^+\) and \(\text{H}_3^+\) ions with speeds of a few tenths of a percent of \(c\). Measuring the light from both ahead of and behind the beams, they found that the product of the Doppler shifts \(D(v)D(-v)\) was equal to 1, as predicted by relativity. If relativity had been false, then one would have expected the product to differ from 1 by an amount that would have been detectable in their experiment. In 2003, Saathoff et al. carried out an extremely precise version of the Ives-Stilwell technique with \(\text{Li}^+\) ions moving at 6.4% of \(c\). The frequencies observed, in units of MHz, were:

ftextupo= 546466918.8±0.4
(unshifted frequency)
ftextupoDv= 582490203.44±.09
(shifted frequency, forward)
ftextupo Dv= 512671442.9±0.5
(shifted frequency, backward)
sqrtftextupoDvcdot ftextupo Dv=546466918.6±0.3

The results show incredibly precise agreement between \(f_\text{o}\) and \(\sqrt{f_\text{o}D(-v)\cdot f_\text{o} D(v)}\), as expected relativistically because \(D(v)D(-v)\) is supposed to equal 1. The agreement extends to 9 significant figures, whereas if relativity had been false there should have been a relative disagreement of about \(v^2=.004\), i.e., a discrepancy in the third significant figure. The spectacular agreement with theory has made this experiment a lightning rod for anti-relativity kooks.

We saw on p. 394 that relativistic velocities should not be expected to be exactly additive, and problem 1 on p. 439 verifies this in the special case where A moves relative to B at \(0.6c\) and B relative to C at \(0.6c\) --- the result not being \(1.2c\). The relativistic Doppler shift provides a simple way of deriving a general equation for the relativistic combination of velocities; problem 17 on p. 443 guides you through the steps of this derivation, and the result is given on p. 942.

7.3 Dynamics

So far we have said nothing about how to predict motion in relativity. Do Newton's laws still work? Do conservation laws still apply? The answer is yes, but many of the definitions need to be modified, and certain entirely new phenomena occur, such as the equivalence of energy and mass, as described by the famous equation \(E=mc^2\).


c / Example 14.

7.3.1 Momentum

Consider the following scheme for traveling faster than the speed of light. The basic idea can be demonstrated by dropping a ping-pong ball and a baseball stacked on top of each other like a snowman. They separate slightly in mid-air, and the baseball therefore has time to hit the floor and rebound before it collides with the ping-pong ball, which is still on the way down. The result is a surprise if you haven't seen it before: the ping-pong ball flies off at high speed and hits the ceiling! A similar fact is known to people who investigate the scenes of accidents involving pedestrians. If a car moving at 90 kilometers per hour hits a pedestrian, the pedestrian flies off at nearly double that speed, 180 kilometers per hour. Now suppose the car was moving at 90 percent of the speed of light. Would the pedestrian fly off at 180% of \(c\)?

To see why not, we have to back up a little and think about where this speed-doubling result comes from. For any collision, there is a special frame of reference, the center-of-mass frame, in which the two colliding objects approach each other, collide, and rebound with their velocities reversed. In the center-of-mass frame, the total momentum of the objects is zero both before and after the collision.


a / An unequal collision, viewed in the center-of-mass frame, 1, and in the frame where the small ball is initially at rest, 2. The motion is shown as it would appear on the film of an old-fashioned movie camera, with an equal amount of time separating each frame from the next. Film 1 was made by a camera that tracked the center of mass, film 2 by one that was initially tracking the small ball, and kept on moving at the same speed after the collision.

Figure a/1 shows such a frame of reference for objects of very unequal mass. Before the collision, the large ball is moving relatively slowly toward the top of the page, but because of its greater mass, its momentum cancels the momentum of the smaller ball, which is moving rapidly in the opposite direction. The total momentum is zero. After the collision, the two balls just reverse their directions of motion. We know that this is the right result for the outcome of the collision because it conserves both momentum and kinetic energy, and everything not forbidden is compulsory, i.e., in any experiment, there is only one possible outcome, which is the one that obeys all the conservation laws.


How do we know that momentum and kinetic energy are conserved in figure a/1?

(answer in the back of the PDF version of the book)

Let's make up some numbers as an example. Say the small ball has a mass of 1 kg, the big one 8 kg. In frame 1, let's make the velocities as follows:

before the collision

after the collision

small ball



big ball



Figure a/2 shows the same collision in a frame of reference where the small ball was initially at rest. To find all the velocities in this frame, we just add 0.8 to all the ones in the previous table.

before the collision

after the collision

small ball



big ball



In this frame, as expected, the small ball flies off with a velocity, 1.6, that is almost twice the initial velocity of the big ball, 0.9.

If all those velocities were in meters per second, then that's exactly what happened. But what if all these velocities were in units of the speed of light? Now it's no longer a good approximation just to add velocities. We need to combine them according to the relativistic rules. For instance, the technique used in problem 1 on p. 439 can be used to show that combining a velocity of 0.8 times the speed of light with another velocity of 0.8 results in 0.98, not 1.6. The results are very different:

before the collision

after the collision

small ball



big ball




b / An 8-kg ball moving at 83% of the speed of light hits a 1-kg ball. The balls appear foreshortened due to the relativistic distortion of space.

We can interpret this as follows. Figure a/1 is one in which the big ball is moving fairly slowly. This is very nearly the way the scene would be seen by an ant standing on the big ball. According to an observer in frame b, however, both balls are moving at nearly the speed of light after the collision. Because of this, the balls appear foreshortened, but the distance between the two balls is also shortened. To this observer, it seems that the small ball isn't pulling away from the big ball very fast.

Now here's what's interesting about all this. The outcome shown in figure a/2 was supposed to be the only one possible, the only one that satisfied both conservation of energy and conservation of momentum. So how can the different result shown in figure b be possible? The answer is that relativistically, momentum must not equal \(mv\). The old, familiar definition is only an approximation that's valid at low speeds. If we observe the behavior of the small ball in figure b, it looks as though it somehow had some extra inertia. It's as though a football player tried to knock another player down without realizing that the other guy had a three-hundred-pound bag full of lead shot hidden under his uniform --- he just doesn't seem to react to the collision as much as he should. As proved in section 7.3.4, this extra inertia is described by redefining momentum as

\[\begin{equation*} p = m \gamma v . \end{equation*}\]

At very low velocities, \(\gamma\) is close to 1, and the result is very nearly \(mv\), as demanded by the correspondence principle. But at very high velocities, \(\gamma\) gets very big --- the small ball in figure b has a \(\gamma\) of 5.0, and therefore has five times more inertia than we would expect nonrelativistically.

This also explains the answer to another paradox often posed by beginners at relativity. Suppose you keep on applying a steady force to an object that's already moving at \(0.9999c\). Why doesn't it just keep on speeding up past \(c\)? The answer is that force is the rate of change of momentum. At \(0.9999c\), an object already has a \(\gamma\) of 71, and therefore has already sucked up 71 times the momentum you'd expect at that speed. As its velocity gets closer and closer to \(c\), its \(\gamma\) approaches infinity. To move at \(c\), it would need an infinite momentum, which could only be caused by an infinite force.

Example 14: Push as hard as you like ...
We don't have to depend on our imaginations to see what would happen if we kept on applying a force to an object indefinitely and tried to accelerate it past \(c\). A nice experiment of this type was done by Bertozzi in 1964. In this experiment, electrons were accelerated by an electric field \(E\) through a distance \(\ell_1\). Applying Newton's laws gives Newtonian predictions \(a_N\) for the acceleration and \(t_N\) for the time required.4

The electrons were then allowed to fly down a pipe for a further distance \(\ell_2=8.4\ \text{m}\) without being acted on by any force. The time of flight \(t_2\) for this second distance was used to find the final velocity \(v=\ell_2/t_2\) to which they had actually been accelerated.

Figure c shows the results.5 According to Newton, an acceleration \(a_N\) acting for a time \(t_N\) should produce a final velocity \(a_N t_N\). The solid line in the graph shows the prediction of Newton's laws, which is that a constant force exerted steadily over time will produce a velocity that rises linearly and without limit.

The experimental data, shown as black dots, clearly tell a different story. The velocity never goes above a certain maximum value, which we identify as \(c\). The dashed line shows the predictions of special relativity, which are in good agreement with the experimental results.


e / A New York Times headline from November 10, 1919, describing the observations discussed in example 15.


f / Top: A PET scanner. Middle: Each positron annihilates with an electron, producing two gamma-rays that fly off back-to-back. When two gamma rays are observed simultaneously in the ring of detectors, they are assumed to come from the same annihilation event, and the point at which they were emitted must lie on the line connecting the two detectors. Bottom: A scan of a person's torso. The body has concentrated the radioactive tracer around the stomach, indicating an abnormal medical condition.


g / In the \(p\)-\(E\) plane, massless particles lie on the two diagonals, while particles with mass lie to the right.

7.3.2 Equivalence of mass and energy

Now we're ready to see why mass and energy must be equivalent as claimed in the famous \(E=mc^2\). So far we've only considered collisions in which none of the kinetic energy is converted into any other form of energy, such as heat or sound. Let's consider what happens if a blob of putty moving at velocity \(v\) hits another blob that is initially at rest, sticking to it. The nonrelativistic result is that to obey conservation of momentum the two blobs must fly off together at \(v/2\). Half of the initial kinetic energy has been converted to heat.6

Relativistically, however, an interesting thing happens. A hot object has more momentum than a cold object! This is because the relativistically correct expression for momentum is \(m\gamma v\), and the more rapidly moving atoms in the hot object have higher values of \(\gamma\). In our collision, the final combined blob must therefore be moving a little more slowly than the expected \(v/2\), since otherwise the final momentum would have been a little greater than the initial momentum. To an observer who believes in conservation of momentum and knows only about the overall motion of the objects and not about their heat content, the low velocity after the collision would seem to be the result of a magical change in the mass, as if the mass of two combined, hot blobs of putty was more than the sum of their individual masses.

Now we know that the masses of all the atoms in the blobs must be the same as they always were. The change is due to the change in \(\gamma\) with heating, not to a change in mass. The heat energy, however, seems to be acting as if it was equivalent to some extra mass.

But this whole argument was based on the fact that heat is a form of kinetic energy at the atomic level. Would \(E=mc^2\) apply to other forms of energy as well? Suppose a rocket ship contains some electrical energy stored in a battery. If we believed that \(E=mc^2\) applied to forms of kinetic energy but not to electrical energy, then we would have to believe that the pilot of the rocket could slow the ship down by using the battery to run a heater! This would not only be strange, but it would violate the principle of relativity, because the result of the experiment would be different depending on whether the ship was at rest or not. The only logical conclusion is that all forms of energy are equivalent to mass. Running the heater then has no effect on the motion of the ship, because the total energy in the ship was unchanged; one form of energy (electrical) was simply converted to another (heat).

The equation \(E=mc^2\) tells us how much energy is equivalent to how much mass: the conversion factor is the square of the speed of light, \(c\). Since \(c\) a big number, you get a really really big number when you multiply it by itself to get \(c^2\). This means that even a small amount of mass is equivalent to a very large amount of energy.


d / Example 15, page 416.

Example 15: Gravity bending light
Gravity is a universal attraction between things that have mass, and since the energy in a beam of light is equivalent to some very small amount of mass, we expect that light will be affected by gravity, although the effect should be very small. The first important experimental confirmation of relativity came in 1919 when stars next to the sun during a solar eclipse were observed to have shifted a little from their ordinary position. (If there was no eclipse, the glare of the sun would prevent the stars from being observed.) Starlight had been deflected by the sun's gravity. Figure d is a photographic negative, so the circle that appears bright is actually the dark face of the moon, and the dark area is really the bright corona of the sun. The stars, marked by lines above and below them, appeared at positions slightly different than their normal ones.
Example 16: Black holes

A star with sufficiently strong gravity can prevent light from leaving. Quite a few black holes have been detected via their gravitational forces on neighboring stars or clouds of gas and dust.

You've learned about conservation of mass and conservation of energy, but now we see that they're not even separate conservation laws. As a consequence of the theory of relativity, mass and energy are equivalent, and are not separately conserved --- one can be converted into the other. Imagine that a magician waves his wand, and changes a bowl of dirt into a bowl of lettuce. You'd be impressed, because you were expecting that both dirt and lettuce would be conserved quantities. Neither one can be made to vanish, or to appear out of thin air. However, there are processes that can change one into the other. A farmer changes dirt into lettuce, and a compost heap changes lettuce into dirt. At the most fundamental level, lettuce and dirt aren't really different things at all; they're just collections of the same kinds of atoms --- carbon, hydrogen, and so on. Because mass and energy are like two different sides of the same coin, we may speak of mass-energy, a single conserved quantity, found by adding up all the mass and energy, with the appropriate conversion factor: \(E+mc^2\).

Example 17: A rusting nail
\(\triangleright\) An iron nail is left in a cup of water until it turns entirely to rust. The energy released is about 0.5 MJ. In theory, would a sufficiently precise scale register a change in mass? If so, how much?

\(\triangleright\) The energy will appear as heat, which will be lost to the environment. The total mass-energy of the cup, water, and iron will indeed be lessened by 0.5 MJ. (If it had been perfectly insulated, there would have been no change, since the heat energy would have been trapped in the cup.) The speed of light is \(c=3\times10^8\) meters per second, so converting to mass units, we have

\[\begin{align*} m &= \frac{E}{c^2} \\ &= \frac{0.5\times10^6\ \text{J}}{\left(3\times10^8\ \text{m}/\text{s}\right)^2} \\ &= 6\times10^{-12}\ \text{kilograms} . \end{align*}\]

The change in mass is too small to measure with any practical technique. This is because the square of the speed of light is such a large number.

Example 18: Electron-positron annihilation
Natural radioactivity in the earth produces positrons, which are like electrons but have the opposite charge. A form of antimatter, positrons annihilate with electrons to produce gamma rays, a form of high-frequency light. Such a process would have been considered impossible before Einstein, because conservation of mass and energy were believed to be separate principles, and this process eliminates 100% of the original mass. The amount of energy produced by annihilating 1 kg of matter with 1 kg of antimatter is
\[\begin{align*} E &= mc^2\\ &= (2\ \text{kg})\left(3.0\times10^8\ \text{m}/\text{s}\right)^2\\ &= 2\times10^{17}\ \text{J} , \end{align*}\]
which is on the same order of magnitude as a day's energy consumption for the entire world's population!

Positron annihilation forms the basis for the medical imaging technique called a PET (positron emission tomography) scan, in which a positron-emitting chemical is injected into the patient and mapped by the emission of gamma rays from the parts of the body where it accumulates.

One commonly hears some misinterpretations of \(E=mc^2\), one being that the equation tells us how much kinetic energy an object would have if it was moving at the speed of light. This wouldn't make much sense, both because the equation for kinetic energy has \(1/2\) in it, \(KE=(1/2)mv^2\), and because a material object can't be made to move at the speed of light. However, this naturally leads to the question of just how much mass-energy a moving object has. We know that when the object is at rest, it has no kinetic energy, so its mass-energy is simply equal to the energy-equivalent of its mass, \(mc^2\),

\[\begin{equation*} \massenergy = mc^2 \ \text{when}\ v=0 , \end{equation*}\]

where the symbol \(\massenergy\) (cursive “E”) stands for mass-energy. The point of using the new symbol is simply to remind ourselves that we're talking about relativity, so an object at rest has \(\massenergy=mc^2\), not \(E=0\) as we'd assume in nonrelativistic physics.

Suppose we start accelerating the object with a constant force. A constant force means a constant rate of transfer of momentum, but \(p=m\gamma v\) approaches infinity as \(v\) approaches \(c\), so the object will only get closer and closer to the speed of light, but never reach it. Now what about the work being done by the force? The force keeps doing work and doing work, which means that we keep on using up energy. Mass-energy is conserved, so the energy being expended must equal the increase in the object's mass-energy. We can continue this process for as long as we like, and the amount of mass-energy will increase without limit. We therefore conclude that an object's mass-energy approaches infinity as its speed approaches the speed of light,

\[\begin{equation*} \massenergy \rightarrow \infty\ \text{when}\ v \rightarrow c . \end{equation*}\]

Now that we have some idea what to expect, what is the actual equation for the mass-energy? As proved in section 7.3.4, it is

\[\begin{equation*} \massenergy =m\gamma c^2 . \end{equation*}\]


Verify that this equation has the two properties we wanted.

(answer in the back of the PDF version of the book)
Example 19: KE compared to \(mc^2\) at low speeds
\(\triangleright\) An object is moving at ordinary nonrelativistic speeds. Compare its kinetic energy to the energy \(mc^2\) it has purely because of its mass.

\(\triangleright\) The speed of light is a very big number, so \(mc^2\) is a huge number of joules. The object has a gigantic amount of energy because of its mass, and only a relatively small amount of additional kinetic energy because of its motion.

Another way of seeing this is that at low speeds, \(\gamma\) is only a tiny bit greater than 1, so \(\massenergy\) is only a tiny bit greater than \(mc^2\).

Example 20: The correspondence principle for mass-energy

\(\triangleright\) Show that the equation \(\massenergy=m\gamma c^2\) obeys the correspondence principle.

\(\triangleright\) As we accelerate an object from rest, its mass-energy becomes greater than its resting value. Classically, we interpret this excess mass-energy as the object's kinetic energy,

\[\begin{align*} KE &= \massenergy(v)-\massenergy(v=0) \\ &= m\gamma c^2 - m c^2 \\ &= m(\gamma-1)c^2 . \end{align*}\]

Expressing \(\gamma\) as \(\left(1-v^2/c^2\right)^{-1/2}\) and making use of the approximation \((1+\epsilon)^p\approx 1+p\epsilon\) for small \(\epsilon\), we have \(\gamma\approx 1+v^2/2c^2\), so

\[\begin{align*} KE &\approx m(1+\frac{v^2}{2c^2}-1)c^2 \\ &= \frac{1}{2}mv^2 , \end{align*}\]

which is the classical expression. As demanded by the correspondence principle, relativity agrees with classical physics at speeds that are small compared to the speed of light.

\myoptionalsubsection[2]{The energy-momentum four-vector}

Starting from \(\massenergy=m\gamma\) and \(p=m\gamma v\), a little algebra allows one to prove the identity

\[\begin{equation*} m^2 = \massenergy^2 - p^2 . \end{equation*}\]

We can define an energy-momentum four-vector,

\[\begin{equation*} \mathbf{p} = (\massenergy,p_x,p_y,p_z) , \end{equation*}\]

and the relation \(m^2 = \massenergy^2 - p^2\) then arises from the inner product \(\mathbf{p}\cdot\mathbf{p}\). Since \(\massenergy\) and \(p\) are separately conserved, the energy-momentum four-vector is also conserved.

Example 21: Energy and momentum of light
Light has \(m=0\) and \(\gamma=\infty\), so if we try to apply \(\massenergy=m\gamma\) and \(p=m\gamma v\) to light, or to any massless particle, we get the indeterminate form \(0\cdot\infty\), which can't be evaluated without a delicate and laborious evaluation of limits as in problem 11 on p. 441.

Applying \(m^2 = \massenergy^2 - p^2\) yields the same result, \(\massenergy=|p|\), much more easily. This example demonstrates that although we encountered the relations \(\massenergy=m\gamma\) and \(p=m\gamma v\) first, the identity \(m^2 = \massenergy^2 - p^2\) is actually more fundamental.

For the reasons given in example 21, we take \(m^2 = \massenergy^2 - p^2\) to be the definition of mass in relativity. One thing to be careful about is that this definition is not additive. Suppose that we lump two systems together and call them one big system, adding their mass-energies and momenta. When we do this, the mass of the combination is not the same as the sum of the masses. For example, suppose we have two rays of light moving in opposite directions, with energy-momentum vectors \((\massenergy,\massenergy,0,0)\) and \((\massenergy,-\massenergy,0,0)\). Adding these gives \((2\massenergy,0,0)\), which implies a mass equal to \(2\massenergy\). In fact, in the early universe, where the density of light was high, the universe's ambient gravitational fields were mainly those caused by the light it contained.

Example 22: Mass-energy, not energy, goes in the energy-momentum four-vector

When we say that something is a four-vector, we mean that it behaves properly under a Lorentz transformation: we can draw such a four-vector on graph paper, and then when we change frames of reference, we should be able to measure the vector in the new frame of reference by using the new version of the graph-paper grid derived from the old one by a Lorentz transformation.

If we had used the energy \(E\) rather than the mass-energy \(\massenergy\) to construct the energy-momentum four-vector, we wouldn't have gotten a valid four-vector. An easy way to see this is to consider the case where a noninteracting object is at rest in some frame of reference. Its momentum and kinetic energy are both zero. If we'd defined \(\mathbf{p}=(E,p_x,p_y,p_z)\) rather than \(\mathbf{p}=(\massenergy,p_x,p_y,p_z)\), we would have had \(\mathbf{p}=0\) in this frame. But when we draw a zero vector, we get a point, and a point remains a point regardless of how we distort the graph paper we use to measure it. That wouldn't have made sense, because in other frames of reference, we have \(E\ne 0\).

Example 23: Metric units

The relation \( m^2 = \massenergy^2 - p^2 \) is only valid in relativistic units. If we tried to apply it without modification to numbers expressed in metric units, we would have

\[\begin{equation*} \text{kg}^2 = \text{kg}^2\!\cdot\!\frac{\text{m}^4}{\text{s}^4} - \text{kg}^2\!\cdot\!\frac{\text{m}^2}{\text{s}^2} , \end{equation*}\]

which would be nonsense because the three terms all have different units. As usual, we need to insert factors of \(c\) to make a metric version, and these factors of \(c\) are determined by the need to fix the broken units:

\[\begin{equation*} m^2c^4 = \massenergy^2 - p^2c^2 \end{equation*}\]

Example 24: Pair production requires matter
Example 18 on p. 417 discussed the annihilation of an electron and a positron into two gamma rays, which is an example of turning matter into pure energy. An opposite example is pair production, a process in which a gamma ray disappears, and its energy goes into creating an electron and a positron.

Pair production cannot happen in a vacuum. For example, gamma rays from distant black holes can travel through empty space for thousands of years before being detected on earth, and they don't turn into electron-positron pairs before they can get here. Pair production can only happen in the presence of matter. When lead is used as shielding against gamma rays, one of the ways the gamma rays can be stopped in the lead is by undergoing pair production.

To see why pair production is forbidden in a vacuum, consider the process in the frame of reference in which the electron-positron pair has zero total momentum. In this frame, the gamma ray would have to have had zero momentum, but a gamma ray with zero momentum must have zero energy as well (example 21). This means that conservation of four-momentum has been violated: the timelike component of the four-momentum is the mass-energy, and it has increased from 0 in the initial state to at least \(2mc^2\) in the final state.


This optional section proves some results claimed earlier.

Ultrarelativistic motion

We start by considering the case of a particle, described as “ultrarelativistic,” that travels at very close to the speed of light. A good way of thinking about such a particle is that it's one with a very small mass. For example, the subatomic particle called the neutrino has a very small mass, thousands of times smaller than that of the electron. Neutrinos are emitted in radioactive decay, and because the neutrino's mass is so small, the amount of energy available in these decays is always enough to accelerate it to very close to the speed of light. Nobody has ever succeeded in observing a neutrino that was not ultrarelativistic. When a particle's mass is very small, the mass becomes difficult to measure. For almost 70 years after the neutrino was discovered, its mass was thought to be zero. Similarly, we currently believe that a ray of light has no mass, but it is always possible that its mass will be found to be nonzero at some point in the future. A ray of light can be modeled as an ultrarelativistic particle.

Let's compare ultrarelativistic particles with train cars. A single car with kinetic energy \(E\) has different properties than a train of two cars each with kinetic energy \(E/2\). The single car has half the mass and a speed that is greater by a factor of \(\sqrt{2}\). But the same is not true for ultrarelativistic particles. Since an idealized ultrarelativistic particle has a mass too small to be detectable in any experiment, we can't detect the difference between \(m\) and \(2m\). Furthermore, ultrarelativistic particles move at close to \(c\), so there is no observable difference in speed. Thus we expect that a single ultrarelativistic particle with energy \(E\) compared with two such particles, each with energy \(E/2\), should have all the same properties as measured by a mechanical detector.

An idealized zero-mass particle also has no frame in which it can be at rest. It always travels at \(c\), and no matter how fast we chase after it, we can never catch up. We can, however, observe it in different frames of reference, and we will find that its energy is different. For example, distant galaxies are receding from us at substantial fractions of \(c\), and when we observe them through a telescope, they appear very dim not just because they are very far away but also because their light has less energy in our frame than in a frame at rest relative to the source. This effect must be such that changing frames of reference according to a specific Lorentz transformation always changes the energy of the particle by a fixed factor, regardless of the particle's original energy; for if not, then the effect of a Lorentz transformation on a single particle of energy \(E\) would be different from its effect on two particles of energy \(E/2\).

How does this energy-shift factor depend on the velocity \(v\) of the Lorentz transformation? Rather than \(v\), it becomes more convenient to express things in terms of the Doppler shift factor \(D\), which multiplies when we change frames of reference. Let's write \(f(D)\) for the energy-shift factor that results from a given Lorentz transformation. Since a Lorentz transformation \(D_1\) followed by a second transformation \(D_2\) is equivalent to a single transformation by \(D_1D_2\), we must have \(f(D_1D_2)=f(D_1)f(D_2)\). This tightly constrains the form of the function \(f\); it must be something like \(f(D)=s^n\), where \(n\) is a constant. We postpone until p. 424 the proof that \(n=1\), which is also in agreement with experiments with rays of light.

Our final result is that the energy of an ultrarelativistic particle is simply proportional to its Doppler shift factor \(D\). Even in the case where the particle is truly massless, so that \(D\) doesn't have any finite value, we can still find how the energy differs according to different observers by finding the \(D\) of the Lorentz transformation between the two observers' frames of reference.


The following argument is due to Einstein. Suppose that a material object O of mass \(m\), initially at rest in a certain frame A, emits two rays of light, each with energy \(E/2\). By conservation of energy, the object must have lost an amount of energy equal to \(E\). By symmetry, O remains at rest.

We now switch to a new frame of reference moving at a certain velocity \(v\) in the \(z\) direction relative to the original frame. We assume that O's energy is different in this frame, but that the change in its energy amounts to multiplication by some unitless factor \(x\), which depends only on \(v\), since there is nothing else it could depend on that could allow us to form a unitless quantity. In this frame the light rays have energies \(ED(v)\) and \(ED(-v)\). If conservation of energy is to hold in the new frame as it did in the old, we must have \(2xE=ED(v)+ED(-v)\). After some algebra, we find \(x=1/\sqrt{1-v^2}\), which we recognize as \(\gamma\). This proves that \(E=m\gamma\) for a material object.


We've seen that ultrarelativistic particles are “generic,” in the sense that they have no individual mechanical properties other than an energy and a direction of motion. Therefore the relationship between energy and momentum must be linear for ultrarelativistic particles. Indeed, experiments verify that light has momentum, and doubling the energy of a ray of light doubles its momentum rather than quadrupling it. On a graph of \(p\) versus \(E\), massless particles, which have \(E\propto|p|\), lie on two diagonal lines that connect at the origin. If we like, we can pick units such that the slopes of these lines are plus and minus one. Material particles lie to the right of these lines. For example, a car sitting in a parking lot has \(p=0\) and \(E=mc^2\).

Now what happens to such a graph when we change to a different frame or reference that is in motion relative to the original frame? A massless particle still has to act like a massless particle, so the diagonals are simply stretched or contracted along their own lengths. In fact the transformation must be linear (p. 387), because conservation of energy and momentum involve addition, and we need these laws to be valid in all frames of reference. By the same reasoning as in figure j on p. 389, the transformation must be area-preserving. We then have the same three cases to consider as in figure g on p. 388. Case I is ruled out because it would imply that particles keep the same energy when we change frames. (This is what would happen if \(c\) were infinite, so that the mass-equivalent \(E/c^2\) of a given energy was zero, and therefore \(E\) would be interpreted purely as the mass.) Case II can't be right because it doesn't preserve the \(E=|p|\) diagonals. We are left with case III, which establishes the fact that the \(p\)-\(E\) plane transforms according to exactly the same kind of Lorentz transformation as the \(x\)-\(t\) plane. That is, \((E,p_x,p_y,p_z)\) is a four-vector.

The only remaining issue to settle is whether the choice of units that gives invariant 45-degree diagonals in the \(x\)-\(t\) plane is the same as the choice of units that gives such diagonals in the \(p\)-\(E\) plane. That is, we need to establish that the \(c\) that applies to \(x\) and \(t\) is equal to the \(c'\) needed for \(p\) and \(E\), i.e., that the velocity scales of the two graphs are matched up. This is true because in the Newtonian limit, the total mass-energy \(E\) is essentially just the particle's mass, and then \(p/E \approx p/m \approx v\). This establishes that the velocity scales are matched at small velocities, which implies that they coincide for all velocities, since a large velocity, even one approaching \(c\), can be built up from many small increments. (This also establishes that the exponent \(n\) defined on p. 423 equals 1 as claimed.)

Since \(m^2=E^2-p^2\), it follows that for a material particle, \(p=m\gamma v\).

7.4 General Relativity (optional)

Postulates of Euclidean geometry:
1. Two points determine a line.
2. Line segments can be extended.
3. A unique circle can be constructed given any point as its center and any line segment as its radius.
4. All right angles are equal to one another.
5. Given a line and a point not on the line, no more than one line can be drawn through the point and parallel to the given line.


a / Noneuclidean effects, such as the discrepancy from \(180°\) in the sum of the angles of a triangle, are expected to be proportional to area. Here, a noneuclidean equilateral triangle is cut up into four smaller equilateral triangles, each with 1/4 the area. As proved in problem 22, the discrepancy is quadrupled when the area is quadrupled.

What you've learned so far about relativity is known as the special theory of relativity, which is compatible with three of the four known forces of nature: electromagnetism, the strong nuclear force, and the weak nuclear force. Gravity, however, can't be shoehorned into the special theory. In order to make gravity work, Einstein had to generalize relativity. The resulting theory is known as the general theory of relativity.7


b / An Einstein's ring. The distant object is a quasar, MG1131+0456, and the one in the middle is an unknown object, possibly a supermassive black hole. The intermediate object's gravity focuses the rays of light from the distant one. Because the entire arrangement lacks perfect axial symmetry, the ring is nonuniform; most of its brightness is concentrated in two lumps on opposite sides.


d / Gravity Probe B was in a polar orbit around the earth. As in the right panel of figure c, the orientation of the gyroscope changes when it is carried around a curve and back to its starting point. Because the effect was small, it was necessary to let it accumulate over the course of 5000 orbits in order to make it detectable.


e / A triangle in a space with negative curvature has angles that add to less than \(180°\).

7.4.1 Our universe isn't Euclidean

Euclid proved thousands of years ago that the angles in a triangle add up to \(180°\). But what does it really mean to “prove” this? Euclid proved it based on certain assumptions (his five postulates), listed in the margin of this page. But how do we know that the postulates are true?

Only by observation can we tell whether any of Euclid's statements are correct characterizations of how space actually behaves in our universe. If we draw a triangle on paper with a ruler and measure its angles with a protractor, we will quickly verify to pretty good precision that the sum is close to \(180°\). But of course we already knew that space was at least approximately Euclidean. If there had been any gross error in Euclidean geometry, it would have been detected in Euclid's own lifetime. The correspondence principle tells us that if there is going to be any deviation from Euclidean geometry, it must be small under ordinary conditions.

To improve the precision of the experiment, we need to make sure that our ruler is very straight. One way to check would be to sight along it by eye, which amounts to comparing its straightness to that of a ray of light. For that matter, we might as well throw the physical ruler in the trash and construct our triangle out of three laser beams. To avoid effects from the air we should do the experiment in outer space. Doing it in space also has the advantage of allowing us to make the triangle very large; as shown in figure a, the discrepancy from \(180°\) is expected to be proportional to the area of the triangle.

But we already know that light rays are bent by gravity. We expect it based on \(E=mc^2\), which tells us that the energy of a light ray is equivalent to a certain amount of mass, and furthermore it has been verified experimentally by the deflection of starlight by the sun (example 15, p. 416). We therefore know that our universe is noneuclidean, and we gain the further insight that the level of deviation from Euclidean behavior depends on gravity.

Since the noneuclidean effects are bigger when the system being studied is larger, we expect them to be especially important in the study of cosmology, where the distance scales are very large.

Example 25: Einstein's ring

An Einstein's ring, figure b, is formed when there is a chance alignment of a distant source with a closer gravitating body. This type of gravitational lensing is direct evidence for the noneuclidean nature of space. The two light rays are lines, and they violate Euclid's first postulate, that two points determine a line.

One could protest that effects like these are just an imperfection of the light rays as physical models of straight lines. Maybe the noneuclidean effects would go away if we used something better and straighter than a light ray. But we don't know of anything straighter than a light ray. Furthermore, we observe that all measuring devices, not just optical ones, report the same noneuclidean behavior.


An example of such a non-optical measurement is the Gravity Probe B satellite, figure d, which was launched into a polar orbit in 2004 and operated until 2010. The probe carried four gyroscopes made of quartz, which were the most perfect spheres ever manufactured, varying from sphericity by no more than about 40 atoms. Each gyroscope floated weightlessly in a vacuum, so that its rotation was perfectly steady. After 5000 orbits, the gyroscopes had reoriented themselves by about \(2\times10^{-3}°\) relative to the distant stars. This effect cannot be explained by Newtonian physics, since no torques acted on them. It was, however, exactly as predicted by Einstein's theory of general relativity. It becomes easier to see why such an effect would be expected due to the noneuclidean nature of space if we characterize euclidean geometry as the geometry of a flat plane as opposed to a curved one. On a curved surface like a sphere, figure c, Euclid's fifth postulate fails, and it's not hard to see that we can get triangles for which the sum of the angles is not \(180°\). By transporting a gyroscope all the way around the edges of such a triangle and back to its starting point, we change its orientation.


c / Left: A 90-90-90 triangle. Its angles add up to more than \(180°\). Middle: The triangle “pops” off the page visually. We intuitively want to visualize it as lying on a curved surface such as the earth's. Right: A gyroscope carried smoothly around its perimeter ends up having changed its orientation when it gets back to its starting point.

The triangle in figure c has angles that add up to more than \(180°\). This type of curvature is referred to as positive. It is also possible to have negative curvature, as in figure e.

In general relativity, curvature isn't just something caused by gravity. Gravity is curvature, and the curvature involves both space and time, as may become clearer once you get to figure k. Thus the distinction between special and general relativity is that general relativity handles curved spacetime, while special relativity is restricted to the case where spacetime is flat.

Curvature doesn't require higher dimensions

Although we often visualize curvature by imagining embedding a two-dimensional surface in a three-dimensional space, that's just an aid in visualization. There is no evidence for any additional dimensions, nor is it necessary to hypothesize them in order to let spacetime be curved as described in general relativity.


f / Only measurements from within the plane define whether the plane is curved. It could look curved when drawn embedded in three dimensions, but nevertheless still be intrinsically flat.

Put yourself in the shoes of a two-dimensional being living in a two-dimensional space. Euclid's postulates all refer to constructions that can be performed using a compass and an unmarked straightedge. If this being can physically verify them all as descriptions of the space she inhabits, then she knows that her space is Euclidean, and that propositions such as the Pythagorean theorem are physically valid in her universe. But the diagram in f/1 illustrating the proof of the Pythagorean theorem in Euclid's Elements (proposition I.47) is equally valid if the page is rolled onto a cylinder, 2, or formed into a wavy corrugated shape, 3. These types of curvature, which can be achieved without tearing or crumpling the surface, are not real to her. They are simply side-effects of visualizing her two-dimensional universe as if it were embedded in a hypothetical third dimension --- which doesn't exist in any sense that is empirically verifiable to her. Of the curved surfaces in figure f, only the sphere, 4, has curvature that she can measure; the diagram can't be plastered onto the sphere without folding or cutting and pasting.

So the observation of curvature doesn't imply the existence of extra dimensions, nor does embedding a space in a higher-dimensional one so that it looks curvy always mean that there will be any curvature detectable from within the lower-dimensional space.


g / An artificial horizon.


h / 1. A ray of light is emitted upward from the floor of the elevator. The elevator accelerates upward. 2. By the time the light is detected at the ceiling, the elevator has changed its velocity, so the light is detected with a Doppler shift.


i / Pound and Rebka at the top and bottom of the tower.


j / The earth is flat --- locally.


k / Spacetime is locally flat.

7.4.2 The equivalence principle

Universality of free-fall

Although light rays and gyroscopes seem to agree that space is curved in a gravitational field, it's always conceivable that we could find something else that would disagree. For example, suppose that there is a new and improved ray called the \(\text{StraightRay}^\text{TM}\). The StraightRay is like a light ray, but when we construct a triangle out of StraightRays, we always get the Euclidean result for the sum of the angles. We would then have to throw away general relativity's whole idea of describing gravity in terms of curvature. One good way of making a StraightRay would be if we had a supply of some kind of exotic matter --- call it \(\text{FloatyStuff}^\text{TM}\) --- that had the ordinary amount of inertia, but was completely unaffected by gravity. We could then shoot a stream of FloatyStuff particles out of a nozzle at nearly the speed of light and make a StraightRay.

Normally when we release a material object in a gravitational field, it experiences a force \(mg\), and then by Newton's second law its acceleration is \(a=F/m=mg/m=g\). The \(m\)'s cancel, which is the reason that everything falls with the same acceleration (in the absence of other forces such as air resistance). The universality of this behavior is what allows us to interpret the gravity geometrically in general relativity. For example, the Gravity Probe B gyroscopes were made out of quartz, but if they had been made out of something else, it wouldn't have mattered. But if we had access to some FloatyStuff, the geometrical picture of gravity would fail, because the “\(m\)” that described its susceptibility to gravity would be a different “\(m\)” than the one describing its inertia.

The question of the existence or nonexistence of such forms of matter turns out to be related to the question of what kinds of motion are relative. Let's say that alien gangsters land in a flying saucer, kidnap you out of your back yard, konk you on the head, and take you away. When you regain consciousness, you're locked up in a sealed cabin in their spaceship. You pull your keychain out of your pocket and release it, and you observe that it accelerates toward the floor with an acceleration that seems quite a bit slower than what you're used to on earth, perhaps a third of a gee. There are two possible explanations for this. One is that the aliens have taken you to some other planet, maybe Mars, where the strength of gravity is a third of what we have on earth. The other is that your keychain didn't really accelerate at all: you're still inside the flying saucer, which is accelerating at a third of a gee, so that it was really the deck that accelerated up and hit the keys.

There is absolutely no way to tell which of these two scenarios is actually the case --- unless you happen to have a chunk of FloatyStuff in your other pocket. If you release the FloatyStuff and it hovers above the deck, then you're on another planet and experiencing genuine gravity; your keychain responded to the gravity, but the FloatyStuff didn't. But if you release the FloatyStuff and see it hit the deck, then the flying saucer is accelerating through outer space.

The nonexistence of FloatyStuff in our universe is called the equivalence principle. If the equivalence principle holds, then an acceleration (such as the acceleration of the flying saucer) is always equivalent to a gravitational field, and no observation can ever tell the difference without reference to something external. (And suppose you did have some external reference point --- how would you know whether it was accelerating?)

Example 26: The artificial horizon

The pilot of an airplane cannot always easily tell which way is up. The horizon may not be level simply because the ground has an actual slope, and in any case the horizon may not be visible if the weather is foggy. One might imagine that the problem could be solved simply by hanging a pendulum and observing which way it pointed, but by the equivalence principle the pendulum cannot tell the difference between a gravitational field and an acceleration of the aircraft relative to the ground --- nor can any other accelerometer, such as the pilot's inner ear. For example, when the plane is turning to the right, accelerometers will be tricked into believing that “down” is down and to the left. To get around this problem, airplanes use a device called an artificial horizon, which is essentially a gyroscope. The gyroscope has to be initialized when the plane is known to be oriented in a horizontal plane. No gyroscope is perfect, so over time it will drift. For this reason the instrument also contains an accelerometer, and the gyroscope is always forced into agreement with the accelerometer's average output over the preceding several minutes. If the plane is flown in circles for several minutes, the artificial horizon will be fooled into indicating that the wrong direction is vertical.

Gravitational Doppler shifts and time dilation

An interesting application of the equivalence principle is the explanation of gravitational time dilation. As described on p. 384, experiments show that a clock at the top of a mountain runs faster than one down at its foot.

To calculate this effect, we make use of the fact that the gravitational field in the area around the mountain is equivalent to an acceleration. Suppose we're in an elevator accelerating upward with acceleration \(a\), and we shoot a ray of light from the floor up toward the ceiling, at height \(h\). The time \(\Delta t\) it takes the light ray to get to the ceiling is about \(h/c\), and by the time the light ray reaches the ceiling, the elevator has sped up by \(v=a\Delta t=ah/c\), so we'll see a red-shift in the ray's frequency. Since \(v\) is small compared to \(c\), we don't need to use the fancy Doppler shift equation from subsection 7.2.8; we can just approximate the Doppler shift factor as \(1-v/c\approx 1-ah/c^2\). By the equivalence principle, we should expect that if a ray of light starts out low down and then rises up through a gravitational field \(g\), its frequency will be Doppler shifted by a factor of \(1-gh/c^2\). This effect was observed in a famous experiment carried out by Pound and Rebka in 1959. Gamma-rays were emitted at the bottom of a 22.5-meter tower at Harvard and detected at the top with the Doppler shift predicted by general relativity. (See problem 25.)

In the mountain-valley experiment, the frequency of the clock in the valley therefore appears to be running too slowly by a factor of \(1-gh/c^2\) when it is compared via radio with the clock at the top of the mountain. We conclude that time runs more slowly when one is lower down in a gravitational field, and the slow-down factor between two points is given by \(1-gh/c^2\), where \(h\) is the difference in height.

We have built up a picture of light rays interacting with gravity. To confirm that this make sense, recall that we have already observed in subsection 7.3.3 and in problem 11 on p. 441 that light has momentum. The equivalence principle says that whatever has inertia must also participate in gravitational interactions. Therefore light waves must have weight, and must lose energy when they rise through a gravitational field.

Local flatness

The noneuclidean nature of spacetime produces effects that grow in proportion to the area of the region being considered. Interpreting such effects as evidence of curvature, we see that this connects naturally to the idea that curvature is undetectable from close up. For example, the curvature of the earth's surface is not normally noticeable to us in everyday life. Locally, the earth's surface is flat, and the same is true for spacetime.

Local flatness turns out to be another way of stating the equivalence principle. In a variation on the alien-abduction story, suppose that you regain consciousness aboard the flying saucer and find yourself weightless. If the equivalence principle holds, then you have no way of determining from local observations, inside the saucer, whether you are actually weightless in deep space, or simply free-falling in apparent weightlessness, like the astronauts aboard the International Space Station. That means that locally, we can always adopt a free-falling frame of reference in which there is no gravitational field at all. If there is no gravity, then special relativity is valid, and we can treat our local region of spacetime as being approximately flat.

In figure k, an apple falls out of a tree. Its path is a “straight” line in spacetime, in the same sense that the equator is a “straight” line on the earth's surface.

Inertial frames

In Newtonian mechanics, we have a distinction between inertial and noninertial frames of reference. An inertial frame according to Newton is one that has a constant velocity vector relative to the stars. But what if the stars themselves are accelerating due to a gravitational force from the rest of the galaxy? We could then take the galaxy's center of mass as defining an inertial frame, but what if something else is acting on the galaxy?


l / Wouldn't it be nice if we could define the meaning of a Newtonian inertial frame of reference? Newton makes it sound easy: to define an inertial frame, just find some object that is not accelerating because it is not being acted on by any external forces. But what object would we use? The earth? The “fixed stars?” Our galaxy? Our supercluster of galaxies? All of these are accelerating --- relative to something.

If we had some FloatyStuff, we could resolve the whole question. FloatyStuff isn't affected by gravity, so if we release a sample of it in mid-air, it will continue on a trajectory that defines a perfect Newtonian inertial frame. (We'd better have it on a tether, because otherwise the earth's rotation will carry the earth out from under it.) But if the equivalence principle holds, then Newton's definition of an inertial frame is fundamentally flawed.

There is a different definition of an inertial frame that works better in relativity. A Newtonian inertial frame was defined by an object that isn't subject to any forces, gravitational or otherwise. In general relativity, we instead define an inertial frame using an object that that isn't influenced by anything other than gravity. By this definition, a free-falling rock defines an inertial frame, but this book sitting on your desk does not.


m / Matter is lifted out of a Newtonian black hole with a bucket. The dashed line represents the point at which the escape velocity equals the speed of light. For a real, relativistic black hole, this is impossible.


o / In Newtonian contexts, physicists and astronomers had a correct intuition that it's hard for things to collapse gravitationally. This star cluster has been around for about 15 billion years, but it hasn't collapsed into a black hole. If any individual star happens to head toward the center, conservation of angular momentum tends to cause it to swing past and fly back out. The Penrose singularity theorem tells us that this Newtonian intuition is wrong when applied to an object that has collapsed past a certain point.

7.4.3 Black holes

The observations described so far showed only small effects from curvature. To get a big effect, we should look at regions of space in which there are strong gravitational fields. The prime example is a black hole. The best studied examples are two objects in our own galaxy: Cygnus X-1, which is believed to be a black hole with about ten times the mass of our sun, and Sagittarius A*, an object near the center of our galaxy with about four million solar masses.

Although a black hole is a relativistic object, we can gain some insight into how it works by applying Newtonian physics. A spherical body of mass \(M\) has an escape velocity \(v=\sqrt{2GM/r}\), which is the minimum velocity that we would need to give to a projectile shot from a distance \(r\) so that it would never fall back down. If \(r\) is small enough, the escape velocity will be greater than \(c\), so that even a ray of light can never escape.

We can now make an educated guess as to what this means without having to study all the mathematics of general relativity. In relativity, \(c\) isn't really the speed of light, it's really to be thought of as a restriction on how fast cause and effect can propagate through space. This suggests the correct interpretation, which is that for an object compact enough to be a black hole, there is no way for an event at a distance closer than \(r\) to have an effect on an event far away. There is an invisible, spherical boundary with radius \(r\), called the event horizon, and the region within that boundary is cut off from the rest of the universe in terms of cause and effect. If you wanted to explore that region, you could drop into it while wearing a space-suit --- but it would be a one-way trip, because you could never get back out to report on what you had seen.

In the Newtonian description of a black hole, matter could be lifted out of a black hole, m. Would this be possible with a real-world black hole, which is relativistic rather than Newtonian? No, because the bucket is causally separated from the outside universe. No rope would be strong enough for this job (problem 12, p. 442).


n / The equivalence principle tells us that spacetime locally has the same structure as in special relativity, so we can draw the familiar parallelogram of \(x-t\) coordinates at each point near the black hole. Superimposed on each little grid is a pair of lines representing motion at the speed of light in both directions, inward and outward. Because spacetime is curved, these lines do not appear to be at 45-degree angles, but to an observer in that region, they would appear to be. When light rays are emitted inward and outward from a point outside the event horizon, one escapes and one plunges into the black hole. On this diagram, they look like they are decelerating and accelerating, but local observers comparing them to their own coordinate grids would always see them as moving at exactly \(c\). When rays are emitted from a point inside the event horizon, neither escapes; the distortion is so severe that “outward” is really inward.

One misleading aspect of the Newtonian analysis is that it encourages us to imagine that a light ray trying to escape from a black hole will slow down, stop, and then fall back in. This can't be right, because we know that any observer who sees a light ray flying by always measures its speed to be \(c\). This was true in special relativity, and by the equivalence principle we can be assured that the same is true locally in general relativity. Figure n shows what would really happen.

Although the light rays in figure n don't speed up or slow down, they do experience gravitational Doppler shifts. If a light ray is emitted from just above the event horizon, then it will escape to an infinite distance, but it will suffer an extreme Doppler shift toward low frequencies. A distant observer also has the option of interpreting this as a gravitational time dilation that greatly lowers the frequency of the oscillating electric charges that produced the ray. If the point of emission is made closer and closer to the horizon, the frequency and energy measured by a distant observer approach zero, making the ray impossible to observe.

Information paradox

Black holes have some disturbing implications for the kind of universe that in the Age of the Enlightenment was imagined to have been set in motion initially and then left to run forever like clockwork.

Newton's laws have built into them the implicit assumption that omniscience is possible, at least in principle. For example, Newton's definition of an inertial frame of reference leads to an infinite regress, as described on p. 431. For Newton this isn't a problem, because in principle an omnisicient observer can know the location of every mass in the universe. In this conception of the cosmos, there are no theoretical limits on human knowledge, only practical ones; if we could gather sufficiently precise data about the state of the universe at one time, and if we could carry out all the calculations to extrapolate into the future, then we could know everything that would ever happen. (See the famous quote by Laplace on p. 16.)

But the existence of event horizons surrounding black holes makes it impossible for any observer to be omniscient; only an observer inside a particular horizon can see what's going on inside that horizon.

Furthermore, a black hole has at its center an infinitely dense point, called a singularity, containing all its mass, and this implies that information can be destroyed and made inaccessible to any observer at all. For example, suppose that astronaut Alice goes on a suicide mission to explore a black hole, free-falling in through the event horizon. She has a certain amount of time to collect data and satisfy her intellectual curiosity, but then she impacts the singularity and is compacted into a mathematical point. Now astronaut Betty decides that she will never be satisfied unless the secrets revealed to Alice are known to her as well --- and besides, she was Alice's best friend, and she wants to know whether Alice had any last words. Betty can jump through the horizon, but she can never know Alice's last words, nor can any other observer who jumps in after Alice does.

This destruction of information is known as the black hole information paradox, and it's referred to as a paradox because quantum physics (ch. 13) has built into its DNA the requirement that information is never lost in this sense.


Around 1960, as black holes and their strange properties began to be better understood and more widely discussed, many physicists who found these issues distressing comforted themselves with the belief that black holes would never really form from realistic initial conditions, such as the collapse of a massive star. Their skepticism was not entirely unreasonable, since it is usually very hard in astronomy to hit a gravitating target, the reason being that conservation of angular momentum tends to make the projectile swing past. (See problem 13 on p. 289 for a quantitative analysis.) For example, if we wanted to drop a space probe into the sun, we would have to extremely precisely stop its sideways orbital motion so that it would drop almost exactly straight in. Once a star started to collapse, the theory went, and became relatively compact, it would be such a small target that further infalling material would be unlikely to hit it, and the process of collapse would halt. According to this point of view, theorists who had calculated the collapse of a star into a black hole had been oversimplifying by assuming a star that was initially perfectly spherical and nonrotating. Remove the unrealistically perfect symmetry of the initial conditions, and a black hole would never actually form.

But Roger Penrose proved in 1964 that this was wrong. In fact, once an object collapses to a certain density, the Penrose singularity theorem guarantees mathematically that it will collapse further until a singularity is formed, and this singularity is surrounded by an event horizon. Since the brightness of an object like Sagittarius A* is far too low to be explained unless it has an event horizon (the interstellar gas flowing into it would glow due to frictional heating), we can be certain that there really is a singularity at its core.


p / An expanding universe with positive spatial curvature can be imagined as a balloon being blown up. Every galaxy's distance from every other galaxy increases, but no galaxy is the center of the expansion.


q / The angular scale of fluctuations in the cosmic microwave background can be used to infer the curvature of the universe.

7.4.4 Cosmology

The Big Bang

Subsection 6.1.5 presented the evidence, discovered by Hubble, that the universe is expanding in the aftermath of the Big Bang: when we observe the light from distant galaxies, it is always Doppler-shifted toward the red end of the spectrum, indicating that no matter what direction we look in the sky, everything is rushing away from us. This seems to go against the modern attitude, originated by Copernicus, that we and our planet do not occupy a special place in the universe. Why is everything rushing away from our planet in particular? But general relativity shows that this anti-Copernican conclusion is wrong. General relativity describes space not as a rigidly defined background but as something that can curve and stretch, like a sheet of rubber. We imagine all the galaxies as existing on the surface of such a sheet, which then expands uniformly. The space between the galaxies (but not the galaxies themselves) grows at a steady rate, so that any observer, inhabiting any galaxy, will see every other galaxy as receding. There is therefore no privileged or special location in the universe.

We might think that there would be another kind of special place, which would be the one at which the Big Bang happened. Maybe someone has put a brass plaque there? But general relativity doesn't describe the Big Bang as an explosion that suddenly occurred in a preexisting background of time and space. According to general relativity, space itself came into existence at the Big Bang, and the hot, dense matter of the early universe was uniformly distributed everywhere. The Big Bang happened everywhere at once.

Observations show that the universe is very uniform on large scales, and for ease of calculation, the first physical models of the expanding universe were constructed with perfect uniformity. In these models, the Big Bang was a singularity. This singularity can't even be included as an event in spacetime, so that time itself only exists after the Big Bang. A Big Bang singularity also creates an even more acute version of the black hole information paradox. Whereas matter and information disappear into a black hole singularity, stuff pops out of a Big Bang singularity, and there is no physical principle that could predict what it would be.

As with black holes, there was considerable skepticism about whether the existence of an initial singularity in these models was an arifact of the unrealistically perfect uniformity assumed in the models. Perhaps in the real universe, extrapolation of all the paths of the galaxies backward in time would show them missing each other by millions of light-years. But in 1972 Stephen Hawking proved a variant on the Penrose singularity theorem that applied to Big Bang singularities. By the Hawking singularity theorem, the level of uniformity we see in the present-day universe is more than sufficient to prove that a Big Bang singularity must have existed.

The cosmic censorship hypothesis

It might not be too much of a philosophical jolt to imagine that information was spontaneously created in the Big Bang. Setting up the initial conditions of the entire universe is traditionally the prerogative of God, not the laws of physics. But there is nothing fundamental in general relativity that forbids the existence of other singularities that act like the Big Bang, being information producers rather than information consumers. As John Earman of the University of Pittsburgh puts it, anything could pop out of such a singularity, including green slime or your lost socks. This would eliminate any hope of finding a universal set of laws of physics that would be able to make a prediction given any initial situation.

That would be such a devastating defeat for the enterprise of physics that in 1969 Penrose proposed an alternative, humorously named the “cosmic censorship hypothesis,” which states that every singularity in our universe, other than the Big Bang, is hidden behind an event horizon. Therefore if green slime spontaneously pops out of one, there is limited impact on the predictive ability of physics, since the slime can never have any causal effect on the outside world. A singularity that is not modestly cloaked behind an event horizon is referred to as a naked singularity. Nobody has yet been able to prove the cosmic censorship hypothesis.

The advent of high-precision cosmology

We expect that if there is matter in the universe, it should have gravitational fields, and in the rubber-sheet analogy this should be represented as a curvature of the sheet. Instead of a flat sheet, we can have a spherical balloon, so that cosmological expansion is like inflating it with more and more air. It is also possible to have negative curvature, as in figure e on p. 427. All three of these are valid, possible cosmologies according to relativity. The positive-curvature type happens if the average density of matter in the universe is above a certain critical level, the negative-curvature one if the density is below that value.

To find out which type of universe we inhabit, we could try to take a survey of the matter in the universe and determine its average density. Historically, it has been very difficult to do this, even to within an order of magnitude. Most of the matter in the universe probably doesn't emit light, making it difficult to detect. Astronomical distance scales are also very poorly calibrated against absolute units such as the SI.

Instead, we measure the universe's curvature, and infer the density of matter from that. It turns out that we can do this by observing the cosmic microwave background (CMB) radiation, which is the light left over from the brightly glowing early universe, which was dense and hot. As the universe has expanded, light waves that were in flight have expanded their wavelengths along with it. This afterglow of the big bang was originally visible light, but after billions of years of expansion it has shifted into the microwave radio part of the electromagnetic spectrum. The CMB is not perfectly uniform, and this turns out to give us a way to measure the universe's curvature. Since the CMB was emitted when the universe was only about 400,000 years old, any vibrations or disturbances in the hot hydrogen and helium gas that filled space in that era would only have had time to travel a certain distance, limited by the speed of sound. We therefore expect that no feature in the CMB should be bigger than a certain known size. In a universe with negative spatial curvature, the sum of the interior angles of a triangle is less than the Euclidean value of 180 degrees. Therefore if we observe a variation in the CMB over some angle, the distance between two points on the sky is actually greater than would have been inferred from Euclidean geometry. The opposite happens if the curvature is positive.

This observation was done by the 1989-1993 COBE probe, and its 2001-2009 successor, the Wilkinson Microwave Anisotropy Probe. The result is that the angular sizes are almost exactly equal to what they should be according to Euclidean geometry. We therefore infer that the universe is very close to having zero average spatial curvature on the cosmological scale, and this tells us that its average density must be within about 0.5% of the critical value. The years since COBE and WMAP mark the advent of an era in which cosmology has gone from being a field of estimates and rough guesses to a high-precision science.

If one is inclined to be skeptical about the seemingly precise answers to the mysteries of the cosmos, there are consistency checks that can be carried out. In the bad old days of low-precision cosmology, estimates of the age of the universe ranged from 10 billion to 20 billion years, and the low end was inconsistent with the age of the oldest star clusters. This was believed to be a problem either for observational cosmology or for the astrophysical models used to estimate the ages of the clusters: “You can't be older than your ma.” Current data have shown that the low estimates of the age were incorrect, so consistency is restored. (The best figure for the age of the universe is currently \(13.8\pm0.1\) billion years.)

Dark energy and dark matter

Not everything works out so smoothly, however. One surpriseis that the universe's expansion is not currently slowing down, as had been expected due to the gravitational attraction of all the matter in it. Instead, it is currently speeding up. This is attributed to a variable in Einstein's equations, long assumed to be zero, which represents a universal gravitational repulsion of space itself, occurring even when there is no matter present. The current name for this is “dark energy,” although the fancy name is just a label for our ignorance about what causes it.

Another surprise comes from attempts to model the formation of the elements during the era shortly after the Big Bang, before the formation of the first stars. The observed relative abundances of hydrogen, helium, and deuterium (\(^2\text{H}\)) cannot be reconciled with the density of low-velocity matter inferred from the observational data. If the inferred mass density were entirely due to normal matter (i.e., matter whose mass consisted mostly of protons and neutrons), then nuclear reactions in the dense early universe should have proceeded relatively efficiently, leading to a much higher ratio of helium to hydrogen, and a much lower abundance of deuterium. The conclusion is that most of the matter in the universe must be made of an unknown type of exotic matter, known as “dark matter.” We are in the ironic position of knowing that precisely 96% of the universe is something other than atoms, but knowing nothing about what that something is. As of 2013, there have been several experiments that have been carried out to attempt the direct detection of dark matter particles. These are carried out at the bottom of mineshafts to eliminate background radiation. Early claims of success appear to have been statistical flukes, and the most sensitive experiments have not detected anything.8

Homework Problems


a / Problem 7.


b / Problem 18.


c / Problem 25b. Redrawn from Van Baak, Physics Today 60 (2007) 16.

[Problems] \addcontentsline{toc}{section}{\protect{Problems}}

1. The figure illustrates a Lorentz transformation using the conventions employed in section 7.2. For simplicity, the transformation chosen is one that lengthens one diagonal by a factor of 2. Since Lorentz transformations preserve area, the other diagonal is shortened by a factor of 2. Let the original frame of reference, depicted with the square, be A, and the new one B. (a) By measuring with a ruler on the figure, show that the velocity of frame B relative to frame A is \(0.6c\). (b) Print out a copy of the page. With a ruler, draw a third parallelogram that represents a second successive Lorentz transformation, one that lengthens the long diagonal by another factor of 2. Call this third frame C. Use measurements with a ruler to determine frame C's velocity relative to frame A. Does it equal double the velocity found in part a? Explain why it should be expected to turn out the way it does.(answer check available at


2. Astronauts in three different spaceships are communicating with each other. Those aboard ships A and B agree on the rate at which time is passing, but they disagree with the ones on ship C.
(a) Alice is aboard ship A. How does she describe the motion of her own ship, in its frame of reference?
(b) Describe the motion of the other two ships according to Alice.
(c) Give the description according to Betty, whose frame of reference is ship B.
(d) Do the same for Cathy, aboard ship C.

3. What happens in the equation for \(\gamma\) when you put in a negative number for \(v\)? Explain what this means physically, and why it makes sense.

4. The Voyager 1 space probe, launched in 1977, is moving faster relative to the earth than any other human-made object, at 17,000 meters per second.
(a) Calculate the probe's \(\gamma\).
(b) Over the course of one year on earth, slightly less than one year passes on the probe. How much less? (There are 31 million seconds in a year.)(answer check available at

5. In example 2 on page 391, I remarked that accelerating a macroscopic (i.e., not microscopic) object to close to the speed of light would require an unreasonable amount of energy. Suppose that the starship Enterprise from Star Trek has a mass of \(8.0\times10^7\) kg, about the same as the Queen Elizabeth 2. Compute the kinetic energy it would have to have if it was moving at half the speed of light. Compare with the total energy content of the world's nuclear arsenals, which is about \(10^{21}\) J.(answer check available at

6. The earth is orbiting the sun, and therefore is contracted relativistically in the direction of its motion. Compute the amount by which its diameter shrinks in this direction.(answer check available at

7. In this homework problem, you'll fill in the steps of the algebra required in order to find the equation for \(\gamma\) on page 389. To keep the algebra simple, let the time \(t\) in figure k equal 1, as suggested in the figure accompanying this homework problem. The original square then has an area of 1, and the transformed parallelogram must also have an area of 1. (a) Prove that point P is at \(x=v\gamma\), so that its \((t,x)\) coordinates are \((\gamma,v\gamma)\). (b) Find the \((t,x)\) coordinates of point Q. (c) Find the length of the short diagonal connecting P and Q. (d) Average the coordinates of P and Q to find the coordinates of the midpoint C of the parallelogram, and then find distance OC. (e) Find the area of the parallelogram by computing twice the area of triangle PQO. [Hint: You can take PQ to be the base of the triangle.] (f) Set this area equal to 1 and solve for \(\gamma\) to prove \(\gamma=1/\sqrt{1-v^2}\).(answer check available at

8. (a) A free neutron (as opposed to a neutron bound into an atomic nucleus) is unstable, and undergoes beta decay (which you may want to review). The masses of the particles involved are as follows:

neutron 1.67495×10 − 27 kg
proton 1.67265×10 − 27 kg
electron 0.00091×10 − 27 kg
antineutrino < 10 − 35 kg

Find the energy released in the decay of a free neutron. (answer check available at
(b) Neutrons and protons make up essentially all of the mass of the ordinary matter around us. We observe that the universe around us has no free neutrons, but lots of free protons (the nuclei of hydrogen, which is the element that 90% of the universe is made of). We find neutrons only inside nuclei along with other neutrons and protons, not on their own.

If there are processes that can convert neutrons into protons, we might imagine that there could also be proton-to-neutron conversions, and indeed such a process does occur sometimes in nuclei that contain both neutrons and protons: a proton can decay into a neutron, a positron, and a neutrino. A positron is a particle with the same properties as an electron, except that its electrical charge is positive (see chapter 7). A neutrino, like an antineutrino, has negligible mass.

Although such a process can occur within a nucleus, explain why it cannot happen to a free proton. (If it could, hydrogen would be radioactive, and you wouldn't exist!)

9. (a) Find a relativistic equation for the velocity of an object in terms of its mass and momentum (eliminating \(\gamma\)).(answer check available at
(b) Show that your result is approximately the same as the classical value, \(p/m\), at low velocities.
(c) Show that very large momenta result in speeds close to the speed of light.

10. (a) Show that for \(v=(3/5)c\), \(\gamma\) comes out to be a simple fraction.
(b) Find another value of \(v\) for which \(\gamma\) is a simple fraction.

11. An object moving at a speed very close to the speed of light is referred to as ultrarelativistic. Ordinarily (luckily) the only ultrarelativistic objects in our universe are subatomic particles, such as cosmic rays or particles that have been accelerated in a particle accelerator.
(a) What kind of number is \(\gamma\) for an ultrarelativistic particle?
(b) Repeat example 19 on page 419, but instead of very low, nonrelativistic speeds, consider ultrarelativistic speeds.
(c) Find an equation for the ratio \(\massenergy/p\). The speed may be relativistic, but don't assume that it's ultrarelativistic.(answer check available at
(d) Simplify your answer to part c for the case where the speed is ultrarelativistic.(answer check available at
(e) We can think of a beam of light as an ultrarelativistic object --- it certainly moves at a speed that's sufficiently close to the speed of light! Suppose you turn on a one-watt flashlight, leave it on for one second, and then turn it off. Compute the momentum of the recoiling flashlight, in units of \(\text{kg}\!\cdot\!\text{m}/\text{s}\).(answer check available at
(f) Discuss how your answer in part e relates to the correspondence principle.

12. As discussed in chapter 6, the speed at which a disturbance travels along a string under tension is given by \(v=\sqrt{T/\mu}\), where \(\mu\) is the mass per unit length, and \(T\) is the tension.
(a) Suppose a string has a density \(\rho\), and a cross-sectional area \(A\). Find an expression for the maximum tension that could possibly exist in the string without producing \(v>c\), which is impossible according to relativity. Express your answer in terms of \(\rho\), \(A\), and \(c\). The interpretation is that relativity puts a limit on how strong any material can be.(answer check available at
(b) Every substance has a tensile strength, defined as the force per unit area required to break it by pulling it apart. The tensile strength is measured in units of \(\text{N}/\text{m}^2\), which is the same as the pascal (Pa), the mks unit of pressure. Make a numerical estimate of the maximum tensile strength allowed by relativity in the case where the rope is made out of ordinary matter, with a density on the same order of magnitude as that of water. (For comparison, kevlar has a tensile strength of about \(4\times10^9\) Pa, and there is speculation that fibers made from carbon nanotubes could have values as high as \(6\times10^{10}\) Pa.)(answer check available at
(c) A black hole is a star that has collapsed and become very dense, so that its gravity is too strong for anything ever to escape from it. For instance, the escape velocity from a black hole is greater than \(c\), so a projectile can't be shot out of it. Many people, when they hear this description of a black hole in terms of an escape velocity greater than \(c\), wonder why it still wouldn't be possible to extract an object from a black hole by other means than launching it out as a projectile. For example, suppose we lower an astronaut into a black hole on a rope, and then pull him back out again. Why might this not work?

13. (a) A charged particle is surrounded by a uniform electric field. Starting from rest, it is accelerated by the field to speed \(v\) after traveling a distance \(d\). Now it is allowed to continue for a further distance \(3d\), for a total displacement from the start of \(4d\). What speed will it reach, assuming classical physics?
(b) Find the relativistic result for the case of \(v=c/2\).

14. Problem 14 has been deleted.

15. Expand the equation \(K = m(\gamma-1)\) in a Taylor series, and find the first two nonvanishing terms. Show that the first term is the nonrelativistic expression for kinetic energy.

16. Expand the relativistic equation for momentum in a Taylor series, and find the first two nonvanishing terms. Show that the first term is the classical expression.

17. (solution in the pdf version of the book) As promised in subsection 7.2.8, this problem will lead you through the steps of finding an equation for the combination of velocities in relativity, generalizing the numerical result found in problem 1. Suppose that A moves relative to B at velocity \(u\), and B relative to C at \(v\). We want to find A's velocity \(w\) relative to C, in terms of \(u\) and \(v\). Suppose that A emits light with a certain frequency. This will be observed by B with a Doppler shift \(D(u)\). C detects a further shift of \(D(v)\) relative to B. We therefore expect the Doppler shifts to multiply, \(D(w)=D(u)D(v)\), and this provides an implicit rule for determining \(w\) if \(u\) and \(v\) are known. (a) Using the expression for \(D\) given in section 7.2.8, write down an equation relating \(u\), \(v\), and \(w\). (b) Solve for \(w\) in terms of \(u\) and \(v\). (c) Show that your answer to part b satisfies the correspondence principle.

18. The figure shows seven four-vectors, represented in a two-dimensional plot of \(x\) versus \(t\). All the vectors have \(y\) and \(z\) components that are zero. Which of these vectors are congruent to others, i.e., which represent spacetime intervals that are equal to one another?

19. Four-vectors can be timelike, lightlike, or spacelike. What can you say about the inherent properties of particles whose momentum four-vectors fall in these various categories?

20. The following are the three most common ways in which gamma rays interact with matter:

Photoelectric effect: The gamma ray hits an electron, is annihilated, and gives all of its energy to the electron.

Compton scattering: The gamma ray bounces off of an electron, exiting in some direction with some amount of energy.

Pair production: The gamma ray is annihilated, creating an electron and a positron.

Example 24 on p. 421 shows that pair production can't occur in a vacuum due to conservation of the energy-momentum four-vector. What about the other two processes? Can the photoelectric effect occur without the presence of some third particle such as an atomic nucleus? Can Compton scattering happen without a third particle?

21. Expand the relativistic equation for the longitudinal Doppler shift of light \(D(v)\) in a Taylor series, and find the first two nonvanishing terms. Show that these two terms agree with the nonrelativistic expression, so that any relativistic effect is of higher order in \(v\).

22. Prove, as claimed in the caption of figure a on p. 425, that \(S-180°=4(s-180°)\), where \(S\) is the sum of the angles of the large equilateral triangle and \(s\) is the corresponding sum for one of the four small ones.(solution in the pdf version of the book)

23. If a two-dimensional being lived on the surface of a cone, would it say that its space was curved, or not?

24. (a) Verify that the equation \(1-gh/c^2\) for the gravitational Doppler shift and gravitational time dilation has units that make sense. (b) Does this equation satisfy the correspondence principle?

25. (a) Calculate the Doppler shift to be expected in the Pound-Rebka experiment described on p. 430. (b) In the 1978 Iijima mountain-valley experiment (p. 384), analysis was complicated by the clock's sensitivity to pressure, humidity, and temperature. A cleaner version of the experiment was done in 2005 by hobbyist Tom Van Baak. He put his kids and three of his atomic clocks in a minivan and drove from Bellevue, Washington to a lodge on Mount Rainier, 1340 meters higher in elevation. At home, he compared the clocks to others that had stayed at his house. Verify that the effect shown in the graph is as predicted by general relativity.

26. The International Space Station orbits at an altitude of about 350 km and a speed of about 8000 m/s relative to the ground. Compare the gravitational and kinematic time dilations. Over all, does time run faster on the ISS than on the ground, or more slowly?

27. Section 7.4.3 presented a Newtonian estimate of how compact an object would have to be in order to be a black hole. Although this estimate is not really right, it turns out to give the right answer to within about a factor of 2. To roughly what size would the earth have to be compressed in order to become a black hole?

28. Clock A sits on a desk. Clock B is tossed up in the air from the same height as the desk and then comes back down. Compare the elapsed times. \hwhint{hwhint:tossed-clock} (solution in the pdf version of the book)

29. The angular defect \(d\) of a triangle (measured in radians) is defined as \(s-\pi\), where \(s\) is the sum of the interior angles. The angular defect is proportional to the area \(A\) of the triangle. Consider the geometry measured by a two-dimensional being who lives on the surface of a sphere of radius \(R\). First find some triangle on the sphere whose area and angular defect are easy to calculate. Then determine the general equation for \(d\) in terms of \(A\) and \(R\).(answer check available at


Exercise A: The Michelson-Morley Experiment


In this exercise you will analyze the Michelson-Morley experiment, and find what the results should have been according to Galilean relativity and Einstein's theory of relativity. A beam of light coming from the west (not shown) comes to the half-silvered mirror A. Half the light goes through to the east, is reflected by mirror C, and comes back to A. The other half is reflected north by A, is reflected by B, and also comes back to A. When the beams reunite at A, part of each ends up going south, and these parts interfere with one another. If the time taken for a round trip differs by, for example, half the period of the wave, there will be destructive interference.

The point of the experiment was to search for a difference in the experimental results between the daytime, when the laboratory was moving west relative to the sun, and the nighttime, when the laboratory was moving east relative to the sun. Galilean relativity and Einstein's theory of relativity make different predictions about the results. According to Galilean relativity, the speed of light cannot be the same in all reference frames, so it is assumed that there is one special reference frame, perhaps the sun's, in which light travels at the same speed in all directions; in other frames, Galilean relativity predicts that the speed of light will be different in different directions, e.g., slower if the observer is chasing a beam of light. There are four different ways to analyze the experiment:

Groups 1-4 work in the sun's frame of reference according to Galilean relativity.

Group 1 finds time AC. Group 2 finds time CA. Group 3 finds time AB. Group 4 finds time BA.

Groups 5 and 6 transform the lab-frame results into the sun's frame according to Einstein's theory.

Group 5 transforms the \(x\) and \(t\) when ray ACA gets back to A into the sun's frame of reference, and group 6 does the same for ray ABA.


Michelson and Morley found no change in the interference of the waves between day and night. Which version of relativity is consistent with their results?

What does each theory predict if \(v\) approaches \(c\)?

What if the arms are not exactly equal in length?

Does it matter if the “special” frame is some frame other than the sun's?

Exercise B: Sports in Slowlightland

In Slowlightland, the speed of light is 20 mi/hr \(\approx\) 32 km/hr \(\approx\) 9 m/s. Think of an example of how relativistic effects would work in sports. Things can get very complex very quickly, so try to think of a simple example that focuses on just one of the following effects:

- relativistic momentum

- relativistic kinetic energy

- relativistic addition of velocities

- time dilation and length contraction

- Doppler shifts of light

- equivalence of mass and energy

- time it takes for light to get to an athlete's eye

- deflection of light rays by gravity

Exercise C: Events and Spacetime




Exercise D: Misconceptions about Relativity

The following is a list of common misconceptions about relativity. The class will be split up into random groups, and each group will cooperate on developing an explanation of the misconception, and then the groups will present their explanations to the class. There may be multiple rounds, with students assigned to different randomly chosen groups in successive rounds.

  1. How can light have momentum if it has zero mass?
  2. What does the world look like in a frame of reference moving at \(c\)?
  3. Alice observes Betty coming toward her from the left at \(c/2\), and Carol from the right at \(c/2\). Therefore Betty is moving at the speed of light relative to Carol.
  4. Are relativistic effects such as length contraction and time dilation real, or do they just seem to be that way?
  5. Special relativity only matters if you've moving close to the speed of light.
  6. Special relativity says that everything is relative.
  7. There is a common misconception that relativistic length contraction is what we would actually see. Refute this by drawing a spacetime diagram for an object approaching an observer, and tracing rays of light emitted from the object's front and back that both reach the observer's eye at the same time.
  8. When you travel close to the speed of light, your time slows down.
  9. Is a light wave's wavelength relativistically length contracted by a factor of gamma?
  10. Accelerate a baseball to ultrarelativistic speeds. Does it become a black hole?
  11. Where did the Big Bang happen?
  12. The universe can't be infinite in size, because it's only had a finite amount of time to expand from the point where the Big Bang happened.

Exercise E: The sum of observer-vectors is an observer-vector.

The figure gives four pairs of four-vectors, oriented in our customary way as shown by the light-cone on the left.


1. Of the types shown in the four cases i-iv, which types of vectors could represent the world-line of an observer?

2. Suppose that \(\mathbf{U}\) and \(\mathbf{V}\) are both observer-vectors. What would it mean physically to compute \(\mathbf{U}+\mathbf{V}\)?

3. Determine the sign of each inner product \(\mathbf{A}\cdot\mathbf{B}\).

4. Given an observer whose world-line is along a four-vector \(\mathbf{O}\), suppose we want to determine whether some other four-vector \(\mathbf{P}\) is also a possible world-line of an observer. Show that knowledge of the signs of the inner products \(\mathbf{O}\cdot\mathbf{P}\) and \(\mathbf{P}\cdot\mathbf{P}\) is necessary and sufficient to determine this. Hint: Consider various possibilities like i-iv for vector \(\mathbf{P}\), and see how the signs would turn out.

5. For vectors as described in 4, determine the signs of

\[\begin{equation*} (\mathbf{U}+\mathbf{V})\cdot(\mathbf{U}+\mathbf{V}) \end{equation*}\]


\[\begin{equation*} (\mathbf{U}+\mathbf{V})\cdot\mathbf{U} \end{equation*}\]

by multiplying them out. Interpret the result physically.

(c) 1998-2013 Benjamin Crowell, licensed under the Creative Commons Attribution-ShareAlike license. Photo credits are given at the end of the Adobe Acrobat version.

[1] A. Einstein, “On the Electrodynamics of Moving Bodies,” Annalen der Physik 17 (1905), p. 891, tr. Saha and Bose.
[2] Bailey at al., Nucl. Phys. B150(1979) 1
[4] Newton's second law gives \(a_N=F/m=eE/m\). The constant-acceleration equation \(\Delta x=(1/2)at^2\) then gives \(t_N=\sqrt{2m\ell_1/eE}\).
[5] To make the low-energy portion of the graph legible, Bertozzi's highest-energy data point is omitted.
[6] A double-mass object moving at half the speed does not have the same kinetic energy. Kinetic energy depends on the square of the velocity, so cutting the velocity in half reduces the energy by a factor of 1/4, which, multiplied by the doubled mass, makes 1/2 the original energy.
[7] Einstein originally described the distinction between the two theories by saying that the special theory applied to nonaccelerating frames of reference, while the general one allowed any frame at all. The modern consensus is that Einstein was misinterpreting his own theory, and that special relativity actually handles accelerating frames just fine.