You are viewing the html version of Simple Nature, by Benjamin Crowell. This version is only designed for casual browsing, and may have some formatting problems. For serious reading, you want the Adobe Acrobat version. |
When Einstein first began to develop the theory of relativity, around 1905, the only real-world observations he could draw on were ambiguous and indirect. Today, the evidence is part of everyday life. For example, every time you use a GPS receiver, a, you're using Einstein's theory of relativity. Somewhere between 1905 and today, technology became good enough to allow conceptually simple experiments that students in the early 20th century could only discuss in terms like “Imagine that we could...” A good jumping-on point is 1971. In that year, J.C. Hafele and R.E. Keating brought atomic clocks aboard commercial airliners, b, and went around the world, once from east to west and once from west to east. Hafele and Keating observed that there was a discrepancy between the times measured by the traveling clocks and the times measured by similar clocks that stayed home at the U.S. Naval Observatory in Washington. The east-going clock lost time, ending up off by \(-59\pm10\) nanoseconds, while the west-going one gained \(273\pm7\) ns.
This establishes that time doesn't work the way Newton believed it did when he wrote that “Absolute, true, and mathematical time, of itself, and from its own nature flows equably without regard to anything external...” We are used to thinking of time as absolute and universal, so it is disturbing to find that it can flow at a different rate for observers in different frames of reference. Nevertheless, the effects that Hafele and Keating observed were small. This makes sense: Newton's laws have already been thoroughly tested by experiments under a wide variety of conditions, so a new theory like relativity must agree with Newton's to a good approximation, within the Newtonian theory's realm of applicability. This requirement of backward-compatibility is known as the correspondence principle.
It's also reassuring that the effects on time were small compared to the three-day lengths of the plane trips. There was therefore no opportunity for paradoxical scenarios such as one in which the east-going experimenter arrived back in Washington before he left and then convinced himself not to take the trip. A theory that maintains this kind of orderly relationship between cause and effect is said to satisfy causality.
Causality is like a water-hungry front-yard lawn in Los Angeles: we know we want it, but it's not easy to explain why. Even in plain old Newtonian physics, there is no clear distinction between past and future. In figure c, number 18 throws the football to number 25, and the ball obeys Newton's laws of motion. If we took a video of the pass and played it backward, we would see the ball flying from 25 to 18, and Newton's laws would still be satisfied. Nevertheless, we have a strong psychological impression that there is a forward arrow of time. I can remember what the stock market did last year, but I can't remember what it will do next year. Joan of Arc's military victories against England caused the English to burn her at the stake; it's hard to accept that Newton's laws provide an equally good description of a process in which her execution in 1431 caused her to win a battle in 1429. There is no consensus at this point among physicists on the origin and significance of time's arrow, and for our present purposes we don't need to solve this mystery. Instead, we merely note the empirical fact that, regardless of what causality really means and where it really comes from, its behavior is consistent. Specifically, experiments show that if an observer in a certain frame of reference observes that event A causes event B, then observers in other frames agree that A causes B, not the other way around. This is merely a generalization about a large body of experimental results, not a logically necessary assumption. If Keating had gone around the world and arrived back in Washington before he left, it would have disproved this statement about causality.
Hafele and Keating were testing specific quantitative predictions of relativity, and they verified them to within their experiment's error bars. Let's work backward instead, and inspect the empirical results for clues as to how time works.
The two traveling clocks experienced effects in opposite directions, and this suggests that the rate at which time flows depends on the motion of the observer. The east-going clock was moving in the same direction as the earth's rotation, so its velocity relative to the earth's center was greater than that of the clock that remained in Washington, while the west-going clock's velocity was correspondingly reduced. The fact that the east-going clock fell behind, and the west-going one got ahead, shows that the effect of motion is to make time go more slowly. This effect of motion on time was predicted by Einstein in his original 1905 paper on relativity, written when he was 26.
If this had been the only effect in the Hafele-Keating experiment, then we would have expected to see effects on the two flying clocks that were equal in size. Making up some simple numbers to keep the arithmetic transparent, suppose that the earth rotates from west to east at 1000 km/hr, and that the planes fly at 300 km/hr. Then the speed of the clock on the ground is 1000 km/hr, the speed of the clock on the east-going plane is 1300 km/hr, and that of the west-going clock 700 km/hr. Since the speeds of 700, 1000, and 1300 km/hr have equal spacing on either side of 1000, we would expect the discrepancies of the moving clocks relative to the one in the lab to be equal in size but opposite in sign.
In fact, the two effects are unequal in size: \(-59\) ns and 273 ns. This implies that there is a second effect involved, simply due to the planes' being up in the air. This was verified more directly in a 1978 experiment by Iijima and Fujiwara, figure e, in which identical atomic clocks were kept at rest at the top and bottom of a mountain near Tokyo. This experiment, unlike the Hafele-Keating one, isolates one effect on time, the gravitational one: time's rate of flow increases with height in a gravitational field. Einstein didn't figure out how to incorporate gravity into relativity until 1915, after much frustration and many false starts. The simpler version of the theory without gravity is known as special relativity, the full version as general relativity. We'll restrict ourselves to special relativity until section 7.4, and that means that what we want to focus on right now is the distortion of time due to motion, not gravity.
We can now see in more detail how to apply the correspondence principle. The behavior of the three clocks in the Hafele-Keating experiment shows that the amount of time distortion increases as the speed of the clock's motion increases. Newton lived in an era when the fastest mode of transportation was a galloping horse, and the best pendulum clocks would accumulate errors of perhaps a minute over the course of several days. A horse is much slower than a jet plane, so the distortion of time would have had a relative size of only \(\sim10^{-15}\) --- much smaller than the clocks were capable of detecting. At the speed of a passenger jet, the effect is about \(10^{-12}\), and state-of-the-art atomic clocks in 1971 were capable of measuring that. A GPS satellite travels much faster than a jet airplane, and the effect on the satellite turns out to be \(\sim10^{-10}\). The general idea here is that all physical laws are approximations, and approximations aren't simply right or wrong in different situations. Approximations are better or worse in different situations, and the question is whether a particular approximation is good enough in a given situation to serve a particular purpose. The faster the motion, the worse the Newtonian approximation of absolute time. Whether the approximation is good enough depends on what you're trying to accomplish. The correspondence principle says that the approximation must have been good enough to explain all the experiments done in the centuries before Einstein came up with relativity.
By the way, don't get an inflated idea of the importance of the Hafele-Keating experiment. Special relativity had already been confirmed by a vast and varied body of experiments decades before 1971. The only reason I'm giving such a prominent role to this experiment, which was actually more important as a test of general relativity, is that it is conceptually very direct.
Relativity says that when two observers are in different frames of reference, each observer considers the other one's perception of time to be distorted. We'll also see that something similar happens to their observations of distances, so both space and time are distorted. What exactly is this distortion? How do we even conceptualize it?
The idea isn't really as radical as it might seem at first. We can visualize the structure of space and time using a graph with position and time on its axes. These graphs are familiar by now, but we're going to look at them in a slightly different way. Before, we used them to describe the motion of objects. The grid underlying the graph was merely the stage on which the actors played their parts. Now the background comes to the foreground: it's time and space themselves that we're studying. We don't necessarily need to have a line or a curve drawn on top of the grid to represent a particular object. We may, for example, just want to talk about events, depicted as points on the graph as in figure a. A distortion of the Cartesian grid underlying the graph can arise for perfectly ordinary reasons that Isaac Newton would have readily accepted. For example, we can simply change the units used to measure time and position, as in figure b.
We're going to have quite a few examples of this type, so I'll adopt the convention shown in figure c for depicting them. Figure c summarizes the relationship between figures a and b in a more compact form. The gray rectangle represents the original coordinate grid of figure a, while the grid of black lines represents the new version from figure b. Omitting the grid from the gray rectangle makes the diagram easier to decode visually.
Our goal of unraveling the mysteries of special relativity amounts to nothing more than finding out how to draw a diagram like c in the case where the two different sets of coordinates represent measurements of time and space made by two different observers, each in motion relative to the other. Galileo and Newton thought they knew the answer to this question, but their answer turned out to be only approximately right. To avoid repeating the same mistakes, we need to clearly spell out what we think are the basic properties of time and space that will be a reliable foundation for our reasoning. I want to emphasize that there is no purely logical way of deciding on this list of properties. The ones I'll list are simply a summary of the patterns observed in the results from a large body of experiments. Furthermore, some of them are only approximate. For example, property 1 below is only a good approximation when the gravitational field is weak, so it is a property that applies to special relativity, not to general relativity.
Most of these are not very subversive. Properties 1 and 2 date back to the time when Galileo and Newton started applying the same universal laws of motion to the solar system and to the earth; this contradicted Aristotle, who believed that, for example, a rock would naturally want to move in a certain special direction (down) in order to reach a certain special location (the earth's surface). Property 3 is the reason that Einstein called his theory “relativity,” but Galileo and Newton believed exactly the same thing to be true, as dramatized by Galileo's run-in with the Church over the question of whether the earth could really be in motion around the sun. Property 4 would probably surprise most people only because it asserts in such a weak and specialized way something that they feel deeply must be true. The only really strange item on the list is 5, but the Hafele-Keating experiment forces it upon us.
If it were not for property 5, we could imagine that figure d would give the correct transformation between frames of reference in motion relative to one another. Let's say that observer 1, whose grid coincides with the gray rectangle, is a hitch-hiker standing by the side of a road. Event A is a raindrop hitting his head, and event B is another raindrop hitting his head. He says that A and B occur at the same location in space. Observer 2 is a motorist who drives by without stopping; to him, the passenger compartment of his car is at rest, while the asphalt slides by underneath. He says that A and B occur at different points in space, because during the time between the first raindrop and the second, the hitch-hiker has moved backward. On the other hand, observer 2 says that events A and C occur in the same place, while the hitch-hiker disagrees. The slope of the grid-lines is simply the velocity of the relative motion of each observer relative to the other.
Figure d has familiar, comforting, and eminently sensible behavior, but it also happens to be wrong, because it violates property 5. The distortion of the coordinate grid has only moved the vertical lines up and down, so both observers agree that events like B and C are simultaneous. If this was really the way things worked, then all observers could synchronize all their clocks with one another for once and for all, and the clocks would never get out of sync. This contradicts the results of the Hafele-Keating experiment, in which all three clocks were initially synchronized in Washington, but later went out of sync because of their different states of motion.
It might seem as though we still had a huge amount of wiggle room available for the correct form of the distortion. It turns out, however, that properties 1-5 are sufficient to prove that there is only one answer, which is the one found by Einstein in 1905. To see why this is, let's work by a process of elimination.
Figure e shows a transformation that might seem at first glance to be as good a candidate as any other, but it violates property 3, that motion is relative, for the following reason. In observer 2's frame of reference, some of the grid lines cross one another. This means that observers 1 and 2 disagree on whether or not certain events are the same. For instance, suppose that event A marks the arrival of an arrow at the bull's-eye of a target, and event B is the location and time when the bull's-eye is punctured. Events A and B occur at the same location and at the same time. If one observer says that A and B coincide, but another says that they don't, we have a direct contradiction. Since the two frames of reference in figure e give contradictory results, one of them is right and one is wrong. This violates property 3, because all inertial frames of reference are supposed to be equally valid. To avoid problems like this, we clearly need to make sure that none of the grid lines ever cross one another.
The next type of transformation we want to kill off is shown in figure f, in which the grid lines curve, but never cross one another. The trouble with this one is that it violates property 1, the uniformity of time and space. The transformation is unusually “twisty” at A, whereas at B it's much more smooth. This can't be correct, because the transformation is only supposed to depend on the relative state of motion of the two frames of reference, and that given information doesn't single out a special role for any particular point in spacetime. If, for example, we had one frame of reference rotating relative to the other, then there would be something special about the axis of rotation. But we're only talking about inertial frames of reference here, as specified in property 3, so we can't have rotation; each frame of reference has to be moving in a straight line at constant speed. For frames related in this way, there is nothing that could single out an event like A for special treatment compared to B, so transformation f violates property 1.
The examples in figures e and f show that the transformation we're looking for must be linear, meaning that it must transform lines into lines, and furthermore that it has to take parallel lines to parallel lines. Einstein wrote in his 1905 paper that “... on account of the property of homogeneity [property 1] which we ascribe to time and space, the [transformation] must be linear.”^{1} Applying this to our diagrams, the original gray rectangle, which is a special type of parallelogram containing right angles, must be transformed into another parallelogram. There are three types of transformations, figure g, that have this property. Case I is the Galilean transformation of figure d on page 386, which we've already ruled out.
Case II can also be discarded. Here every point on the grid rotates counterclockwise. What physical parameter would determine the amount of rotation? The only thing that could be relevant would be \(v\), the relative velocity of the motion of the two frames of reference with respect to one another. But if the angle of rotation was proportional to \(v\), then for large enough velocities the grid would have left and right reversed, and this would violate property 4, causality: one observer would say that event A caused a later event B, but another observer would say that B came first and caused A.
The only remaining possibility is case III, which I've redrawn in figure h with a couple of changes. This is the one that Einstein predicted in 1905. The transformation is known as the Lorentz transformation, after Hendrik Lorentz (1853-1928), who partially anticipated Einstein's work, without arriving at the correct interpretation. The distortion is a kind of smooshing and stretching, as suggested by the hands. Also, we've already seen in figures a-c on page 385 that we're free to stretch or compress everything as much as we like in the horizontal and vertical directions, because this simply corresponds to choosing different units of measurement for time and distance. In figure h I've chosen units that give the whole drawing a convenient symmetry about a 45-degree diagonal line. Ordinarily it wouldn't make sense to talk about a 45-degree angle on a graph whose axes had different units. But in relativity, the symmetric appearance of the transformation tells us that space and time ought to be treated on the same footing, and measured in the same units.
As in our discussion of the Galilean transformation, slopes are interpreted as velocities, and the slope of the near-horizontal lines in figure i is interpreted as the relative velocity of the two observers. The difference between the Galilean version and the relativistic one is that now there is smooshing happening from the other side as well. Lines that were vertical in the original grid, representing simultaneous events, now slant over to the right. This tells us that, as required by property 5, different observers do not agree on whether events that occur in different places are simultaneous. The Hafele-Keating experiment tells us that this non-simultaneity effect is fairly small, even when the velocity is as big as that of a passenger jet, and this is what we would have anticipated by the correspondence principle. The way that this is expressed in the graph is that if we pick the time unit to be the second, then the distance unit turns out to be hundreds of thousands of miles. In these units, the velocity of a passenger jet is an extremely small number, so the slope \(v\) in figure i is extremely small, and the amount of distortion is tiny --- it would be much too small to see on this scale.
The only thing left to determine about the Lorentz transformation is the size of the transformed parallelogram relative to the size of the original one. Although the drawing of the hands in figure h may suggest that the grid deforms like a framework made of rigid coat-hanger wire, that is not the case. If you look carefully at the figure, you'll see that the edges of the smooshed parallelogram are actually a little longer than the edges of the original rectangle. In fact what stays the same is not lengths but areas, as proved in the caption to figure j.
With a little algebra and geometry (homework problem 7, page 439), one can use the equal-area property to show that the factor \(\gamma\) (Greek letter gamma) defined in figure k is given by the equation
If you've had good training in physics, the first thing you probably think when you look at this equation is that it must be nonsense, because its units don't make sense. How can we take something with units of velocity squared, and subtract it from a unitless 1? But remember that this is expressed in our special relativistic units, in which the same units are used for distance and time. In this system, velocities are always unitless. This sort of thing happens frequently in physics. For instance, before James Joule discovered conservation of energy, nobody knew that heat and mechanical energy were different forms of the same thing, so instead of measuring them both in units of joules as we would do now, they measured heat in one unit (such as calories) and mechanical energy in another (such as foot-pounds). In ordinary metric units, we just need an extra conversion factor \(c\), and the equation becomes
Here's why we care about \(\gamma\). Figure k defines it as the ratio of two times: the time between two events as expressed in one coordinate system, and the time between the same two events as measured in the other one. The interpretation is:
A clock runs fastest in the frame of reference of an observer who is at rest relative to the clock. An observer in motion relative to the clock at speed \(v\) perceives the clock as running more slowly by a factor of \(\gamma\).
As proved in figures l and m, lengths are also distorted:
A meter-stick appears longest to an observer who is at rest relative to it. An observer moving relative to the meter-stick at \(v\) observes the stick to be shortened by a factor of \(\gamma\).
What is \(\gamma\) when \(v=0\)? What does this mean?
(answer in the back of the PDF version of the book)Betty experiences time dilation. At this speed, her \(\gamma\) is 2.0, so that the voyage will only seem to her to last 7 years. But there is perfect symmetry between Alice's and Betty's frames of reference, so Betty agrees with Alice on their relative speed; Betty sees herself as being at rest, while the sun and Tau Ceti both move backward at 87% of the speed of light. How, then, can she observe Tau Ceti to get to her in only 7 years, when it should take 14 years to travel 12 light-years at this speed?
We need to take into account length contraction. Betty sees the distance between the sun and Tau Ceti to be shrunk by a factor of 2. The same thing occurs for Alice, who observes Betty and her spaceship to be foreshortened.
Particles called muons (named after the Greek letter \(\mu\), “myoo”) were produced by an accelerator at CERN, near Geneva. A muon is essentially a heavier version of the electron. Muons undergo radioactive decay, lasting an average of only 2.197 \(\mu\text{s}\) before they evaporate into an electron and two neutrinos. The 1974 experiment was actually built in order to measure the magnetic properties of muons, but it produced a high-precision test of time dilation as a byproduct. Because muons have the same electric charge as electrons, they can be trapped using magnetic fields. Muons were injected into the ring shown in figure p, circling around it until they underwent radioactive decay. At the speed at which these muons were traveling, they had \(\gamma=29.33\), so on the average they lasted 29.33 times longer than the normal lifetime. In other words, they were like tiny alarm clocks that self-destructed at a randomly selected time. Figure o shows the number of radioactive decays counted, as a function of the time elapsed after a given stream of muons was injected into the storage ring. The two dashed lines show the rates of decay predicted with and without relativity. The relativistic line is the one that agrees with experiment.
Figure q shows an artist's rendering of the length contraction for the collision of two gold nuclei at relativistic speeds in the RHIC accelerator in Long Island, New York, which went on line in 2000. The gold nuclei would appear nearly spherical (or just slightly lengthened like an American football) in frames moving along with them, but in the laboratory's frame, they both appear drastically foreshortened as they approach the point of collision. The later pictures show the nuclei merging to form a hot soup, in which experimenters hope to observe a new form of matter.
The paradox is resolved when we recognize that the concept of fitting the bus in the garage “all at once” contains a hidden assumption, the assumption that it makes sense to ask whether the front and back of the bus can simultaneously be in the garage. Observers in different frames of reference moving at high relative speeds do not necessarily agree on whether things happen simultaneously. As shown in figure r, the person in the garage's frame can shut the door at an instant B he perceives to be simultaneous with the front bumper's arrival A at the back wall of the garage, but the driver would not agree about the simultaneity of these two events, and would perceive the door as having shut long after she plowed through the back wall.
Let's think a little more about the role of the 45-degree diagonal in the Lorentz transformation. Slopes on these graphs are interpreted as velocities. This line has a slope of 1 in relativistic units, but that slope corresponds to \(c\) in ordinary metric units. We already know that the relativistic distance unit must be extremely large compared to the relativistic time unit, so \(c\) must be extremely large. Now note what happens when we perform a Lorentz transformation: this particular line gets stretched, but the new version of the line lies right on top of the old one, and its slope stays the same. In other words, if one observer says that something has a velocity equal to \(c\), every other observer will agree on that velocity as well. (The same thing happens with \(-c\).)
This is counterintuitive, since we expect velocities to add and subtract in relative motion. If a dog is running away from me at 5 m/s relative to the sidewalk, and I run after it at 3 m/s, the dog's velocity in my frame of reference is 2 m/s. According to everything we have learned about motion, the dog must have different speeds in the two frames: 5 m/s in the sidewalk's frame and 2 m/s in mine. But velocities are measured by dividing a distance by a time, and both distance and time are distorted by relativistic effects, so we actually shouldn't expect the ordinary arithmetic addition of velocities to hold in relativity; it's an approximation that's valid at velocities that are small compared to \(c\).
For example, suppose Janet takes a trip in a spaceship, and accelerates until she is moving at \(0.6c\) relative to the earth. She then launches a space probe in the forward direction at a speed relative to her ship of \(0.6c\). We might think that the probe was then moving at a velocity of \(1.2c\), but in fact the answer is still less than \(c\) (problem 1, page 438). This is an example of a more general fact about relativity, which is that \(c\) represents a universal speed limit. This is required by causality, as shown in figure s.
Now consider a beam of light. We're used to talking casually about the “speed of light,” but what does that really mean? Motion is relative, so normally if we want to talk about a velocity, we have to specify what it's measured relative to. A sound wave has a certain speed relative to the air, and a water wave has its own speed relative to the water. If we want to measure the speed of an ocean wave, for example, we should make sure to measure it in a frame of reference at rest relative to the water. But light isn't a vibration of a physical medium; it can propagate through the near-perfect vacuum of outer space, as when rays of sunlight travel to earth. This seems like a paradox: light is supposed to have a specific speed, but there is no way to decide what frame of reference to measure it in. The way out of the paradox is that light must travel at a velocity equal to \(c\). Since all observers agree on a velocity of \(c\), regardless of their frame of reference, everything is consistent.
The constancy of the speed of light had in fact already been observed when Einstein was an 8-year-old boy, but because nobody could figure out how to interpret it, the result was largely ignored. In 1887 Michelson and Morley set up a clever apparatus to measure any difference in the speed of light beams traveling east-west and north-south. The motion of the earth around the sun at 110,000 km/hour (about 0.01% of the speed of light) is to our west during the day. Michelson and Morley believed that light was a vibration of a mysterious medium called the ether, so they expected that the speed of light would be a fixed value relative to the ether. As the earth moved through the ether, they thought they would observe an effect on the velocity of light along an east-west line. For instance, if they released a beam of light in a westward direction during the day, they expected that it would move away from them at less than the normal speed because the earth was chasing it through the ether. They were surprised when they found that the expected 0.01% change in the speed of light did not occur.
If you've flown in a jet plane, you can thank relativity for helping you to avoid crashing into a mountain or an ocean. Figure u shows a standard piece of navigational equipment called a ring laser gyroscope. A beam of light is split into two parts, sent around the perimeter of the device, and reunited. Since the speed of light is constant, we expect the two parts to come back together at the same time. If they don't, it's evidence that the device has been rotating. The plane's computer senses this and notes how much rotation has accumulated.
Relativity has only one universal speed, so it requires that all light waves travel at the same speed, regardless of their frequency and wavelength. Presently the best experimental tests of the invariance of the speed of light with respect to wavelength come from astronomical observations of gamma-ray bursts, which are sudden outpourings of high-frequency light, believed to originate from a supernova explosion in another galaxy. One such observation, in 2009,^{3} found that the times of arrival of all the different frequencies in the burst differed by no more than 2 seconds out of a total time in flight on the order of ten billion years!
A person in a spaceship moving at 99.99999999% of the speed of light relative to Earth shines a flashlight forward through dusty air, so the beam is visible. What does she see? What would it look like to an observer on Earth?
A question that students often struggle with is whether time and space can really be distorted, or whether it just seems that way. Compare with optical illusions or magic tricks. How could you verify, for instance, that the lines in the figure are actually parallel? Are relativistic effects the same, or not?
On a spaceship moving at relativistic speeds, would a lecture seem even longer and more boring than normal?
Mechanical clocks can be affected by motion. For example, it was a significant technological achievement to build a clock that could sail aboard a ship and still keep accurate time, allowing longitude to be determined. How is this similar to or different from relativistic time dilation?
Figure q from page 392, depicting the collision of two nuclei at the RHIC accelerator, is reproduced below. What would the shapes of the two nuclei look like to a microscopic observer riding on the left-hand nucleus? To an observer riding on the right-hand one? Can they agree on what is happening? If not, why not --- after all, shouldn't they see the same thing if they both compare the two nuclei side-by-side at the same instant in time?
If you stick a piece of foam rubber out the window of your car while driving down the freeway, the wind may compress it a little. Does it make sense to interpret the relativistic length contraction as a type of strain that pushes an object's atoms together like this? How does this relate to discussion question E?
The machine-gunner in the figure sends out a spray of bullets. Suppose that the bullets are being shot into outer space, and that the distances traveled are trillions of miles (so that the human figure in the diagram is not to scale). After a long time, the bullets reach the points shown with dots which are all equally far from the gun. Their arrivals at those points are events A through E, which happen at different times. The chain of impacts extends across space at a speed greater than \(c\). Does this violate special relativity?
The Newtonian picture of the universe has particles interacting with each other by exerting forces from a distance, and these forces are imagined to occur without any time delay. For example, suppose that super-powerful aliens, angered when they hear disco music in our AM radio transmissions, come to our solar system on a mission to cleanse the universe of our aesthetic contamination. They apply a force to our sun, causing it to go flying out of the solar system at a gazillion miles an hour. According to Newton's laws, the gravitational force of the sun on the earth will immediately start dropping off. This will be detectable on earth, and since sunlight takes eight minutes to get from the sun to the earth, the change in gravitational force will, according to Newton, be the first way in which earthlings learn the bad news --- the sun will not visibly start receding until a little later. Although this scenario is fanciful, it shows a real feature of Newton's laws: that information can be transmitted from one place in the universe to another with zero time delay, so that transmission and reception occur at exactly the same instant. Newton was sharp enough to realize that this required a nontrivial assumption, which was that there was some completely objective and well-defined way of saying whether two things happened at exactly the same instant. He stated this assumption explicitly: “Absolute, true, and mathematical time, of itself, and from its own nature flows at a constant rate without regard to anything external...”
Relativity forbids Newton's instantaneous action at a distance. For suppose that instantaneous action at a distance existed. It would then be possible to send signals from one place in the universe to another without any time lag. This would allow perfect synchronization of all clocks. But the Hafele-Keating experiment demonstrates that clocks A and B that have been initially synchronized will drift out of sync if one is in motion relative to the other. With instantaneous transmission of signals, we could determine, without having to wait for A and B to be reunited, which was ahead and which was behind. Since they don't need to be reunited, neither one needs to undergo any acceleration; each clock can fix an inertial frame of reference, with a velocity vector that changes neither its direction nor its magnitude. But this violates the principle that constant-velocity motion is relative, because each clock can be considered to be at rest, in its own frame of reference. Since no experiment has ever detected any violation of the relativity of motion, we conclude that instantaneous action at a distance is impossible.
Since forces can't be transmitted instantaneously, it becomes natural to imagine force-effects spreading outward from their source like ripples on a pond, and we then have no choice but to impute some physical reality to these ripples. We call them fields, and they have their own independent existence. Gravity is transmitted through a field called the gravitational field. Besides gravity, there are other fundamental fields of force such as electricity and magnetism (ch. 10-11). Ripples of the electric and magnetic fields turn out to be light waves. This tells us that the speed at which electric and magnetic field ripples spread must be \(c\), and by an argument similar to the one in subsection 7.2.3 the same must hold for any other fundamental field, including the gravitational field.
Fields don't have to wiggle; they can hold still as well. The earth's magnetic field, for example, is nearly constant, which is why we can use it for direction-finding.
Even empty space, then, is not perfectly featureless. It has measurable properties. For example, we can drop a rock in order to measure the direction of the gravitational field, or use a magnetic compass to find the direction of the magnetic field. This concept made a deep impression on Einstein as a child. He recalled that as a five-year-old, the gift of a magnetic compass convinced him that there was “something behind things, something deeply hidden.”
The smoking-gun argument for this strange notion of traveling force ripples comes from the fact that they carry energy. In figure x/1, Alice and Betty hold balls A and B at some distance from one another. These balls make a force on each other; it doesn't really matter for the sake of our argument whether this force is gravitational, electrical, or magnetic. Let's say it's electrical, i.e., that the balls have the kind of electrical charge that sometimes causes your socks to cling together when they come out of the clothes dryer. We'll say the force is repulsive, although again it doesn't really matter.
If Alice chooses to move her ball closer to Betty's, x/2, Alice will have to do some mechanical work against the electrical repulsion, burning off some of the calories from that chocolate cheesecake she had at lunch. This reduction in her body's chemical energy is offset by a corresponding increase in the electrical interaction energy. Not only that, but Alice feels the resistance stiffen as the balls get closer together and the repulsion strengthens. She has to do a little extra work, but this is all properly accounted for in the interaction energy.
But now suppose, x/3, that Betty decides to play a trick on Alice by tossing B far away just as Alice is getting ready to move A. We have already established that Alice can't feel B's motion instantaneously, so the electric forces must actually be propagated by an electric field. Of course this experiment is utterly impractical, but suppose for the sake of argument that the time it takes the change in the electric field to propagate across the diagram is long enough so that Alice can complete her motion before she feels the effect of B's disappearance. She is still getting stale information about B's position. As she moves A to the right, she feels a repulsion, because the field in her region of space is still the field caused by B in its old position. She has burned some chocolate cheesecake calories, and it appears that conservation of energy has been violated, because these calories can't be properly accounted for by any interaction with B, which is long gone.
If we hope to preserve the law of conservation of energy, then the only possible conclusion is that the electric field itself carries away the cheesecake energy. In fact, this example represents an impractical method of transmitting radio waves. Alice does work on charge A, and that energy goes into the radio waves. Even if B had never existed, the radio waves would still have carried energy, and Alice would still have had to do work in order to create them.
Amy and Bill are flying on spaceships in opposite directions at such high velocities that the relativistic effect on time's rate of flow is easily noticeable. Motion is relative, so Amy considers herself to be at rest and Bill to be in motion. She says that time is flowing normally for her, but Bill is slow. But Bill can say exactly the same thing. How can they both think the other is slow? Can they settle the disagreement by getting on the radio and seeing whose voice is normal and whose sounds slowed down and Darth-Vadery?
The figure shows a famous thought experiment devised by Einstein. A train is moving at constant velocity to the right when bolts of lightning strike the ground near its front and back. Alice, standing on the dirt at the midpoint of the flashes, observes that the light from the two flashes arrives simultaneously, so she says the two strikes must have occurred simultaneously. Bob, meanwhile, is sitting aboard the train, at its middle. He passes by Alice at the moment when Alice later figures out that the flashes happened. Later, he receives flash 2, and then flash 1. He infers that since both flashes traveled half the length of the train, flash 2 must have occurred first. How can this be reconciled with Alice's belief that the flashes were simultaneous? Explain using a graph.
Resolve the following paradox by drawing a spacetime diagram (i.e., a graph of \(x\) versus \(t\)). Andy and Beth are in motion relative to one another at a significant fraction of \(c\). As they pass by each other, they exchange greetings, and Beth tells Andy that she is going to blow up a stick of dynamite one hour later. One hour later by Andy's clock, she still hasn't exploded the dynamite, and he says to himself, “She hasn't exploded it because of time dilation. It's only been 40 minutes for her.” He now accelerates suddenly so that he's moving at the same velocity as Beth. The time dilation no longer exists. If he looks again, does he suddenly see the flash from the explosion? How can this be? Would he see her go through 20 minutes of her life in fast-motion?
Use a graph to resolve the following relativity paradox. Relativity says that in one frame of reference, event A could happen before event B, but in someone else's frame B would come before A. How can this be? Obviously the two people could meet up at A and talk as they cruised past each other. Wouldn't they have to agree on whether B had already happened?
The rod in the figure is perfectly rigid. At event A, the hammer strikes one end of the rod. At event B, the other end moves. Since the rod is perfectly rigid, it can't compress, so A and B are simultaneous. In frame 2, B happens before A. Did the motion at the right end cause the person on the left to decide to pick up the hammer and use it?
Given an event P, we can now classify all the causal relationships in which P can participate. In Newtonian physics, these relationships fell into two classes: P could potentially cause any event that lay in its future, and could have been caused by any event in its past. In relativity, we have a three-way distinction rather than a two-way one. There is a third class of events that are too far away from P in space, and too close in time, to allow any cause and effect relationship, since causality's maximum velocity is \(c\). Since we're working in units in which \(c=1\), the boundary of this set is formed by the lines with slope \(\pm1\) on a \((t,x)\) plot. This is referred to as the light cone, for reasons that become more visually obvious when we consider more than one spatial dimension, figure aa.
Events lying inside one another's light cones are said to have a timelike relationship. Events outside each other's light cones are spacelike in relation to one another, and in the case where they lie on the surfaces of each other's light cones the term is lightlike. \myoptionalsubsection[2]{The spacetime interval}
The light cone is an object of central importance in both special and general relativity. It relates the geometry of spacetime to possible cause-and-effect relationships between events. This is fundamentally how relativity works: it's a geometrical theory of causality.
These ideas naturally lead us to ask what fruitful analogies we can form between the bizarre geometry of spacetime and the more familiar geometry of the Euclidean plane. The light cone cuts spacetime into different regions according to certain measurements of relationships between points (events). Similarly, a circle in Euclidean geometry cuts the plane into two parts, an interior and an exterior, according to the measurement of the distance from the circle's center. A circle stays the same when we rotate the plane. A light cone stays the same when we change frames of reference. Let's build up the analogy more explicitly.
We say that two line segments are congruent, \(\text{AB}\cong \text{CD}\), if the distance between points A and B is the same as the distance between C and D, as measured by a rigid ruler.
We define \(\text{AB}\cong \text{CD}\) if:
The three parts of the relativistic version each require some justification.
Case 1 has to be the way it is because space is part of spacetime. In special relativity, this space is Euclidean, so the definition of congruence has to agree with the Euclidean definition, in the case where it is possible to apply the Euclidean definition. The spacelike relation between the points is both necessary and sufficient to make this possible. If points A and B are spacelike in relation to one another, then a frame of reference exists in which they are simultaneous, so we can use a ruler that is at rest in that frame to measure their distance. If they are lightlike or timelike, then no such frame of reference exists. For example, there is no frame of reference in which Charles VII's restoration to the throne is simultaneous with Joan of Arc's execution, so we can't arrange for both of these events to touch the same ruler at the same time.
The definition in case 2 is the only sensible way to proceed if we are to respect the symmetric treatment of time and space in relativity. The timelike relation between the events is necessary and sufficient to make it possible for a clock to move from one to the other. It makes a difference that the clocks move inertially, because the twins in example 1 on p. 391 disagree on the clock time between the traveling twin's departure and return.
Case 3 may seem strange, since it says that any two lightlike intervals are congruent. But this is the only possible definition, because this case can be obtained as a limit of the timelike one. Suppose that AB is a timelike interval, but in the planet earth's frame of reference it would be necessary to travel at almost the speed of light in order to reach B from A. The required speed is less than \(c\) (i.e., less than 1) by some tiny amount \(\epsilon\). In the earth's frame, the clock referred to in the definition suffers extreme time dilation. The time elapsed on the clock is very small. As \(\epsilon\) approaches zero, and the relationship between A and B approaches a lightlike one, this clock time approaches zero. In this sense, the relativistic notion of “distance” is very different from the Euclidean one. In Euclidean geometry, the distance between two points can only be zero if they are the same point.
The case splitting involved in the relativistic definition is a little ugly. Having worked out the physical interpretation, we can now consolidate the definition in a nicer way by appealing to Cartesian coordinates.
Given a vector \((\Delta x,\Delta y)\) from point A to point B, the square of the distance between them is defined as \(\overline{\text{AB}}^2=\Delta x^2+\Delta y^2\).
Given points separated by coordinate differences \(\Delta x\), \(\Delta y\), \(\Delta z\), and \(\Delta t\), the spacetime interval \(\interval\) (cursive letter “I”) between them is defined as \(\interval = \Delta t^2-\Delta x^2-\Delta y^2-\Delta z^2\).
This is stated in natural units, so all four terms on the right-hand side have the same units; in metric units with \(c \ne 1\), appropriate factors of \(c\) should be inserted in order to make the units of the terms agree. The interval \(\interval\) is positive if AB is timelike (regardless of which event comes first), zero if lightlike, and negative if spacelike. Since \(\interval\) can be negative, we can't in general take its square root and define a real number \(\overline{\text{AB}}\) as in the Euclidean case. When the interval is timelike, we can interpret \(\sqrt{\interval}\) as a time, and when it's spacelike we can take \(\sqrt{-\interval}\) to be a distance.
The Euclidean definition of distance (i.e., the Pythagorean theorem) is useful because it gives the same answer regardless of how we rotate the plane. Although it is stated in terms of a certain coordinate system, its result is unambiguously defined because it is the same regardless of what coordinate system we arbitrarily pick. Similarly, \(\interval\) is useful because, as proved in example 8 below, it is the same regardless of our frame of reference, i.e., regardless of our choice of coordinates.
t (s) | x | y | z | |
0 | 1784times1012 textupm | 3951times1012 textupm | 0237times1012 textupm | |
37869120000times108 textups | 2420times1012 textupm | 8827times1012 textupm | 0488times1012 textupm | |
Compare the time elapsed on the spacecraft to the time in a frame of reference tied to the sun.
\(\triangleright\) We can convert these data into natural units, with the distance unit being the second (i.e., a light-second, the distance light travels in one second) and the time unit being seconds. Converting and carrying out this subtraction, we have:
Δt (s) | Δx | Δy | Δz |
37869120000times108 textups | 02121times104 textups | 1626times104 textups | 0084times104 textups |
Comparing the exponents of the temporal and spatial numbers, we can see that the spacecraft was moving at a velocity on the order of \(10^{-4}\) of the speed of light, so relativistic effects should be small but not completely negligible.
Since the interval is timelike, we can take its square root and interpret it as the time elapsed on the spacecraft. The result is \(\sqrt{\interval}=3.786911996\times 10^8\ \text{s}\). This is 0.4 s less than the time elapsed in the sun's frame of reference.
\myoptionalsubsection[4]{Four-vectors and the inner product}
Example 7 makes it natural that we define a type of vector with four components, the first one relating to time and the others being spatial. These are known as four-vectors. It's clear how we should define the equivalent of a dot product in relativity:
The term “dot product” has connotations of referring only to three-vectors, so the operation of taking the scalar product of two four-vectors is usually referred to instead as the “inner product.” The spacetime interval can then be thought of as the inner product of a four-vector with itself. We care about the relativistic inner product for exactly the same reason we care about its Euclidean version; both are scalars, so they have a fixed value regardless of what coordinate system we choose.
But isn't it valid to say that Betty's spaceship is standing still and the earth moving? In that description, wouldn't Alice end up younger and Betty older? This is referred to as the “twin paradox.” It can't really be a paradox, since it's exactly what was observed in the Hafele-Keating experiment (p. 381).
Betty's track in the \(x\)-\(t\) plane (her “world-line” in relativistic jargon) consists of vectors \(\mathbf{b}\) and \(\mathbf{c}\) strung end-to-end (figure ad). We could adopt a frame of reference in which Betty was at rest during \(\mathbf{b}\) (i.e., \(b_x=0\)), but there is no frame in which \(\mathbf{b}\) and \(\mathbf{c}\) are parallel, so there is no frame in which Betty was at rest during both \(\mathbf{b}\) and \(\mathbf{c}\). This resolves the paradox.
We have already established by other methods that Betty ages less that Alice, but let's see how this plays out in a simple numerical example. Omitting units and making up simple numbers, let's say that the vectors in figure ad are
where the components are given in the order \((t,x)\). The time experienced by Alice is then
which is greater than the Betty's elapsed time
\myoptionalsubsection[2]{Doppler shifts of light and addition of velocities}
When Doppler shifts happen to ripples on a pond or the sound waves from an airplane, they can depend on the relative motion of three different objects: the source, the receiver, and the medium. But light waves don't have a medium. Therefore Doppler shifts of light can only depend on the relative motion of the source and observer.
One simple case is the one in which the relative motion of the source and the receiver is perpendicular to the line connecting them. That is, the motion is transverse. Nonrelativistic Doppler shifts happen because the distance between the source and receiver is changing, so in nonrelativistic physics we don't expect any Doppler shift at all when the motion is transverse, and this is what is in fact observed to high precision. For example, the photo shows shortened and lengthened wavelengths to the right and left, along the source's line of motion, but an observer above or below the source measures just the normal, unshifted wavelength and frequency. But relativistically, we have a time dilation effect, so for light waves emitted transversely, there is a Doppler shift of \(1/\gamma\) in frequency (or \(\gamma\) in wavelength).
The other simple case is the one in which the relative motion of the source and receiver is longitudinal, i.e., they are either approaching or receding from one another. For example, distant galaxies are receding from our galaxy due to the expansion of the universe, and this expansion was originally detected because Doppler shifts toward the red (low-frequency) end of the spectrum were observed.
Nonrelativistically, we would expect the light from such a galaxy to be Doppler shifted down in frequency by some factor, which would depend on the relative velocities of three different objects: the source, the wave's medium, and the receiver. Relativistically, things get simpler, because light isn't a vibration of a physical medium, so the Doppler shift can only depend on a single velocity \(v\), which is the rate at which the separation between the source and the receiver is increasing.
The square in figure ah is the “graph paper” used by someone who considers the source to be at rest, while the parallelogram plays a similar role for the receiver. The figure is drawn for the case where \(v=3/5\) (in units where \(c=1\)), and in this case the stretch factor of the long diagonal is 2. To keep the area the same, the short diagonal has to be squished to half its original size. But now it's a matter of simple geometry to show that OP equals half the width of the square, and this tells us that the Doppler shift is a factor of 1/2 in frequency. That is, the squish factor of the short diagonal is interpreted as the Doppler shift. To get this as a general equation for velocities other than 3/5, one can show by straightforward fiddling with the result of part c of problem 7 on p. 439 that the Doppler shift is
Here \(v>0\) is the case where the source and receiver are getting farther apart, \(v\lt0\) the case where they are approaching. (This is the opposite of the sign convention used in subsection 6.1.5. It is convenient to change conventions here so that we can use positive values of \(v\) in the case of cosmological red-shifts, which are the most important application.)
Suppose that Alice stays at home on earth while her twin Betty takes off in her rocket ship at 3/5 of the speed of light. When I first learned relativity, the thing that caused me the most pain was understanding how each observer could say that the other was the one whose time was slow. It seemed to me that if I could take a pill that would speed up my mind and my body, then naturally I would see everybody else as being slow. Shouldn't the same apply to relativity? But suppose Alice and Betty get on the radio and try to settle who is the fast one and who is the slow one. Each twin's voice sounds slooooowed doooowwwwn to the other. If Alice claps her hands twice, at a time interval of one second by her clock, Betty hears the hand-claps coming over the radio two seconds apart, but the situation is exactly symmetric, and Alice hears the same thing if Betty claps. Each twin analyzes the situation using a diagram identical to ah, and attributes her sister's observations to a complicated combination of time distortion, the time taken by the radio signals to propagate, and the motion of her twin relative to her.
Turn your book upside-down and reinterpret figure ah.
(answer in the back of the PDF version of the book)The result of example 12 was the basis of one of the earliest laboratory tests of special relativity, by Ives and Stilwell in 1938. They observed the light emitted by excited by a beam of \(\text{H}_2^+\) and \(\text{H}_3^+\) ions with speeds of a few tenths of a percent of \(c\). Measuring the light from both ahead of and behind the beams, they found that the product of the Doppler shifts \(D(v)D(-v)\) was equal to 1, as predicted by relativity. If relativity had been false, then one would have expected the product to differ from 1 by an amount that would have been detectable in their experiment. In 2003, Saathoff et al. carried out an extremely precise version of the Ives-Stilwell technique with \(\text{Li}^+\) ions moving at 6.4% of \(c\). The frequencies observed, in units of MHz, were:
ftextupo | = 546466918.8±0.4 |
(unshifted frequency) | |
ftextupoDv | = 582490203.44±.09 |
(shifted frequency, forward) | |
ftextupo Dv | = 512671442.9±0.5 |
(shifted frequency, backward) | |
sqrtftextupoDvcdot ftextupo Dv | =546466918.6±0.3 |
The results show incredibly precise agreement between \(f_\text{o}\) and \(\sqrt{f_\text{o}D(-v)\cdot f_\text{o} D(v)}\), as expected relativistically because \(D(v)D(-v)\) is supposed to equal 1. The agreement extends to 9 significant figures, whereas if relativity had been false there should have been a relative disagreement of about \(v^2=.004\), i.e., a discrepancy in the third significant figure. The spectacular agreement with theory has made this experiment a lightning rod for anti-relativity kooks.
We saw on p. 394 that relativistic velocities should not be expected to be exactly additive, and problem 1 on p. 438 verifies this in the special case where A moves relative to B at \(0.6c\) and B relative to C at \(0.6c\) --- the result not being \(1.2c\). The relativistic Doppler shift provides a simple way of deriving a general equation for the relativistic combination of velocities; problem 17 on p. 442 guides you through the steps of this derivation, and the result is given on p. 936.
So far we have said nothing about how to predict motion in relativity. Do Newton's laws still work? Do conservation laws still apply? The answer is yes, but many of the definitions need to be modified, and certain entirely new phenomena occur, such as the equivalence of energy and mass, as described by the famous equation \(E=mc^2\).
Consider the following scheme for traveling faster than the speed of light. The basic idea can be demonstrated by dropping a ping-pong ball and a baseball stacked on top of each other like a snowman. They separate slightly in mid-air, and the baseball therefore has time to hit the floor and rebound before it collides with the ping-pong ball, which is still on the way down. The result is a surprise if you haven't seen it before: the ping-pong ball flies off at high speed and hits the ceiling! A similar fact is known to people who investigate the scenes of accidents involving pedestrians. If a car moving at 90 kilometers per hour hits a pedestrian, the pedestrian flies off at nearly double that speed, 180 kilometers per hour. Now suppose the car was moving at 90 percent of the speed of light. Would the pedestrian fly off at 180% of \(c\)?
To see why not, we have to back up a little and think about where this speed-doubling result comes from. For any collision, there is a special frame of reference, the center-of-mass frame, in which the two colliding objects approach each other, collide, and rebound with their velocities reversed. In the center-of-mass frame, the total momentum of the objects is zero both before and after the collision.
Figure a/1 shows such a frame of reference for objects of very unequal mass. Before the collision, the large ball is moving relatively slowly toward the top of the page, but because of its greater mass, its momentum cancels the momentum of the smaller ball, which is moving rapidly in the opposite direction. The total momentum is zero. After the collision, the two balls just reverse their directions of motion. We know that this is the right result for the outcome of the collision because it conserves both momentum and kinetic energy, and everything not forbidden is mandatory, i.e., in any experiment, there is only one possible outcome, which is the one that obeys all the conservation laws.
How do we know that momentum and kinetic energy are conserved in figure a/1?
(answer in the back of the PDF version of the book)Let's make up some numbers as an example. Say the small ball has a mass of 1 kg, the big one 8 kg. In frame 1, let's make the velocities as follows:
| before the collision | after the collision |
small ball | -0.8 | 0.8 |
big ball | 0.1 | -0.1 |
Figure a/2 shows the same collision in a frame of reference where the small ball was initially at rest. To find all the velocities in this frame, we just add 0.8 to all the ones in the previous table.
| before the collision | after the collision |
small ball | 0 | 1.6 |
big ball | 0.9 | 0.7 |
In this frame, as expected, the small ball flies off with a velocity, 1.6, that is almost twice the initial velocity of the big ball, 0.9.
If all those velocities were in meters per second, then that's exactly what happened. But what if all these velocities were in units of the speed of light? Now it's no longer a good approximation just to add velocities. We need to combine them according to the relativistic rules. For instance, the technique used in problem 1 can be used to show that combining a velocity of 0.8 times the speed of light with another velocity of 0.8 results in 0.98, not 1.6. The results are very different:
| before the collision | after the collision |
small ball | 0 | 0.98 |
big ball | 0.83 | 0.76 |
We can interpret this as follows. Figure a/1 is one in which the big ball is moving fairly slowly. This is very nearly the way the scene would be seen by an ant standing on the big ball. According to an observer in frame b, however, both balls are moving at nearly the speed of light after the collision. Because of this, the balls appear foreshortened, but the distance between the two balls is also shortened. To this observer, it seems that the small ball isn't pulling away from the big ball very fast.
Now here's what's interesting about all this. The outcome shown in figure a/2 was supposed to be the only one possible, the only one that satisfied both conservation of energy and conservation of momentum. So how can the different result shown in figure b be possible? The answer is that relativistically, momentum must not equal \(mv\). The old, familiar definition is only an approximation that's valid at low speeds. If we observe the behavior of the small ball in figure b, it looks as though it somehow had some extra inertia. It's as though a football player tried to knock another player down without realizing that the other guy had a three-hundred-pound bag full of lead shot hidden under his uniform --- he just doesn't seem to react to the collision as much as he should. As proved in section 7.3.4, this extra inertia is described by redefining momentum as
At very low velocities, \(\gamma\) is close to 1, and the result is very nearly \(mv\), as demanded by the correspondence principle. But at very high velocities, \(\gamma\) gets very big --- the small ball in figure b has a \(\gamma\) of 5.0, and therefore has five times more inertia than we would expect nonrelativistically.
This also explains the answer to another paradox often posed by beginners at relativity. Suppose you keep on applying a steady force to an object that's already moving at \(0.9999c\). Why doesn't it just keep on speeding up past \(c\)? The answer is that force is the rate of change of momentum. At \(0.9999c\), an object already has a \(\gamma\) of 71, and therefore has already sucked up 71 times the momentum you'd expect at that speed. As its velocity gets closer and closer to \(c\), its \(\gamma\) approaches infinity. To move at \(c\), it would need an infinite momentum, which could only be caused by an infinite force.
Now we're ready to see why mass and energy must be equivalent as claimed in the famous \(E=mc^2\). So far we've only considered collisions in which none of the kinetic energy is converted into any other form of energy, such as heat or sound. Let's consider what happens if a blob of putty moving at velocity \(v\) hits another blob that is initially at rest, sticking to it. The nonrelativistic result is that to obey conservation of momentum the two blobs must fly off together at \(v/2\). Half of the initial kinetic energy has been converted to heat.^{4}
Relativistically, however, an interesting thing happens. A hot object has more momentum than a cold object! This is because the relativistically correct expression for momentum is \(m\gamma v\), and the more rapidly moving atoms in the hot object have higher values of \(\gamma\). In our collision, the final combined blob must therefore be moving a little more slowly than the expected \(v/2\), since otherwise the final momentum would have been a little greater than the initial momentum. To an observer who believes in conservation of momentum and knows only about the overall motion of the objects and not about their heat content, the low velocity after the collision would seem to be the result of a magical change in the mass, as if the mass of two combined, hot blobs of putty was more than the sum of their individual masses.
Now we know that the masses of all the atoms in the blobs must be the same as they always were. The change is due to the change in \(\gamma\) with heating, not to a change in mass. The heat energy, however, seems to be acting as if it was equivalent to some extra mass.
But this whole argument was based on the fact that heat is a form of kinetic energy at the atomic level. Would \(E=mc^2\) apply to other forms of energy as well? Suppose a rocket ship contains some electrical energy stored in a battery. If we believed that \(E=mc^2\) applied to forms of kinetic energy but not to electrical energy, then we would have to believe that the pilot of the rocket could slow the ship down by using the battery to run a heater! This would not only be strange, but it would violate the principle of relativity, because the result of the experiment would be different depending on whether the ship was at rest or not. The only logical conclusion is that all forms of energy are equivalent to mass. Running the heater then has no effect on the motion of the ship, because the total energy in the ship was unchanged; one form of energy (electrical) was simply converted to another (heat).
The equation \(E=mc^2\) tells us how much energy is equivalent to how much mass: the conversion factor is the square of the speed of light, \(c\). Since \(c\) a big number, you get a really really big number when you multiply it by itself to get \(c^2\). This means that even a small amount of mass is equivalent to a very large amount of energy.
A star with sufficiently strong gravity can prevent light from leaving. Quite a few black holes have been detected via their gravitational forces on neighboring stars or clouds of gas and dust.
You've learned about conservation of mass and conservation of energy, but now we see that they're not even separate conservation laws. As a consequence of the theory of relativity, mass and energy are equivalent, and are not separately conserved --- one can be converted into the other. Imagine that a magician waves his wand, and changes a bowl of dirt into a bowl of lettuce. You'd be impressed, because you were expecting that both dirt and lettuce would be conserved quantities. Neither one can be made to vanish, or to appear out of thin air. However, there are processes that can change one into the other. A farmer changes dirt into lettuce, and a compost heap changes lettuce into dirt. At the most fundamental level, lettuce and dirt aren't really different things at all; they're just collections of the same kinds of atoms --- carbon, hydrogen, and so on. Because mass and energy are like two different sides of the same coin, we may speak of mass-energy, a single conserved quantity, found by adding up all the mass and energy, with the appropriate conversion factor: \(E+mc^2\).
\(\triangleright\) The energy will appear as heat, which will be lost to the environment. The total mass-energy of the cup, water, and iron will indeed be lessened by 0.5 MJ. (If it had been perfectly insulated, there would have been no change, since the heat energy would have been trapped in the cup.) The speed of light is \(c=3\times10^8\) meters per second, so converting to mass units, we have
The change in mass is too small to measure with any practical technique. This is because the square of the speed of light is such a large number.
Positron annihilation forms the basis for the medical imaging technique called a PET (positron emission tomography) scan, in which a positron-emitting chemical is injected into the patient and mapped by the emission of gamma rays from the parts of the body where it accumulates.
One commonly hears some misinterpretations of \(E=mc^2\), one being that the equation tells us how much kinetic energy an object would have if it was moving at the speed of light. This wouldn't make much sense, both because the equation for kinetic energy has \(1/2\) in it, \(KE=(1/2)mv^2\), and because a material object can't be made to move at the speed of light. However, this naturally leads to the question of just how much mass-energy a moving object has. We know that when the object is at rest, it has no kinetic energy, so its mass-energy is simply equal to the energy-equivalent of its mass, \(mc^2\),
where the symbol \(\massenergy\) (cursive “E”) stands for mass-energy. The point of using the new symbol is simply to remind ourselves that we're talking about relativity, so an object at rest has \(\massenergy=mc^2\), not \(E=0\) as we'd assume in classical physics.
Suppose we start accelerating the object with a constant force. A constant force means a constant rate of transfer of momentum, but \(p=m\gamma v\) approaches infinity as \(v\) approaches \(c\), so the object will only get closer and closer to the speed of light, but never reach it. Now what about the work being done by the force? The force keeps doing work and doing work, which means that we keep on using up energy. Mass-energy is conserved, so the energy being expended must equal the increase in the object's mass-energy. We can continue this process for as long as we like, and the amount of mass-energy will increase without limit. We therefore conclude that an object's mass-energy approaches infinity as its speed approaches the speed of light,
Now that we have some idea what to expect, what is the actual equation for the mass-energy? As proved in section 7.3.4, it is
Verify that this equation has the two properties we wanted.
(answer in the back of the PDF version of the book)\(\triangleright\) The speed of light is a very big number, so \(mc^2\) is a huge number of joules. The object has a gigantic amount of energy because of its mass, and only a relatively small amount of additional kinetic energy because of its motion.
Another way of seeing this is that at low speeds, \(\gamma\) is only a tiny bit greater than 1, so \(\massenergy\) is only a tiny bit greater than \(mc^2\).
\(\triangleright\) Show that the equation \(\massenergy=m\gamma c^2\) obeys the correspondence principle.
\(\triangleright\) As we accelerate an object from rest, its mass-energy becomes greater than its resting value. Classically, we interpret this excess mass-energy as the object's kinetic energy,
Expressing \(\gamma\) as \(\left(1-v^2/c^2\right)^{-1/2}\) and making use of the approximation \((1+\epsilon)^p\approx 1+p\epsilon\) for small \(\epsilon\), we have \(\gamma\approx 1+v^2/2c^2\), so
which is the classical expression. As demanded by the correspondence principle, relativity agrees with classical physics at speeds that are small compared to the speed of light.
\myoptionalsubsection[2]{The energy-momentum four-vector}
Starting from \(\massenergy=m\gamma\) and \(p=m\gamma v\), a little algebra allows one to prove the identity
We can define an energy-momentum four-vector,
and the relation \(m^2 = \massenergy^2 - p^2\) then arises from the inner product \(\mathbf{p}\cdot\mathbf{p}\). Since \(\massenergy\) and \(p\) are separately conserved, the energy-momentum four-vector is also conserved.
Applying \(m^2 = \massenergy^2 - p^2\) yields the same result, \(\massenergy=p\), much more easily. This example demonstrates that although we encountered the relations \(\massenergy=m\gamma\) and \(p=m\gamma v\) first, the identity \(m^2 = \massenergy^2 - p^2\) is actually more fundamental.
When we say that something is a four-vector, we mean that it behaves properly under a Lorentz transformation: we can draw such a four-vector on graph paper, and then when we change frames of reference, we should be able to measure the vector in the new frame of reference by using the new version of the graph-paper grid derived from the old one by a Lorentz transformation.
If we had used the energy \(E\) rather than the mass-energy \(\massenergy\) to construct the energy-momentum four-vector, we wouldn't have gotten a valid four-vector. An easy way to see this is to consider the case where a noninteracting object is at rest in some frame of reference. Its momentum and kinetic energy are both zero. If we'd defined \(\mathbf{p}=(E,p_x,p_y,p_z)\) rather than \(\mathbf{p}=(\massenergy,p_x,p_y,p_z)\), we would have had \(\mathbf{p}=0\) in this frame. But when we draw a zero vector, we get a point, and a point remains a point regardless of how we distort the graph paper we use to measure it. That wouldn't have made sense, because in other frames of reference, we have \(E\ne 0\).
The relation \( m^2 = \massenergy^2 - p^2 \) is only valid in relativistic units. If we tried to apply it without modification to numbers expressed in metric units, we would have
which would be nonsense because the three terms all have different units. As usual, we need to insert factors of \(c\) to make a metric version, and these factors of \(c\) are determined by the need to fix the broken units:
Pair production cannot happen in a vacuum. For example, gamma rays from distant black holes can travel through empty space for thousands of years before being detected on earth, and they don't turn into electron-positron pairs before they can get here. Pair production can only happen in the presence of matter. When lead is used as shielding against gamma rays, one of the ways the gamma rays can be stopped in the lead is by undergoing pair production.
To see why pair production is forbidden in a vacuum, consider the process in the frame of reference in which the electron-positron pair has zero total momentum. In this frame, the gamma ray would have to have had zero momentum, but a gamma ray with zero momentum must have zero energy as well (example 20). This means that conservation of four-momentum has been violated: the timelike component of the four-momentum is the mass-energy, and it has increased from 0 in the initial state to at least \(2mc^2\) in the final state.
\myoptionalsubsection[4]{Proofs}
This optional section proves some results claimed earlier.
We start by considering the case of a particle, described as “ultrarelativistic,” that travels at very close to the speed of light. A good way of thinking about such a particle is that it's one with a very small mass. For example, the subatomic particle called the neutrino has a very small mass, thousands of times smaller than that of the electron. Neutrinos are emitted in radioactive decay, and because the neutrino's mass is so small, the amount of energy available in these decays is always enough to accelerate it to very close to the speed of light. Nobody has ever succeeded in observing a neutrino that was not ultrarelativistic. When a particle's mass is very small, the mass becomes difficult to measure. For almost 70 years after the neutrino was discovered, its mass was thought to be zero. Similarly, we currently believe that a ray of light has no mass, but it is always possible that its mass will be found to be nonzero at some point in the future. A ray of light can be modeled as an ultrarelativistic particle.
Let's compare ultrarelativistic particles with train cars. A single car with kinetic energy \(E\) has different properties than a train of two cars each with kinetic energy \(E/2\). The single car has half the mass and a speed that is greater by a factor of \(\sqrt{2}\). But the same is not true for ultrarelativistic particles. Since an idealized ultrarelativistic particle has a mass too small to be detectable in any experiment, we can't detect the difference between \(m\) and \(2m\). Furthermore, ultrarelativistic particles move at close to \(c\), so there is no observable difference in speed. Thus we expect that a single ultrarelativistic particle with energy \(E\) compared with two such particles, each with energy \(E/2\), should have all the same properties as measured by a mechanical detector.
An idealized zero-mass particle also has no frame in which it can be at rest. It always travels at \(c\), and no matter how fast we chase after it, we can never catch up. We can, however, observe it in different frames of reference, and we will find that its energy is different. For example, distant galaxies are receding from us at substantial fractions of \(c\), and when we observe them through a telescope, they appear very dim not just because they are very far away but also because their light has less energy in our frame than in a frame at rest relative to the source. This effect must be such that changing frames of reference according to a specific Lorentz transformation always changes the energy of the particle by a fixed factor, regardless of the particle's original energy; for if not, then the effect of a Lorentz transformation on a single particle of energy \(E\) would be different from its effect on two particles of energy \(E/2\).
How does this energy-shift factor depend on the velocity \(v\) of the Lorentz transformation? Rather than \(v\), it becomes more convenient to express things in terms of the Doppler shift factor \(D\), which multiplies when we change frames of reference. Let's write \(f(D)\) for the energy-shift factor that results from a given Lorentz transformation. Since a Lorentz transformation \(D_1\) followed by a second transformation \(D_2\) is equivalent to a single transformation by \(D_1D_2\), we must have \(f(D_1D_2)=f(D_1)f(D_2)\). This tightly constrains the form of the function \(f\); it must be something like \(f(D)=s^n\), where \(n\) is a constant. We postpone until p. 423 the proof that \(n=1\), which is also in agreement with experiments with rays of light.
Our final result is that the energy of an ultrarelativistic particle is simply proportional to its Doppler shift factor \(D\). Even in the case where the particle is truly massless, so that \(D\) doesn't have any finite value, we can still find how the energy differs according to different observers by finding the \(D\) of the Lorentz transformation between the two observers' frames of reference.
The following argument is due to Einstein. Suppose that a material object O of mass \(m\), initially at rest in a certain frame A, emits two rays of light, each with energy \(E/2\). By conservation of energy, the object must have lost an amount of energy equal to \(E\). By symmetry, O remains at rest.
We now switch to a new frame of reference moving at a certain velocity \(v\) in the \(z\) direction relative to the original frame. We assume that O's energy is different in this frame, but that the change in its energy amounts to multiplication by some unitless factor \(x\), which depends only on \(v\), since there is nothing else it could depend on that could allow us to form a unitless quantity. In this frame the light rays have energies \(ED(v)\) and \(ED(-v)\). If conservation of energy is to hold in the new frame as it did in the old, we must have \(2xE=ED(v)+ED(-v)\). After some algebra, we find \(x=1/\sqrt{1-v^2}\), which we recognize as \(\gamma\). This proves that \(E=m\gamma\) for a material object.
We've seen that ultrarelativistic particles are “generic,” in the sense that they have no individual mechanical properties other than an energy and a direction of motion. Therefore the relationship between energy and momentum must be linear for ultrarelativistic particles. Indeed, experiments verify that light has momentum, and doubling the energy of a ray of light doubles its momentum rather than quadrupling it. On a graph of \(p\) versus \(E\), massless particles, which have \(E\propto|p|\), lie on two diagonal lines that connect at the origin. If we like, we can pick units such that the slopes of these lines are plus and minus one. Material particles lie to the right of these lines. For example, a car sitting in a parking lot has \(p=0\) and \(E=mc^2\).
Now what happens to such a graph when we change to a different frame or reference that is in motion relative to the original frame? A massless particle still has to act like a massless particle, so the diagonals are simply stretched or contracted along their own lengths. In fact the transformation must be linear (p. 387), because conservation of energy and momentum involve addition, and we need these laws to be valid in all frames of reference. By the same reasoning as in figure j on p. 389, the transformation must be area-preserving. We then have the same three cases to consider as in figure g on p. 388. Case I is ruled out because it would imply that particles keep the same energy when we change frames. (This is what would happen if \(c\) were infinite, so that the mass-equivalent \(E/c^2\) of a given energy was zero, and therefore \(E\) would be interpreted purely as the mass.) Case II can't be right because it doesn't preserve the \(E=|p|\) diagonals. We are left with case III, which establishes the fact that the \(p\)-\(E\) plane transforms according to exactly the same kind of Lorentz transformation as the \(x\)-\(t\) plane. That is, \((E,p_x,p_y,p_z)\) is a four-vector.
The only remaining issue to settle is whether the choice of units that gives invariant 45-degree diagonals in the \(x\)-\(t\) plane is the same as the choice of units that gives such diagonals in the \(p\)-\(E\) plane. That is, we need to establish that the \(c\) that applies to \(x\) and \(t\) is equal to the \(c'\) needed for \(p\) and \(E\), i.e., that the velocity scales of the two graphs are matched up. This is true because in the Newtonian limit, the total mass-energy \(E\) is essentially just the particle's mass, and then \(p/E \approx p/m \approx v\). This establishes that the velocity scales are matched at small velocities, which implies that they coincide for all velocities, since a large velocity, even one approaching \(c\), can be built up from many small increments. (This also establishes that the exponent \(n\) defined on p. 422 equals 1 as claimed.)
Since \(m^2=E^2-p^2\), it follows that for a material particle, \(p=m\gamma v\).
What you've learned so far about relativity is known as the special theory of relativity, which is compatible with three of the four known forces of nature: electromagnetism, the strong nuclear force, and the weak nuclear force. Gravity, however, can't be shoehorned into the special theory. In order to make gravity work, Einstein had to generalize relativity. The resulting theory is known as the general theory of relativity.^{5}
Euclid proved thousands of years ago that the angles in a triangle add up to \(180°\). But what does it really mean to “prove” this? Euclid proved it based on certain assumptions (his five postulates), listed in the margin of this page. But how do we know that the postulates are true?
Only by observation can we tell whether any of Euclid's statements are correct characterizations of how space actually behaves in our universe. If we draw a triangle on paper with a ruler and measure its angles with a protractor, we will quickly verify to pretty good precision that the sum is close to \(180°\). But of course we already knew that space was at least approximately Euclidean. If there had been any gross error in Euclidean geometry, it would have been detected in Euclid's own lifetime. The correspondence principle tells us that if there is going to be any deviation from Euclidean geometry, it must be small under ordinary conditions.
To improve the precision of the experiment, we need to make sure that our ruler is very straight. One way to check would be to sight along it by eye, which amounts to comparing its straightness to that of a ray of light. For that matter, we might as well throw the physical ruler in the trash and construct our triangle out of three laser beams. To avoid effects from the air we should do the experiment in outer space. Doing it in space also has the advantage of allowing us to make the triangle very large; as shown in figure a, the discrepancy from \(180°\) is expected to be proportional to the area of the triangle.
But we already know that light rays are bent by gravity. We expect it based on \(E=mc^2\), which tells us that the energy of a light ray is equivalent to a certain amount of mass, and furthermore it has been verified experimentally by the deflection of starlight by the sun (example 14, p. 416). We therefore know that our universe is noneuclidean, and we gain the further insight that the level of deviation from Euclidean behavior depends on gravity.
Since the noneuclidean effects are bigger when the system being studied is larger, we expect them to be especially important in the study of cosmology, where the distance scales are very large.
An Einstein's ring, figure b, is formed when there is a chance alignment of a distant source with a closer gravitating body. This type of gravitational lensing is direct evidence for the noneuclidean nature of space. The two light rays are lines, and they violate Euclid's first postulate, that two points determine a line.
One could protest that effects like these are just an imperfection of the light rays as physical models of straight lines. Maybe the noneuclidean effects would go away if we used something better and straighter than a light ray. But we don't know of anything straighter than a light ray. Furthermore, we observe that all measuring devices, not just optical ones, report the same noneuclidean behavior.
An example of such a non-optical measurement is the Gravity Probe B satellite, figure d, which was launched into a polar orbit in 2004 and operated until 2010. The probe carried four gyroscopes made of quartz, which were the most perfect spheres ever manufactured, varying from sphericity by no more than about 40 atoms. Each gyroscope floated weightlessly in a vacuum, so that its rotation was perfectly steady. After 5000 orbits, the gyroscopes had reoriented themselves by about \(2\times10^{-3}°\) relative to the distant stars. This effect cannot be explained by Newtonian physics, since no torques acted on them. It was, however, exactly as predicted by Einstein's theory of general relativity. It becomes easier to see why such an effect would be expected due to the noneuclidean nature of space if we characterize euclidean geometry as the geometry of a flat plane as opposed to a curved one. On a curved surface like a sphere, figure c, Euclid's fifth postulate fails, and it's not hard to see that we can get triangles for which the sum of the angles is not \(180°\). By transporting a gyroscope all the way around the edges of such a triangle and back to its starting point, we change its orientation.
The triangle in figure c has angles that add up to more than \(180°\). This type of curvature is referred to as positive. It is also possible to have negative curvature, as in figure e.
In general relativity, curvature isn't just something caused by gravity. Gravity is curvature, and the curvature involves both space and time, as may become clearer once you get to figure k. Thus the distinction between special and general relativity is that general relativity handles curved spacetime, while special relativity is restricted to the case where spacetime is flat.
Although we often visualize curvature by imagining embedding a two-dimensional surface in a three-dimensional space, that's just an aid in visualization. There is no evidence for any additional dimensions, nor is it necessary to hypothesize them in order to let spacetime be curved as described in general relativity.
Put yourself in the shoes of a two-dimensional being living in a two-dimensional space. Euclid's postulates all refer to constructions that can be performed using a compass and an unmarked straightedge. If this being can physically verify them all as descriptions of the space she inhabits, then she knows that her space is Euclidean, and that propositions such as the Pythagorean theorem are physically valid in her universe. But the diagram in f/1 illustrating the proof of the Pythagorean theorem in Euclid's Elements (proposition I.47) is equally valid if the page is rolled onto a cylinder, 2, or formed into a wavy corrugated shape, 3. These types of curvature, which can be achieved without tearing or crumpling the surface, are not real to her. They are simply side-effects of visualizing her two-dimensional universe as if it were embedded in a hypothetical third dimension --- which doesn't exist in any sense that is empirically verifiable to her. Of the curved surfaces in figure f, only the sphere, 4, has curvature that she can measure; the diagram can't be plastered onto the sphere without folding or cutting and pasting.
So the observation of curvature doesn't imply the existence of extra dimensions, nor does embedding a space in a higher-dimensional one so that it looks curvy always mean that there will be any curvature detectable from within the lower-dimensional space.
Although light rays and gyroscopes seem to agree that space is curved in a gravitational field, it's always conceivable that we could find something else that would disagree. For example, suppose that there is a new and improved ray called the \(\text{StraightRay}^\text{TM}\). The StraightRay is like a light ray, but when we construct a triangle out of StraightRays, we always get the Euclidean result for the sum of the angles. We would then have to throw away general relativity's whole idea of describing gravity in terms of curvature. One good way of making a StraightRay would be if we had a supply of some kind of exotic matter --- call it \(\text{FloatyStuff}^\text{TM}\) --- that had the ordinary amount of inertia, but was completely unaffected by gravity. We could then shoot a stream of FloatyStuff particles out of a nozzle at nearly the speed of light and make a StraightRay.
Normally when we release a material object in a gravitational field, it experiences a force \(mg\), and then by Newton's second law its acceleration is \(a=F/m=mg/m=g\). The \(m\)'s cancel, which is the reason that everything falls with the same acceleration (in the absence of other forces such as air resistance). The universality of this behavior is what allows us to interpret the gravity geometrically in general relativity. For example, the Gravity Probe B gyroscopes were made out of quartz, but if they had been made out of something else, it wouldn't have mattered. But if we had access to some FloatyStuff, the geometrical picture of gravity would fail, because the “\(m\)” that described its susceptibility to gravity would be a different “\(m\)” than the one describing its inertia.
The question of the existence or nonexistence of such forms of matter turns out to be related to the question of what kinds of motion are relative. Let's say that alien gangsters land in a flying saucer, kidnap you out of your back yard, konk you on the head, and take you away. When you regain consciousness, you're locked up in a sealed cabin in their spaceship. You pull your keychain out of your pocket and release it, and you observe that it accelerates toward the floor with an acceleration that seems quite a bit slower than what you're used to on earth, perhaps a third of a gee. There are two possible explanations for this. One is that the aliens have taken you to some other planet, maybe Mars, where the strength of gravity is a third of what we have on earth. The other is that your keychain didn't really accelerate at all: you're still inside the flying saucer, which is accelerating at a third of a gee, so that it was really the deck that accelerated up and hit the keys.
There is absolutely no way to tell which of these two scenarios is actually the case --- unless you happen to have a chunk of FloatyStuff in your other pocket. If you release the FloatyStuff and it hovers above the deck, then you're on another planet and experiencing genuine gravity; your keychain responded to the gravity, but the FloatyStuff didn't. But if you release the FloatyStuff and see it hit the deck, then the flying saucer is accelerating through outer space.
The nonexistence of FloatyStuff in our universe is called the equivalence principle. If the equivalence principle holds, then an acceleration (such as the acceleration of the flying saucer) is always equivalent to a gravitational field, and no observation can ever tell the difference without reference to something external. (And suppose you did have some external reference point --- how would you know whether it was accelerating?)
The pilot of an airplane cannot always easily tell which way is up. The horizon may not be level simply because the ground has an actual slope, and in any case the horizon may not be visible if the weather is foggy. One might imagine that the problem could be solved simply by hanging a pendulum and observing which way it pointed, but by the equivalence principle the pendulum cannot tell the difference between a gravitational field and an acceleration of the aircraft relative to the ground --- nor can any other accelerometer, such as the pilot's inner ear. For example, when the plane is turning to the right, accelerometers will be tricked into believing that “down” is down and to the left. To get around this problem, airplanes use a device called an artificial horizon, which is essentially a gyroscope. The gyroscope has to be initialized when the plane is known to be oriented in a horizontal plane. No gyroscope is perfect, so over time it will drift. For this reason the instrument also contains an accelerometer, and the gyroscope is always forced into agreement with the accelerometer's average output over the preceding several minutes. If the plane is flown in circles for several minutes, the artificial horizon will be fooled into indicating that the wrong direction is vertical.
An interesting application of the equivalence principle is the explanation of gravitational time dilation. As described on p. 384, experiments show that a clock at the top of a mountain runs faster than one down at its foot.
To calculate this effect, we make use of the fact that the gravitational field in the area around the mountain is equivalent to an acceleration. Suppose we're in an elevator accelerating upward with acceleration \(a\), and we shoot a ray of light from the floor up toward the ceiling, at height \(h\). The time \(\Delta t\) it takes the light ray to get to the ceiling is about \(h/c\), and by the time the light ray reaches the ceiling, the elevator has sped up by \(v=a\Delta t=ah/c\), so we'll see a red-shift in the ray's frequency. Since \(v\) is small compared to \(c\), we don't need to use the fancy Doppler shift equation from subsection 7.2.8; we can just approximate the Doppler shift factor as \(1-v/c\approx 1-ah/c^2\). By the equivalence principle, we should expect that if a ray of light starts out low down and then rises up through a gravitational field \(g\), its frequency will be Doppler shifted by a factor of \(1-gh/c^2\). This effect was observed in a famous experiment carried out by Pound and Rebka in 1959. Gamma-rays were emitted at the bottom of a 22.5-meter tower at Harvard and detected at the top with the Doppler shift predicted by general relativity. (See problem 25.)
In the mountain-valley experiment, the frequency of the clock in the valley therefore appears to be running too slowly by a factor of \(1-gh/c^2\) when it is compared via radio with the clock at the top of the mountain. We conclude that time runs more slowly when one is lower down in a gravitational field, and the slow-down factor between two points is given by \(1-gh/c^2\), where \(h\) is the difference in height.
We have built up a picture of light rays interacting with gravity. To confirm that this make sense, recall that we have already observed in subsection 7.3.3 and in problem 11 on p. 440 that light has momentum. The equivalence principle says that whatever has inertia must also participate in gravitational interactions. Therefore light waves must have weight, and must lose energy when they rise through a gravitational field.
The noneuclidean nature of spacetime produces effects that grow in proportion to the area of the region being considered. Interpreting such effects as evidence of curvature, we see that this connects naturally to the idea that curvature is undetectable from close up. For example, the curvature of the earth's surface is not normally noticeable to us in everyday life. Locally, the earth's surface is flat, and the same is true for spacetime.
Local flatness turns out to be another way of stating the equivalence principle. In a variation on the alien-abduction story, suppose that you regain consciousness aboard the flying saucer and find yourself weightless. If the equivalence principle holds, then you have no way of determining from local observations, inside the saucer, whether you are actually weightless in deep space, or simply free-falling in apparent weightlessness, like the astronauts aboard the International Space Station. That means that locally, we can always adopt a free-falling frame of reference in which there is no gravitational field at all. If there is no gravity, then special relativity is valid, and we can treat our local region of spacetime as being approximately flat.
In figure k, an apple falls out of a tree. Its path is a “straight” line in spacetime, in the same sense that the equator is a “straight” line on the earth's surface.
In Newtonian mechanics, we have a distinction between inertial and noninertial frames of reference. An inertial frame according to Newton is one that has a constant velocity vector relative to the stars. But what if the stars themselves are accelerating due to a gravitational force from the rest of the galaxy? We could then take the galaxy's center of mass as defining an inertial frame, but what if something else is acting on the galaxy?
If we had some FloatyStuff, we could resolve the whole question. FloatyStuff isn't affected by gravity, so if we release a sample of it in mid-air, it will continue on a trajectory that defines a perfect Newtonian inertial frame. (We'd better have it on a tether, because otherwise the earth's rotation will carry the earth out from under it.) But if the equivalence principle holds, then Newton's definition of an inertial frame is fundamentally flawed.
There is a different definition of an inertial frame that works better in relativity. A Newtonian inertial frame was defined by an object that isn't subject to any forces, gravitational or otherwise. In general relativity, we instead define an inertial frame using an object that that isn't influenced by anything other than gravity. By this definition, a free-falling rock defines an inertial frame, but this book sitting on your desk does not.
The observations described so far showed only small effects from curvature. To get a big effect, we should look at regions of space in which there are strong gravitational fields. The prime example is a black hole. The best studied examples are two objects in our own galaxy: Cygnus X-1, which is believed to be a black hole with about ten times the mass of our sun, and Sagittarius A*, an object near the center of our galaxy with about four million solar masses.
Although a black hole is a relativistic object, we can gain some insight into how it works by applying Newtonian physics. A spherical body of mass \(M\) has an escape velocity \(v=\sqrt{2GM/r}\), which is the minimum velocity that we would need to give to a projectile shot from a distance \(r\) so that it would never fall back down. If \(r\) is small enough, the escape velocity will be greater than \(c\), so that even a ray of light can never escape.
We can now make an educated guess as to what this means without having to study all the mathematics of general relativity. In relativity, \(c\) isn't really the speed of light, it's really to be thought of as a restriction on how fast cause and effect can propagate through space. This suggests the correct interpretation, which is that for an object compact enough to be a black hole, there is no way for an event at a distance closer than \(r\) to have an effect on an event far away. There is an invisible, spherical boundary with radius \(r\), called the event horizon, and the region within that boundary is cut off from the rest of the universe in terms of cause and effect. If you wanted to explore that region, you could drop into it while wearing a space-suit --- but it would be a one-way trip, because you could never get back out to report on what you had seen.
In the Newtonian description of a black hole, matter could be lifted out of a black hole, m. Would this be possible with a real-world black hole, which is relativistic rather than Newtonian? No, because the bucket is causally separated from the outside universe. No rope would be strong enough for this job (problem 12, p. 441).
One misleading aspect of the Newtonian analysis is that it encourages us to imagine that a light ray trying to escape from a black hole will slow down, stop, and then fall back in. This can't be right, because we know that any observer who sees a light ray flying by always measures its speed to be \(c\). This was true in special relativity, and by the equivalence principle we can be assured that the same is true locally in general relativity. Figure n shows what would really happen.
Although the light rays in figure n don't speed up or slow down, they do experience gravitational Doppler shifts. If a light ray is emitted from just above the event horizon, then it will escape to an infinite distance, but it will suffer an extreme Doppler shift toward low frequencies. A distant observer also has the option of interpreting this as a gravitational time dilation that greatly lowers the frequency of the oscillating electric charges that produced the ray. If the point of emission is made closer and closer to the horizon, the frequency and energy measured by a distant observer approach zero, making the ray impossible to observe.
Black holes have some disturbing implications for the kind of universe that in the Age of the Enlightenment was imagined to have been set in motion initially and then left to run forever like clockwork.
Newton's laws have built into them the implicit assumption that omniscience is possible, at least in principle. For example, Newton's definition of an inertial frame of reference leads to an infinite regress, as described on p. 430. For Newton this isn't a problem, because in principle an omnisicient observer can know the location of every mass in the universe. In this conception of the cosmos, there are no theoretical limits on human knowledge, only practical ones; if we could gather sufficiently precise data about the state of the universe at one time, and if we could carry out all the calculations to extrapolate into the future, then we could know everything that would ever happen. (See the famous quote by Laplace on p. 16.)
But the existence of event horizons surrounding black holes makes it impossible for any observer to be omniscient; only an observer inside a particular horizon can see what's going on inside that horizon.
Furthermore, a black hole has at its center an infinitely dense point, called a singularity, containing all its mass, and this implies that information can be destroyed and made inaccessible to any observer at all. For example, suppose that astronaut Alice goes on a suicide mission to explore a black hole, free-falling in through the event horizon. She has a certain amount of time to collect data and satisfy her intellectual curiosity, but then she impacts the singularity and is compacted into a mathematical point. Now astronaut Betty decides that she will never be satisfied unless the secrets revealed to Alice are known to her as well --- and besides, she was Alice's best friend, and she wants to know whether Alice had any last words. Betty can jump through the horizon, but she can never know Alice's last words, nor can any other observer who jumps in after Alice does.
This destruction of information is known as the black hole information paradox, and it's referred to as a paradox because quantum physics (ch. 13) has built into its DNA the requirement that information is never lost in this sense.
Around 1960, as black holes and their strange properties began to be better understood and more widely discussed, many physicists who found these issues distressing comforted themselves with the belief that black holes would never really form from realistic initial conditions, such as the collapse of a massive star. Their skepticism was not entirely unreasonable, since it is usually very hard in astronomy to hit a gravitating target, the reason being that conservation of angular momentum tends to make the projectile swing past. (See problem 13 on p. 289 for a quantitative analysis.) For example, if we wanted to drop a space probe into the sun, we would have to extremely precisely stop its sideways orbital motion so that it would drop almost exactly straight in. Once a star started to collapse, the theory went, and became relatively compact, it would be such a small target that further infalling material would be unlikely to hit it, and the process of collapse would halt. According to this point of view, theorists who had calculated the collapse of a star into a black hole had been oversimplifying by assuming a star that was initially perfectly spherical and nonrotating. Remove the unrealistically perfect symmetry of the initial conditions, and a black hole would never actually form.
But Roger Penrose proved in 1964 that this was wrong. In fact, once an object collapses to a certain density, the Penrose singularity theorem guarantees mathematically that it will collapse further until a singularity is formed, and this singularity is surrounded by an event horizon. Since the brightness of an object like Sagittarius A* is far too low to be explained unless it has an event horizon (the interstellar gas flowing into it would glow due to frictional heating), we can be certain that there really is a singularity at its core.
Subsection 6.1.5 presented the evidence, discovered by Hubble, that the universe is expanding in the aftermath of the Big Bang: when we observe the light from distant galaxies, it is always Doppler-shifted toward the red end of the spectrum, indicating that no matter what direction we look in the sky, everything is rushing away from us. This seems to go against the modern attitude, originated by Copernicus, that we and our planet do not occupy a special place in the universe. Why is everything rushing away from our planet in particular? But general relativity shows that this anti-Copernican conclusion is wrong. General relativity describes space not as a rigidly defined background but as something that can curve and stretch, like a sheet of rubber. We imagine all the galaxies as existing on the surface of such a sheet, which then expands uniformly. The space between the galaxies (but not the galaxies themselves) grows at a steady rate, so that any observer, inhabiting any galaxy, will see every other galaxy as receding. There is therefore no privileged or special location in the universe.
We might think that there would be another kind of special place, which would be the one at which the Big Bang happened. Maybe someone has put a brass plaque there? But general relativity doesn't describe the Big Bang as an explosion that suddenly occurred in a preexisting background of time and space. According to general relativity, space itself came into existence at the Big Bang, and the hot, dense matter of the early universe was uniformly distributed everywhere. The Big Bang happened everywhere at once.
Observations show that the universe is very uniform on large scales, and for ease of calculation, the first physical models of the expanding universe were constructed with perfect uniformity. In these models, the Big Bang was a singularity. This singularity can't even be included as an event in spacetime, so that time itself only exists after the Big Bang. A Big Bang singularity also creates an even more acute version of the black hole information paradox. Whereas matter and information disappear into a black hole singularity, stuff pops out of a Big Bang singularity, and there is no physical principle that could predict what it would be.
As with black holes, there was considerable skepticism about whether the existence of an initial singularity in these models was an arifact of the unrealistically perfect uniformity assumed in the models. Perhaps in the real universe, extrapolation of all the paths of the galaxies backward in time would show them missing each other by millions of light-years. But in 1972 Stephen Hawking proved a variant on the Penrose singularity theorem that applied to Big Bang singularities. By the Hawking singularity theorem, the level of uniformity we see in the present-day universe is more than sufficient to prove that a Big Bang singularity must have existed.
It might not be too much of a philosophical jolt to imagine that information was spontaneously created in the Big Bang. Setting up the initial conditions of the entire universe is traditionally the prerogative of God, not the laws of physics. But there is nothing fundamental in general relativity that forbids the existence of other singularities that act like the Big Bang, being information producers rather than information consumers. As John Earman of the University of Pittsburgh puts it, anything could pop out of such a singularity, including green slime or your lost socks. This would eliminate any hope of finding a universal set of laws of physics that would be able to make a prediction given any initial situation.
That would be such a devastating defeat for the enterprise of physics that in 1969 Penrose proposed an alternative, humorously named the “cosmic censorship hypothesis,” which states that every singularity in our universe, other than the Big Bang, is hidden behind an event horizon. Therefore if green slime spontaneously pops out of one, there is limited impact on the predictive ability of physics, since the slime can never have any causal effect on the outside world. A singularity that is not modestly cloaked behind an event horizon is referred to as a naked singularity. Nobody has yet been able to prove the cosmic censorship hypothesis.
We expect that if there is matter in the universe, it should have gravitational fields, and in the rubber-sheet analogy this should be represented as a curvature of the sheet. Instead of a flat sheet, we can have a spherical balloon, so that cosmological expansion is like inflating it with more and more air. It is also possible to have negative curvature, as in figure e on p. 426. All three of these are valid, possible cosmologies according to relativity. The positive-curvature type happens if the average density of matter in the universe is above a certain critical level, the negative-curvature one if the density is below that value.
To find out which type of universe we inhabit, we could try to take a survey of the matter in the universe and determine its average density. Historically, it has been very difficult to do this, even to within an order of magnitude. Most of the matter in the universe probably doesn't emit light, making it difficult to detect. Astronomical distance scales are also very poorly calibrated against absolute units such as the SI.
Instead, we measure the universe's curvature, and infer the density of matter from that. It turns out that we can do this by observing the cosmic microwave background (CMB) radiation, which is the light left over from the brightly glowing early universe, which was dense and hot. As the universe has expanded, light waves that were in flight have expanded their wavelengths along with it. This afterglow of the big bang was originally visible light, but after billions of years of expansion it has shifted into the microwave radio part of the electromagnetic spectrum. The CMB is not perfectly uniform, and this turns out to give us a way to measure the universe's curvature. Since the CMB was emitted when the universe was only about 400,000 years old, any vibrations or disturbances in the hot hydrogen and helium gas that filled space in that era would only have had time to travel a certain distance, limited by the speed of sound. We therefore expect that no feature in the CMB should be bigger than a certain known size. In a universe with negative spatial curvature, the sum of the interior angles of a triangle is less than the Euclidean value of 180 degrees. Therefore if we observe a variation in the CMB over some angle, the distance between two points on the sky is actually greater than would have been inferred from Euclidean geometry. The opposite happens if the curvature is positive.
This observation was done by the 1989-1993 COBE probe, and its 2001-2009 successor, the Wilkinson Microwave Anisotropy Probe. The result is that the angular sizes are almost exactly equal to what they should be according to Euclidean geometry. We therefore infer that the universe is very close to having zero average spatial curvature on the cosmological scale, and this tells us that its average density must be within about 0.5% of the critical value. The years since COBE and WMAP mark the advent of an era in which cosmology has gone from being a field of estimates and rough guesses to a high-precision science.
If one is inclined to be skeptical about the seemingly precise answers to the mysteries of the cosmos, there are consistency checks that can be carried out. In the bad old days of low-precision cosmology, estimates of the age of the universe ranged from 10 billion to 20 billion years, and the low end was inconsistent with the age of the oldest star clusters. This was believed to be a problem either for observational cosmology or for the astrophysical models used to estimate the ages of the clusters: “You can't be older than your ma.” Current data have shown that the low estimates of the age were incorrect, so consistency is restored. (The best figure for the age of the universe is currently \(13.8\pm0.1\) billion years.)
Not everything works out so smoothly, however. One surpriseis that the universe's expansion is not currently slowing down, as had been expected due to the gravitational attraction of all the matter in it. Instead, it is currently speeding up. This is attributed to a variable in Einstein's equations, long assumed to be zero, which represents a universal gravitational repulsion of space itself, occurring even when there is no matter present. The current name for this is “dark energy,” although the fancy name is just a label for our ignorance about what causes it.
Another surprise comes from attempts to model the formation of the elements during the era shortly after the Big Bang, before the formation of the first stars. The observed relative abundances of hydrogen, helium, and deuterium (\(^2\text{H}\)) cannot be reconciled with the density of low-velocity matter inferred from the observational data. If the inferred mass density were entirely due to normal matter (i.e., matter whose mass consisted mostly of protons and neutrons), then nuclear reactions in the dense early universe should have proceeded relatively efficiently, leading to a much higher ratio of helium to hydrogen, and a much lower abundance of deuterium. The conclusion is that most of the matter in the universe must be made of an unknown type of exotic matter, known as “dark matter.” We are in the ironic position of knowing that precisely 96% of the universe is something other than atoms, but knowing nothing about what that something is. As of 2013, there have been several experiments that have been carried out to attempt the direct detection of dark matter particles. These are carried out at the bottom of mineshafts to eliminate background radiation. Early claims of success appear to have been statistical flukes, and the most sensitive experiments have not detected anything.^{6}
1. The figure illustrates a Lorentz transformation using the conventions employed in section 7.2. For simplicity, the transformation chosen is one that lengthens one diagonal by a factor of 2. Since Lorentz transformations preserve area, the other diagonal is shortened by a factor of 2. Let the original frame of reference, depicted with the square, be A, and the new one B. (a) By measuring with a ruler on the figure, show that the velocity of frame B relative to frame A is \(0.6c\). (b) Print out a copy of the page. With a ruler, draw a third parallelogram that represents a second successive Lorentz transformation, one that lengthens the long diagonal by another factor of 2. Call this third frame C. Use measurements with a ruler to determine frame C's velocity relative to frame A. Does it equal double the velocity found in part a? Explain why it should be expected to turn out the way it does.(answer check available at lightandmatter.com)
2. Astronauts in three different spaceships are communicating with each other.
Those aboard ships A and B agree on the rate at which time is passing, but
they disagree with the ones on ship C.
(a) Alice is aboard ship A. How does she describe the motion of her own ship, in its frame of reference?
(b) Describe the motion of the other two ships according to Alice.
(c) Give the description according to Betty, whose frame of reference is ship B.
(d) Do the same for Cathy, aboard ship C.
3. What happens in the equation for \(\gamma\) when you put in a negative number for \(v\)? Explain what this means physically, and why it makes sense.
4.
The Voyager 1 space probe, launched in 1977, is moving faster relative to the earth than
any other human-made object, at 17,000 meters per second.
(a) Calculate the probe's \(\gamma\).
(b) Over the course of one year on earth, slightly less than one year passes on the probe.
How much less? (There are 31 million seconds in a year.)(answer check available at lightandmatter.com)
5. In example 2 on page 391, I remarked that accelerating a macroscopic (i.e., not microscopic) object to close to the speed of light would require an unreasonable amount of energy. Suppose that the starship Enterprise from Star Trek has a mass of \(8.0\times10^7\) kg, about the same as the Queen Elizabeth 2. Compute the kinetic energy it would have to have if it was moving at half the speed of light. Compare with the total energy content of the world's nuclear arsenals, which is about \(10^{21}\) J.(answer check available at lightandmatter.com)
6. The earth is orbiting the sun, and therefore is contracted relativistically in the
direction of its motion. Compute the amount by which its diameter shrinks in this
direction.(answer check available at lightandmatter.com)
7. In this homework problem, you'll fill in the steps of the algebra required in order to find the equation for \(\gamma\) on page 389. To keep the algebra simple, let the time \(t\) in figure k equal 1, as suggested in the figure accompanying this homework problem. The original square then has an area of 1, and the transformed parallelogram must also have an area of 1. (a) Prove that point P is at \(x=v\gamma\), so that its \((t,x)\) coordinates are \((\gamma,v\gamma)\). (b) Find the \((t,x)\) coordinates of point Q. (c) Find the length of the short diagonal connecting P and Q. (d) Average the coordinates of P and Q to find the coordinates of the midpoint C of the parallelogram, and then find distance OC. (e) Find the area of the parallelogram by computing twice the area of triangle PQO. [Hint: You can take PQ to be the base of the triangle.] (f) Set this area equal to 1 and solve for \(\gamma\) to prove \(\gamma=1/\sqrt{1-v^2}\).(answer check available at lightandmatter.com)
8. (a) A free neutron (as opposed to a neutron bound into an atomic nucleus) is unstable, and undergoes beta decay (which you may want to review). The masses of the particles involved are as follows:
neutron | 1.67495×10^{ − 27} kg |
proton | 1.67265×10^{ − 27} kg |
electron | 0.00091×10^{ − 27} kg |
antineutrino | < 10^{ − 35} kg |
Find the energy released in the decay of a free neutron. (answer check available at lightandmatter.com)
(b) Neutrons and protons make up essentially all of the mass of the ordinary
matter around us. We observe that the universe around us has no free neutrons, but
lots of free protons
(the nuclei of hydrogen, which is the element that 90% of the universe
is made of). We find neutrons only inside nuclei along with other neutrons and
protons, not on their own.
If there are processes that can convert neutrons into protons, we might imagine that there could also be proton-to-neutron conversions, and indeed such a process does occur sometimes in nuclei that contain both neutrons and protons: a proton can decay into a neutron, a positron, and a neutrino. A positron is a particle with the same properties as an electron, except that its electrical charge is positive (see chapter 7). A neutrino, like an antineutrino, has negligible mass.
Although such a process can occur within a nucleus, explain why it cannot happen to a free proton. (If it could, hydrogen would be radioactive, and you wouldn't exist!)
9.
(a) Find a relativistic equation for the velocity of an
object in terms of its mass and momentum (eliminating
\(\gamma\)).(answer check available at lightandmatter.com)
(b) Show that your result
is approximately the same as the classical value, \(p/m\), at
low velocities.
(c) Show that very large momenta result in
speeds close to the speed of light.
10.
(a) Show that for \(v=(3/5)c\), \(\gamma\) comes out to be a simple fraction.
(b) Find another value of \(v\) for which \(\gamma\) is a simple fraction.
11.
An object moving at a speed very close to the speed of light is referred to as
ultrarelativistic. Ordinarily (luckily) the only ultrarelativistic objects
in our universe are subatomic particles, such as cosmic rays or particles
that have been accelerated in a particle accelerator.
(a) What kind of number is \(\gamma\) for an ultrarelativistic particle?
(b) Repeat example 18 on page 418,
but instead of very low, nonrelativistic speeds, consider ultrarelativistic speeds.
(c) Find an equation for the ratio \(\massenergy/p\). The speed may be relativistic, but don't
assume that it's ultrarelativistic.(answer check available at lightandmatter.com)
(d) Simplify your answer to part c for the case where the speed is ultrarelativistic.(answer check available at lightandmatter.com)
(e) We can think of a beam of light as an ultrarelativistic object --- it certainly moves at a speed
that's sufficiently close to the speed of light! Suppose you turn on a one-watt flashlight, leave it
on for one second, and then turn it off. Compute the momentum of the recoiling flashlight, in units
of \(\text{kg}\!\cdot\!\text{m}/\text{s}\).(answer check available at lightandmatter.com)
(f) Discuss how your answer in part e relates to the correspondence principle.
12. As discussed in
chapter 6, the speed at which a disturbance travels along
a string under tension is given by \(v=\sqrt{T/\mu}\), where \(\mu\) is the mass per unit
length, and \(T\) is the tension.
(a) Suppose a string has a density \(\rho\), and a cross-sectional
area \(A\). Find an expression for the maximum tension that could possibly exist in the string
without producing \(v>c\), which is impossible according to relativity. Express your answer in
terms of \(\rho\), \(A\), and \(c\). The interpretation is that relativity puts a limit on how
strong any material can be.(answer check available at lightandmatter.com)
(b) Every substance has a tensile strength, defined as the force
per unit area required to break it by pulling it apart. The tensile strength is measured in
units of \(\text{N}/\text{m}^2\), which is the same as the pascal (Pa), the mks unit of pressure.
Make a numerical estimate of the maximum tensile strength allowed by relativity in the case where
the rope is made out of ordinary matter, with a density on the same order of magnitude as
that of water. (For comparison, kevlar has a tensile strength of about \(4\times10^9\) Pa,
and there is speculation that fibers made from carbon nanotubes could have
values as high as \(6\times10^{10}\) Pa.)(answer check available at lightandmatter.com)
(c) A black hole is a star that has collapsed and become very dense, so that
its gravity is too strong for anything ever to escape from it. For instance, the escape
velocity from a black hole is greater than \(c\), so a projectile can't be shot out of it.
Many people, when they hear this description of a black hole in terms of an escape velocity
greater than \(c\), wonder why it still wouldn't be possible to extract an object from a black
hole by other means than launching it out as a projectile.
For example, suppose we lower an astronaut into a black hole on a rope, and then pull him
back out again. Why might this not work?
13. (a) A charged particle is surrounded by a uniform electric field.
Starting from rest, it is accelerated by the field to speed \(v\) after
traveling a distance \(d\). Now it is allowed to continue for a further
distance \(3d\), for a total displacement from the start of \(4d\).
What speed will it reach,
assuming classical physics?
(b) Find the relativistic result for the case of \(v=c/2\).
14. Problem 14 has been deleted.
15. Expand the equation \(K = m(\gamma-1)\) in a Taylor series, and find the first two nonvanishing terms. Show that the first term is the nonrelativistic expression for kinetic energy.
16. Expand the relativistic equation for momentum in a Taylor series, and find the first two nonvanishing terms. Show that the first term is the classical expression.
17. (solution in the pdf version of the book) As promised in subsection 7.2.8, this problem will lead you through the steps of finding an equation for the combination of velocities in relativity, generalizing the numerical result found in problem 1. Suppose that A moves relative to B at velocity \(u\), and B relative to C at \(v\). We want to find A's velocity \(w\) relative to C, in terms of \(u\) and \(v\). Suppose that A emits light with a certain frequency. This will be observed by B with a Doppler shift \(D(u)\). C detects a further shift of \(D(v)\) relative to B. We therefore expect the Doppler shifts to multiply, \(D(w)=D(u)D(v)\), and this provides an implicit rule for determining \(w\) if \(u\) and \(v\) are known. (a) Using the expression for \(D\) given in section 7.2.8, write down an equation relating \(u\), \(v\), and \(w\). (b) Solve for \(w\) in terms of \(u\) and \(v\). (c) Show that your answer to part b satisfies the correspondence principle.
18. The figure shows seven four-vectors, represented in a two-dimensional plot of \(x\) versus \(t\). All the vectors have \(y\) and \(z\) components that are zero. Which of these vectors are congruent to others, i.e., which represent spacetime intervals that are equal to one another?
19. Four-vectors can be timelike, lightlike, or spacelike. What can you say about the inherent properties of particles whose momentum four-vectors fall in these various categories?
20. The following are the three most common ways in which gamma rays interact with matter:
Photoelectric effect: The gamma ray hits an electron, is annihilated, and gives all of its energy to the electron.
Compton scattering: The gamma ray bounces off of an electron, exiting in some direction with some amount of energy.
Pair production: The gamma ray is annihilated, creating an electron and a positron.
Example 23 on p. 420 shows that pair production can't occur in a vacuum due to conservation of the energy-momentum four-vector. What about the other two processes? Can the photoelectric effect occur without the presence of some third particle such as an atomic nucleus? Can Compton scattering happen without a third particle?
21. Expand the relativistic equation for the longitudinal Doppler shift of light \(D(v)\) in a Taylor series, and find the first two nonvanishing terms. Show that these two terms agree with the nonrelativistic expression, so that any relativistic effect is of higher order in \(v\).
22. Prove, as claimed in the caption of figure a on p. 424, that \(S-180°=4(s-180°)\), where \(S\) is the sum of the angles of the large equilateral triangle and \(s\) is the corresponding sum for one of the four small ones.(solution in the pdf version of the book)
23. If a two-dimensional being lived on the surface of a cone, would it say that its space was curved, or not?
24. (a) Verify that the equation \(1-gh/c^2\) for the gravitational Doppler shift and gravitational time dilation has units that make sense. (b) Does this equation satisfy the correspondence principle?
25. (a) Calculate the Doppler shift to be expected in the Pound-Rebka experiment described on p. 429. (b) In the 1978 Iijima mountain-valley experiment (p. 384), analysis was complicated by the clock's sensitivity to pressure, humidity, and temperature. A cleaner version of the experiment was done in 2005 by hobbyist Tom Van Baak. He put his kids and three of his atomic clocks in a minivan and drove from Bellevue, Washington to a lodge on Mount Rainier, 1340 meters higher in elevation. At home, he compared the clocks to others that had stayed at his house. Verify that the effect shown in the graph is as predicted by general relativity.
26. The International Space Station orbits at an altitude of about 350 km and a speed of about 8000 m/s relative to the ground. Compare the gravitational and kinematic time dilations. Over all, does time run faster on the ISS than on the ground, or more slowly?
27. Section 7.4.3 presented a Newtonian estimate of how compact an object would have to be in order to be a black hole. Although this estimate is not really right, it turns out to give the right answer to within about a factor of 2. To roughly what size would the earth have to be compressed in order to become a black hole?
28. Clock A sits on a desk. Clock B is tossed up in the air from the same height as the desk and then comes back down. Compare the elapsed times. \hwhint{hwhint:tossed-clock} (solution in the pdf version of the book)
29. The angular defect \(d\) of a triangle (measured in radians) is defined as \(s-\pi\), where \(s\) is the sum of the interior angles. The angular defect is proportional to the area \(A\) of the triangle. Consider the geometry measured by a two-dimensional being who lives on the surface of a sphere of radius \(R\). First find some triangle on the sphere whose area and angular defect are easy to calculate. Then determine the general equation for \(d\) in terms of \(A\) and \(R\).(answer check available at lightandmatter.com)
In this exercise you will analyze the Michelson-Morley experiment, and find what the results should have been according to Galilean relativity and Einstein's theory of relativity. A beam of light coming from the west (not shown) comes to the half-silvered mirror A. Half the light goes through to the east, is reflected by mirror C, and comes back to A. The other half is reflected north by A, is reflected by B, and also comes back to A. When the beams reunite at A, part of each ends up going south, and these parts interfere with one another. If the time taken for a round trip differs by, for example, half the period of the wave, there will be destructive interference.
The point of the experiment was to search for a difference in the experimental results between the daytime, when the laboratory was moving west relative to the sun, and the nighttime, when the laboratory was moving east relative to the sun. Galilean relativity and Einstein's theory of relativity make different predictions about the results. According to Galilean relativity, the speed of light cannot be the same in all reference frames, so it is assumed that there is one special reference frame, perhaps the sun's, in which light travels at the same speed in all directions; in other frames, Galilean relativity predicts that the speed of light will be different in different directions, e.g., slower if the observer is chasing a beam of light. There are four different ways to analyze the experiment:
Groups 1-4 work in the sun's frame of reference according to Galilean relativity.
Group 1 finds time AC. Group 2 finds time CA. Group 3 finds time AB. Group 4 finds time BA.
Groups 5 and 6 transform the lab-frame results into the sun's frame according to Einstein's theory.
Group 5 transforms the \(x\) and \(t\) when ray ACA gets back to A into the sun's frame of reference, and group 6 does the same for ray ABA.
Discussion:
Michelson and Morley found no change in the interference of the waves between day and night. Which version of relativity is consistent with their results?
What does each theory predict if \(v\) approaches \(c\)?
What if the arms are not exactly equal in length?
Does it matter if the “special” frame is some frame other than the sun's?
In Slowlightland, the speed of light is 20 mi/hr \(\approx\) 32 km/hr \(\approx\) 9 m/s. Think of an example of how relativistic effects would work in sports. Things can get very complex very quickly, so try to think of a simple example that focuses on just one of the following effects:
- relativistic momentum
- relativistic kinetic energy
- relativistic addition of velocities
- time dilation and length contraction
- Doppler shifts of light
- equivalence of mass and energy
- time it takes for light to get to an athlete's eye
- deflection of light rays by gravity
\includegraphics[width=168mm]{../share/relativity/figs/spacetime-ex-0.pdf}
\includegraphics[width=168mm]{../share/relativity/figs/spacetime-ex-1.pdf}
\includegraphics[width=168mm]{../share/relativity/figs/spacetime-ex-2.pdf}
The following is a list of common misconceptions about relativity. The class will be split up into random groups, and each group will cooperate on developing an explanation of the misconception, and then the groups will present their explanations to the class. There may be multiple rounds, with students assigned to different randomly chosen groups in successive rounds.
The figure gives four pairs of four-vectors, oriented in our customary way as shown by the light-cone on the left.
1. Of the types shown in the four cases i-iv, which types of vectors could represent the world-line of an observer?
2. Suppose that \(\mathbf{U}\) and \(\mathbf{V}\) are both observer-vectors. What would it mean physically to compute \(\mathbf{U}+\mathbf{V}\)?
3. Determine the sign of each inner product \(\mathbf{A}\cdot\mathbf{B}\).
4. Given an observer whose world-line is along a four-vector \(\mathbf{O}\), suppose we want to determine whether some other four-vector \(\mathbf{P}\) is also a possible world-line of an observer. Show that knowledge of the signs of the inner products \(\mathbf{O}\cdot\mathbf{P}\) and \(\mathbf{P}\cdot\mathbf{P}\) is necessary and sufficient to determine this. Hint: Consider various possibilities like i-iv for vector \(\mathbf{P}\), and see how the signs would turn out.
5. For vectors as described in 4, determine the signs of
and
by multiplying them out. Interpret the result physically.
(c) 1998-2013 Benjamin Crowell, licensed under the Creative Commons Attribution-ShareAlike license. Photo credits are given at the end of the Adobe Acrobat version.