You are viewing the html version of Simple Nature, by Benjamin Crowell. This version is only designed for casual browsing, and may have some formatting problems. For serious reading, you want the printer-friendly Adobe Acrobat version.

Table of Contents

(c) 1998-2006 Benjamin Crowell, licensed under the Creative Commons Attribution-ShareAlike license, or, at your option, the GFDL license. Photo credits are given at the end of the Adobe Acrobat version.

Contents
Section 7.1 - Basic Relativity
Section 7.2 - The Lorentz transformation
Section 7.3 - Dynamics

Chapter 7. Relativity

einstein

a / Albert Einstein.

Complaining about the educational system is a national sport among professors in the U.S., and I, like my colleagues, am often tempted to imagine a golden age of education in our country's past, or to compare our system unfavorably with foreign ones. Reality intrudes, however, when my immigrant students recount the overemphasis on rote memorization in their native countries and the philosophy that what the teacher says is always right, even when it's wrong.

Albert Einstein's education in late-nineteenth-century Germany was neither modern nor liberal. He did well in the early grades (the myth that he failed his elementary-school classes comes from a misunderstanding based on a reversal of the German numerical grading scale), but in high school and college he began to get in trouble for what today's edspeak calls “critical thinking.”

Indeed, there was much that deserved criticism in the state of physics at that time. There was a subtle contradiction between Maxwell's theory of electromagnetism and Galileo's principle that all motion is relative. Einstein began thinking about this on an intuitive basis as a teenager, trying to imagine what a light beam would look like if you could ride along beside it on a motorcycle at the speed of light. Today we remember him most of all for his radical and far-reaching solution to this contradiction, his theory of relativity, but in his student years his insights were greeted with derision from his professors. One called him a “lazy dog.” Einstein's distaste for authority was typified by his decision as a teenager to renounce his German citizenship and become a stateless person, based purely on his opposition to the militarism and repressiveness of German society. He spent his most productive scientific years in Switzerland and Berlin, first as a patent clerk but later as a university professor. He was an outspoken pacifist and a stubborn opponent of World War I, shielded from retribution by his eventual acquisition of Swiss citizenship.

As the epochal nature of his work began to become evident, some liberal Germans began to point to him as a model of the “new German,” but with the Nazi coup d'etat, staged public meetings began to be held at which Nazi scientists criticized the work of this ethnically Jewish (but spiritually nonconformist) giant of science. Einstein was on a stint as a visiting professor at Caltech when Hitler was appointed chancellor, and never returned to the Nazi state. World War II convinced Einstein to soften his strict pacifist stance, and he signed a secret letter to President Roosevelt urging research into the building of a nuclear bomb, a device that could not have been imagined without his theory of relativity. He later wrote, however, that when Hiroshima and Nagasaki were bombed, it made him wish he could burn off his own fingers for having signed the letter.

This chapter and the next are specifically about Einstein's theory of relativity, but Einstein also began a second, parallel revolution in physics known as the quantum theory, which stated, among other things, that certain processes in nature are inescapably random. Ironically, Einstein was an outspoken doubter of the new quantum ideas, being convinced that “the Old One [God] does not play dice with the universe,” but quantum and relativistic concepts are now thoroughly intertwined in physics. The remainder of this book beyond the present pair of chapters is an introduction to the quantum theory, but we will continually be led back to relativistic ideas.

\formatlikesubsubsection{The structure of this chapter} From the modern point of view, electricity and magnetism becomes much simpler and easier to understand if it is encountered after relativity. Most schools' curricula, however, place electricity and magnetism before relativity. In such a curriculum, section 7.1 should be covered before electricity and magnetism, and then later in the course one can go back and cover all of chapter 7. This chapter is also designed so that it can be read without having previously covered waves.

7.1 Basic Relativity

Absolute, true, and mathematical time ... flows at a constant rate without relation to anything external... Absolute space... without relation to anything external, remains always similar and immovable. -- Isaac Newton (tr. Andrew Motte)

The principle of relativity

Galileo's most important physical discovery was that motion is relative. With modern hindsight, we restate this in a way that shows what made the teenage Einstein suspicious:

\mythmhdr{The principle of Galilean relativity} Matter obeys the same laws of physics in any inertial frame of reference, regardless of the frame's orientation, position, or constant-velocity motion.

Note that it only refers to matter, not light.

Einstein's professors taught that light waves obeyed an entirely different set of rules than material objects. They believed that light waves were a vibration of a mysterious medium called the ether, and that the speed of light should be interpreted as a speed relative to this ether. Thus although the cornerstone of the study of matter had for two centuries been the idea that motion is relative, the science of light seemed to contain a concept that a certain frame of reference was in an absolute state of rest with respect to the ether, and was therefore to be preferred over moving frames.

Now let's think about Albert Einstein's daydream of riding a motorcycle alongside a beam of light. In cyclist Albert's frame of reference, the light wave appears to be standing still. However, James Clerk Maxwell had already constructed a highly successful mathematical description of light waves as patterns of electric and magnetic fields. Einstein on his motorcycle can stick measuring instruments into the wave to monitor the electric and magnetic fields, and they will be constant at any given point. But an electromagnetic wave pattern standing frozen in space like this violates Maxwell's equations and cannot exist. Maxwell's equations say that light waves always move with the same velocity, notated c, equal to 3.0×108 m/s. Einstein could not tolerate this disagreement between the treatment of relative and absolute motion in the theories of matter on the one hand and light on the other. He decided to rebuild physics with a single guiding principle:

\mythmhdr{Einstein's principle of relativity} Both light and matter obey the same laws of physics in any inertial frame of reference, regardless of the frame's orientation, position, or constant-velocity motion.

michelson-portrait

c / Albert Michelson, in 1887, the year of the Michelson-Morley experiment.

fitzgerald-portrait

d / George FitzGerald, 1851-1901.

lorentz-portrait

e / Hendrik Lorentz, 1853-1928.

Distortion of time and space

This is hard to swallow. If a dog is running away from me at 5 m/s relative to the sidewalk, and I run after it at 3 m/s, the dog's velocity in my frame of reference is 2 m/s. According to everything we have learned about motion, the dog must have different speeds in the two frames: 5 m/s in the sidewalk's frame and 2 m/s in mine. How, then, can a beam of light have the same speed as seen by someone who is chasing the beam? michelson

a / The Michelson-Morley experiment, shown in photographs, and drawings from the original 1887 paper. 1. A simplified drawing of the apparatus. A beam of light from the source, s, is partially reflected and partially transmitted by the half-silvered mirror h1. The two half-intensity parts of the beam are reflected by the mirrors at a and b, reunited, and observed in the telescope, t. If the earth's surface was supposed to be moving through the ether, then the times taken by the two light waves to pass through the moving ether would be unequal, and the resulting time lag would be detectable by observing the interference between the waves when they were reunited. 2. In the real apparatus, the light beams were reflected multiple times. The effective length of each arm was increased to 11 meters, which greatly improved its sensitivity to the small expected difference in the speed of light. 3. In an earlier version of the experiment, they had run into problems with its “extreme sensitiveness to vibration,” which was “so great that it was impossible to see the interference fringes except at brief intervals ... even at two o'clock in the morning.” They therefore mounted the whole thing on a massive stone floating in a pool of mercury, which also made it possible to rotate it easily. 4. A photo of the apparatus. Note that it is underground, in a room with solid brick walls.

In fact the strange constancy of the speed of light had shown up in the now-famous Michelson-Morley experiment of 1887. Michelson and Morley set up a clever apparatus to measure any difference in the speed of light beams traveling east-west and north-south. The motion of the earth around the sun at 110,000 km/hour (about 0.01% of the speed of light) is to our west during the day. Michelson and Morley believed in the ether hypothesis, so they expected that the speed of light would be a fixed value relative to the ether. As the earth moved through the ether, they thought they would observe an effect on the velocity of light along an east-west line. For instance, if they released a beam of light in a westward direction during the day, they expected that it would move away from them at less than the normal speed because the earth was chasing it through the ether. They were surprised when they found that the expected 0.01% change in the speed of light did not occur.Although the Michelson-Morley experiment was nearly two decades in the past by the time Einstein published his first paper on relativity in 1905, it's unclear how much it influenced Einstein. Michelson and Morley themselves were uncertain about whether the result was to be trusted, or whether systematic and random errors were masking a real effect from the ether. There were a variety of competing theories, each of which could claim some support from the shaky data. Some physicists believed that the ether could be dragged along by matter moving through it, which inspired variations on the experiment that were conducted on mountaintops in thin-walled buildings, b, or with one arm of the appartus out in the open, and the other surrounded by massive lead walls. In the standard sanitized textbook version of the history of science, every scientist does his experiments without any preconceived notions about the truth, and any disagreement is quickly settled by a definitive experiment. In reality, this period of confusion about the Michelson-Morley experiment lasted for four decades, and a few reputable skeptics, including Miller, continued to believe that Einstein was wrong, and kept trying different variations of the experiment as late as the 1920's. Most of the remaining doubters were convinced by an extremely precise version of the experiment performed by Joos in 1930, although you can still find kooks on the internet who insist that Miller was right, and that there was a vast conspiracy to cover up his results. miller

b / Dayton Miller thought that the result of the Michelson-Morley experiment could be explained because the ether had been pulled along by the dirt, and the walls of the laboratory. This motivated him to carry out a series of experiments at the top of Mount Wilson, in a building with thin walls.

Before Einstein, some physicists who did believe the negative result of the Michelson-Morley experiment came up with explanations that preserved the ether. In the period from 1889 to 1895, Hendrik Lorentz and George FitzGerald suggested that the negative result of the Michelson-Morley experiment could be explained if the earth, and every physical object on its surface, was contracted slightly by the strain of the earth's motion through the ether.1

How did Einstein explain this strange refusal of light waves to obey the usual rules of addition and subtraction of velocities due to relative motion? He had the originality and bravery to suggest a radical solution. He decided that space and time must be stretched and compressed as seen by observers in different frames of reference. Since velocity equals distance divided by time, an appropriate distortion of time and space could cause the speed of light to come out the same in a moving frame. This conclusion could have been reached by the physicists of two generations before, on the day after Maxwell published his theory of light, but the attitudes about absolute space and time stated by Newton were so strongly ingrained that such a radical approach did not occur to anyone before Einstein.

Time distortion

Consider the situation shown in figure f. Aboard a rocket ship we have a tube with mirrors at the ends. If we let off a flash of light at the bottom of the tube, it will be reflected back and forth between the top and bottom. It can be used as a clock: by counting the number of times the light goes back and forth we get an indication of how much time has passed. (This may not seem very practical, but a real atomic clock does work on essentially the same principle.) Now imagine that the rocket is cruising at a significant fraction of the speed of light relative to the earth. Motion is relative, so for a person inside the rocket, f/1, there is no detectable change in the behavior of the clock, just as a person on a jet plane can toss a ball up and down without noticing anything unusual. But to an observer in the earth's frame of reference, the light appears to take a zigzag path through space, f/2, increasing the distance the light has to travel. zigzag

f / A light beam bounces between two mirrors in a spaceship.

If we didn't believe in the principle of relativity, we could say that the light just goes faster according to the earthbound observer. Indeed, this would be correct if the speeds were not close to the speed of light, and if the thing traveling back and forth was, say, a ping-pong ball. But according to the principle of relativity, the speed of light must be the same in both frames of reference. We are forced to conclude that time is distorted, and the light-clock appears to run more slowly than normal as seen by the earthbound observer. In general, a clock appears to run most quickly for observers who are in the same state of motion as the clock, and runs more slowly as perceived by observers who are moving relative to the clock.

We can easily calculate the size of this time-distortion effect. In the frame of reference shown in figure f/1, moving with the spaceship, let t1 be the time required for the beam of light to move from the bottom to the top. An observer on the earth, who sees the situation shown in figure f/2, disagrees, and says this motion took a longer time t2. Let v be the velocity of the spaceship relative to the earth. In frame 2, the light beam travels along the hypotenuse of a right triangle whose base has length

base = vt2 .

Observers in the two frames of reference agree on the vertical distance traveled by the beam, i.e., the height of the triangle perceived in frame 2, and an observer in frame 1 says that this height is the distance covered by a light beam in time t1, so the height is

height = ct1 .

The hypotenuse of this triangle is the distance the light travels in frame 2,

hypotenuse = ct2 .

Using the Pythagorean theorem, we can relate these three quantities, and solving for t2 we find

 t_2 = frac{t_1}{sqrt{1-left(v/cright)^2}} qquad .

The amount of distortion is given by the factor 1/sqrt{1-left(v/cright)^2}, and this quantity appears so often that we give it a special name, γ (Greek letter gamma),

 gamma = frac{1}{sqrt{1-left(v/cright)^2}}qquad . qquad text{[definition of the $gamma$ factor]}

gammagraph

g / The behavior of the γ factor.

self-check: What is γ when v=0? What does this mean? (answer in the back of the PDF version of the book)

Distortion of space

The speed of light is supposed to be the same in all frames of reference, and a speed is a distance divided by the time. We can't change time without changing distance, since then the speed couldn't come out the same. A rigorous treatment requires some delicacy, but we postpone that to section 7.2 and state for now the apparently reasonable result that if time is distorted by a factor of γ, then lengths must also be distorted according to the same ratio. An object in motion appears longest to someone who is at rest with respect to it, and is shortened along the direction of motion as seen by other observers.

muona

h / Decay of muons created at rest with respect to the observer.

muonb

i / Decay of muons moving at a speed of 0.995c with respect to the observer.

dqillusion

Discussion question B

Applications

Nothing can go faster than the speed of light.

What happens if we want to send a rocket ship off at, say, twice the speed of light, v=2c? Then γ will be 1/sqrt{-3}. But your math teacher has always cautioned you about the severe penalties for taking the square root of a negative number. The result would be physically meaningless, so we conclude that no object can travel faster than the speed of light. Even travel exactly at the speed of light appears to be ruled out for material objects, since then γ would be infinite.

Einstein had therefore found a solution to his original paradox about riding on a motorcycle alongside a beam of light, resulting in a violation of Maxwell's theory of electromagnetism. The paradox is resolved because it is impossible for the motorcycle to travel at the speed of light.

Most people, when told that nothing can go faster than the speed of light, immediately begin to imagine methods of violating the rule. For instance, it would seem that by applying a constant force to an object for a long time, we could give it a constant acceleration, which would eventually cause it to go faster than the speed of light. We will take up these issues in section 7.3.

Cosmic-ray muons

A classic experiment to demonstrate time distortion uses observations of cosmic rays. Cosmic rays are protons and other atomic nuclei from outer space. When a cosmic ray happens to come the way of our planet, the first earth-matter it encounters is an air molecule in the upper atmosphere. This collision then creates a shower of particles that cascade downward and can often be detected at the earth's surface. One of the more exotic particles created in these cosmic ray showers is the muon (named after the Greek letter mu, μ). The reason muons are not a normal part of our environment is that a muon is radioactive, lasting only 2.2 microseconds on the average before changing itself into an electron and two neutrinos. A muon can therefore be used as a sort of clock, albeit a self-destructing and somewhat random one! Figures h and i show the average rate at which a sample of muons decays, first for muons created at rest and then for high-velocity muons created in cosmic-ray showers. The second graph is found experimentally to be stretched out by a factor of about ten, which matches well with the prediction of relativity theory:

 gamma = 1/sqrt{1-(v/c)^2}

 = 1/sqrt{1-(0.995)^2}

 approx 10

Since a muon takes many microseconds to pass through the atmosphere, the result is a marked increase in the number of muons that reach the surface.

supernovae

j / Light curves of supernovae, showing a time-dilation effect for supernovae that are in motion relative to us.

Time dilation for objects larger than the atomic scale

Our world is (fortunately) not full of human-scale objects moving at significant speeds compared to the speed of light. For this reason, it took over 80 years after Einstein's theory was published before anyone could come up with a conclusive example of drastic time dilation that wasn't confined to cosmic rays or particle accelerators. Recently, however, astronomers have found definitive proof that entire stars undergo time dilation. The universe is expanding in the aftermath of the Big Bang, so in general everything in the universe is getting farther away from everything else. One need only find an astronomical process that takes a standard amount of time, and then observe how long it appears to take when it occurs in a part of the universe that is receding from us rapidly. A type of exploding star called a type Ia supernova fills the bill, and technology is now sufficiently advanced to allow them to be detected across vast distances. Figure j shows convincing evidence for time dilation in the brightening and dimming of two distant supernovae.

The twin paradox

A natural source of confusion in understanding the time-dilation effect is summed up in the so-called twin paradox, which is not really a paradox. Suppose there are two teenaged twins, and one stays at home on earth while the other goes on a round trip in a spaceship at relativistic speeds (i.e., speeds comparable to the speed of light, for which the effects predicted by the theory of relativity are important). When the traveling twin gets home, he has aged only a few years, while his brother is now old and gray. (Robert Heinlein even wrote a science fiction novel on this topic, although it is not one of his better stories.)

The “paradox” arises from an incorrect application of the principle of relativity to a description of the story from the traveling twin's point of view. From his point of view, the argument goes, his homebody brother is the one who travels backward on the receding earth, and then returns as the earth approaches the spaceship again, while in the frame of reference fixed to the spaceship, the astronaut twin is not moving at all. It would then seem that the twin on earth is the one whose biological clock should tick more slowly, not the one on the spaceship. The flaw in the reasoning is that the principle of relativity only applies to frames that are in motion at constant velocity relative to one another, i.e., inertial frames of reference. The astronaut twin's frame of reference, however, is noninertial, because his spaceship must accelerate when it leaves, decelerate when it reaches its destination, and then repeat the whole process again on the way home. Their experiences are not equivalent, because the astronaut twin feels accelerations and decelerations. A correct treatment requires some mathematical complication to deal with the changing velocity of the astronaut twin, but the result is indeed that it's the traveling twin who is younger when they are reunited.

The twin “paradox” really isn't a paradox at all. It may even be a part of your ordinary life. The effect was first verified experimentally by synchronizing two atomic clocks in the same room, and then sending one for a round trip on a passenger jet. (They bought the clock its own ticket and put it in its own seat.) The clocks disagreed when the traveling one got back, and the discrepancy was exactly the amount predicted by relativity. The effects are strong enough to be important for making the global positioning system (GPS) work correctly. If you've ever taken a GPS receiver with you on a hiking trip, then you've used a device that has the twin “paradox” programmed into its calculations. Your handheld GPS box talks to a system onboard a satellite, and the satellite is moving fast enough that its time dilation is an important effect. So far no astronauts have gone fast enough to make time dilation a dramatic effect in terms of the human lifetime. The effect on the Apollo astronauts, for instance, was only a fraction of a second, since their speeds were still fairly small compared to the speed of light. (As far as I know, none of the astronauts had twin siblings back on earth!)

An example of length contraction

Figure k shows an artist's rendering of the length contraction for the collision of two gold nuclei at relativistic speeds in the RHIC accelerator in Long Island, New York, which went on line in 2000. The gold nuclei would appear nearly spherical (or just slightly lengthened like an American football) in frames moving along with them, but in the laboratory's frame, they both appear drastically foreshortened as they approach the point of collision. The later pictures show the nuclei merging to form a hot soup, in which experimenters hope to observe a new form of matter.

rhic

k / Colliding nuclei show relativistic length contraction.

Discussion Questions

◊ On a spaceship moving at relativistic speeds, would a lecture seem even longer and more boring than normal?

A question that students often struggle with is whether time and space can really be distorted, or whether it just seems that way. Compare with optical illusions or magic tricks. How could you verify, for instance, that the lines in the figure are actually parallel? Are relativistic effects the same or not?

◊ If you were in a spaceship traveling at the speed of light (or extremely close to the speed of light), would you be able to see yourself in a mirror?

◊ Mechanical clocks can be affected by motion. For example, it was a significant technological achievement to build a clock that could sail aboard a ship and still keep accurate time, allowing longitude to be determined. How is this similar to or different from relativistic time dilation?

What would the shapes of the two nuclei in the RHIC experiment look like to a microscopic observer riding on the left-hand nucleus? To an observer riding on the right-hand one? Can they agree on what is happening? If not, why not --- after all, shouldn't they see the same thing if they both compare the two nuclei side-by-side at the same instant in time?

If you stick a piece of foam rubber out the window of your car while driving down the freeway, the wind may compress it a little. Does it make sense to interpret the relativistic length contraction as a type of strain that pushes an object's atoms together like this? How does this relate to discussion question E?

7.2 The Lorentz transformation

rotation

a / Two observers describe the same landscape with different coordinate systems.

Coordinate transformations in general

In section 7.1 the emphasis was on demonstrating some of the fundamental relativistic phenomena, without getting tangled up in too much mathematics. However, the issues that were glossed over there would come back to bite us if we never examined them carefully, and we haven't yet seen the full extent of relativity's attack on the traditional and intuitive concepts of space and time.

Rotation

For guidance, let's look at the mathematical treatment of the part of the principle of relativity that states that the laws of physics are the same regardless of the orientation of the coordinate system. Suppose that two observers are in frames of reference that are at rest relative to each other, and they set up coordinate systems with their origins at the same point, but rotated by 90 degrees, as in figure a. To go back and forth between the two systems, we can use the equations

 x' = y

 y' = - x

A set of equations such as this one for changing from one system of coordinates to another is called a coordinate transformation, or just a transformation for short.

Similarly, if the coordinate systems differed by an angle of 5 degrees, we would have

 x' = (cos 5degunit) x + (sin 5degunit) y

 y' = (-sin 5degunit) x + (cos 5degunit) y

Since cos 5°=0.997 is very close to one, and sin 5°=0.087 is close to zero, the rotation through a small angle has only a small effect, which makes sense. The equations for rotation are always of the form

 x' = text{(constant #1)} x + text{(constant #2)} y

 y' = text{(constant #3)} x + text{(constant #4)} y qquad .

Galilean transformation for frames moving relative to each other

Einstein wanted to see if he could find a rule for changing between coordinate systems that were moving relative to each other. As a second warming-up example, let's look at the transformation between frames of reference in relative motion according to Galilean relativity, i.e., without any distortion of space and time. Suppose the x' axis is moving to the right at a velocity v relative to the x axis. The transformation is simple:

 x' = x - vt

 t' = quad quad : t

Again we have an equation with constants multiplying the variables, but now the variables are distance and time. The interpretation of the -vt term is the observer moving with the origin x' system sees a steady reduction in distance to an object on the right and at rest in the x system. In other words, the object appears to be moving according to the x' observer, but at rest according to x. The fact that the constant in front of x in the first equation equals one tells us that there is no distortion of space according to Galilean relativity, and similarly the second equation tells us there is no distortion of time.

astronaut

b / The x,t frame is defined from the asteroid, and the x',t' frame from the astronaut.

Derivation of the lorentz transformation

Guided by analogy, Einstein decided to look for a transformation between frames in relative motion that would have the form

 x' = Ax + Bt

 t' = Cx + D t qquad .

(Any form more complicated than this, for example equations including x2 or t2 terms, would violate the part of the principle of relativity that says the laws of physics are the same in all locations.) For historical reasons, this is called a Lorentz transformation. The constants A, B, C, and D would depend only on the relative velocity, v, of the two frames. Galilean relativity had been amply verified by experiment for values of v much less than the speed of light, so at low speeds we must have A≈1, B≈ -v, C≈0, and D≈1. For high speeds, however, the constants A and D would start to become measurably different from 1, providing the distortions of time and space needed so that the speed of light would be the same in all frames of reference.

self-check: What units would the constants A, B, C, and D need to have? (answer in the back of the PDF version of the book)

Natural units

Despite the reputation for difficulty of Einstein's theories, the derivation of Einstein's transformations is fairly straightforward. The algebra, however, can appear more cumbersome than necessary unless we adopt a choice of units that is better adapted to relativity than the metric units of meters and seconds. The form of the transformation equations shows that time and space are not even entirely separate entities. Life is easier if we adopt a new set of units:

{}-Time is measured in seconds.
* -Distance is also measured in units of seconds. A distance of one second is how far light travels in one second of time.

{}In these units, the speed of light equals one by definition:

 c = frac{text{1 second of distance}}{text{1 second of time}} = 1

All velocities are represented by unitless numbers in this system, so for example v=0.5 would describe motion at half the speed of light.

Example 1: Converting a formula from ordinary units to natural units

In ordinary units, the equation for the Lorentz factor γ is

 gamma = frac{1}{sqrt{1-frac{v^2}{c^2}}} qquad .

Suppose we want to reexpress this in natural units. One way of doing it would be to redo the derivation on page 372, but with the simplifying assumption of c=1. However, this would just mean eliminating any c that appears in a multiplication or division, so rather than retracing the trail of breadcrumbs, we can just eliminate the c's from the final result:

 gamma = frac{1}{sqrt{1-v^2}} qquad .

Example 2: Converting a formula from natural units to ordinary units
In reality, the reason for using natural units in the first place is to make derivations simpler. Therefore a much more common situation is that you get a formula in natural units as the result of some symbolic calculation, but then you need to convert it to ordinary units in order to plug in numbers that you have in ordinary units. Working in the opposite direction, we observe that the equation gamma=1/sqrt{1-v^2} doesn't make any sense in metric units, because you can't take a unitless number like 1 and subtract from it a quantity that has units of m2/s2. That would be like subtracting three gallons from seven miles! Even if we don't remember how the formula was derived, we know that the derivation in natural units and the derivation in ordinary units could only have differed by the presence or absence of factors of c in various places. Therefore, we know we can recover the result in metric units simply by inserting factors of c wherever they're needed in order to turn the nonsense into sense. One way of doing this would be to divide the v2 term by c2, which makes it into a unitless quantity that it's possible to subtract from 1. The result is gamma=1/sqrt{1-v^2/c^2}. (It might seem like the result wouldn't be unique, since we could instead fix the 1 by multiplying it by c2, giving c2-v2 inside the square root. However, the units of the right-hand side of the equation would then be s/m, so we'd also need to change the left-hand side to γ/c, and then the result would be exactly equivalent gamma=1/sqrt{1-v^2/c^2}.)
Example 3: A black hole
Here's an example where you don't even have the option of rederiving the equation from scratch. A black hole of mass m has an invisible spherical boundary of radius r surrounding it, and any object that comes in closer than that can never escape. The radius is given in natural units by r=2Gm, where G is Newton's gravitational constant. All of this can be partly, but not completely, explained using special relativity. For instance, we can try calculating the distance at which escape velocity becomes greater than the speed of light. However, this would be a swindle, because special relativity doesn't include gravity --- to get a correct relativistic treatment of gravity, we'd need general relativity, which is beyond the scope of this book. (One way you can tell that the naive calculation using escape velocity isn't really correct is that it makes it sound as though an object could still be hoisted out of a black hole on a cable, but that's actually not true according to general relativity.) But even though you don't know enough physics to derive the equation correctly from scratch, you can still convert it to metric units. The units of G are m3/kg⋅s2, so the units of Gm are m3/s2. This doesn't equal meters, so the equation r=2Gm is nonsense if you interpret it directly in metric units. However, the units do work if you change it to r=2Gm/c2, so that's what the equation must be in metric units.

Derivation of the lorentz transformation

We now want to find out how the constants A, B, C, and D in the transformation equations

 x' = Ax + Bt

 t' = Cx + Dt

depend on velocity. For vividness, we imagine that the x,t frame is defined by an asteroid at x=0, and the x',t' frame by a rocket ship at x'=0. The rocket ship is coasting at a constant speed v relative to the asteroid, and as it passes the asteroid they synchronize their clocks to read t=0 and t'=0.

Asteroid time as perceived by the rocket

In section 7.1, we've already found that a clock seems to run more slowly by a factor of γ to an observer in motion with respect to the clock. A clock on the asteroid has x=0, so if the rocket pilot monitors the ticking of a clock on the asteroid via radio signals, the Lorentz transformation gives t'=Dt. The idea of time running more slowly by a factor of γ is expressed by t'=γ t, so we have

D=γ .

Asteroid's motion as seen by the rocket

Straightforward algebra can be used to reverse the transformation equations so that they give x and t in terms of x' and t'. The result for x is x=(Dx'-Bt')/(AD-BC). The asteroid's frame of reference has its origin defined by the asteroid itself, so the asteroid is always at x=0. In the rocket's frame, the asteroid falls behind according to the equation x'=-vt', and substituting this into the equation for x gives 0=(-Dvt'-Bt')/(AD-BC). This requires us to have B=-vD, or

B=-vγ .

So far in this derivation, we've been able to avoid talking about events that happen in different places and at different times, but we won't be able to avoid that anymore. We need to compare the perception of space and time by observers on the rocket and the asteroid, but this can be a bit tricky because our usual ideas about measurement contain hidden assumptions. If, for instance, we want to measure the length of a box, we imagine we can lay a ruler down on it, take in the scene visually, and take the measurement using the ruler's scale on the right side of the box while the left side of the box is simultaneously lined up with the butt of the ruler. The assumption that we can take in the whole scene at once with our eyes is, however, based on the assumption that light travels with infinite speed to our eyes. Since we will be dealing with relative motion at speeds comparable to the speed of light, we have to spell out our methods of measuring distance.

astronaut-and-remotes

c / To discuss distances and time intervals between different events, we imagine that each frame of reference has observers in more than one place.

We will therefore imagine an explicit procedure for the asteroid and the rocket pilot to make their distance measurements: they send electromagnetic signals (light or radio waves) back and forth to their own remote stations. For instance the asteroid's station will send it a message to tell it the time at which the rocket went by. The asteroid's station is at rest with respect to the asteroid, and the rocket's is at rest with respect to the rocket (and therefore in motion with respect to the asteroid).

The measurement of time is likewise fraught with danger if we are careless, which is why we have had to spell out procedures for the synchronization of clocks between the asteroid and the rocket. The asteroid must also synchronize its clock with its remote stations's clock by adjusting them until flashes of light released by both the asteroid and its station at equal clock readings are received on the opposite sides at equal clock readings. The rocket pilot must go through the same kind of synchronization procedure with her remote station.

Rocket's motion as seen by the asteroid

The origin of the rocket's x',t' frame is defined by the rocket itself, so the rocket always has x'=0. Let the asteroid's remote station be at position x in the asteroid's frame. The asteroid sees the rocket travel at speed v, so the asteroid's remote station sees the rocket pass it when x equals vt. The equation x'=Ax+Bt then becomes 0=Avt+Bt, which implies a relationship between A and B: B=-Av. (In the Galilean version, we had B=-v and A=1.) Thus,

A=γ .

This boils down to a statement that length contraction occurs in the same proportion as time dilation, as we'd already argued less rigorously.

Agreement on the speed of light

Suppose the rocket pilot releases a flash of light in the forward direction as she passes the asteroid at t=t'=0. As seen in the asteroid's frame, we might expect this pulse to travel forward faster than normal because it was emitted by the moving rocket, but the principle of relativity tells us this is not so. The flash reaches the asteroid's remote station when x equals ct, and since we are working in natural units, this is equivalent to x=t. The speed of light must be the same in the rocket's frame, so we must also have x'=t' when the flash gets there. Setting x' =Ax+Bt equal to t' = Cx+ Dt and substituting in x=t, we find A+B=C+D, so we deduce C=B+A-D=B, or

C=-vγ .

We have now arrived at the correct relativistic equation for transforming between frames in relative motion. For completeness, I will include, without proof, the trivial transformations of the y and z coordinates.

 x' = gamma x - v gamma t

 t' = -vgamma x + gamma t

 y' = y

 z' = z

These equations are valid provided that (1) the two coordinate systems coincide at t=t'=0; and (2) the observer in the x',y',z',t' frame is moving at velocity v relative to the x,y,z,t frame, and the motion is in the direction of the x axis.

self-check: What happens to the Lorentz transformation in the case where v equals zero? (answer in the back of the PDF version of the book)

We now turn to some subversive consequences of these equations.

earthworldline

f / The world-line of the earth as it orbits the sun.

lightcone

g / The light cone.

historian

Discussion question A.

worldlinesc

Discussion question B.

Spacetime

No absolute time

The fact that the equation for time is not just t'=t tells us we're not in Kansas anymore --- Newton's concept of absolute time is dead. One way of understanding this is to think about the steps described for synchronizing the four clocks:

(1) The asteroid's clock --- call it A1 --- was synchronized with the clock on its remote station, A2.

(2) The rocket pilot synchronized her clock, R1, with A1, at the moment when she passed the asteroid.

(3) The clock on the rocket's remote station, R2, was synchronized with R1.

Now if A2 matches A1, A1 matches R1, and R1 matches R2, we would expect A2 to match R2. This cannot be so, however. The rocket pilot released a flash of light as she passed the asteroid. In the asteroid's frame of reference, that light had to travel the full distance to the asteroid's remote station before it could be picked up there. In the rocket pilot's frame of reference, however, the asteroid's remote station is rushing at her, perhaps at a sizeable fraction of the speed of light, so the flash has less distance to travel before the asteroid's station meets it. Suppose the rocket pilot sets things up so that R2 has just enough of a head start on the light flash to reach A2 at the same time the flash of light gets there. Clocks A2 and R2 cannot agree, because the time required for the light flash to get there was different in the two frames. Thus, two clocks that were initially in agreement will disagree later on.

simultaneity

d / Different observers don't agree that the flashes of light hit the front and back of the ship simultaneously.

No simultaneity

Part of the concept of absolute time was the assumption that it was valid to say things like, “I wonder what my uncle in Beijing is doing right now.” In the nonrelativistic world-view, clocks in Los Angeles and Beijing could be synchronized and stay synchronized, so we could unambiguously define the concept of things happening simultaneously in different places. It is easy to find examples, however, where events that seem to be simultaneous in one frame of reference are not simultaneous in another frame. In figure d, a flash of light is set off in the center of the rocket's cargo hold. According to a passenger on the rocket, the flashes have equal distances to travel to reach the front and back walls, so they get there simultaneously. But an outside observer who sees the rocket cruising by at high speed will see the flash hit the back wall first, because the wall is rushing up to meet it, and the forward-going part of the flash hit the front wall later, because the wall was running away from it. Only when the relative velocity of two frames is small compared to the speed of light will observers in those frames agree on the simultaneity of events.

The garage paradox

One of the most famous of all the so-called relativity paradoxes has to do with our incorrect feeling that simultaneity is well defined. The idea is that one could take a schoolbus and drive it at relativistic speeds into a garage of ordinary size, in which it normally would not fit. Because of the length contraction, the bus would supposedly fit in the garage. The paradox arises when we shut the door and then quickly slam on the brakes of the bus. An observer in the garage's frame of reference will claim that the bus fit in the garage because of its contracted length. The driver, however, will perceive the garage as being contracted and thus even less able to contain the bus. The paradox is resolved when we recognize that the concept of fitting the bus in the garage “all at once” contains a hidden assumption, the assumption that it makes sense to ask whether the front and back of the bus can simultaneously be in the garage. Observers in different frames of reference moving at high relative speeds do not necessarily agree on whether things happen simultaneously. The person in the garage's frame can shut the door at an instant he perceives to be simultaneous with the front bumper's arrival at the back wall of the garage, but the driver would not agree about the simultaneity of these two events, and would perceive the door as having shut long after she plowed through the back wall.

schoolbus

e / In the garage's frame of reference, 1, the bus is moving, and can fit in the garage. In the bus's frame of reference, the garage is moving, and can't hold the bus.

Spacetime

We consider x, y, and z to be three axes in space. There can be no physical distinction between them, since rotation transformations like the ones given on page 379 can interchange or mix together the three coordinates. One observer can say that two different points in space have the same value of x, but a different observer working in a differently oriented coordinate system would say they have different x values; this shows that the x axis can't be singled out from the others in any physically meaningful way.

In relativity, Lorentz transformations can mix the space variables with the time variable, and different observers will not necessarily agree on whether two events are simultaneous, i.e., on whether they have the same t. Thus there is no unique, physically meaningful way to defined a time axis and set it apart from the space axes. The three space coordinates and the time coordinate are really just four coordinates that allow us to describe points in a four-dimensional space, which we call spacetime.

What does this mean? We can't visualize four dimensions. One technique for visualizing spacetime is to ignore one of the space dimensions, cutting the total number back down to three. For instance, the earth orbits the sun within a certain plane, so if we define x and y axes within the orbital plane, the z axis is not very interesting, and we can ignore it for purposes of describing the earth's motion. If we visualize x, y, and t in three dimensions, the points in spacetime visited by the earth form a helical curve, f. The earth stays “above” the circle defined by its orbit in the x-y plane. The earth will visit the same x-y point over and over, but it never visits the same spacetime point again, because t changes by one year over each orbit. Every point in spacetime is called an event.

It might seem that the mixing of space and time is so insane that virtually anything can happen, but there is a good way of bringing order to the madness. Continuing with the same mode of visualization, imagine that at a certain moment at a certain point in space (at a certain event in spacetime), a flash of light is emitted. The light pulse travels outward in all directions, forming an expanding spherical shell. If we ignore one of the space dimensions, this becomes an expanding circle, like a ripple on a pond. In x-y-t space, the ripple becomes a cone, g. If we are present at the emission of the light pulse, then events inside this light cone are ones that we may be able to observe in the future. For instance, if we just stay put, we will be present at every event that lies along the axis of the cone. If we were to move off at 99.99999% of the speed of light, we could witness a bunch of the events along a world line lying just inside the cone. Events in the region of spacetime outside the light cone, however, are ones we can never experience firsthand. For instance, consider an event that is happening right now, in a galaxy far, far away. This event lies in the spacelike region outside the light cone, directly away from the tip of the cone in the direction perpendicular to its axis. We can't travel a large distance in an instant, since it's impossible to go faster than light, so we can never get to this event. The spacelike region consists of points whose distance from us in space is greater (in natural units) than their distance from us in time, so we can never visit them without traveling faster than light.

The great thing about the light cone is that everyone has to agree on it. A Lorentz transformation will skew and distort all of spacetime, but it will leave the light cone alone, since the light cone is defined by a flash of light, and all observers agree on the speed of light.

It's also possible to make a light cone that extends backward into the past. These are events that we can remember or get information about, but that we can never visit again because they lie in our past.

The spacetime interval

The light cone is helpful because it stays the same when you do a Lorentz transformation. Is there anything else that stays the same? For guidance, consider rotations. In a rotation, distances and angles stay the same. Now if you were an ant living on a telephone wire, you'd only know about one dimension, and the only type of rotation you'd be able to understand would be a 180-degree flip, which wouldn't change the lengths of line segments. But suppose there was some unsuspected second dimension to space. A one-meter line segment, with Δ x=1.0 m, that was rotated 60 degrees into this second dimension would then have Δ x=0.5 m. The existence of the newly discovered dimension has broken the rule that rotations don't change values of Δ x. However, an ant named Pythagoras might realize that there was a new way to redefine distance, as sqrt{Delta x^2+Delta y^2}, so distances would stay the same in two-dimensional space. For reasons that will become apparent shortly, it turns out to be more convenient to work with the square of the distance, Δ x2y2, which we call an interval. Generalizing to three dimensions is a snap: we just define the interval as Δ x2y2z2.

Now what about the generalization to four dimensions? The quantity Δ x2y2z2 doesn't stay the same when we do a Lorentz transformation --- distances get contracted. We might try Δ x2y2z2t2 (in natural units), but that wouldn't work, as you can easily verify by trying an example. We already know that the light cone stays the same under a Lorentz transformation, and the light cone is defined by the equation (distance)/(time)=(speed of light), or (distance)=(time) in natural units, which is equivalent to

 sqrt{Delta x^2+Delta y^2+Delta z^2}=Delta t

or

Δ x2y2z2t2 = 0 .

With this motivation, we define2 the interval between two events in spacetime as Δ x2y2z2t2. With this definition, the interval stays the same under a Lorentz transformation. Events with a zero interval between them are in each other's light cones. A positive interval indicates a spacelike relationship, and a negative one shows a timelike relationship. The possibility of a negative result is the reason for working with the quantity Δ x2y2z2t2 rather than sqrt{Delta x^2+Delta y^2+Delta z^2-Delta t^2}.

Example 4: The interval and the correspondence principle

What happens if we try to interpret the interval in a nonrelativistic context? The equation for the interval is expressed in natural units. Suppose we take two events in the everyday world. For instance, my dog barks, and I turn around to look at why he's barking. In natural units, both the Δ t and the Δ x between these two events would be expressed in units of seconds. The Δ t is something like a second or half a second. The Δ x would be a few meters in SI units, but in natural units, it converts to the time light would take to travel a few meters, which would be on the order of 10-8 s. We find, then, that the Δ x makes a negligibly small contribution to the interval compared to the Δ t. In other words, we never encounter spacelike or lightlike intervals in our everyday experience; all the intervals we experience directly are timelike. Nonrelativistically, when nothing is traveling at an appreciable fraction of the speed of light, the interval is essentially the same as Δ t2.

How are we to interpret this? The interval is the same in all frames of reference, and in nonrelativistic situations, this means that Δ t must be the same for all observers. But this is exactly what we expect in Galilean relativity. In the Galilean transformations, the absolute nature of time is expressed in the equation t'=t. This is an example of the correspondence principle, which states that when a new physical theory supersedes an old one, it must be consistent with the old one within the old one's domain of validity.

Discussion Questions

The graphs for discussion questions A and B represent spacetime with one space dimension. Each square on the graph is one light-year wide and one year tall. In A, the dots represent civilizations in different times and on different planets. The planets all happen to lie along the same line, so we don't need y or z coordinates to show their locations. None of the planets are in rapid motion relative to each other, so we don't have to worry right now about whose frame of reference the graph depicts --- they all agree.

(1) How many different planets are represented on the graph?

(2) The black dot represents a historian. Draw the historian's light cone.

(3) The historian is interested in getting information that has been preserved by civilizations in her past, and also hopes to preserve information about her own civilization for those in her future. By what methods could this be accomplished for the events shown as white circles?

Which of the world lines are possible, and which are impossible? Could they represent light? Matter? What does this have to do with the light cone?

The graph below is unlike the other ones we've been considering because it represents two dimensions of space at a certain instant. In each case, show what happens to the letter R when you do the transformation. Are the laws of physics the same in the x', y' coordinates? In other words, if the transformation is suddenly applied to your physics lab, will experiments still come out the same?

The following graphs show one space dimension and one time dimension.

(1) In each case, apply the transformation

 x' = x+(0.2)t

 t' = t

to the indicated events.

(2) How would you interpret the meaning of the transformation?

(3) In each case are there special relationships between the two events? Do observers in these two frames of reference agree on these relationships?

dq-galilean

Discussion question D.

This is similar to discussion question D, but with the following transformation:

x' = (1.67)x+(-1.34)t

t' = (-1.34)x+(1.67)t

lorentz

h / Discussion question E.

7.3 Dynamics

So far we have said nothing about how to predict motion in relativity. Do Newton's laws still work? Do conservation laws still apply? The answer is yes, but many of the definitions need to be modified, and certain entirely new phenomena occur, such as the conversion of mass to energy and energy to mass, as described by the famous equation E=mc2. To cut down on the level of mathematical detail, I have relegated most of the derivations to page 850, presenting mainly the results and their physical explanations in this section.

Invariants

The discussion has the potential to become very confusing very quickly because some quantities, force for example, are perceived differently by observers in different frames, whereas in Galilean relativity they were the same in all frames of reference. To clear the smoke it will be helpful to start by identifying quantities that we can depend on not to be different in different frames. We have already seen how the principle of relativity requires that the speed of light is the same in all frames of reference. We say that c is invariant.

Another important invariant is mass. This makes sense, because the principle of relativity states that physics works the same in all reference frames. The mass of an electron, for instance, is the same everywhere in the universe, so its numerical value is one of the basic laws of physics. We should therefore expect it to be the same in all frames of reference as well. (Just to make things more confusing, about 50% of all books say mass is invariant, while 50% describe it as changing. It is possible to construct a self-consistent framework of physics according to either description. Neither way is right or wrong, the two philosophies just require different sets of definitions of quantities like momentum and so on. For what it's worth, Einstein eventually weighed in on the mass-as-an-invariant side of the argument. The main thing is just to be consistent.)

A third invariant is electrical charge. This has been verified to high precision because experiments show that an electric field does not produce any measurable force on a hydrogen atom. If charge varied with speed, then the electron, typically orbiting at about 1% of the speed of light, would not exactly cancel the charge of the proton, and the hydrogen atom would have a net charge.

Combination of velocities

The impossibility of motion faster than light is a radical difference between relativistic and nonrelativistic physics, and we can get at most of the issues in this section by considering the flaws in various plans for going faster than light. The simplest argument of this kind is as follows. Suppose Janet takes a trip in a spaceship, and accelerates until she is moving at v=0.9 (90% of the speed of light in natural units) relative to the earth. She then launches a space probe in the forward direction at a speed u=0.2 relative to her ship. Isn't the probe then moving at a velocity of 1.1 times the speed of light relative to the earth?

The problem with this line of reasoning is that the distance covered by the probe in a certain amount of time is shorter as seen by an observer in the earthbound frame of reference, due to length contraction. Velocities are therefore combined not by simple addition but by a more complex method, which we derive on page 850 by performing two transformations in a row. In our example, the first transformation would be from the earth's frame to Janet's, the second from Janet's to the probe's. The result is

label{relcombveleqn} v_{combined} = frac{u+v}{1+uv} qquad .

Example 5: Janet's probe

Applying the equation to Janet's probe, we find

 v_{combined} = frac{zu{0.9+0.2}}{zu{1+(0.9)(0.2)}}

 = 0.93 qquad ,

so it's still going quite a bit slower than the speed of light

Example 6: Combination of velocities in unnatural units

In a system of units, like the metric system, with c≠1, all our symbols for velocity should be replaced with velocities divided by c, so we have

 frac{ v_{combined}}{ c} = frac{frac{ u}{ c}+frac{ v}{ c}} {1+left(frac{ u}{ c}right)left(frac{ v}{ c}right)} qquad ,

or

vcombined

When u and v are both much less than the speed of light, the quantity uv/ c2 is very close to zero, and we recover the nonrelativistic approximation, vcombined= u+ v.

The second example shows the correspondence principle at work: when a new scientific theory replaces an old one, the two theories must agree within their common realm of applicability.

Momentum and force

Momentum

We begin our discussion of relativistic momentum with another scheme for going faster than light. Imagine that a freight train moving at a velocity of 0.6 (v=0.6c in unnatural units) strikes a ping-pong ball that is initially at rest, and suppose that in this collision no kinetic energy is converted into other forms such as heat and sound. We can easily prove based on conservation of momentum that in a very unequal collision of this kind, the smaller object flies off with double the velocity with which it was hit. (This is because the center of mass frame of reference is essentially the same as the frame tied to the freight train, and in the center of mass frame both objects must reverse their initial momenta.) So doesn't the ping-pong ball fly off with a velocity of 1.2, i.e., 20% faster than the speed of light?

The answer is that since p=mv led to this contradiction with the structure of relativity, p=mv must not be the correct equation for relativistic momentum. Apparently p=mv is only a low-velocity approximation to the correct relativistic result. We need to find a new expression for momentum that agrees approximately with p=mv at low velocities, and that also agrees with the principle of relativity, so that if the law of conservation of momentum holds in one frame of reference, it also is obeyed in every other frame. A proof is given on page 850 that such an equation is

 p = mgamma v quad , quad text{[relativistic equation for momentum]}

which differs from the nonrelativistic version only by the factor of γ. At low velocities γ is very close to 1, so p=mv is approximately true, in agreement with the correspondence principle. At velocities close to the speed of light, γ approaches infinity, and so an object would need infinite momentum to reach the speed of light.

Force

What happens if you keep applying a constant force to an object, causing it to accelerate at a constant rate until it exceeds the speed of light? The hidden assumption here is that Newton's second law, a=F/m, is still true. It isn't. Experiments show that at speeds comparable to the speed of light, a=F/m is wrong. The equation that still is true is

 vc{F} = frac{der vc{p}}{der t} qquad .

You could apply a constant force to an object forever, increasing its momentum at a steady rate, but as the momentum approached infinity, the velocity would approach the speed of light. In general, a force produces an acceleration significantly less than F/m at relativistic speeds.

Would passengers on a spaceship moving close to the speed of light perceive every object as being more difficult to accelerate, as if it was more massive? No, because then they would be able to detect a change in the laws of physics because of their state of motion, which would violate the principle of relativity. The way out of this difficulty is to realize that force is not an invariant. What the passengers perceive as a small force causing a small change in momentum would look to a person in the earth's frame of reference like a large force causing a large change in momentum.

kecomparison

a / A comparison of the relativistic and nonrelativistic expressions for kinetic energy.

lhc

b / The Large Hadron Collider. The red circle shows the location of the underground tunnel which the LHC will share with a preexisting accelerator.

Kinetic energy

Since kinetic energy equals frac{1}{2}mv^2, wouldn't a sufficient amount of energy cause v to exceed the speed of light? You're on to my methods by now, so you know this is motivation for a redefinition of kinetic energy. The work-kinetic energy theorem is derived on page 850 using the correct relativistic treatment of force. The result is

 K = m(gamma-1) qquad . qquad text{[relativistic kinetic energy]}

Since γ approaches infinity as velocity approaches the speed of light, an infinite amount of energy would be required in order to make an object move at the speed of light.

Example 7: Kinetic energy in unnatural units

How can this equation be converted back into units in which the speed of light does not equal one? One approach would be to redo the derivation on page 850 in unnatural units. A far simpler approach is simply to add factors of c where necessary to make the metric units look consistent. Suppose we decide to modify the right side in order to make its units consistent with the energy units on the left. The ordinary nonrelativistic definition of kinetic energy as zu{(1/2)} mv^2 shows that the units on the left are

 kgunit unitdot frac{munit^2}{sunit^2} qquad .

The factor of γ-1 is unitless, so the mass units on the right need to be multiplied by m2/s2 to agree with the left. This means that we need to multiply the right side by c2:

K = mc2(γ-1)

This is beginning to resemble the famous E= mc2 equation, which we will soon attack head-on.

Example 8: The correspondence principle for kinetic energy

It's far from obvious that this result, even in its metric-unit form, reduces to the familiar zu{(1/2)} mv^2 at low speeds, as required by the correspondence principle. To show this, we need to find a low-velocity approximation for γ. In metric units, the equation for γ reads as

 gamma = frac{1}{sqrt{1- v^2/ c^2}}qquad .

Reexpressing this as left(1- v^2/ c^2right)^zu{-1/2}, and making use of the approximation (1+ε)p≈1+ pε for small ε, the equation for gamma becomes

 gamma approx 1 + frac{ v^2}{2 c^2} qquad ,

which can readily be used to show  mc^2(gamma-1)approx zu{(1/2)} mv^2.

Example 9: The large hadron collider

◊ The Large Hadron3 Collider (LHC), being built in Switzerland, is a ring with a radius of 4.3 km, designed to accelerate two counterrotating beams of protons to energies of 1.1×10-6 J per proton. (A microjoule is quite a healthy energy for a subatomic particle!) The ring has to be so big because the inward force from the accelerator's magnets would not be great enough to make the protons curve more tightly at top speed.

(a) What inward force must be exerted on each proton?

(b) In a purely Newtonian world where there were no relativistic effects, how much smaller could the LHC be if it was to produce proton beams moving at speeds close to the speed of light?

◊(a) Since the protons have velocity vectors with constant magnitudes, γ is constant, so let's start by computing it. We'll work the whole problem in SI units, since none of the data are given in natural units. Looking up the mass of a proton, we have

 mc^2 = zu{(1.7}times10^{-27} kgunit zu{)(3.0}times10^8 zu{m/s)}^2

 = 1.5times10^{-10} junit qquad .

The kinetic energy is thousands of times greater than mc2, so the protons go very close to the speed of light. Under these conditions there is no significant difference between γ and γ-1, so

 gamma approx K / mc^2

 = 7.3times10^3

We analyze the circular motion in the laboratory frame of reference, since that is the frame of reference in which the LHC's magnets sit, and their fields were calibrated by instruments at rest with respect to them. The inward force required is

 vc{F} = der vc{p}/der t

 = der( mgamma vc{v})/der t

 = m gamma der vc{v}/der t

 = m gamma vc{a} qquad .

Except for the factor of γ, this is the same result we would have had in Newtonian physics, where we already know the equation a= v2/ r for the inward acceleration in uniform circular motion. Since the velocity is essentially the speed of light, we have a= c2/ r. The force required is

 F = m gamma c^2/r

 = K / r qquad . qquad text{[since $gammaapproxgamma-1$]}

This looks a little funny, but the units check out, since a joule is the same as a newton-meter. The result is

 F = 2.6times10^{-10} nunit

(b) In a Newtonian universe,

 F = mv^2/ r

 = mc^2/ r

 r = mc^2/ F

 = zu{ 0.59} munit

In a nonrelativistic world, it would be a table-top accelerator! The energies and momenta, however, would be smaller.

Equivalence of mass and energy

The treatment of relativity so far has been purely mechanical, so the only form of energy we have discussed is kinetic. For example, the storyline for the introduction of relativistic momentum was based on collisions in which no kinetic energy was converted to other forms. We know, however, that collisions can result in the production of heat, which is a form of kinetic energy at the molecular level, or the conversion of kinetic energy into entirely different forms of energy, such as light or electrical energy.

Let's consider what happens if a blob of putty moving at velocity v hits another blob that is initially at rest, sticking to it, and as much kinetic energy as possible is converted into heat. (It is not possible for all the kinetic energy to be converted to heat, because then conservation of momentum would be violated.) The nonrelativistic result is that to obey conservation of momentum the two blobs must fly off together at v/2.

Relativistically, however, an interesting thing happens. A hot object has more momentum than a cold object! This is because the relativistically correct expression for momentum is p=mγ v, and the more rapidly moving molecules in the hot object have higher values of γ. There is no such effect in nonrelativistic physics, because the velocities of the moving molecules are all in random directions, so the random motion's contribution to momentum cancels out.

In our collision, the final combined blob must therefore be moving a little more slowly than the expected v/2, since otherwise the final momentum would have been a little greater than the initial momentum. To an observer who believes in conservation of momentum and knows only about the overall motion of the objects and not about their heat content, the low velocity after the collision would seem to be the result of a magical change in the mass, as if the mass of two combined, hot blobs of putty was more than the sum of their individual masses.

Heat energy is equivalent to mass.

Now we know that mass is invariant, and no molecules were created or destroyed, so the masses of all the molecules must be the same as they always were. The change is due to the change in γ with heating, not to a change in m. But how much does the mass appear to change? On page 852 we prove that the perceived change in mass exactly equals the change in heat energy between two temperatures, i.e., changing the heat energy by an amount E changes the effective mass of an object by E as well. This looks a bit odd because the natural units of energy and mass are the same. Converting back to ordinary units by our usual shortcut of introducing factors of c, we find that changing the heat energy by an amount E causes the apparent mass to change by m=E/c2. Rearranging, we have the famous E=mc2.

All energy is equivalent to mass.

But this whole argument was based on the fact that heat is a form of kinetic energy at the molecular level. Would E=mc2 apply to other forms of energy as well? Suppose a rocket ship contains some electrical energy stored in a battery. If we believed that E=mc2 applied to forms of kinetic energy but not to electrical energy, then we would have to believe that the pilot of the rocket could slow the ship down by using the battery to run a heater! This would not only be strange, but it would violate the principle of relativity, because the result of the experiment would be different depending on whether the ship was at rest or not. The only logical conclusion is that all forms of energy are equivalent to mass. Running the heater then has no effect on the motion of the ship, because the total energy in the ship was unchanged; one form of energy was simply converted to another.

Example 10: A rusting nail

◊ A 50-gram iron nail is left in a cup of water until it turns entirely to rust. The energy released is about 0.5 MJ (megajoules). In theory, would a sufficiently precise scale register a change in mass? If so, how much?

◊ The energy will appear as heat, which will be lost to the environment. So the total mass-energy of the cup, water, and iron will indeed be lessened by 0.5 MJ. (If it had been perfectly insulated, there would have been no change, since the heat energy would have been trapped in the cup.) Converting to mass units, we have

 m= E/ c^2

 = zu{(0.5}times10^6 junitzu{)} / zu{(3.0}times10^8 munit/sunitzu{)}^2

 = 6times10^{-12} junit/zu{(m}^2/sunit^2zu{)}

 = 6times10^{-12} left(kgunitunitdotzu{(m}^2/sunit^2zu{)}right) / zu{(m}^2/sunit^2zu{)}

 = 6times10^{-12} kgunit qquad ,

so the change in mass is too small to measure with any practical technique. This is because the square of the speed of light is such a large number in metric units.

Energy participates in gravitational forces.

In the example we tacitly assumed that the increase in mass would show up on a scale, i.e., that its gravitational attraction with the earth would increase. Strictly speaking, however, we have only proved that energy relates to inertial mass, i.e., to phenomena like momentum and the resistance of an object to a change in its state of motion. Even before Einstein, however, experiments had shown to a high degree of precision that any two objects with the same inertial mass will also exhibit the same gravitational attractions, i.e., have the same gravitational mass. For example, the only reason that all objects fall with the same acceleration is that a more massive object's inertia is exactly in proportion to the greater gravitational forces in which it participates. We therefore conclude that energy participates in gravitational forces in the same way mass does. The total gravitational attraction between two objects is proportional not just to the product of their masses, m1m2, as in Newton's law of gravity, but to the quantity (m1+E1)(m2+E2). (Even this modification does not give a complete, self-consistent theory of gravity, which is only accomplished through the general theory of relativity.)

eclipse

c / Example 11.

Example 11: Gravity bending light
Mass and energy are equivalent. The energy of a beam of light is equivalent to a certain amount of mass, and the beam is therefore deflected by a gravitational field. Einstein's prediction of this effect was verified in 1919 by astronomers who photographed stars in the dark sky surrounding the sun during an eclipse. (If there was no eclipse, the glare of the sun would prevent the stars from being observed.) Figure c is a photographic negative, so the circle that appears bright is actually the dark face of the moon, and the dark area is really the bright corona of the sun. The stars, marked by lines above and below them, appeared at positions slightly different than their normal ones, indicating that their light had been bent by the sun's gravity on its way to our planet.
Example 12: Black holes

A star with sufficiently strong gravity can prevent light from leaving. Quite a few black holes have been detected via their gravitational forces on neighboring stars or clouds of dust.

Creation and destruction of particles

Since mass and energy are beginning to look like two sides of the same coin, it may not be so surprising that nature displays processes in which particles are actually destroyed or created; energy and mass are then converted back and forth on a wholesale basis. This means that in relativity there are no separate laws of conservation of energy and conservation of mass. There is only a law of conservation of mass plus energy (referred to as mass-energy). In natural units, E+m is conserved, while in ordinary units the conserved quantity is E+mc2.

Example 13: Electron-positron annihilation

Natural radioactivity in the earth produces positrons, which are like electrons but have the opposite charge. A form of antimatter, positrons annihilate with electrons to produce gamma rays, a form of high-frequency light. Such a process would have been considered impossible before Einstein, because conservation of mass and energy were believed to be separate principles, and the process eliminates 100% of the original mass. In metric units, the amount of energy produced by annihilating 1 kg of matter with 1 kg of antimatter is

 E = mc^2

 = zu{(2 kg)(3.0}times10^8 text{m/s}zu{)}^2

 = 2times10^{17} junit qquad ,

which is on the same order of magnitude as a day's energy consumption for the entire world!

Positron annihilation forms the basis for the medical imaging procedure called a PET (positron emission tomography) scan, in which a positron-emitting chemical is injected into the patient and mapped by the emission of gamma rays from the parts of the body where it accumulates.

Note that the idea of mass as an invariant is separate from the idea that mass is not separately conserved. Invariance is the statement that all observers agree on a particle's mass regardless of their motion relative to the particle. Mass may be created or destroyed if particles are created or destroyed, and in such a situation mass invariance simply says that all observers will agree on how much mass was created or destroyed.

\backofchapterboilerplate{rel}

Homework Problems

1. Astronauts in three different spaceships are communicating with each other. Those aboard ships A and B agree on the rate at which time is passing, but they disagree with the ones on ship C.
(a) Describe the motion of the other two ships according to Alice, who is aboard ship A.
(b) Give the description according to Betty, whose frame of reference is ship B.
(c) Do the same for Cathy, aboard ship C.

2. As of 2006, the best atomic clocks have accuracies of about one part in 1015. How does this compare with the time dilation effect produced if the clock takes a trip aboard a jet moving at 300 m/s? Would the effect be measurable? \hwhint{hwhint:atomicclock}

3. (a) Find an expression for v in terms of γ
(answer check available at lightandmatter.com) (b) Using your result from part a, show that for very large values of γ, v gets close to the speed of light.

4. A velocity of 4/5 the speed of light results in γ=5/3, which is a nice simple fraction: one integer divided by another. Find one or more additional examples like this (not the trivial cases v=0 or -4/5).

5. The earth is orbiting the sun, and therefore is contracted relativistically in the direction of its motion. Compute the amount by which its diameter shrinks in this direction.

6. When an object moves at a speed extremely close to the speed of light, we refer to its motion as “ultrarelativistic.” Find an approximation for the γ of an object in ultrarelativistic motion at a velocity of (1-ε)c, where ε is small. This approximation can be useful in cases where ε is so small that your calculator would round off the expression sqrt{1-v^2/c^2} to zero, giving a γ=∞.(answer check available at lightandmatter.com)

7. Our sun lies at a distance of 26,000 light years from the center of the galaxy, where there are some spectactular sights to see, including a supermassive black hole that is rapidly eating up the surrounding interstellar gas and dust. Rich tourist Bill Gates IV buys a spaceship, and heads for the galactic core at a speed of 99.99999% of the speed of light.
(a) According to observers on Earth, how long does it take before he gets back? (Ignore the short time he actually spends sightseeing at the core.)(answer check available at lightandmatter.com)
(b) In Bill's frame of reference, how much time passes?(answer check available at lightandmatter.com)
(c) When you compare your answer to part b with the round-trip distance, do you conclude that Bill considers himself to be moving faster than the speed of light? If so, how do you reconcile this with relativity? If not, then resolve the apparent paradox.

8. (a) Reexpress the Lorentz transformation equations using ordinary metric units where c≠1. The point here is to practice the technique for converting any formula from natural units to metric units, by inserting factors of c wherever necessary in order to make the units make sense, as in the examples 2 and 3 on page 381. That means you shouldn't go back and redo the whole derivation from scratch.(answer check available at lightandmatter.com)
(b) Show that for speeds that are small compared to the speed of light, these equations are identical to the Galilean equations.

9. (a) Make up a numerical example of two events, and show that if we defined the spacetime interval as Δ x2y2z2t2, we would not get consistent results when we Lorentz-transformed the events into a different frame of reference.
(b) Show that, for the particular example you chose in part a, the quantity Δ x2y2z2t2 does come out the same in both frames.
(c) Ignoring the y and z space dimensions, prove that Δ x2t2 stays the same under a Lorentz transformation for motion along the x axis. You're proving this in general now, not just checking it for one numerical example.
(d) Reexpress the definition Δ x2y2z2t2 of the spacetime interval in unnatural units, where c≠1.

10. Make up a numerical example, in a particular frame of reference, of two events with a spacelike interval between them. Make event 2 occur after event 1. Now show by using a Lorentz transformation that you can find another frame of reference in which event 2 occurs before event 1.

To get from event 1 to event 2, or vice versa, you would have to travel faster than light. Therefore there can't be a cause-and-effect relationship between the two events, and it doesn't really matter which one we consider to have happened first. On the other hand, if faster-than-light travel was possible, then time travel paradoxes would be possible in this kind of situation. For example, event 2 could be your birth, and event 1 could be when you kill your own grandmother before she has any children.

11. (a) A spacecraft traveling at 1.0000×107 m/s relative to the earth releases a probe in the forward direction at a relative speed of 2.0000×107 m/s. How fast is the probe moving relative to the earth? How does this compare with the nonrelativistic result? (answer check available at lightandmatter.com)
(b) Repeat the calculation, but with both velocities equal to c/2. How does this compare with the nonrelativistic result?(answer check available at lightandmatter.com)

12. (a) Show that when two velocities are combined relativistically, and one of them equals the speed of light, the result also equals the speed of light.
(b) Explain why it has to be this way based on the principle of relativity. (Note that it doesn't work to say that it has to be this way because motion faster than c is impossible. That isn't what the principle of relativity says, and it also doesn't handle the case where the velocities are in opposite directions.)

13. Cosmic-ray particles with relativistic velocities are continually bombarding the earth's atmosphere. They are protons and other atomic nuclei. Suppose a carbon nucleus (containing six protons and six neutrons) arrives with an energy of 10-7 J, which is unusually high, but not unheard of. By what factor is its length shortened as seen by an observer in the earth's frame of reference? \hwhint{hwhint:cosmicraygamma}(answer check available at lightandmatter.com)

14. (a) A free neutron (as opposed to a neutron bound into an atomic nucleus) is unstable, and decays radioactively into a proton, an electron, and a particle called a neutrino. (This process can also occur for a neutron in a nucleus, but then other forms of mass-energy are involved as well.) The masses are as follows:

neutron 1.67495×10 − 27 kg
proton 1.67265×10 − 27 kg
electron 0.00091×10 − 27 kg
neutrinonegligible


Find the energy released in the decay of a free neutron.(answer check available at lightandmatter.com)
(b) We might imagine that a proton could decay into a neutron, a positron, and a neutrino. Although such a process can occur within a nucleus, explain why it cannot happen to a free proton. (If it could, hydrogen would be radioactive!)

15. (a) Find a relativistic equation for the velocity of an object in terms of its mass and momentum (eliminating γ). Work in natural units. (answer check available at lightandmatter.com)
(b) Show that your result is approximately the same as the classical value, p/m, at low velocities.
(c) Show that very large momenta result in speeds close to the speed of light.

16. (a) Prove the equation E2-p2=m2 for a material object, where E=mγ is the total mass-energy.
(b) Using this result, show that an object with zero mass must move at the speed of light.
(c) This equation can be applied more generally, to light for instance. Use it to find the momentum of a beam of light having energy E. (answer check available at lightandmatter.com)
(d) Convert your answer from the previous part into ordinary units.(answer check available at lightandmatter.com)

17. Starting from the equation vcombinedγcombined = derived on page 850, complete the proof of vcombined = (v1+v2)/(1+v1v2).

18. In the equation for the relativistic addition of velocities u and v, consider the limit in which u approaches 1, but v simultaneously approaches -1. Give both a physical and a mathematical interpretation.

19. (a) Use the result of problem 16d to show that if light with power P is reflected perpendicularly from a perfectly reflective surface, the force on the surface is 2P/c.
(b) Estimate the maximum mass of a thin film that is to be levitated by a 100-watt lightbulb. (solution in the pdf version of the book)

20. Expand the equation K = m(γ-1) in a Taylor series, and find the first two nonvaninishing terms. Show that the first term is the classical expression for kinetic energy.

21. Expand the relativistic equation for momentum in a Taylor series, and find the first two nonvaninishing terms. Show that the first term is the classical expression.

22. A source of light with frequency f is moving toward an observer at velocity v (or away from the observer if v is negative). Find the relativistically correct equation for the Doppler shift of the light. \hwhint{hwhint:reldoppler}

23. An antielectron collides with an electron that is at rest. (An antielectron is a form of antimatter that is just like an electron, but with the opposite charge.) The antielectron and electron annihilate each other and produce two gamma rays. (A gamma ray is a form of light. It has zero mass.) Gamma ray 1 is moving in the same direction as the antielectron was initially going, and gamma ray 2 is going in the opposite direction. Throughout this problem, you should work in natural units and use the notation E to mean the total mass-energy of a particle, i.e., its mass plus its kinetic energy. Find the energies of the two gamma-rays, E1 and E2, in terms of m, the mass of an electron or antielectron, and Eo, the initial mass-energy of the antielectron. You'll need the result of problem 16a.

24. Radiocative particle a decays, annihilating itself and producing two particles b and c, of unequal mass. Consider this process in the frame of reference in which particle a was at rest before the decay.
(a) In the special case where very little energy is released in the decay, and particles b and c have nonrelativistic speeds, prove using classical physics that the particle with the lower mass must have the higher kinetic energy.
(b) Find an expression for the mass-energy Ec of particle c, in terms of the masses ma, mb, and mc. Hint: work in natural units, and make use of the result of problem 16a.(answer check available at lightandmatter.com)
(c) Show that the units of your answer make sense.
(d) Show that your expression has the correct behavior in the case of mb=mc.
(e) A process of this type is the decay of a K+ particle into a π+ and a π0 (called pions). The masses are 493.7, 139.6, and 135.0 MeV, respectively. (MeV are a unit of energy, but in natural units, they can also be a unit of mass.) Find the mass-energies and kinetic energies of the two pions, and verify that the nonrelativistic prediction of part (a) is still correct, even in the fully relativistic case.

25. (a) Find an expression, in natural units, for the velocity of a particle having mass m and mass-energy E.
(b) Show that the units of your equation make sense.
(c) Your answer involves a square root, which could be either the positive or the negative root. Explain what this represents physically, and why it makes sense.
(d) Discuss the limit of E >> m, both mathematically and physically.
(e) Rewrite your expression in SI units. Don't rederive it from scratch. Simply determine how it needs to be altered by inserting factors of c in order to make the units work out in the SI.

Exercises

Exercise A: The Michelson-Morley Experiment

ex-michelson-morley In this exercise you will analyze the Michelson-Morley experiment, and find what the results should have been according to Galilean relativity and Einstein's theory of relativity. A beam of light coming from the west (not shown) comes to the half-silvered mirror A. Half the light goes through to the east, is reflected by mirror C, and comes back to A. The other half is reflected north by A, is reflected by B, and also comes back to A. When the beams reunite at A, part of each ends up going south, and these parts interfere with one another. If the time taken for a round trip differs by, for example, half the period of the wave, there will be destructive interference.

The point of the experiment was to search for a difference in the experimental results between the daytime, when the laboratory was moving west relative to the sun, and the nighttime, when the laboratory was moving east relative to the sun. Galilean relativity and Einstein's theory of relativity make different predictions about the results. According to Galilean relativity, the speed of light cannot be the same in all reference frames, so it is assumed that there is one special reference frame, perhaps the sun's, in which light travels at the same speed in all directions; in other frames, Galilean relativity predicts that the speed of light will be different in different directions, e.g., slower if the observer is chasing a beam of light. There are four different ways to analyze the experiment:

Groups 1-4 work in the sun's frame of reference according to Galilean relativity.

Group 1 finds time AC. Group 2 finds time CA. Group 3 finds time AB. Group 4 finds time BA.

Groups 5 and 6 transform the lab-frame results into the sun's frame according to Einstein's theory.

Group 5 transforms the x and t when ray ACA gets back to A into the sun's frame of reference, and group 6 does the same for ray ABA.

Discussion:

Michelson and Morley found no change in the interference of the waves between day and night. Which version of relativity is consistent with their results?

What does each theory predict if v approaches c?

What if the arms are not exactly equal in length?

Does it matter if the “special” frame is some frame other than the sun's?

Exercise B: Sports in Slowlightland

In Slowlightland, the speed of light is 20 mi/hr ≈ 32 km/hr ≈ 9 m/s. Think of an example of how relativistic effects would work in sports. Things can get very complex very quickly, so try to think of a simple example that focuses on just one of the following effects:

- relativistic momentum

- relativistic kinetic energy

- relativistic addition of velocities

- time dilation and length contraction

- Doppler shifts of light

- equivalence of mass and energy

- time it takes for light to get to an athlete's eye

- deflection of light rays by gravity

Footnotes
[1] See discussion question F on page 378, and homework problem 5
[2] The definition Δ t2x2y2z2 is equally valid. It's just a matter of convention. You have to be careful when using the literature to make sure you don't mix equations that assume inconsistent choices of the signs.
[3] “Hadron” refers to particles like protons and neutrons, which participate in nuclear forces.