You are viewing the html version of General Relativity, by Benjamin Crowell. This version is only designed for casual browsing, and may have some formatting problems. For serious reading, you want the Adobe Acrobat version. (c) 1998-2011 Benjamin Crowell, licensed under the Creative Commons Attribution-ShareAlike license. Photo credits are given at the end of the Adobe Acrobat version. |
We now have enough machinery to be able to calculate quite a bit of interesting physics, and to be sure that the results are actually meaningful in a relativistic context. The strategy is to identify relativistic quantities that behave as Lorentz scalars and Lorentz vectors, and then combine them in various ways. The notion of a tensor has been introduced on page 92. A Lorentz scalar is a tensor of rank 0, and a Lorentz vector is a rank-1 tensor.
A Lorentz scalar is a quantity that remains invariant under both spatial rotations and Lorentz boosts. Mass is a Lorentz scalar.1 Electric charge is also a Lorentz scalar, as demonstrated to extremely high precision by experiments measuring the electrical neutrality of atoms and molecules to a relative precision of better than 10-20; the electron in a hydrogen atom has typically velocities of about 1/100, and those in heavier elements such as uranium are highly relativistic, so any violation of Lorentz invariance would give the atoms a nonvanishing net electric charge.
The time measured by a clock traveling along a particular world-line from one event to another is something that all observers will agree upon; they will simply note the mismatch with their own clocks. It is therefore a Lorentz scalar. This clock-time as measured by a clock attached to the moving body in question is often referred to as proper time, “proper” being used here in the somewhat archaic sense of “own” or “self,” as in “The Vatican does not lie within Italy proper.” Proper time, which we notate τ, can only be defined for timelike world-lines, since a lightlike or spacelike world-line isn't possible for a material clock.
More generally, when we express a metric as ds2=…, the quantity ds is a Lorentz scalar. In the special case of a timelike world-line, ds and dτ are the same thing. (In books that use a -+++ metric, one has ds=-dτ.)
Even more generally, affine parameters, which exist independent of any metric at all, are scalars. As a trivial example, if τ is a particular object's proper time, then τ is a valid affine parameter, but so is 2τ+7. Less trivially, a photon's proper time is always zero, but one can still define an affine parameter along its trajectory. We will need such an affine parameter, for example, in section 6.2.7, page 204, when we calculate the deflection of light rays by the sun, one of the early classic experimental tests of general relativity.
Another example of a Lorentz scalar is the pressure of a perfect fluid, which is often assumed as a description of matter in cosmological models.
At the beginning of chapter 3, I motivated the use of infinitesimals as useful tools for doing differential geometry in curved spacetime. Even in the context of special relativity, however, infinitesimals can be useful. One way of expressing the proper time accumulated on a moving clock is



which only contains an explicit dependence on the clock's velocity, not its acceleration. This is an example of the clock “postulate” referred to in the remark at the end of homework problem 1 on page 76. Note that the clock postulate only applies in the limit of a small clock. This is represented in the above equation by the use of infinitesimal quantities like dx.
Our basic Lorentz vector is the spacetime displacement dxi. Any other quantity that has the same behavior as dxi under rotations and boosts is also a valid Lorentz vector. Consider a particle moving through space, as described in a Lorentz frame. Since the particle may be subject to nongravitational forces, the Lorentz frame cannot be made to coincide (except perhaps momentarily) with the particle's rest frame. Dividing the infinitesimal displacement by an infinitesimal proper time interval, we have the four-velocity vector vi=dxi/dτ, whose components in a Lorentz coordinate system are (γ,γ v1,γ v2,γ v3), where vμ, μ=1, 2, 3, is the ordinary three-component velocity vector as defined in classical mechanics. The four-velocity's squared magnitude vivi is always exactly 1, even if the particle is not moving at the speed of light.
When we hear something referred to as a “vector,” we usually take this is a statement that it not only transforms as a vector, but also that it adds as a vector. But we have already seen in section 2.3.1 on page 58 that even collinear velocities in relativity do not add linearly; therefore they clearly cannot add linearly when dressed in the clothing of four-vectors. We've also seen in section 2.5.3 that the combination of non-collinear boosts is noncommutative, and is generally equivalent to a boost plus a spatial rotation; this is also not consistent with linear addition of four vectors. At the risk of beating a dead horse, a four-velocity's squared magnitude is always 1, and this is not consistent with being able to add four-velocity vectors.
◊ Suppose an object has a certain four-velocity vi in a certain frame of reference. Can we transform into a different frame in which the object is at rest, and its four-velocity is zero?
◊ No. In general, the Lorentz transformation preserves the magnitude of vectors, so it can never transform a vector with a zero magnitude into one with zero magnitude. We can transform into a frame in which the object is at rest, but an object at rest does not have a vanishing four-velocity. It has a four-velocity of (1,0,0,0).
The four-acceleration is found by taking a second derivative with respect to proper time. Its squared magnitude is only approximately equal to minus the squared magnitude of the classical acceleration three-vector, in the limit of small velocities.
◊ Let τ stand for the ship's proper time, and let dots indicate derivatives with respect to τ. The ship's velocity has magnitude 1, so

An observer who is instantaneously at rest with respect to the ship judges is to have a four-acceleration (0,a,0,0) (because the low-velocity limit applies). The observer in the (t,x) frame agrees on the magnitude of this vector, so

The solution of these differential equations is
,
[4]
, and eliminating
τ gives

As t approaches infinity, dx/dt approaches the speed of light.
a / Example 10.
If we hope to find something that plays the role of momentum in relativity, then the momentum three-vector clearly needs to be generalized to some kind of four-vector, and if it is to satisfy the correspondence principle then its definition should probably look as much as possible like the nonrelativistic one. In subsection 4.2.1, we obtained the velocity four-vector. Multiplying by the particle's mass, we have the four-momentum pi=mvi, which in Lorentz coordinates is (mγ,mγ v1,mγ v2,mγ v3). There is no a priori guarantee that this is right, but it's the most reasonable thing to guess. It needs to be checked against experiment, and also for consistency with the other parts of our theory.
The spacelike components look like the classical momentum vector multiplied by a factor of γ, the interpretation being that to an observer in this frame, the moving particle's inertia is increased relative to its classical value. Such an effect is indeed observed experimentally. This is why particle accelerators are so big and expensive. As the particle approaches the speed of light, γ diverges, so greater and greater forces are needed in order to produce the same acceleration.
The momentum four-vector has locked within it the reason for Einstein's famous E=mc2, which in our relativistic units becomes simply E=m. To see why, consider the experimentally measured inertia of a physical object made out of atoms. The subatomic particles are all moving, and many of the velocities, e.g., the velocities of the electrons, are quite relativistic. This has the effect of increasing the experimentally determined inertial mass, by a factor of γ averaged over all the particles. The same must be true for the gravitational mass, based on the equivalence principle as verified by Eötvös experiments. If the object is heated, the velocities will increase on the average, resulting in a further increase in its mass. Thus, a certain amount of heat energy is equivalent to a certain amount of mass. But if heat energy contributes to mass, then the same must be true for other forms of energy. For example, suppose that heating leads to a chemical reaction, which converts some heat into electromagnetic binding energy. If one joule of binding energy did not convert to the same amount of mass as one joule of heat, then this would allow the object to spontaneously change its own mass, and then by conservation of momentum it would have to spontaneously change its own velocity, which would clearly violate the principle of relativity. We conclude that mass and energy are equivalent, both inertially and gravitationally. In relativity, neither is separately conserved; the conserved quantity is their sum, referred to as the mass-energy, E. The timelike component of the four-momentum, mγ, is interpreted as the mass-energy of the particle, consisting of its mass m plus its kinetic energy m(γ-1). An alternative derivation, by Einstein, is given in example 12 on page 120.
Since the momentum four-vector was obtained from the magnitude-1 velocity four-vector through multiplication by m, its squared magnitude pipi is equal to the square of the particle's mass. Writing p for the magnitude of the momentum three-vector, and E for the mass-energy, we find the useful relation E2-p2=m2.
A common source of confusion for beginners in relativity is the distinction between quantities that are conserved and quantities that are the same in all frames. There is nothing relativistic about this distinction. Before Einstein, physicists already knew that observers in different frames of reference would agree on the mass of a particle. That is, m was known to be frame-invariant. They also knew that energy was conserved. But just because energy was conserved, that didn't mean that it had to be the same for observers in all frames of reference. The kinetic energy of the chair you're sitting in is millions of joules in a frame of reference tied to the axis of the earth. In relativity, m is frame-invariant (i.e., a Lorentz scalar), but the conserved quantity is the momentum four-vector, which is not frame-invariant.
Applying E2-p2=m2 to the special case of a massless particle, we have |p|=E, which demonstrates, for example, that a beam of light exerts pressure when it is absorbed or reflected by a surface.5 A massless particle must also travel at exactly the speed of light, since |p| arrow E requires mγ v arrow mγ; conversely, a massive particle always has |v|<1.
has two roots, a
positive one and a negative one.
The positive-energy and negative-energy states are separated by a no-man's land of width 2m, so no continuous classical
process can lead from one side to the other. But quantum-mechanically,
if an electron exists with energy
, it should be able to make a quantum leap into a state with
, emitting the energy difference of 2E in the form of photons. Why doesn't
this happen? One explanation is that the states with E<0 are all already occupied. This is the “Dirac sea,” which
we now interpret as being full of electrons. A vacancy in the sea manifests itself as an antielectron.
To demonstrate the consistency of the theory, we can arrive at the same conclusion by a different method. Whenever a particle has a small mass (small compared to its energy, say), it must travel at close to c. It must therefore have a very large time dilation, and will take a very long time to undergo radioactive decay. In the limit as the mass approaches zero, the time required for the decay approaches infinity. Another way of saying this is that the rate of radioactive decay must be fixed in terms of proper time, but there is no such thing as proper time for a massless particle. Thus it is not only this specific process that is forbidden, but any radioactive decay process involving a massless particle.
There are various loopholes in this argument. The question is investigated more thoroughly by Fiore and Modanese.2
In reality, such a discovery would be more of a problem for particle physicists than for relativists, as we can
see by the following sketch of an argument. Imagine two charged particles, at rest, interacting via an electrical attraction.
Quantum mechanics describes this as an exchange of photons. Since the particles are at rest, there is no source of energy,
so where do we get the energy to make the photons? The Heisenberg uncertainty principle, Δ EΔ t >rsim h, allows us to steal this energy,
provided that we give it back within a time Δ t. This time limit imposes a limit on the distance the photons can travel,
but by using photons of low enough energy, we can make this distance limit as large as we like, and there is therefore no
limit on the range of the force. But suppose that the photon has a mass. Then there is
a minimum mass-energy mc2 required in order to create a photon, the maximum time is h/mc2, and the maximum
range is h/mc. Refining these crude arguments a little, one finds that exchange of zero-mass particles gives
a force that goes like 1/r2, while a nonzero mass results in e-μ r/r2, where
.
For the photon, the best current mass limit corresponds to μ-1 >rsim 1011 m, so the deviation
from 1/r2 would be difficult to measure in earthbound experiments.
Now Gauss's law is a specific characteristic
of 1/r2 fields. It would be violated slightly if photons had mass. We would have to modify Maxwell's equations, and
it turns out4
that the necessary change to Gauss's law would be of the form
,
where Φ is the electrical potential, and (...) indicates factors that depend on the choice of units.
This tells us that Φ, which in classical electromagnetism can only be measured
in terms of differences between different points in space, can now be measured in absolute terms. Gauge symmetry has
been broken. But gauge symmetry is indispensible in creating well-behaved relativistic field theories, and
this is the reason that, in general, particle physicists have a hard time with
forces arising from the exchange of massive particles. The hypothetical Higgs particle, which may be observed at the Large Hadron
Collider in the near future, is essentially a mechanism for wriggling out of this difficulty in the case of the massive W and Z particles
that are responsible for the weak nuclear force; the mechanism cannot, however, be extended to allow a massive photon.
The early universe was dominated by radiation. A photon in a box
contributes a pressure on each wall that is proportional to |pμ|, where μ is a spacelike index.
In thermal equilibrium, each of these three degrees of freedom carries an equal amount of energy, and since
momentum and energy are equal for a massless particle, the average momentum along each axis is equal to
.
The resulting equation of state is
. As the universe expanded, the wavelengths of the
photons expanded in proportion to the stretching of the space they occupied, resulting in λ ∝ a-1, where
a is a distance scale describing the universe's intrinsic curvature at a fixed time.
Since the number density of photons is diluted in proportion to a-3, and the mass per photon varies as a-1,
both ρ and P vary as a-4.
Cosmologists refer to noninteracting, nonrelativistic materials as “dust,” which could mean many things, including hydrogen gas, actual dust, stars, galaxies, and some forms of dark matter. For dust, the momentum is negligible compared to the mass-energy, so the equation of state is P=0, regardless of ρ. The mass-energy density is dominated simply by the mass of the dust, so there is no red-shift scaling of the a-1 type. The mass-energy density scales as a-3. Since this is a less steep dependence on a than the a-4, there was a point, about a thousand years after the Big Bang, when matter began to dominate over radiation. At this point, the rate of expansion of the universe made a transition to a qualitatively different behavior resulting from the change in the equation of state.
In the present era, the universe's equation of state is dominated by neither dust nor radiation but by the cosmological constant (see page 255). Figure a shows the evolution of the size of the universe for the three different regimes. Some of the simpler cases are derived in sections 8.2.7 and 8.2.8, starting on page 276.
Frequency is to time as the wavenumber k=1/λ is to space, so when treating waves relativistically it is
natural to conjecture that there is a four-frequency fa made by assembling
, which behaves as a Lorentz vector. This is correct, since
we already know that ∂a transforms as a covariant vector, and for a scalar wave of the form
the partial derivative operator is identical to multiplication by 2π fa.
As an application, consider the relativistic Doppler shift of a light wave. For simpicity, let's restrict ourselves to one spatial dimension. For a light wave, f=k, so the frequency vector in 1+1 dimensions is simply (f,f). Putting this through a Lorentz transformation, we find

where the second form displays more clearly the symmetic form of the relativistic relationship, such that interchanging the roles of source and observer is equivalent to flipping the sign of v. That is, the relativistic version only depends on the relative motion of the source and the observer, whereas the Newtonian one also depends on the source's motion relative to the medium (i.e., relative to the preferred frame in which the waves have the “right” velocity). In Newtonian mechanics, we have f'=(1+v)f for a moving observer. Relativistically, there is also a time dilation of the oscillation of the source, providing an additional factor of γ.
This analysis is extended to 3+1 dimensions in problem 11.
The relativistic Doppler shift differs from the nonrelativistic one by the time-dilation factor γ, so that
there is still a shift even when the relative motion of the source and the observer is perpendicular to the direction
of propagation. This is called the transverse Doppler shift. Einstein suggested this early on as a test of relativity.
However, such experiments are difficult to carry out with high precision, because they are sensitive to any error
in the alignment of the 90-degree angle. Such experiments were eventually performed, with results that confirmed
relativity,6
but one-dimensional measurements provided both the earliest tests of the relativistic Doppler shift and the most precise
ones to date. The first such test was done by Ives and Stilwell in 1938, using the following trick. The relativistic
expression
for the Doppler shift has the property that SvS-v=1, which differs from the
nonrelativistic result of (1+v)(1-v)=1-v2. One can therefore accelerate an ion up to a relativistic speed, measure
both the forward Doppler shifted frequency ff and the backward one fb, and compute
. According
to relativity, this should exactly equal the frequency fo measured in the ion's rest frame.
In a particularly exquisite modern version of the Ives-Stilwell idea,7 Saathoff et al. circulated Li+ ions at v=.064 in a storage ring. An electron-cooler technique was used in order to reduce the variation in velocity among ions in the beam. Since the identity SvS-v=1 is independent of v, it was not necessary to measure v to the same incredible precision as the frequencies; it was only necessary that it be stable and well-defined. The natural line width was 7 MHz, and other experimental effects broadened it further to 11 MHz. By curve-fitting the line, it was possible to achieve results good to a few tenths of a MHz. The resulting frequencies, in units of MHz, were:
| ff | = 582490203.44±.09 |
| fb | = 512671442.9±0.5 |
| sqrtfffb | =546466918.6±0.3 |
| ftextupo | = 546466918.8±0.4 (from previous experimental work) |
The spectacular agreement with theory has made this experiment a lightning rod for anti-relativity kooks.
If one is searching for small deviations from the predictions of special relativity, a natural place to look is at high velocities. Ives-Stilwell experiments have been performed at velocities as high as 0.84, and they confirm special relativity.8
Suppose that a lantern, at rest in the lab frame, is floating weightlessly in outer space, and simultaneously emits two pulses of light in opposite directions, each with energy E/2 and frequency f. By symmetry, the momentum of the pulses cancels, and the lantern remains at rest. An observer in motion at velocity v relative to the lab sees the frequencies of the beams shifted to f'=(1± v)γ f. The effect on the energies of the beams can be found purely classically, by transforming the electric and magnetic fields to the moving frame, but as a shortcut we can apply the quantum-mechanical relation Eph=hf for the energies of the photons making up the beams. The result is that the moving observer finds the total energy of the beams to be not E but (E/2)(1+v)γ +(E/2)(1-v)γ=Eγ.
Both observers agree that the lantern had to use up some of the energy stored in its fuel in order to make the two pulses. But the moving observer says that in addition to this energy E, there was a further energy E(γ-1). Where could this energy have come from? It must have come from the kinetic energy of the lantern. The lantern's velocity remained constant throughout the experiment, so this decrease in kinetic energy seen by the moving observer must have come from a decrease in the lantern's inertial mass --- hence the title of Einstein's paper, “Does the inertia of a body depend upon its energy content?”
To figure out how much mass the lantern has lost, we have to decide how we can even define mass in this new context. In Newtonian mechanics, we had K=(1/2)mv2, and by the correspondence principle this must still hold in the low-velocity limit. Expanding E(γ-1) in a Taylor series, we find that it equals E(v2/2)+…, and in the low-velocity limit this must be the same as Δ K=(1/2)Δ m v2, so Δ m=E. Reinserting factors of c to get back to nonrelativistic units, we have E=Δ m c2.
It is fairly easy to see that the electric and magnetic fields cannot be the spacelike parts of two four-vectors. Consider the arrangement shown in figure b/1. We have two infinite trains of moving charges superimposed on the same line, and a single charge alongside the line. Even though the line charges formed by the two trains are moving in opposite directions, their currents don't cancel. A negative charge moving to the left makes a current that goes to the right, so in frame 1, the total current is twice that contributed by either line charge.
In frame 1 the charge densities of the two line charges cancel out, and the electric field experienced by the lone charge is therefore zero. Frame 2 shows what we'd see if we were observing all this from a frame of reference moving along with the lone charge. Both line charges are in motion in both frames of reference, but in frame 1, the line charges were moving at equal speeds, so their Lorentz contractions were equal, and their charge densities canceled out. In frame 2, however, their speeds are unequal. The positive charges are moving more slowly than in frame 1, so in frame 2 they are less contracted. The negative charges are moving more quickly, so their contraction is greater now. Since the charge densities don't cancel, there is an electric field in frame 2, which points into the wire, attracting the lone charge.
We appear to have a logical contradiction here, because an observer in frame 2 predicts that the charge will collide with the wire, whereas in frame 1 it looks as though it should move with constant velocity parallel to the wire. Experiments show that the charge does collide with the wire, so to maintain the Lorentz-invariance of electromagnetism, we are forced to invent a new kind of interaction, one between moving charges and other moving charges, which causes the acceleration in frame 2. This is the magnetic interaction, and if we hadn't known about it already, we would have been forced to invent it. That is, magnetism is a purely relativistic effect. The reason a relativistic effect can be strong enough to stick a magnet to a refrigerator is that it breaks the delicate cancellation of the extremely large electrical interactions between electrically neutral objects.
Although the example shows that the electric and magnetic fields do transform when we change from one frame to another, it is easy to show that they do not transform as the spacelike parts of a relativistic four-vector. This is because transformation between frames 1 and 2 is along the axis parallel to the wire, but it affects the components of the fields perpendicular to the wire. The electromagnetic field actually transforms as a rank-2 tensor.
c / The charged particle follows a trajectory that extremizes
compared to other nearby trajectories. Relativistically, the trajectory should be understood as a world-line in 3+1-dimensional spacetime.
d / The magnetic field (top) and vector potential (bottom) of a solenoid. The lower diagram is in the plane cutting through the waist of the solenoid, as indicated by the dashed line in the upper diagram. For an infinite solenoid, the magnetic field is uniform on the inside and zero on the outside, while the vector potential is proportional to r on the inside and to 1/r on the outside.
An electromagnetic quantity that does transform as a four-vector is the potential.
On page 107, I mentioned the fact, which may or may not already be familiar to you,
that whereas the Newtonian gravitational field's polarization properties allow it to be described using a single
scalar potential φ or a single vector field
, the pair of electromagnetic fields
needs a pair of potentials, Φ and A. It's easy to see that Φ can't be a Lorentz scalar.
Electric charge q is a scalar, so if Φ were a scalar as well, then the product qΦ would be a scalar.
But this is equal to the energy of the charged particle, which is only the timelike component of the energy-momentum
four-vector, and therefore not a Lorentz scaler itself. This is a contradiction, so Φ is not a scalar.
To see how to fit Φ into relativity, consider the nonrelativistic quantum mechanical relation qΦ=hf for a charged particle
in a potential Φ. Since f is the timelike component of a four-vector in relativity, we need Φ to be the timelike
component of some four vector, Ab. For the spacelike part of this four-vector, let's write A, so that
.
We can see by the following argument that this mysterious A must have something to do with the magnetic field.
Consider the example of figure c from a quantum-mechanical point of view. The charged particle q has wave properties,
but let's say that it can be well approximated in this example as following a specific trajectory. This is like the ray
approximation to wave optics. A light ray in classical optics follows Fermat's principle, also known as the principle
of least time, which states that the ray's path from point A to point B is one that extremizes the optical path length (essentially the number of
oscillations).
The reason for this is that the ray approximation is only an approximation. The ray actually has some width, which we
can visualize as a bundle of neighboring trajectories. Only if the trajectory follows Fermat's principle will the
interference among the neighboring paths be constructive. The classical optical path length is found by integrating
, where k is the wavenumber. To make this relativistic, we need to use the frequency
four-vector to form fbdxb, which can also be expressed as
. If the charge is at rest and there are no magnetic fields, then the quantity in parentheses
is f=E/h=(q/h)Φ. The correct relativistic generalization is clearly fb=(q/h)Ab.
Since Ab's spacelike part, A, results in the velocity-dependent effects, we conclude that A is a kind of potential that relates to the magnetic field, in the same way that the potential Φ relates to the electric field. A is known as the vector potential, and the relation between the potentials and the fields is


An excellent discussion of the vector potential from a purely classical point of view is given in the classic Feynman Lectures.9 Figure d shows an example.
We may wish to represent a vector in more than one coordinate system, and to convert back
and forth between the two representations. In general relativity, the transformation of
the coordinates need not be linear, as in the Lorentz transformations; it can be any smooth,
one-to-one function.
For simplicity, however, we start by considering the one-dimensional case, and by assuming the
coordinates are related in an affine manner, x'μ=axμ+b.
The addition of the constant b is merely a change in the choice of origin, so it has
no effect on the components of the vector, but the dilation by the factor a
gives a change in scale, which results in v'μ = a vμ for a contravariant vector.
In the special case where v is an infinitesimal displacement, this is consistent
with the result found by implicit differentiation of the coordinate transformation.
For a contravariant vector,
. Generalizing to more than one
dimension, and to a possibly nonlinear transformation, we have


Note the inversion of the partial derivative in one equation compared to the other. Because these equations describe a change from one coordinate system to another, they clearly depend on the coordinate system, so we use Greek indices rather than the Latin ones that would indicate a coordinate-independent equation. Note that the letter μ in these equations always appears as an index referring to the new coordinates, κ to the old ones. For this reason, we can get away with dropping the primes and writing, e.g., vμ=vκ ∂ x'μ/∂ xκ rather than v', counting on context to show that vμ is the vector expressed in the new coordinates, vκ in the old ones. This becomes especially natural if we start working in a specific coordinate system where the coordinates have names. For example, if we transform from coordinates (t,x,y,z) to (a,b,c,d), then it is clear that vt is expressed in one system and vc in the other.
Self-check: Recall that the gauge transformations allowed in general relativity are not just any coordinate transformations; they must be (1) smooth and (2) one-to-one. Relate both of these requirements to the features of the vector transformation laws above.
In equation [2], μ appears as a subscript on the left side of the equation, but as a superscript on the right. This would appear to violate our rules of notation, but the interpretation here is that in expressions of the form ∂/∂ xi and ∂/∂ xi, the superscripts and subscripts should be understood as being turned upside-down. Similarly, [1] appears to have the implied sum over κ written ungrammatically, with both κ's appearing as superscripts. Normally we only have implied sums in which the index appears once as a superscript and once as a subscript. With our new rule for interpreting indices on the bottom of derivatives, the implied sum is seen to be written correctly. This rule is similar to the one for analyzing the units of derivatives written in Leibniz notation, with, e.g., d2 x/dt2 having units of meters per second squared.
A quantity v that transforms according to [1] or [2] is referred to as a rank-1 tensor, which is the same thing as a vector.
In the case of the identity transformation x'μ=xμ, equation [1] clearly gives v'=v, since all the mixed partial derivatives ∂ x'μ/∂ xκ with μ ≠ κ are zero, and all the derivatives for κ=μ equal 1.
In equation [2], it is tempting to write

but this would give infinite results for the mixed terms! Only in the case of functions of a single variable is it possible to flip derivatives in this way; it doesn't work for partial derivatives. To evaluate these partial derivatives, we have to invert the transformation (which in this example is trivial to accomplish) and then take the partial derivatives.
The metric is a rank-2 tensor, and transforms analogously:

(writing g rather than g' on the left, because context makes the distinction clear).
Self-check: Write the similar expressions for gμν, gμν, and gμν, which are entirely determined by the grammatical rules for writing superscripts and subscripts. Interpret the case of a rank-0 tensor.









The closely related topic of a uniform gravitational field in general relativity is considered in problem 5 on page 182.
The relation between the potential A and the fields E and B given on page 123
can be written in manifestly covariant form as
, where F, called the electromagnetic tensor,
is an antisymmetric rank-two tensor whose six independent components correspond in a certain way with the components of the E and B
three-vectors. If F vanishes completely at a certain point in spacetime, then the linear form of the tensor transformation laws guarantees
that it will vanish in all coordinate systems, not just one. The GPS system takes advantage of this fact
in the transmission of timing signals from the satellites to the users. The electromagnetic wave is modulated so that the bits
it transmits are represented by phase reversals of the wave. At these phase reversals, F vanishes, and this vanishing holds true
regardless of the motion of the user's unit or its position in the earth's gravitational field.
The techniques developed in this chapter allow us to make a variety of new predictions that can be tested by experiment. In general, the mathematical treatment of all observables in relativity as tensors means that all observables must obey the same transformation laws. This is an extremely strict statement, because it requires that a wide variety of physical systems show identical behavior. For example, we already mentioned on page 67 the 2007 Gravity Probe B experiment (discussed in detail on pages 153 and 196), in which four gyroscopes aboard a satellite were observed to precess due to special- and general-relativistic effects. The gyroscopes were complicated electromechanical systems, but the predicted precession was entirely independent of these complications. We argued that if two different types of gyroscopes displayed different behaviors, then the resulting discrepancy would allow us to map out some mysterious vector field. This field would be a built-in characteristic of spacetime (not produced by any physical objects nearby), and since all observables in general relativity are supposed to be tensors, the field would have to transform as a tensor. Let's say that this tensor was of rank 1. Since the tensor transformation law is linear, a nonzero tensor can never be transformed into a vanishing tensor in another coordinate system. But by the equivalence principle, any special, local property of spacetime can be made to vanish by transforming into a free-falling frame of reference, in which the spacetime is has a generic Lorentzian geometry. The mysterious new field should therefore vanish in such a frame. This is a contradiction, so we conclude that different types of gyroscopes cannot differ in their behavior.
This is an example of a new way of stating the equivalence principle: there is no way to associate a preferred tensor field with spacetime.11
In a Lorentz invariant theory, we interpret c as a property of the underlying spacetime, not of the particles that inhabit it. One way in which Lorentz invariance could be violated would be if different types of particles had different maximum velocities. In 1997, Coleman and Glashow suggested a sensitive test for such an effect.12
Assuming Lorentz invariance, a photon cannot decay into an electron and a positron, γ arrow e++e- (example 8, page 116). Suppose, however, that material particles have a maximum speed cm=1, while photons have a maximum speed cp>1. Then the photon's momentum four-vector, (E,E/cp) is timelike, so a frame does exist in which its three-momentum is zero. The detection of cosmic-ray gammas from distant sources with energies on the order of 10 TeV puts an upper limit on the decay rate, implying cp-1 ≤sssim 10-15.
An even more stringent limit can be put on the possibility of cp<1. When a charged particle moves through a medium at a speed higher than the speed of light in the medium, Cerenkov radiation results. If cp is less than 1, then Cerenkov radiation could be emitted by high-energy charged particles in a vacuum, and the particles would rapidly lose energy. The observation of cosmic-ray protons with energies ∼ 108 TeV requires cp-1 >rsim -10-23.
The straightforward properties of the momentum four-vector have surprisingly far-reaching implications for matter subject to extreme pressure, as in a star that uses up all its fuel for nuclear fusion and collapses. These implications were initially considered too exotic to be taken seriously by astronomers. For historical perspective, consider that in 1916, when Einstein published the theory of general relativity, the Milky Way was believed to constitute the entire universe; the “spiral nebulae” were believed to be inside it, rather than being similar objects exterior to it. The only types of stars whose structure was understood even vaguely were those that were roughly analogous to our own sun. (It was not known that nuclear fusion was their source of energy.) The term “white dwarf” had not been invented, and neutron stars were unknown.
An ordinary, smallish star such as our own sun has enough hydrogen to sustain fusion reactions for billions of years, maintaining an equilibrium between its gravity and the pressure of its gases. When the hydrogen is used up, it has to begin fusing heavier elements. This leads to a period of relatively rapid fluctuations in structure. Nuclear fusion proceeds up until the formation of elements as heavy as oxygen (Z=8), but the temperatures are not high enough to overcome the strong electrical repulsion of these nuclei to create even heavier ones. Some matter is blown off, but finally nuclear reactions cease and the star collapses under the pull of its own gravity.
To understand what happens in such a collapse, we have to understand the behavior of gases under very high pressures. In general, a surface area A within a gas is subject to collisions in a time t from the n particles occupying the volume V=Avt, where v is the typical velocity of the particles. The resulting pressure is given by P∼ npv/V, where p is the typical momentum.
As a star with the mass of our sun collapses, it reaches a point at which the electrons begin to behave as a degenerate gas, and the collapse stops. The resulting object is called a white dwarf. A white dwarf should be an extremely compact body, about the size of the Earth. Because of its small surface area, it should emit very little light. In 1910, before the theoretical predictions had been made, Russell, Pickering, and Fleming discovered that 40 Eridani B had these characteristics. Russell recalled: “I knew enough about it, even in these paleozoic days, to realize at once that there was an extreme inconsistency between what we would then have called `possible' values of the surface brightness and density. I must have shown that I was not only puzzled but crestfallen, at this exception to what looked like a very pretty rule of stellar characteristics; but Pickering smiled upon me, and said: `It is just these exceptions that lead to an advance in our knowledge,' and so the white dwarfs entered the realm of study!”
S. Chandrasekhar showed in that 1930's that there
was an upper limit to the mass of a white dwarf. We will recapitulate his calculation briefly in condensed order-of-magnitude form.
The pressure at the core of the star is P∼ ρ g r∼ GM2/r4, where M
is the total mass of the star. The star contains roughly equal numbers of neutrons, protons, and electrons, so M=Knm, where m is
the mass of the electron, n is the number of electrons, and K≈ 4000. For stars near the limit, the electrons are relativistic.
Setting the pressure at the core equal to the degeneracy pressure of a relativistic gas, we find that the
Chandrasekhar limit
is
. A less sloppy calculation gives something more like
.
The self-consistency of this solution is investigated in homework problem 15 on page 142.
What happens to a star whose mass is above the Chandrasekhar limit? As nuclear fusion reactions flicker out, the core of the star becomes a white dwarf, but once fusion ceases completely this cannot be an equilibrium state. Now consider the nuclear reactions


which happen due to the weak nuclear force. The first of these releases 0.8 MeV, and has a half-life of 14 minutes. This explains why free neutrons are not observed in significant numbers in our universe, e.g., in cosmic rays. The second reaction requires an input of 0.8 MeV of energy, so a free hydrogen atom is stable. The white dwarf contains fairly heavy nuclei, not individual protons, but similar considerations would seem to apply. A nucleus can absorb an electron and convert a proton into a neutron, and in this context the process is called electron capture. Ordinarily this process will only occur if the nucleus is neutron-deficient; once it reaches a neutron-to-proton ratio that optimizes its binding energy, neutron capture cannot proceed without a source of energy to make the reaction go. In the environment of a white dwarf, however, there is such a source. The annihilation of an electron opens up a hole in the “Fermi sea.” There is now an state into which another electron is allowed to drop without violating the exclusion principle, and the effect cascades upward. In a star with a mass above the Chandrasekhar limit, this process runs to completion, with every proton being converted into a neutron. The result is a neutron star, which is essentially an atomic nucleus (with Z=0) with the mass of a star!
Observational evidence for the existence of neutron stars came in 1967 with the detection by Bell and Hewish at Cambridge
of a mysterious radio signal with a period of 1.3373011 seconds. The signal's observability
was synchronized with the rotation of the earth relative to the stars, rather than with legal clock time
or the earth's rotation relative to the sun. This led to the conclusion that its origin was in space rather than
on earth, and Bell and Hewish originally dubbed it LGM-1 for “little green men.” The discovery of a second signal,
from a different direction in the sky, convinced them that it was not actually an artificial signal being generated
by aliens. Bell published the observation as an appendix to her PhD thesis, and it was soon interpreted as
a signal from a neutron star. Neutron stars can be highly magnetized, and because of this magnetization they
may emit a directional beam of electromagnetic radiation that sweeps across the sky once per rotational period ---
the “lighthouse effect.” If the earth lies in the plane of the beam, a periodic signal can be detected, and
the star is referred to as a pulsar. It is fairly easy to see that the short period of rotation
makes it difficult to explain a pulsar as any kind of less exotic rotating object. In the approximation of
Newtonian mechanics, a spherical body of density ρ, rotating with a period
, has zero
apparent gravity at its equator, since gravity is just strong enough to accelerate an object so that it
follows a circular trajectory above a fixed point on the surface (problem 14).
In reality, astronomical bodies of planetary size and greater are held together by their own gravity, so we have
for any body that does not fly apart spontaneously due to its own rotation.
In the case of the Bell-Hewish pulsar, this implies ρ >rsim 1010 kg/m3, which is
far larger than the density of normal matter, and also 10-100 times greater than the typical density of
a white dwarf near the Chandrasekhar limit.
An upper limit on the mass of a neutron star can be found in a manner entirely analogous to the
calculation of the Chandrasekhar limit. The only difference is that the mass of a neutron is much
greater than the mass of an electron, and the neutrons are the only particles present, so there is
no factor of K. Assuming the more precise result of
for the Chandrasekhar limit rather than our
sloppy one, and ignoring the interaction of the neutrons via the strong nuclear force,
we can infer an upper limit on the mass of a neutron star:

The theoretical uncertainties in such an estimate are fairly large. Tolman, Oppenheimer, and Volkoff
originally estimated it in 1939 as
, whereas modern estimates are more in the range of 1.5
to
. These are significantly lower than our crude estimate of
,
mainly because the attractive nature of the strong nuclear force tends to pull the star toward
collapse. Unambiguous results are presently impossible because of uncertainties in extrapolating the
behavior of the strong force from the regime of ordinary nuclei, where it has been relatively well parametrized, into
the exotic environment of a neutron star, where the density is significantly different and no protons
are present. There are a variety of effects that may be difficult to anticipate or to calculate. For example,
Brown and Bethe found in 199413 that it might be possible for the mass limit to be drastically revised because
of the process e-arrow K-+νe, which is impossible in free space due to conservation of energy, but
might be possible in a neutron star. Observationally, nearly all neutron stars seem to lie in a surprisingly
small range of mass, between 1.3 and
, but in 2010 a neutron star with a mass of
was discovered,
ruling out most neutron-star models that included exotic matter.14
For stars with masses above the Tolman-Oppenheimer-Volkoff limit, theoretical predictions become even more speculative. A variety of bizarre objects has been proposed, including black stars, gravastars, quark stars, boson stars, Q-balls, and electroweak stars. It seems likely, however, both on theoretical and observational grounds, that objects with masses of about 3 to 20 solar masses end up as black holes; see section 6.3.3.
a / Two spacelike surfaces.
b / We define a boundary around a region whose charge we want to measure.
c / This boundary cuts the sphere into equal parts.
Some of the first tensors we discussed were mass and charge, both rank-0 tensors, and the rank-1 momentum tensor, which contains both the classical energy and the classical momentum. Physicists originally decided that mass, charge, energy, and momentum were interesting because these things were found to be conserved. This makes it natural to ask how conservation laws can be formulated in relativity. We're used to stating conservation laws casually in terms of the amount of something in the whole universe, e.g., that classically the total amount of mass in the universe stays constant. Relativity does allow us to make physical models of the universe as a whole, so it seems as though we ought to be able to talk about conservation laws in relativity.
We can't.
First, how do we define “stays constant?” Simultaneity isn't well-defined, so we can't just take two snapshots, call them initial and final, and compare the total amount of, say, electric charge in each snapshot. This difficulty isn't insurmountable. As in figure a, we can arbitrarily pick out three-dimensional spacelike surfaces --- one initial and one final --- and integrate the charge over each one. A law of conservation of charge would say that no matter what spacelike surface we picked, the total charge on each would be the same.
Next there's the issue that the integral might diverge, especially if the universe was spatially infinite. For now, let's assume a spatially finite universe. For simplicity, let's assume that it has the topology of a three-sphere (see section 8.2 for reassurance that this isn't physically unreasonable), and we can visualize it as a two-sphere.
In the case of the momentum four-vector, what coordinate system would we express it in? In general, we do not even expect to be able to define a smooth, well-behaved coordinate system that covers the entire universe, and even if we did, it would not make sense to add a vector expressed in that coordinate system at point A to another vector from point B; the best we could do would be to parallel-transport the vectors to one point and then add them, but parallel transport is path-dependent. (Similar issues occur with angular momentum.) For this reason, let's restrict ourselves to the easier case of a scalar, such as electric charge.
But now we're in real trouble. How would we go about actually measuring the total electric charge of the universe? The only way to do it is to measure electric fields, and then apply Gauss's law. This requires us to single out some surface that we can integrate the flux over, as in b. This would really be a two-dimensional surface on the three-sphere, but we can visualize it as a one-dimensional surface --- a closed curve --- on the two-sphere. But now suppose this curve is a great circle, c. If we measure a nonvanishing total flux across it, how do we know where the charge is? It could be on either side.
The conclusion is that conservation laws only make sense in relativity under very special circumstances.15 We do not have anything like over-arching, global principles of conservation. As an example of the appropriate special circumstances, section 6.2.6, p. 200 shows how to define conserved quantities, which behave like energy and momentum, for the motion of a test particle in a particular metric that has a certain symmetry. This is generalized on p. 225 to a general, global conservation law corresponding to every continuous symmetry of a spacetime.
d / A relativistic jet.
e / Two light rays travel in the earth's equatorial plane from A to B. Due to frame-dragging, the ray moving with the earth's rotation is deflected by a greater amount than the one moving contrary to it. As a result, the figure has an asymmetric banana shape. Both the deflection and its asymmetry are greatly exaggerated.
f / Gravity Probe B verified the existence of frame-dragging. The rotational axis of the gyroscope precesses in two perpendicular planes due to the two separate effects: geodetic and frame-dragging.
Another special case where conservation laws work is that if the spacetime we're studying gets very flat at large distances from a small system we're studying, then we can define a far-away boundary that surrounds the system, measure the flux through that boundary, and find the system's charge. For such asymptotically flat spacetimes, we can also get around the problems that crop up with conserved vectors, such as momentum. If the spacetime far away is nearly flat, then parallel transport loses its path-dependence, so we can unambiguously define a notion of parallel-transporting all the contributions to the flux to one arbitrarily chosen point P and then adding them. Asymptotic flatness also allows us to define an approximate notion of a global Lorentz frame, so that the choice of P doesn't matter.
As an example, figure d shows a jet of matter being ejected from the galaxy M87 at ultrarelativistic fields. The blue color of the jet in the visible-light image comes from synchrotron radiation, which is the electromagnetic radiation emitted by relativistic charged particles accelerated by a magnetic field. The jet is believed to be coming from a supermassive black hole at the center of M87. The emission of the jet in a particular direction suggests that the black hole is not spherically symmetric. It seems to have a particular axis associated with it. How can this be? Our sun's spherical symmetry is broken by the existence of externally observable features such as sunspots and the equatorial bulge, but the only information we can get about a black hole comes from its external gravitational (and possibly electromagnetic) fields. It appears that something about the spacetime metric surrounding this black hole breaks spherical symmetry, but preserves symmetry about some preferred axis. What aspect of the initial conditions in the formation of the hole could have determined such an axis? The most likely candidate is the angular momentum. We are thus led to suspect that black holes can possess angular momentum, that angular momentum preserves information about their formation, and that angular momentum is externally detectable via its effect on the spacetime metric.
What would the form of such a metric be? Spherical coordinates in flat spacetime give a metric like this:
We'll see in chapter 6 that for a non-rotating black hole, the metric is of the form
where (…) represents functions of r. In fact, there is nothing special about the metric of a black hole, at least far away; the same external metric applies to any spherically symmetric, non-rotating body, such as the moon. Now what about the metric of a rotating body? We expect it to have the following properties:
Restricting our attention to the equatorial plane θ=π/2, the simplest modification that has these three properties is to add a term of the form
where (…) again gives the r-dependence and L is a constant, interpreted as the angular momentum. A detailed treatment is beyond the scope of this book, but solutions of this form to the relativistic field equations were found by New Zealand-born physicist Roy Kerr in 1963 at the University of Texas at Austin.
The astrophysical modeling of observations like figure d is complicated, but we can see in a simplified thought experiment that if we want to determine the angular momentum of a rotating body via its gravitational field, it will be difficult unless we use a measuring process that takes advantage of the asymptotic flatness of the space. For example, suppose we send two beams of light past the earth, in its equatorial plane, one on each side, and measure their deflections, e. The deflections will be different, because the sign of dφdt will be opposite for the two beams. But the entire notion of a “deflection” only makes sense if we have an asymptotically flat background, as indicated by the dashed tangent lines. Also, if spacetime were not asymptotically flat in this example, then there might be no unambiguous way to determine whether the asymmetry was due to the earth's rotation, to some external factor, or to some kind of interaction between the earth and other bodies nearby.
It also turns out that a gyroscope in such a gravitational field precesses. This effect, called frame dragging, was predicted by Lense and Thirring in 1918, and was finally verified experimentally in 2008 by analysis of data from the Gravity Probe B experiment, to a precision of about 15%. The experiment was arranged so that the relatively strong geodetic effect (6.6 arc-seconds per year) and the much weaker Lense-Thirring effect (.041 arc-sec/yr) produced precessions in perpendicular directions. Again, the presence of an asymptotically flat background was involved, because the probe measured the orientations of its gyroscopes relative to the guide star IM Pegasi.
This section can be skipped on a first reading.
We've embarked on a program of redefining every possible physical quantity as a tensor, but so far we haven't tackled area and volume. Is there, for example, an area tensor in a locally Euclidean plane? We are encouraged to hope that there is such a thing, because on p. 45 we saw that we could cook up a measure of area with no other ingredients than the axioms of affine geometry. What kind of tensor would it be? The notions of vector and scalar from freshman mechanics are distinguished from one another by the fact that one has a direction in space and the other does not. Therefore we expect that area would be a scalar, i.e., a rank-0 tensor. But this can't be right, for the following reason. Under a rescaling of coordinates by a factor k, area should change by a factor of k2. But by the tensor transformation laws, a rank-0 tensor is supposed to be invariant under a change of coordinates. We therefore conclude that quantities like area and volume are not tensors.
In the language of ordinary vectors and scalars in Euclidean three-space, one way to express area and volume is by using
dot and cross products. The area of the parallelogram spanned by u and v is measured by the area vector
,
and similarly the volume of the parallelepiped formed by u, v, and w can be computed as the scalar triple product
.
Both of these quantities are defined such that interchanging two of the inputs negates the output. In differential
geometry, we do have a scalar product, which is defined by contracting the indices of two vectors, as in uava. If we also had a
a tensorial cross product, we would be able to define area and volume tensors, so we conclude that there is no tensorial cross product, i.e.,
an operation that would multiply two rank-1 tensors to produce a rank-1 tensor. Since one of the most important physical applications of
the cross product is to calculate the angular momentum
, we find that angular momentum in relativity is
either not a tensor or not a rank-1 tensor.
When someone tells you that it's impossible to do a seemingly straightforward thing, the typical response is to look for a way to get around the supposed limitation. In the case of a locally Euclidean plane, what is to stop us from making a small, standard square, and then sliding the square around to any desired location? If we have some figure whose area we wish to measure, we can then dissect it into squares of that size and count the number of squares.
There are two problems with this plan, neither of which is completely insurmountable. First, the area vector
is a vector, with its orientation specified by the direction of the normal to the surface. We need this orientation, for example, when we calculate
the electric flux as
. Figure a shows that we cannot always define such an orientation
in a consistent way. When the x-y coordinate system is slid around the Möbius strip, it ends up with the opposite orientation.
In general relativity, there is not any guarantee of orientability in space --- or even in time! But the vast majority
of spacetimes of physical interest are in fact orientable in every desired way, and even for those that aren't, orientability still holds
in any sufficiently small neighborhood.
The other problem is simply that area has the wrong scaling properties to be a rank-0 tensor. We can get around this problem simply by being willing to discuss quantities that don't transform exactly like tensors. Often we only care about transformations, such as rotations and translations, that don't involve any scaling. We saw in section 2.2 on p. 46 that Lorentz boosts also have the special property of preserving area in a space-time plane containing the boost. We therefore define a tensor density as a quantity that transforms like a tensor under rotations, translations, and boosts, but that rescales and possibly flips its sign under other types of coordinate transformations. In general, the additional factor comes from the determinant d of the matrix consisting of the partial derivatives ∂ x'μ/∂ xν (called the Jacobian matrix). This determinant is raised to a power W, known as the weight of the tensor density. Weight zero corresponds to the case of a real tensor.
In a Euclidean plane, making our rulers longer by a factor of k causes the area measured in the new coordinates to decrease by a factor of 1/k2. The rescaling is represented by a matrix of partial derivatives that is simply kI, where I is the identity matrix. The determinant is k2. Therefore area is a tensor density of weight -1.
A piece of aluminum foil as a certain number of milligrams per square centimeter. Stretching rulers by k causes this number to increase by k2, so this mass density has W=+1.
Let a line segment of unit length be parallel to the x axis in the Euclidean plane. Then the transformation (x,y)arrow (x/k,y) changes its length to 1/k, which would lead us to imagine that length was a tensor density with W=-1/2. But (x,y)arrow (x,y/k) doesn't change the length at all, suggesting W=0 (a pure tensor). The result is that length is not a tensor or tensor density of any kind.
Generalizing to more than two dimensions, an m-dimensional volume embedded in an n-dimensional space is a tensor density with W=-1 if and only if m=n; for m<n, the m-volume isn't a tensor density at all.
In Weyl's apt characterization,16 tensors represent intensities, while tensor densities measure quantity.
Although there is no tensorial vector cross product, we can define a similar operation whose output is a tensor density. This is most easily expressed in terms of the Levi-Civita symbol ε. (See p. 83 for biographical information about Levi-Civita.)
In n dimensions, the Levi-Civita symbol has n indices. It is defined so as to be totally asymmetric, in the sense that if any two of the indices are interchanged, its sign flips. This is sufficient to define the symbol completely except for an over-all scaling, which is fixed by arbitrarily taking one of the nonvanishing elements and setting it to +1. To see that this is enough to define ε completely, first note that it must vanish when any index is repeated. For example, in three dimensions labeled by κ, λ, and μ, εκλλ is unchanged under an interchange of the second and third indices, but it must also flip its sign under this operation, which means that it must be zero. If we arbitrarily fix εκλμ=+1, then interchange of the second and third indices gives εκμλ=-1, and a further interchange of the first and second yields εμκλ=+1. Any permutation of the three distinct indices can be reached from any other by a series of such pairwise swaps, and the number of swaps is uniquely odd or even.17 In Cartesian coordinates in three dimensions, it is conventional to choose εxyz=+1 when x, y, and z form a right-handed spatial coordinate system. In four dimensions, we take εtxyz=+1 when t is future-timelike and (x,y,z) are right-handed.
In Euclidean three-space, in coordinates such that g=diag(1,1,1), the vector cross product
, where we have
in mind the interpretation of A as area, can be expressed
as Aμ=εμκλuκ vλ.
Self-check: Check that this matches up with the more familiar definition of the vector cross product.
Now suppose that we want to generalize to curved spaces, where g cannot be constant. There are two ways to proceed.
One is to let ε have the values 0 and ±1 at some arbitrarily chosen point, in some arbitrarily chosen coordinate system, but to let it transform like a tensor. Then Aμ=εμκλuκ vλ needs to be modified, since the right-hand side is a tensor, and that would make A a tensor, but if A is an area we don't want it to transform like a 1-tensor. We therefore need to revise the definition of area to be Aμ=g-1/2εμκλuκ vλ, where g is the determinant of the lower-index form of the metric. The following two examples justify this procedure in a locally Euclidean three-space.
Then scaling of coordinates by k scales all the elements of the metric by k-2, g by k-6, g-1/2 by k3, εμκλ by k-3, and uκ vλ by k2. The result is to scale Aμ by k+3-3+2=k2, which makes sense if A is an area.
In oblique coordinates (example 8, p. 94), the two basis vectors have unit
length but are at an angle φ ≠ π/2 to one another. The determinant of the metric is g=sin2φ, so
,
which is exactly the correction factor needed in order to get the right area when u and v are the two basis vectors.
This procedure works more generally, the sole modification being that in a space such as a locally Lorentzian one where g<0 we need
to use
as the correction factor rather than
.
The other option is to let ε have the same 0 and ±1 values at all points. Then ε is clearly not a tensor, because
it doesn't scale by a factor of kn when the coordinates are scaled by k; ε is a tensor density with weight +1 for the upper-index version
and -1 for the lower-index one. The relation Aμ=εμκλuκ vλ gives an area that is a tensor density,
not a tensor, because A is not written in terms of purely tensorial quantities. Scaling the coordinates by k leaves εμκλ
unchanged, scales up uκ vλ by k2, and scales up the area by k2, as expected.
Unfortunately, there is no consistency in the literature as to whether ε should be a tensor or a tensor density.
Some will define both a tensor and a nontensor version, with notations
like ε and
, or18 ε0123 and
.
Others avoid writing the letter ε completely.19 The tensor-density version
is convenient because we always know that its value is 0 or ± 1. The tensor version has the advantage that it transforms
as a tensor.
As discussed above, angular momentum cannot be a rank-1 tensor. One approach is to define a rank-2 angular momentum tensor Lab=rapb.
In a frame whose origin is instantaneously moving along with a certain system's center of mass at a certain time, the time-space components of L vanish, and the components Lyz, Lzx, and Lxy coincide in the nonrelativistic limit with the x, y, and z components of the Newtonian angular momentum vector. We can also define a three-dimensional object La=εabcLbc (with tensor-density ε) that doesn't transform like a tensor.
2. The Large Hadron Collider is designed to accelerate protons to energies of 7 TeV. Find 1-v for such a proton. (solution in the pdf version of the book)
3. Prove that a photon in a vacuum cannot absorb a photon. (This is the reason that the ability of materials to absorb gamma-rays is strongly dependent on atomic number Z. The case of Z=0 corresponds to the vacuum.)
4. (a) For an object moving in a circle at constant speed, the dot product of the classical three-vectors v and a is zero. Give an interpretation in terms of the work-kinetic energy theorem. (b) In the case of relativistic four-vectors, viai=0 for any world-line. Give a similar interpretation. Hint: find the rate of change of the four-velocity's squared magnitude.
5. Starting from coordinates (t,x) having a Lorentzian metric g, transform the metric tensor into reflected coordinates (t',x')=(t,-x), and verify that g' is the same as g.
6. Starting from coordinates (t,x) having a Lorentzian metric g, transform the metric tensor into Lorentz-boosted coordinates (t',x'), and verify that g' is the same as g.
7. Verify the transformation of the metric given in example 15 on page 126.
8. A skeptic claims that the Hafele-Keating experiment can only be explained correctly by relativity in a frame in which the earth's axis is at rest. Prove mathematically that this is incorrect. Does it matter whether the frame is inertial? (solution in the pdf version of the book)
9. Assume the metric g=diag(+1,+1,+1). Which of the following correctly expresses the noncommutative property of ordinary matrix multiplication?


10.
Example 6 on page 116 introduced the Dirac sea, whose existence is implied
by the two roots of the relativistic relation
. Prove that a Lorentz boost
will never transform a positive-energy state into a negative-energy state.
(solution in the pdf version of the book)
11. On page 119, we found the relativistic Doppler shift in 1+1 dimensions. Extend this to 3+1 dimensions, and check your result against the one given by Einstein on page 321. (solution in the pdf version of the book)
12. Estimate the energy contained in the electric field of an electron, if the electron's radius is r. Classically (i.e., assuming relativity but no quantum mechanics), this energy contributes to the electron's rest mass, so it must be less than the rest mass. Estimate the resulting lower limit on r, which is known as the classical electron radius. (solution in the pdf version of the book)
13. For gamma-rays in the MeV range, the most frequent mode of interaction with matter is Compton scattering, in which the photon is scattered by an electron without being absorbed. Only part of the gamma's energy is deposited, and the amount is related to the angle of scattering. Use conservation of four-momentum to show that in the case of scattering at 180 degrees, the scattered photon has energy E'=E/(1+2E/m), where m is the mass of the electron.
14.
Derive the equation
given on page 131 for the period
of a rotating, spherical object that results in zero apparent gravity at its surface.
15. Section 4.4.3 presented an estimate of the upper limit on the mass of a white dwarf. Check the self-consistency of the solution in the following respects: (1) Why is it valid to ignore the contribution of the nuclei to the degeneracy pressure? (2) Although the electrons are ultrarelativistic, spacetime is approximated as being flat. As suggested in example 11 on page 58, a reasonable order-of-magnitude check on this result is that we should have M/r << c2/G.