You are viewing the html version of General Relativity, by Benjamin Crowell. This version is only designed for casual browsing, and may have some formatting problems. For serious reading, you want the Adobe Acrobat version.

Table of Contents

Section 6.1 - Event horizons
Section 6.2 - The Schwarzschild metric
Section 6.3 - Black holes
Section 6.4 - Degenerate solutions

Chapter 6. Vacuum solutions


a / A Swiss commemorative coin shows the vacuum field equation.

In this chapter we investigate general relativity in regions of space that have no matter to act as sources of the gravitational field. We will not, however, limit ourselves to calculating spacetimes in cases in which the entire universe has no matter. For example, we will be able to calculate general-relativistic effects in the region surrounding the earth, including a full calculation of the geodetic effect, which was estimated in section 5.5.1 only to within an order of magnitude. We can have sources, but we just won't describe the metric in the regions where the sources exist, e.g., inside the earth. The advantage of accepting this limitation is that in regions of empty space, we don't have to worry about the details of the stress-energy tensor or how it relates to curvature. As should be plausible based on the physical motivation given in section 5.1, page 160, the field equations in a vacuum are simply \(R_{ab}=0\).

6.1 Event horizons

One seemingly trivial way to generate solutions to the field equations in vacuum is simply to start with a flat Lorentzian spacetime and do a change of coordinates. This might seem pointless, since it would simply give a new description (and probably a less convenient and descriptive one) of the same old, boring, flat spacetime. It turns out, however, that some very interesting things can happen when we do this.


a / A spaceship (curved world-line) moves with an acceleration perceived as constant by its passengers. The photon (straight world-line) comes closer and closer to the ship, but will never quite catch up.

6.1.1 The event horizon of an accelerated observer

Consider the uniformly accelerated observer described in examples 4 on page 126 and 19 on page 140. Recalling these earlier results, we have for the ship's equation of motion in an inertial frame

\[\begin{equation*} x = \frac{1}{a}\left(\sqrt{1+a^2t^2}-1\right) , \end{equation*}\]

and for the metric in the ship's frame

\[\begin{align*} g'_{t't'} &= (1+ax')^2 \\ g'_{x'x'} &= -1 . \end{align*}\]

Since this metric was derived by a change of coordinates from a flat-space metric, and the Ricci curvature is an intrinsic property, we expect that this one also has zero Ricci curvature. This is straightforward to verify. The nonvanishing Christoffel symbols are

\[\begin{equation*} \Gamma^{t'}_{x't'} = \frac{a}{1+ax'} \text{and} \Gamma^{x'}_{t't'}=a(1+ax') . \end{equation*}\]

The only elements of the Riemann tensor that look like they might be nonzero are \(R^{t'}_{t'x'x'}\) and \(R^{x'}_{t'x't'}\), but both of these in fact vanish.

Self-check: Verify these facts.

This seemingly routine exercise now leads us into some very interesting territory. Way back on page 12, we conjectured that not all events could be time-ordered: that is, that there might exists events in spacetime 1 and 2 such that 1 cannot cause 2, but neither can 2 cause 1. We now have enough mathematical tools at our disposal to see that this is indeed the case.

We observe that \(x(t)\) approaches the asymptote \(x=t-1/a\). This asymptote has a slope of 1, so it can be interpreted as the world-line of a photon that chases the ship but never quite catches up to it. Any event to the left of this line can never have a causal relationship with any event on the ship's world-line. Spacetime, as seen by an observer on the ship, has been divided by a curtain into two causally disconnected parts. This boundary is called an event horizon. Its existence is relative to the world-line of a particular observer. An observer who is not accelerating along with the ship does consider an event horizon to exist. Although this particular example of the indefinitely accelerating spaceship has some physically implausible features (e.g., the ship would have to run out of fuel someday), event horizons are real things. In particular, we will see in section 6.3.2 that black holes have event horizons.

Interpreting everything in the \((t',x')\) coordinates tied to the ship, the metric's component \(g'_{t't'}\) vanishes at \(x'=-1/a\). An observer aboard the ship reasons as follows. If I start out with a head-start of \(1/a\) relative to some event, then the timelike part of the metric at that event vanishes. If the event marks the emission of a material particle, then there is no possible way for that particle's world-line to have \(ds^2>0\). If I were to detect a particle emitted at that event, it would violate the laws of physics, since material particles must have \(ds^2>0\), so I conclude that I will never observe such a particle. Since all of this applies to any material particle, regardless of its mass \(m\), it must also apply in the limit \(m\rightarrow 0\), i.e., to photons and other massless particles. Therefore I can never receive a particle emitted from this event, and in fact it appears that there is no way for that event, or any other event behind the event horizon, to have any effect on me. In my frame of reference, it appears that light cones near the horizon are tipped over so far that their future light-cones lie entirely in the direction away from me.

We've already seen in example 15 on page 65 that a naive Newtonian argument suggests the existence of black holes; if a body is sufficiently compact, light cannot escape from it. In a relativistic treatment, this should be described as an event horizon.

6.1.2 Information paradox

The existence of event horizons in general relativity has deep implications, and in particular it helps to explain why it is so difficult to reconcile general relativity with quantum mechanics, despite nearly a century of valiant attempts. Quantum mechanics has a property called unitarity. Mathematically, this says that if the state of a quantum mechanical system is given, at a certain time, in the form of a vector, then its state at some point in the future can be predicted by applying a unitary matrix to that vector. A unitary matrix is the generalization to complex numbers of the ordinary concept of an orthogonal matrix, and essentially it just represents a change of basis, in which the basis vectors have unit length and are perpendicular to one another.

To see what this means physically, consider the following nonexamples. The matrix

\[\begin{equation*} \left(\begin{array}{cc} 1 & 0 \\ 0 & 0 \end{array}\right) \end{equation*}\]

is not unitary, because its rows and columns are not orthogonal vectors with unit lengths. If this matrix represented the time-evolution of a quantum mechanical system, then its meaning would be that any particle in state number 1 would be left alone, but any particle in state 2 would disappear. Any information carried by particles in state 2 is lost forever and can never be retrieved. This also violates the time-reversal symmetry of quantum mechanics.

Another nonunitary matrix is:

\[\begin{equation*} \left(\begin{array}{cc} 1 & 0 \\ 0 & \sqrt{2} \end{array}\right) \end{equation*}\]

Here, any particle in state 2 is increased in amplitude by a factor of \(\sqrt{2}\), meaning that it is doubled in probability. That is, the particle is cloned. This is the opposite problem compared to the one posed by the first matrix, and it is equally problematic in terms of time-reversal symmetry and conservation of information. Actually, if we could clone a particle in this way, it would violate the Heisenberg uncertainty principle. We could make two copies of the particle, and then measure the position of one copy and the momentum of the other, each with unlimited precision. This would violate the uncertainty principle, so we believe that it cannot be done. This is known as the no-cloning theorem.1

The existence of event horizons in general relativity violates unitarity, because it allows information to be destroyed. If a particle is thrown behind an event horizon, it can never be retrieved.


b / Bill Unruh (1945-).

6.1.3 Radiation from event horizons

In interesting twist on the situation was introduced by Bill Unruh in 1976. Observer B aboard the accelerating spaceship believes in the equivalence principle, so she knows that the local properties of space at the event horizon would seem entirely normal and Lorentzian to a local observer A. (The same applies to a black hole's horizon.) In particular, B knows that A would see pairs of virtual particles being spontaneously created and destroyed in the local vacuum. This is simply a manifestation of the time-energy form of the uncertainty principle, \(\Delta E \Delta t \lesssim h\). Now suppose that a pair of particles is created, but one is created in front of the horizon and one behind it. To A these are virtual particles that will have to be annihilated within the time \(\Delta t\), but according to B the one created in front of the horizon will eventually catch up with the spaceship, and can be observed there, although it will be red-shifted. The amount of redshift is given by \(\sqrt{g'_{t't'}} = \sqrt{(1+ax')^2}\). Say the pair is created right near the horizon, at \(x'=-1/a\). By the uncertainty principle, each of the two particles is spread out over a region of space of size \(\Delta x'\). Since these are photons, which travel at the speed of light, the uncertainty in position is essentially the same as the uncertainty in time. The forward-going photon's redshift comes out to be \(a\Delta x'=a\Delta t'\), which by the uncertainty principle should be at least \(ha/E\), so that when the photon is observed by B, its energy is \(E(ha/E)=ha\).

Now B sees a uniform background of photons, with energies of around \(ha\), being emitted randomly from the horizon. They are being emitted from empty space, so it seems plausible to believe that they don't encode any information at all; they are completely random. A surface emitting a completely random (i.e., maximum-entropy) hail of photons is a black-body radiator, so we expect that the photons will have a black-body spectrum, with its peak at an energy of about \(ha\). This peak is related to the temperature of the black body by \(E\sim kT\), where \(k\) is Boltzmann's constant. We conclude that the horizon acts like a black-body radiator with a temperature \(T\sim ha/k\). The more careful treatment by Unruh shows that the exact relation is \(T= ha/4\pi^2 k\), or \(ha/4\pi^2 kc\) in SI units.

An important observation here is that not only do different observers disagree about the number of quanta that are present (which is true in the case of ordinary Doppler shifts), but about the number of quanta in the vacuum as well. B sees photons that according to A do not exist.

Let's consider some real-world examples of large accelerations:

acceleration m / s2)

temperature of horizon (K)

bullet fired from a gun


10 − 17

electron in a CRT


10 − 13

plasmas produced by intense laser pulses



proton in a helium nucleus



To detect Unruh radiation experimentally, we would ideally like to be able to accelerate a detector and let it detect the radiation. This is clearly impractical. The third line shows that it is possible to impart very large linear accelerations to subatomic particles, but then one can only hope to infer the effect of the Unruh radiation indirectly by its effect on the particles. As shown on the final line, examples of extremely large nonlinear accelerations are not hard to find, but the interpretation of Unruh radiation for nonlinear motion is unclear. A summary of the prospects for direct experimental detection of this effect is given by Rosu.2 This type of experiment is clearly extremely difficult, but it is one of the few ways in which one could hope to get direct empirical insight, under controlled conditions, into the interface between gravity and quantum mechanics.

6.2 The Schwarzschild metric


a / The field equations of general relativity are nonlinear.

We now set ourselves the goal of finding the metric describing the static spacetime outside a spherically symmetric, nonrotating, body of mass \(m\). This problem was first solved by Karl Schwarzschild in 1915.3 One byproduct of finding this metric will be the ability to calculate the geodetic effect exactly, but it will have more far-reaching consequences, including the existence of black holes.

The problem we are solving is similar to calculating the spherically symmetric solution to Gauss's law in a vacuum. The solution to the electrical problem is of the form \(\hat{\mathbf{r}}/r^2\), with an arbitrary constant of proportionality that turns out to be proportional to the charge creating the field. One big difference, however, is that whereas Gauss's law is linear, the equation \(R_{ab}=0\) is highly nonlinear, so that the solution cannot simply be scaled up and down in proportion to \(m\).

The reason for this nonlinearity is fundamental to general relativity. For example, when the earth condensed out of the primordial solar nebula, large amounts of heat were produced, and this energy was then gradually radiated into outer space, decreasing the total mass of the earth. If we pretend, as in figure a, that this process involved the merging of only two bodies, each with mass \(m\), then the net result was essentially to take separated masses \(m\) and \(m\) at rest, and bring them close together to form close-neighbor masses \(m\) and \(m\), again at rest. The amount of energy radiated away was proportional to \(m^2\), so the gravitational mass of the combined system has been reduced from \(2m\) to \(2m-(...)m^2\), where ... is roughly \(G/c^2r\). There is a nonlinear dependence of the gravitational field on the masses.

Self-check: The signature of a metric is defined as the list of positive and negative signs that occur when it is diagonalized.4 The equivalence principle requires that the signature be \(+---\) (or \(-+++\), depending on the choice of sign conventions). Verify that any constant metric (including a metric with the “wrong” signature, e.g., 2+2 dimensions rather than 3+1) is a solution to the Einstein field equation in vacuum.

The correspondence principle tells us that our result must have a Newtonian limit, but the only variables involved are \(m\) and \(r\), so this limit must be the one in which \(r/m\) is large. Large compared to what? There is nothing else available with which to compare, so it can only be large compared to some expression composed of the unitless constants \(G\) and \(c\). We have already chosen units such that \(c=1\), and we will now set \(G=1\) as well. Mass and distance are now comparable, with the conversion factor being \(G/c^2=7\times10^{-28}\ \text{m}/\text{kg}\), or about a mile per solar mass. Since the earth's radius is thousands of times more than a mile, and its mass hundreds of thousands of times less than the sun's, its \(r/m\) is very large, and the Newtonian approximation is good enough for all but the most precise applications, such as the GPS network or the Gravity Probe B experiment.

6.2.1 The zero-mass case

First let's demonstrate the trivial solution with flat spacetime. In spherical coordinates, we have

\[\begin{equation*} ds^2 = dt^2 - dr^2 - r^2 d\theta^2 - r^2 \sin^2\theta d\phi^2 . \end{equation*}\]

The nonvanishing Christoffel symbols (ignoring swaps of the lower indices) are:

\[\begin{align*} \Gamma^\theta_{r\theta} &= \frac{1}{r} \\ \Gamma^\phi_{r\phi} &= \frac{1}{r} \\ \Gamma^r_{\theta\theta} &= -r \\ \Gamma^r_{\phi\phi} &= -r\sin^2\theta \\ \Gamma^\theta_{\phi\phi} &= -\sin\theta\cos\theta \\ \Gamma^\phi_{\theta\phi} &= \cot\theta \end{align*}\]

Self-check: If we'd been using the \((-+++)\) metric instead of \((+---)\), what would have been the effect on the Christoffel symbols? What if we'd expressed the metric in different units, rescaling all the coordinates by a factor \(k\)?

Use of ctensor

In fact, when I calculated the Christoffel symbols above by hand, I got one of them wrong, and missed calculating one other because I thought it was zero. I only found my mistake by comparing against a result in a textbook. The computation of the Riemann tensor is an even bigger mess. It's clearly a good idea to resort to a computer algebra system here. Cadabra, which was discussed earlier, is specifically designed for coordinate-independent calculations, so it won't help us here. A good free and open-source choice is ctensor, which is one of the standard packages distributed along with the computer algebra system Maxima, introduced on page 76.

The following Maxima program calculates the Christoffel symbols found in section 6.2.1.


Line 1 loads the ctensor package. Line 2 sets up the names of the coordinates. Line 3 defines the \(g_{ab}\), with lg meaning “the version of \(g\) with lower indices.” Line 7 tells Maxima to do some setup work with \(g_{ab}\), including the calculation of the inverse matrix \(g^{ab}\), which is stored in ug. Line 8 says to calculate the Christoffel symbols. The notation mcs refers to the tensor \({\Gamma'}_{bc}^a\) with the indices swapped around a little compared to the convention \(\Gamma^a_{bc}\) followed in this book. On a Linux system, we put the program in a file flat.mac and run it using the command maxima -b flat.mac. The relevant part of the output is:

(%t6) mcs = -
                                   2, 3, 3 r

(%t7) mcs = -
                                   2, 4, 4 r

(%t8) mcs = - r
                                  3, 3, 2

(%t9) mcs = ----------
                               3, 4, 4 sin(theta)

(%t10) mcs = - r sin (theta)
                            4, 4, 2

(%t11) mcs = - cos(theta) sin(theta)
                        4, 4, 3

Adding the command ricci(true); at the end of the program results in the output THIS SPACETIME IS EMPTY AND/OR FLAT, which saves us hours of tedious computation. The tensor ric (which here happens to be zero) is computed, and all its nonzero elements are printed out. There is a similar command riemann(true); to compute the Riemann rensor riem. This is stored so that riem[i,j,k,l] is what we would call \(R^l_{ikj}\). Note that \(l\) is moved to the end, and \(j\) and \(k\) are also swapped.

6.2.2 Geometrized units

If the mass creating the gravitational field isn't zero, then we need to decide what units to measure it in. It has already proved very convenient to adopt units with \(c=1\), and we will now also set the gravitational constant \(G=1\). Previously, with only \(c\) set to 1, the units of time and length were the same, \([T]=[L]\), and so were the units of mass and energy, \([M]=[E]\). With \(G=1\), all of these become the same units, \([T]=[L]=[M]=[E]\).

Self-check: Verify this statement by combining Newton's law of gravity with Newton's second law of motion.

The resulting system is referred to as geometrized, because units like mass that had formerly belonged to the province of mechanics are now measured using the same units we would use to do geometry.

6.2.3 A large-r limit

Now let's think about how to tackle the real problem of finding the non-flat metric. Although general relativity lets us pick any coordinates we like, the spherical symmetry of the problem suggests using coordinates that exploit that symmetry. The flat-space coordinates \(\theta\) and \(\phi\) can stil be defined in the same way, and they have the same interpretation. For example, if we drop a test particle toward the mass from some point in space, its world-line will have constant \(\theta\) and \(\phi\). The \(r\) coordinate is a little different. In curved spacetime, the circumference of a circle is not equal to \(2\pi\) times the distance from the center to the circle; in fact, the discrepancy between these two is essentially the definition of the Ricci curvature. This gives us a choice of two logical ways to define \(r\). We'll define it as the circumference divided by \(2\pi\), which has the advantage that the last two terms of the metric are the same as in flat space: \(-r^2 d\theta^2 - r^2 \sin^2\theta d\phi^2\). Since we're looking for static solutions, none of the elements of the metric can depend on \(t\). Also, the solution is going to be symmetric under \(t \rightarrow -t\), \(\theta \rightarrow -\theta\), and \(\phi \rightarrow -\phi\), so we can't have any off-diagonal elements.5 The result is that we have narrowed the metric down to something of the form

\[\begin{equation*} ds^2 = h(r)dt^2 - k(r)dr^2 - r^2 d\theta^2 - r^2 \sin^2\theta d\phi^2 , \end{equation*}\]

where both \(h\) and \(k\) approach 1 for \(r\rightarrow\infty\), where spacetime is flat.

For guidance in how to construct \(h\) and \(k\), let's consider the acceleration of a test particle at \(r \gg m\), which we know to be \(-m/r^2\), since nonrelativistic physics applies there. We have

\[\begin{equation*} \nabla_t v^r = \partial_t v^r + \Gamma^r_{tc}v^c . \end{equation*}\]

An observer free-falling along with the particle observes its acceleration to be zero, and a tensor that is zero in one coordinate system is zero in all others. Since the covariant derivative is a tensor, we conclude that \( \nabla_t v^r=0\) in all coordinate systems, including the \((t,r,...)\) system we're using. If the particle is released from rest, then initially its velocity four-vector is \((1,0,0,0)\), so we find that its acceleration in \((t,r)\) coordinates is \(-\Gamma^r_{tt}=-\frac{1}{2}g^{rr}\partial_rg_{tt}=-\frac{1}{2}h'/k\). Setting this equal to \(-m/r^2\), we find \(h'/k=2m/r^2\) for \(r \gg m\). Since \(k \approx 1\) for large \(r\), we have

\[\begin{equation*} h' \approx \frac{2m}{r^2} \text{for $r \gg m$} . \end{equation*}\]

The interpretation of this calculation is as follows. We assert the equivalence principle, by which the acceleration of a free-falling particle can be said to be zero. After some calculations, we find that the rate at which time flows (encoded in \(h\)) is not constant. It is different for observers at different heights in a gravitational potential well. But this is something we had already deduced, without the index gymnastics, in example 7 on page 129.

Integrating, we find that for large \(r\), \(h=1-2m/r\).

6.2.4 The complete solution

A series solution

We've learned some interesting things, but we still have an extremely nasty nonlinear differential equation to solve. One way to attack a differential equation, when you have no idea how to proceed, is to try a series solution. We have a small parameter \(m/r\) to expand around, so let's try to write \(h\) and \(k\) as series of the form

\[\begin{align*} h &= \Sigma_{n=0}^\infty a_k \left(\frac{m}{r}\right)^n \\ k &= \Sigma_{n=0}^\infty b_k \left(\frac{m}{r}\right)^n \end{align*}\]

We already know \(a_0\), \(a_1\), and \(b_0\). Let's try to find \(b_1\). In the following Maxima code I omit the factor of \(m\) in \(h_1\) for convenience. In other words, we're looking for the solution for \(m=1\).


I won't reproduce the entire output of the Ricci tensor, which is voluminous. We want all four of its nonvanishing components to vanish as quickly as possible for large values of \(r\), so I decided to fiddle with \(R_{tt}\), which looked as simple as any of them. It appears to vary as \(r^{-4}\) for large \(r\), so let's evaluate \(\lim_{r\rightarrow\infty}\left(r^4R_{tt}\right)\):


The result is \((b_1-2)/2\), so let's set \(b_1=2\). The approximate solution we've found so far (reinserting the \(m\)'s),

\[\begin{equation*} ds^2 \approx \left(1-\frac{2m}{r}\right)dt^2 - \left(1+\frac{2m}{r}\right)dr^2 - r^2 d\theta^2 - r^2 \sin^2\theta d\phi^2 , \end{equation*}\]

was first derived by Einstein in 1915, and he used it to solve the problem of the non-Keplerian relativistic correction to the orbit of Mercury, which was one of the first empirical tests of general relativity.

Continuing in this fashion, the results are as follows:

a0 = 1 b0 = 1
a1 = − 2 b1 = 2
a2 = 0 b2 = 4
a3 = 0 b2 = 8

The closed-form solution

The solution is unexpectedly simple, and can be put into closed form. The approximate result we found for \(h\) was in fact exact. For \(k\) we have a geometric series \(1/(1-2/r)\), and when we reinsert the factor of \(m\) in the only way that makes the units work, we get \(1/(1-2m/r)\). The result for the metric is

\[\begin{equation*} ds^2 = \left(1-\frac{2m}{r}\right)dt^2 - \left(\frac{1}{1-2m/r}\right)dr^2 - r^2 d\theta^2 - r^2 \sin^2\theta d\phi^2 . \end{equation*}\]

This is called the Schwarzschild metric. A quick calculation in Maxima demonstrates that it is an exact solution for all \(r\), i.e., the Ricci tensor vanishes everywhere, even at \(r\lt2m\), which is outside the radius of convergence of the geometric series.

Time-reversal symmetry

The Schwarzschild metric is invariant under time reversal, since time occurs only in the form of \(dt^2\), which stays the same under \(dt\rightarrow -dt\). This is the same time-reversal symmetry that occurs in Newtonian gravity, where the field is described by the gravitational acceleration \(\mathbf{g}\), and accelerations are time-reversal invariant.

Fundamentally, this is an example of general relativity's coordinate independence. The laws of physics provided by general relativity, such as the vacuum field equation, are invariant under any smooth coordinate transformation, and \(t\rightarrow -t\) is such a coordinate transformation, so general relativity has time-reversal symmetry. Since the Schwarzschild metric was found by imposing time-reversal-symmetric boundary conditions on a time-reversal-symmetric differential equation, it is an equally valid solution when we time-reverse it. Furthermore, we expect the metric to be invariant under time reversal, unless spontaneous symmetry breaking occurs (see p. 314).

This suggests that we ask the more fundamental question of what global symmetries general relativity has. Does it have symmetry under parity inversion, for example? Or can we take any solution such as the Schwarzschild spacetime and transform it into a frame of reference in which the source of the field is moving uniformly in a certain direction? Because general relativity is locally equivalent to special relativity, we know that these symmetries are locally valid. But it may not even be possible to define the corresponding global symmetries. For example, there are some spacetimes on which it is not even possible to define a global time coordinate. On such a spacetime, which is described as not time-orientable, there does not exist any smooth vector field that is everywhere timelike, so it is not possible to define past versus future light-cones at all points in space without having a discontinuous change in the definition occur somewhere. This is similar to the way in which a Möbius strip does not allow an orientation of its surface (an “up” direction as seen by an ant) to be defined globally.

Suppose that our spacetime is time-orientable, and we are able to define coordinates \((p,q,r,s)\) such that \(p\) is always the timelike coordinate. Because \(q\rightarrow -q\) is a smooth coordinate transformation, we are guaranteed that our spacetime remains a valid solution of the field equations under this change. But that doesn't mean that what we've found is a symmetry under parity inversion in a plane. Our coordinate \(q\) is not necessarily interpretable as distance along a particular “\(q\) axis.” Such axes don't even exist globally in general relativity. A coordinate does not even have to have units of time or distance; it could be an angle, for example, or it might not have any geometrical significance at all. Similarly, we could do a transformation \(q\rightarrow q'=q+kp\). If we think of \(q\) as measuring spatial position and \(p\) time, then this looks like a Galilean transformation, with \(k\) being the velocity. The solution to the field equations obtained after performing this transformation is still a valid solution, but that doesn't mean that relativity has Galilean symmetry rather than Lorentz symmetry. There is no sensible way to define a Galilean transformation acting on an entire spacetime, because when we talk about a Galilean transformation we assume the existence of things like global coordinate axes, which do not even exist in general relativity.

6.2.5 Geodetic effect

As promised in section 5.5.1, we now calculate the geodetic effect on Gravity Probe B, including all the niggling factors of 3 and \(\pi\). To make the physics clear, we approach the actual calculation through a series of warmups.

Flat space

As a first warmup, consider two spatial dimensions, represented by Euclidean polar coordinates \((r,\phi)\). Parallel-transport of a gyroscope's angular momentum around a circle of constant \(r\) gives

\[\begin{align*} \nabla_\phi L^\phi &= 0 \\ \nabla_\phi L^r &= 0 . \end{align*}\]

Computing the covariant derivatives, we have

\[\begin{align*} 0 &= \partial_\phi L^\phi + \Gamma^\phi_{\phi r}L^r \\ 0 &= \partial_\phi L^r + \Gamma^r_{\phi \phi}L^\phi . \end{align*}\]

The Christoffel symbols are \(\Gamma^\phi_{\phi r}=1/r\) and \(\Gamma^r_{\phi \phi}=-r\). This is all made to look needlessly complicated because \(L^\phi\) and \(L^r\) are expressed in different units. Essentially the vector is staying the same, but we're expressing it in terms of basis vectors in the \(r\) and \(\phi\) directions that are rotating. To see this more transparently, let \(r=1\), and write \(P\) for \(L^\phi\) and \(Q\) for \(L^r\), so that

\[\begin{align*} P' &= -Q \\ Q' &= P , \end{align*}\]

which have solutions such as \(P=\sin\phi\), \(Q=\cos\phi\). For each orbit (\(2\pi\) change in \(\phi\)), the basis vectors rotate by \(2\pi\), so the angular momentum vector once again has the same components. In other words, it hasn't really changed at all.

Spatial curvature only

The flat-space calculation above differs in two ways from the actual result for an orbiting gyroscope: (1) it uses a flat spatial geometry, and (2) it is purely spatial. The purely spatial nature of the calculation is manifested in the fact that there is nothing in the result relating to how quickly we've moved the vector around the circle. We know that if we whip a gyroscope around in a circle on the end of a rope, there will be a Thomas precession (section 2.5.4), which depends on the speed.

As our next warmup, let's curve the spatial geometry, but continue to omit the time dimension. Using the Schwarzschild metric, we replace the flat-space Christoffel symbol \(\Gamma^r_{\phi \phi}=-r\) with \(-r+2m\). The differential equations for the components of the \(L\) vector, again evaluated at \(r=1\) for convenience, are now

\[\begin{align*} P' &= -Q \\ Q' &= (1-\epsilon)P , \end{align*}\]

where \(\epsilon=2m\). The solutions rotate with frequency \(\omega'=\sqrt{1-\epsilon}\). The result is that when the basis vectors rotate by \(2\pi\), the components no longer return to their original values; they lag by a factor of \(\sqrt{1-\epsilon}\approx 1-m\). Putting the factors of \(r\) back in, this is \(1-m/r\). The deviation from unity shows that after one full revolution, the \(L\) vector no longer has quite the same components expressed in terms of the \((r,\phi)\) basis vectors.

To understand the sign of the effect, let's imagine a counterclockwise rotation. The \((r,\phi)\) rotate counterclockwise, so relative to them, the \(L\) vector rotates clockwise. After one revolution, it has not rotated clockwise by a full \(2\pi\), so its orientation is now slightly counterclockwise compared to what it was. Thus the contribution to the geodetic effect arising from spatial curvature is in the same direction as the orbit.

Comparing with the actual results from Gravity Probe B, we see that the direction of the effect is correct. The magnitude, however, is off. The precession accumulated over \(n\) periods is \(2\pi n m/r\), or, in SI units, \(2\pi n Gm/c^2r\). Using the data from section 2.5.4, we find \(\Delta\theta=2\times10^{-5}\) radians, which is too small compared to the data shown in figure b on page 171.

2+1 Dimensions

To reproduce the experimental results correctly, we need to include the time dimension. The angular momentum vector now has components \((L^\phi,L^r,L^t)\). The physical interpretation of the \(L^t\) component is obscure at this point; we'll return to this question later.

Writing down the total derivatives of the three components, and notating \(dt/d\phi\) as \(\omega^{-1}\), we have

\[\begin{align*} \frac{dL^\phi}{d\phi} &= \partial_\phi L^\phi + \omega^{-1}\partial_t L^\phi \\ \frac{dL^r }{d\phi} &= \partial_\phi L^r + \omega^{-1}\partial_t L^r \\ \frac{dL^t }{d\phi} &= \partial_\phi L^t + \omega^{-1}\partial_t L^t \end{align*}\]

Setting the covariant derivatives equal to zero gives

\[\begin{align*} 0 &= \partial_\phi L^\phi + \Gamma^\phi_{\phi r} L^r \\ 0 &= \partial_\phi L^r + \Gamma^r _{\phi \phi} L^\phi \\ 0 &= \partial_t L^r + \Gamma^r _{t t} L^t \\ 0 &= \partial_t L^t + \Gamma^t_{t r} L^r . \end{align*}\]

Self-check: There are not just four but six covariant derivatives that could in principle have occurred, and in these six covariant derivatives we could have had a total of 18 Christoffel symbols. Of these 18, only four are nonvanishing. Explain based on symmetry arguments why the following Christoffel symbols must vanish: \(\Gamma^\phi_{\phi t}\), \(\Gamma^t_{t t}\).

Putting all this together in matrix form, we have \(L'=ML\), where

\[\begin{equation*} M = \left( \begin{array}{ccc} 0 & -1 & 0 \\ 1-\epsilon & 0 & -\epsilon(1-\epsilon)/2\omega \\ 0 & -\epsilon/2\omega(1-\epsilon) & 0 \end{array} \right) . \end{equation*}\]

The solutions of this differential equation oscillate like \(e^{i \Omega t}\), where \(i\Omega\) is an eigenvalue of the matrix.

Self-check: The frequency in the purely spatial calculation was found by inspection. Verify the result by applying the eigenvalue technique to the relevant \(2\times 2\) submatrix.

To lowest order, we can use the Newtonian relation \(\omega^2 r = Gm/r\) and neglect terms of order \(\epsilon^2\), so that the two new off-diagonal matrix elements are both approximated as \(\sqrt{\epsilon/2}\). The three resulting eigenfrequencies are zero and \(\Omega=\pm[1-(3/2)m/r]\).

The presence of the mysterious zero-frequency solution can now be understood by recalling the earlier mystery of the physical interpretation of the angular momentum's \(L^t\) component. Our results come from calculating parallel transport, and parallel transport is a purely geometric process, so it gives the same result regardless of the physical nature of the four-vector. Suppose that we had instead chosen the velocity four-vector as our guinea pig. The definition of a geodesic is that it parallel-transports its own tangent vector, so the velocity vector has to stay constant. If we inspect the eigenvector corresponding to the zero-frequency eigenfrequency, we find a timelike vector that is parallel to the velocity four-vector. In our 2+1-dimensional space, the other two eigenvectors, which are spacelike, span the subspace of spacelike vectors, which are the ones that can physically be realized as the angular momentum of a gyroscope. These two eigenvectors, which vary as \(e^{\pm i\Omega}\), can be superposed to make real-valued spacelike solutions that match the initial conditions, and these lag the rotation of the basis vectors by \(\Delta\Omega=(3/2)mr\). This is greater than the purely spatial result by a factor of 3/2. The resulting precession angle, over \(n\) orbits of Gravity Probe B, is \(3\pi n Gm/c^2r = 3\times10^{-5}\) radians, in excellent agreement with experiment.

One will see apparently contradictory statements in the literature about whether Thomas precession occurs for a satellite: “The Thomas precession comes into play for a gyroscope on the surface of the Earth ..., but not for a gyroscope in a freely moving satellite.”6 But: “The total effect, geometrical and Thomas, gives the well-known Fokker-de Sitter precession of \(3\pi m/r\), in the same sense as the orbit.”7 The second statement arises from subtracting the purely spatial result from the 2+1-dimensional result, and noting that the absolute value of this difference is the same as the Thomas precession that would have been obtained if the gyroscope had been whirled at the end of a rope. In my opinion this is an unnatural way of looking at the physics, for two reasons. (1) The signs don't match, so one is forced to say that the Thomas precession has a different sign depending on whether the rotation is the result of gravitational or nongravitational forces. (2) Referring to observation, it is clearly artificial to treat the spatial curvature and Thomas effects separately, since neither one can be disentangled from the other by varying the quantities \(n\), \(m\), and \(r\). For more discussion, see


b / Proof that if the metric's components are independent of \(t\), the geodesic of a test particle conserves \(p_t\).

6.2.6 Orbits

The main event of Newton's Principia Mathematica is his proof of Kepler's laws. Similarly, Einstein's first important application in general relativity, which he began before he even had the exact form of the Schwarzschild metric in hand, was to find the non-Newtonian behavior of the planet Mercury. The planets deviate from Keplerian behavior for a variety of Newtonian reasons, and in particular there is a long list of reasons why the major axis of a planet's elliptical orbit is expected to gradually rotate. When all of these were taken into account, however, there was a remaining discrepancy of about 40 seconds of arc per century, or \(6.6\times 10^{-7}\) radians per orbit. The direction of the effect was in the forward direction, in the sense that if we view Mercury's orbit from above the ecliptic, so that it orbits in the counterclockwise direction, then the gradual rotation of the major axis is also counterclockwise. In other words, Mercury spends more time near perihelion than it should nonrelativistically. During this time, it sweeps out a greater angle than nonrelativistically expected, so that when it flies back out and away from the sun, its orbit has rotated counterclockwise.

We can at least qualitatively understand the reason for such an effect based on the spatial part of the curvature of the spacetime surrounding the sun. This spatial curvature is positive, so a circle's circumference is less than \(2\pi\) times its radius. This causes Mercury to get back to a previously visited angular position before it has had time to complete its Newtonian cycle of radial motion.

Based on the examples in section 5.5, we also expect that the effect will be of order \(m/r\), where \(m\) is the mass of the sun and \(r\) is the radius of Mercury's orbit. This works out to be \(2.5\times10^{-8}\), which is smaller than the observed precession by a factor of about 26.

Conserved quantities

If Einstein had had a computer on his desk, he probably would simply have integrated the motion numerically using the geodesic equation. But it is possible to simplify the problem enough to attack it with pencil and paper, if we can find the relevant conserved quantities of the motion. Nonrelativistically, these are energy and angular momentum.

Consider a rock falling directly toward the sun. The Schwarzschild metric is of the special form

\[\begin{equation*} ds^2 = h(r)dt^2 - k(r)dr^2 - ... . \end{equation*}\]

The rock's trajectory is a geodesic, so it extremizes the proper time \(s\) between any two events fixed in spacetime, just as a piece of string stretched across a curved surface extremizes its length. Let the rock pass through distance \(r_1\) in coordinate time \(t_1\), and then through \(r_2\) in \(t_2\). (These should really be notated as \(\Delta r_1\), ... or \(dr_1\), ..., but we avoid the \(\Delta\)'s or \(d\)'s for convenience.) Approximating the geodesic using two line segments, the proper time is

\[\begin{align*} s &= s_1 + s_2 \\ &= \sqrt{h_1 t_1^2-k_1 r_1^2} + \sqrt{h_2 t_2^2-k_2 r_2^2} \\ &= \sqrt{h_1 t_1^2-k_1 r_1^2} + \sqrt{h_2 (T-t_1)^2-k_2 r_2^2} , \end{align*}\]

where \(T=t_1+t_2\) is fixed. If this is to be extremized with respect to \(t_1\), then \(ds/dt_1=0\), which leads to

\[\begin{equation*} 0 = \frac{h_1t_1}{s_1} - \frac{h_2t_2}{s_2} , \end{equation*}\]

which means that

\[\begin{equation*} h\frac{dt}{ds} = g_{tt}\frac{dx^t}{ds} = \frac{dx_t}{ds} \end{equation*}\]

is a constant of the motion. Except for an irrelevant factor of \(m\), this is the same as \(p_t\), the timelike component of the covariant momentum vector. We've already seen that in special relativity, the timelike component of the momentum four-vector is interpreted as the mass-energy \(E\), and the quantity \(p_t\) has a similar interpretation here. Note that no special assumption was made about the form of the functions \(h\) and \(k\). In addition, it turns out that the assumption of purely radial motion was unnecessary. All that really mattered was that \(h\) and \(k\) were independent of \(t\). Therefore we will have a similar conserved quantity \(p_\mu\) any time the metric's components, expressed in a particular coordinate system, are independent of \(x^\mu\). (This is generalized on p. 246.) In particular, the Schwarzschild metric's components are independent of \(\phi\) as well as \(t\), so we have a second conserved quantity \(p_\phi\), which is interpreted as angular momentum.

Writing these two quantities out explicitly in terms of the contravariant coordinates, in the case of the Schwarzschild spacetime, we have

\[\begin{align*} E &= \left(1-\frac{2m}{r}\right) \frac{dt}{ds} \\ \text{and} L &= r^2 \frac{d \phi}{ds} \end{align*}\]

for the conserved energy per unit mass and angular momentum per unit mass.

In interpreting the energy per unit mass \(E\), it is important to understand that in the general-relativistic context, there is no useful way of separating the rest mass, kinetic energy, and potential energy into separate terms, as we could in Newtonian mechanics. \(E\) includes contributions from all of these, and turns out to be less than the contribution due to the rest mass (i.e., less than 1) for a planet orbiting the sun. It turns out that \(E\) can be interpreted as a measure of the additional gravitational mass that the solar system possesses as measured by a distant observer, due to the presence of the planet. It then makes sense that \(E\) is conserved; by analogy with Newtonian mechanics, we would expect that any gravitational effects that depended on the detailed arrangement of the masses within the solar system would decrease as \(1/r^4\), becoming negligible at large distances and leaving a constant field varying as \(1/r^2\).

One way of seeing that it doesn't make sense to split \(E\) into parts is that although the equation given above for \(E\) involves a specific set of coordinates, \(E\) can actually be expressed as a Lorentz-invariant scalar (see p. 246). This property makes \(E\) especially interesting and useful (and different from the energy in Newtonian mechanics, which is conserved but not frame-independent). On the other hand, the kinetic and potential energies depend on the velocity and position. These are completely dependent on the coordinate system, and there is nothing physically special about the coordinate system we've used here. Suppose a particle is falling directly toward the earth, and an astronaut in a space-suit is free-falling along with it and monitoring its progress. The astronaut judges the particle's kinetic energy to be zero, but other observers say it's nonzero, so it's clearly not a Lorentz scalar. And suppose the astronaut insists on defining a potential energy to go along with this kinetic energy. The potential energy must be decreasing, since the particle is getting closer to the earth, but then there is no way that the sum of the kinetic and potential energies could be constant.

Perihelion advance

For convenience, let the mass of the orbiting rock be 1, while \(m\) stands for the mass of the gravitating body.

The unit mass of the rock is a third conserved quantity, and since the magnitude of the momentum vector equals the square of the mass, we have for an orbit in the plane \(\theta=\pi/2\),

\[\begin{align*} 1 &= g^{tt}p_t^2 - g^{rr}p_r^2 - g^{\phi\phi}p_\phi^2 \\ &= g^{tt}p_t^2 - g_{rr}(p^r)^2 - g^{\phi\phi}p_\phi^2 \\ &= \frac{1}{1-2m/r}E^2 - \frac{1}{1-2m/r} \left(\frac{dr}{ds}\right)^2 - \frac{1}{r^2} L^2 . \end{align*}\]

Rearranging terms and writing \(\dot{r}\) for \(dr/ds\), this becomes

\[\begin{align*} \dot{r}^2 &= E^2-(1-2m/r)(1+L^2/r^2) \\ \text{or} \dot{r}^2 &= E^2-U^2 \\ \text{where} U^2 &= (1-2m/r)(1+L^2/r^2) . \end{align*}\]

There is a varied and strange family of orbits in the Schwarzschild field, including bizarre knife-edge trajectories that take several nearly circular turns before suddenly flying off. We turn our attention instead to the case of an orbit such as Mercury's which is nearly Newtonian and nearly circular.

Nonrelativistically, a circular orbit has radius \(r=L^2/m\) and period \(T=2\pi L^3/m^2\).

Relativistically, a circular orbit occurs when there is only one turning point at which \(\dot{r}=0\). This requires that \(E^2\) equal the minimum value of \(U^2\), which occurs at

\[\begin{align*} r &= \frac{L^2}{2m}\left(1+\sqrt{1-12m^2/L^2}\right)\\ &\approx \frac{L^2}{m}(1-\epsilon) , \end{align*}\]

where \(\epsilon=3(m/L)^2\). A planet in a nearly circular orbit oscillates between perihelion and aphelion with a period that depends on the curvature of \(U^2\) at its minimum. We have

\[\begin{align*} k &= \frac{d^2(U^2)}{dr^2} \\ &= \frac{d^2}{dr^2} \left( 1 - \frac{2m}{r} + \frac{L^2}{r^2} - \frac{2mL^2}{r^3} \right) \\ &= -\frac{4m}{r^3} + \frac{6L^2}{r^4} - \frac{24mL^2}{r^5} \\ &= 2L^{-6}m^4(1+2\epsilon) \end{align*}\]

The period of the oscillations is

\[\begin{align*} \Delta s_{osc} &= 2\pi\sqrt{2/k} \\ &= 2\pi L^3m^{-2}(1-\epsilon) . \end{align*}\]

The period of the azimuthal motion is

\[\begin{align*} \Delta s_{az} &= 2\pi r^2/L \\ &= 2\pi L^3 m^{-2} (1-2\epsilon) . \end{align*}\]

The periods are slightly mismatched because of the relativistic correction terms. The period of the radial oscillations is longer, so that, as expected, the perihelion shift is in the forward direction. The mismatch is \(\epsilon\Delta s\), and because of it each orbit rotates the major axis by an angle \(2\pi\epsilon=6\pi(m/L)^2=6\pi m/r\). Plugging in the data for Mercury, we obtain \(5.8\times10^{-7}\) radians per orbit, which agrees with the observed value to within about 10%. Eliminating some of the approximations we've made brings the results in agreement to within the experimental error bars, and Einstein recalled that when the calculation came out right, “for a few days, I was beside myself with joyous excitement.”

Further attempts were made to improve on the precision of this historically crucial test of general relativity. Radar now gives the most precise orbital data for Mercury. At the level of about one part per thousand, however, an effect creeps in due to the oblateness of the sun, which is difficult to measure precisely.

In 1974, astronomers J.H. Taylor and R.A. Hulse of Princeton, working at the Arecibo radio telescope, discovered a binary star system whose members are both neutron stars. The detection of the system was made possible because one of the neutron stars is a pulsar: a neutron star that emits a strong radio pulse in the direction of the earth once per rotational period.

The orbit is highly elliptical, and the minimum separation between the two stars is very small, about the same as the radius of our sun. Both because the \(r\) is small and because the period is short (about 8 hours), the rate of perihelion advance per unit time is very large, about 4.2 degrees per year. The system has been compared in great detail with the predictions of general relativity,8 giving extremely good agreement, and as a result astronomers have been confident enough to reason in the opposite direction and infer properties of the system, such as its total mass, from the general-relativistic analysis. The system's orbit is decaying due to the radiation of energy in the form of gravitational waves, which are predicted to exist by relativity.

6.2.7 Deflection of light

As discussed on page 171, one of the first tests of general relativity was Eddington's measurement of the deflection of rays of light by the sun's gravitational field. The deflection measured by Eddington was 1.6 seconds of arc. For a light ray that grazes the sun's surface, the only physically relevant parameters are the sun's mass \(m\) and radius \(r\). Since the deflection is unitless, it can only depend on \(m/r\), the unitless ratio of the sun's mass to its radius. Expressed in SI units, this is \(Gm/c^2r\), which comes out to be about \(10^{-6}\). Roughly speaking, then, we expect the order of magnitude of the effect to be about this big, and indeed \(10^{-6}\) radians comes out to be in the same ball-park as a second of arc. We get a similar estimate in Newtonian physics by treating a photon as a (massive) particle moving at speed \(c\).

It is possible to calculate a precise value for the deflection using methods very much like those used to determine the perihelion advance in section 6.2.6. However, some of the details would have to be changed. For example, it is no longer possible to parametrize the trajectory using the proper time \(s\), since a light ray has \(ds=0\); we must use an affine parameter. Let us instead use this an an example of the numerical technique for solving the geodesic equation, first demonstrated in section 5.9.2 on page 188. Modifying our earlier program, we have the following:

import math

# constants, in SI units:
G = 6.67e-11 # gravitational constant
c = 3.00e8 # speed of light
m_kg = 1.99e30 # mass of sun
r_m = 6.96e8 # radius of sun

# From now on, all calculations are in units of the
# radius of the sun.

# mass of sun, in units of the radius of the sun:
m_sun = (G/c**2)*(m_kg/r_m)
m = 1000.*m_sun
print "m/r=",m

# Start at point of closest approach.
# initial position:
r=1 # closest approach, grazing the sun's surface
# initial derivatives of coordinates w.r.t. lambda
vr = 0
vt = 1
vphi = math.sqrt((1.-2.*m/r)/r**2)*vt # gives ds=0, lightlike

l = 0 # affine parameter lambda
l_max = 20000.
epsilon = 1e-6 # controls how fast lambda varies
while l<l_max:
  dl = epsilon*(1.+r**2) # giant steps when farther out
  l = l+dl
  # Christoffel symbols:
  Gttr = m/(r**2-2*m*r)
  Grtt = m/r**2-2*m**2/r**3
  Grrr = -m/(r**2-2*m*r)
  Grphiphi = -r+2*m
  Gphirphi = 1/r
  # second derivatives:
  # The factors of 2 are because we have, e.g., G^a_{bc}=G^a_{cb}
  at = -2.*Gttr*vt*vr
  ar = -(Grtt*vt*vt + Grrr*vr*vr + Grphiphi*vphi*vphi)
  aphi = -2.*Gphirphi*vr*vphi
  # update velocity:
  vt = vt + dl*at
  vr = vr + dl*ar
  vphi = vphi + dl*aphi
  # update position:
  r = r + vr*dl
  t = t + vt*dl
  phi = phi + vphi*dl

# Direction of propagation, approximated in asymptotically flat coords.
# First, differentiate (x,y)=(r cos phi,r sin phi) to get vx and vy:
vx = vr*math.cos(phi)-r*math.sin(phi)*vphi
vy = vr*math.sin(phi)+r*math.cos(phi)*vphi
prop = math.atan2(vy,vx) # inverse tan of vy/vx, in the proper quadrant
prop_sec = prop*180.*3600/math.pi
print "final direction of propagation = %6.2f arc-seconds" % prop_sec

At line 14, we take the mass to be 1000 times greater than the mass of the sun. This helps to make the deflection easier to calculate accurately without running into problems with rounding errors. Lines 17-25 set up the initial conditions to be at the point of closest approach, as the photon is grazing the sun. This is easier to set up than initial conditions in which the photon approaches from far away. Because of this, the deflection angle calculated by the program is cut in half. Combining the factors of 1000 and one half, the final result from the program is to be interpreted as 500 times the actual deflection angle.

The result is that the deflection angle is predicted to be 870 seconds of arc. As a check, we can run the program again with \(m=0\); the result is a deflection of \(-8\) seconds, which is a measure of the accumulated error due to rounding and the finite increment used for \(\lambda\).

Dividing by 500, we find that the predicted deflection angle is 1.74 seconds, which, expressed in radians, is exactly \(4Gm/c^2r\). The unitless factor of 4 is in fact the correct result in the case of small deflections, i.e., for \(m/r \ll 1\).

Although the numerical technique has the disadvantage that it doesn't let us directly prove a nice formula, it has some advantages as well. For one thing, we can use it to investigate cases for which the approximation \(m/r \ll 1\) fails. For \(m/r=0.3\), the numerical techique gives a deflection of 222 degrees, whereas the weak-field approximation \(4Gm/c^2r\) gives only 69 degrees. What is happening here is that we're getting closer and closer to the event horizon of a black hole. Black holes are the topic of section 6.3, but it should be intuitively reasonable that something wildly nonlinear has to happen as we get close to the point where the light wouldn't even be able to escape.

The precision of Eddington's original test was only about \(\pm\) 30%, and has never been improved on significantly with visible-light astronomy. A better technique is radio astronomy, which allows measurements to be carried out without waiting for an eclipse. One merely has to wait for the sun to pass in front of a strong, compact radio source such as a quasar. These techniques have now verified the deflection of light predicted by general relativity to a relative precision of about \(10^{-5}\).9

6.3 Black holes

6.3.1 Singularities

A provocative feature of the Schwarzschild metric is that it has elements that blow up at \(r=0\) and at \(r=2m\). If this is a description of the sun, for example, then these singularities are of no physical significance, since we only solved the Einstein field equation for the vacuum region outside the sun, whereas \(r=2m\) would lie about 3 km from the sun's center. Furthermore, it is possible that one or both of these singularities is nothing more than a spot where our coordinate system misbehaves. This would be known as a coordinate singularity. For example, the metric of ordinary polar coordinates in a Euclidean plane has \(g^{\theta\theta}\rightarrow\infty\) as \(r\rightarrow 0\).

One way to test whether a singularity is a coordinate singularity is to calculate a scalar measure of curvature, whose value is independent of the coordinate system. We can take the trace of the Ricci tensor, \(R^a_a\), known as the scalar curvature or Ricci scalar, but since the Ricci tensor is zero, it's not surprising that that is zero. A different scalar we can construct is the product \(R^{abcd}R_{abcd}\) of the Riemann tensor with itself. This is known as the Kretchmann invariant. The Maxima command lriemann(true) displays the nonvanishing components of \(R_{abcd}\). The component that misbehaves the most severely at \(r=0\) is \(R_{trrt}=2m/r^3\). Because of this, the Kretchmann invariant blows up like \(r^{-6}\) as \(r\rightarrow 0\). This shows that the singularity at \(r=0\) is a real, physical singularity.

The singularity at \(r=2m\), on the other hand, turns out to be only a coordinate singularity. To prove this, we have to use some technique other than constructing scalar measures of curvature. Even if every such scalar we construct is finite at \(r=2m\), that doesn't prove that every such scalar we could construct is also well behaved. We can instead search for some other coordinate system in which to express the solution to the field equations, one in which no such singularity appears. A partially successful change of coordinates for the Schwarzschild metric, found by Eddington in 1924, is \(t \rightarrow t'=t-2m\ln(r-2m)\) (see problem 7 on page 223). This makes the covariant metric finite at \(r=2m\), although the contravariant metric still blows up there. A more complicated change of coordinates that completely eliminates the singularity at \(r=2m\) was found by Eddington and Finkelstein in 1958, establishing that the singularity was only a coordinate singularity. Thus, if an observer is so unlucky as to fall into a black hole, he will not be subjected to infinite tidal stresses --- or infinite anything --- at \(r=2m\). He may not notice anything special at all about his local environment. (Or he may already be dead because the tidal stresses at \(r>2m\), although finite, were nevertheless great enough to kill him.)

6.3.2 Event horizon

Even though \(r=2m\) isn't a real singularity, interesting things do happen there. For \(r\lt2m\), the sign of \(g_{tt}\) becomes negative, while \(g_{rr}\) is positive. In our \(+---\) signature, this has the following interpretation. For the world-line of a material particle, \(ds^2\) is supposed to be the square of the particle's proper time, and it must always be positive. If a particle had a constant value of \(r\), for \(r\lt2m\), it would have \(ds^2\lt0\), which is impossible.

The timelike and spacelike characters of the \(r\) and \(t\) coordinates have been swapped, so \(r\) acts like a time coordinate.

Thus for an object compact enough that \(r=2m\) is exterior, \(r=2m\) is an event horizon: future light cones tip over so far that they do not allow causal relationships to connect with the spacetime outside. In relativity, event horizons do not occur only in the context of black holes; their properties, and some of the implications for black holes, have already been discussed in section 6.1.

The gravitational time dilation in the Schwarzschild field, relative to a clock at infinity, is given by the square root of the \(g_{tt}\) component of the metric. This goes to zero at the event horizon, meaning that, for example, a photon emitted from the event horizon will be infinitely redshifted when it reaches an observer at infinity. This makes sense, because the photon is then undetectable, just as it would be if it had been emitted from inside the event horizon. If matter is falling into a black hole, then due to time dilation an observer at infinity “sees” that matter as slowing down more and more as it approaches the horizon. This has some counterintuitive effects. A radially infalling particle has \(d ^2r/dt^2>0\) once it falls past a certain point, which could be interpreted as a gravitational repulsion. The observer at infinity may also be led to describe the black hole as consisting of an empty, spherical shell of matter that never quite made it through the horizon. If asked what holds the shell up, the observer could say that it is held up by gravitational repulsion.

There is actually nothing wrong with any of this, but one should realize that it is only one possible description in one possible coordinate system. An observer hovering just outside the event horizon sees a completely different picture, with matter falling past at velocities that approach the speed of light as it comes to the event horizon. If an atom emits a photon from the event horizon, the hovering observer sees it as being infinitely red-shifted, but explains the red-shift as a kinematic one rather than a gravitational one.

We can imagine yet a third observer, one who free-falls along with the infalling matter. According to this observer, the gravitational field is always zero, and it takes only a finite time to pass through the event horizon.

If a black hole has formed from the gravitational collapse of a cloud of matter, then some of our observers can say that “right now” the matter is located in a spherical shell at the event horizon, while others can say that it is concentrated at an infinitely dense singularity at the center. Since simultaneity isn't well defined in relativity, it's not surprising that they disagree about what's happening “right now.” Regardless of where they say the matter is, they all agree on the spacetime curvature. In fact, Birkhoff's theorem (p. 251) tells us that any spherically symmetric vacuum spacetime is Schwarzschild in form, so it doesn't matter where we say the matter is, as long as it's distributed in a spherically symmetric way and surrounded by vacuum.

6.3.3 Expected formation

Einstein and Schwarzschild did not believe, however, that any of these features of the Schwarzschild metric were more than a mathematical curiosity, and the term “black hole” was not invented until the 1967, by John Wheeler. Although there is quite a bit of evidence these days that black holes do exist, there is also the related question of what sizes they come in.

We might expect naively that since gravity is an attractive force, there would be a tendency for any primordial cloud of gas or dust to spontaneously collapse into a black hole. But clouds of less than about \(0.1M_\odot\) (0.1 solar masses) form planets, which achieve a permanent equilibrium between gravity and internal pressure. Heavier objects initiate nuclear fusion, but those with masses above about \(100M_\odot\) are immediately torn apart by their own solar winds. In the range from 0.1 to \(100M_\odot\), stars form. As discussed in section 4.4.3, those with masses greater than about a few \(M_\odot\) are expected to form black holes when they die. We therefore expect, on theoretical grounds, that the universe should contain black holes with masses ranging from a few solar masses to a few tens of solar masses.


a / A black hole accretes matter from a companion star.

6.3.4 Observational evidence

A black hole is expected to be a very compact object, with a strong gravitational field, that does not emit any of its own light. A bare, isolated black hole would be difficult to detect, except perhaps via its lensing of light rays that happen to pass by it. But if a black hole occurs in a binary star system, it is possible for mass to be transferred onto the black hole from its companion, if the companion's evolution causes it to expand into a giant and intrude upon the black hole's gravity well. The infalling gas would then get hot and emit radiation before disappearing behind the event horizon. The object known as Cygnus X-1 is the best-studied example. This X-ray-emitting object was discovered by a rocket-based experiment in 1964. It is part of a double-star system, the other member being a blue supergiant. They orbit their common center of mass with a period of 5.6 days. The orbit is nearly circular, and has a semi-major axis of about 0.2 times the distance from the earth to the sun. Applying Kepler's law of periods to these data constrains the sum of the masses, and knowledge of stellar structure fixes the mass of the supergiant. The result is that the mass of Cygnus X-1 is greater than about 10 solar masses, and this is confirmed by multiple methods. Since this is far above the Tolman-Oppenheimer-Volkoff limit, Cygnus X-1 is believed to be a black hole, and its X-ray emissions are interpreted as the radiation from the disk of superheated material accreting onto it from its companion. It is believed to have more than 90% of the maximum possible spin for a black hole of its mass.10

Around the turn of the 21st century, new evidence was found for the prevalence of supermassive black holes near the centers of nearly all galaxies, including our own. Near our galaxy's center is an object called Sagittarius A*, detected because nearby stars orbit around it. The orbital data show that Sagittarius A* has a mass of about four million solar masses, confined within a sphere with a radius less than \(2.2\times10^7\) km. There is no known astrophysical model that could prevent the collapse of such a compact object into a black hole, nor is there any plausible model that would allow this much mass to exist in equilibrium in such a small space, without emitting enough light to be observable.

The existence of supermassive black holes is surprising. Gas clouds with masses greater than about 100 solar masses cannot form stable stars, so supermassive black holes cannot be the end-point of the evolution of heavy stars. Mergers of multiple stars to form more massive objects are generally statistically unlikely, since a star is such a small target in relation to the distance between the stars. Once astronomers were confronted with the empirical fact of their existence, a variety of mechanisms was proposed for their formation. Little is known about which of these mechanisms is correct, although the existence of quasars in the early universe is interpreted as evidence that mass accreted rapidly onto supermassive black holes in the early stages of the evolution of the galaxies.

A skeptic could object that although Cygnus X-1 and Sagittarius A* are more compact than is believed possible for a neutron star, this does not necessarily prove that they are black holes. Indeed, speculative theories have been proposed in which exotic objects could exist that are intermediate in compactness between black holes and neutron stars. These hypothetical creatures have names like black stars, gravastars, quark stars, boson stars, Q-balls, and electroweak stars. Although there is no evidence that these theories are right or that these objects exist, we are faced with the question of how to determine whether a given object is really a black hole or one of these other species. The defining characteristic of a black hole is that it has an event horizon rather than a physical surface. If an object is not a black hole, than by conservation of energy any matter that falls onto it must release its gravitational potential energy when it hits that surface. Cygnus X-1 has a copious supply of matter falling onto it from its supergiant companion, and Sagittarius A* likewise accretes a huge amount of gas from the stellar wind of nearby stars. By analyzing millimeter and infrared very-long-baseline-interferometry observations, Broderick, Loeb, and Narayan11 have shown that if Sagittarius A* had a surface, then the luminosity of this surface must be less than 0.3% of the luminosity of the accretion disk. But this is not physically possible, because there are fundamental limits on the efficiency with which the gas can radiate away its energy before hitting the surface. We can therefore conclude that Sagittarius A* must have an event horizon. Its event horizon may be imaged directly in the near future.12


b / A conical singularity. The cone has zero intrinsic curvature everywhere except at its tip. Geodesic 1 can be extended infinitely far, but geodesic 2 cannot; since the metric is undefined at the tip, there is no sensible way to define how geodesic 2 should be extended.

6.3.5 Singularities and cosmic censorship

Informal ideas

Since we observe that black holes really do exist, maybe we should take the singularity at \(r=0\) seriously. Physically, it says that the mass density and tidal forces blow up to infinity there.

Generally when a physical theory says that observable quantities blow up to infinity at a particular point, it means that the theory has reached the point at which it can no longer make physical predictions. For instance, Maxwell's theory of electromagnetism predicts that the electric field blows up like \(r^{-2}\) near a point charge, and this implies that infinite energy is stored in the field within a finite radius around the charge. Physically, this can't be right, because we know it only takes 511 keV of energy to create an electron out of nothing, e.g., in nuclear beta decay. The paradox is resolved by quantum electrodynamics, which modifies the description of the vacuum around the electron to include a sea of virtual particles popping into and out of existence.

In the case of the singularity at the center of a black hole, it is possible that quantum mechanical effects at the Planck scale prevent the formation of a singularity. Unfortunately, we are unlikely to find any empirical evidence about this, since black holes always seem to come clothed in event horizons, so we outside observers cannot extract any data about the singularity inside. Even if we take a suicidal trip into a black hole, we get no data about the singularity, because the singularity in the Schwarzschild metric is spacelike, not timelike, and therefore it always lies in our future light cone, never in our past.

In a way, the inaccessibility of singularities is a good thing. If a singularity exists, it is a point at which all the known laws of physics break down, and physicists therefore have no way of predicting anything about its behavior. There is likewise no great crisis for physics due to the Big Bang singularity or the Big Crunch singularity that occurs in some cosmologies in which the universe recollapses; we have no reasonable expectation of being able to make and test predictions or retrodictions that extend beyond the beginning or end of the universe.

What would be a crushing blow to the enterprise of physics would be a singularity that could sit on someone's desk. As John Earman of the University of Pittsburgh puts it, anything could pop out of such a singularity, including green slime or your lost socks. In more technical language, a singularity would constitute an extreme violation of unitarity and an acute instance of the information paradox (see page 203).

There is no obvious reason that general relativity should not allow naked singularities, but neither do we know of any real-world process by which one could be formed by gravitational collapse. Penrose's cosmic censorship hypothesis states that the laws of physics prevent the formation of naked singularities from nonsingular and generic initial conditions. “Generic” is a necessary addition to Penrose's original 1969 formulation, since Choptuik showed in 1993 that certain perfectly fine-tuned initial conditions allowed collapse to a naked singularity.13

Formal definitions

The remainder of this subsection provides a more formal exposition of the definitions relating to singularities. It can be skipped without loss of continuity.

The reason we care about singularities is that they indicate an incompleteness of the theory, and the theory's inability to make predictions. One of the simplest things we could ask any theory to do would be to predict the trajectories of test particles. For example, Maxwell's equations correctly predict the motion of an electron in a uniform magnetic field, but they fail to predict the motion of an electron that collides head-on with a positron. It might have been natural for someone in Maxwell's era (assuming they were informed about the existence of positrons and told to assume that both particles were pointlike) to guess that the two particles would scatter through one another at \(\theta=0\), their velocities momentarily becoming infinite. But it would have been equally natural for this person to refuse to make a prediction.

Similarly, if a particle hits a black hole singularity, we should not expect general relativity to make a definite prediction. It doesn't. Not only does the geodesic equation break down, but if we were to naively continue the particle's geodesic by assuming that it scatters in the forward direction, the continuation would be a world-line whose future-time direction pointed into the singularity rather than back out of it.

We would therefore like to define a singularity as a situation in which the geodesics of test particles can't be extended indefinitely. But what does “indefinitely” mean? If the test particle is a photon, then the metric length of its world-line is zero. We get around this by defining length in terms of an affine parameter.

Definition: A spacetime is said to be geodesically incomplete if there exist timelike or lightlike geodesics that cannot be extended past some finite affine parameter into the past or future.

Geodesic incompleteness defines what we mean by a singularity. A geodesically incomplete spacetime has one or more singularities in it. The Schwarzschild spacetime has a singularity at \(r=0\), but not at the event horizon, since geodesics continue smoothly past the event horizon. Cosmological spacetimes contain a Big Bang singularity which prevents geodesics from being extended beyond a certain point in the past.

There are two types of singularities, curvature singularities and conical singularities. The examples above are curvature singularities. Figure b shows an example of a conical singularity. (Cf. figure b, 192.) As one approaches a curvature singularity, the curvature of spacetime diverges to infinity, as measured by a curvature invariant such as the Ricci scalar. In 2+1-dimensional relativity, curvature vanishes identically, and the only kind of gravity that exists is due to conical singularities. Conical singularities are not expected to be present in our universe, since there is no known mechanism by which they could be formed by gravitational collapse.

Actual singularities involving geodesic incompleteness are to be distinguished from coordinate singularities, which are not really singularities at all. In the Schwarzschild spacetime, as described in Schwarzschild's original coordinates, some components of the metric blow up at the event horizon, but this is not an actual singularity. This coordinate system can be replaced with a different one in which the metric is well behaved.

The reason curvature scalars are useful as tests for an actual curvature singularity is that since they're scalars, they can't diverge in one coordinate system but stay finite in another. We define a singularity to be a curvature singularity if timelike or lightlike geodesics can only be extended to some finite affine parameter, and some curvature scalar (not necessarily every such scalar) approaches infinity as we approach this value of the affine parameter. Anything that is not a curvature singularity is considered a conical singularity.

A singularity is not considered to be a point in a spacetime; it's more like a hole in the topology of the manifold. For example, the Big Bang didn't occur at a point.

Because a singularity isn't a point or a point-set, we can't define its timelike or spacelike character in quite the way we would with, say, a curve. A timelike singularity is one such that an observer with a timelike world-line can have the singularity sometimes in his future light-cone and sometimes in his past light-cone.14

Schwarzschild and Big Bang singularities are spacelike. (Note that in the Schwarzschild metric, the Schwarzschild \(r\) and \(t\) coordinates swap their timelike and spacelike characters inside the event horizon.)

The definition of a timelike singularity is local. A timelike singularity would be one that you could have sitting on your desk, where you could look at it and poke it with a stick.

A naked singularity is one from which timelike or lightlike world-lines can originate and then escape to infinity. The Schwarzschild metric's singularity is not naked. This notion is global.

If either a timelike or a naked singularity can be formed by gravitational collapse from realistic initial conditions, then it would create severe difficulties for physicists wishing to make predictions using the laws of physics.

6.3.6 Hawking radiation

Radiation from black holes

Since event horizons are expected to emit blackbody radiation, a black hole should not be entirely black; it should radiate. This is called Hawking radiation. Suppose observer B just outside the event horizon blasts the engines of her rocket ship, producing enough acceleration to keep from being sucked in. By the equivalence principle, what she observes cannot depend on whether the acceleration she experiences is actually due to a gravitational field. She therefore detects radiation, which she interprets as coming from the event horizon below her. As she gets closer and closer to the horizon, the acceleration approaches infinity, so the intensity and frequency of the radiation grows without limit.

A distant observer A, however, sees a different picture. According to A, B's time is extremely dilated. A sees B's acceleration as being only \(\sim 1/m\), where \(m\) is the mass of the black hole; A does not perceive this acceleration as blowing up to infinity as B approaches the horizon. When A detects the radiation, it is extremely red-shifted, and it has the spectrum that one would expect for a horizon characterized by an acceleration \(a\sim 1/m\). The result for a 10-solar-mass black hole is \(T\sim10^{-8}\) K, which is so low that the black hole is actually absorbing more energy from the cosmic microwave background radiation than it emits.

Direct observation of black-hole radiation is therefore probably only possible for black holes of very small masses. These may have been produced soon after the big bang, or it is conceivable that they could be created artificially, by advanced technology. If black-hole radiation does exist, it may help to resolve the information paradox, since it is possible that information that goes into a black hole is eventually released via subtle correlations in the black-body radiation it emits.

Particle physics

Hawking radiation has some intriguing properties from the point of view of particle physics. In a particle accelerator, the list of particles one can create in appreciable quantities is determined by coupling constants. In Hawking radiation, however, we expect to see a representative sampling of all types of particles, biased only by the fact that massless or low-mass particles are more likely to be produced than massive ones. For example, it has been speculated that some of the universe's dark matter exists in the form of “sterile” particles that do not couple to any force except for gravity. Such particles would never be produced in particle accelerators, but would be seen in radiation.

Hawking radiation would violate many cherished conservation laws of particle physics. Let a hydrogen atom fall into a black hole. We've lost a lepton and a baryon, but if we want to preserve conservation of lepton number and baryon number, we cover this up with a fig leaf by saying that the black hole has simply increased its lepton number and baryon number by \(+1\) each. But eventually the black hole evaporates, and the evaporation is probably mostly into zero-mass particles such as photons. Once the hole has evaporated completely, our fig leaf has evaporated as well. There is now no physical object to which we can attribute the \(+1\) units of lepton and baryon number.

Black-hole complementarity

A very difficult question about the relationship between quantum mechanics and general relativity occurs as follows. In our example above, observer A detects an extremely red-shifted spectrum of light from the black hole. A interprets this as evidence that the space near the event horizon is actually an intense maelstrom of radiation, with the temperature approaching infinity as one gets closer and closer to the horizon. If B returns from the region near the horizon, B will agree with this description. But suppose that observer C simply drops straight through the horizon. C does not feel any acceleration, so by the equivalence principle C does not detect any radiation at all. Passing down through the event horizon, C says, “A and B are liars! There's no radiation at all.” A and B, however, C see as having entered a region of infinitely intense radiation. “Ah,” says A, “too bad. C should have turned back before it got too hot, just as I did.” This is an example of a principle we've encountered before, that when gravity and quantum mechanics are combined, different observers disagree on the number of quanta present in the vacuum. We are presented with a paradox, because A and B believe in an entirely different version of reality that C. A and B say C was fricasseed, but C knows that that didn't happen. One suggestion is that this contradiction shows that the proper logic for describing quantum gravity is nonaristotelian, as described on page 68. This idea, suggested by Susskind et al., goes by the name of black-hole complementarity, by analogy with Niels Bohr's philosophical description of wave-particle duality as being “complementary” rather than contradictory. In this interpretation, we have to accept the fact that C experiences a qualitatively different reality than A and B, and we comfort ourselves by recognizing that the contradiction can never become too acute, since C is lost behind the event horizon and can never send information back out.

6.3.7 Black holes in \(d\) dimensions

It has been proposed that our universe might actually have not \(d=4\) dimensions but some higher number, with the \(d-4\) “extra” ones being spacelike, and curled up on some small scale \(\rho\) so that we don't see them in ordinary life. One candidate for such a scale \(\rho\) is the Planck length, and we then have to talk about theories of quantum gravity such as string theory. On the other hand, it could be the 1 TeV electroweak scale; the motivation for such an idea is that it would allow the unification of electroweak interactions with gravity. This idea goes by the name of “large extra dimensions” --- “large” because \(\rho\) is bigger than the Planck length. In fact, in such theories the Planck length is the electroweak unification scale, and the number normally referred to as the Planck length is not really the Planck length.15

In \(d\) dimensions, there are \(d-1\) spatial dimensions, and a surface of spherical symmetry has \(d-2\). In the Newtonian weak-field limit, the density of gravitational field lines falls off like \(m/r^{d-2}\) with distance from a source \(m\), and we therefore find that Newton's law of gravity has an exponent of \(-(d-2)\). If \(d\ne 3\), we can integrate to find that the gravitational potential varies as \(\Phi\sim -mr^{-(d-3)}\). Passing back to the weak-field limit of general relativity, the equivalence principle dictates that the \(g_{tt}\) term of the metric be approximately \(1+2\Phi\), so we find that the metric has the form

\[\begin{equation*} ds^2 \approx (1-2mr^{-(d-3)})dt^2 - (...)dr^2 - r^2 d\theta^2 - r^2 \sin^2\theta d\phi^2 . \end{equation*}\]

This looks like the Schwarzschild form with no other change than a generalization of the exponent, and in fact Tangherlini showed in 1963 that for \(d>4\), one obtains the exact solution simply by applying the same change of exponent to \(g_{rr}\) as well.16

If large extra dimensions do exist, then this is the actual form of any black-hole spacetime for \(r\ll \rho\), where the background curvature of the extra dimensions is negligible. Since the exponents are all changed, gravitational forces become stronger than otherwise expected at small distances, and it becomes easier to make black holes. It has been proposed that if large extra dimensions exist, microscopic black holes would be observed at the Large Hadron Collider. They would immediately evaporate into Hawking radiation (p. 230), with an experimental signature of violating the standard conservation laws of particle physics. As of 2010, the empirical results seem to be negative.17

The reasoning given above fails in the case of \(d=3\), i.e., 2+1-dimensional spacetime, both because the integral of \(r^{-1}\) is not \(r^0\) and because the Tangherlini-Schwarzschild metric is not a vacuum solution. As shown in problem 11 on p. 239, there is no counterpart of the Schwarzschild metric in 2+1 dimensions. This is essentially because for \(d=3\) mass is unitless, so given a source having a certain mass, there is no way to set the distance scale at which Newtonian weak-field behavior gives way to the relativistic strong field. Whereas for \(d \ge 4\), Newtonian gravity is the limiting case of relativity, for \(d=3\) they are unrelated theories. In fact, the relativistic theory of gravity for \(d=3\) is somewhat trivial. Spacetime does not admit curvature in vacuum solutions,18 so that the only nontrivial way to make non-Minkowski 2+1-dimensional spacetimes is by gluing together Minkowski pieces in various topologies, like gluing pieces of paper to make things like cones and Möbius strips. 2+1-dimensional gravity has conical singularities, but not Schwarzschild-style ones that are surrounded by curved spacetime.

If black-hole solutions exist in \(d\) dimensions, then one can extend such a solution to \(d+1\) dimensions with cylindrical symmetry, forming a “black string.” The nonexistence of \(d=3\) black holes implies that black string solutions do not exist in our own \(d=4\) universe. However, different considerations arise in a universe with a negative cosmological constant (p. 285). There are then 2+1-dimensional solutions known as BTZ black holes.19 Since our own universe has a positive cosmological constant, not a negative one, we still find that black strings cannot exist.

6.4 Degenerate solutions


a / The change of coordinates is degenerate at \(t=0\).

This section can be omitted on a first reading.

At the event horizon of the Schwarzschild spacetime, the timelike and spacelike roles of the Schwarzschild \(r\) and \(t\) coordinates get swapped around, so that the signs in the metric change from \(+---\) to \(-+--\). In discussing cases like this, it becomes convenient to define a new usage of the term “signature,” as \(s=p-q\), where \(p\) is the number of positive signs and \(q\) the number of negative ones. This can also be represented by the pair of numbers \((p,q)\). The example of the Schwarzschild horizon is not too disturbing, both because the funny behavior arises at a singularity that can be removed by a change of coordinates and because the signature stays the same. An observer who free-falls through the horizon observes that the local properties of spacetime stay the same, with \(|s|=2\), as required by the equivalence principle.

But this only makes us wonder whether there are other examples in which an observer would actually detect a change in the metric's signature. We are encouraged to think of the signature as something empirically observable because, for example, it has been proposed that our universe may have previously unsuspected additional spacelike dimensions, and these theories make testable predictions. Since we don't notice the extra dimensions in ordinary life, they would have to be wrapped up into a cylindrical topology. Some such theories, like string theory, are attempts to create a theory of quantum gravity, so the cylindrical radius is assumed to be on the order of the Planck length, which corresponds quantum-mechanically to an energy scale that we will not be able to probe using any foreseeable technology. But it is also possible that the radius is large --- a possibility that goes by the name of “large extra dimensions” --- so that we could see an effect at the Large Hadron Collider. Nothing in the formulation of the Einstein field equations requires a 3+1 (i.e., \((1,3)\)) signature, and they work equally well if the signature is instead 4+1, 5+1, .... Newton's inverse-square law of gravity is described by general relativity as arising from the three-dimensional nature of space, so on small scales in a theory with \(n\) large extra dimensions, the \(1/r^2\) behavior changes over to \(1/r^{2+n}\), and it becomes possible that the LHC could produce microscopic black holes, which would immediately evaporate into Hawking radiation in a characteristic way.

So it appears that the signature of spacetime is something that is not knowable a priori, and must be determined by experiment. When a thing is supposed to be experimentally observable, general relativity tells us that it had better be coordinate-independent. Is this so? A proposition from linear algebra called Sylvester's law of inertia encourages us to believe that it is. The theorem states that when a real matrix \(A\) is diagonalized by a real, nonsingular change of basis (a similarity transformation \(S^{-1}AS\)), the number of positive, negative, and zero diagonal elements is uniquely determined. Since a change of coordinates has the effect of applying a similarity transformation on the metric, it appears that the signature is coordinate-independent.

This is not quite right, however, as shown by the following paradox. The coordinate invariance of general relativity tells us that if all clocks, everywhere in the universe, were to slow down simultaneously (with simultaneity defined in any way we like), there would be no observable consequences. This implies that the spacetime \(ds^2=-tdt^2-d\ell^2\), where \(d\ell^2=dx^2+dy^2+dz^2\), is empirically indistinguishable from a flat spacetime. Starting from \(t=-\infty\), the positive \(g_{tt}\) component of the metric shrinks uniformly, which should be harmless. We can indeed verify by direct evaluation of the Riemann tensor that this is a flat spacetime (problem 9, p. 239). But for \(t>0\) the signature of the metric switches from \(+---\) to \(----\), i.e., from Lorentzian (\(|s|=2\)) to Euclidean (\(|s|=4\)). This is disquieting. For \(t\lt0\), the metric is a perfectly valid description of our own universe (which is approximately flat). Time passes, and there is no sign of any impending disaster. Then, suddenly, at some point in time, the entire structure of spacetime undergoes a horrible spasm. This is a paradox, because we could just as well have posed our initial conditions using some other coordinate system, in which the metric had the familiar form \(ds^2=dt^2-d\ell^2\). General relativity is supposed to be agnostic about coordinates, but a choice of coordinate leads to a differing prediction about the signature, which is a coordinate-independent quantity.

We are led to the resolution of the paradox if we explicitly construct the coordinate transformation involved. In coordinates \((t,x,y,z)\), we have \(ds^2=-tdt^2-d\ell^2\). We would like to find the relationship between \(t\) and some other coordinate \(u\) such that we recover the familiar form \(ds^2=du^2-d\ell^2\) for the metric. The tensor transformation law gives

\[\begin{align*} g_{tt} &= \left(\frac{\partial u}{\partial t}\right)^2 g_{uu} \\ -t &= \left(\frac{\partial u}{\partial t}\right)^2 \\ \text{with solution} u &= \pm \frac{2}{3} t^{3/2} , t\lt0 . \end{align*}\]

There is no solution for \(t>0\).

If physicists living in this universe, at \(t\lt0\), for some reason choose \(t\) as their time coordinate, there is in fact a way for them to tell that the cataclysmic event at \(t=0\) is not a reliable prediction. At \(t=0\), their metric's time component vanishes, so its signature changes from \(+---\) to \(0---\). At that moment, the machinery of the standard tensor formulation of general relativity breaks down. For example, one can no longer raise indices, because \(g^{ab}\) is the matrix inverse of \(g_{ab}\), but \(g_{ab}\) is not invertible. Since the field equations are ultimately expressed in terms of the metric using machinery that includes raising and lowering of indices, there is no way to apply them at \(t=0\). They don't make a false prediction of the end of the world; they fail to make any prediction at all. Physicists accustomed to working in terms of the \(t\) coordinate can simply throw up their hands and say that they have no way to predict anything at \(t>0\). But they already know that their spacetime is one whose observables, such as curvature, are all constant with respect to time, so they should ask why this perfect symmetry is broken by singling out \(t=0\). There is physically nothing that should make one moment in time different than any other, so choosing a particular time to call \(t=0\) should be interpreted merely as an arbitrary choice of the placement of the origin of the coordinate system. This suggests to the physicists that all of the problems they've been having are not problems with any physical meaning, but merely problems arising from a poor choice of coordinates. They carry out the calculation above, and discover the \(u\) time coordinate. Expressed in terms of \(u\), the metric is well behaved, and the machinery of prediction never breaks down.

The paradox posed earlier is resolved because Sylvester's law of inertia only applies to a nonsingular transformation \(S\). If \(S\) had been singular, then the \(S^{-1}\) referred to in the theorem wouldn't even have existed. But the transformation from \(u\) to \(t\) has \(\partial t/\partial u=0\) at \(u=t=0\), so it is singular. This is all in keeping with the general philosophy of coordinate-invariance in relativity, which is that only smooth, one-to-one coordinate transformations are allowed. Someone who has found a lucky coordinate like \(u\), and who then contemplates transforming to \(t\), should realize that it isn't a good idea, because the transformation is not smooth and one-to-one. Someone who has started by working with an unlucky coordinate like \(t\) finds that the machinery breaks down at \(t=0\), and concludes that it would be a good idea to search for a more useful set of coordinates. This situation can actually arise in practical calculations.

What about our original question: could the signature of spacetime actually change at some boundary? The answer is now clear. Such a change of signature is something that could conceivably have intrinsic physical meaning, but if so, then the standard formulation of general relativity is not capable of making predictions about it. There are other formulations of general relativity, such as Ashtekar's, that are ordinarily equivalent to Einstein's, but that are capable of making predictions about changes of signature. However, there is more than one such formulation, and they do not agree on their predictions about signature changes.

Homework Problems

\begin{homeworkforcelabel}{geometrized-power}{1}{}{1} Show that in geometrized units, power is unitless. Find the equivalent in watts of a power that equals 1 in geometrized units. \end{homeworkforcelabel}

\begin{homeworkforcelabel}{coordinate-singularity-sphere}{1}{}{2} The metric of coordinates \((\theta,\phi)\) on the unit sphere is \(ds^2=d\theta^2+\sin^2\thetad\phi^2\). (a) Show that there is a singular point at which \(g^{ab}\rightarrow\infty\). (b) Verify directly that the scalar curvature \(R=R^a_a\) constructed from the trace of the Ricci tensor is never infinite. (c) Prove that the singularity is a coordinate singularity. \end{homeworkforcelabel}

\begin{homeworkforcelabel}{black-hole-slingshot}{1}{}{3} (a) Space probes in our solar system often use a slingshot maneuver. In the simplest case, the probe is scattered gravitationally through an angle of 180 degrees by a planet. Show that in some other frame such as the rest frame of the sun, in which the planet has speed \(u\) toward the incoming probe, the maneuver adds \(2u\) to the speed of the probe. (b) Suppose that we replace the planet with a black hole, and the space probe with a light ray. Why doesn't this accelerate the ray to a speed greater than \(c\)? (solution in the pdf version of the book) \end{homeworkforcelabel}

\begin{homeworkforcelabel}{astroid}{1}{}{4} The curve given parametrically by \((\cos^3 t,\sin^3 t)\) is called an astroid. The arc length along this curve is given by \(s=(3/2)\sin^2 t\), and its curvature by \(k=-(2/3)\csc 2t\). By rotating this astroid about the \(x\) axis, we form a surface of revolution that can be described by coordinates \((t,\phi)\), where \(\phi\) is the angle of rotation. (a) Find the metric on this surface. (b) Identify any singularities, and classify them as coordinate or intrinsic singularities. (solution in the pdf version of the book) \end{homeworkforcelabel}

\begin{homeworkforcelabel}{carousel-singularities}{1}{}{5} (a) Section 3.5.4 (p. 108) gave a flat-spacetime metric in rotating polar coordinates,

\[\begin{equation*} ds^2=(1-\omega^2 r^2)dt^2 - dr^2 - r^2d \theta'^2 - 2\omega r^2d\theta'dt . \end{equation*}\]

Identify the two values of \(r\) at which singularities occur, and classify them as coordinate or non-coordinate singularities.
(b) The corresponding spatial metric was found to be

\[\begin{equation*} ds^2= - dr^2 - \frac{r^2}{1-\omega^2r^2}d \theta'^2 . \end{equation*}\]

Identify the two values of \(r\) at which singularities occur, and classify them as coordinate or non-coordinate singularities.
(c) Consider the following argument, which is intended to provide an answer to part b without any computation. In two dimensions, there is only one measure of curvature, which is equivalent (up to a constant of proportionality) to the Gaussian curvature. The Gaussian curvature is proportional to the angular deficit \(\epsilon\) of a triangle. Since the angular deficit of a triangle in a space with negative curvature satisfies the inequality \(-\pi \lt \epsilon \lt0\), we conclude that the Gaussian curvature can never be infinite. Since there is only one measure of curvature in a two-dimensional space, this means that there is no non-coordinate singularity. Is this argument correct, and is the claimed result consistent with your answers to part b? (solution in the pdf version of the book) \end{homeworkforcelabel}

\begin{homeworkforcelabel}{sirius-b-redshift}{1}{}{6} The first experimental verification of gravitational redshifts was a measurement in 1925 by W.S. Adams of the spectrum of light emitted from the surface of the white dwarf star Sirius B. Sirius B has a mass of \(0.98M_\odot\) and a radius of \(5.9\times10^6\) m. Find the redshift. \end{homeworkforcelabel}

\begin{homeworkforcelabel}{eddington-coordinate-change}{1}{}{7} Show that, as claimed on page 223, applying the change of coordinates \(t'=t-2m\ln(r-2m)\) to the Schwarzschild metric results in a metric for which \(g_{rr}\) and \(g_{t't'}\) never blow up, but that \(g^{t't'}\) does blow up. \end{homeworkforcelabel}

\begin{homeworkforcelabel}{schwarzschild-circular-orbit}{1}{}{8} Use the geodesic equation to show that, in the case of a circular orbit in a Schwarzschild metric, \(d^2 t/ds^2=0\). Explain why this makes sense. \end{homeworkforcelabel}

\begin{homeworkforcelabel}{slowdown-curvature}{1}{}{9} Verify by direct calculation, as asserted on p. 235, that the Riemann tensor vanishes for the metric \(ds^2=-tdt^2-d\ell^2\), where \(d\ell^2=dx^2+dy^2+dz^2\). (solution in the pdf version of the book) \end{homeworkforcelabel}

\begin{homeworkforcelabel}{bogus-vacuum-field-equation}{1}{}{10} Suppose someone proposes that the vacuum field equation of general relativity isn't \(R_{ab}=0\) but rather \(R_{ab}=k\), where \(k\) is some constant that describes an innate tendency of spacetime to have tidal distortions. Explain why this is not a good proposal.(solution in the pdf version of the book) \end{homeworkforcelabel}

\begin{homeworkforcelabel}{no-schwarzschild-in-three-dimensions}{1}{}{11} Prove, as claimed on p. 232, that in 2+1 dimensions, with a vanishing cosmological constant, there is no nontrivial Schwarzschild metric.(solution in the pdf version of the book) \end{homeworkforcelabel}

\begin{homeworkforcelabel}{flip-velocity-vectors}{1}{}{12} On p. 211 I argued that there is no way to define a time-reversal operation in general relativity so that it applies to all spacetimes. Why can't we define it by picking some arbitrary spacelike surface that covers the whole universe, flipping the velocity of every particle on that surface, and evolving a new version of the spacetime backward and forward from that surface using the field equations?(solution in the pdf version of the book) \end{homeworkforcelabel}

(c) 1998-2013 Benjamin Crowell, licensed under the Creative Commons Attribution-ShareAlike license. Photo credits are given at the end of the Adobe Acrobat version.

[1] Ahn et al. have shown that the no-cloning theorem is violated in the presence of closed timelike curves:
[3] “On the gravitational field of a point mass according to Einstein's theory,” Sitzungsberichte der K\:{o}niglich Preussischen Akademie der Wissenschaften 1 (1916) 189. An English translation is available at
[4] See p. 234 for a different but closely related use of the same term.
[5] For more about time-reversal symmetry, see p. 211.
[6] Misner, Thorne, and Wheeler, Gravitation, p. 1118
[7] Rindler, Essential Relativity, 1969, p. 141
[9] For a review article on this topic, see Clifford Will, “The Confrontation between General Relativity and Experiment,”
[10] Gou et al., “The Extreme Spin of the Black Hole in Cygnus X-1,”
[13] Phys. Rev. Lett. 70, p. 9
[14] Penrose, Gravitational radiation and gravitational collapse; Proceedings of the Symposium, Warsaw, 1973. Dordrecht, D. Reidel Publishing Co. pp. 82-91, free online at
[15] Kanti,
[16] Emparan and Reall, “Black Holes in Higher Dimensions,”