Tensor Analysis on Manifolds


Author: Samuel Peterson


Date Published
2018-08-14 (ISO 8601)
73-08-14 (Post Bomb)


This post is on a book that has been in my "to read" stack for about a decade. In the second year of my undergraduate1 I took an introductory course on differential geometry, which was a prelude to a course on general relativity. "Tensor Analysis on Manifolds" by Richard Bishop and Samuel Goldberg was one of the textbooks which the course drew upon, however the course itself was by no means constructed around the text, and was much more a product of the professor's (Tevian Dray was his name); A consequence was that only isolated parts of the book were selected for supplementary reading, and so I only read enough to appreciate the style of the authors' and to realize that I should keep it with me in case I ever got the urge to study it with any thoroughness. Fast-forward to summer 2018, and I finally gave it the attention it deserved.

Why did I pick it up after all these years? I remembered a neat trick from that differential geometry class which dealt with the generalizations of those wonderful differential operators one learns in vector calculus courses: divergence, gradient, curl and the laplacian. This trick handled the derivation of the curvilinear-coordinate versions of these operators automatically. The problem was that when I had taken the course I didn't quite understand it. I attribute this to my relative inexperience with mathematics at the time.

My original intent was to just read up to the coverage of Stokes' theorem -- I had little initial interest reading much on Riemannian and Semi-Riemannian geometry. However, after getting to Stokes' theorem I had enjoyed myself so much that I had to just finish book. I'm glad I did, because the best parts were in that last half which I had intended to leave untouched.

The book's title does a pretty good job of summarizing its content. Given the two widely applicable mathematical structures of: differentiable manifolds, and vector spaces what sort of structures emerge when we combine them? The main objects considered in this book as a result of this combination are Tensor Fields which are Tensor-valued functions on a manifold. Such entities have an enormous scope of applications in the physical sciences, perhaps most notably with general relativity. I'll summarize some of the most fascinating parts of the book below. My summary will not go not go into great detail, rather the intent is to give the reader an idea of what can be learned from the text, should he try to pursue it. First, however, a clarification needs to be made.

What is a tensor?

Given a d-dimensional vector space, \(V\), and its dual, \(V^{\ast}\), a tensor, T, of type \(r, s\) is a real-valued multi-linear function on \(V^{\ast r} \times V^s\) (that is to say the first \(r\) arguments are from the set \(V^{\ast}\) and the last \(s\) arguments are from the set \(V\)).

Normally I like to let Wikipedia handle the bulk of the background material on my blog posts. However, when it comes to the internet's definition2 of a tensor, I find the state of affairs to be appalling3. Some examples, apart from the Wikipedia link above are: mathworld.wolfram, and Quora. At the time of this writing, the definitions seem to focus on the representation of tensors first and foremost as multi-dimensional arrays.

This is a bad starting point because then you are obliged to spell out the rules for how these multi-dimensional arrays operate on vectors and their duals, and how they are also equivalent to other multi-dimensional arrays through the correct change of basis operations. What's more, it's very clumsy to discuss the role of dual vectors when you start from this perspective, and so many definitions just gloss over the distinction and speak of tensors of type (r): arrays with r-indices, which isn't even true in general, because the basis transformations of the dual and non-dual parts work differently.

In contrast, the definition I give at the top of the section, which is the definition from the book I'm writing about, is completely accurate and much less obfuscating than what prevails on the internet now. I felt this needed to be put down so as to increase the proportion of sensible definitions on the web.

With that out of the way, let's summarize some of the best topics in the book, starting with...

Stokes' Theorem

In the last half of the 19th century, some wonderful mathematics arose which were instrumental to the formulation of classical electro-magnetic theory. Among these mathematical developments was a generalization of the fundamental theorem of calculus in the form of the Kelvin-Stokes theorem. After some maturing of the fields of analysis and geometry, this theorem reached its modern form in the first half of the 20th century under the name of the Stokes-Cartan theorem, which is quite an elegant statement: $$ \int_{\partial \Omega} \omega = \int_{\Omega} d\omega. $$ Here \(\Omega\) is a n-dimensional region, \(\partial \Omega\) is its boundary, \(\omega\) is a rank (0,n-1) tensor field called a (n-1)-form, and \(d\omega\) is its exterior derivative (a n-form).

What's great about the Stokes-Cartan theorem is that it is a generalization of its progenitor to arbitrary dimensions. The older variant of Stokes' theorem, which is what a typical engineering or physicist will encounter during his or her undergraduate studies, just involves 2-dimensional integrals of vector fields in a 3-dimensional space.

The chapter which featured Stokes' theorem also covered the generalization of the differential operators I wrote of in the introduction of this post. It relies on the exterior algebra and exterior derivative on differential forms. In general, the exterior derivative is the n-dimensional generalization of the differential operators on vector fields, and in this setting, the familiar operators are given as follows $$ d \equiv gradient $$ $$ *d \equiv curl $$ $$ *d* \equiv divergence $$ $$ *d*d \equiv laplacian. $$ where the \(*\) is the hodge star operator. I should note that these equivalences hold for the appropriate domain, so for instance, the laplacian and gradient operate on scaler functions, while divergence and curl operate on vector fields modeled, in this case, by 1-forms. This identification with the classical fields from vector calculus with 1-forms only works with orthonormal coordinates. Provided this condition is met, the correct formulation for these operators just pops out of the calculation.

Riemannian and Semi-Riemannian geometry

The analytical framework of tensor fields on manifolds which is developed in the first half of the book has a really nice payoff when the topic of Riemannian and Semi-riemannian geometry comes up. Covarian Derivatives are introduced, along with the resultant torsion and curvature tensors. We are then treated to some rather marvelous theorems:

  • In the case of no torsion, there is a unique covariant derivative which is consistent with a given riemannian metric: that is, there is only one way to translate reference frames in a parallel manner which preserves angles and distances (that's a relief!)
  • A Geodisic, as considered as a path on the tangent bundle TM, is the flow of some vector field on TM, G. G is called the Geodesic Spray of the semi-riemannian manifold and in terms of coordinates on TM, \(y_i\) \(i = 1...2d\), which are induced by the coordinates on M, \(x_i\) \(i = 1 ... d\), it is given by $$ G = y^{i+d}Y_i - y^{j+d}y^{k+d}\Gamma^{i}_{jk} \circ \pi Y_{i+d} $$ where repeated indices are summed, \(d\) is the dimension of the underlying manifold, \(\pi\) is the projection from TM to M, and \(\Gamma^{i}_{jk}\) are the christoffel symbols of the seccond kind which are determined by the unique covariant derivative consistent with the metric.
  • Oh and those geodesics are indeed length minimizing... at least locally.

With G we can determine geodesics in a pretty simple way, all we have to do is determine the flow. Given coordinates \(x_i\) on a riemannian manifold, M, the geodesics obey the following system of differential equations $$ \ddot{x_i}(t) = -\dot{x_j}(t)\dot{x_k}(t)\Gamma^{i}_{jk}(x_1(t), x_2(t), \dots, x_d(t)) $$. You might notice that the right side of the equation above is almost exactly the coefficient of \(Y_{i+d}\) in the definition of G. That's what I mean by simple: the flow just falls out of the field.

As a bit of fun, and to test my understanding of the material here, I thought I'd derived the geodesic spray for a 2-dimensional sphere. The real work is in determining the covariant derivative. Let us consider a radius of 1, and let \(\theta \in [0,\pi)\) be the polar angle, and \(\phi \in [0, 2\pi)\) be the azimuthal angle. We start off with the metric $$ m = d\theta \otimes d\theta + \sin(\theta) d\phi \otimes d\phi $$ From which we get the Christoffel symbols: $$ \Gamma^{\theta}_{\phi\phi} = -\sin(\theta) \cos(\theta) $$ $$ \Gamma^{\phi}_{\theta\phi} = \Gamma^{\phi}_{\phi\theta} = \cot(\theta) $$. The rest are 0. Putting all that together, we find the geodesic spray for the sphere, which yields the following system of differential equations that determine its flow: $$ \ddot{\theta}(t) = \dot{\phi}^2(t) \sin(\theta(t)) \cos(\theta(t)) $$ $$ \ddot{\phi}(t) = -2 \dot{\theta}(t) \dot{\phi}(t) \cot(\theta(t)). $$ Now I didn't feel like actually solving this system, so I just solved it numerically with the initial heading of 45 degrees from north at the equator. It looks like this:

Numerical solution to the equations for a geodesic on the sphere. A nice great-circle arc, just as we'd expect

We do indeed get something that looks like a geodesic on a sphere, as these curves form great circles on the surface. This fact is well known about the 2d-sphere, and you can see solutions worked out on a bunch of sites if you searched "geodesic on the sphere". Typically these are solved using a different method called the calculus of variations, so I thought this little exercise was worth putting up.

We could also do a similar thing to determine orbits according to general relativity. Given the simple model of a black hole with the same mass as the sun, space-time has the Schwarzschild metric $$ c^2 m = -\left(1 - \frac{r_s}{r}\right) c^2 dt \otimes dt + \left(1 - \frac{r_s}{r}\right)^{-1} dr \otimes dr + r^2 d\theta \otimes d\theta + r^2 \sin^2(\theta) d\phi \otimes d\phi $$. where \(r_s = 2950\) meters is the schwarzschild radius for a body of the sun's mass. Below is a figure demonstrating a resulting orbit, which is just a geodesic in spacetime with the above metric.


Example of a relativistic orbit. Notice the precession of orbit, which is not predicted in the Newtonian theory of gravitation.

I chose a very elliptic and close orbit to make clearly visible the relativistic effect of orbital precession. For instance, the orbit pictured here has a aphelion of \(1.56 * 10^6\) meters. Mercury by comparison has an aphelion of \(6.9 * 10^{10}\) meters.4

Overall Impressions

This book was a hoot. The elementary material was laid out in the beginning very well, and after investing the mental effort to understand the first half of the book, the fun stuff fell into place beautifully. Just make sure to at least read over and think about the exercises scattered throughout the book, some contain results which are used in later sections, but they are given in such a way that just reading them is sufficient. For example they might say: prove that such-and-such is true.

High recommendations for anyone with a scientific background who wants to learn some fun refinements of vector calculus, who want to learn some of the mathematics underlying general relativity.

Footnotes

  1. winter 2007, while I was still a student of physics rather than math
  2. By internet's definition, I mean the sort of answers you'll get in the top 5 or so results from a search on a given word.
  3. I should note that Wikipedia, although it's first (and lengthiest) definition of tensor easily falls under the type which I'm criticizing, it has other definitions which are saner which are much further down the page.
  4. Yes I'm aware I'm not labeling my axes... if you must know, it's in meters.