Editorial note to: Erwin Schr\"odinger, Dirac electron in the   gravitational field I

Bernard S. Kay (York)

arXiv:1906.10765·physics.hist-ph·October 7, 2022

Editorial note to: Erwin Schr\"odinger, Dirac electron in the gravitational field I

Bernard S. Kay (York)

PDF

TL;DR

This paper provides a historical and mathematical analysis of Schr"odinger's 1932 work on the Dirac equation in curved spacetime, highlighting its significance and the development of related concepts like the spin connection.

Contribution

It offers the first detailed historical and mathematical commentary on Schr"odinger's 1932 paper, clarifying its role in the development of the Schr"odinger-Lichnerowicz formula and spin connection.

Findings

01

Schr"odinger's 1932 paper first derived the Schr"odinger-Lichnerowicz formula.

02

Historical analysis of the debate on spin connection development.

03

Clarification of the conflict between Schr"odinger and other physicists' approaches.

Abstract

Editorial Note with a mathematical and historical introduction to a 1932 paper by Erwin Schr\"odinger on the generalization of the Dirac equation to a curved spacetime -- to appear in the 'Golden Oldie' section of the Journal of General Relativity and Gravitation alongside an English translation of that paper. The Schr\"odinger paper is of interest as the first place that the well-known formula $g^{μν} \nabla_{μ} \nabla_{ν} + m^{2} + \frac{R}{4}$ was obtained for the 'square' of the Dirac operator in curved spacetime. This formula is known by a number of names and we explain why we favour the name 'Schr\"odinger-Lichnerowicz formula'. We also aim to explain how the modern notion of `spin connection' emerged from a debate in the physics journals in the period 1929-1933. We discuss the key contributions of Weyl, Fock and Cartan and explain how and why they were partly in conflict with the…

Equations36

i γ^{a} \partial_{a} ψ - m ψ = 0,

i γ^{a} \partial_{a} ψ - m ψ = 0,

{γ^{a}, γ^{b}} = 2 η^{ab},

{γ^{a}, γ^{b}} = 2 η^{ab},

\nabla_{μ} ψ = \partial_{μ} ψ + Γ_{μ} ψ

\nabla_{μ} ψ = \partial_{μ} ψ + Γ_{μ} ψ

Γ_{μ} (x) = - \frac{i}{4} ω_{ab μ} (x) σ^{ab}

Γ_{μ} (x) = - \frac{i}{4} ω_{ab μ} (x) σ^{ab}

ω_{b μ}^{a} = e_{ν}^{a} \partial e_{b}^{ν} / \partial x^{μ} + e_{ν}^{a} e_{b}^{ρ} Γ_{ρ μ}^{ν} .

ω_{b μ}^{a} = e_{ν}^{a} \partial e_{b}^{ν} / \partial x^{μ} + e_{ν}^{a} e_{b}^{ρ} Γ_{ρ μ}^{ν} .

σ^{ab} = \frac{i}{2} [γ^{a}, γ^{b}] .

σ^{ab} = \frac{i}{2} [γ^{a}, γ^{b}] .

i γ^{μ} \nabla_{μ} ψ - m ψ = 0.

i γ^{μ} \nabla_{μ} ψ - m ψ = 0.

γ^{μ} = e_{a}^{μ} γ^{a} .

γ^{μ} = e_{a}^{μ} γ^{a} .

{γ^{μ}, γ^{ν}} = 2 g^{μν} .

{γ^{μ}, γ^{ν}} = 2 g^{μν} .

[\nabla_{α}, \nabla_{β}] v_{γ} = R_{α β γ}^{δ} v_{δ}

[\nabla_{α}, \nabla_{β}] v_{γ} = R_{α β γ}^{δ} v_{δ}

[\nabla_{α}, \nabla_{β}] ψ = - \frac{i}{4} R_{α β c d} σ^{c d} ψ

[\nabla_{α}, \nabla_{β}] ψ = - \frac{i}{4} R_{α β c d} σ^{c d} ψ

\nabla_{[α} \nabla_{β]} ψ = \frac{1}{8} R_{α β δ η} γ^{δ} γ^{η} ψ

\nabla_{[α} \nabla_{β]} ψ = \frac{1}{8} R_{α β δ η} γ^{δ} γ^{η} ψ

u_{α, i} = \frac{\partial u _{α}}{\partial x ^{i}} + Λ_{α i}^{β} u_{β}

u_{α, i} = \frac{\partial u _{α}}{\partial x ^{i}} + Λ_{α i}^{β} u_{β}

\nabla_{ρ} γ_{ν} = \frac{\partial γ _{ν}}{\partial x ^{ρ}} - Γ_{ν ρ}^{μ} γ_{μ} - Γ_{ρ} γ_{ν} + γ_{ν} Γ_{ρ} .

\nabla_{ρ} γ_{ν} = \frac{\partial γ _{ν}}{\partial x ^{ρ}} - Γ_{ν ρ}^{μ} γ_{μ} - Γ_{ρ} γ_{ν} + γ_{ν} Γ_{ρ} .

(g^{μν} \nabla_{μ} \nabla_{ν} + m^{2} + \frac{R}{4}) ψ = 0.

(g^{μν} \nabla_{μ} \nabla_{ν} + m^{2} + \frac{R}{4}) ψ = 0.

0 = (- i γ^{μ} \nabla_{μ} - m) (i γ^{ν} \nabla_{ν} ψ - m ψ) = γ^{μ} γ^{ν} (\nabla_{(μ} \nabla_{ν)} + \nabla_{[μ} \nabla_{ν]} + m^{2}) ψ

0 = (- i γ^{μ} \nabla_{μ} - m) (i γ^{ν} \nabla_{ν} ψ - m ψ) = γ^{μ} γ^{ν} (\nabla_{(μ} \nabla_{ν)} + \nabla_{[μ} \nabla_{ν]} + m^{2}) ψ

(g^{μν} \nabla_{μ} \nabla_{ν} + m^{2} + \frac{1}{8} R_{μν δ η} γ^{μ} γ^{ν} γ^{δ} γ^{η}) ψ

(g^{μν} \nabla_{μ} \nabla_{ν} + m^{2} + \frac{1}{8} R_{μν δ η} γ^{μ} γ^{ν} γ^{δ} γ^{η}) ψ

i γ^{μ} (\partial_{μ} + Γ_{μ} - i e A_{μ}) ψ - m ψ = 0.

i γ^{μ} (\partial_{μ} + Γ_{μ} - i e A_{μ}) ψ - m ψ = 0.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

∎

11institutetext: Department of Mathematics, University of York, York YO10 5DD 11email: [email protected]

Editorial note to:

Erwin Schrödinger, Dirac electron in the gravitational field I

Bernard S. Kay

(Received: date / Accepted: date)

Abstract

We aim to give a mathematical and historical introduction to the 1932 paper “Dirac equation in the gravitational field I” by Erwin Schrödinger on the generalization of the Dirac equation to a curved spacetime and also to discuss the influence this paper had on subsequent work. The paper is of interest as the first place that the well-known formula $g^{\mu\nu}\nabla_{\mu}\nabla_{\nu}+m^{2}+R/4$ was obtained for the ‘square’ of the Dirac operator in curved spacetime. This formula is known by a number of names and we explain why we favour the name ‘Schrödinger-Lichnerowicz formula’. We also aim to explain how the modern notion of ‘spin connection’ emerged from a debate in the physics journals in the period 1929-1933. We discuss the key contributions of Weyl, Fock and Cartan and explain how and why they were partly in conflict with the approaches of Schrödinger and several other authors. We reference and comment on some previous historical accounts of this topic.

Keywords:

Schrödinger history Dirac equation spin connection curved spacetime Schrödinger-Lichnerowicz formula square of Dirac operator

1 Introduction

This Golden Oldie is a translation from the original German of Schr32 . (Despite the numeral ‘I’ in the paper’s title, there doesn’t appear to have been a second instalment.) As we shall see, this is an, in some ways flawed, but, nevertheless both mathematically and historically interesting, paper. It seems to have been the first place that the formula (Equation (12) below) was obtained for the square of the curved-spacetime Dirac operator. It is also one of a number of papers by a number of authors from the period 1928-33 which embodied an interesting scientific debate and from which the important notion of spin connection emerged. It is not one of those Golden Oldies about which one could justifiably say (with an, in consequence, rather short editorial note): “If you want to learn this topic, then read this original paper.” (And that can arguably also not be said of any other of those several 1928-1933 papers although, as we shall relate, two of the 1929 papers do give a correct account of the topic.) The reader would surely be better off with the modern textbook literature for that purpose. But there seems to be more worth saying than has been said in previous accounts of the history of the notion of spin connection, and it is hoped that what follows will go some way towards filling the gap.

Added Note October 2022: Since posting v6 of this paper on the arXiv and publishing it as KayGRG it was pointed out to me by Gary Gibbons and (indirectly) by Mike Stone that aside from the rediscovery of Schrödinger’s formula for the ‘square’ of the curved-spacetime Dirac Operator in 1962/3 by André Lichnerowicz, it was also rediscovered in 1963 by Asher Peres. Gary Gibbons and I have prepared a brief addendum to the present paper which mainly discusses that latter rediscovery and which it is intended to publish separately. The present v7 is identical to v6 except that the essential content of that addendum is appended before the bibliography. The five papers newly cited in the addendum have also been added at the end of the bibliography.

2 Background

The proposal of matrix mechanics by Heisenberg in 1925, together with that of wave mechanics by Schrödinger in 1926, surely constitute one of the biggest ever upheavals in our understanding of the laws of physics. Its/their potential implications for physics, and also for chemistry and mathematics, were vast.

The opportunities these proposals opened up were very apparent to physicists and mathematicians at the time and it’s remarkable how rapid was the progress in the next few years. The Hydrogen atom was quickly solved (by Pauli in 1926) and the basis for quantum chemistry laid, but also, around 1926-30 the basis was laid for quantum field theory (see (Weinberg, , Chapter 1)). On the side of the mathematical formalism, Dirac (after an earlier attempt by Schrödinger himself) showed the equivalence of Heisenberg’s and Schrödinger’s theories in 1927, later (1932) to be re-worked in a fully mathematically rigorous way by von Neumann.

And, in 1928, Dirac came up with his celebrated equation

[TABLE]

where

[TABLE]

where $\eta_{ab}$ is the Minkowskian metric (which we shall take here to have signature $(+,-,-,-)$ ). Dirac’s equation, by itself, opened up a vast range of further questions. Amongst these, an interesting and important matter of principle was raised by the

Question How, even just locally, can the notion of a Dirac wavefunction $\psi$ be generalized to a general curved spacetime? And, assuming that is solved, how (again, even just locally) can one generalize Dirac’s equation and thus achieve compatibility with Einstein’s theory of general relativity (which dated back to 1915)?

The first part of our Question, in modern language, amounts to the problem of equipping our spacetime with a spin structure and, if we are only interested in doing things locally, presents no difficulty. But the second requires the generalization of the ‘ $\partial_{a}$ ’ of (1) to a notion of covariant derivative suitable to act on the spinor $\psi$ – or, in other words, the provision of a suitable notion of parallel-transport for $\psi$ , or, in yet other words, the provision of a suitable notion of ‘spin connection’.

Of course, with the mathematical concepts that have been in our possession since the 1950s111Here is some basic information on the history of Riemannian geometry, fibre bundles, gauge theories and spinors since 1950: The reformulation of earlier notions of connection in terms of a covariant derivative operator (say in the direction of a vector $X$ ) $\nabla_{X}$ ( $=X^{a}\nabla_{a}$ ) satisfying linearity, additivity, Leibniz rule etc., as explained in most modern general relativity textbooks, is based on the work of Koszul in 1950. Also in 1950, Ehresmann (a student of Cartan) reworked and generalized to what we would call general gauge (Lie) groups, the notion of connection in terms of a notion of horizontality in a principal fibre bundle – see e.g. BishCritt or Trautman and, for a brief account of how the history of gauge theories slots in to the history of the theory of fibre bundles, see Section I in Neeman . The notion of spin structure (for $SO(n)$ ) was introduced by Haefliger (a student of Ehresmann) in 1956 who also found a criterion (the vanishing of the second Stiefel-Whitney class) for its global existence on an orientable Riemannian manifold. For the history of spinors, see (Berger, , Section 6b) and/or watch the youtube video Atiyah . this is no longer a challenge: In one way of saying things, one can start with the usual (torsion-free) metric connection of general relativity, say viewed as a notion of horizontality on the bundle, $B$ , of Lorentz frames of our spacetime (whose structure group is the Lorentz group). This will induce a connection on the (always at least locally defined) bundle of spin frames (which doubly covers $B$ and has, as structure group the double covering, $SL(2,\mathbb{C})$ , of the Lorentz group) and that, in turn, will, by a standard construction, induce a connection on the associated bundle of spinors (which again always exists at least locally) in which our Dirac field, $\psi$ , is a cross-section.

In terms of a vierbein, $e^{a}_{\ \mu}$ (which coordinatises $B$ and which satisfies $e^{a}_{\ \mu}(x)e^{b}_{\ \nu}(x)\eta_{ab}=g_{\mu\nu}$ , $e^{\mu}_{\ a}(x)e^{\nu}_{\ b}(x)g_{\mu\nu}=\eta_{ab}$ ; and we note that we shall raise and lower Latin indices with $\eta_{ab}$ , and Greek indices with $g_{\mu\nu}$ ) the resulting connection, when re-expressed as a covariant derivative operator, may be written (see e.g. BirrellDavies ; Lawrie ; GSW ):

[TABLE]

where

[TABLE]

where $\omega_{ab\mu}$ are the components of the spin connection, given in terms of the usual Christoffel symbol for the (torsion free) metric connection, $\Gamma^{\nu}_{\rho\mu}$ , by

[TABLE]

Here $\sigma^{ab}$ (defined so that, if $\lambda^{a}_{\ b}$ is the generator of Lorentz transformation on 4-vectors, then $-\frac{i}{4}\lambda_{ab}\sigma^{ab}$ is the generator of Lorentz transformations on Dirac wavefunctions) is given by

[TABLE]

The answer to the second part of our Question is then that the Dirac equation should be replaced, in a curved spacetime, by the equation

[TABLE]

where $\nabla_{\mu}$ is as in (3) and

[TABLE]

We note, in passing, that, in view of (2), the $\gamma^{\mu}$ satisfy

[TABLE]

Let us also note, for later reference, that the familiar equation

[TABLE]

for the commutator of two covariant derivatives on (say) a covariant vector field, $v_{\alpha}$ , has, as an analogue, the equation

[TABLE]

for a spinor field, $\psi$ , and this formula can be inferred just from the fact that $\nabla_{\alpha}$ has the general properties of a connection without any need to invoke an explicit formula for $\Gamma_{\mu}$ . Moreover, by (6) and (8), this may be rewitten as

[TABLE]

which, we notice, no longer makes any reference to a choice of vierbein.

But back in 1928, all this was yet to come. And, even though the theory of spinors on vector spaces had been developed considerably by Cartan (who, already in 1913, had discovered what would later be understood as spinor representations of Lie Algebras) and by van der Waerden (see e.g. his article ‘Spinoranalyse’ vdWaerden29 and, for a historical account, Schneider ) and others, the answer to our above Question posed a serious challenge. The answer emerged from a debate in the physics journals and is fascinating to read, involving a number of wrong turnings and unnecessary detours as well as misunderstandings and disagreements. (Other historical accounts can be found in (vdWaerden60, , Section 12), Kichenassamy , Schneider , and (Goenner, , Section 7.2.2) some of which, however, appear to differ from us here in their detailed conclusions222While, in (vdWaerden60, , Section 12), van der Waerden (fairly) points out that the paper InfeldvdW by Infeld and van der Waerden follows and further simplifies Weyl WeylNAS ; WeylZ and Fock Fock , and that the approaches of Tetrode Tetrode and of Schrödinger Schr32 and of Bargmann Bargmann are more complicated, we caution the reader that (despite what Weyl and Cartan had taught us much earlier! – see below) it misleadingly says that the approaches of Weyl and of Fock, and of Tetrode and of Schrödinger and of Bargmann are all “equivalent from the physical point of view”. We urge similar caution regarding similar misleading statements in other historical surveys – e.g. in Kichenassamy and Schneider . For more about Schrödinger’s general motivations and interests in the relevant period, which mention, but are not confined to, his 1932 paper, see Ruger , Urbantke and Dorling . And for more about the general motivations and interests of theoretical physicists of that period, we recommend the blog article Motl by Lubos Motl. on the matter of Schrödinger’s paper and the several papers that it refers to.) And, though it is now clear that the answer (i.e. essentially the content of Equations (3), (4), (5), (6) above) had appeared (in the papers of Weyl WeylNAS ; WeylZ and Fock Fock ) in 1929, it seems to have taken quite a few years before that answer came to be generally accepted as the standard answer in the physics literature (and for less satisfactory viewpoints such as that of Schr32 to fall out of the mainstream – see e.g. where we mention the work of Brill and Wheeler below). It took a similarly long time before that answer came to be incorporated, as an accepted part of the general folklore, in mathematics (see e.g. (Berger, , Section 6b) for the history and LawMich for a modern (1989) textbook treatment).

Once the solution had been digested though, the opportunities it opened up were enormous and included, in 1963, the important (Riemannian) Dirac-operator example of the Atiyah-Singer index theorem Atiyah63 , followed by Lichnerowicz’ proof Lichner63 (more about which below) of the vanishing of the index (i.e. the Hirzebruch $\hat{A}$ -genus) in that example for even-dimensional compact Riemannian manifolds admitting spin structures with non-negative (but not identically zero) scalar curvature – which (together with the supergravity-based arguments for positive mass due to Deser and Teitelboim and Grisaru [as cited in Witten ]) was one of the acknowledged influences on Witten’s 1981 alternative proof Witten of Schoen and Yau’s positive mass theorem. Plus Connes’ noncommutative geometry Connes and (via InfeldvdW ) the applications of spinors to General Relativity due to Newman, Penrose and others PenRind , (Wald, , Chapter 13) $\dots$ , not to mention the obvious relevance of the curved spacetime Dirac equation to quantum theory in curved spacetime (on which, by the way, Schrödinger also wrote a pioneering paper Schr39 in 1939) and the rôle the Dirac equation in curved spacetime plays in supersymmetry, supergravity (see also Footnote 11), string theory, and much more.

3 Tetrode and Wigner, Weyl and Fock, Schrödinger, and Cartan

We now turn to a detailed discussion of the Schrödinger paper. We will attempt to indicate what it did that was original and had influence on later work and/but also to indicate its shortcomings. To this end, it is useful to begin with a critical examination of what Schrödinger writes in his introduction:

He begins by citing a number of papers by previous authors – Wigner, Tetrode, Fock, Weyl, Zaycoff and Podolsky.333A number of these papers – namely, those Tetrode ; Fock ; WeylZ of Tetrode, Fock and Weyl, as well as that Schr32 of Schrödinger and that InfeldvdW of Infeld and van der Waerden – are reprinted in the recent volume Half along with some commentary on (inter alia) the work of these authors in the article by Alexander Blum which forms Chapter 5 of that volume. I thank Alexander Blum for drawing my attention to this reference and also for helpful conversations about the work of Tetrode and Wigner. (In our language) he points out that most of them use vierbeins and remarks – seemingly of all the papers that use vierbeins – that “it is a little bit difficult to recognize whether Einstein’s idea of teleparallelism, to which reference is partly made, really enters or whether one is independent of it” and he explains that anyway he prefers not to use vierbeins because they are “more complicated” (i.e. than tensors). On the other hand, he commends one of the authors, namely Tetrode, for replacing the commutation relations (2) by the curved spacetime form (9) without any mention of vierbeins. It helps here, first, to briefly describe what Tetrode, and, after him, Wigner do: Basically, in Tetrode , Tetrode writes down Dirac’s equation in a curved spacetime as in (7), replacing the $\gamma^{a}$ of (2) by the $\gamma^{\mu}$ of (9). (Incidentally, the fact that the relations (9) are related to those of (2) by (8) seems to be the essential content of the paper, FockIvanenko , of Fock and Ivanenko.) However, while Tetrode replaces the Minkowskian partial derivative $\partial/\partial x^{a}$ by a general coordinate partial derivative $\partial/\partial x^{\mu}$ , he doesn’t attempt to replace that with a covariant derivative of any sort. He simply notes that it seems difficult to see what can replace the invariance property of the flat spacetime Dirac equation under Lorentz transformations in the curved spacetime case. Then, in Wigner , Wigner argues that this difficulty could be resolved if one were to relate Tetrode’s curved spacetime gamma matrices to Dirac’s original gamma matrices using a ‘vierbein’ (as in (8)) – however not the usual notion of vierbein, but rather the notion of ‘vierbein’ used in Einstein’s teleparallelism theory444Ironically, teleparallelism currently seems to be in the midst of one of its periodic revivals in work on alternative theories of gravity. For its early history, see e.g. (Goenner, , Sections 6.4 and 7.2). Whereas with the usual notion of vierbein we consider in general relativity, the Lorentz group acts as a local gauge group, in the Einstein teleparallel theory it consists of four globally defined vector fields, which transform under a single global action of the Lorentz group. Wigner shows that a suitably symmetrized version of Tetrode’s Dirac equation (where $\partial/\partial x^{\mu}$ is replaced by a ‘covariant derivative’ which, however, acts on spinors as if they were scalars) is invariant under this single global action of the Lorentz group.

Schrödinger (rightly in the modern view) doesn’t want to adopt such a theory, based on teleparallelism. But he errs in seeming to imply that all the papers which use vierbeins rely on teleparallelism. Fock and Weyl don’t!555Podolsky Podolsky and Zaycoff Zaycoff both appear to be concerned with building/commenting on what Fock and Weyl had already done and so we may omit further discussion of them. In April 1929, Weyl gives in WeylNAS (see also the slightly later, fuller version WeylZ ) what we would regard as the right solution to our Question for his two-component version of the massless Dirac equation (which is introduced in the same work) while in July 1929, Fock gives essentially the same solution for the original version (1) of the massive Dirac equation. (For a historical discussion of the work of Weyl and of Fock, see Scholz .) Furthermore, regarding Schrödinger’s wish to anyway avoid vierbeins because they are complicated, well it is clear to us now that one can’t avoid them! At least not if one wants to have an explicit expression for the covariant derivative of a spinor wave function.

Indeed anything that pertains to the bundle of spinors (including the question of the global existence of Dirac wavefunctions themselves) requires the use of vierbeins in the sense that it requires reference to a double covering of the bundle of Lorentz frames and, because the double cover needs to be taken, we cannot revert to the bundle of general linear frames.

This is clear from the well known fact ((Cartan66, , Sections 85 and 177) and e.g. (GSW, , Page 272)) that, for $n>2$ , (the connected component of the identity of) $GL(n,\mathbb{R})$ (even though doubly connected) has no finite-dimensional multivalued (i.e. ‘spinor’) representations. (The proof is straightforward once one observes that one can replace $GL(n,\mathbb{R})$ by $SL(n,\mathbb{R})$ and that, while this is doubly connected, its complexification, $SL(n,\mathbb{C})$ , is simply connected.)

In fact, this seems to have already been clear to Weyl back in 1929 (and maybe also to Fock) and, at least by 1937, to Cartan: Weyl writes, in WeylNAS ,

“We need such local cartesian axes e(a) in each point $P$ in order to be able to describe the quantity $\psi$ by means of its components $\psi_{1}^{+}$ , $\psi_{2}^{+}$ ; $\psi_{1}^{-}$ , $\psi_{2}^{-}$ , for the law of transformation of the components $\psi$ can only be given for orthogonal transformations as it corresponds to a representation of the orthogonal group which cannot be extended to the group of all linear transformations. The tensor calculus is consequently an unusable instrument for considerations involving the $\psi$ .”

Weyl then appends an endnote to this, writing: “Attempts to employ only the tensor calculus have been made by Tetrode (Z. Physik, 50, 336 (1928)); J. M. Whittaker (Proc. Camb. Phil. Soc., 25, 501 (1928)), and others; I consider them misleading.”

In Sections 85 and 177 of his 1937 book Cartan66 ‘The Theory of Spinors’, Cartan proves the ‘well known’ result we mentioned above and then (without referring to Weyl) makes precise Weyl’s sentence “The tensor calculus is consequently an unusable instrument for considerations involving the $\psi$ .” (equivalently my own above sentence “ $\dots$ * we cannot revert to the bundle of general linear frames*”) with the last theorem of his book which is worth quoting in full:

Cartan’s Theorem. With the geometric sense we have given to the word “spinor” it is impossible to introduce fields of spinors into the classical Riemannian technique; that is, having chosen an arbitrary system of co-ordinates $x^{i}$ for the space, it is impossible to represent a spinor by any finite number $N$ whatsoever, of components $u_{\alpha}$ such that the $u_{\alpha}$ have covariant derivatives of the form

[TABLE]

where the $\Lambda^{\beta}_{\alpha i}$ are determinate functions of $x^{h}$ .666 We note here that it is possible, though, to consider spinors as transforming under such a formula if one allows spinors to transform at each point under a nonlinear realization of $GL(n,\mathbb{R})$ . See OgPon ; Pitts . I thank Brian Pitts for drawing these references to my attention.

Cartan makes his reasons for stating this theorem clear on the previous page (Page 150) where, referring specifically to Schrödinger’s paper Schr32 in a footnote777Note that a seeming reference to the same paper of Schrödinger in an earlier footnote on the same page (Page 150) of Cartan66 seems to be a typographical error and to have been meant to refer to the paper of Fock; in fact the journal, volume and page numbers are those of Fock’s paper.), he writes:

“Other physicists, not wishing to employ local Galilean reference frames, have sought to generalize Dirac’s equations by using the classical technique of Riemannian geometry. $\dots$ We shall see that if we adopt this point of view and wish to continue to regard spinors as well-defined geometric entities, which behave as tensors in the most general sense of that term, then the generalization of Dirac’s equations will become impossible.”

4 What Schrödinger Does

We next discuss how Schrödinger manages to partly get around these objections. (But of course there will be questions that he is unable to address.) To attempt to avoid using vierbeins, Schrödinger essentially focuses, not on the covariant derivative which acts on spinor wavefunctions, but rather on the covariant derivative which acts on the gamma matrices. In modern language, this may be understood in terms of a connection on the Clifford bundle whose fibres above each spacetime point are generated by the gamma matrices at that point. He is right that this connection does not (need to) involve vierbeins. This is because the structure group of this Clifford bundle is obviously the Lorentz group and not its double cover, and a connection on it can therefore be regarded as associated to a connection on the bundle of Lorentz frames and this, in its turn, can of course be obtained by restriction from a connection (with structure group $GL(4,\mathbb{R})$ ) on the bundle of general linear frames. Schrödinger (and Fock before him) essentially argue that this covariant derivative should take the form (cf. Equation (S8))888We use an ‘S’ in front of an equation number to indicate that it is an equation number in Schr32 .

[TABLE]

and should be required to vanish! Here, $\Gamma_{\mu}$ is (apart from an ambiguity in its sign [which is, we might say, the cause of the vierbein trouble] and up to the addition of an undetermined term of form $B_{\mu}I$ where $B_{\mu}$ is a covariant vector field and $I$ the identity operator – more about this term below) taken to be the same as the $\Gamma_{\mu}$ of Equation (4) above which gives us the covariant derivative on spinor wave functions.

What Schrödinger can’t do, however, is obtain an explicit formula for $\Gamma_{\mu}$ because, as we explained above, that would require the use of vierbeins. (The reader may verify, that, indeed, nowhere in the paper is there an equation with $\Gamma_{\mu}$ standing alone on its left side!)999Actually, related to Footnote 6, if one introduces a suitable notion for a preferred matrix square root, $r_{\mu\nu}$ , of $g_{\mu\nu}$ (thought of as a 4 x 4 matrix in each coordinate system) then one can find, in terms of it, a vierbein-free formula for $\Gamma_{\mu}$ . However, $r_{\mu\nu}$ will necessarily transform nonlinearly under general coordinate transformations. See again OgPon ; Pitts . I again thank Brian Pitts for pointing this out to me.

Let us pause to make three further historical remarks here. First, in their paper InfeldvdW , Infeld and van der Waerden criticise Schrödinger’s paper for never giving an explicit formula for $\Gamma_{\mu}$ , albeit they fail to make the (stronger) point that this would be impossible in principle without using vierbeins (which they themselves do, though, use). In fact essentially what InfeldvdW does is to adapt what Fock and Weyl had done to a formalism in which the Dirac equation is viewed as a pair of coupled 2-spinor equations and this was influential in (and referenced in) the later work of Penrose and Newman and others. (See e.g. Penrose60 ; Penrose65 ; PenRind and (Wald, , Chapter 13)).)

Secondly, an explict formula for $\Gamma_{\mu}$ was later obtained in a sort of addendum to the Schrödinger paper, Bargmann , written by Bargmann who was then a pre-doctoral student with Schrödinger in Berlin (soon after to flee Nazi Germany to Zürich where he obtained his PhD under the supervision of Wentzel). Of course, to do this, it uses vierbeins – as it must! – and so brings us back around a circle to what Weyl and Fock had done in the first place. (By the way, related to our ‘mainstream’ remark above, Brill and Wheeler’s 1957 article BrillWheel on neutrinos in gravitational fields adopts Bargmann’s Schrödinger-inspired way of explaining spin connections [rather than Weyl’s or Fock’s] giving pride of place to the above Equation (11) [and even adopting the view of Schrödinger (and of Fock) that part of $\Gamma_{\mu}$ may be identified with an electromagnetic 4-potential, $A_{\mu}$ – as we will discuss further below].)

Lastly, it is interesting to ask what Dirac himself knew or did about generalizing his equation to curved spacetimes. In 1935, he wrote a paper Dirac35 generalizing the Dirac equation to de Sitter space and to anti de Sitter space and in 1936, a further paper Dirac36 generalizing the Dirac equation to “a four-dimensional surface of a hyperquadric in five-dimensional projective space”. These appear to have been in line with Dirac’s quest for beauty in his equations, beauty here being interpreted as a high degree of spacetime symmetry. However, it was only in 1958, with his paper Dirac58 , that Dirac addressed the question of generalizing his equation to a general curved spacetime and cited the earlier work on this topic by (i.a.) Cartan (our reference Cartan66 ), Fock Fock , Infeld and van der Waerden InfeldvdW , Schrödinger (i.e. our main paper of interest) Schr32 , Tetrode Tetrode and Weyl WeylZ – though without indicating e.g. the shortcomings that we have indicated concerning the work in Tetrode and Schr32 . He then proceeds to give his own solution to the problem which, as he indicates, is on similar lines to Fock’s solution, except that it is couched in terms of the Dirac matrices, $\alpha^{i}$ and $\beta$ , related to the Dirac gamma matrices (i.e. the $\gamma^{a}$ of (2) by $\alpha^{i}=\gamma^{0}\gamma^{i}$ and $\beta=\gamma^{0}$ . Dirac claims that his approach is “rather more direct than Fock’s” and that it has some other advantages.

Returning to Schrödinger’s paper, while it adopts an approach that is incapable of giving an explicit formula for $\Gamma_{\mu}$ , this doesn’t prevent it from doing a number of things which don’t require vierbeins and don’t require such an explicit formula. First it looks at quantities, such as $\bar{\psi}\gamma^{\mu}\psi$ , but including more general tensor quantities, which are sesquilinear in $\psi$ , for which the covariant derivative obviously doesn’t need to involve vierbeins; e.g. $\bar{\psi}\gamma^{\mu}\psi$ itself is just an ordinary vector field!

Secondly – and this was undoubtedly the most important new result in Schrödinger’s paper – it obtains (see Equation (S74)) the formula (in our conventions and setting the electromagnetic field to zero):

[TABLE]

for the square of the Dirac equation (more precisely for the middle expression in (13)).101010 The two factors of $\sqrt{g}$ in Equation (S74) appear to be unnecessary (but harmless). (Added October 2022: See however the last paragraph of the Addendum.) The difference in sign in front of the $\frac{1}{4}R$ term between (12) and (S74) is due to our different signature conventions.

Let us remark about this, first, that thirty years later, the analogue of this formula for an even dimensional Riemannian manifold was discovered by Lichnerowicz in Lichner63 where it is used as a mathematical tool to prove his theorem which we mentioned above.

Secondly, as we indicated above and will discuss further below, Schrödinger’s Equation (S74) is actually a generalization of (12) to include an external electromagnetic field (with $\nabla_{\mu}$ now meaning $\partial_{\mu}+\Gamma_{\mu}-ieA_{\mu}$ ) in which the term $R/4$ above is replaced by $R/4+\frac{1}{2}\sigma^{ab}F_{ab}$ where $F_{ab}$ is the electromagnetic field-strength tensor $\partial_{a}A_{b}-\partial_{b}A_{a}$ .

Thirdly, the fact that this formula can be derived without the use of vierbeins, and without the need for an explicit formula for $\Gamma_{\mu}$ can be seen from Equation (10) which, as we remarked around that equation, can also be derived without the use of vierbeins and without the need for an explicit formula for $\Gamma_{\mu}$ . Indeed we could multiply the (curved spacetime) Dirac equation (7) by $(-i\gamma^{\mu}\nabla_{\mu}-m)$ , thus obtaining

[TABLE]

which, by (10) is equal to

[TABLE]

The last term in the brackets above is easily seen to be the same as the displayed expression after Equation (S73) and we can, at this point, join the derivation sketched in Schrödinger’s paper to see that it is equal (with our conventions) to $R/4$ .

The paper of Schrödinger appears to be the first place that Equation (12) appears. In the literature, Equation (12) is often called the “Lichnerowicz formula” (see e.g. Berger ) or sometimes (in recognition of an analogous equation involving the Hodge Laplacian for differential forms, found before the Dirac equation even existed) the “Bochner-Lichnerowicz” or “Lichnerowicz-Bochner-Weitzenböck formula” etc. However, some authors (see e.g. Chrysikos ) call it the “the Schrödinger-Lichnerowicz formula” and this is surely what everyone ought to call it!

Let us mention here that generalizations of this Schrödinger-Lichnerowicz formula and possible applications to elementary particle physics model building have been discussed e.g. in AckTolks ; Tolks2001 ; Tolks2007 where information can also be had about more mathematically sophisticated ways of thinking about, and proving, such formulae.

Let us also mention some further aspects of Schrödinger’s paper which are of historical and/or potential scientific interest and some further connections with later work.

First of all, let us return to the parenthetical remark we made after Equation (11) about the ambiguity in $\Gamma_{\mu}$ when thought of as a solution to Equation (11). In fact, Schrödinger (and Fock and Weyl before him – see again Scholz for a historical account – and some others who followed them later, including Brill-Wheeler BrillWheel ) interpreted that ambiguity as allowing for an external electromagnetic 4-potential in addition to an external gravitational field, so that the full Dirac equation takes the form

[TABLE]

with (if one pins down $\Gamma_{\mu}$ as the trace-free part of the sum $\Gamma_{\mu}+B_{\mu}I$ ) $-ieA_{\mu}$ identified with $B_{\mu}$ . (And hence it was natural for Schrödinger to derive the electromagnetic generalization of (12).) On the other hand, as Schrödinger indicates (see around Equations (S10) and (S15)) Equation (14) (which could of course have been arrived at quite independently of consideration of Eq (S8)/(11)) suggests that the electromagnetic potential belongs to the same family of mathematical objects as the gravitational (spin-)connection $\Gamma_{\mu}$ and, in this sense, Schrödinger (together with Fock and Weyl) can be considered to have anticipated some of the ideas later explained by Utiyama (see Utiyama and also Trautman ) and others about the close relation between gravity and gauge theories.

Equation (11) appears to have inspired Chisholm and Farwell (see e.g. Chisholm88 , Chisholm2002 ) to consider analogues of this equation involving other Clifford algebras and to base on these some ideas for elementary particle theory model-building.

Lastly, let us mention that, aside from exhibiting the close analogy between gravity and electromagnetism discussed above, another motivation for Schrödinger’s work was to try to explain the origin of mass as a gravitational effect.111111 It is noteworthy that the idea that mass might arise as a gravitational effect was also taken in Weyl’s paper WeylNAS as his excuse for considering the massless Dirac equation (so as to eliminate the negative energy solutions). Let us also note, although it is not strictly relevant to our evaluation of Schrödinger’s paper, that one finds in the same paper of Weyl perhaps the first ever statement of electromagnetic gauge invariance – i.e., in his own notation, $\psi\mapsto e^{i\lambda}\psi,\phi_{p}\mapsto\phi_{p}-\frac{\partial\lambda}{\partial x_{p}}$ . It is also noteworthy that Weyl revisited the coupling of the Dirac equation to gravity in a further paper Weyl1950 in 1950, where he encounters the fact that in a Lagrangian formulation in which one varies the metric and the connection independently, one obtains a generalization of General Relativity (also associated Hehl76 with Einstein, Cartan, Sciama, Kibble and several other names) involving torsion, where the torsion couples to the spin density. This has also been influential in later work, inter alia in Supergravity DeserZumino . (See Ruger for more discussion.) Indeed, after obtaining Equation (S8) he remarks

“The second term seems to me to be of considerable theoretical interest. It is, however, too small by many, many powers of ten to be able to replace, for example, the term on the right-hand side. For $\mu$ is the reciprocal Compton wavelength, about $10^{11}{\rm cm}^{-1}$ . At least it seems significant that one naturally meets in the generalized theory a term at all similar to the enigmatic mass term.”

(A reference is appended to a paper by Veblen and Hoffmann on Kaluza-Klein theory presumably just to acknowledge that they get a Klein-Gordon equation with a similar $R$ correction to a mass term from different considerations.)

For some recent ideas about mass generation that appear to be related to, or perhaps inspired by, this aspect of Schrödinger’s paper, see e.g. Pollock2010 .

5 Addendum

In the above main article (now published as the editorial note KayGRG to the recent (re-)publication EngSchr of an English translation [by Claus Kiefer] of the 1932 paper Schr32 of Erwin Schrödinger) it was mentioned that the latter paper of Schrödinger had been (by thirty years!) the first place where the formula $g^{\mu\nu}\nabla_{\mu}\nabla_{\nu}+m^{2}+R/4$ had been obtained for the ‘square’ $(-i\gamma^{\mu}\nabla_{\mu}-m)(i\gamma^{\nu}\nabla_{\nu}\psi-m)$ of the Dirac operator.

It was also mentioned in KayGRG that, around 30 years later, the corresponding formula was rediscovered by André Lichnerowicz. (As stated in KayGRG , this was done in the special case of zero mass and on a Riemannian manifold rather than a spacetime in Lichner63 but actually the formula for the square of the massless Dirac operator also appeared in the slightly earlier paper Lichner62 in a Lorentzian context.) Lichnerowicz used the result in Lichner63 , to prove the vanishing of the index – i.e. the Hirzebruch $\hat{A}$ -genus – for even-dimensional compact Riemannian manifolds admitting spin structures with non-negative (but not identically zero) scalar curvature. (This was an early application of the Atiyah Singer index theorem, when applied to the case of Riemannian manifolds with spin structures.) And it was argued in KayGRG that, of the several names (including “Bochner-Lichnerowicz formula”, “Lichnerowicz-Bochner-Weitzenböck formula” etc. as well as “Schrödinger-Lichnerowicz formula”) that had been used by other authors, the name Schrödinger-Lichnerowicz formula seemed the most appropriate.

However, it was unfortunately overlooked in KayGRG that, also in 1963, Asher Peres rediscovered the same formula in Peres63 in aid of making the point that the ‘squared’ Dirac equation in an external gravitational field doesn’t contain a gyro-gravitational term analogous to the gyro-magnetic term (see the top of Page 10 in KayGRG ) $\tfrac{1}{2}\sigma^{ab}F_{ab}$ in the ‘squared’ Dirac equation in an external electromagnetic field. Also, Peres pointed out that it is not analogous to the equation for a classical spinning particle in a curved spacetime where there is a term of form $\frac{1}{2}R_{\alpha\beta\gamma\delta}v^{\beta}S^{\gamma\delta}$ where $v^{\alpha}$ is the velocity vector and $S^{\alpha\beta}$ the angular momentum tensor of the classical spinning particle.121212It is not really correct, though, to say, as Peres does that “Dirac particles $\dots$ cannot be used to test general relativity.” since of course, say in a coordinate system, the term $g^{\mu\nu}\nabla_{\mu}\nabla_{\nu}$ in the ‘squared’ Dirac equation is not the same as the scalar Laplacian. Actually, in obtaing the formula for the ‘square’ of the Dirac operator, Peres63 makes reference to the equations in Peres’s earlier paper Peres62 which also discusses the theory associated with the names of Einstein, Cartan, Kibble, Sciama and Weyl and others (see Footnote 11 in KayGRG ) of gravity with torsion and thus, arguably, Peres deserves to have his name included in that list of names too. (Somewhat ironically though, Peres63 does not mention that the theory involving torsion discussed in Peres62 would lead to a gravitational counterpart of spin-orbit coupling.)

This unfortunate omission leads us to reflect on the often noted fact that the naming of theorems in mathematics and of results in physics is often the result of historical quirks and accidents and frequently unsatisfactory or unfair in one way or another. In the case of the formula for the ‘square’ of the Dirac operator, we feel that in an ideal world, in which Lichnerowicz and Peres and everyone else would have been aware of Schrödinger’s prior contribution, it would have deserved to be called, simply, the ‘Schrödinger formula’. Indeed not only did Schrödinger obtain it 30 years earlier but it is all the more creditable that, as explained in KayGRG , he managed to do so without a formula for the spin connection. In contrast, both Lichnerowicz and Peres benefited from the fact that, by 1963, the notion of spin connection had (as also discussed in KayGRG ) by then penetrated the realm of the routine in both physics and mathematics. However, there are other reasons for naming formulae than historical priority. And perhaps it is good to append some other names so as to distinguish this particular formula of Schrödinger from the several other formulae that bear his name. If so, though, perhaps it should more justly be known as the Schrödinger-Lichnerowicz-Peres formula.

Let us also take the opportunity of this addendum to correct Footnote 10 which suggested that the square-roots of $g$ in Equation (74) in Schr32 ; EngSchr were unnecessary. The square roots are in fact needed since, in Schr32 , and unlike in modern usage, the symbol ‘ $\nabla_{k}$ ’ was always taken to mean ((Schr32, , Eq. (74))) $\frac{\partial}{\partial_{k}}-\Gamma_{k}$ even when it acts on a spinor vector.

6 Acknowledgment

I thank Gary Gibbons for permission to include the essential content of our addendum to KayGRG as Section 5 above.

Bibliography64

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1) Schrödinger, E: Sitzungsberichte der Preußischen Akademie der Wissenschaften. Physikalisch-mathematische Klasse, 105 (1932)
2(2) Weinberg, S.: The Quantum Theory of Fields, Volume I, Cambridge University Press, Cambridge (1995)
3(3) Bishop, R.L., Crittenden, R.J.: Geometry of Manifolds. Academic Press, New York (1964)
4(4) Trautman, A.: Reports on Mathematical Physics 1 , 29-62 (1970)
5(5) Ne’eman, Y.: Acta Physica Polonica B 29 , 827 (1998)
6(6) Berger, M.: Riemannian Geometry During the Second Half of the Twentieth Century (University Lecture Series, Volume 17). American Mathematical Society, Providence, R.I., USA (2000)
7(7) Atiyah, M.: What is a Spinor? IHES, Paris (2013) https://www.youtube.com/watch?v=S Bd W 978Ii_E&t=5s
8(8) Birrell, N.D., Davies, P.C.W.: Quantum Fields in Curved Space. Cambridge University Press, Cambridge (1982)