Sturm's theorem on zeros of linear combinations of eigenfunctions

Pierre B\'erard (IF); Bernard Helffer (LMJL)

arXiv:1706.08247·math.SP·January 4, 2022

Sturm's theorem on zeros of linear combinations of eigenfunctions

Pierre B\'erard (IF), Bernard Helffer (LMJL)

PDF

Open Access

TL;DR

This paper revisits Sturm's 1836 theorem on zeros of linear combinations of eigenfunctions, highlighting its historical context and potential relevance to modern spectral theory and Courant's nodal domain theorem.

Contribution

It brings renewed attention to Sturm's classical theorem, emphasizing its historical significance and potential applications in current spectral analysis.

Findings

01

Historical analysis of Sturm's theorem

02

Discussion on relevance to Courant's nodal domain theorem

03

Highlighting overlooked classical results

Abstract

Motivated by recent questions about the extension of Courant's nodal domain theorem, we revisit a theorem published by C. Sturm in 1836, which deals with zeros of linear combination of eigenfunctions of Sturm-Liouville problems. Although well known in the nineteenth century, this theorem seems to have been ignored or forgotten by some of the specialists in spectral theory since the second half of the twentieth-century. Although not specialists in History of Sciences, we have tried to put these theorems into the context of nineteenth century mathematics.

Equations204

\frac{d}{d x} (K \frac{d V}{d x}) + (r G - L) V = 0, \leavevmode \nobreak for \leavevmode \nobreak x \in] α, β [,

\frac{d}{d x} (K \frac{d V}{d x}) + (r G - L) V = 0, \leavevmode \nobreak for \leavevmode \nobreak x \in] α, β [,

(K \frac{d V}{d x} - hV) (α) = 0,

(K \frac{d V}{d x} + H V) (β) = 0 .

K, G, L : [α, β] \to R \leavevmode \nobreak are positive functions,

K, G, L : [α, β] \to R \leavevmode \nobreak are positive functions,

h, H \in [0, \infty] \leavevmode \nobreak are non negative constants, possibly infinite.

\left\{\begin{array}[]{l}[\alpha,\beta]\subset]\alpha_{0},\beta_{0}[\,,\\[5.0pt] K,G,L\in C^{\infty}(]\alpha_{0},\beta_{0}[)\,,\\[5.0pt] K,G,L>0\text{\leavevmode\nobreak\ on\leavevmode\nobreak\ }]\alpha_{0},\beta_{0}[\,.\end{array}\right.

\left\{\begin{array}[]{l}[\alpha,\beta]\subset]\alpha_{0},\beta_{0}[\,,\\[5.0pt] K,G,L\in C^{\infty}(]\alpha_{0},\beta_{0}[)\,,\\[5.0pt] K,G,L>0\text{\leavevmode\nobreak\ on\leavevmode\nobreak\ }]\alpha_{0},\beta_{0}[\,.\end{array}\right.

Y = j = m \sum n A_{j} V_{j},

Y = j = m \sum n A_{j} V_{j},

Y_{k} = (- 1)^{k} j = m \sum n ρ_{j}^{k} A_{j} V_{j} .

Y_{k} = (- 1)^{k} j = m \sum n ρ_{j}^{k} A_{j} V_{j} .

\frac{d}{d x} (K \frac{d V _{p}}{d x}) + (ρ_{p} G - L) V_{p} = 0,

\frac{d}{d x} (K \frac{d V _{p}}{d x}) + (ρ_{p} G - L) V_{p} = 0,

(K \frac{d V _{p}}{d x} - h V_{p}) (α) = 0,

(K \frac{d V _{p}}{d x} + H V_{p}) (β) = 0,

G Y_{k + 1} = K \frac{d ^{2} Y _{k}}{d x ^{2}} + \frac{d K}{d x} \frac{d Y _{k}}{d x} - L Y_{k} .

G Y_{k + 1} = K \frac{d ^{2} Y _{k}}{d x ^{2}} + \frac{d K}{d x} \frac{d Y _{k}}{d x} - L Y_{k} .

\frac{d ^{p} V _{n}}{d x ^{p}} (ξ) + j = m \sum n - 1 (\frac{ρ _{j}}{ρ _{n}})^{ℓ} \frac{A _{j}}{A _{n}} \frac{d ^{p} V _{j}}{d x ^{p}} (ξ) = 0 .

\frac{d ^{p} V _{n}}{d x ^{p}} (ξ) + j = m \sum n - 1 (\frac{ρ _{j}}{ρ _{n}})^{ℓ} \frac{A _{j}}{A _{n}} \frac{d ^{p} V _{j}}{d x ^{p}} (ξ) = 0 .

\left\{\begin{array}[]{l}U(x)=B_{\xi}(x-\xi)^{p}+(x-\xi)^{p+1}R_{\xi}(x)\,,\\[5.0pt] U_{1}(x)=B_{1,\xi}(x-\xi)^{p-2}+(x-\xi)^{p-1}R_{1,\xi}(x)\,,\\[5.0pt] \text{with\leavevmode\nobreak\ }B_{\xi}\,B_{1,\xi}>0\,.\end{array}\right.

\left\{\begin{array}[]{l}U(x)=B_{\xi}(x-\xi)^{p}+(x-\xi)^{p+1}R_{\xi}(x)\,,\\[5.0pt] U_{1}(x)=B_{1,\xi}(x-\xi)^{p-2}+(x-\xi)^{p-1}R_{1,\xi}(x)\,,\\[5.0pt] \text{with\leavevmode\nobreak\ }B_{\xi}\,B_{1,\xi}>0\,.\end{array}\right.

U (ξ) = \dots = \frac{d ^{p - 1} U}{d x ^{p - 1}} (ξ) = 0

U (ξ) = \dots = \frac{d ^{p - 1} U}{d x ^{p - 1}} (ξ) = 0

\frac{d ^{p} U}{d x ^{p}} (ξ) \neq = 0 .

\frac{d ^{p} U}{d x ^{p}} (ξ) \neq = 0 .

U (x) = B_{ξ} (x - ξ)^{p} + (x - ξ)^{p + 1} R_{ξ} (x),

U (x) = B_{ξ} (x - ξ)^{p} + (x - ξ)^{p + 1} R_{ξ} (x),

B_{ξ} = \frac{1}{p !} \frac{d ^{p} U}{d x ^{p}} (ξ) \neq = 0 .

B_{ξ} = \frac{1}{p !} \frac{d ^{p} U}{d x ^{p}} (ξ) \neq = 0 .

(G U_{1}) (x) = p (p - 1) B_{ξ} (x - ξ)^{p - 2} K (x) + (x - ξ)^{p - 1} S_{ξ} (x),

(G U_{1}) (x) = p (p - 1) B_{ξ} (x - ξ)^{p - 2} K (x) + (x - ξ)^{p - 1} S_{ξ} (x),

U_{1} (x) = B_{1, ξ} (x - ξ)^{p - 2} + (x - ξ)^{p - 1} R_{1, ξ} (x),

U_{1} (x) = B_{1, ξ} (x - ξ)^{p - 2} + (x - ξ)^{p - 1} R_{1, ξ} (x),

U (α) = \dots = \frac{d ^{p - 1} Y _{k}}{d x ^{p - 1}} (α) = 0

U (α) = \dots = \frac{d ^{p - 1} Y _{k}}{d x ^{p - 1}} (α) = 0

\frac{d ^{p} U}{d x ^{p}} (α) \neq = 0 .

\frac{d ^{p} U}{d x ^{p}} (α) \neq = 0 .

U (x) = B_{α} (x - α)^{p} + (x - α)^{p + 1} R_{α} (x),

U (x) = B_{α} (x - α)^{p} + (x - α)^{p + 1} R_{α} (x),

U_{1} (x) = B_{1, α} (x - α)^{p - 2} + (x - α)^{p - 1} R_{1, α} (x),

U_{1} (x) = B_{1, α} (x - α)^{p - 2} + (x - α)^{p - 1} R_{1, α} (x),

Y_{k + q} (x) = B_{k + q, α} + (x - α) R_{k + q, α} (x),

Y_{k + q} (x) = B_{k + q, α} + (x - α) R_{k + q, α} (x),

Y_{k + q} (x) = B_{k + q, α} (x - α) + (x - α)^{2} R_{k + q, α} (x),

Y_{k + q} (x) = B_{k + q, α} (x - α) + (x - α)^{2} R_{k + q, α} (x),

m (U, ξ) = min {p \leavevmode ∣ \leavevmode \frac{d ^{p} U}{d x ^{p}} (ξ) \neq = 0} .

m (U, ξ) = min {p \leavevmode ∣ \leavevmode \frac{d ^{p} U}{d x ^{p}} (ξ) \neq = 0} .

\overline{m}(U,\alpha)=\left\{\begin{array}[]{ll}\frac{1}{2}m(U,\alpha)&\text{if\leavevmode\nobreak\ }h\in[0,\infty[\,,\\[5.0pt] 0&\text{if\leavevmode\nobreak\ }h=\infty\,,\end{array}\right.

\overline{m}(U,\alpha)=\left\{\begin{array}[]{ll}\frac{1}{2}m(U,\alpha)&\text{if\leavevmode\nobreak\ }h\in[0,\infty[\,,\\[5.0pt] 0&\text{if\leavevmode\nobreak\ }h=\infty\,,\end{array}\right.

N_{m} (U,] α, β [) = j = 1 \sum p m (U, ξ_{i} (U)),

N_{m} (U,] α, β [) = j = 1 \sum p m (U, ξ_{i} (U)),

\overline{N}_{m} (U, [α, β]) = j = 1 \sum p m (U, ξ_{i} (U)) + \overline{m} (U, α) + \overline{m} (U, β),

\overline{N}_{m} (U, [α, β]) = j = 1 \sum p m (U, ξ_{i} (U)) + \overline{m} (U, α) + \overline{m} (U, β),

N (U,] α, β [) = p,

N (U,] α, β [) = p,

N_{v} (U,] α, β [) = j = 1 \sum p \frac{1}{2} [1 - (- 1)^{m (U, ξ_{j} (U))}] .

N_{v} (U,] α, β [) = j = 1 \sum p \frac{1}{2} [1 - (- 1)^{m (U, ξ_{j} (U))}] .

ε_{0} U (a) > 0, \frac{d U}{d x} (a) = 0 \mbox an d ε_{0} \frac{d ^{2} U}{d x ^{2}} (a) \leq 0 .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHistory and Theory of Mathematics · Electrical and Electromagnetic Research · History of Computing Technologies

Full text

\includeversion

weaker \includeversionaddendum

Sturm’s theorem on zeros of linear combinations of eigenfunctions

Pierre Bérard

and

Bernard Helffer

PB: Institut Fourier, Université Grenoble Alpes and CNRS, B.P.74

F38402 Saint Martin d’Hères Cedex, France.

[email protected]

BH: Laboratoire Jean Leray, Université de Nantes and CNRS

F44322 Nantes Cedex, France.

[email protected]

(Date: October 3, 2017. Revised July 31, 2018)

Abstract.

Motivated by recent questions about the extension of Courant’s nodal domain theorem, we revisit a theorem published by C. Sturm in 1836, which deals with zeros of linear combination of eigenfunctions of Sturm-Liouville problems. Although well known in the nineteenth century, this theorem seems to have been ignored or forgotten by some of the specialists in spectral theory since the second half of the twentieth-century. Although not specialists in History of Sciences, we have tried to put this theorem into the context of nineteenth century mathematics.

Key words and phrases:

Sturm-Liouville eigenvalue problem, Sturm’s theorems.

2010 Mathematics Subject Classification:

34B24, 34L10, 34L99.

*To appear in Expositiones Mathematicae 2018,

except for the Appendices C to E (in blue).

1. Introduction

In this paper, we are interested in the following one-dimensional eigenvalue problem, where $r$ denotes the spectral parameter.

[TABLE]

Here,

[TABLE]

*Remark 1.1**.*

When $h=\infty$ (resp. $H=\infty$ ), the boundary condition should be understood as the Dirichlet boundary condition $V(\alpha)=0$ (resp. as the Dirichlet boundary condition $V(\beta)=0$ ).

Precise assumptions on $K,G,L$ are given below.

Note that when $K=G\equiv 1$ , (1.1)–(1.3) is an eigenvalue problem for the classical operator $-\frac{d^{2}{V}}{d{x}^{2}}+LV$ .

This eigenvalue problem, in the above generality ( $K,G,L$ functions of $x$ ), was first studied by Charles Sturm in a Memoir presented to the Paris Academy of sciences in September 1833, summarized in [37, 38], and published in [39, 40].

*Remark 1.2**.*

In this paper, we have mainly retained the notation of [39], except that we use $[\alpha,\beta]$ for the interval, instead of Sturm’s notation $[\mathrm{x},\mathrm{X}]$ . We otherwise use today notation and vocabulary. Note that in [40], Sturm uses lower case letters for the functions $K,G,L$ , the same notation as Joseph Fourier in [13].

As far as the eigenvalue problem (1.1)–(1.3) is concerned, Sturm’s results can be roughly summarized in the following theorems.

Theorem 1.3 (Sturm, 1836).

Under the assumptions (1.4)–(1.5), the eigenvalue problem (1.1)–(1.3) admits an increasing infinite sequence $\{\rho_{i},i\geq 1\}$ of positive simple eigenvalues, tending to infinity. Furthermore, the associated eigenfunctions $V_{i}$ have the following remarkable property: the function $V_{i}$ vanishes, and changes sign, precisely $(i-1)$ times in the open interval $]\alpha,\beta[\,$ .

Theorem 1.4 (Sturm, 1836).

Let $Y=A_{m}V_{m}+\cdots+A_{n}V_{n}$ be a non trivial linear combination of eigenfunctions of the eigenvalue problem (1.1)–(1.3), with $1\leq m\leq n$ , and $\{A_{j},m\leq j\leq n\}$ real constants such that $A_{m}^{2}+\cdots+A_{n}^{2}\not=0$ . Then, the function $Y$ has at least $(m-1)$ , and at most $(n-1)$ zeros in the open interval $]\alpha,\beta[$ .

The first theorem today appears in most textbooks on Sturm-Liouville theory. Although well known in the nineteenth century, the second theorem (as well as the more precise Theorem 2.15) seems to have been ignored or forgotten by some of the specialists in spectral theory since the second half of the twentieth-century, as the following chronology indicates.

**1833: **

Sturm’s Memoir presented to the Paris Academy of sciences in September, summarized in [37, 38].

**1836: **

Sturm’s papers [39, 40] published. Joseph Liouville summarizes Sturm’s results in [23, $\S$ III, p. 257], and uses them to study the expansion of a given function $f$ into a series of eigenfunctions of (1.1).

**1877: **

Lord Rayleigh writes “a beautiful theorem has been discovered by Sturm” as he mentions Theorem 1.4 in [34, Section 142].

**1891: **

F. Pockels [30, pp. 68-73] gives a summary of Sturm’s results, including Theorem 1.4, and mentions the different proofs provided by Sturm, Liouville and Rayleigh. On the basis of a note of Sturm in Férussac’s Bulletin [36], Pockels (p. 71, lines 12-17) also suggests that Sturm may have looked for a statement in higher dimension as well, without success. Sturm indeed mentions studying an example with spherical symmetry in dimension $3$ (leading to an ordinary differential equation with singularity), to which he may have applied Theorem 1.4.

**1903: **

Hurwitz [19] gives a lower bound for the number of zeros of the sum of a trigonometric series with a spectral gap and refers, somewhat inaccurately, to Sturm’s Theorems. This result, known as the Sturm-Hurwitz theorem, already appears in a more general framework in Liouville’s paper [23].

See [12, $\S$ 2] for a generalization of the Sturm-Hurwitz theorem to Fourier integrals with a spectral gap, [28] for geometric applications, and the recent paper [35] which quantifies the Sturm-Hurwitz theorem.

**1931: **

Courant & Hilbert [10, 11] extensively mention the Sturm-Liouville problem. They do not refer to the original papers of Sturm, but to Bôcher’s book [8] which does not include Theorem 1.4. They then state an extension of the so-called Courant’s nodal domain theorem to linear combination of eigenfunctions, [10, footnote, p. 394] and [11, footnote, p. 454], and refer to the dissertation of H. Herrmann [18]. It turns out that neither Herrmann’s dissertation, nor his later papers, consider this extension of Courant’s Theorem.

**1950: **

The book [15] by F. Gantmacher and M. Krein contains several notes on Sturm’s contributions. One result (Corollary, Chap. III.5, p. 138), stated in the context of Chebyshev systems, is stronger than Theorem 1.4, yet weaker than Theorem 2.15. The book does however not mention [40].

**1956: **

Pleijel mentions Sturm’s Theorem 1.4, somewhat inaccurately, in [29, p. 543 and 550].

**1973: **

V. Arnold [1] points out that an extension of Courant’s theorem to linear combinations of eigenfunctions cannot be true in general. Counterexamples were first given by O. Viro for the $3$ -sphere (with the canonical metric) [41] and, more recently in the papers [4, 5], see also [17].

It seems to us that Arnold may have not been aware of Theorem 1.4. Indeed, in [3], see also the Supplementary problem 9 in [2, p. 327], he mentions a proof, suggested by I. Gelfand, of the upper bound in Theorem 1.4. Gelfand’s idea is to “use fermions rather than bosons”, and to apply Courant’s nodal domain theorem in the fermionic context. However, Arnold concludes by writing [3, p. 30], “the arguments [given by Gelfand] do not yet provide a proof”. It is interesting to note that Liouville’s and Rayleigh’s proofs of the lower bound in Theorem 1.4 use an idea similar to Gelfand’s, see the proof of Claim 3.5.

As far as we know, the first implementation of Gelfand’s idea into a complete proof of Theorem 1.4 is given in [5, 6].

*Remark 1.5**.*

In [40], Theorem 1.4 first appears as a corollary to a much deeper theorem [40, $\S$ XXIV], in which Sturm describes the time evolution of the $x$ -zeros of a solution $u(x,t)$ of the heat equation. We shall not consider this topic here, and we refer to [14, 26] for modern formulations and a historical analysis.

Our interest in Theorem 1.4 arose from reading [20], and investigating Courant’s nodal domain theorem and its extension to linear combination of eigenfunctions.

The main purpose of this paper is to popularize Theorems 1.4 and 2.15, as well as Sturm’s originality and ideas. Sturm’s results are clearly stated in the summaries [37, 38]. Unfortunately, Sturm’s detailed papers [39, 40] are written linearly, and contain very few tagged statements. Our second purpose is to provide an accessible proof of Theorems 1.4 and 2.15, meeting today standards of rigor. We in particular state precise assumptions, clarify some technical points, and provide some alternative proofs. We otherwise closely follow the original proofs, and we provide precise cross-references to Sturm’s papers.

In this paper, we make the following strong assumptions.

[TABLE]

*Remark 1.6**.*

Neither Sturm nor Liouville make any explicit regularity assumptions, see Subsection 4.3 and Remark 3.7 for more details.

Organization of the paper

In Section 2, we prove Theorem 2.15, Sturm’s refined version of Theorem 1.4, following the ideas of [40, $\S$ XXVI]. In Section 3, we prove Theorem 3.2, Liouville’s version of Theorem 1.4, following [23, 24]. In Section 4, we describe the context of Sturm’s papers and his ideas. Appendix A provides the detailed proof of a technical argument. Appendix B contains the citations from Sturm’s papers in their original French formulation. Appendix C considers Sturm’s theorem under weaker assumptions. Appendices D and E provide cross-references between our paper and the papers of Sturm and Liouville.

Acknowledgements

The authors would like to thank N. Kuznetsov and J. Lützen for their comments on an earlier version of this paper. They also thank the anonymous referees for their constructive remarks.

2. Sturm’s o.d.e. proof of Theorem 1.4

2.1. Preliminary lemmas and notation

2.1.1.

Recall that $\{(\rho_{j},V_{j}),j\geq 1\}$ are the eigenvalues and eigenfunctions of the eigenvalue problem (1.1)–(1.3).

By our assumption $L>0$ , the eigenvalues are positive, $\rho_{j}>0$ . Under the Assumptions (1.6), the functions $V_{j}$ are $C^{\infty}$ on $]\alpha_{0},\beta_{0}[\,$ . This follows from Cauchy’s existence and uniqueness theorem, or from Liouville’s existence proof [23]. Note that the assumption $L>0$ is convenient, but not necessary. It actually suffices that $\frac{L}{G}$ be bounded from below.

In this section, we fix

[TABLE]

a linear combination of eigenfunctions of the eigenvalue problem (1.1)–(1.3), where $1\leq m\leq n\,$ , and where the $A_{j}$ are real constants.

*Remark 2.1**.*

We shall always assume that $Y\not\equiv 0\,$ , which is equivalent to assuming that $\sum_{m}^{n}A_{j}^{2}\neq 0\,$ . As far as the statement of Theorem 1.4 is concerned, and without loss of generality, it is simpler to assume that $A_{m}\,A_{n}\neq 0\,$ .

We also introduce the associated family of functions, $\{Y_{k},k\in\mathbb{Z}\}$ , where

[TABLE]

Note that $Y_{0}$ is the original linear combination $Y$ , and that $Y_{k}\equiv 0$ if and only if $Y\equiv 0\,$ .

Roughly speaking, Sturm’s idea is to show that the number of zeros of $Y_{k}$ , in the interval $]\alpha,\beta[$ , is non-decreasing with respect to $k$ , and then to take the limit when $k$ tends to infinity, see Subsection 2.3. Up to changing the constants $A_{j}$ , it suffices to compare the numbers of zeros of $Y$ and $Y_{1}$ . For this purpose, Sturm compares the signs of $Y$ and $Y_{1}$ near the zeros of $Y$ (Lemma 2.4), and at the non-zero local extrema of $Y$ (Lemmas 2.7 and 2.9). The main ingredient for this purpose is the differential relation (2.6). In the sequel, we indicate the pages in Sturm’s papers corresponding to the different steps of the proof.

2.1.2.

For $m\leq p\leq n$ , write the equations satisfied by the eigenfunction $V_{p}$ ,

[TABLE]

and multiply the $p$ -th equation by $\rho_{p}^{k}\,A_{p}$ . Summing up from $p=m$ to $n$ , yields the following lemma.

Lemma 2.2.

Assume that (1.6) holds. Let $k\in\mathbb{Z}$ .

(1)

The function $Y_{k}$ satisfies the boundary conditions (1.2) and (1.3). 2. (2)

The functions $Y_{k}$ and $Y_{k+1}$ satisfy the differential relation

[TABLE] 3. (3)

Under the Assumptions (1.6), the function $Y_{k}$ cannot vanish at infinite order at a point $\xi\in[\alpha,\beta]$ , unless $Y\equiv 0\,$ .

Proof. [40, p. 437] Assertions (1) and (2) are clear by linearity.

For Assertion (3), assume that $Y_{k}\not\equiv 0\,$ , and that it vanishes at infinite order at some $\xi$ . Then, according to (2.6) and its successive derivatives, the function $Y_{k+1}$ also vanishes at infinite order at $\xi$ , and so does $Y_{\ell}$ for any $\ell\geq k$ . Assume, as indicated in Remark 2.1, that $A_{n}\not=0$ . Fixing some $p\geq 0$ , we can write, for any $\ell\geq k$ ,

[TABLE]

Since $\rho_{n}>\rho_{j}$ for $m\leq j\leq n-1$ , letting $\ell$ tend to infinity, we conclude that $\frac{d^{p}V_{n}}{dx^{p}}(\xi)=0$ . This would be true for all $p$ , which is impossible by Cauchy’s uniqueness theorem, or by Sturm’s argument [39, $\S$ II]. ∎

*Remark 2.3**.*

Assertion (3), and the fact that the zeros of $Y$ are isolated, with finite multiplicities, are implicit in [40].

Lemma 2.4.

Assume that (1.6) holds. Let $U$ denote any $Y_{k}$ , and $U_{1}=Y_{k+1}$ . Let $\xi\in[\alpha,\beta]$ be a zero of $U$ , of order $p\geq 2\,$ . Then, there exist constants $B_{\xi}$ and $B_{1,\xi}\,$ , and smooth functions $R_{\xi}$ and $R_{1,\xi}\,$ , such that

[TABLE]

Proof. [40, p. 439] Assume that $\xi$ is a zero of order $p\geq 2$ of $U$ , so that

[TABLE]

and

[TABLE]

Taylor’s formula with integral remainder term, see Laplace [21, p. 179] (in Livre premier, Partie 2, Chap. 3, § 44), gives the existence of some function $R_{\xi}$ such that

[TABLE]

where

[TABLE]

Equation (2.6) implies that

[TABLE]

for some smooth function $S_{\xi}$ . It follows that

[TABLE]

for some function $R_{1,\xi}$ , with $B_{1,\xi}=p(p-1)\frac{K(\xi)}{G(\xi)}B_{\xi}\,$ .

In particular, $B_{1,\xi}\,B_{\xi}>0\,$ and this proves the lemma. ∎

Lemma 2.5.

*Assume that (1.6) holds. Assume that $h\in[0,\infty[\,$ , i.e., that the boundary condition at $\alpha$ is not the Dirichlet boundary condition. Let $U$ denote any $Y_{k}$ , $U_{1}=Y_{k+1}$ , and assume that $U(\alpha)=0\,$ . Then, $\alpha$ is a zero of $U$ of even order, i.e., there exists $n_{U}\in\mathbb{N}\setminus\{0\}$ such that $\frac{d^{p}U}{dx^{p}}(\alpha)=0$ for $0\leq p\leq 2n_{U}-1$ and $\not=0$ for $p=2n_{U}$ .

When $H\in[0,\infty[\,$ , a similar statement holds at the boundary $\beta$ .*

Proof. [40, p. 440-441] Assume that $U(\alpha)=0$ . By Lemma 2.2, $U$ does not vanish at infinite order at $\alpha$ , so that there exists $p\geq 1$ with

[TABLE]

and

[TABLE]

Taylor’s formula with integral remainder term gives

[TABLE]

where $B_{\alpha}=\frac{1}{p!}\frac{d^{p}U}{dx^{p}}(\alpha)\not=0\,$ .

The boundary condition at $\alpha$ implies that $\frac{dU}{dx}(\alpha)=0$ , and hence that $p\geq 2$ . By Lemma 2.4, we can write

[TABLE]

with $B_{1,\alpha}\,B_{\alpha}>0\,$ .

If $p=2$ , then $U_{1}(\alpha)\not=0$ . If $p>2$ , one can continue.

If $p=2q$ , one arrives at

[TABLE]

with $Y_{k+q}(\alpha)=B_{k+q,\alpha}$ and $B_{k+q,\alpha}\,B_{k,\alpha}>0\,$ .

If $p=2q+1$ , one arrives at

[TABLE]

with $B_{k+q\alpha}\,B_{k,\alpha}>0$ and $\frac{dY_{k+q}}{dx}(\alpha)=B_{k+q,\alpha}\not=0\,$ . On the other-hand, since $Y_{k+q}$ satisfies (1.2) and $Y_{k+q}(\alpha)=0\,$ , we must have $\frac{dY_{k+q}}{dx}(\alpha)=0\,$ , because $h>0\,$ . This yields a contradiction and proves that the case $p=2q+1$ cannot occur. The lemma is proved. ∎

2.2. Counting zeros

Assume that (1.6) holds. Let $U$ denote any $Y_{k}$ , and $U_{1}=Y_{k+1}$ . They satisfy the relation (2.6).

From Lemma 2.2, we know that $U$ cannot vanish at infinite order at a point $\xi\in[\alpha,\beta]$ . If $\xi\in]\alpha,\beta[$ and $U(\xi)=0\,$ , we define the multiplicity $m(U,\xi)$ of the zero $\xi$ by

[TABLE]

From Lemma 2.5, we know that the multiplicity $m(U,\alpha)$ is even if $h\in[0,\infty[$ , and that the multiplicity $m(U,\beta)$ is even if $H\in[0,\infty[\,$ . We define the reduced multiplicity of $\alpha$ by

[TABLE]

and a similar formula for the reduced multiplicity of $\beta$ .

By Lemma 2.2, the function $U$ has finitely many distinct zeros $\xi_{1}(U)<\xi_{2}(U)<\cdots<\xi_{p}(U)$ in the interval $]\alpha,\beta[\,$ . We define the number of zeros of $U$ in $]\alpha,\beta[\,$ , counted with multiplicities, by

[TABLE]

and we use the notation $N_{m}(U)$ whenever the interval is clear.

We define the number of zeros of $U$ in $[\alpha,\beta]$ , counted with multiplicities, by

[TABLE]

and we use the notation $\overline{N}_{m}(U)$ whenever the interval is clear.

We define the number of zeros of $U$ in $]\alpha,\beta[$ (multiplicities not accounted for) by

[TABLE]

and we use the notation $N(U)$ whenever the interval is clear.

Finally, we define the number of sign changes of $U$ in the interval $]\alpha,\beta[$ by

[TABLE]

*Remark 2.6**.*

Note that sign changes of the function $Y$ correspond to zeros with odd multiplicity.

2.3. Comparing the numbers of zeros of $Y_{k}$ and $Y_{k+1}$

Assume that (1.6) holds. Let $U$ be some $Y_{k}$ and $U_{1}=Y_{k+1}$ . In this subsection, we show that the number of zeros of $U_{1}$ is not smaller than the number of zeros of $U$ .

Lemma 2.7.

Let $\xi<\eta$ be two zeros of $U$ in $[\alpha,\beta]$ . Then, there exists some $a_{\xi,\eta}\in]\xi,\eta[$ such that $U(a_{\xi,\eta})\,U_{1}(a_{\xi,\eta})<0\,$ .

*Remark 2.8**.*

We do not assume that $\xi,\eta$ are consecutive zeros.

Proof. [40, p. 437] Since $U$ cannot vanish identically in $]\xi,\eta[$ (see Lemma 2.2), there exists some $x_{0}\in]\xi,\eta[$ such that $U(x_{0})\neq 0$ . Let $\varepsilon_{0}=\operatorname{sign}(U(x_{0}))$ . Then $\varepsilon_{0}U$ takes a positive value at $x_{0}$ , and hence $M:=\sup\{\varepsilon_{0}U(x)\leavevmode\nobreak\ |\leavevmode\nobreak\ x\in[\xi,\eta]\}$ is positive and achieved at some $a_{\xi,\eta}\in]\xi,\eta[$ . Denote this point by $a$ for short, then,

[TABLE]

It follows from (2.6) that $\varepsilon_{0}U_{1}(a)<0\,$ , or equivalently, that $U(a)U_{1}(a)<0\,$ . The lemma is proved. ∎

Lemma 2.9.

*Let $\xi\in]\alpha,\beta]$ . Assume that $U(\xi)=0\,$ , and that $U$ does not change sign in $]\alpha,\xi[$ . Then, there exists some $a_{\xi}\in[\alpha,\xi[$ such that $U(a_{\xi})U_{1}(a_{\xi})<0$ .

Let $\eta\in[\alpha,\beta[$ . Assume that $U(\eta)=0$ , and that $U$ does not change sign in $]\eta,\beta[$ . Then, there exists some $b_{\eta}\in]\eta,\beta]$ such that $U(b_{\eta})U_{1}(b_{\eta})<0$ .*

Proof. [40, p. 438] Since $U$ cannot vanish identically in $]\alpha,\xi[$ (see Lemma 2.2), there exists $x_{0}\in]\alpha,\xi[$ such that $U(x_{0})\neq 0\,$ . Let $\varepsilon_{\xi}=\operatorname{sign}(U(x_{0}))\,$ . Since $U$ does not change sign in $]\alpha,\xi[\,$ , $\varepsilon_{\xi}U(x)\geq 0$ in $]\alpha,\xi[\,$ . Then,

[TABLE]

Let

[TABLE]

Then $a_{\xi}\in[\alpha,\xi[$ .

If $a_{\xi}\in]\alpha,\xi[$ , then $\varepsilon_{\xi}U(a_{\xi})>0\,$ , $\frac{dU}{dx}(a_{\xi})=0\,$ , and $\varepsilon_{\xi}\frac{d^{2}{U}}{d{x}^{2}}(a_{\xi})\leq 0\,$ . By (2.6), this implies that $\varepsilon_{\xi}U_{1}(a_{\xi})<0$ . Equivalently, $U(a_{\xi})U_{1}(a_{\xi})<0$ .

Claim 2.10.

If $a_{\xi}=\alpha\,$ , then $\varepsilon_{\xi}\,U(\alpha)>0\,$ , $h=0\,$ , $\frac{dU}{dx}(\alpha)=0\,$ , and $\varepsilon_{\xi}\,\frac{d^{2}{U}}{d{x}^{2}}(\alpha)\leq 0\,$ .

*Proof of the claim. * Assume that $a_{\xi}=\alpha\,$ , then $\varepsilon_{\xi}\,U(\alpha)>0\,$ , and hence $h\neq\infty\,$ . If $h$ were in $]0,\infty[$ , we would have $\varepsilon_{\xi}\,\frac{dU}{dx}(\alpha)=h\varepsilon_{\xi}U(\alpha)>0\,$ , and hence $a_{\xi}>\alpha\,$ . It follows that the assumption $a_{\xi}=\alpha$ implies that $h=0$ and $\frac{dU}{dx}(\alpha)=0$ . If $\varepsilon_{\xi}\frac{d^{2}{U}}{d{x}^{2}}(\alpha)$ where positive, we would have $a_{\xi}>\alpha$ . Therefore, the assumption $a_{\xi}=\alpha$ also implies that $\varepsilon_{\xi}\frac{d^{2}{U}}{d{x}^{2}}(\alpha)\leq 0\,$ . The claim is proved.

If $a_{\xi}=\alpha$ , then by Claim 2.10 and (2.6), we have $\varepsilon_{\xi}U_{1}(\alpha)<0$ . Equivalently, $U(\alpha)U_{1}(\alpha)<0$ . The first assertion of the lemma is proved. The proof of the second assertion is similar. ∎

Proposition 2.11.

Assume that (1.6) holds, and let $k\in\mathbb{Z}$ . Then,

[TABLE]

i.e., in the interval $]\alpha,\beta[$ , the function $Y_{k+1}$ changes sign at least as many times as the function $Y_{k}$ .

Proof. [40, p. 437-439] We keep the notation $U=Y_{k}$ and $U_{1}=Y_{k+1}$ . By Lemma 2.2, the functions $U$ and $U_{1}$ have finitely many zeros in $]\alpha,\beta[$ , with finite multiplicities. Since $\alpha$ and $\beta$ are fixed, we skip the mention to the interval $]\alpha,\beta[$ in the proof, and we examine several cases.

Case 1. If $N_{v}(U)=0$ , there is nothing to prove.

Case 2. Assume that $N_{v}(U)=1$ . Then $U$ admits a unique zero $\xi\in]\alpha,\beta[$ having odd multiplicity. Without loss of generality, we may assume that $U\geq 0$ in $]\alpha,\xi[$ and $U\leq 0$ in $]\xi,\beta[\,$ . By Lemma 2.9, there exist $a\in[\alpha,\xi[$ and $b\in]\xi,\beta]$ such that $U_{1}(a)<0$ and $U_{1}(b)>0\,$ .

It follows that the function $U_{1}$ vanishes and changes sign at least once in $]\alpha,\beta[\,$ , so that $N_{v}(U_{1})\geq 1=N_{v}(U)$ , which proves the lemma in Case 2.

Case 3. If $N_{v}(U)=2$ , the function $U$ has exactly two zeros, having odd multiplicities, $\xi$ and $\eta$ in $]\alpha,\beta[\,$ , $\alpha<\xi<\eta<\beta\,$ , and we may assume that $U|_{]\alpha,\xi[}\geq 0\,$ , $U|_{]\xi,\eta[}\leq 0$ , and $U|_{]\eta,\beta[}\geq 0$ . The arguments given in Case 2 imply that there exist $a\in[\alpha,\xi[$ such that $U_{1}(a)<0$ and $b\in]\eta,\beta]$ such that $U_{1}(b)<0\,$ . In $]\xi,\eta[$ the function $U$ does not vanish identically and therefore achieves a global minimum at a point $c$ such that $U(c)<0\,$ , $\frac{dU}{dx}(c)=0\,$ , and $\frac{d^{2}{U}}{d{x}^{2}}(c)\geq 0\,$ . Equation(2.6) then implies that $U_{1}(c)>0\,$ .

We can conclude that the function $U_{1}$ vanishes and changes sign at least twice in $]\alpha,\beta[$ , so that $N_{v}(U_{1})\geq 2=N_{v}(U)$ .

Case 4. Assume that $N_{v}(U)=p\geq 3\,$ . Then, $U$ has exactly $p$ zeros, with odd multiplicities, in $]\alpha,\beta[\,$ , $\alpha<\xi_{1}<\xi_{2}<\cdots<\xi_{p}<\beta$ , and one can assume that

[TABLE]

One can repeat the arguments given in the Cases 2 and 3, and conclude that there exist $a_{0},\ldots,a_{p}$ with $a_{0}\in[\alpha,\xi_{1}[\,$ , $a_{i}\in]\xi_{i},\xi_{i+1}[$ for $1\leq i\leq p-1$ , and $a_{p}\in]\xi_{p},\beta]$ such that $(-1)^{i}U_{1}(a_{i})<0\,$ .

We can then conclude that the function $U_{1}$ vanishes and changes sign at least $p$ times in $]\alpha,\beta[\,$ , i.e. that $N_{v}(U_{1})\geq p=N_{v}(U)\,$ .

This concludes the proof of Proposition 2.11. ∎

Proposition 2.12.

Assume that (1.6) holds. For any $k\in\mathbb{Z}$ ,

[TABLE]

i.e., in the interval $]\alpha,\beta[\,$ , counting multiplicities of zeros, the function $Y_{k+1}$ vanishes at least as many times as the function $Y_{k}$ .

Proof. [40, p. 439-442] Let $U=Y_{k}$ and $U_{1}=Y_{k+1}\,$ . If $U$ does not vanish in $]\alpha,\beta[\,$ , there is nothing to prove. We now assume that $U$ has at least one zero in $]\alpha,\beta[\,$ . By Lemma 2.2, $U$ and $U_{1}$ have finitely many zeros in $]\alpha,\beta[\,$ . Let

[TABLE]

be the distinct zeros of $U$ , with multiplicities $p_{i}=m(U,\xi_{i})$ for $1\leq i\leq k\,$ . Let $\sigma_{0}$ be the sign of $U$ in $]\alpha,\xi_{1}[\,$ , $\sigma_{i}$ the sign of $U$ in $]\xi_{i},\xi_{i+1}[$ for $1\leq i\leq k-1\,$ , and $\sigma_{k}$ the sign of $U$ in $]\xi_{k},\beta[\,$ . Note that

[TABLE]

By Lemma 2.9, there exist $a_{0}\in[\alpha,\xi_{1}[$ and $a_{k}\in]\xi_{k},\beta]$ such that $U(a_{0})U_{1}(a_{0})<0$ and $U(a_{k})U_{1}(a_{k})<0\,$ . By Lemma 2.7, there exists $a_{i}\in]\xi_{i},\xi_{i+1}[\,$ , $1\leq i\leq k-1\,$ , such that $U(a_{i})U_{1}(a_{i})<0\,$ .

Summarizing, we have obtained:

[TABLE]

We have the relation

[TABLE]

Indeed, for $1\leq i\leq k$ , the interval $]a_{i-1},a_{i}[$ contains precisely one zero $\xi_{i}$ of $U$ , with multiplicity $p_{i}\,$ .

For $U_{1}$ , we have the inequality

[TABLE]

because $U_{1}$ might have zeros in the interval $]\alpha,a_{0}[$ if $a_{0}>\alpha$ (resp. in the interval $]a_{k},\beta[$ if $a_{k}<\beta$ ).

Claim 2.13.

For $1\leq i\leq k$ ,

[TABLE]

To prove the claim, we consider several cases.

$\bullet$ If $p_{i}=1$ , then $U(a_{i-1})\,U(a_{i})<0$ and, by (2.16), $U_{1}(a_{i-1})U_{1}(a_{i})<0\,$ , so that $N_{m}(U_{1},]a_{i-1},a_{i}[)\geq 1\,$ .

$\bullet$ If $p_{i}\geq 2$ , we apply Lemma (2.4) at $\xi_{i}\,$ : there exist real numbers $B,B_{1}$ and smooth functions $R$ and $R_{1}$ , such that, in a neighborhood of $\xi_{i}\,$ ,

[TABLE]

where $\operatorname{sign}(B)=\operatorname{sign}(B_{1})=\sigma_{i}\,$ .

We now use (2.16) and the fact that $\operatorname{sign}(U(a_{i}))=\sigma_{i}\,$ .

$\diamond$ If $p_{i}\geq 2$ is odd, then $\sigma_{i-1}\sigma_{i}=-1\,$ . It follows that

[TABLE]

By (2.19), for $\varepsilon$ small enough, we also have

[TABLE]

This means that $U_{1}$ vanishes at order $p_{i}-2$ at $\xi_{i}\,$ , and at least once in the intervals $]a_{i-1},\xi_{i}-\varepsilon[$ and $]\xi_{i}+\varepsilon,a_{i}[$ , so that

[TABLE]

$\diamond$ If $p_{i}\geq 2$ is even, then $\sigma_{i-1}\sigma_{i}=1\,$ . It follows that

[TABLE]

By (2.19), for $\varepsilon$ small enough, we also have

[TABLE]

This means that $U_{1}$ vanishes at order $p_{i}-2$ at $\xi_{i}\,$ , and at least once in the intervals $]a_{i-1},\xi_{i}-\varepsilon[$ and $]\xi_{i}+\varepsilon,a_{i}[\,$ , so that

[TABLE]

The claim is proved, and the proposition as well. ∎

Proposition 2.14.

Assume that (1.6) holds. For any $k\in\mathbb{Z}$ ,

[TABLE]

i.e., in the interval $[\alpha,\beta]$ , counting multiplicities of interior zeros, and reduced multiplicities of $\alpha$ and $\beta$ , the function $Y_{k+1}$ vanishes at least as many times as the function $Y_{k}$ .

Proof. [40, p. 440-442] Recall that the reduced multiplicity of $\alpha$ (resp. $\beta$ ) is zero if the Dirichlet condition holds at $\alpha$ (resp. at $\beta$ ) or if $U(\alpha)\neq 0$ (resp. $U(\beta)\neq 0$ ). Furthermore, according to Lemma 2.5, if $h\in[0,\infty[$ and $U(\alpha)=0$ (resp. if $H\in[0,\infty[$ and $U(\beta)=0$ ), then $m(U,\alpha)=2p$ (resp. $m(U,\beta)=2q$ ).

*Case 1. * Assume that $N_{m}(U,]\alpha,\beta[)=0\,$ . Without loss of generality, we may assume that $U>0$ in $]\alpha,\beta[\,$ .

$\bullet$ If $U(\alpha)\neq 0$ and $U(\beta)\neq 0$ , there is nothing to prove.

$\bullet$ Assume that $U(\alpha)=U(\beta)=0\,$ . Then, there exists $a\in]\alpha,\beta[$ such that

[TABLE]

with

[TABLE]

It follows from (2.6) that $U_{1}(a)<0\,$ , and that

[TABLE]

It now suffices to look separately at the intervals $[\alpha,a]$ and $[a,\beta]\,$ .

$\diamond$ Interval $[\alpha,a]$ . If the Dirichlet condition holds at $\alpha$ , there is nothing to prove. If $h\in[0,\infty[$ , $m(U,\alpha)=2p\geq 2$ and, by Lemma 2.4,

[TABLE]

It follows that $U_{1}(\alpha+\varepsilon)>0$ for any positive $\varepsilon$ small enough so that $N_{m}(U_{1},]\alpha,a[)\geq 1$ . It follows that

[TABLE]

$\diamond$ Interval $[a,\beta]$ . The proof is similar.

$\bullet$ Assume that $U(\alpha)=0$ and $U(\beta)\neq 0\,$ . The proof is similar to the previous one with $a\in]a,\beta]\,$ .

$\bullet$ Assume that $U(\alpha)\neq 0$ and $U(\beta)=0\,$ . The proof is similar to the previous one with $a\in[\alpha,a[\,$ .

*Case 2. * Assume that $N_{m}(U,]\alpha,\beta[)\geq 1\,$ .

$\bullet$ If $U(\alpha)\neq 0$ (resp. $U(\beta)\neq 0$ ), there is nothing to prove for the boundary $\alpha$ (resp. $\beta$ ).

$\bullet$ If $U(\alpha)=0$ (resp. $U(\beta)=0$ ), the number $a_{0}$ (resp. $a_{k}$ ) which appears in the proof of Proposition 2.12 belongs to the open interval $]\alpha,\xi_{1}[$ (resp. to the open interval $]\xi_{k},\beta[$ ), where $\xi_{1}$ (resp. $\xi_{k}$ ) is the smallest (resp. largest) zero of $U$ in $]\alpha,\beta[$ . We can then apply the proof of Step. 1 to the interval $[\alpha,a_{0}]$ (resp. to the interval $[a_{k},\beta]$ ) to prove that $\overline{N}_{m}(U_{1},[\alpha,a_{0}])\geq\overline{N}_{m}(U,[\alpha,a_{0}]$ ) (resp. to prove that $\overline{N}_{m}(U_{1},[a_{k},\beta])\geq\overline{N}_{m}(U,[a_{k},\beta]$ ). This proves Proposition 2.14. ∎

We can now state Sturm’s refined version of Theorem 1.4.

Theorem 2.15.

Assume that (1.6) holds, and let $Y$ be the non trivial linear combination

[TABLE]

where $1\leq m\leq n$ , and where $\{A_{p},m\leq p\leq n\}$ are real constants such that $A_{m}^{2}+\cdots+A_{n}^{2}\not\equiv 0\,$ . Then, with the notation of Subsection 2.2,

[TABLE]

Proof. [40, p. 442] Let $N(V)$ be any of the above functions. We may of course assume that $A_{m}\not=0$ and $A_{n}\not=0$ . In the preceding lemmas, we have proved that $N(Y_{k+1})\geq N(Y_{k})$ for any $k\in\mathbb{Z}$ . This inequality can also be rewritten as

[TABLE]

Letting $k$ tend to infinity, we conclude that

[TABLE]

and we can apply Theorem 1.3. ∎

Remark. For a complete proof of the limiting argument when $k$ tends to infinity, we refer to Appendix A.

3. Liouville’s approach to Theorem 1.4

3.1. Main statement

We keep the notation of Section 2. Starting from a linear combination $Y$ as in (2.1), Liouville also considers the family $Y_{k}$ given by (2.2), and shows that the number of zeros of $Y_{k+1}$ is not smaller than the number of zeros of $Y_{k}$ . His proof is based on a generalization of Rolle’s theorem.

*Remark 3.1**.*

In his proof, Liouville [24] only considers the zeros in the open interval $]\alpha,\beta[\,$ .

As in Section 2, for $1\leq m\leq n$ , we fix $Y=\sum_{j=m}^{n}A_{j}V_{j}$ , a linear combination of eigenfunctions of the eigenvalue problem (1.1)–(1.3), and we assume that $A_{m}A_{n}\neq 0\,$ , see Remark 2.1.

Theorem 3.2.

Counting zeros with multiplicities in the interval $]\alpha,\beta[\,$ , the function $Y$ (1) has at most $(n-1)$ zeros and, (2) has at least $(m-1)$ zeros.

**Proof. **Liouville uses the following version of Rolle’s theorem (Michel Rolle (1652-1719) was a French mathematician). This version of Rolle’s theorem seems to go back to Cauchy and Lagrange.

Lemma 3.3.

Let $f$ be a function in $]\alpha_{0},\beta_{0}[\,$ . Assume that

[TABLE]

(1)

If the function $f$ is differentiable, and has $\nu-1$ distinct zeros in the interval $]x^{\prime},x^{\prime\prime}[\,$ , then the derivative $f^{\prime}$ has at least $\nu$ distinct zeros in $]x^{\prime},x^{\prime\prime}[\,$ . 2. (2)

If the function $f$ is smooth, and has $\mu-1$ zeros counted with multiplicities in the interval $]x^{\prime},x^{\prime\prime}[\,$ , then the derivative $f^{\prime}$ has at least $\mu$ zeros counted with multiplicities in $]x^{\prime},x^{\prime\prime}[\,$ .

**Proof of the lemma. ** Call $x_{1}<x_{2}<\cdots x_{\nu-1}$ the distinct zeros of $f$ in $]x^{\prime},x^{\prime\prime}[\,$ . Since $f(x^{\prime})=f(x^{\prime\prime})=0\,$ , by Rolle’s theorem [32], the function $f^{\prime}$ vanishes at least once in each open interval determined by the $x_{j}\,$ , $1\leq j\leq\nu-1\,$ , as well as in the intervals $]x^{\prime},x_{1}[$ and $]x_{\nu-1},x^{\prime\prime}[\,$ . It follows that $f^{\prime}$ has at least $\nu$ distinct zeros in $]x^{\prime},x^{\prime\prime}[\,$ , which proves the first assertion.

Call $m_{j}$ the multiplicity of the zero $x_{j}$ , $1\leq j\leq\nu-1\,$ . Then $f^{\prime}$ has at least $\nu$ zeros, one in each of the open intervals determined by $x^{\prime},x^{\prime\prime}$ and the $x_{j}$ ’s, and has a zero at each $x_{j}$ with multiplicity $m_{j}-1\,$ , provided that $m_{j}>1$ . It follows that the number of zeros of $f^{\prime}$ in $]x^{\prime},x^{\prime\prime}[\,$ , counting multiplicities, is at least

[TABLE]

which proves the second assertion. ∎

3.2. Proof of the assertion “ $Y$ has at most $(n-1)$ zeros in $]\alpha,\beta[$ , counting multiplicities”

Write (1.1) for $V_{1}$ and for $V_{p}$ , for some $m\leq p\leq n$ . Multiply the first equation by $-V_{p}$ , the second by $V_{1}$ , and add the resulting equations. Then

[TABLE]

Use the identity

[TABLE]

and integrate from $\alpha$ to $t$ to get the identity

[TABLE]

Here we have used the boundary condition (1.2) which implies that

[TABLE]

Multiplying the identity (3.3) by $A_{p}$ , and summing for $p$ from $m$ to $n$ , we obtain

[TABLE]

or

[TABLE]

where we have used the fact that the function $V_{1}$ does not vanish in the interval $]\alpha,\beta[\,$ .

Let $\Psi(x)=\frac{Y}{V_{1}}(x)$ . The zeros of $Y$ in $]\alpha,\beta[$ are the same as the zeros of $\Psi$ , with the same multiplicities. Let $\mu$ be the number of zeros of $Y$ , counted with multiplicities. Using Lemma 3.3, Assertion (2), one can show that $\frac{d\Psi}{dx}$ has at least $\mu-1$ zeros in $]\alpha,\beta[\,$ , and hence so does the left-hand side of (3.5),

[TABLE]

On the other hand, this function vanishes at $\alpha$ and $\beta$ (because of the boundary condition (1.3) or orthogonality). By Lemma 3.3, its derivative,

[TABLE]

has at least $\mu$ zeros counted with multiplicities in $]\alpha,\beta[\,$ . We have proved the following

Lemma 3.4.

If the function $Y=\sum_{p=m}^{n}A_{p}V_{p}$ has at least $\mu$ zeros counted with multiplicities in the interval $]\alpha,\beta[\,$ , then the function $Y_{1}=\sum_{p=m}^{n}(\rho_{1}-\rho_{p})A_{p}V_{p}$ has at least $\mu$ zeros, counted with multiplicities, in $]\alpha,\beta[\,$ .

Applying this lemma iteratively, we deduce that if $Y$ has at least $\mu$ zeros counted with multiplicities in $]\alpha,\beta[\,$ , then, for any $k\geq 1$ , the function

[TABLE]

has at least $\mu$ zeros, counted with multiplicities, in $]\alpha,\beta[\,$ .

We may of course assume that the coefficient $A_{n}$ is non-zero. The above assertion can be rewritten as the statement:

For all $k\geq 0\,$ , the equation

[TABLE]

has at least $\mu$ solutions in $]\alpha,\beta[$ , counting multiplicities.

Letting $k$ tend to infinity, and using the fact that $V_{n}$ has exactly $(n-1)$ zeros in $]\alpha,\beta[\,$ , this implies that $\mu\leq(n-1)$ . This proves the first assertion. ∎

3.3. Proof of the assertion “ $Y$ has at least $(m-1)$ zeros in $]\alpha,\beta[$ , counting multiplicities”

We have seen that the number of zeros of $Y_{k}$ is less than or equal to the number of zeros of the function $Y_{k+1}$ . This assertion actually holds for any $k\in\mathbb{Z}$ , and can also be rewritten as,

[TABLE]

for any $k\geq 0$ , where

[TABLE]

and we can again let $k$ tend to infinity. The second assertion is proved and Theorem 3.2 as well. ∎

3.4. Liouville’s 2nd approach to the 2nd part of Theorem 3.2

If the function $Y$ has $\mu_{1}$ distinct zeros, and $\mu\leq\mu_{1}$ sign changes, we call $a_{i}$ , $\alpha<a_{1}<\cdots<a_{\mu}<\beta$ , the points at which $Y$ changes sign.

Claim 3.5.

The function $Y$ changes sign at least $(m-1)$ times in the interval $]\alpha,\beta[\,$ .

Proof of the claim. Assume, by contradiction, that $\mu\leq(m-2)$ . Consider the function

[TABLE]

where the function $\Delta$ is defined as the determinant

[TABLE]

The function $W$ vanishes at the points $a_{i}\,,1\leq i\leq\mu\,$ . According to the first part in Theorem 3.2, $W$ being a linear combination of the first $\mu+1$ eigenfunctions, vanishes at most $\mu$ times in $]\alpha,\beta[\,$ , counting multiplicities. This implies that each zero $a_{i}$ of $W$ has order one, and that $W$ does not have any other zero in $]\alpha,\beta[\,$ . It follows that the function $YW$ vanishes only at the points $\{a_{i}\}$ , $1\leq i\leq\mu$ , and that it does not change sign. We can assume that $YW\geq 0\,$ . On the other hand, we have

[TABLE]

because $Y$ involves the functions $V_{p}$ with $p\geq m$ and $W$ the functions $V_{q}$ with $q\leq\mu+1\leq m-1\,$ . This gives a contradiction. ∎

*Remark 3.6**.*

Liouville does actually not use the determinant (3.12), but a similar approach, see [23, p. 259], Lemme $1^{\text{er}}$ . The determinant $\Delta$ appears in [34, Section 142]. The paper [6] is based on a careful analysis of this determinant.

*Remark 3.7**.*

The arguments in Subsection 3.2, using Assertion (1) of Lemma 3.3, instead of Assertion (2), yield an upper bound on the number of zeros of $Y$ , multiplicities not accounted for. This estimate holds under weaker regularity assumptions, namely only assuming that the functions $G,L$ are continuous, and that the function $K$ is $C^{1}$ , see Appendix C, and compare with [15], Chap. III.5.

4. Mathematical context of Sturm’s papers.

Sturm’s motivations and ideas

4.1. On Sturm’s style

Sturm’s papers [39, 40] are written in French, and quite long, about 80 pages each. One difficulty in reading them is the lack of layout structure. The papers are written linearly, and divided into sequences of sections, without any title. Most results are stated without tags, “Theorem” and the like, and only appear in the body of the text. For example, [39] only contains one theorem stated as such, see $\S$ XII, p. 125. In order to have an overview of the results contained in [39], the reader should look at the announcement [37]. Theorem 1.4 is stated in [38].

For a more thorough analysis of Sturm’s papers on differential equations, we refer to [26, 14]. We refer to [7, 33] for the relationships between Theorem 1.3 and Sturm’s theorem on the number of real roots of real polynomials.

4.2. Sturm’s motivations

Sturm’s motivations come from mathematical physics, and more precisely, from the problem of heat diffusion in a non-homogeneous bar. He considers the heat equation,

[TABLE]

with boundary conditions

[TABLE]

for all $t>0$ , and with the initial condition

[TABLE]

where $f$ is a given function.

The functions $K,G,L$ and the constants $h,H$ describe the physical properties of the bar, see [40, Introduction, p. 376]. Sturm refers to the book of Siméon Denis Poisson [31], rather than to Fourier’s book [13], because Poisson’s equations are more general, see [33, Chap. III].

The boundary conditions (1.2)-(1.3) and (4.2) first appeared in the work of Fourier [13] but are called “Robin’s condition” in the recent literature. Victor Gustave Robin (1855-1897) was a French mathematician.

As was popularized by Fourier and Poisson, in order to solve (4.1), Sturm uses the method of separation of variables, and is therefore led to the eigenvalue problem (1.1)–(1.3).

4.3. Sturm’s assumptions

In [39, 40], Sturm implicitly assumes that the functions $K,G,L$ are $C^{\infty}$ and, explicitly, that $K$ is positive, see [39, p. 108]. For the eigenvalue problem, he also assumes that $G,L$ are positive, see [40, p. 381]. In [40, p. 394], he mentions that $L$ could take negative values, and implicitly assumes, in this case, that $\frac{L}{G}$ is bounded from below.

In [24], Liouville does not mention any regularity assumption on the functions $G,K,L$ . He however indicates a regularity assumption (piecewise $C^{2}$ functions) in a previous paper, [23, Footnote $(*)$ , p. 256].

4.4. Sturm’s originality

Before explaining Sturm’s proofs, we would like to insist on the originality of his approach. Indeed, unlike his predecessors, Sturm does not look for explicit solutions of the differential equation (4.4) (i.e., solutions in closed form, or given as sums of series or as integrals), but he rather looks for qualitative properties of the solutions, properties which can be deduced directly from the differential equation itself. The following excerpts are translated from [39, Introduction]111See Appendix B for the original citations in French..

*One only knows how to integrate these equations in a very small number of particular cases, and one can otherwise not even obtain a first integral; even when one knows the expression of the function which satisfies such an equation, in finite form, as a series, as integrals either definite or indefinite, it is most generally difficult to recognize in this expression the behaviour and the characteristic properties of this function. …

Although it is important to be able to determine the value of the unknown function for an isolated value of the variable it depends upon, it is not less necessary to discuss the behaviour of this function, or otherwise stated, the form and the twists and turns of the curve whose ordinate would be the function, and the abscissa the independent variable. It turns out that one can achieve this goal by the sole consideration of the differential equation themselves, without having to integrate them. This is the purpose of the present memoir. …*

4.5. Sturm and the existence and uniqueness theorem for ordinary differential equation

In [39, p. 108], Sturm considers the differential equation

[TABLE]

and takes the existence and uniqueness theorem for granted. More precisely, he claims [39, p. 108], without any reference whatsoever,

The complete integral of equation (I) must contain two arbitrary constants, for which one can take the values of $V$ and of $\frac{dV}{dx}$ corresponding to some particular value of $x$ . Once these values are fixed, the function $V$ is fully determined by equation (I), it has a uniquely determined value for each value of $x$ .

On the other hand, he gives two arguments for the fact that a solution of (I) and its derivative cannot vanish simultaneously at a point without vanishing identically, see [39, $\S$ II]. When the coefficients $K,G$ of the differential equation depend upon a parameter $m$ , e.g. continuously, Sturm also takes for granted the fact that the solution $V(x,m)$ , and its zeros, depend continuously on $m$ .

In [40, $\S$ II], Sturm mentions the existence proof given by Liouville in [23], see also [22]. According to [16], Augustin-Louis Cauchy may have presented the existence and uniqueness theorem for ordinary differential equations in his course at École polytechnique as early as in the year 1817-1818. Following a recommendation of the administration of the school, Cauchy delivered the notes of his lectures in 1824, see [9] and, in particular, the introduction by Christian Gilain who discovered these notes in 1974. These notes apparently had a limited distribution. Liouville entered the École polytechnique in 1825, and there attended the mathematics course given by Ampère222We are grateful to J. Lützen for providing this information. (as a matter of fact Ampère and Cauchy gave the course every other year, alternatively). Liouville’s proof of the existence theorem for differential equations in [22], à la Picard but before Picard, though limited to the particular case of 2nd order linear equations, might be the first well circulated proof of an existence theorem for differential equations, see [25, $\S$ 34]. Cauchy’s theorem was later popularized in the second volume of Moigno’s book, published in 1844, see [27], “Vingt-sixième Leçon” $\S$ 159, pp. 385–396.

4.6. Sturm’s proof of Theorem 1.3

Theorem 1.3 is proved in [40]. For the first assertion, see $\S$ III (p. 384) to VII; for the second assertion, see $\S$ VIII (p. 396) to X.

The proof is based on the paper [39] in which Sturm studies the zeros of the solution of the initial value problem,

[TABLE]

Here $K,G$ are assumed to be functions of $x$ depending on a real parameter $m$ , with $K$ positive (the constants $h$ and $H$ may also depend on the parameter $m$ ). The solution $V(x,m)$ is well defined up to a scaling factor. The main part of [39] is devoted to studying how the zeros of the function $V(x,m)$ (and other related functions) depend on the parameter $m$ , see [39, $\S$ XII, p. 125]. While developing this program, Sturm proves the oscillation, separation and comparison theorems which nowadays bear his name, [39, $\S$ XV, XVI and XXXVII].

The eigenvalue problem (1.1)–(1.3) itself is studied in [40]. For this purpose, Sturm considers the functions

[TABLE]

the solution $V(x,r)$ of the corresponding initial value problem (4.4)–(4.5), and applies the results and methods of [39].

The spectral data of the eigenvalue problem (1.1)–(1.3) are determined by the following transcendental equation in the spectral parameter $r$ ,

[TABLE]

see, [40], §III, page 383, line 8 from bottom.

4.7. Sturm’s two proofs of Theorem 1.4

Theorem 1.4 appears in [40, $\S$ XXV, p. 431], see also the announcement [38].

Sturm’s general motivation, see the introductions to [39] and [40], was the investigation of heat diffusion in a (non-homogeneous) bar, whose physical properties are described by the functions $K,G,L$ . He first obtained Theorem 1.4 as a corollary of a much deeper theorem which describes the behaviour, as time varies, of the $x$ -zeros of a solution $u(x,t)$ of the heat equation (4.1)-(4.3). When the initial temperature $u(x,0)$ is given by a linear combination of simple states,

[TABLE]

the function $u(x,t)$ is given by

[TABLE]

When $t$ tends to infinity, the $x$ -zeros of $u(x,t)$ approach those of $V_{p}$ , where $p$ is the least integer $j,m\leq j\leq n$ such that $A_{j}\not=0$ .

J. Liouville, who was aware of Theorem 1.4, made use of it in [23], and provided a purely “ordinary differential equation” proof in [24], a few months before the actual publication of [40]. This induced Sturm to provide two proofs of Theorem 1.4 in [40], his initial proof using the heat equation, and another proof based on the sole ordinary differential equation. The proofs of Sturm actually give a more precise result. In [40, p. 379], Sturm writes,

*M. Liouville gave a direct proof of this theorem, which for me was a mere corollary of the preceding one, without taking care of the particular case in which the function vanishes at one of the extremities of the bar. I have also found, after him, another direct proof which I give in this memoir. M. Liouville made use of the same theorem in a very nice memoir which he published in the July issue of his journal, and which deals with the expansion of an arbitrary function into a series made of the functions $V$ which we have considered. *

The time independent analog to studying the behaviour of the $x$ -zeros of (4.8) is to study the behaviour of the zeros of the family of functions $\{Y_{k}\}_{k\in\mathbb{Z}}$ , where

[TABLE]

as $k$ tends to infinity.

Appendix A The limiting argument in (3.8)

Recall that we assume that $A_{n}\neq 0$ . Define

[TABLE]

One can rewrite (3.8) as

[TABLE]

where

[TABLE]

It follows that $\Pi$ is uniformly bounded by

[TABLE]

Similarly,

[TABLE]

Call $\xi_{1}<\xi_{2}<\cdots<\xi_{n-1}$ the zeros of the function $V_{n}$ in the interval $]\alpha,\beta[$ .

$\bullet$ Assume that $V_{n}(\alpha)\not=0$ and $V_{n}(\beta)\not=0\,$ .

Since $\frac{dV_{n}}{dx}(\xi_{i})\not=0\,$ , there exist $\delta_{1},\varepsilon_{1}>0$ such that $|\frac{dV_{n}}{dx}(x)|\geq\varepsilon_{1}$ for $x\in[\xi_{i}-\delta_{1},\xi_{i}+\delta_{1}]\,$ , and $|V_{n}(x)|\geq\varepsilon_{1}$ in $[\alpha,\beta]\setminus\cup\,]\xi_{i}-\delta_{1},\xi_{i}+\delta_{1}[\,$ .

For $k$ large enough, we have $\omega M,\omega N\leq\varepsilon_{1}/2$ . It follows that in the interval $[\xi_{i}-\delta_{1},\xi_{i}+\delta_{1}]\,$ ,

[TABLE]

Furthermore,

[TABLE]

Since $V_{n}(\xi_{i}+\delta_{1})V_{n}(\xi_{i}-\delta_{1})<0\,$ , we can conclude that the function $V_{n}+\omega\,\Pi$ has exactly one zero in each interval $]\xi_{i}-\delta_{1},\xi_{i}+\delta_{1}[\,$ .

In $[\alpha,\beta]\setminus\cup\,]\xi_{i}-\delta_{1},\xi_{i}+\delta_{1}[\,$ , we have

[TABLE]

which implies that $V_{n}(x)+\omega\,\Pi(x)\not=0\,$ .

$\bullet$ Assume that $V_{n}(\alpha)=0$ and $V_{n}(\beta)\neq 0\,$ . This corresponds to the case $h=+\infty$ and $H\neq+\infty$ . Hence the $V_{j}$ verify Dirichlet at $\alpha$ and $\Pi$ verifies Dirichlet at $\alpha$ . Observing that $V^{\prime}_{n}(\alpha)\neq 0$ , it is immediate to see that there exists $\delta_{1}>0$ , such that, for $k$ large enough, $V_{n}(x)+\omega\,\Pi(x)$ has only $\alpha$ as zero in $[\alpha,\alpha+\delta_{1}]$ .

$\bullet$ The other cases are treated in the same way. ∎

Appendix B Citations from Sturm’s papers

French original and English translation

Citation from [39, Introduction].

On ne sait [ces équations] les intégrer que dans un très petit nombre de cas particuliers hors desquels on ne peut pas même en obtenir une intégrale première ; et lors même qu’on possède l’expression de la fonction qui vérifie une telle équation, soit sous forme finie, soit en série, soit en intégrales définies ou indéfinies, il est le plus souvent difficile de reconnaître dans cette expression la marche et les propriétés caractéristiques de cette fonction. …

S’il importe de pouvoir déterminer la valeur de la fonction inconnue pour une valeur isolée quelconque de la variable dont elle dépend, il n’est pas moins nécessaire de discuter la marche de cette fonction, ou en d’autres termes, d’examiner la forme et les sinuosités de la courbe dont cette fonction serait l’ordonnée variable, en prenant pour abscisse la variable indépendante. Or on peut arriver à ce but par la seule considération des équations différentielles elles-mêmes, sans qu’on ait besoin de leur intégration. Tel est l’objet du présent mémoire. …

One only knows how to integrate these equations in a very small number of particular cases, and one can otherwise not even obtain a first integral; even when one knows the expression of the function which satisfies such an equation, in finite form, as a series, as integrals either definite or indefinite, it is most generally difficult to recognize in this expression the behaviour and the characteristic properties of this function. …

Although it is important to be able to determine the value of the unknown function for an isolated value of the variable it depends upon, it is not less necessary to discuss the behaviour of this function, or otherwise stated, the form and the twists and turns of the curve whose ordinate would be the function, and the abscissa the independent variable. It turns out that one can achieve this goal by the sole consideration of the differential equation themselves, without having to integrate them. This is the purpose of the present memoir. …

Citation from [39, p. 108].

L’intégrale complète de l’équation (I) doit contenir deux constantes arbitraires, pour lesquelles on peut prendre les valeurs de $V$ et de $\frac{dV}{dx}$ correspondantes à une valeur particulière de $x$ . Lorsque ces valeurs sont fixées, la fonction $V$ est entièrement définie par l’équation (I), elle a une valeur déterminée et unique pour chaque valeur de $x$ .

The complete integral of equation (I) must contain two arbitrary constants, for which one can take the values of $V$ and of $\frac{dV}{dx}$ corresponding to some particular value of $x$ . Once these values are fixed, the function $V$ is fully determined by equation (I), it has a uniquely determined value for each value of $x$ .

Citation from [40, p. 379].

M. Liouville a démontré directement ce théorème, qui n’était pour moi qu’un corollaire du précédent, sans s’occuper du cas particulier où la fonction serait nulle à l’une des extrémités de la barre. J’en ai aussi trouvé après lui une autre démonstration directe que je donne dans ce mémoire. M. Liouville a fait usage du même théorème dans un très beau Mémoire qu’il a publié dans le numéro de juillet de son journal et qui a pour objet le développement d’une fonction arbitraire en une série composée de fonctions $V$ que nous avons considérées.

M. Liouville gave a direct proof of this theorem, which for me was a mere corollary of the preceding one, without taking care of the particular case in which the function vanishes at one of the extremities of the bar. I have also found, after him, another direct proof which I give in this memoir. M. Liouville made use of the same theorem in a very nice memoir which he published in the July issue of his journal, and which deals with the expansion of an arbitrary function into a series made of the functions $V$ which we have considered.

{weaker}

Appendix C Sturm’s results under weaker assumptions

We proved Theorems 2.15 and 3.2 under the Assumptions (1.6). In this section, we consider the weaker assumptions

[TABLE]

Under these assumptions, the functions $V_{j}$ are $C^{2}$ on $]\alpha_{0},\beta_{0}[$ . This follows easily for example from Liouville’s existence proof [23], and we have the following lemma, whose proof is analogous to the proof of Lemma 2.2

Lemma C.1.

Let $k\in\mathbb{Z}\,$ .

(1)

The function $Y_{k}$ satisfies the boundary conditions (1.2) and (1.3). 2. (2)

The functions $Y_{k}$ and $Y_{k+1}$ satisfy the relation

[TABLE] 3. (3)

Under the Assumptions (C.1), the function $Y_{k}$ cannot vanish identically on an open interval $]\alpha_{1},\beta_{1}[\subset]\alpha_{0},\beta_{0}[$ , unless $Y\equiv 0\,$ .

In Subsection 3.2, we have used Lemma 3.3 (2) which relies on the fact that the functions $V_{j}$ are $C^{\infty}$ . If the functions $V_{j}$ are only $C^{2}$ , we can apply Lemma 3.3 (1). It is easy to conclude that Liouville’s proofs in Subsection 3.2 and 3.3 go through, under the weaker Assumptions (C.1), if we only count distinct zeros, see (2.11). More precisely, we can prove the following claim.

Claim C.2.

Under the Assertions (C.1), for any $k\in\mathbb{Z}\,$ , if the function $Y_{k}$ has at least $\mu$ distinct zeros in the interval $]\alpha,\beta[\,$ , then the function $Y_{k+1}$ has at least $\mu$ distinct zeros in the interval $]\alpha,\beta[\,$ .

We can then deduce from this claim, as in Section 3, that a linear combination $Y=\sum_{j=m}^{n}A_{j}V_{j}$ has at most $(n-1)$ distinct zeros (in particular it has finitely many zeros).

Once this result is secured, we can define zeros at which $Y$ changes sign (without using the multiplicity), and apply Sturm’s lower bound argument to conclude that the function $Y$ must change sign at least $(m-1)$ times.

{addendum}

Appendix D Sturm’s original o.d.e proof

The first proof of Theorem 1.4 appears in [40, $\S$ XXV, p. 431], as a corollary of a more profound theorem ( $\S$ XXIV) which describes the behaviour, as $t$ grows from [math] to infinity, of the zeros of $x\mapsto u(x,t)$ , where $u$ is a solution of the heat (4.1)-(4.3).

Sturm proves that the number $N(t)$ of zeros of the function $x\mapsto u(x,t)$ is piecewise constant, non-increasing in $t$ , and that jumps occur precisely for values of $t$ such that $u(x,t)$ and $\frac{\partial u}{\partial t}(x,t)$ have common zeros. We refer to [14] for an analysis of this aspect of Sturm’s paper [40].

The second proof, purely o.d.e., is developed in [40, $\S$ XXVI, p. 436 ff]. In this section, we give the main steps of this proof (with page numbers and number of line from top $\ell\downarrow$ , resp. from bottom $\ell\uparrow$ ).

p. 436 $\ell\uparrow 13$ , Sturm mentions Liouville’s proof [24].

*M. Liouville a démontré directement le théorème du numéro précédent (dans le cahier d’août de son journal) sans employer la considération de la variable auxiliaire $t$ qui entre dans la fonction $u$ (42) dont j’ai fait usage. Il n’a pas tenu compte toutefois de la racine $\mathrm{x}$ ou $\mathrm{X}$ 333Respectively $\alpha$ and $\beta$ with our notation. lorsqu’elle existe. Je vais donner ici une autre démonstration directe du même théorème, indépendante de celui du n∘XXIV.

*He introduces the linear combination

[TABLE]

and, p. 436 $\ell\uparrow 1$ , its companion

[TABLE]

p. 437, Sturm establishes the differential relation

[TABLE]

He also notes $\ell\downarrow 5$ , that the function $Y$ satisfies the boundary conditions (1.2)-(1.3). Sturm’ idea Je vais prouver …, is to prove that the function $Y_{1}$ has at least as many zeros in $]\alpha,\beta[$ , counted with multiplicities, as the function $Y$ in the same circumstances.

p. 437 $\ell\uparrow 10$ , Sturm makes the implicit assumption that the zeros of $Y$ are isolated.

p. 439 $\ell\downarrow 5$ , Sturm states that the number of sign changes of $Y_{1}$ in $]\alpha,\beta[$ is not smaller than the number of sign changes of $Y$ . He then considers the zeros with multiplicities, and implicitly assumes that the function $Y$ (assumed not to be identically zero) does not vanish at infinite order at some point.

p. 440 $\ell\uparrow 13$ , Sturm states that the number of zeros of $Y_{1}$ in $]\alpha,\beta[$ , counted with multiplicities, is not smaller than the number of zeros of $Y$ . He then examines ( $\ell\uparrow 6$ ) the possible zeros of $Y$ at $\alpha$ or $\beta$ .

p. 442 $\ell\downarrow 7$ , Sturm states that the number of zeros of $Y_{1}$ in $[\alpha,\beta]$ , counted with multiplicities (with a special rule for counting multiplicities at $\alpha$ , $\beta$ ), is not smaller than the number of zeros of $Y$ .

p. 442 $\ell\downarrow 13$ , Sturm iterates the procedure (with $Y_{k}$ ), and uses a limiting argument to conclude that the number of zeros of $Y$ in $[\alpha,\beta]$ , counting multiplicities, is at most $p-1$ .

p. 443, Sturm proves the lower bound for the number of zeros and, ( $\ell\uparrow 6$ ), compares the present proof with the heat equation proof, the functions $Y_{k}$ are equal to $\frac{d^{k}u}{dt^{k}}(x,0)$ . Finally, in a footnote, he mentions that $Y$ cannot vanish identically unless all the coefficients $C_{j}$ are zero. He does not mention the fact that $Y$ can actually not vanish at infinite order at any point.

p. 444 $\ell\downarrow 3$ , Sturm explains what to do when no assumption is made on the sign of the function $\ell$ . Taking $Y$ as above, and defining

[TABLE]

where $c$ is a constant, he obtains

[TABLE]

It suffices to assume that the constant $c$ is such that $gc+\ell>0$ and to follow the previous proof with this new definition of $Y_{1}$ .

Appendix E Cross references to Sturm’s and Liouville’s papers

In this Appendix, we give the references to pages in Sturm’s paper [40, $\S$ XXVI] for the results in our paper.

•

Lemma 2.2: p. 437. Note that the third assertion does not appear in Sturm’s paper. He indeed implicitly assumes that the zeros of $Y$ are isolated.

•

Lemma 2.4: p. 439.

•

Lemma 2.5: p. 440-441.

•

Lemma 2.7: p. 437.

•

Lemma 2.9: p. 438.

•

Proposition 2.11: p. 437-439.

•

Proposition 2.12: p. 439-442.

•

Proposition 2.14: p. 440-442.

•

Theorem 2.15: p. 442.

Here are the pages in Liouville’s paper [24].

•

Theorem 3.2: p. 272.

•

Lemma 3.3: Mentioned p. 272. No precise statement, no proof provided by Liouville.

•

Proof of first assertion. Lemma 3.4: p. 274.

•

Proof of second assertion: p. 276 and reference to [23].

Claim 3.5: We use the determinant $\Delta$ to simplify Liouville’s [23, Lemme 1 ${}^{\text{er}}$ , p. 259].

Numbers inserted after a reference indicate the pages where it is cited.

Bibliography41

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] V. Arnold. Topology of real algebraic curves (works of I.G. Petrovsky and their development)[Russian]. Usp. Mat. Nauk. 28:5 (1973),260–262. Translated by O. Viro, in V.I. Arnold, Collected works, Vol. 2, pp. 251-254. Springer 2014.
2[2] V. Arnold. Ordinary differential equations. Translated from the 3rd Russian edition by Roger Cooke. Springer-Verlag 1992.
3[3] V. Arnold. Topological properties of eigenoscillations in mathematical physics. Proc. Steklov Inst. Math., 273 (2011), 25–34.
4[4] P. Bérard and B. Helffer. On Courant’s nodal domain property for linear combinations of eigenfunctions, Part I. ar Xiv:1705.03731.
5[5] P. Bérard and B. Helffer. On Courant’s nodal domain property for linear combinations of eigenfunctions, Part II. ar Xiv:1803.00449.
6[6] P. Bérard and B. Helffer. Sturm’s theorem on the zeros of sums of eigenfunctions: Gelfand’s strategy implemented. ar Xiv:1807.03990.
7[7] M. Bôcher. The published and unpublished work of Charles Sturm on algebraic and differential equations. Proc. Amer. Math. Soc., 18 (1911), 1–18.
8[8] M. Bôcher. Leçons sur les méthodes de Sturm dans la théorie des équations différentielles linéaires et leurs développements modernes. Gauthier-Villars et Cie, Éditeurs. Paris 1917.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Sturm’s theorem on zeros of linear combinations of eigenfunctions

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

Remark 1.1*.*

Remark 1.2*.*

Theorem 1.3** (Sturm, 1836).**

Theorem 1.4** (Sturm, 1836).**

Remark 1.5*.*

Remark 1.6*.*

Organization of the paper

Acknowledgements

2. Sturm’s o.d.e. proof of Theorem 1.4

2.1. Preliminary lemmas and notation

2.1.1.

Remark 2.1*.*

2.1.2.

Lemma 2.2**.**

Remark 2.3*.*

Lemma 2.4**.**

Lemma 2.5**.**

2.2. Counting zeros

Remark 2.6*.*

2.3. Comparing the numbers of zeros of YkY_{k}Yk​ and Yk+1Y_{k+1}Yk+1​

Lemma 2.7**.**

Remark 2.8*.*

Lemma 2.9**.**

Claim 2.10**.**

Proposition 2.11**.**

Proposition 2.12**.**

Claim 2.13**.**

Proposition 2.14**.**

Theorem 2.15**.**

3. Liouville’s approach to Theorem 1.4

3.1. Main statement

Remark 3.1*.*

Theorem 3.2**.**

Lemma 3.3**.**

3.2. Proof of the assertion “YYY has at most (n−1)(n-1)(n−1) zeros in ]α,β[]\alpha,\beta[]α,β[, counting multiplicities”

Lemma 3.4**.**

3.3. Proof of the assertion “YYY has at least (m−1)(m-1)(m−1) zeros in ]α,β[]\alpha,\beta[]α,β[, counting multiplicities”

3.4. Liouville’s 2nd approach to the 2nd part of Theorem 3.2

Claim 3.5**.**

Remark 3.6*.*

Remark 3.7*.*

4. Mathematical context of Sturm’s papers.

4.1. On Sturm’s style

4.2. Sturm’s motivations

4.3. Sturm’s assumptions

4.4. Sturm’s originality

4.5. Sturm and the existence and uniqueness theorem for ordinary differential equation

4.6. Sturm’s proof of Theorem 1.3

4.7. Sturm’s two proofs of Theorem 1.4

Appendix A The limiting argument in (3.8)

Appendix B Citations from Sturm’s papers

Appendix C Sturm’s results under weaker assumptions

Lemma C.1**.**

Claim C.2**.**

Appendix D Sturm’s original o.d.e proof

Appendix E Cross references to Sturm’s and Liouville’s papers

*Remark 1.1**.*

*Remark 1.2**.*

Theorem 1.3 (Sturm, 1836).

Theorem 1.4 (Sturm, 1836).

*Remark 1.5**.*

*Remark 1.6**.*

*Remark 2.1**.*

Lemma 2.2.

*Remark 2.3**.*

Lemma 2.4.

Lemma 2.5.

*Remark 2.6**.*

2.3. Comparing the numbers of zeros of $Y_{k}$ and $Y_{k+1}$

Lemma 2.7.

*Remark 2.8**.*

Lemma 2.9.

Claim 2.10.

Proposition 2.11.

Proposition 2.12.

Claim 2.13.

Proposition 2.14.

Theorem 2.15.

*Remark 3.1**.*

Theorem 3.2.

Lemma 3.3.

3.2. Proof of the assertion “ $Y$ has at most $(n-1)$ zeros in $]\alpha,\beta[$ , counting multiplicities”

Lemma 3.4.

3.3. Proof of the assertion “ $Y$ has at least $(m-1)$ zeros in $]\alpha,\beta[$ , counting multiplicities”

Claim 3.5.

*Remark 3.6**.*

*Remark 3.7**.*

Lemma C.1.

Claim C.2.