This paper investigates the logical strength of Ekeland's variational principle within reverse mathematics, revealing its equivalence to strong systems like $ ext{ACA}_0$ and $ ext{Pi}^1_1$-${ ext{CA}}_0$, especially in localized forms.
Contribution
It establishes the precise logical equivalences of Ekeland's variational principle and its restrictions in the framework of reverse mathematics.
Findings
01
Full variational principle equivalent to $ ext{Pi}^1_1$-${ ext{CA}}_0$
02
Restrictions correspond to $ ext{WKL}_0$ and $ ext{ACA}_0$
03
Localized version with continuous functions also equivalent to $ ext{Pi}^1_1$-${ ext{CA}}_0$
Abstract
We analyze Ekeland's variational principle in the context of reverse mathematics. We find that that the full variational principle is equivalent to Π11-CA0, a strong theory of second-order arithmetic, while natural restrictions (e.g.~to compact spaces or continuous functions) yield statements equivalent to weak K\"onig's lemma (WKL0) and to arithmetical comprehension (ACA0). We also find that the localized version of Ekeland's variational principle is equivalent to Π11-CA0 even when restricting to continuous functions. This is a rare example of a statement about continuous functions having great logical strength.
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputability, Logic, AI Algorithms · Advanced Topology and Set Theory · Mathematical and Theoretical Analysis
Full text
Ekeland’s variational principle in weak and strong systems of arithmetic
We analyze Ekeland’s variational principle in the context of reverse mathematics. We find that that the full variational principle is equivalent to Π11\mbox−CA0, a strong theory of second-order arithmetic, while natural restrictions (e.g. to compact spaces or to continuous functions) yield statements equivalent to weak König’s lemma (WKL0) and to arithmetical comprehension (ACA0). We also find that the localized version of Ekeland’s variational principle is equivalent to Π11\mbox−CA0, even when restricted to continuous functions. This is a rare example of a statement about continuous functions having great logical strength.
The field of reverse mathematics, introduced by H. Friedman [Friedman], aims to identify the minimal foundational assumptions required to prove specific results from several mathematical fields, including mathematical analysis.
There are many advantages to determining these minimal assumptions, among them aiding in extracting computational content from theorems whose original proofs were non-constructive.
It is also desirable from a methodological perspective to avoid strong set-theoretic assumptions when possible, and indeed advances in reverse mathematics have shown us that a large portion of known mathematics can be carried out within a relatively small fragment of second-order arithmetic [SimpsonSOSOA].
Second-order arithmetic extends the language of Peano arithmetic by adding variables and quantifiers ranging over sets of natural numbers (see §2); this is sufficient to formalize many familiar concepts from analysis, including real numbers (and, more generally, points in a complete separable metric space, also called a Polish space), continuous functions, and open and closed sets (see §3).
One can then ask which axioms are needed to prove (say) the Stone–Weierstrass theorem in this framework.
Although there are many exceptions, a surprisingly large portion of the theorems analyzed are equivalent to one of a handful of systems of second-order arithmetic. These axiomatic systems are always assumed to extend the ‘base theory’ of reverse mathematics, recursive comprehension (RCA0), and the ones we consider are weak König’s lemma (WKL0), arithmetical comprehension (ACA0), and Π11-comprehension (Π11\mbox−CA0), listed in strictly increasing order of strength.
Each of RCA0, ACA0 and Π11\mbox−CA0 include axioms asserting that sets of the form {n∈N:φ(n)} exist, where in the case of RCA0, φ must express a computable predicate; in the case of ACA0, φ may contain arbitrary quantifiers over natural numbers (but not over sets of natural numbers); and in the case of Π11\mbox−CA0, φ is of the form ∀Xψ(n,X), where X is a second-order variable and ψ(n,X) contains no additional second-order quantifiers. Meanwhile, the theory WKL0 asserts that any infinite binary tree has an infinite branch.
It is known, for example, that the Baire category theorem is provable in RCA0, that the Heine–Borel theorem is equivalent to WKL0, and that the Stone–Weierstrass theorem is equivalent to ACA0. As we will see throughout the text, there are many more examples of theorems equivalent to these theories. On the other hand, statements equivalent to Π11\mbox−CA0 are much more difficult to come by. One example is the Cantor–Bendixson theorem, stating that any closed subset of a Polish space can be written as the union of a countable set and a perfect set.
From a logical perspective, there is a vast gap in strength between the systems ACA0 and Π11\mbox−CA0. ACA0 may be thought of as the second-order analog of the familiar first-order system of Peano arithmetic (PA). Indeed, ACA0 is conservative over PA, meaning that for every first-order statement φ (i.e., for every φ that may contain quantifiers over natural numbers but not over sets of natural numbers), φ is provable in ACA0 if and only if it is provable in PA. Both ACA0 and PA have proof-theoretic ordinal ε0. In computational terms, ACA0 is characterized by the existence of Turing jumps. That is, over RCA0, ACA0 is equivalent to the statement “for every set X, the Turing jump of X is also a set.” Π11\mbox−CA0, on the other hand, is a squarely impredicative system whose proof-theoretic ordinal is far above Γ0. In fact, its proof-theoretic ordinal is far above even the Bachmann–Howard ordinal. In computational terms, Π11\mbox−CA0 is characterized by the existence of hyperjumps. That is, over RCA0, Π11\mbox−CA0 is equivalent to the statement “for every set X, the hyperjump of X is also a set.” See [SimpsonSOSOA] for analyses of the Heine–Borel, Stone–Weierstrass, and Cantor–Bendixson theorems and for characterizations of ACA0 and Π11\mbox−CA0 in terms of jumps and hyperjumps, and see [Pohlers1998] for an ordinal analysis of Π11\mbox−CA0.
Our goal in this article is to study Ekeland’s variational principle [ekeland1974] in the context of reverse mathematics. This principle states that under certain conditions, lower semi-continuous functions on complete metric spaces always attain ‘approximate minima,’ which we call critical points. In his original paper [ekeland1974] and the survey [ekeland1979], Ekeland provides many applications of his variational principle, centered around optimization problems concerning minimal surfaces, partial differential equations, geodesics, the geometry of Banach spaces, control theory, and more. Ekeland’s variational principle has been studied extensively, leading to many variants and extensions (see e.g. [Georgiev1988, bosch2007]). The variational principle can also be used to give an easy proof of Caristi’s fixed point theorem [caristi1976], whose logical strength we analyze in forthcoming work [LC17Talk, PLS2017Talk, ASL18Talk].
A priori, an analysis of Ekeland’s variational principle in reverse mathematics is a natural and interesting project.
Ekeland’s variational principle is a well-known and important result, and understanding its computational content could lead to algorithms for approximating critical points, or at least to determining that no such algorithm exists.
From a technical perspective, lower semi-continuous functions have not yet received much attention in reverse mathematics, and developing a theory of these functions in second-order arithmetic is an interesting endeavor on its own (see §4).
However, a posteriori the analysis of Ekeland’s variational principle is even more interesting and quite surprising: as we will see, natural restrictions of the result (e.g. to compact spaces, to continuous f) yield statements equivalent over RCA0 to each of WKL0, ACA0, and Π11\mbox−CA0, including what is, to the best of our knowledge, the first statement about continuous functions stemming from analysis that is equivalent to Π11\mbox−CA0.
Before diving into formal systems, let us discuss the variational principle in more detail and sketch a proof.
Let R=R∪{±∞}, where we stipulate that sup∅=−∞ and inf∅=∞.
If X is a metric space, recall that a function f:X→R is lower semi-continuous if for every x∈X and every λ<f(x), there is a δ>0 such that whenever d(x,y)<δ, it follows that f(y)>λ. The notion of an upper semi-continuous function is defined dually. Clearly, a function that is both upper and lower semi-continuous is continuous. Additionally, recall the following well-known characterization of semi-continuity.
Lemma 1.1**.**
Let X be a metric space.
A function f:X→R is
(a)
lower semi-continuous if and only if {(x,y)∈X×R:f(x)≤y} is closed;
2. (b)
upper semi-continuous if and only if {(x,y)∈X×R:f(x)≥y} is closed.
If f:X→R≥0 is continuous but X is not compact, then f may not attain its infimum, and since every continuous function is also lower semi-continuous, the same holds of lower semi-continuous functions. Nevertheless, Ekeland’s variational principle states that f has points that are in a sense approximate local minima, and we call these points ε-critical points. For the sake of brevity, we will often refer to lower semi-continuous functions f:X→R≥0 as potentials.
Definition 1.2**.**
Let X be a metric space, let ε>0, and let f:X→R≥0 be a potential. A point x∗∈X is an ε-critical point of f if
[TABLE]
Ekeland’s variational principle then reads as follows.
Theorem 1.3** (Ekeland [ekeland1974]).**
If X is a complete metric space and f:X→R≥0 is any potential, then for every ε>0, f has an ε-critical point x∗.
Moreover, for any x0∈X, we can choose x∗ so that
[TABLE]
In the literature, it is often assumed that f(x0)≤ε+inf(f), in which case (1) is replaced by d(x0,x∗)≤1.
We refer to Theorem 1.3 without (1) as the free variational principle (FVP) and with (1) as the localized variational principle (LVP); note that Ekeland instead calls these the weak and strong principles, respectively.
Let us sketch a proof similar to that presented by Du [Du16], which is based on one given by Brézis and Browder [Brezis1976] in a more general order-theoretic setting.
Suppose that X is a complete metric space, let x0∈X, and let f:X→R≥0 be any potential. For ε>0, define a partial order ≼ε on X given by y≼εx if and only if εd(x,y)≤f(x)−f(y). For x∈X, define S(x)={y∈X:y≼εx}. Then we construct a sequence (xn:n<ω) by letting xn+1∈S(xn) be such that f(xn+1)<inff[S(xn)]+2−n.
It is not too hard to check that (xn:n<ω) is Cauchy and hence converges to some x∗∈X, and, using the fact that f is lower semi-continuous, that f(x∗)≤f(xn) for all n<ω. From this it readily follows that x∗ is ε-critical and satisfies (1); we leave the details to the reader.
As it turns out, there are some issues when attempting to formalize this argument in second-order arithmetic. The sets S(xn) are closed, and finding infima of lower semi-continuous functions on closed sets requires Π11\mbox−CA0 in general. It is easier to formalize the argument when S(xn) is open andf is continuous, and we can achieve this by replacing S(xn) by a suitable open set. However, this modification typically causes (1) to fail.
In fact, we obtain that the FVP for continuous functions is equivalent to ACA0 (§7), while the LVP is equivalent to Π11\mbox−CA0, even when f is assumed to be continuous (§10). This last result is particularly interesting because it gives an example of a statement about continuous functions that is equivalent to Π11\mbox−CA0; typically the mathematics of continuous functions can be carried out in ACA0 and below.
By tweaking other parameters in the statement, for example taking X to be compact or even to be a specific space such as the unit interval, we obtain several variants of the variational principle, each equivalent to one of WKL0, ACA0 or Π11\mbox−CA0.
The equivalence between these theories and variational principles is established over RCA0, and hinges on known equivalences, e.g. the equivalence between ACA0 and the fact that every increasing sequence of rationals bounded above has a supremum. The main reversals are found in §9.
A full summary of our main results is found in §11.
2. Subsystems of second-order arithmetic
In this section we review the subsystems of second-order arithmetic we will be working with, loosely following the presentation of Simpson [SimpsonSOSOA].
Let us first settle conventions regarding syntax. The language of second-order arithmetic consists of first-order variables intended to range over natural numbers; second-order variables intended to range over sets of natural numbers; constant symbols 0 and 1; 2-ary function symbols + and ×; and 2-ary relation symbols =, <, and ∈. Terms and formulas are built from variables, constant symbols, function symbols, relation symbols, propositional connectives (¬, ∧, ∨, etc.), and the quantifiers ∀ and ∃. The equality relation symbol is restricted to first-order objects. Equality for second-order objects is defined via ∈, and we write X=Y as an abbreviation for ∀n(n∈X↔n∈Y). Similarly, we write X⊆Y as an abbreviation for ∀n(n∈X→n∈Y).
We use Δ00 to denote the class of all formulas, possibly with parameters, where no second-order quantifiers appear and all first-order quantifiers are bounded; that is, of the form ∀x(x<t→φ) or ∃x(x<t∧φ). We simultaneously define Σ00=Π00=Δ00, and we recursively define Σn+10 to be the set of all formulas of the form ∃xφ with φ∈Πn0 and similarly define Πn+10 to be the set of all formulas of the form ∀xφ with φ∈Σn0. We denote by Πω0 the union of all Πn0; these are the arithmetical formulas.
The classes Σn1 and Πn1 are defined analogously by counting alternations of second-order quantifiers and setting Σ01=Π01=Δ01=Πω0.
We use Robinson arithmetic Q as our background theory, which essentially consists of the axioms of PA without induction (see [HajekPudlak:1993:Metamathematics, Definition I.1.1]). When added to the basic axioms of Q, the following schemes axiomatize many of the theories we consider. Below, Γ denotes a class of formulas.
•
\Gamma\mbox{-}{\rm CA}:\ \exists X\forall x\ \big{(}x\in X\leftrightarrow\varphi(x)\big{)}, where φ∈Γ and X is not free in φ;
•
{\Delta}^{0}_{1}\mbox{-}{\rm CA}:\ \forall x\big{(}\pi(x)\leftrightarrow\sigma(x)\big{)}\rightarrow\exists X\forall x\ \big{(}x\in X\leftrightarrow\sigma(x)\big{)}, where σ∈Σ10, π∈Π10, and X is not free in σ;
Recursive comprehension, the base theory of second-order arithmetic, is defined as
[TABLE]
We also define the theory of arithmetical comprehension ACA0:RCA0+Σ10-CA and the much stronger theory Π11\mbox−CA0:ACA0+Π11-CA. This gives us three of the four important theories that we consider; the fourth, WKL0, requires formalizing trees in second-order arithmetic.
RCA0 suffices to define a bijective pairing function ⟨⋅,⋅⟩:N2→N that is increasing in both coordinates, such as ⟨x,y⟩=(x+y)(x+y+1)/2. We also use ⟨⋅,⋅⟩ to denote a pairing function on sets, which may also be defined in a standard way. With this, a binary relation is a pair of sets ⟨A,R⟩ with A,R⊆N, where the elements of R are of the form ⟨n,m⟩ with n,m∈A. When a binary relation F is meant to represent a function, we write y=F(x) instead of ⟨x,y⟩∈F.
RCA0 also suffices to implement the typical codings of finite sequences of natural numbers as natural numbers (see [SimpsonSOSOA, Section II.2], for example). The set of all finite sequences of natural numbers is denoted N<N. We write σ⊑τ if σ is an initial segment of τ, σ⊏τ if σ is a proper initial segment of τ, and define ↓σ={τ∈N<N:τ⊑σ}. If x:N→N and n∈N, we write x↾n for the finite sequence (x(i))i<n. We extend the use of ⊏ by writing σ⊏x whenever σ=x↾n for some n.
Definition 2.1**.**
A tree is a set T⊆N<N such that ∀σ(σ∈T→↓σ⊆T). We say that T is a binary tree if T⊆{0,1}<N, that is, if all entries appearing in elements of T are either [math] or 1. When T⊆N<N is a tree, we say that an infinite sequence x is a path through T if ∀n(x↾n∈T). The collection of all paths through T is denoted [T].
Say that a set X⊆N is finite if it is bounded above, and say that X is infinite otherwise. With this, we define the axiom WKL to be the natural formalization of the following theorem.
Theorem 2.2** (König).**
Every infinite binary tree has an infinite path.
For reference, we list the theories we have defined in increasing order of strength:
[TABLE]
The standard reference for second-order arithmetic is Simpson’s [SimpsonSOSOA], and we refer the reader to it for a complete treatment of all the material mentioned above.
3. Metric spaces in second-order arithmetic
Part of the appeal of second-order arithmetic as a foundational system for mathematics is that it suffices to develop a large part of mathematical analysis, particularly when dealing with complete separable metric spaces. However, this requires some coding machinery. In this section, we review this machinery and establish notation that will be used throughout.
First, we assume that Q is represented in some standard way using e.g. pairs of natural numbers and that R is represented by rapidly converging Cauchy sequences of rationals as in [SimpsonSOSOA, Section II.4]. We use the notation Q>0={q∈Q:q>0}, and we define Q≥0, R≥0, etc. analogously.
Definition 3.1** (RCA0; see [SimpsonSOSOA, Definition II.5.1]).**
A (code for a) complete separable metric space X=X is a non-empty set X⊆N together with a sequence of real numbers d:X×X→R≥0 such that d(a,a)=0, d(a,b)=d(b,a)≥0, and d(a,b)+d(b,c)≥d(a,c) for all a,b,c∈X. A point of X is a sequence x=(xi)i∈N of elements of X such that for all i≤j, d(xi,xj)≤2−i. We write x∈X to mean that
x is a point of X.
We identify a∈X with the sequence (a)i∈N and consider X as a dense subset of X.
We set d(x,y)=limn→∞d(xn,yn), and write x=Xy if d(x,y)=0 (subscripts will be omitted if there is no confusion).
We use either notation X or X to denote complete separable metric spaces. The symbol X emphasizes that the space is coded by the dense set X (and the metric on X). The following spaces appear throughout.
(1)
The real line, R, with dense set the rational numbers, Q, equipped with the usual metric.
Closed subintervals of R may be represented similarly, where the dense set for [a,b] is Q∩[a,b].
2. (2)
The Baire space, with dense set the set of sequences x∈NN that are eventually zero and
[TABLE]
3. (3)
The Cantor space, which is {0,1}N (also denoted 2N) seen as a subspace of the Baire space.
4. (4)
\mathcal{C}\big{(}[a,b]\big{)} with a<b rational numbers. The dense set is given by piecewise linear continuous functions f:[a,b]→R with rational breakpoints, each represented by finitely many pairs ⟨x,f(x)⟩∈Q×Q. The metric is d(f,g)=maxx∈[a,b]∣f(x)−g(x)∣.111We emphasize that \mathcal{C}\big{(}[a,b]\big{)} is the space of uniformly continuous functions f:[a,b]→R that have moduli of uniform continuity in the sense of [SimpsonSOSOA, Definition IV.2.1]. The statement “every continuous function f:[a,b]→R has a modulus of uniform continuity” is equivalent to WKL0 over RCA0.
In the cases of the Baire space and the Cantor space, a sequence that is eventually zero may be represented via an appropriate finite initial segment and hence as a natural number. If σ is a finite sequence, then where convenient we identify σ with σ⌢0N, which is the infinite sequence beginning with σ and is then identically zero. We remark also that the official dense set for \mathcal{C}\big{(}[a,b]\big{)} in [SimpsonSOSOA] is the set of polynomial functions f:[a,b]→Q with rational coefficients, but piecewise linear functions are more convenient for us. The two presentations are equivalent over RCA0 (see [SimpsonSOSOA, Example II.10.3 and Lemma IV.2.4]).
Definition 3.2** (RCA0; see [SimpsonSOSOA, Definition II.5.6]).**
Let X be a complete separable metric space. A (code for a) rational open ball Br(a) is an ordered pair ⟨a,r⟩, with a∈X and r∈Q>0. We define Br(a)\subsetplusBq(b) if d(a,b)+r<q.
A (code for an) open setU in X is a set U⊆N×X×Q>0, where a point x∈X is said to belong to U (abbreviated x∈U) if it satisfies the Σ10 condition ∃n∃a∃r(d(x,a)<r∧(n,a,r)∈U). A (code for a) closed set C is the same as a code for its complement X∖C, except we define x∈C if x∈X∖C.
The intuition behind the above definition is that a code for an open set is an enumeration of rational open balls. If U⊆N×X×Q>0 codes an open set, then the rational open ball Br(a) appears in the enumeration if there is an n with (n,a,r)∈U.
We remark that for X⊆R, we may represent basic open sets in the form (a−r,a+r) rather than Br(a), and we will often prefer this representation. On occasion we will regard closed balls (or, more precisely, closures of balls) as metric spaces in their own right, in which case if Br(a) is a ball in X, then Br(a) is the subspace with dense set Br(a)∩X.
We may also reason about compactness within second-order arithmetic.
Definition 3.3** (RCA0; see [SimpsonSOSOA, Definition III.2.3]).**
A complete separable metric space X is compact if there is a sequence of finite sequences ((xi,j:j<ni):i∈N) of points in X such that
[TABLE]
We say that X is Heine–Borel compact if for every sequence (Brk(ak):k∈N) of rational open balls such that (∀x∈X)(∃k∈N)(x∈Brk(ak)), there is an N∈N such that (∀x∈X)(∃k<N)(x∈Brk(ak)).
What we call ‘compact’ might be more properly called ‘uniformly (or effectively) totally bounded,’ but when working with complete separable metric spaces in second-order arithmetic, the convention is to use ‘compact’ for this notion, and we do not wish to deviate. WKL0 is required to show that every compact (in the above sense) complete separable metric space is Heine–Borel compact.
Theorem 3.4** (see [SimpsonSOSOA, Theorem IV.1.2 and Theorem IV.1.5]).**
The following are equivalent over RCA0.
(1)
WKL0.
2. (2)
Every compact complete separable metric space is Heine–Borel compact.
3. (3)
The unit interval [0,1] is Heine–Borel compact.
Sequential compactness is an even stronger notion. ACA0 is required to show that every compact space is sequentially compact.
Theorem 3.5** (see [SimpsonSOSOA, Theorem III.2.2 and Theorem III.2.7]).**
The following are equivalent over RCA0.
(1)
ACA0.
2. (2)
Every infinite sequence of points in a compact complete separable metric space has a convergent subsequence.
3. (3)
Every infinite sequence of points in [0,1] has a convergent subsequence.
4. (4)
Every increasing (or decreasing) sequence of points in [0,1] converges.
5. (5)
Every increasing (or decreasing) sequence of rational points in [0,1] converges.
Note that the equivalence with (5) is not stated explicitly in [SimpsonSOSOA], but it follows from the proof of Theorem III.2.2 given there. We conclude this section by defining continuous functions between complete separable metric spaces. The idea is to code a continuous function f:X→Y by an enumeration of (codes for) pairs of open balls ⟨Br(a),Bq(b)⟩, where Br(a)⊆X and Bq(b)⊆Y. If the pair ⟨Br(a),Bq(b)⟩ appears in the enumeration, then this means that f maps Br(a) into the closure of Bq(b).
Let X=X and Y=Y be complete separable metric spaces. A continuous partial functionf:X→Y is coded by a set Φ⊆N×X×Q>0×Y×Q>0 that satisfies the properties below. Let us write Br(a)→ΦBs(b) for \exists n\ \big{(}(n,a,r,b,s)\in\Phi\big{)}. Then, for all a,a′∈X, all b,b′∈Y, and all r,r′,s,s′∈Q>0, Φ must satisfy:
(cf1)
if Br(a)→ΦBs(b) and Br(a)→ΦBs′(b′), then d(b,b′)≤s+s′;
2. (cf2)
if Br(a)→ΦBs(b) and Br′(a′)\subsetplusBr(a), then Br′(a′)→ΦBs(b);
3. (cf3)
if Br(a)→ΦBs(b) and Bs(b)\subsetplusBs′(b′), then Br(a)→ΦBs′(b′).
A point x∈X is in the domain of the function f coded by Φ if, for every ε>0, there are Br(a)→ΦBs(b) such that d(x,a)<r and s<ε. If x∈dom(f), we define the value f(x) to be the unique point y∈Y such that d(y,b)≤s for all Br(a)→ΦBs(b) with d(x,a)<r.
A continuous function is a continuous partial function f:X→Y with dom(f)=X.
In case Y=R, we often write Br(a)→Φ(u,v) for Br(a)→ΦBs(b) with u=b−s and v=b+s.
To reason about the values of coded continuous functions, it often helps to think in the following way. Suppose that Φ codes a continuous function f:X→Y. If x∈X and y∈Y are such that for every ε>0 there are Br(a)→ΦBs(b) with x∈Br(a), y∈Bs(b), and s<ε, then f(x)=y.
The following lemmas are useful for constructing codes of open sets, closed sets, and continuous functions.
Lemma 3.7** ([SimpsonSOSOA, Lemma II.5.7]).**
For a given Σ10 formula φ(x), the following is provable within RCA0.
Let X=X be a complete separable metric space.
If x=Xy implies φ(x)↔φ(y), then there exists (a code for) an open set U⊆X such that φ(x)↔x∈U.
This lemma guarantees that a Σ10-definable subset of a complete separable metric space is an open set, and thus a Π10-definable subset is a closed set.
Note that this lemma holds uniformly. In other words, if φ(x,i) is a Σ10 formula that defines a subset of X for each i∈N, then there exists a sequence of codes for open sets (Ui)i∈N such that φ(x,i)↔x∈Ui.
Intuitively, the following lemma states that if a function is uniformly continuous in an effective way and the values of the function can be computed on a dense set (e.g., if f(x) is provided by elementary functions, by power series, etc.), then there is a code for the function. This property also holds restricted to any open set.
Lemma 3.8** (RCA0).**
Let X=X and Y=Y be complete separable metric spaces, and let U⊆X be an open set.
Assume that (⟨ai,yi⟩:i∈N) is a sequence of points in X×Y and that h:N→N is a function such that
•
(ai)i∈N* enumerates all points in U∩X,*
•
dX(ai,aj)<2−h(n)* implies dY(yi,yj)<2−n for all i,j,n∈N.*
Then, there exists (a code for) a continuous function f:U→Y such that f(ai)=yi for all i∈N. (In fact, h is a modulus of uniform continuity for f; see [SimpsonSOSOA, Definition IV.2.1].)
Proof.
Let U be the code for U.
Define a code Φ for a continuous partial function f:X→R so that Br(a)→ΦBs(b) if and only if the quadruple (a,r,b,s)∈X×Q>0×Y×Q>0 satisfies
•
Br(a)\subsetplusBr′(a′) for some (n,a′,r′)∈U, and
•
Br(a)\subsetplusB2−h(n)−1(ai) and B2−n(yi)\subsetplusBs(b) for some n,i∈N.
The code Φ exists because the above conditions can be described by a Σ10 formula.
It is easy to see that Φ satisfies (cf2) and (cf3) of Definition 3.6.
We show that Φ also satisfies (cf1).
Assume that Br(a)→ΦBs(b) and Br(a)→ΦBs′(b′), where in each case the second condition is satisfied by n,ai,yi and m,aj,yj, respectively.
Without loss of generality, we may assume that n≤m and that h is non-decreasing.
Since a∈B2−h(n)−1(ai)∩B2−h(m)−1(aj), we have dX(ai,aj)<2−h(n).
Then dY(yi,yj)<2−n, and thus yj∈B2−n(yi)⊆Bs(b).
Hence yj∈Bs(b)∩Bs′(b′), which implies that dY(b,b′)≤s+s′.
Finally, we check that U=dom(f) and f(ai)=yi.
If x∈/U, then x∈/dom(f) by the first condition, thus we have dom(f)⊆U.
For the converse, assume that x∈U and ε>0.
Take (n′,a′,r′)∈U so that x∈Br′(a′) and n∈N so that 2−n+1<ε, and then take r∈Q>0 so that B2r(x)\subsetplusBr′(a′) and r<2−h(n)−1.
Pick a point a=ai∈U∩X so that dX(x,a)<r and a point b∈Y so that dY(yi,b)<2−n, and put s=2−n+1. Then we have Br(a)→ΦBs(b) and x∈Br(a). Since ε was arbitrary and s<ε, this means that x∈dom(f).
If x=aj for some j∈N in the above, then i can be always chosen to be equal to j, and thus one can find Br(a)→ΦBs(b) with aj∈Br(a), yj∈Bs(b) and s<ε for all ε>0. This means that f(aj)=yj for all j∈N.
∎
We also remark that continuous partial functions can be patched together, provided they are mutually compatible.
Lemma 3.9** (RCA0; [Y-TMP, Lemma 3.31]).**
Let X=X and Y=Y be complete separable metric spaces.
Assume that (Ui,fi)i∈N is a sequence of pairs of (codes for) open sets Ui⊆X and continuous functions fi:Ui→Y such that fi(x)=fj(x) for any i,j∈N and x∈Ui∩Uj.
Then, there exists (a code for) a continuous function fˉ:⋃i∈NUi→Y such that fˉ(x)=fi(x) for all i∈N and x∈Ui.
Notice that the above lemmas also hold uniformly. For Lemma 3.8, if we are given a sequence (⟨ai,j,yi,j⟩i∈N,Uj,hj:j∈N), where ⟨ai,j,yi,j⟩i∈N, Uj and hj satisfy the conditions of Lemma 3.8 for all j∈N, then we may obtain a sequence of continuous functions (fj:Uj→Y)j∈N.
Finally, we consider products of finitely many metric spaces (Xi:i<n). In [SimpsonSOSOA, Example II.5.4], product spaces are defined via the Euclidean metric, but for our purposes it is more convenient to use the ‘max’ metric instead.
Definition 3.10** (RCA0; see [SimpsonSOSOA, Example II.5.4]).**
Let (Xi:i<n) be complete separable metric spaces coded by dense sets (Xi:i<n) and metrics (di:i<n). The product spaceX=∏i<nXi is the complete separable metric space coded by the dense set X=∏i<nXi and the metric d:X×X→R where
[TABLE]
Products of metric spaces will be very useful to us.
For example, we may use them to define Banach spaces in the context of second-order arithmetic: these are simply metric spaces X equipped with a designated element 0∈X and total, continuous functions +:X×X→X and ⋅:R×X→X such that if we define ∥x∥=d(x,0), then (X,0,+,⋅,∥⋅∥) satisfies the standard definition of a normed vector space.
On occasion we will make free use of the fact that some of the spaces we consider, such as {\mathcal{C}}\big{(}[0,1]\big{)}, come equipped with a standard Banach space structure.
If X=∏i<nXi is the product of the complete separable metric spaces (Xi:i<n), then, officially, the basic open sets in X are those of the form Br(a), where a∈∏i<nXi and r∈Q>0. However, when working with X, it is often more convenient to think of the basic open sets as being the sets of the form ∏i<nBri(ai), where each Bri(ai) is a basic open set in Xi. The following lemma says that in RCA0 we can uniformly translate between the two styles of basic open sets when coding open sets. Therefore we may always use the style of open set that is most convenient.
Lemma 3.11** (RCA0).**
Let (Xi:i<n) be complete separable metric spaces coded by dense sets (Xi:i<n) and metrics (di:i<n). Let X=∏i<nXi and d code X=∏i<nXi as in Definition 3.10.
(i)
There is a function f:∏i<n(Xi×Q>0)×N→X×Q>0 such that, for every
[TABLE]
f(a0,r0,…,an−1,rn−1,⋅)* enumerates an official code for the unofficial basic open set*
[TABLE]
(ii)
For every ⟨a,r⟩∈X×Q>0, Br(a)=∏i<nBr(ai). Thus the function g:X×Q>0→∏i<n(Xi×Q>0) given by g(a,r)=(a0,r,…,an−1,r) translates the code ⟨a,r⟩ for the official basic open set Br(a) to the code (a0,r,…,an−1,r) for the equivalent unofficial basic open set ∏i<nBr(ai).
Proof.
(i) The function f exists by Lemma 3.7 because the unofficial basic open set ∏i<nBri(ai) is Σ10 uniformly in (a0,r0,…,an−1,rn−1).
(ii) Let x∈X. Then x∈Br(a) if and only if d(a,x)<r if and only if maxi<ndi(ai,xi)<r, if and only if (∀i<n)(di(ai,xi)<r), if and only if x∈∏i<nBr(ai). Thus Br(a)=∏i<nBr(ai).
∎
4. Semi-continuous functions in second-order arithmetic
Although continuous functions have been extensively studied in the context of second-order arithmetic, lower semi-continuous functions have received less attention.
Fortunately, they admit a natural representation in the spirit of Definition 3.6.
Definition 4.1** (RCA0).**
Let X=X be a complete separable metric space. A code for a lower semi-continuous functionf:X→R is a set Ψ⊆N×X×Q>0×Q that satisfies the properties below. Let us write Br(a)⇁Ψq for ∃n((n,a,r,q)∈Ψ). Then, for all a,a′∈X, all q,q′∈Q, and all r,r′∈Q>0, Ψ must satisfy:
(lsc1)
if Br(a)⇁Ψq and Br′(a′)\subsetplusBr(a), then Br′(a′)⇁Ψq, and
2. (lsc2)
if Br(a)⇁Ψq and q′<q, then Br(a)⇁Ψq′.
For x∈X and y∈R we define f(x)=y if
[TABLE]
The relation f thus defined is a lower semi-continuous function if (∀x∈X)(∃y∈R)(f(x)=y).
The support of f, denoted supp(f), is the collection {x∈X:f(x)∈R}. We do not assume that supp(f) exists as any kind of coded set in RCA0. An assertion of the form x∈supp(f) is to be taken as an abbreviation for (∃y∈R)(f(x)=y).
If b∈Q is such that for every a∈X and r∈Q>0 there is m∈N such that (m,a,r,b)∈Ψ, we say that Ψ is a code of a lower semi-continuous function f:X→R≥b.
If b=0, we call the function coded by Ψ a potential.
Suppose that X is a complete separable metric space. As with Definition 3.6, the idea behind Definition 4.1 is that Ψ enumerates pairs ⟨Br(a),q⟩ with the property that if f is the function being coded by Ψ, then f maps Br(a)∩dom(f) into [q,∞]. One may define upper semi-continuous partial functions from X to R by appropriately dualizing Definition 4.1, in which case we write Br(a)⇀Ψq for the dual of Br(a)⇁Ψq.
We remark that according to our definition, any code for a lower semi-continuous function defines a function f:X→R, as, recalling that sup∅=−∞, we see that every subset of R has a supremum in R.
However, this fact is not provable in RCA0.
For this reason, lower semi-continuous functions are defined with the explicit assumption that f is defined everywhere.
See Remark 4.4 for further discussion.
Next we show that RCA0 proves a version of Lemma 1.1, which we take as evidence that Definition 4.1 is a reasonable definition of semi-continuity for use in RCA0.
Indeed, we prove an RCA0 version of the fact that a function f:X→R is lower semi-continuous if and only if {⟨x,y⟩∈X×R:f(x)≤y} is closed.
Proposition 4.2** (RCA0).**
Let X be a complete separable metric space.
(i)
If f:X→R is lower semi-continuous, then {⟨x,y⟩∈X×R:f(x)≤y} is closed.
2. (ii)
If C⊆X×R is a closed set such that
•
for every x∈X and y,z∈R, if ⟨x,y⟩∈C and z≥y, then ⟨x,z⟩∈C, and
•
for every x∈X, inf{y∈R:⟨x,y⟩∈C} exists
then there is a lower semi-continuous function f:X→R such that C={⟨x,y⟩∈X×R:f(x)≤y}.
Proof.
(i)
Let f be coded by Φ.
Then f(x)>y if and only if there exists (a,r,q)∈X×Q>0×Q such that Br(a)⇁Φq∧d(x,a)<r∧y<q. Thus, {⟨x,y⟩∈X×R:f(x)>y} is Σ10-definable, hence it is open by Lemma 3.7.
(ii) Let C be a closed set as in the statement of (ii), and let U be the complement of C. Enumerate a code Φ for a lower semi-continuous function f by enumerating Br(a)⇁Φq whenever U enumerates an open set Bs(b)×(u,v) with Br(a)\subsetplusBs(b) and q<v. We show that, for every x∈X, f(x) exists and equals inf{z∈R:⟨x,z⟩∈C} (which exists by the assumptions on C). So let x∈X, and let y=inf{z∈R:⟨x,z⟩∈C}. We need to show that
[TABLE]
Suppose that q∈Q is such that there is a ball Br(a)⊆X with x∈Br(a) and Br(a)⇁Φq. By the definition of Φ, there is a Bs(b)×(u,v)⊆U where Br(a)\subsetplusBs(b) and q<v. From this we can conclude that if z∈R is such that ⟨x,z⟩∈C, then q<z. If not, then by the first assumption on C, there would be a z such that ⟨x,z⟩∈C and z∈(u,v). This implies that ⟨x,z⟩∈Bs(b)×(u,v)⊆U, which contradicts that U is the complement of C. Thus q≤y because y=inf{z∈R:⟨x,z⟩∈C}.
We have shown that y is an upper bound on the q∈Q for which there is a Br(a)⊆X with x∈Br(a) and Br(a)⇁Φq. We need to show that y is least. So consider a z<y. Then ⟨x,z⟩∈/C because y=inf{z∈R:⟨x,z⟩∈C}. Then there is a Bs(a)×(u,v) enumerated into U that contains ⟨x,z⟩. Let r<s be such that x∈Br(a)\subsetplusBs(a), and let q∈Q∩(z,v). Then Br(a)⇁Φq by the definition of Φ. Thus x∈Br(a) and Br(a)⇁Φq, but z<q. So z is not an upper bound on the q∈Q for which there is a Br(a)⊆X with x∈Br(a) and Br(a)⇁Φq. Thus y is indeed the required supremum.
We have shown that, for all x∈X, f(x)=inf{z∈R:⟨x,z⟩∈C}. It is now easy to see that, for ⟨x,y⟩∈X×R, f(x)≤y if and only if inf{z∈R:⟨x,z⟩∈C}≤y if and only if ⟨x,y⟩∈C.
∎
The analogous properties of upper semi-continuous functions can be proved by dualizing Proposition 4.2.
The next proposition is an RCA0 version of the fact that a function f:X→R is continuous if and only if it is upper semi-continuous and lower semi-continuous.
Proposition 4.3** (RCA0).**
Let X be a complete separable metric space.
(i)
If f:X→R is continuous, then there are a lower semi-continuous g:X→R and an upper semi-continuous h:X→R such that (∀x∈X)(f(x)=g(x)=h(x)).
2. (ii)
If g:X→R is lower semi-continuous, h:X→R is upper semi-continuous, and (∀x∈X)(g(x)=h(x)), then there is a continuous partial f:X→R with dom(f)=supp(g)=supp(h) such that (∀x∈dom(f))(f(x)=g(x)=h(x)).
Proof.
(i) Let Φ be a code for a continuous f:X→R. Define a code Γ for a lower semi-continuous g:X→R by enumerating Br(a)⇁Γq whenever Bs(b)→Φ(u,v) for a Bs(b) with Br(a)\subsetplusBs(b) and a (u,v) with q<u. It is then easy to check that g(x) is defined and equal to f(x) for all x∈X. The upper semi-continuous partial function h:X→R such that (∀x∈dom(f))(f(x)=h(x)) is defined dually.
(ii) Let Γ be a code for a lower semi-continuous g:X→R and let Ψ be a code for an upper semi-continuous h:X→R such that (∀x∈X)(g(x)=h(x)). Define a code Φ for a continuous partial function f:X→R by enumerating Br(a)→Φ(u,v) whenever there are Bs(b),Bt(c)⊆X and p,q∈Q such that Br(a)\subsetplusBs(b), Br(a)\subsetplusBt(c), [p,q]⊆(u,v), Bs(b)⇁Γp, and Bt(c)⇀Ψq. It is easy to see that Φ satisfies (cf2) and (cf3) of Definition 3.6. To see that Φ satisfies (cf1) of Definition 3.6, first observe that if Br(a)→Φ(u,v), then g(a)>u and h(a)<v. However, g(a)=h(a), so their common value must be finite and in the interval (u,v). Therefore, if Br(a)→Φ(u,v) and Br(a)→Φ(u′,v′), then (u,v) and (u′,v′) both contain g(a), and therefore (u,v) and (u′,v′) intersect.
We show that dom(f)=supp(g)=supp(h) and that (∀x∈dom(f))(f(x)=g(x)=h(x)). First, suppose that x∈/supp(g). Then g(x)=±∞. Suppose that g(x)=−∞. The only way this can happen is if there is no Bs(b)⇁Γp with x∈Bs(b). But if there is no Bs(b)⇁Γp with x∈Bs(b), then there is also no Br(a)→Φ(u,v) with x∈Br(a). Thus x∈/dom(f). If instead g(x)=∞, then also h(x)=∞. In this case, there can be no Bt(c)⇀Ψq with x∈Bt(c), which similarly implies that x∈/dom(f).
Now suppose that x∈supp(g), and let ε∈Q>0. Let (u,v)⊆R be such that g(x)∈(u,v) and v−u<ε. By the definition of g(x), there are a Bs(b)⊆X and a p∈(u,v) such that x∈Bs(b) and Bs(b)⇁Γp. Likewise, as h(x)=g(x)∈(u,v), there are a Bt(c)⊆X and a q∈(u,v) such that x∈Bt(c) and Bt(c)⇀Ψq. Notice then that p≤g(x)≤q. Let Br(a)⊆X be such that x∈Br(a), Br(a)\subsetplusBs(b), and Br(a)\subsetplusBt(c). Then Bs(b), Bt(c), p, and q witness that Br(a)→Φ(u,v). Thus for every ε∈Q>0, there are a Br(a)⊆X with x∈Br(a) and a (u,v)⊆R with v−u<ε and g(x)∈(u,v). This means that x∈dom(f) and f(x)=g(x)=h(x).
∎
Remark 4.4**.**
Our definition of the values of lower semi-continuous functions by suprema of rationals is similar to the definition of Borel measures within RCA0 in [SimpsonSOSOA, Section X.1]. As with Borel measures, one could understand the values of lower semi-continuous functions in a comparative way instead of requiring that the defining suprema exist. That is, if f is lower semi-continuous, then it is still possible to make sense of inequalities like f(x)≤r in RCA0 even if the supremum defining f(x) does not exist. Indeed, in ω-models, the values of lower semi-continuous functions are defined as (relative) left-c.e. reals. With this perspective, Proposition 4.2 is still true without the existence of infima. The proof of Proposition 4.3 shows that a code Ψ for a partial continuous function g:X→R can be considered as a pair of codes (Ψ−,Ψ+) for lower and upper semi-continuous functions by putting Br(a)⇁Ψ−u and Br(a)⇀Ψ+v if Br(a)→Ψ(u,v), and then x∈dom(g) if and only if the values at x defined by Ψ− and Ψ+ coincide. If the values coincide in a comparative sense, then it exists as a real number within RCA0. Similar modifications may be available for other theorems presented here.
5. Honestly-coded potentials
Recall that we refer to lower semi-continuous functions f:X→R≥0 as potentials. According to Definition 4.1, for X a complete separable metric space, Φ a code for a lower semi-continuous function f:X→R, Br(a)⊆X, and q∈Q, we know that f(x)≥q for all x∈Br(a) if it is the case that Br(a)⇁Φq. We call a code Φ for fhonest if it contains all of the information of this sort. That is, if the ‘if’ is an ‘if and only if.’
Definition 5.1** (RCA0).**
Let X be a complete separable metric space and let f:X→R be lower semi-continuous. A code Φ for f is called honest if, for every Br(a)⊆X and every q∈Q, Br(a)⇁Φq if and only if (∀x∈Br(a))(f(x)≥q). If f has an honest code, then we say that f is honestly-coded.
Notice that if f:X→R is lower semi-continuous and Φ⊆N×X×Q>0×Q is any set such that, for every Br(a)⊆X and every q∈Q, Br(a)⇁Φq if and only if (∀x∈Br(a))(f(x)≥q), then Φ is automatically a code (and hence an honest code) for f as in Definition 4.1.
Every lower semi-continuous function admits an honest code, although such a code cannot always be constructed in a weak theory.
Lemma 5.2**.**
Let X be a complete separable metric space.
(i)
(WKL0)* If X is compact and f:X→R is lower semi-continuous, then there is a code Φ for f such that, for every Br(a)⊆X and every q∈Q, Br(a)⇁Φq if and only if (∀x∈Br(a))(f(x)>q).*
2. (ii)
(ACA0)* If X is compact and f:X→R is lower semi-continuous, then f has an honest code.*
3. (iii)
(ACA0)* If f:X→R is continuous, then there is an honestly-coded lower semi-continuous g:X→R such that (∀x∈X)(g(x)=f(x)).*
4. (iv)
(Π11\mbox−CA0)* If f:X→R is lower semi-continuous, then f has an honest code.*
Proof.
(i) Work in WKL0. For ⟨a,r⟩∈X×Q and a sequence (⟨bi,si⟩:i<n) of elements of X×Q, the assertion that Br(a)⊆⋃i<nBsi(bi) is Σ10 uniformly in ⟨a,r⟩ and (⟨bi,si⟩:i<n), essentially by a uniform version of [SimpsonSOSOA, Theorem IV.1.7].
Let Ψ be a code for f. Define a new code Φ by enumerating Br(a)⇁Φq whenever there are balls (Bsi(bi):i<n) and rationals ⟨ti:i<n⟩ such that ti>q and Bsi(bi)⇁Ψti for each i<n and Br(a)⊆⋃i<nBsi(bi).
Suppose that Br(a)⇁Φq. Then Br(a) is covered by balls (Bsi(bi):i<n) where f(x)>q for all x∈Bsi(bi) and all i<n. Thus f(x)>q for all x∈Br(a). Conversely, suppose that f(x)>q for all x∈Br(a). Then any enumeration (Bsi(bi):i∈N) of every ball Bsi(bi) for which there is a ti∈Q with ti>q and Bsi(bi)⇁Ψti is an open cover of Br(a) and hence (essentially by [SimpsonSOSOA, Theorem IV.1.6]) has a finite subcover. Thus there are a sequence (Bsi(bi):i<n) of balls and a sequence ⟨ti:i<n⟩ of rationals witnessing that Br(a)⇁Φq.
We have shown that, for every Br(a)⊆X and every q∈Q, Br(a)⇁Φq if and only if (∀x∈Br(a))(f(x)>q). Using this fact, it is straightforward to verify that Φ is also a code for f in the sense of Definition 4.1.
(ii) Work in ACA0. Let Φ be a code for f as in (i). Then, define a code Γ by setting Br(a)⇁Γq whenever it is the case that Bs(b)⇁Φt for every t<q and every Bs(b)⊆X with Bs(b)\subsetplusBr(a).
Suppose that x∈Br(a) and Br(a)⇁Γq. Let Bs(b) be such that x∈Bs(b) and Bs(b)\subsetplusBr(a). Then Bs(b)⇁Φt for every t<q, which means that f(x)>t for every t<q. Hence f(x)≥q. Conversely, suppose that f(x)≥q for all x∈Br(a). If Bs(b)⊆X and t∈Q satisfy t<q and Bs(b)\subsetplusBr(a), then f(x)≥q>t for all x∈Bs(b) because Bs(b)⊆Br(a). Thus Bs(b)⇁Φt. Hence Bs(b)⇁Φt whenever t<q and Bs(b)\subsetplusBr(a), so Br(a)⇁Γq.
We have shown that Br(a)⇁Γq if and only if (∀x∈Br(a))(f(x)≥q). So Γ is an honest code for f.
(iii) In ACA0, we can define a code Φ by Br(a)⇁Φq if and only if f(b)≥q for all b∈Br(a)∩X. One readily checks that Φ is an honest code for a lower semi-continuous g that is equal to f.
(iv) In Π11\mbox−CA0, we can directly define an honest code Φ for f by setting Br(a)⇁Φq if and only if (∀x∈Br(a))(f(x)≥q).
∎
6. Continuous envelopes
Honestly-coded lower semi-continuous functions facilitate the calculation of infima, which helps us approximate lower semi-continuous functions by continuous ones.
To be precise, lower semi-continuous functions bounded below can be written as the increasing limit of continuous functions using the following construction.
Definition 6.1**.**
Given a potential f on a complete separable metric space X and an α∈R>0, define the lower α-envelope of f by
[TABLE]
Although not needed for our purposes, it is instructive to observe that if f:X→R≥0 is a potential, then fn converges pointwise to f as n→∞.
What we do need to prove is that continuous envelopes exist, and this can typically not be done within a weak theory. The construction of envelopes hinges on the following more general lemma.
Lemma 6.2** (ACA0).**
Let X and Y be complete separable metric spaces, let h:X×Y→R≥0 be uniformly continuous, and let f:Y→R≥0 be lower semi-continuous and honestly-coded with non-empty support. Then there is a uniformly continuous function g:X→R≥0 such that
[TABLE]
Proof.
Let Φ be a code for h, and let Ψ be an honest code for f. We first show that if x∈X, then infy∈Y(h(x,y)+f(y)) indeed exists, which is a consequence of the following Claim.
Claim**.**
Let x∈X and q∈Q. Then there is a y∈Y such that h(x,y)+f(y)<q if and only if there are Br(⟨a,b⟩)⊆X×Y, (u,v)⊆R, and p∈Q such that x∈Br(a), Br(⟨a,b⟩)→Φ(u,v), ¬(Br(b)⇁Ψp), and v+p<q.
Proof of Claim.
Be aware that Br(⟨a,b⟩)=Br(a)×Br(b) by Definition 3.10.
Suppose that y∈Y is such that h(x,y)+f(y)<q. Let u,v,p∈Q be such that h(x,y)∈(u,v), p>f(y), and v+p<q. Then there must be a Br(⟨a,b⟩)⊆X×Y such that ⟨x,y⟩∈Br(⟨a,b⟩) and Br(⟨a,b⟩)→Φ(u,v). Furthermore, since Ψ is a code for f, it must be that ¬(Br(b)⇁Ψp) because y∈Br(b) but f(y)<p. Thus Br(⟨a,b⟩), (u,v), and p are as required.
Conversely, suppose that there are such Br(⟨a,b⟩), (u,v), and p. Because Ψ is honest and ¬(Br(b)⇁Ψp), there must be a y∈Br(b) such that f(y)<p. Also, h(x,y)≤v because ⟨x,y⟩∈Br(a)×Br(b)=Br(⟨a,b⟩) and Br(⟨a,b⟩)→Φ(u,v). Hence h(x,y)+f(y)<v+p<q.
∎
Thus, given x∈X, let Q be the set of all q∈Q for which there are ⟨a,b,r⟩∈X×Y×Q and u,v,p∈Q such that Br(⟨a,b⟩), (u,v), and p witness that there is a y∈Y such that h(x,y)+f(y)<q as in the Claim. The set Q is bounded below by [math] and is non-empty since supp(f)=∅, so infQ exists (essentially by [SimpsonSOSOA, Theorem III.2.2]), and, by the Claim, infQ=infy∈Y(h(x,y)+f(y)). Henceforth, for each x∈X, let αx denote infy∈Y(h(x,y)+f(y)).
We can make the above argument uniformly for all a∈X by letting A⊆X×Q be the set of all ⟨a,q⟩ for which there are r∈Q, b∈Y, and u,v,p∈Q such that Br(⟨a,b⟩), (u,v), and p witness that there is a y∈Y such that h(a,y)+f(y)<q. Then from A we can define the sequence (αa:a∈X) because, given a sequence of sets of rationals all bounded from below, ACA0 suffices to produce the corresponding sequence of infima (that is ACA0 proves item 4 of [SimpsonSOSOA, Theorem III.2.2] uniformly).
Define a code Γ for a continuous partial function g:X→R by defining Br(a)→Γ(u,v) if and only if (∀c∈Br(a)∩X)[αc∈(u,v)]. It is easy to see that Γ satisfies the requirements of Definition 3.6. We need to show that g is indeed uniformly continuous on all of X and that (∀x∈X)(g(x)=αx).
Claim**.**
Let ε,δ∈Q>0 be such that
[TABLE]
Let x0,x1∈X. Then dX(x0,x1)<δ→∣αx0−αx1∣≤ε.
Proof of Claim.
We show that (∀η∈Q>0)(αx1≤αx0+ε+η), which implies that αx1≤αx0+ε. By a symmetric argument, we also have that αx0≤αx1+ε, which gives the desired ∣αx0−αx1∣≤ε.
Thus let η∈Q>0. Let y∈Y be such that h(x0,y)+f(y)≤αx0+η. Then
[TABLE]
where the last inequality is by the choice of y and the fact that dX×Y(⟨x0,y⟩,⟨x1,y⟩)=dX(x0,x1)<δ.
∎
Given x∈X and ε∈Q>0, let δ∈Q>0 be such that
[TABLE]
and let a∈X be such that x∈Bδ(a). Let (u,v)⊆R be such that [αa−\nicefracε4,αa+\nicefracε4]⊆(u,v) and v−u<ε. The Claim tells us that if c∈Bδ(a)∩X, then αc∈[αa−\nicefracε4,αa+\nicefracε4]⊆(u,v), which means that Bδ(a)→Γ(u,v). Likewise, the Claim implies that αx∈(u,v). Thus for every x∈X and every ε∈Q>0, there are a ball Bδ(a)⊆X and a (u,v)⊆R such that x∈Bδ(a), αx∈(u,v), v−u<ε, and Bδ(a)→Γ(u,v). Thus g(x) is defined and equals αx for all x∈X. Furthermore, the Claim clearly implies that g is uniformly continuous.
∎
Lemma 6.3** (ACA0).**
Let X be a complete separable metric space, let f:X→R≥0 be lower semi-continuous and honestly coded with non-empty support, and let α∈R≥0. Then there is a uniformly continuous function fα:X→R≥0 such that, for all x∈X,
[TABLE]
Proof.
This follows from Lemma 6.2 because the function h:X×X→R≥0 given by h(x,y)=αdX(x,y) is uniformly continuous and bounded below by [math].
∎
7. Critical points of continuous functions
In this section we formalize Ekeland’s variational principle and show that it has natural restrictions provable in weak systems. Recall that a potential on a complete separable metric space X is a lower semi-continuous function f:X→R≥0. Recall also that ε-critical points were defined in Definition 1.2 and that this definition may readily be formalized in RCA0 using coded lower semi-continuous functions.
With this, we are ready to define our formalized version (or, versions) of Ekeland’s variational principle.
Definition 7.1**.**
Given definable classes X of coded complete separable metric spaces and F of coded potentials, the (formalized) free variational principle (FVP) for X∈X and f∈F is the statement that, if X∈X and f:X→R≥0 in F is such that supp(f)=∅, then for every ε>0 there is an x∗∈supp(f) such that
[TABLE]
The localized variational principle (LVP) is defined similarly, except that x0∈supp(f) is given and x∗ is chosen so that εd(x0,x∗)≤f(x0)−f(x∗).
When not mentioned, we assume that X is the class of all coded complete separable metric spaces and F is the class of all coded potentials. Given c>0, to indicate that f(x)≤c for all x∈dom(f), we will say that f is c-bounded, and if we want to fix the value of ε to c, we call the statement the free/localized variational principle for ε=c. The free/localized variational principles for ε=c and fc-bounded will be denoted the c-FVP and the c-LVP, respectively.
Critical points are sometimes also called pseudo-minima, as they may serve as approximate minima of functions that do not actually attain their minimum value.
Lemma 7.2** (RCA0).**
If X is any complete separable metric space and f:X→R≥0 is lower semi-continuous with non-empty support, then for any x∗∈X, f attains its minimum at x∗ if and only if x∗ is an ε-critical point of f for all ε>0.
Proof.
It is straightforward to see that if x=x∗ and εd(x,x∗)≤f(x∗)−f(x) then f(x)<f(x∗), hence if x∗ is not ε-critical it cannot be a minimum. Conversely, if f does not attain its minimum at x∗, then we may choose x∈X with f(x)<f(x∗). By choosing ε small enough, we obtain εd(x,x∗)≤f(x∗)−f(x).
∎
The class F will contain either continuous or lower semi-continuous functions.
Important special cases for us are the 1-FVP and the 1-LVP, which on occasion are equivalent to the FVP and LVP, respectively.
Lemma 7.3** (RCA0).**
Let X be a class of complete separable metric spaces and F be either the class of continuous or of lower semi-continuous potentials.
(1)
The FVP and the LVP with X∈X and f∈F are equivalent, respectively, to the FVP and the LVP with ε=1, X∈X, and f∈F.
2. (2)
If X is either the class of all complete separable metric spaces or of all compact metric spaces then the FVP and the LVP with X∈X and f∈F bounded are equivalent, respectively, to the 1-FVP* and the 1-LVP with X∈X and f∈F.*
3. (3)
*If X is
*
(a) a Banach space,
*
(b) the Baire space,
*
(c) the Cantor space, or
*
(d) a closed ball in Rn (with any norm),
*
*
then the FVP and the LVP for X and bounded, lower semi-continuous potentials are equivalent, respectively, to the 1-FVP and the 1-LVP for X and lower semi-continuous potentials.*
Proof.
(1) Replace f by f′=\nicefracfε and note that any 1-critical point for f′ is ε-critical for f.
(2) First replace f by f′=\nicefracfb, so that f′ is bounded by 1 and any \nicefracεb-critical point for f′ is an ε-critical point for f.
As we can no longer freely scale f′, we instead scale the metric and consider d′=\nicefracεdb.
Then the reader can verify that any 1-critical point for f′ with respect to d′ is an ε-critical point for f with respect to d.
(3)
The general idea is that in all of these cases, the proof of (2) can be simulated (in a possibly ad-hoc way) without modifying the space X.
For the sake of illustration we sketch the proof in the case where X is a closed ball in Rn—we give an informal argument and let the reader verify that it can be carried out in RCA0.
The FVP and LVP cases are similar, so we focus on the FVP.
Assume that the 1-FVP holds for X.
Without loss of generality we may assume that X=Bρ(0) for some ρ>0.
Let f be a bounded lower semi-continuous function on X with non-empty domain, and let ε>0. Without loss of generality we may assume that f is bounded by \nicefrac12 (adjusting ε if needed).
Let A⊆Bρ(0)∩Qn be finite and such that every point in Bρ(0) is within distance \nicefrac14 of some point in A. Let δ>0 be small enough so that δ<min{ερ,\nicefrac14} and d(a,a′)>2δ whenever a,a′∈A are distinct. Define g:X→R≥0 by
[TABLE]
By the 1-FVP, let x∗ be a 1-critical point of g.
We claim that x∗∈Bδ(a) for some a∈A.
If not, then g(x∗)=1 by the definition of g.
Let a∈A minimize d(x∗,a), and note that d(x∗,a)<\nicefrac14 by our choice of A.
Let y0∈supp(f).
Then x0:=ρδy0+a∈Bδ(a) and g(x0)=f(y0)≤\nicefrac12, so that
[TABLE]
This contradicts that x∗ is a 1-critical point of g.
Finally, we claim that y∗:=δρ(x∗−a) is an ε-critical point of f.
Let y∈supp(f) with y=x∗ be arbitrary, and let x:=ρδy+a∈Bδ(a). Then since x∗ is a 1-critical point of g,
[TABLE]
which by the definition of g becomes
[TABLE]
as needed.
The proof that the 1-LVP implies the bounded LVP is similar, except that it is more convenient to replace A by the singleton {0}.
The proofs of the other items are analogous, with the main difference being the choice of g.
Let us assume that f is bounded by 1.
For (3a) we set g(x)=f(ε1x).
For items (3b) and (3c) we use the observation that d(σ⌢x,σ⌢y)=2−∣σ∣d(x,y) for every σ, x, and y.
Let n be large enough so that 2−n<ε. For the FVP define
g(σ⌢y)=f(y), where ∣σ∣=n. That is, given x, g(x) chops off x↾n and applies f to the remaining sequence y.
For the LVP, we define g((0n)⌢y)=f(y) and g(x)=1 if x does not begin with n zeroes.
We leave to the reader to check that in each of this cases, any 1-critical point for g induces an ε-critical point for f, satisfying the localization condition when appropriate.
∎
With Lemma 7.3 in mind, we often take ε=1 or, in the bounded case, f bounded by 1, and we write critical point instead of 1-critical point.
Note however that if we fix X beforehand, the FVP or LVP for X with bounded f are not necessarily equivalent to the 1-FVP or the 1-LVP for X, respectively, since we may not be able to perform the required scaling transformations without modifying the space.
Claim (3) gives a few examples where such transformations may be performed, and while the list is by no means meant to be exhaustive, we conjecture that a space X can be found so that the 1-FVP and the FVP for X are not provably equivalent in RCA0.
Note also that in the cases (3a)-(3c), if the function f is continuous, then so is g. Thus in these cases, the claim also holds for the respective variational principles restricted to continuous functions.
Our construction for (3d) does not preserve continuity, but it should also be possible to define some g~ which is continuous if f was (similar to the 2-envelope of g). Note however that some care would be required to construct g~ in RCA0.
In any case, the current formulation of the lemma suffices for our purposes.
Let us now begin our analysis by showing that WKL0 suffices to construct critical points of continuous functions on compact spaces.
Proposition 7.4** (WKL0).**
The FVP holds for compact X and continuous f; in fact, in this setting f attains its minimum.
Proof.
WKL0 proves that every continuous function from a compact complete separable metric space to R attains a minimum value (see [SimpsonSOSOA, Theorem IV.2.2] for the version of this fact with ‘maximum’ in place of ‘minimum’). Thus let x∗∈X be a point at which f attains its minimum value. Then x∗ is a critical point of f by Lemma 7.2.
∎
When working over ACA0, we may drop the assumption that X is compact. Our proof is based on those in [Brezis1976, Du16].
Theorem 7.5** (ACA0).**
The FVP holds for arbitrary X and continuous f.
Proof.
Let X=X be a complete separable metric space with metric d, and in view of Lemma 7.3.1, assume that ε=1. First we use ACA0 to collect some information in order to implement the proof. Define S⊆X×X×Q>0 by
[TABLE]
and for a∈X and q∈Q>0, write S(a,q) for {b∈X:(a,b,q)∈S}. Define a sequence of reals (ua,q:a∈X∧q∈Q>0) by ua,q=infb∈S(a,q)f(b) for each a∈X and q∈Q>0, which always exists because f is bounded from below and S(a,q) is non-empty (as it contains a). Next, define a sequence of reals (ra:a∈X) by ra=supq∈Q>0ua,q for each a∈X, which always exists because (∀q∈Q>0)(ua,q≤f(a)). That the sequences (ua,q:a∈X∧q∈Q>0) and (ra:a∈X) can be defined in ACA0 follows from the appropriately uniformized version of the (1)⇒(4) direction of [SimpsonSOSOA, Theorem III.2.2] (and the analogous implication with ‘greatest lower bound’ in place of ‘least upper bound’ in (4)). Observe that (∀a∈X)(∀q0,q1∈Q>0)(q0≤q1→ua,q1≤ua,q0), which means that (∀a∈X)(ra=limq→0+ua,q). Finally, define a function R:X×Q>0→Q describing the rates of convergence of the sequences (ua,q:q∈Q>0) for each a∈X by letting R(a,p)=2−n, where n is least such that (∀q∈Q>0)(q≤2−n→∣ra−ua,q∣<p).
Now we define a sequence (an:n∈N) of points in X converging to a critical point of f and a helper sequence (qn:n∈N) of points in Q>0. Let a0 be the first point (that is, the point with the least code) in X, and let q0=2−2R(a0,1). Given (ai:i≤n) and (qi:i≤n), let an+1 be the first point in S(an,qn) such that f(an+1)<uan,qn+2−(n+1), and let qn+1=2−(n+3)∏i≤n+1R(ai,2−i). The important property of (qn:n∈N) is that, for every n∈N,
[TABLE]
which holds because (∀n)(∀k≥n)[qk≤2−(k+2)R(an,2−n)].
As the sequence (f(an):n∈N) is bounded from below, let ℓ=liminfn∈Nf(an) (which exists by [SimpsonSOSOA, Lemma III.2.1] with ‘liminf’ in place of ‘limsup’). Observe that liminfn∈Nuan,qn=ℓ as well because ∀n(uan,qn≤f(an+1)<uan,qn+2−(n+1)). Let i0<i1<i2<… be an increasing sequence of indices such that limn∈Nf(ain)=ℓ. We show that the corresponding sequence (ain:n∈N) is Cauchy. Given M∈N, let N>M be large enough so that ∀n,m(N<n<m→∣f(ain)−f(aim)∣≤2−(M+1)). Then, for every n,m∈N with N<n<m, we have that
[TABLE]
In the above expression, the first inequality is the triangle inequality, the second inequality is because ak+1 is chosen from S(ak,qk) for every k∈N, the third inequality is obtained by canceling the appropriate terms in the telescoping sum, and the fourth inequality is by the choice of N and by recalling that ∑k≥inqk<2−in (and that in≥n>N>M). Thus (ain:n∈N) is Cauchy. By the (1)⇒(3) direction of [SimpsonSOSOA, Theorem III.2.2], (ain:n∈N) converges to some x∗∈X.
We show that x∗ is a critical point of f. By the continuity of f, f(x∗)=limn∈Nf(ain)=ℓ. We show that if y∈X is such that d(x∗,y)≤f(x∗)−f(y), then f(y)=ℓ as well. Thus consider a y∈X such that d(x∗,y)≤f(x∗)−f(y). Clearly f(y)≤ℓ because otherwise we would have the contradiction d(x∗,y)<0. So we show that f(y)≥ℓ. To do this, fix n∈N. We see that d(ain,x∗)≤f(ain)−f(x∗)+∑k≥inqk by considering the inequality (2) as m→∞. Therefore,
[TABLE]
Thus d(ain,y)<f(ain)−f(y)+R(ain,2−in) because ∑k≥inqk<R(ain,2−in). It follows that f(y)≥uain,R(ain,2−in) because y is in the open set
[TABLE]
and S(ain,R(ain,2−in)) is dense in U. Observe also that qin≤R(ain,2−in), which implies that uain,qin−uain,R(ain,2−in)<2−in. Therefore f(y)>uain,qin−2−in. So we have shown that ∀n(f(y)>uain,qin−2−in). As (uain,qin:n∈N) is a subsequence of (uan,qn:n∈N), it follows that liminfn∈Nuain,qin≥liminfn∈Nuan,qn=ℓ. Thus we conclude that f(y)≥ℓ, as desired. This completes the proof that if d(x∗,y)≤f(x∗)−f(y), then f(y)=f(x∗)=ℓ. So if y∈X satisfies d(x∗,y)≤f(x∗)−f(y), then y=x∗. Hence x∗ is a critical point of f.
∎
8. Critical points of arbitrary potentials
Now let us consider the case where f is a possibly discontinuous potential.
As it turns out, this general case can be reduced to the continuous case by using envelopes.
To be precise, in order to find an α-critical point of a lower semi-continuous potential f, it suffices to find an α-critical point of the β-envelope fβ for any β>α.
Lemma 8.1** (RCA0).**
Let X be a complete separable metric space, 0<α<β, f:X→R≥0 be a potential with supp(f)=∅, and suppose that there is a continuous function fβ:X→R≥0 such that, for every x∈X,
[TABLE]
Then, for any x∗∈X,
(1)
if x∗ is an α-critical point of fβ, then f(x∗)=fβ(x∗) and x∗ is an α-critical point of f.
2. (2)
If fβ attains its minimum at x∗, then f attains its minimum at x∗.
Proof.
In view of Lemma 7.2, the second item is a consequence of the first: indeed, if fβ attains its minimum at x∗∈X, then for every α<β we have that x∗ is α-critical for fβ, so that by the first item it is α-critical for f. Since α is arbitrary, f attains its minimum at x∗.
Thus we focus on the first item.
Suppose that x∗ is an α-critical point of fβ; we begin by showing that f(x∗)=fβ(x∗).
Clearly fβ(x∗)≤f(x∗), so suppose for a contradiction that fβ(x∗)<f(x∗).
Let ε=1 if f(x∗)=∞ and ε=2f(x∗)−fβ(x∗)>0 otherwise, and let δ∈Q>0 be such that f(y)>fβ(x∗)+ε whenever d(x∗,y)<δ. Such a δ can be obtained by letting Φ be the code for f; letting Br(a)⊆X and q∈Q>0 be such that Br(a)⇁Φq, x∗∈Br(a), and q>fβ(x∗)+ε; and then taking δ<r−d(a,x∗).
Now let y∈X be such that
[TABLE]
Note that y=x∗, for otherwise fβ(x∗)>f(x∗)−min{(β−α)δ,ε}≥fβ(x∗)+ε, a contradiction. Moreover, we cannot have that d(x∗,y)<δ, for otherwise
[TABLE]
where the first inequality is by the choice of y and the second inequality is by the choice of δ and the assumption d(x∗,y)<δ. Thus it must be that d(x∗,y)≥δ. Therefore,
[TABLE]
where the first inequality is by the choice of y and the second inequality is because fβ(y)≤f(y), δ≤d(x∗,y), and min{(β−α)δ,ε}≤(β−α)δ.
This means that x∗ is not an α-critical point of fβ, which is a contradiction.
We have established that f(x∗)=fβ(x∗). We now use this to show that x∗ is an α-critical point of f. Assume for a contradiction that this is not the case. Then there is a y∈X such that αd(x∗,y)≤f(x∗)−f(y) but y=x∗. Then,
[TABLE]
because fβ(x∗)=f(x∗) and fβ(y)≤f(y). This contradicts that x∗ is an α-critical point of fβ. Thus x∗ is indeed an α-critical point of f.
∎
Theorem 8.2**.**
(i)
(ACA0)* The FVP holds for arbitrary X and any honestly-coded potential f.*
2. (ii)
(ACA0)* The FVP holds for compact X and any potential f; in fact, such an f attains its minimum.*
3. (iii)
(Π11\mbox−CA0)* The FVP holds for arbitrary X and any potential f.*
Proof.
In view of Lemma 7.2, we may assume that ε=1; note that an honest code remains honest after scaling f.
(i) As f is assumed to be honestly-coded, the envelope f2 exists and is continuous by Lemma 6.3. The function f2 has a critical point by Theorem 7.5, and this critical point is also a critical point of f by Lemma 8.1.
(ii) As X is compact, f can be honestly coded by Lemma 5.2 item (ii). Thus f has a critical point by item (i) of this theorem. In fact, f2 is defined and attains its minimum by Proposition 7.4, so that f also attains its minimum.
(iii) Working in Π11\mbox−CA0, we can assume that f is honestly-coded by Lemma 5.2 item (iv). Thus f has a critical point by item (i).
∎
Below we will show that the points in Theorem 8.2 are optimal.
9. Reversals of the variational principle
In this section we show that many of the results of §7 and §8 reverse. Let us begin with the weakest version of the FVP we have considered. Recall from Proposition 7.4 that WKL0 suffices to prove that every continuous potential over a compact space has a critical point. We now show that WKL0 is also necessary to prove this version of the FVP.
Proposition 9.1**.**
The FVP for continuous, bounded f on the Cantor space or on [0,1] implies WKL0 over RCA0.
Proof.
We work in RCA0 and prove the contrapositive. If WKL0 fails, then there is an infinite binary tree T⊆2<N that has no infinite path.
Let T∘ be the set of leaves of T: T∘={σ∈T:σ⌢0,σ⌢1∈/T}.
Since T is infinite and has no infinite path, T∘ is also infinite.
For each σ∈T∘, define
[TABLE]
For each σ∈T, define σ~∈2<N with ∣σ~∣=2∣σ∣ as σ~(2i)=0 and σ~(2i+1)=σ(i) if i<∣σ∣.
Put T~={σ~:σ∈T}, T~∘={σ~:σ∈T∘}, and put
[TABLE]
Here, S is the set of all binary strings τ which move away from T~ before reaching a member of T~∘. S can be defined in RCA0 because each τ need only be checked against strings σ~ of length at most ∣τ∣. The elements of T~∘∪S are pairwise incomparable, and for every x∈2N there is a σ∈T~∘∪S such that x⊒σ. Indeed, each x must either reach a leaf of T~ or move away from T~ before that because T~ has no infinite path.
Now, define a continuous function f:2N→[0,3] as follows:
[TABLE]
One may easily obtain a code for f by Lemma 3.8 and Lemma 3.9.
We show that f has no critical point, which gives the desired contradiction.
If x⊒τ for some τ∈S, then take any σ∈T∘ and y⊒σ~, and observe that
f(x)−f(y)≥3−2≥d(x,y). Thus x is not a critical point.
Assume instead that x⊒σ~ for some σ∈T∘.
Let i0 be the greatest i<∣σ∣−1 such that i∈/Aσ, which exists because T is infinite. Then there is a σ′∈T∘ such that σ′⊒(σ↾i0)⌢(1−σ(i0+1)) and ∣σ′∣>∣σ∣.
By the maximality of i0, any τ∈T which extends σ↾(i0+1) is shorter than σ′.
Thus, we have that i0∈Aσ′ and that j∈Aσ implies that j∈Aσ′ for every j<i0 since σ↾i0=σ′↾i0. Take y⊒σ~′.
Then d(x,y)≤2−2i0−1, and therefore
[TABLE]
Thus x is not a critical point.
We can simulate the above construction on the unit interval in order to obtain a continuous function on [0,1] that has no critical point.
For a given σ∈2<N, let lσ=0.σ=∑i<∣σ∣2−σ(i),rσ=lσ+2−∣σ∣, and Iσ=[lσ,rσ]⊆[0,1].
Now, for each σ∈T∘, let gσ~:Iσ~→[0,3] be a piecewise linear function such that gσ~(lσ~)=gσ~(rσ~)=3 and gσ~((lσ~+rσ~)/2)=2−∑i∈Aσ2−2i,
and, for each τ∈S, define gτ:Iτ→[0,3] as gτ(x)=3.
Put g=⋃τ∈T~∘∪Sgτ.
Then, g is a continuous function on [0,1] by Lemma 3.8 and Lemma 3.9: for this, observe that for each τ∈T~∘∪S, gτ can be extended to an open subset of [0,1], and thus g can be decomposed into piecewise linear functions on open subsets of [0,1]. Indeed, if lτ,rτ∈/{0,1}, then one can effectively find σ0,σ1∈T~∘∪S such that rσ0=lτ and rτ=lσ1. Then g is still piecewise linear on an interval (lσ0,rσ1).
One can check that g has no critical point, as we have seen above.
If x∈Iτ for some τ∈S, then take any σ∈T∘, and observe that y=(lσ~+rσ~)/2 witnesses that x is not a critical point.
If x∈Iσ~ for some σ∈T∘ then take σ′∈T∘ as in the Cantor space case. Then y=(lσ~′+rσ~′)/2 witnesses that x is not a critical point.
∎
Next we consider versions of the variational principle provable in ACA0. Recall from Theorem 7.5 and Theorem 8.2 that ACA0 is able to prove the FVP when either f is continuous or X is compact. Let us see that each of these cases implies ACA0, beginning with the former.
Proposition 9.2**.**
The FVP for continuous, bounded f on the Baire space implies ACA0 over RCA0.
Proof.
As is well-known, ACA0 is equivalent to the statement “for every injection h:N→N, the range of h exists as a set.” (see [SimpsonSOSOA, Lemma III.1.3]).
Let h:N→N be an injection. For the purposes of this proof, we view the natural numbers as coding the finite sets, with [math] coding ∅. For each n∈N and finite set D, let vn(D) denote the number of a∈D with h(a)<n: vn(D)=∣{a∈D:h(a)<n}∣. The fact that h is an injection implies that vn(D)≤n.
Define a continuous function g:NN→R≥0 by
[TABLE]
We may use Lemma 3.8 to show that g can indeed be coded as a continuous function in RCA0, as it is straightforward to produce the sequence (⟨σ⌢0N,g(σ⌢0N)⟩:σ∈N<N) and, for all n∈N and σ,τ∈N<N, to check that
[TABLE]
To see the above implication, observe that if d(x,y)<2−2n+1, then x↾(2n+1+1)=y↾(2n+1+1). This means that the terms 2−2k−1+vk(x(2k+1)) and 2−2k−1+vk(y(2k+1)) agree for k≤n and therefore that ∣g(x)−g(y)∣<∑k=n+1∞2−k−1=2−n−1<2−n.
We see that g(x)≤1 for all x∈NN, thus the function f:NN→R≥0 given by f(x)=1−g(x) is a continuous potential on the Baire space that is bounded by 1.
By the FVP on the Baire space, let x∗ be a critical point of f. For each n, let Dn denote the set coded by x∗(2n+1). We claim that Dn must contain every a for which h(a)<n. Suppose not, and let a be such that h(a)<n but a∈/Dn. Let y∈NN be such that y(2n+1) is a code for Dn∪{a} and y(m)=x∗(m) for all m=2n+1. Then
[TABLE]
contradicting that x∗ is a critical point. Thus Dn contains every a for which h(a)<n. We may then extract the range of h from x∗ by taking ran(h)={n:(∃a∈Dn+1)(h(a)=n)}.
∎
Proposition 9.3**.**
The FVP for honestly-coded f on [0,1] implies ACA0 over RCA0.
Proof.
We work in RCA0 and prove the contrapositive. By Theorem 3.5, if ACA0 fails, then there is a strictly increasing sequence c=(cn:n∈N) of rationals in [0,1] with no supremum.
To define f, enumerate a code Φ by enumerating (u,v)⇁Φq if q≤u or if q≤2 and there is an n such that v<cn. One readily checks that Φ is a code for the potential f:[0,1]→R≥0 given by
[TABLE]
To ensure that Φ is honest, additionally enumerate (u,v)⇁Φq whenever there is an n such that q≤cn<v. To see that the resulting code is honest, consider an open interval (u,v), and suppose that (∀x∈(u,v))(f(x)≥q). Note that f is bounded above by 2, so q≤2. If there is an n such that v<cn, then (u,v)⇁Φq. If cn<u for all n, then f(x)=x on (u,v). Thus q≤u, so (u,v)⇁Φq. Finally, suppose that there is an n such that cn∈(u,v), but there is no n such that v<cn. As v is not the supremum of (cn:n∈N) (because we assumed that there is no such supremum), there is a y∈(u,v) such that ∀n(cn<y). Then f(y)=y, which implies that q≤y<v. By a similar argument, it cannot be that ∀n(cn<q) because if this were true, then there would be a z∈(u,v) with z<q such that ∀n(cn<z). As also f(z)=z, this contradicts the assumption that (∀x∈(u,v))(f(x)≥q). Thus it must be that q≤cn<v for some n, which implies that (u,v)⇁Φq. Hence Φ is honest.
We claim that f has no critical point. Assume towards a contradiction that c∗ is a critical point of f. We show that c∗ is the supremum of c. Indeed, it is readily checked that if c∗<cn for some n, then
[TABLE]
and hence c∗ cannot be a critical point. It follows that c∗≥cn for all n. Now, if c′<c∗ were also an upper bound of c, then we would have f(c′)=c′, so that
[TABLE]
and thus c∗ cannot be a critical point. Hence c∗ is the supremum of c, contradicting our initial assumption.
∎
Finally, we show that the unrestricted FVP proves Π11\mbox−CA0 by appealing to the following characterization of Π11\mbox−CA0.
Lemma 9.4** ([SimpsonSOSOA, Lemma VI.1.1]).**
The following are equivalent over RCA0:
(1)
Π11\mbox−CA0**
2. (2)
for any sequence (Ti)i∈N of subtrees of N<N, there is a set X such that for all i∈N, i∈X if and only if Ti has an infinite path.
Proposition 9.5**.**
The FVP on the Baire space implies Π11\mbox−CA0 over RCA0.
Proof.
By Proposition 9.2, the FVP on the Baire space implies ACA0 over RCA0, so we may work over ACA0.
Recall that we assume that the pairing function ⟨⋅,⋅⟩:N×N→N is increasing in both coordinates. For the purposes of this proof, if x∈NN and i∈N, then (x)i∈NN is the function defined by (x)i(n)=x(⟨i,n⟩). Similarly, if σ∈N<N and i∈N, then (σ)i∈N<N is the longest sequence such that (∀n<∣(σ)i∣)[⟨i,n⟩∈domσ∧(σ)i(n)=σ(⟨i,n⟩)].
Let (Ti)i∈N be a sequence of subtrees of N<N. We first define a code for the lower semi-continuous potential f:NN→[0,1] given by
[TABLE]
To do this, define Br(σ)⇁Φq if there is a τ∈N<N such that Br(σ)\subsetplusB2−∣τ∣(τ) and q≤∑i=0∞{2−i:(τ)i∈/Ti}. For a given x∈NN, one readily checks that v=∑i=0∞{2−i:(x)i∈/[Ti]} (which ACA0 proves exists) is indeed the supremum of
[TABLE]
This shows that Φ correctly codes the desired potential f and that f is provably total in ACA0.
By the FVP, let x∗ be a critical point of f. Let X={i:(x∗)i∈[Ti]}. We show that (\forall i\geq 0)(i\in X\leftrightarrow\text{T_{i} has a path}). Clearly, if i∈X, then Ti has a path. Suppose for a contradiction that there is a j such that Tj has a path, but j∈/X. Let h∈[Tj], and let y∈NN be such that for all i,n∈N,
[TABLE]
so that (y)j=h and ∀i[i=j→(y)i=(x∗)i]. Then
[TABLE]
This contradicts that x∗ is a critical point of f, which completes the proof.
∎
10. The localized variational principle
In this section we compare the strength of the free and localized variational principles in different contexts.
First, we show that, in contrast to the FVP, the LVP is not affected by the boundedness of f.
We also show that the two principles are equivalent when f is not assumed to be continuous.
For the latter, it is clear that the LVP implies the FVP, so we focus on the other direction.
The following two lemmas aid the proof of the above facts.
Lemma 10.1** (RCA0).**
Let X be a complete separable metric space, let f:X→R≥0 be lower semi-continuous, and let C⊆X be closed. Then there is a lower semi-continuous function g:X→R≥0 such that
[TABLE]
Proof.
Let Ψ be a code for f, and let U be a code for the complement of C. Enumerate a code Φ for g by enumerating Br(a)⇁Φq if either
•
q≤0 or
•
Br(a)⇁Ψq and there is an open ball Bs(b) enumerated in U with Br(a)\subsetplusBs(b).
If x∈C then there are balls Br(a) with x∈Br(a) and Br(a)⇁Φ0, but there are no balls Br(a) and q>0 such that x∈Br(a) and Br(a)⇁Φq. Hence g(x)=0.
If x∈/C, given q∈Q>0, it is clear from the definition of Φ that if there is a ball Br(a) with x∈Br(a) and Br(a)⇁Φq then also Br(a)⇁Ψq.
Conversely, suppose that there is a ball Br(a) with x∈Br(a) and Br(a)⇁Ψq.
Since x∈/C, x∈Bs(b) for some ball Bs(b) enumerated in U, and hence there are a′,r′ such that x∈Br′(a′)\subsetplusBr(a),Bs(b). But then by condition (lsc1) of Definition 4.1, Br′(a′)⇁Ψq, so that also Br′(a′)⇁Φq. Since q was arbitrary we conclude that g(x)=f(x) given that both are the supremum of such q in R.
∎
Lemma 10.2** (RCA0).**
Let X be a complete separable metric space, and let f,g:X→R≥0 be lower semi-continuous. Then (f+g),max{f,g},min{f,g}:X→R≥0 (defined in the usual way) are lower semi-continuous.
Proof.
Let Ψ0 be a code for f and let Ψ1 be a code for g. Enumerate a code for f+g by enumerating Br(a)⇁Φq if there are q0,q1 such that Br(a)⇁Ψ0q0, Br(a)⇁Ψ1q1, and q≤q0+q1.
Enumerate a code Φ for max{f,g} by enumerating Br(a)⇁Φq if either Br(a)⇁Ψ0q or Br(a)⇁Ψ1q.
Enumerate a code Φ for min{f,g} by enumerating Br(a)⇁Φq if both Br(a)⇁Ψ0q and Br(a)⇁Ψ1q.
∎
Lemma 10.3** (RCA0).**
For every complete separable metric space X, the LVP for bounded potentials on X implies the LVP on X, and the LVP for bounded continuous potentials on X implies the LVP for continuous potentials on X.
Proof.
Given a complete separable metric space X, a potential f:X→R≥0, and an x0∈supp(f), define f~:X→R≥0 by setting f~(x)=min{f(x),f(x0)}. Then f~ is bounded, is a potential by Lemma 10.2, and is continuous if f is continuous. Suppose that ε>0 and that x∗∈X is an ε-critical point of f~ with εd(x0,x∗)≤f~(x0)−f~(x∗). Then either f~(x∗)=f(x∗) or f~(x∗)=f(x0). However, if f~(x∗)=f(x0), then εd(x0,x∗)≤f~(x0)−f~(x∗)=f(x0)−f(x0)=0. Thus d(x0,x∗)=0 and x∗=x0, in which case again f~(x∗)=f(x∗). Thus in either case f~(x∗)=f(x∗). Therefore εd(x0,x∗)≤f~(x0)−f~(x∗)=f(x0)−f(x∗). Furthermore, if x∈X and εd(x∗,x)≤f(x∗)−f(x), then also εd(x∗,x)≤f~(x∗)−f~(x), so x=x∗ because x∗ is an ε-critical point of f~. Thus x∗ is also an ε-critical point of f with εd(x0,x∗)≤f(x0)−f(x∗).
∎
Lemma 10.4** (RCA0).**
For every complete separable metric space X, the FVP on X implies the LVP on X.
Proof.
Assume the FVP. Let f:X→R≥0 be any potential, ε>0, x0∈supp(f), and let C be the closed set
[TABLE]
To see that C is closed, rewrite C as C={x∈X:f(x)+εd(x,x0)≤f(x0)}, and observe that the function g(x)=f(x)+εd(x,x0) is lower semi-continuous by Proposition 4.3 and Lemma 10.2. As in the proof of Proposition 4.2, g(x)≤f(x0) is a Π10-definable property of x, so C is closed.
By Lemma 10.1 and Lemma 10.2, the function f~:X→R≥0 given by
[TABLE]
is lower semi-continuous. Let x∗ be an ε-critical point of f~. It must be that x∗∈C, for otherwise we would have that x∗=x0 (as x0∈C), but f~(x∗)≥εd(x∗,x0)+f(x0)=εd(x∗,x0)+f~(x0), meaning that x0 witnesses that x∗ is not an ε-critical point of f~.
We have that f(x∗)≤f(x0)−εd(x∗,x0) because x∗∈C. Moreover, if x∈X and f(x)≤f(x∗)−εd(x∗,x), then
[TABLE]
(where the last inequality is by the triangle inequality), so x∈C and therefore f~(x)=f(x). Thus for an x∈X with f(x)≤f(x∗)−εd(x∗,x),
[TABLE]
and therefore x=x∗ because x∗ is an ε-critical point of f~. So x∗ is also an ε-critical point of f.
We conclude that x∗ is an ε-critical point of f and that f(x∗)≤f(x0)−εd(x∗,x0).
∎
Thus the FVP and the LVP for arbitrary f are equivalent.
In contrast, the LVP for continuous functions suffices to prove the full FVP.
To make this precise, we introduce the notion of pseudo-fibrations.
Below, by a closed isometry we mean a continuous function f:X→Y such that dX(x0,x1)=dY(f(x0),f(x1)) for all x0,x1∈X and such that f[X] is closed. We check that if f is a closed isometry and C⊆X is closed, then RCA0 proves that f[C] is closed as well.
Lemma 10.5** (RCA0).**
Let X be a complete separable metric space, let C⊆X be closed, and let f:X→Y be a closed isometry. Then f[C]⊆Y is closed.
Proof.
Let U be a code for the complement of C⊆X, and let V0 be a code for the complement of f[X]⊆Y. Enumerate a code V for an open V⊆Y by enumerating the ball Bs(b) in V if either
•
Bs(b) is enumerated in V0 or
•
there is a ball Br(a) enumerated in U such that dY(f(a),b)+s<r.
We show that a y∈Y is in V if and only if there is no x∈C such that f(x)=y. This shows that V is the complement of f[C] and hence that f[C] is closed.
For y∈Y, we know that there is no x∈X such that y=f(x) if and only if there is a ball Bs(b) containing y that is enumerated in V0. Thus it suffices to assume that there is an x∈X such that y=f(x) and show that y∈V if and only if x∈/C. Note that x is unique because f is an isometry.
First suppose that x∈/C. Then x∈Br(a) for some ball Br(a) enumerated in U. We have that dY(f(a),y)=dX(a,x)<r because y=f(x) and f is an isometry. Thus by choosing b sufficiently close to y, we see that there is a ball Bs(b) containing y with dY(f(a),b)+s<r. Thus y∈V.
Conversely, suppose that y∈V. Then there are balls Bs(b)⊆Y and Br(a)⊆X such that y∈Bs(b), Br(a) is enumerated in U, and dY(f(a),b)+s<r. Therefore dY(f(a),f(x))<r because f(x)=y∈Bs(b). Thus dX(a,x)<r because f is an isometry. Therefore x∈Br(a), so x∈/C.
∎
Definition 10.6**.**
Let X,Y be complete separable metric spaces. Say that a space Z is an X-pseudofibration of Y if there are a closed isometry ι:X×Y→Z and a continuous function π:Z→Y such that
(1)
for all z,z′∈Z, dY(π(z),π(z′))≤dZ(z,z′) and
2. (2)
for all ⟨x,y⟩∈X×Y, πι(x,y)=y.
Of course the typical example of an X-pseudofibration of Y is X×Y, but later we will see that there are others.
Pseudofibrations are useful due to the following.
Lemma 10.7** (RCA0).**
(1)
If X and Z are complete separable metric spaces such that Z is an X-pseudofibration of R≥0, then the LVP for continuous potentials on Z implies the FVP on X.
2. (2)
If X and Z are complete separable metric spaces such that Z is an X-pseudofibration of [0,1], then the LVP for continuous potentials on Z implies the 1-FVP* on X.*
Proof.
We give a uniform proof for the two claims.
Note that by Lemma 7.3.1 we can also assume for the conclusion of the first claim that ε=1.
Let X be a complete separable metric space, let Y be either R≥0 for the first claim or [0,1] for the second, and let Z be an X-pseudofibration of Y with associated functions ι:X×Y→Z and π:Z→Y. For the first claim, let f:X→R≥0 be lower semi-continuous; and for the second claim, let f:X→[0,1] be lower semi-continuous. By Proposition 4.2, the set
[TABLE]
is closed, hence so is Γ=ι[Δ]⊆Z by Lemma 10.5 because ι is a closed isometry. By Urysohn’s lemma, which is provable in RCA0 by [SimpsonSOSOA, Lemma II.7.3], there is a continuous function g:Z→[0,1] such that g(z)=0 if and only if z∈Γ. Define a continuous function f~:Z→R≥0 by
[TABLE]
Every z∈Γ is of the form z=ι(x,y) for some ⟨x,y⟩∈X×Y. For such a z, we then have that
[TABLE]
where the second equality is because z∈Γ (and therefore g(z)=0), and the last equality is by Definition 10.6.2.
Now, fix any x0∈supp(f), let y0=f(x0), and let z0=ι(x0,y0). By the LVP for continuous potentials on Z, let z∗∈Z be a critical point of f~ that satisfies f~(z∗)≤f~(z0)−dZ(z0,z∗). We first claim that z∗∈Γ. To see this, observe that z0∈Γ by definition and therefore f~(z0)=π(z0)=y0. We then have that
[TABLE]
Therefore we must have that g(z∗)=0, which means that z∗∈Γ.
As z∗∈Γ, it must be that z∗=ι(x∗,y∗) for some ⟨x∗,y∗⟩∈Δ. Thus x∗∈X and y∗≥f(x∗). In fact, it must be that y∗=f(x∗). To see this, suppose for a contradiction that y∗>f(x∗), and let z=ι(x∗,f(x∗)). We then have that
[TABLE]
where the first equality is because z∗ and z are in Γ, and the second equality is because ι is an isometry. Thus z witnesses that z∗ is not a critical point of f~, which is a contradiction. Therefore y∗=f(x∗), so z∗=ι(x∗,f(x∗)).
We now show that x∗ is a critical point of the original f, which completes the proof. Let x∈X, and suppose that f(x∗)≥f(x)+dX(x,x∗). Let z=ι(x,f(x)). Then
[TABLE]
where the first equality is because ι is an isometry, and the second equality is because f(x∗)−f(x)≥dX(x,x∗) and we use the max norm on X×Y. As z∗ and z are in Γ, we have that
[TABLE]
Therefore z=z∗ because z∗ is a critical point of f~, and since ι is injective, x=x∗. We have shown that if x∈X satisfies f(x∗)≥f(x)+dX(x,x∗), then x=x∗. Thus x∗ is a critical point of f.
∎
Of course it follows from Lemma 10.7 that the LVP for continuous functions on X×R≥0 implies the FVP on X.
However, the notion of pseudofibrations will allow us to replace X×R≥0 by a more familiar space.
In what follows, we will mainly deal with \mathcal{C}\big{(}[0,1]\big{)}.
For given h\in\mathcal{C}\big{(}[0,1]\big{)} and a closed interval I⊆[0,1], we write ∥h∥I=supt∈I∣h(t)∣.
By the definition (coding) of \mathcal{C}\big{(}[0,1]\big{)}, statements of the form ∥h∥I≤r are always Π10.
Lemma 10.8** (RCA0).**
Let X be a complete separable metric space, and let Y=[a,b]∩R, where a<b are elements of R. If X embeds by a closed isometry into \mathcal{C}\big{(}[0,1]\big{)}, then \mathcal{C}\big{(}[0,1]\big{)} is an X-pseudofibration of Y.
Proof.
Let \mathcal{Z}=\mathcal{C}\big{(}[0,1]\big{)} and let f:X→Z be a closed isometry.
Let us write fx instead of f(x).
Define ι:X×Y→Z by
[TABLE]
and define π:Z→Y by
[TABLE]
Intuitively, ι(x,y) represents y by its value on [math] and represents f(x) by its values on [\nicefrac12,1].
It is easy to check that a code for π can be constructed using Lemma 3.8.
We may also use Lemma 3.8 to construct a code for ι.
To do this, let X and Y be the dense sets of X and Y, respectively. It is straightforward to produce the sequence (⟨⟨a,q⟩,ι(a,q)⟩:⟨a,q⟩∈X×Y) and observe that
[TABLE]
Thus the hypotheses of Lemma 3.8 are satisfied, so there exists a code for ι.
Equation (3) implies that ι is an isometry. To see that ι is closed, consider an h\in\mathcal{C}\big{(}[0,1]\big{)}, put h♭(t)=(2t)h(1/2)−(1−2t)h(0), and put h♯(t)=h(\nicefract2+\nicefrac12).
Then h∈ι[X×Y] if and only if ∥h−h♭∥[0,1/2]=0 (i.e., h is linear on [0,1/2]), h(0)∈Y, and h♯∈f[X], which is a Π10 condition because f[X] is closed by assumption.
Thus, ι[X×Y] is closed by Lemma 3.7. It is then easy to check that ι and π make Z an X-pseudofibration of Y.
∎
In order to apply Lemma 10.8, we need to show that many of the spaces we are interested in isometrically embed into \mathcal{C}\big{(}[0,1]\big{)}. Let us begin with the unit interval.
Lemma 10.9** (RCA0).**
There exists a closed isometry f\colon[0,1]\to\mathcal{C}\big{(}[0,1]\big{)}.
Proof.
Let X=[0,1] and \mathcal{Y}=\mathcal{C}\big{(}[0,1]\big{)}.
For r∈[0,1], write Ir for the constant function with value r.
Define f:X→Y by f(x)=Ix.
It is obvious that f is an isometry and thus a code for f exists by arguing in the style of the proof of Lemma 10.8 and appealing to Lemma 3.8.
We may observe that h∈f[X] if and only if ∥h−Ih(0)∥[0,1]=0, which is a Π10 condition, thus f[X] is closed by Lemma 3.7.
∎
Next, let us see that there is also a closed isometry from the Baire space into \mathcal{C}\big{(}[0,1]\big{)}.
Definition 10.10**.**
If I=[a,b] is an interval with 0≤a<b≤1, define I^:[0,1]→R to be the piecewise linear function with I^(0)=I^(a)=I^(b)=I^(1)=0, \hat{I}\big{(}\frac{a+b}{2}\big{)}=1, and I^ linear elsewhere.
For n∈N, define In=[1−2−n,1−2−(n+1)] and for m,n∈N define
[TABLE]
Let X be the Baire space and Y be \mathcal{C}\big{(}[0,1]\big{)}.
Then, define F:X→Y by
[TABLE]
Lemma 10.11** (RCA0).**
The function F of Definition 10.10 is a closed isometry from the Baire space to {\mathcal{C}}\big{(}[0,1]\big{)}.
Proof.
A code for F exists by checking that F is an isometry, arguing as in the proof of Lemma 10.8, and appealing to Lemma 3.8.
For h\in{\mathcal{C}}\big{(}[0,1]\big{)}, h∈F[NN] if and only if for every n∈N, ∥h∥In=2−n, and for every n,m∈N, either ∥h∥Jnm=0 or ∥h−2−nJ^nm∥In=0, which is a Π10 statement.
It is easy to check the “only if” direction, so we check the “if” direction. For a given h\in{\mathcal{C}}\big{(}[0,1]\big{)} which satisfies the assumption, define hN:N→N by letting hN(n) be the unique m∈N such that ∥h∥Jnm=0 (which can be found effectively). Then F(hN)=h. Thus the image of F is closed by Lemma 3.7.
∎
Proposition 10.12** (RCA0).**
(1)
The LVP for bounded, continuous potentials on [0,1]×[0,1] with the max metric implies ACA0.
2. (2)
The LVP for bounded, continuous potentials on \mathcal{C}\big{(}[0,1]\big{)} implies Π11\mbox−CA0.
Proof.
Note that in view of Lemma 10.3 the bounded LVP implies the full LVP, so we may remove ‘bounded’ from both items.
For the first item, we note by Proposition 9.3 that the FVP on [0,1] implies ACA0.
Moreover, by Lemma 7.3.3d, since [0,1] is a closed ball in R1, we may replace the FVP by the 1-FVP.
Since [0,1]×[0,1] is clearly a [0,1]-pseudofibration of [0,1], by Lemma 10.7 we have that the LVP for bounded, continuous potentials on [0,1]×[0,1] implies ACA0 as well.
For the second, let X be the Baire space and Z be {\mathcal{C}}\big{(}[0,1]\big{)}. By Lemma 7.3.3b and Proposition 9.5, the 1-FVP on X implies Π11\mbox−CA0. By Lemma 10.11, X embeds by a closed isometry into Z, so that by Lemma 10.8, Z is an X-pseudofibration of [0,1]. Hence, reasoning as above, the LVP for bounded, continuous functions on Z implies the 1-FVP on X, from which we obtain Π11\mbox−CA0.
∎
Remark 10.13**.**
Let B be the closed unit ball of \mathcal{C}\big{(}[0,1]\big{)}. It is easy to see that the isometries used in Lemmas 10.9 and 10.11 map into B, and the proof of Lemma 10.8 can be modified to show that B is an X-pseudofibration of [0,1]. With this we may replace \mathcal{C}\big{(}[0,1]\big{)} by B in Proposition 10.12.2.
11. Conclusion
We have considered formalizations of Ekeland’s variational principle with various natural restrictions and have shown that they are equivalent to well-known theories of second-order arithmetic. In this final section we synthesize these results to produce the definitive versions of our main results.
There are three main weakenings of the LVP we considered: the free, rather than localized, statement; the case where X is compact; and the case where f is continuous. When the LVP is weakened in the three ways simultaneously, we obtain a statement equivalent to WKL0.
Theorem 11.1**.**
Over RCA0, the following are equivalent:
(a)
WKL0;
2. (b)
the FVP for continuous f and compact X;
3. (c)
the FVP for continuous f on the Cantor space or on [0,1].
Proof.
That (a) implies the other items is Proposition 7.4, while (c) implies (a) by Proposition 9.1.
∎
If we choose two, but not three, of the weakenings, we instead obtain statements equivalent to ACA0. Moreover, if we impose only one of the three weakenings, we typically obtain statements that are not provable in ACA0. The only exception to this is the case where X is compact.
Theorem 11.2**.**
Over RCA0, the following are equivalent:
(a)
ACA0;
2. (b)
the FVP for continuous f;
3. (c)
the FVP for compact X;
4. (d)
the LVP for compact X;
5. (e)
the LVP for continuous f and compact X.
Moreover in item (b) we may take X to be the Baire space, in items (c) and (d) we may take X=[0,1], and in (d) and (e) we may take X=[0,1]×[0,1] with the max metric.
Proof.
The equivalence between (a) and (b) is given by Theorem 7.5 and Proposition 9.2, while the equivalence between (a) and (c) is given by Theorem 8.2 and Proposition 9.3. By Lemma 10.4, (c) implies (d). Clearly, (d) implies (e), which by Proposition 10.12 implies (c).
∎
Other than compactness, if we impose at most one weakening to the LVP we obtain statements equivalent to Π11\mbox−CA0.
Theorem 11.3**.**
The following are equivalent:
(a)
Π11\mbox−CA0;
2. (b)
the FVP;
3. (c)
the LVP for continuous f;
4. (d)
the LVP.
Moreover, in (b) and (d) we may take X to be the Baire space, and in any of (b)-(d) we may take X to be (the closed unit ball of) \mathcal{C}\big{(}[0,1]\big{)}.
Proof.
The equivalence between (a) and (b) is given by Theorem 8.2 and Proposition 9.5. That (b) implies (d) is Lemma 10.4. Clearly, (d) implies (c), and the latter implies (a) by Proposition 10.12 (with Remark 10.13).
∎
We have also considered variants of the variational principle where f is bounded, but this restriction has not affected the proof-theoretic strength of the theorem; indeed, in each of these theorems we may additionally assume that f is bounded.
Acknowledgments
We thank our anonymous reviewers for their many helpful suggestions for improving the clarity of this work. This project was partially supported by the Fonds voor Wetenschappelijk Onderzoek – Vlaanderen Pegasus program, a grant from the John Templeton Foundation (“A new dawn of intuitionism: mathematical and philosophical advances” ID 60842), JSPS KAKENHI (grant numbers 16K17640 and 15H03634), JSPS Core-to-Core Program (A. Advanced Research Networks) and JAIST Research Grant 2018(Houga). The opinions expressed in this work are those of the authors and do not necessarily reflect the views of the John Templeton Foundation.