The Epsilon Calculus with Equality and Herbrand Complexity

Kenji Miyamoto; Georg Moser

arXiv:1904.11304·math.LO·July 2, 2019

The Epsilon Calculus with Equality and Herbrand Complexity

Kenji Miyamoto, Georg Moser

PDF

Open Access

TL;DR

This paper analyzes the complexity bounds of Herbrand disjunctions in epsilon calculus with equality, extending previous results and providing new upper and lower bounds for Herbrand's theorem.

Contribution

It offers the first detailed complexity analysis of Herbrand disjunctions in epsilon calculus with equality, building on and extending prior work without equality.

Findings

01

Established upper bounds for Herbrand disjunction length with equality.

02

Derived lower bounds matching the upper bounds, showing tight complexity estimates.

03

Extended the complexity analysis from epsilon calculus without equality to include equality.

Abstract

Hilbert's epsilon calculus is an extension of elementary or predicate calculus by a term-forming operator $ε$ and initial formulas involving such terms. The fundamental results about the epsilon calculus are so-called epsilon theorems, which have been proven by means of the epsilon elimination method. It is a procedure of transforming a proof in epsilon calculus into a proof in elementary or predicate calculus through getting rid of those initial formulas. One remarkable consequence is a proof of Herbrand's theorem due to Bernays and Hilbert which comes as a corollary of extended first epsilon theorem. The contribution of this paper is the upper and lower bounds analysis of the length of Herbrand disjunctions in extended first epsilon theorem for epsilon calculus with equality. We also show that the complexity analysis for Herbrand's theorem with equality is a straightforward…

Equations201

A (t) \to A (ε_{x} A (x))

A (t) \to A (ε_{x} A (x))

u = v \to ε_{x} B (x, u) = ε_{x} B (x, v)

u = v \to ε_{x} B (x, u) = ε_{x} B (x, v)

t ::= x ∣ a ∣ f t ∣ ε_{x} A

t ::= x ∣ a ∣ f t ∣ ε_{x} A

A, B ::= P t ∣ t = t^{'} ∣ \neg A ∣ A \to B ∣ A \land B ∣ A \lor B ∣ \exists x A ∣ \forall x A

FV (z) := {z}, FV (f t) := FV (P t) := ⋃_{i < ∣ t ∣} FV (t_{i}), FV (\neg A) := FV (A),

FV (z) := {z}, FV (f t) := FV (P t) := ⋃_{i < ∣ t ∣} FV (t_{i}), FV (\neg A) := FV (A),

FV (Q x A) := FV (ε_{x} A) := FV (A) ∖ {x}, FV (A \circ B) := FV (A) \cup FV (B) .

w {z / s} := s if w \equiv z, w {z / s} := w if w \neq \equiv z,

w {z / s} := s if w \equiv z, w {z / s} := w if w \neq \equiv z,

(f t) {z / s} := f (t {z / s}), (P t) {z / s} := P (t {z / s}),

(\neg A) {z / s} := \neg (A {z / s}), (A \circ B) {z / s} := A {z / s} \circ B {z / s},

(Q x A) {z / s} := Q x A and (ε_{x} A) {z / s} := ε_{x} A if x \equiv z,

(Q x A) {z / s} := Q x^{'} A {x / x^{'}} {z / s} and (ε_{x} A) {z / s} := ε_{x^{'}} (A {x / x^{'}} {z / s}) o.w.,

x \equiv_{α} x, a \equiv_{α} a, f s \equiv_{α} f t := ⋀_{i < ∣ s ∣} s_{i} \equiv_{α} t_{i},

x \equiv_{α} x, a \equiv_{α} a, f s \equiv_{α} f t := ⋀_{i < ∣ s ∣} s_{i} \equiv_{α} t_{i},

P s \equiv_{α} P t := ⋀_{i < ∣ s ∣} s_{i} \equiv_{α} t_{i}, \neg A \equiv_{α} \neg B := A \equiv_{α} B,

A \circ B \equiv_{α} A^{'} \circ B^{'} := A \equiv_{α} A^{'} and B \equiv_{α} B^{'} for \circ \in {\to, \land, \lor},

Q x A (x) \equiv_{α} Q y B (y) := ε_{x} A (x) \equiv_{α} ε_{y} B (y) := A (z) \equiv_{α} B (z) for a fresh z .

t = t,

t = t,

s = t \to P s \to P t,

\frac{Γ ⊢ A Γ ⊢ A \to B}{Γ ⊢ B}

\frac{Γ ⊢ A Γ ⊢ A \to B}{Γ ⊢ B}

\forall x A (x) \to A (t)

\forall x A (x) \to A (t)

A (t) \to \exists x A (x)

\frac{Γ ⊢ A \to B ( a )}{Γ ⊢ A \to \forall x B ( x )} (\forall^{+}) \frac{Γ ⊢ A ( a ) \to B}{Γ ⊢ \exists x A ( x ) \to B} (\exists^{-})

\frac{Γ ⊢ A \to B ( a )}{Γ ⊢ A \to \forall x B ( x )} (\forall^{+}) \frac{Γ ⊢ A ( a ) \to B}{Γ ⊢ \exists x A ( x ) \to B} (\exists^{-})

A (t) \to A (ε_{x} A (x)),

A (t) \to A (ε_{x} A (x)),

(A \to B (ε_{x} B (x))) \to A \to B (ε_{x} (A \to B (x))) .

(A \to B (ε_{x} B (x))) \to A \to B (ε_{x} (A \to B (x))) .

(A \to B (ε_{x} B (x))) \to A \to B (ε_{x} (A \to B (x)))

(A \to B (ε_{x} B (x))) \to A \to B (ε_{x} (A \to B (x)))

A (ε_{x} (A (x) \to A (ε_{y} A (y)))) \to A (ε_{y} A (y)) .

A (ε_{x} (A (x) \to A (ε_{y} A (y)))) \to A (ε_{y} A (y)) .

(A (ε_{y} A (y)) \to A (ε_{y} A (y))) \to

(A (ε_{y} A (y)) \to A (ε_{y} A (y))) \to

A (ε_{x} (A (x) \to A (ε_{y} A (y)))) \to A (ε_{y} A (y))

A (ε_{y} A (y)) \to A (ε_{y} A (y))

A (ε_{x} (A (x) \to A (ε_{y} A (y)))) \to A (ε_{y} A (y))

\exists x A (x) := A (ε_{x} A (x)), \forall x A (x) := A (ε_{x} \neg A (x)) .

\exists x A (x) := A (ε_{x} A (x)), \forall x A (x) := A (ε_{x} \neg A (x)) .

x^{ε} := x, a^{ε} := a, (f t)^{ε} := f t^{ε}, (ε_{x} A)^{ε} := ε_{x} A^{ε}, (P t)^{ε} := P t^{ε},

x^{ε} := x, a^{ε} := a, (f t)^{ε} := f t^{ε}, (ε_{x} A)^{ε} := ε_{x} A^{ε}, (P t)^{ε} := P t^{ε},

(A \to B)^{ε} := A^{ε} \to B^{ε}, (A \land B)^{ε} := A^{ε} \land B^{ε}, (A \lor B)^{ε} := A^{ε} \lor B^{ε},

(\neg A)^{ε} := \neg A^{ε}, (\exists x A (x))^{ε} := A^{ε} (ε_{x} A^{ε} (x)), (\forall x A (x))^{ε} := A^{ε} (ε_{x} \neg A^{ε} (x)) .

(A \to \exists x B (x)) \to \exists x . A \to B (x),

(A \to \exists x B (x)) \to \exists x . A \to B (x),

\exists x A (x) \to \forall y A (y),

\exists x A (x) \to \forall y A (y),

PC^{=} ⊢ \exists x E (x) .

PC^{=} ⊢ \exists x E (x) .

EC^{=} ⊢ i = 0 ⋁ n E (t_{i}) .

EC^{=} ⊢ i = 0 ⋁ n E (t_{i}) .

u = v \to ε_{x} A (x, u) = ε_{x} A (x, v)

u = v \to ε_{x} A (x, u) = ε_{x} A (x, v)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLogic, programming, and type systems · Polynomial and algebraic computation · Advanced Combinatorial Mathematics

Full text

The Epsilon Calculus with Equality and Herbrand Complexity

Kenji Miyamoto and Georg Moser

[email protected], [email protected]

University of Innsbruck

Abstract

Hilbert’s epsilon calculus is an extension of elementary or predicate calculus by a term-forming operator $\varepsilon$ and initial formulas involving such terms. The fundamental results about the epsilon calculus are so-called epsilon theorems, which have been proven by means of the epsilon elimination method. It is a procedure of transforming a proof in epsilon calculus into a proof in elementary or predicate calculus through getting rid of those initial formulas. One remarkable consequence is a proof of Herbrand’s theorem due to Bernays and Hilbert which comes as a corollary of extended first epsilon theorem. The contribution of this paper is the upper and lower bounds analysis of the length of Herbrand disjunctions in extended first epsilon theorem for epsilon calculus with equality. We also show that the complexity analysis for Herbrand’s theorem with equality is a straightforward consequence of the one for extended first epsilon theorem without equality due to Moser and Zach.

Keywords. Hilbert’s epsilon calculus, epsilon theorems, Herbrand complexity, proof complexity

1 Introduction

Hilbert’s epsilon calculus is an extension of predicate calculus by the $\varepsilon$ -operator which forms for a formula $A(x)$ a term $\varepsilon_{x}A(x)$ . This operator is governed by the following two initial formulas: One is the critical formula

[TABLE]

where $t$ is an arbitrary term, and the other is the $\varepsilon$ -equality formula

[TABLE]

where $\vec{u}$ and $\vec{v}$ are sequences of terms $u_{0},u_{1},\ldots,u_{n-1}$ and $v_{0},v_{1},\ldots,v_{n-1}$ and $\vec{u}=\vec{v}$ stands for the conjunction of $u_{0}=v_{0}$ , $u_{1}=v_{1}$ , $\ldots$ , and $u_{n-1}=v_{n-1}$ for an arbitrary positive natural number $n$ , and the proper subterms of $\varepsilon_{x}B(x,\vec{a})$ are only variables $\vec{a}$ . Pure epsilon calculus is an extension of elementary calculus by the $\varepsilon$ -operator and the critical formula. The $\varepsilon$ -operator is expressive enough to encode the existential and universal quantifiers, so that they are definable as $\exists xA(x):=A(\varepsilon_{x}A(x))$ and $\forall xA(x):=A(\varepsilon_{x}\neg A(x))$ within the epsilon calculus.

The epsilon calculus was originally developed in the context of Hilbert’s program. Early work in proof theory (before Gentzen) concentrated on the epsilon calculus, the $\varepsilon$ -elimination method, and the $\varepsilon$ -substitution method, and those results were carried out by Bernays [HB39] (see also [Zac03, Zac04, MZ06]), Ackermann [Ack25, Ack40] (see also [Mos06]), and von Neumann [vN27]. The correct proof of Herbrand’s theorem was first given by means of epsilon calculus [Bus94]. The theorem is commonly stated in a less general way as follows than the original: If there is a proof of a prenex existential formula $\exists{\vec{x}}A(\vec{x})$ for quantifier-free $A(x)$ in predicate calculus, there is a proof of $A(\vec{t}_{0})\lor A(\vec{t}_{1})\lor\ldots\lor A(\vec{t}_{k-1})$ for some terms $\vec{t}_{0},\vec{t}_{1},\ldots,\vec{t}_{k-1}$ in elementary calculus. The epsilon calculus is of independent and lasting interest, however, and a study from a computational and proof-theoretic point of view is particularly worthwhile.

In the course of proving epsilon theorems and Herbrand’s theorem, the $\varepsilon$ -elimination method is used to proof-theoretically transform a proof in epsilon calculus into a proof which is free from the above mentioned initial formulas. Assume there is a proof of $A(\vec{t}\,)$ in pure epsilon calculus, where $\vec{t}$ is a finite sequence of terms possibly with occurrences of $\varepsilon$ -terms, then the $\varepsilon$ -elimination method generates another proof of the disjunction $A(\vec{s}_{0})\lor A(\vec{s}_{1})\lor\ldots\lor A(\vec{s}_{k-1})$ in elementary calculus, where $\vec{s}_{0},\vec{s}_{1},\ldots,\vec{s}_{k-1}$ are terms without the $\varepsilon$ -operator. The disjunction is a so-called Herbrand disjunction for the formula $A(\vec{t}\,)$ , and the aim of this paper is analyses of the Herbrand complexity which is the length $k$ of the shortest Herbrand disjunction for the original formula. This paper extends the Herbrand complexity analysis by Moser and Zach [MZ06]. Their result tells us that the Herbrand Complexity of a formula $A$ is based on the proof measure speaking only about the first-order counterpart of a proof of $A$ . While they have dealt with the systems of epsilon calculus without the $\varepsilon$ -equality formula, we target epsilon calculus with the $\varepsilon$ -equality formula and study the upper and lower bounds analysis of the Herbrand complexity for the system with the $\varepsilon$ -equality formula. Our contribution is divided into two parts. The first one is a complexity analysis for Herbrand’s theorem in first-order logic with equality. In this case, we can avoid to rely on epsilon calculus with the $\varepsilon$ -equality formula, hence the result by Moser and Zach is directly applicable. The second one is the upper and lower bounds analyses for extended first epsilon theorem with the $\varepsilon$ -equality formula, where the upper bound analysis depends on a measure concerning the structure of critical formulas as well as the measure for first-order ingredients of a proof.

Hilbert’s epsilon calculus is primarily a classical formalism, and we will restrict our attention to classical first-order logic. For non-classical approaches to epsilon calculus, see the work of Bell [Bel93a, Bel93b], DeVidi [DeV95], Fitting [Fit75], Mostowski [Mos63], and Shirai [Shi71]. Our study is also motivated by the recent renewed interest in the epsilon calculus and the $\varepsilon$ -substitution method in, e.g., the work of Arai [Ara03, Ara05], Avigad [Avi02], Baaz et al. [BLL18], and Mints et al., [MT99, Min03]. The epsilon calculus also allows the incorporation of choice construction into logic [BG00]. The treatment of eigenvariables in the context of unsound proofs and its relation to the epsilon calculus is studied by Aguilera and Baaz [AB16]. On the semantics of epsilon calculus, see the work of Zach [Zac17].

The rest of this paper is organized in the following way. Section 2 describes the syntax of epsilon calculus without the $\varepsilon$ -equality formula, Section 3 shows the embedding lemma which states that predicate calculus is a subset of pure epsilon calculus without the $\varepsilon$ -equality formula. A complexity analyses of Herbrand’s theorem for a prenex existential formula comes as a simple consequence of the lemma. In Section 4, the system is extended by the $\varepsilon$ -equality formula, which makes the identity schema true within the system. Section 5 clarifies the subtlety of complexity analyses of a system with equality through Yukami’s trick [Yuk84]. In Section 6 we review first and second epsilon theorems following the proof by Bernays. Section 7 and Section 8 are devoted to analysing the upper and the lower bounds, respectively, where Section 7 describes our complexity analysis for extended first epsilon theorem and the upper bound of the Herbrand complexity. Section 9 concludes this paper.

2 Epsilon Calculus

We start from defining terms and formulas of our logic. As a convention we assume $x,y,z$ range over a set $\mathcal{BV}$ of bound variables, $a,b,c$ over a set $\mathcal{FV}$ of free variables, $f$ over a set $\mathcal{F}$ of function symbols, and $P,Q,R$ over a set $\mathcal{P}$ of predicate symbols. The symbol $=$ is reserved for the equality predicate. Each function symbol and predicate symbol has an arity, and $\mathcal{BV}$ , $\mathcal{FV}$ , $\mathcal{F}$ , $\{=\}$ , and $\mathcal{P}$ are disjoint. We abbreviate $t_{0},t_{1},\ldots,t_{k-1}$ as $\vec{t}$ and let $|\vec{t}\,|$ denote its length $k$ . We define terms, formulas, and free variable occurrences. Notice the difference between free variables $\mathcal{FV}$ and free variable occurrences.

Definition 2.1 (Term and formula).

Raw terms* $t$ and raw formulas $A$ , $B$ are simultaneously defined as follows.*

[TABLE]

Sets of free variable occurrences $\mathrm{FV}(t)$ and $\mathrm{FV}(A)$ are simultaneously defined, assuming $\mathrm{\circ}\in\{\to,\land,\lor\}$ , $\mathsf{Q}\in\{\exists,\forall\}$ , and $z\in\mathcal{FV}\cup\mathcal{BV}$ .

[TABLE]

A raw term $t$ is a semiterm if $\mathrm{FV}(t)\cap\mathcal{BV}\neq\emptyset$ , and $t$ is a term if $\mathrm{FV}(t)\cap\mathcal{BV}=\emptyset$ . A raw formula $A$ is a semiformula if $\mathrm{FV}(A)\cap\mathcal{BV}\neq\emptyset$ , and $A$ is a formula if $\mathrm{FV}(A)\cap\mathcal{BV}=\emptyset$ . A (semi)formula and a (semi)term are quantifier free in case neither $\forall,\exists$ , nor $\varepsilon$ occurs in them.

We abbreviate $A_{0}\wedge A_{1}\wedge\ldots\wedge A_{n}$ as $\bigwedge_{i=0}^{n}A_{i}$ and also as $\bigwedge_{i<n+1}A_{i}$ , and the same convention applies to $\bigvee$ . Terms of the form $\varepsilon_{x}A$ is called $\varepsilon$ -terms.

Definition 2.2 (Substitution).

Assume $\mathrm{\circ}\in\{\to,\land,\lor\}$ and $\mathsf{Q}\in\{\exists,\forall\}$ . For (semi)terms $s,t$ , (semi)formulas $A,B$ , and variables $w,z\in\mathcal{FV}\cup\mathcal{BV}$ , the substitution $t\{z/s\}$ is defined as follows.

[TABLE]

where $\vec{t}\{w/s\}:=t_{0}\{w/s\},t_{1}\{w/s\},\ldots,t_{n-1}\{w/s\}$ and $x^{\prime}$ is fresh.

We can write $A(a)$ for a formula $A$ with a free variable $a\in\mathrm{FV}(A)$ , and then $A\{a/t\}$ is abbreviated as $A(t)$ . This notation is extended through the vector notation and the simultaneous substitution. We employ the same for terms.

Definition 2.3 ( $\alpha$ -equivalence).

We define the $\alpha$ -equivalence for (semi)terms and (semi)formulas as follows.

[TABLE]

We also define the term substitution $t\{s/u\}$ for (semi)terms $t,s,u$ through the $\alpha$ -equivalence instead of the equality on variables, and the simultaneous substitution.

Definition 2.4 (Set induced by vector).

For any vector $\vec{t}$ , a set $\{\vec{t}\,\}$ is defined to be $\bigcup_{i<|\vec{t}\,|}\{t_{i}\}$ via $\equiv_{\alpha}$ . We say a list of vectors $\vec{t}_{0},\ldots,\vec{t}_{k-1}$ is a split of $\vec{t}$ if $\{\vec{t}_{0}\}\uplus\cdots\uplus\{\vec{t}_{k-1}\}=\{\vec{t}\,\}$ and $\{\vec{t}_{i}\}\neq\emptyset$ for $0\leq i<k$ .

Definition 2.5 (Equality).

The following formulas are referred to by $\mathbf{EQ}$ .

[TABLE]

Definition 2.6 (Elementary calculus and predicate calculus).

The system of elementary calculus is denoted by $\mathbf{EC}$ , where its initial formulas are propositional tautologies and its inference rule is modus ponens given as follows.

[TABLE]

The system of first-order predicate calculus is denoted by $\mathbf{PC}$ , where the initial formulas are propositional tautologies and the following formulas $(\forall^{-})$ and $(\exists^{+})$ .

[TABLE]

The inference rules of $\mathbf{PC}$ are modus ponens and the following $(\forall^{+})$ and $(\exists^{-})$ , where the eigenvariable $a$ may not occur in any formula in the axiom $\Gamma$ .

[TABLE]

$\mathbf{EC}$ * and $\mathbf{PC}$ extended by the initial formulas $\mathbf{EQ}$ are called $\mathbf{EC}+\mathbf{EQ}$ and $\mathbf{PC}+\mathbf{EQ}$ , respectively. We alternatively say $\mathbf{EC}^{=}$ and $\mathbf{PC}^{=}$ for them.*

Definition 2.7 (Epsilon calculus).

Let a formula of the form

[TABLE]

where $t$ is an arbitrary term and $A(a)$ is a formula containing $a$ , be a critical formula, and we define the systems $\mathbf{EC}_{\varepsilon}$ and $\mathbf{PC}_{\varepsilon}$ by extending $\mathbf{EC}$ and $\mathbf{PC}$ by taking such critical formulas as initial formulas. We say $\varepsilon_{x}A(x)$ is the critical $\varepsilon$ -term of the critical formula and the critical formula belongs to $\varepsilon_{x}A(x)$ .

Definition 2.8 (Proof).

Let $\mathbf{S}$ be a system which consists of initial formulas and inference rules, and assume a set $\Gamma$ of formulas which we call axioms. A list of formulas is a proof in $\mathbf{S}$ from $\Gamma$ , if each formula is an initial formula of $\mathbf{S}$ , a formula in $\Gamma$ , or a consequence of an inference rule of $\mathbf{S}$ referring to preceding formulas in the proof. We write $\mathbf{S},\Gamma\vdash_{\pi}A$ if and only if a formula $A$ is the last formula of the proof $\pi$ in system $\mathbf{S}$ from $\Gamma$ . We omit $\Gamma$ if it is empty and $\mathbf{S}$ if there is no confusion. An inference rule consists of one consequence and assumptions, and may be displayed using a horizontal line.

Definition 2.9 (Languages).

Let the language $L(\mathbf{PC}_{\varepsilon}^{=})$ be formulas and terms in Definition 2.1 and the language $L(\mathbf{EC}_{\varepsilon}^{=})$ be $L(\mathbf{PC}_{\varepsilon}^{=})$ without the universal and existential quantifiers. We denote by $L(\mathbf{PC}^{=})$ and $L(\mathbf{EC}^{=})$ the sublanguages of $L(\mathbf{PC}_{\varepsilon}^{=})$ and $L(\mathbf{EC}_{\varepsilon}^{=})$ without the $\varepsilon$ -operator, respectively. Also, $L(\mathbf{PC}_{\varepsilon})$ and $L(\mathbf{EC}_{\varepsilon})$ are the sublanguages of $L(\mathbf{PC}_{\varepsilon}^{=})$ and $L(\mathbf{EC}_{\varepsilon}^{=})$ without the equality symbol, respectively. Finally, $L(\mathbf{PC})$ and $L(\mathbf{EC})$ are the sublanguages of $L(\mathbf{PC}^{=})$ and $L(\mathbf{EC}^{=})$ without the equality symbol, respectively.

We give two examples of $\mathbf{EC}_{\varepsilon}$ -proofs. These formulas in the examples are meant to be $\varepsilon$ -calculus versions of the independence of premise and the drinker’s formula. See also Example 3.2 and Example 3.3.

Example 2.10.

Consider the following formula in $L(\mathbf{EC}_{\varepsilon})$ .

[TABLE]

This formula (1) is an instance of the critical formula, hence a proof of (1) is given as follows.

[TABLE]

Example 2.11.

Consider the following formula in $L(\mathbf{EC}_{\varepsilon})$ .

[TABLE]

An $\mathbf{EC}_{\varepsilon}$ -proof of this formula (2) is given as follows.

[TABLE]

We conclude this section with the following basic results.

Theorem 2.12 (Deduction theorem).

Assume $A$ is a closed formula. $\Gamma\vdash A\to B$ iff $\Gamma,A\vdash B$ in $\mathbf{PC}_{\varepsilon}$ and in $\mathbf{EC}_{\varepsilon}$ .

Lemma 2.13 (Identity schema).

For any formula $A(a)$ and terms $s,t$ in $L(\mathbf{S})$ , $\mathbf{S}\vdash s=t\to A(s)\to A(t)$ holds for $\mathbf{S}\in\{\mathbf{PC}^{=},\mathbf{EC}^{=}\}$ .

Proof.

By induction on the size of $A(a)$ . ∎

Note that the above identity schema is not available in $\mathbf{PC}^{=}$ and $\mathbf{EC}^{=}$ , if the language is extended to $L(\mathbf{PC}_{\varepsilon}^{=})$ and $L(\mathbf{EC}_{\varepsilon}^{=})$ , respectively. In Section 4, we deal with epsilon calculus with the $\varepsilon$ -equality formula, within which the identity schema is recovered for $L(\mathbf{PC}_{\varepsilon}^{=})$ and $L(\mathbf{EC}_{\varepsilon}^{=})$ .

3 Embedding Lemma

Hilbert introduced the epsilon operator to encode quantifiers, so that predicate calculus goes to elementary calculus extended with the critical formula. This section describes this encoding of $\mathbf{PC}$ within $\mathbf{EC}_{\varepsilon}$ . The idea is to define the quantifiers by $\varepsilon$ -operator as follows, and recursively apply them.

[TABLE]

Definition 3.1 ( $\varepsilon$ -translation).

For a (semi)term $t$ and a (semi)formula $A$ we define its $\varepsilon$ -translation $t^{\varepsilon}$ and $A^{\varepsilon}$ . Let $\vec{t^{\varepsilon}}$ stand for $t_{0}^{\varepsilon},\ldots,t_{|\vec{t}\,|-1}^{\varepsilon}$ .

[TABLE]

Example 3.2.

Here is the formula of independence of premise in $L(\mathbf{PC})$ ,

[TABLE]

whose $\varepsilon$ -translation is the formula (1) in Example 2.10.

Example 3.3.

Here is the drinker’s formula in $L(\mathbf{PC})$ ,

[TABLE]

whose $\varepsilon$ -translation is the formula (2) in Example 2.11.

Remark 3.4.

The above two examples also show that the $\varepsilon$ -translation of a formula which is not provable in intuitionistic logic can be provable in $\mathbf{EC}_{\varepsilon}$ without using any classical propositional tautology.

Definition 3.5 (Regular proof).

A proof is regular if each eigenvariable in the proof is used by at most one $\forall^{+}$ or $\exists^{-}$ .

Definition 3.6 (Proof size).

The size $|\pi|$ of a proof $\pi$ is the length of the list.

If there is a proof, a regular one is always available and whose size is polynomially bounded to the original non-regular proof. This fact comes by the following theorem due to Krajíček [Kra94].

Theorem 3.7.

Let $\left\lVert\phi\right\rVert^{s}$ and $\left\lVert\phi\right\rVert^{t}$ be the size of the smallest sequence-proof and tree-proof of a provable first-order formula $\phi$ in the Hilbert style calculus, respectively. Then there exists a polynomial $p(x)$ such that $\left\lVert\phi\right\rVert^{t}\leq p(\left\lVert\phi\right\rVert^{s})$ for every provable first-order formula $\phi$ .

In the rest of this paper we implicitly assume the regularity of proofs.

Definition 3.8 (Critical count).

For a proof $\pi$ , we let $\mathsf{cc}(\pi)$ be the number of critical formulas, $\forall^{-}$ , and $\exists^{+}$ in $\pi$ .

Lemma 3.9 (Embedding).

Assume $\mathbf{PC}+\mathbf{EQ}\vdash_{\pi}A$ for a formula $A\in L(\mathbf{PC}^{=})$ , then $\mathbf{EC}_{\varepsilon}+\mathbf{EQ}\vdash_{\rho}A^{\varepsilon}$ for some $\rho$ with $\mathsf{cc}(\rho)\leq\mathsf{cc}(\pi)$ .

Proof.

We refer to the formula at line $k$ in the proof by $A_{k}$ . By induction on the length $l:=|\pi|$ . If $l=0$ it is trivial. We prove the case $l+1$ , making case analysis how the formula $A_{l+1}$ at the line $l+1$ is derived. In case it comes by modus ponens using $A_{i}$ and $A_{j}$ , which is of the form $A_{i}\to A_{l+1}$ , for $i,j\leq l$ , by the induction hypotheses there are proofs $\rho_{1}$ and $\rho_{2}$ concluding ${A_{i}}^{\varepsilon}$ and ${A_{i}}^{\varepsilon}\to{A_{l+1}}^{\varepsilon}$ , respectively, hence ${A_{l+1}}^{\varepsilon}$ by modus ponens. In case $A_{l+1}$ is derived by $\forall^{+}$ , $A_{l+1}$ is of the form $\forall xA_{i}(x)$ for $i\leq l$ . As $(\forall xA_{i}(x))^{\varepsilon}={A_{i}}^{\varepsilon}(\varepsilon_{x}\neg{A_{i}}^{\varepsilon}(x))$ , it suffices to substitute $x$ for $\varepsilon_{x}\neg{A_{i}}^{\varepsilon}(x)$ throughout the proof of ${A_{i}}^{\varepsilon}(x)$ which is due to the induction hypothesis. Here we assumed the regularity of the proof. In case $A_{l+1}$ is derived by $\exists^{-}$ , $A_{l+1}$ is of the form $\exists x.B(x)\to C$ and $A_{i}=B(t)\to C$ for $i\leq l$ . As $(\exists x.B(x)\to C)^{\varepsilon}={B}^{\varepsilon}(\varepsilon_{x}({B}^{\varepsilon}(x)\to C^{\varepsilon}))\to C^{\varepsilon}$ , it suffices to use modus ponens with a critical formula and $B^{\varepsilon}(t^{\varepsilon})\to C^{\varepsilon}$ , which comes by induction hypothesis. In case $A_{l+1}$ is by $\forall^{-}$ , $A_{l+1}$ is of the form $\forall xB(x)\to B(t)$ and hence we prove $B^{\varepsilon}(\varepsilon_{x}(\neg B^{\varepsilon}(x)))\to B^{\varepsilon}(t^{\varepsilon})$ , whose contrapositive is a critical formula. In case $A_{l+1}$ is by $\exists^{+}$ , $A_{l+1}$ is of the form $B(t)\to\exists xB(x)$ and hence we prove $B^{\varepsilon}(t^{\varepsilon})\to B^{\varepsilon}(\varepsilon_{x}B^{\varepsilon}(x))$ that is immediate as it is a critical formula. The rest is the axioms. The rest is the cases for propositional tautologies and $\mathbf{EQ}$ , which are all trivial. ∎

Theorem 3.10 (Herbrand’s theorem).

Assume $\exists\vec{x}E(\vec{x})$ is a prenex existential formula in $L(\mathbf{PC}^{=})$ , namely, $E(\vec{x})$ is quantifier free, and

[TABLE]

Then there are $\varepsilon$ -free terms $\vec{t}_{0},\vec{t}_{1},\ldots,\vec{t}_{n}$ for $n\leq 2_{2\cdot\mathsf{cc}(\pi)}^{3\cdot\mathsf{cc}(\pi)}$ such that

[TABLE]

Proof.

Assume $\pi$ is the $\mathbf{PC}^{=}$ -proof of $\exists\vec{x}E(\vec{x})$ . By means of Lemma 3.9, there is an $\mathbf{EC}_{\varepsilon}$ -proof $\rho$ of $(\exists\vec{x}E(\vec{x}))^{\varepsilon}$ , which is namely $E(\vec{e})$ for some $\varepsilon$ -terms $\vec{e}$ , then the conclusion follows from extended first epsilon theorem for $\mathbf{EC}_{\varepsilon}$ (cf. Theorem 16 in [MZ06]) with $\mathbf{EQ}$ being propositional tautologies. ∎

4 Epsilon Calculus with the $\varepsilon$ -Equality Formula

Epsilon calculus with equality was originally introduced by Hilbert [HB39]. Assuming $\varepsilon_{x}A(x,a)$ is an $\varepsilon$ -matrix, he formulated the $\varepsilon$ -equality formula as follows.

[TABLE]

In this section we adopt a variant of the $\varepsilon$ -equality formula which is given as follows via the vector notation.

[TABLE]

Then we define our system of epsilon calculus with equality $\mathbf{EC}_{\varepsilon}^{=}$ to be $\mathbf{EC}_{\varepsilon}+\mathbf{EQ}$ extended with the initial formula (4). The $\varepsilon$ -elimination method and the proofs of epsilon theorems for $\mathbf{EC}_{\varepsilon}^{=}$ can be simpler than the ones for the original system by Hilbert. While the notion of closures is crucial in Hilbert and Bernays’ work, we do not need this notion in $\mathbf{EC}_{\varepsilon}^{=}$ . Moreover, concerning the hyperexponential part of the upper bound analysis of the Herbrand complexity, our result for $\mathbf{EC}_{\varepsilon}^{=}$ is better than the one for the system with (3), as it will be shown in Section 7.3.

Definition 4.1 ( $\varepsilon$ -matrix and semicolon notation).

An $\varepsilon$ -term $e$ is an $\varepsilon$ -matrix iff each proper subterm of $e$ is a free variable and each free variable in $e$ occurs exactly once. The $\varepsilon$ -matrix and its immediate subsemiformula can be denoted as $\varepsilon_{x}A(x;\vec{a})$ and as $A(x;\vec{a})$ , respectively, if and only if $\varepsilon_{x}A(x,\vec{a})$ is an $\varepsilon$ -matrix with its free variables $\vec{a}$ . We call the free variables $\vec{a}$ of an $\varepsilon$ -matrix $\varepsilon_{x}A(x;\vec{a})$ its parameters.

Conventionally, we let $g$ range over $\varepsilon$ -matrices, possibly with its parameters $\vec{a}$ explicitly denoted as $g(\vec{a})$ . For any $\varepsilon$ -term, its $\varepsilon$ -matrix is uniquely determined modulo free variable names. If $e$ is a critical $\varepsilon$ -term, the $\varepsilon$ -matrix of $e$ is called a critical $\varepsilon$ -matrix.

Definition 4.2 (Arity of $\varepsilon$ -matrix).

For an $\varepsilon$ -matrix $\varepsilon_{x}A(x;\vec{a})$ , we define its arity $\mathsf{a}(\varepsilon_{x}A(x;\vec{a}))$ to be $|\vec{a}|$ . Let $\mathsf{ma}(\pi,r)$ be the maximal arity of critical $\varepsilon$ -matrices of rank $r$ in $\pi$ , and $\mathsf{ma}(\pi)$ be $\max\{\mathsf{ma}(\pi,r)\mid r\leq\mathsf{rk}(\pi)\}$ .

Lemma 4.3.

If $e$ is an $\varepsilon$ -term, then $e\equiv_{\alpha}g(\vec{t}\,)$ for some $\varepsilon$ -matrix $g(\vec{a})$ and $\vec{t}$ .

Proof.

Let $\vec{t}$ be all the immediate subterms of $e$ and $\vec{a}$ be fresh variables, so that $e\equiv_{\alpha}e(\vec{t}\,)$ . Then, $e(\vec{a})$ is the $\varepsilon$ -matrix $g(\vec{a})$ . ∎

The epsilon calculus with equality by Hilbert and Bernays also employs the $\varepsilon$ -equality formula as an initial formula.

Definition 4.4 (Epsilon calculus with the $\varepsilon$ -equality formula).

Let $\mathbf{PC}_{\varepsilon}^{=}$ and $\mathbf{EC}_{\varepsilon}^{=}$ be $\mathbf{PC}_{\varepsilon}+\mathbf{EQ}$ and $\mathbf{EC}_{\varepsilon}+\mathbf{EQ}$ extended with the following additional initial formula, respectively,

[TABLE]

where $\vec{u}$ and $\vec{v}$ are term vectors of the same length as of the parameters $\vec{a}$ of $\varepsilon$ -matrix $\varepsilon_{x}A(x;\vec{a})$ . A formula of the form $\vec{u}=\vec{v}\to\varepsilon_{x}A(x;\vec{u})=\varepsilon_{x}A(x;\vec{v})$ is an $\varepsilon$ -equality formula, where $\varepsilon_{x}A(x;\vec{u})$ and $\varepsilon_{x}A(x;\vec{v})$ are called the critical $\varepsilon$ -terms of the $\varepsilon$ -equality formula. We also say that the $\varepsilon$ -equality formula belongs to $\varepsilon_{x}A(x;\vec{u})$ and to $\varepsilon_{x}A(x;\vec{v})$ .

According to the semicolon notation, the $\varepsilon$ -equality formula always belongs to critical $\varepsilon$ -terms $\varepsilon_{x}A(x;\vec{u}),\varepsilon_{x}A(x;\vec{v})$ which were formed by applying substitutions $\{\vec{a}/\vec{u}\},\{\vec{a}/\vec{v}\}$ to an $\varepsilon$ -matrix $\varepsilon_{x}A(x;\vec{a})$ . The next section details this constraint from a perspective of complexity analysis. Due to the $\varepsilon$ -equality formula, the identity schema is available in $L(\mathbf{PC}_{\varepsilon}^{=})$ .

Lemma 4.5 (Identity schema).

Let a formula $A(a)$ and terms $s,t$ be in $L(\mathbf{PC}_{\varepsilon}^{=})$ , then $\mathbf{PC}_{\varepsilon}^{=}\vdash s=t\to A(s)\to A(t)$ . The same holds for $L(\mathbf{EC}_{\varepsilon}^{=})$ in $\mathbf{EC}_{\varepsilon}^{=}$ .

Proof.

By induction. ∎

We further define means of measuring complexity of terms and proofs, which are used in the next sections to study procedures of eliminating critical $\varepsilon$ -terms. The rank counts the depth of nesting $\varepsilon$ -semiterms, while the degree counts the depth of nesting $\varepsilon$ -terms. Here we suppose that $\max\{\}=0$ .

Definition 4.6 (Rank).

We define the rank $\mathsf{rk}(t)$ for a (semi)term $t$ .

[TABLE]

where $t$ subordinates $\varepsilon_{x}A(x)$ iff $x\in\mathrm{FV}(t)$ and $t$ is a subsemiterm of $A(x)$ . We define $\mathsf{rk}(\pi)$ be $\max\{\mathsf{rk}(e_{0}),\ldots,\mathsf{rk}(e_{n-1})\}$ , where $\mathsf{rk}(e_{0}),\ldots,\mathsf{rk}(e_{n-1})$ are the critical $\varepsilon$ -terms in $\pi$ .

The rank is stable against substitutions.

Lemma 4.7.

For any terms $t(\vec{a})$ and $\vec{u}$ , $\mathsf{rk}(t(\vec{a}))=\mathsf{rk}(t(\vec{u}))$ .

Proof.

Comparing with $t(\vec{a})$ , nothing new is subordinating in $t(\vec{u})$ due to the substitution of $\vec{u}$ for $\vec{a}$ , hence it is obvious from Definition 4.6. ∎

Lemma 4.8.

For an $\varepsilon$ -matrix $\varepsilon_{x}A(x,\vec{b})$ , $\mathsf{rk}(\varepsilon_{x}A(x,\vec{b}))=\mathsf{rk}(A(a,\vec{b}))+1$ .

Proof.

By induction on the construction of $\varepsilon_{x}A(x,\vec{b})$ . ∎

Definition 4.9 (Degree).

For a (semi)term $t$ , we define its degree $\mathsf{deg}(t)$ .

[TABLE]

Definition 4.10 (Maximal critical $\varepsilon$ -term).

Let maximal critical $\varepsilon$ -terms of a proof $\pi$ be the set of critical $\varepsilon$ -terms of the greatest degree among the set of critical $\varepsilon$ -terms of the greatest rank in a proof $\pi$ .

We conclude this section by defining measures for the proof complexity based on critical $\varepsilon$ -terms, $\varepsilon$ -matrices, critical formulas and $\varepsilon$ -equality formulas.

Definition 4.11 (Order).

For a proof $\pi$ , the number of distinct critical $\varepsilon$ -terms of rank $r$ in $\pi$ is denoted by $\mathsf{o}(\pi,r)$ which we call the order, and the number of distinct $\varepsilon$ -matrices is defined in the same manner and denoted by $\mathsf{m}(\pi,r)$ which we call the matrix order.

Definition 4.12 (Width).

Define $0pt^{\varepsilon}(\pi,e)$ and $0pt^{=}(\pi,e)$ by the number of distinct critical formulas belonging to $e$ in $\pi$ and of distinct $\varepsilon$ -equality formulas belonging to $e$ in $\pi$ , respectively. The width $0pt(\pi,e)$ is defined to be $0pt^{\varepsilon}(\pi,e)+0pt^{=}(\pi,e)$ . Let $\vec{e}$ be critical $\varepsilon$ -terms in $\pi$ , then the maximal width $\mathsf{mwd}(\pi,r)$ is defined to be $\max\{0pt(\pi,e_{i})\mid\mathsf{rk}(e_{i})=r\}$ .

In order to measure the number of $\varepsilon$ -equality formulas, we replace the notion of the critical count in Definition 3.8 by the following one.

Definition 4.13 (Critical count).

Assume $\pi$ is a proof in $\mathbf{PC}_{\varepsilon}^{=}$ or $\mathbf{EC}_{\varepsilon}^{=}$ . The critical count $\mathsf{cc}(\pi)$ of $\pi$ is defined to be the sum of the numbers of critical formulas, $\varepsilon$ -equality formulas, $\forall^{-}$ , and $\exists^{+}$ in $\pi$ . We let $\mathsf{cc}^{\varepsilon}(\pi)$ and $\mathsf{cc}^{=}(\pi)$ be the numbers of critical fomulas and of $\varepsilon$ -equality formulas in $\pi$ , respectively.

5 Yukami’s Trick

In this short section, we clarify the need for the restriction to $\varepsilon$ -matrices in the definition of $\varepsilon$ -equality axioms, cf. (3) and (4) (see also Definition 4.4).

For the sake of the argument we assume, for the duration of this section only, that the restriction to $\varepsilon$ -matrices is dropped. We focus on the above formulation of $\varepsilon$ -equality axioms, using vector notation, as expressed in (4). However, the below given argument is equally valid for Hilbert’s original definition (if we drop the restriction to $\varepsilon$ -matrices).

We will employ Yukami’s trick [Yuk84] together with folkore results in structural proof theory [Bus98, Pud98]. For additional insight into the proof theoretic strength of applications of identiy schema, see [BF01].

Theorem 5.1 (cf. [Yuk84]).

Using two instances of the following restricted scheme of identity

[TABLE]

we can uniformly derive $0^{k}\mathrel{:=}\overbrace{0+(0+\cdots(0+0))}^{k{\rm\ times}}=0$ , from (i) $0+0=0$ , (ii) $\forall x,y,z\ x=y\land y=z\mathrel{\rightarrow}x=z$ , and (iii) $\forall x,y\ x+y=y\mathrel{\rightarrow}x=0$ .

Proof.

Let $r_{1}(0+0)\equiv 0^{n}+(0^{n-1}+\cdots+(0^{2}+0)\ldots)$ where $0+0$ is fully indicated. Let $r_{2}[0+0]\equiv 0^{n-1}+(0^{n-2}+\cdots+(0^{2}+(0+0))\ldots)$ , where $0+0$ in $r_{2}[0+0]$ refers only to the innermost occuring term $0+0$ . The following equalities can be easily derived (employing in addition suitable instance of the transitivity axiom (ii) and axiom (i))

[TABLE]

if we employ the instances of (5)

[TABLE]

and

[TABLE]

Hence we have derived $r_{1}(0+0)=r_{2}[0]$ . Eventually, to obtain the desired result, we apply axiom (iii), as $r_{1}(0+0)=r_{2}[0]$ is nothing else than $0^{n}+r_{2}[0]=r_{2}[0]$ ( $r_{2}[0]$ is indicated by $A$ above).

Note that the derivation is uniform for any $k\geq 0$ : while for any $k$ the proof slightly differs, the number of steps and in particular the critical count is constant. ∎

Remark 5.2.

Using induction one can derive (5) uniformly from (i) $\forall x\ s(x)\not=0$ and (ii) $\forall x\ x=x$ . That is Yukami’s trick is available in any suitable rich arithmetical theory.

The next result clarifies that the restricted identity axioms employed in Yukami’s trick are uniformly derivable if no additional restriction on the form of the $\varepsilon$ -terms are enforced in (4). Let ${\mathbf{EC}^{\prime}}_{\varepsilon}^{=}$ denote the extension of the $\varepsilon$ -calculus $\mathbf{EC}_{\varepsilon}$ with the following axioms to cover $\varepsilon$ -equality:

[TABLE]

where $\varepsilon_{x}A(x;\vec{u})$ , $\varepsilon_{x}A(x;\vec{v})$ denote (arbitrary) $\varepsilon$ -terms.

Lemma 5.3.

The following identity schema, generalising (5), is derivable in ${\mathbf{EC}^{\prime}}_{\varepsilon}^{=}$ :

[TABLE]

where $g$ is an aribrary term in $L(\mathbf{EC}_{\varepsilon}^{=})$ .

Proof.

Let $s,t\in L(\mathbf{EC}_{\varepsilon}^{=})$ . Consider the following two critical axioms:

[TABLE]

Thus ${\mathbf{EC}^{\prime}}_{\varepsilon}^{=}$ derives (i) $\varepsilon_{x}(x=g(s))=g(s)$ as well as (ii) $\varepsilon_{x}(x=g(t))=g(t)$ . We exploit the following $\varepsilon$ -equality axiom in ${\mathbf{EC}^{\prime}}_{\varepsilon}^{=}$

[TABLE]

Assuming $s=t$ , we can thus derive (within ${\mathbf{EC}^{\prime}}_{\varepsilon}^{=}$ ) $\varepsilon_{x}(x=g(s))=\varepsilon_{x}(x=g(t))$ . Due to (i) and (ii) and equality axioms $\mathbf{EQ}$ , we thus obtain $g(s)=g(t)$ in ${\mathbf{EC}^{\prime}}_{\varepsilon}^{=}$ as claimed. It is important to emphasise, that the $\varepsilon$ -term employed in (6) is not an $\varepsilon$ -matrix. ∎

Before we can employ Yukami’s trick and the above lemma, we need some preparatory definitions and results.

Let $T$ be a theory. We say $T$ admits Herbrand’s theorem if whenever $T\vdash\exists\vec{x}\ E(\vec{x})$ , with $E(\vec{x})$ quantifier-free, then there exists a finite sequence of terms $\vec{t}_{0},\vec{t}_{1},\ldots,\vec{t}_{n}$ such that $T\vdash\bigvee_{i=0}^{n}E(\vec{t}_{i})$ .

Let $T$ be axiomatised by purely universal formulas. Then it is well-known that $T$ admits Herbrand’s theorem, cf. [Bus98]. Due to Theorem 3.10 we can even conclude the existence of a function $f\colon\mathbb{N}\to\mathbb{N}$ such that $n\leq f(k)$ , where $k$ denotes the critical count of the proof of $T\vdash\exists\vec{x}\ E(\vec{x})$ .

The next result improves upon this, in the sense that we also bound the term complexity of the sequence of terms $\vec{t}_{i}$ in the critical count. Let $\operatorname{\mathsf{dp}}(t)$ denote the depth of any term $t\in L(T)$ , defined in the usual way. Futher let $\operatorname{\mathsf{comp}}(F)$ denote the formula complexity of an formula $F$ in the language of $T$ . A variant of the following result is due to Krajicek and Pudlak (see [KP88]).

Theorem 5.4.

Suppose $T$ is a universal theory such that $T\vdash_{\pi}\exists\vec{x}E(\vec{x})$ so that the underlying equational theory of $T$ (if any), has positive unification type. Then there exists a primitive recursive function $g$ and a finite sequence of terms $\vec{t}_{i}$ such that $T\vdash\bigvee_{i=0}^{n}E(\vec{t}_{i})$ , where $n,\operatorname{\mathsf{dp}}(t_{i})\leq g(\mathsf{cc}(\pi),\operatorname{\mathsf{comp}}(E(\vec{x})))$ .

Proof.

Wlog. we assume that $T$ is axiomatised by quantifier-free formulas. As $\exists\vec{x}\ E(\vec{x})$ is provable in $T$ , there exists a conjunction of (quantifier-free) axioms $Ax$ in $T$ such that $\mathbf{PC}^{=}\vdash Ax\mathrel{\rightarrow}\exists\vec{x}E(\vec{x})$ . By the above, we conclude the existence of terms $\vec{t}_{i}$ and a primitive recursive function $f\colon\mathbb{N}\to\mathbb{N}$ with $n\leq f(k)$ , such that

[TABLE]

is a consequence of the underlying equational theory of $T$ , if any. As above, $k$ denotes the critical count of $\pi$ . It remains to prove the existence of the bounding function $g$ , bounding not only the number of terms $t_{i}$ , but also their term depth.

It is not difficult to formulate the property that the formula (7) is quasi-tautology as a unification problem $U$ over the corresponding equational theory, cf. [KP88, Pud98]. As $U$ is solvable there exists a most general unifier $\rho$ such that $(A\mathrel{\rightarrow}U)\rho$ is a quasi-tautology, that is follows from the equational theory of $T$ . This follows as unification has a positive unification type by assumption. For each term $s$ in the range of $\rho$ , $\operatorname{\mathsf{dp}}(s)$ is bounded by a (monotone) function $h\colon\mathbb{N}\times\mathbb{N}\to\mathbb{N}$ , depending only on the input to the unification problem, that is, the length $n$ of the term sequence $\vec{t}_{0},\vec{t}_{1},\ldots,\vec{t}_{n}$ and the formula complexity of $E(\vec{x})$ .

Finally, it is easy to see how to define a bounding function $g$ such that (i) $n\leq f(\mathsf{cc}(\pi))\leq g(\mathsf{cc}(\pi),\operatorname{\mathsf{comp}}(E(\vec{x})))$ and (ii) $\operatorname{\mathsf{dp}}(s)\leq h(f(\mathsf{cc}(\pi)),\operatorname{\mathsf{comp}}(E(\vec{x})))\leq g(\mathsf{cc}(\pi),\operatorname{\mathsf{comp}}(E(\vec{x})))$ . ∎

We say a theory $T$ that admits bounded Herbrand complexity if whenever $T\vdash_{\pi}\exists\vec{x}\ E(\vec{x})$ , with $E(\vec{x})$ quantifier-free, there exist terms $\vec{t}_{i}$ such that $T\vdash\bigvee_{i=0}^{n}E(\vec{t}_{i})$ and $n$ is bounded by a function depending only on the formula complexity of $E$ and the number of steps in $\pi$ . Note that according to the theorem any universal theory so that the underlying equational theory of $T$ (if any), has positive unification type admits bounded Herbrand complexity.

The next results clarifies that no theory $T$ can exists that admits bounded Herbrand complexity, while at the same time deriving the assumptions of Theorem 5.1.

Corollary 5.5.

Let $T$ be a universal theory whose axioms include (i) $0+0=0$ , (ii) $\forall x,y,z\ x=y\land y=z\mathrel{\rightarrow}x=z$ , and (iii) $\forall x,y\ x+y=y\mathrel{\rightarrow}x=0$ and let the equational theory of $T$ (if any) be of positive unification type.

Such a theory $T$ cannot admit bounded Herbrand complexity and at the same time derives the restricted identity schema (5).

Proof.

Suppose to the contrary, a theory $T$ exists whose underlying equational theory has positive unification type. Moreover $T$ admits bounded Herbrand complexity and derives the assumed axioms. Thus due to Theorem 5.1, $T$ uniformly derive $0^{k}=0$ for all $k\in\mathbb{N}$ . Further for any $k$ , there exists a finite set $\{A_{1},\dots,A_{\ell}\}$ of (universally quantified) axioms of $T$ , in particular including axioms (i)—(iii), such that

[TABLE]

is provable in $\mathbf{PC}^{=}$ with proofs of constant critical count. Arguing as in the proof of Theorem 5.4, employing the assumption that $T$ that admits bounded Herbrand complexity, we conclude the existence of terms $\vec{t}_{i}$ of bounded depth such that

[TABLE]

is a quasi-tautology. As the depth of the terms $\vec{t}$ is bounded, while $k$ is unbounded. This is absurd. Contradiction to the assumption that $T$ that admits bounded Herbrand complexity. ∎

Finally, we arrive at the main result of this section, emphasising the need for restricting the use of $\varepsilon$ -matrices in the epsilon calculus with equality, cf. Definition 4.4.

Theorem 5.6.

Let $T$ be finitely axiomatised by the axioms (i) $0+0=0$ , (ii) $\forall x,y,z\ x=y\land y=z\mathrel{\rightarrow}x=z$ , and (iii) $\forall x,y\ x+y=y\mathrel{\rightarrow}x=0$ and formalised over ${\mathbf{EC}^{\prime}}_{\varepsilon}^{=}$ . Then $T$ cannot admit bounded Herbrand complexity.

Proof.

Suppose to the contrary that $T$ admits bounded Herbrand complexity, that is, whenever $T\vdash_{\pi}\exists\vec{x}\ E(\vec{x})$ , with $E(\vec{x})$ quantifier-free, there exists terms $\vec{t}_{i}$ such that $T\vdash\bigvee_{i=0}^{n}E(\vec{t}_{i})$ and $n$ is bounded by a function depending only on the formula complexity of $E$ and the number of steps in $\pi$ .

Further, as $T$ is axiomatised over ${\mathbf{EC}^{\prime}}_{\varepsilon}^{=}$ , $T$ derives the restricted identity schema (5), cf. Lemma 5.3,

Finally, as the equational theory of $T$ is restriced to syntactic equality, the corresponding unification type is $1$ and the most general unifier of any unification problem is uniquely defined. But this contradicts Corollary 5.5, stating that no such theory can exists. ∎

6 First and Second Epsilon Theorems

Assume there is a proof in $\mathbf{EC}_{\varepsilon}^{=}$ of a formula $E$ which is free from bound variables. First epsilon theorem states that there is an $\mathbf{EC}^{=}$ -proof of the same formula $E$ . The proof of the theorem is due to the epsilon elimination method, which is to replace critical $\varepsilon$ -terms in a given $\mathbf{EC}_{\varepsilon}^{=}$ -proof by other terms, maintaining the correctness of the proof, so that all critical formulas and $\varepsilon$ -equality formulas in the proof are eliminated. Before we go into the general case, we sketch how epsilon elimination works through very simple examples. The first part is for the case only a critical formula belongs to a critical $\varepsilon$ -term, and the second one is the case both a critical formula and an $\varepsilon$ -equality formula belong to a critical $\varepsilon$ -term.

Example 6.1.

Consider an $\mathbf{EC}_{\varepsilon}$ -proof $\pi$ of the bound variable free formula $E$ involving only one critical formula of the following form.

[TABLE]

We eliminate this critical formula by generating two proofs of $A(t)\to E$ and of $\neg A(t)\to E$ as follows. Assume $A(t)$ is an axiom and replace all occurrences of $\varepsilon_{x}A(x)$ throughout the proof by $t$ , then the above formula goes to $A(t\{\varepsilon_{x}A(x)/t\})\to A(t)$ , which is provable as $A(t)$ is our axiom. All the other formulas in $\pi$ are either propositional tautology or modus ponens, hence we get a proof of $A(t)\to E$ without using any axiom. On the other hand, assuming $\neg A(t)$ is an axiom, the above critical formula is proved due to ex falso quodlibet, hence we get a proof of $\neg A(t)\to E$ . Composing the above two proofs by means of excluded middle $A(t)\lor\neg A(t)$ , we obtain a proof of $E$ without a critical formula, hence it is an $\mathbf{EC}$ -proof of $E$ .

Example 6.2.

Consider another $\mathbf{EC}_{\varepsilon}^{=}$ -proof of the bound variable free formula $E$ involving one critical formula and one $\varepsilon$ -equality formula of the following form, assuming $u$ and $v$ are $\varepsilon$ -free terms.

[TABLE]

We first eliminate this $\varepsilon$ -equality formula by generating two proofs concluding $u=v\to E$ and $\neg u=v\to E$ , where we still use a critical formula which is eliminated due to the above argument. Assume $u=v$ is an axiom and replace $\varepsilon_{x}P(x;u)$ throughout the proof by $\varepsilon_{x}P(x;v)$ . The critical formula goes to $P(t^{\prime};u)\to P(\varepsilon_{x}P(x;v);u)$ , where $t^{\prime}:=t\{\varepsilon_{x}P(x;u)/\varepsilon_{x}P(x;v)\}$ , which is proved by means of another critical formula $P(t^{\prime};v)\to P(\varepsilon_{x}P(x;v);v)$ and the identity formulas $P(t^{\prime};u)\to P(t^{\prime};v)$ and $P(\varepsilon_{x}P(x;v);v)\to P(\varepsilon_{x}P(x;v);u)$ from the axiom $u=v$ . The $\varepsilon$ -equality formula goes to $u=v\to\varepsilon_{x}P(x;v)=\varepsilon_{x}P(x;v)$ which is trivially true, and the identity formulas are trivially true, too. As all the other formulas are either propositional tautology or modus ponens, hence we eliminate the critical formula to have an $\mathbf{EC}^{=}$ -proof of $u=v\to E$ . On the other hand, assuming $\neg u=v$ is an axiom, the above $\varepsilon$ -equality formula becomes trivially provable, hence by the previous argument we get an $\mathbf{EC}^{=}$ -proof of $\neg u=v\to E$ . Composing those two proofs using excluded middle $u=v\lor\neg u=v$ , we get an $\mathbf{EC}^{=}$ -proof of $E$ involving a critical formula, which is eliminated by the previous argument.

In the rest of this section, we address the general case, namely, the epsilon elimination method for proofs arbitrarily involving critical formulas and $\varepsilon$ -equality formulas. In case there are at least two different critical $\varepsilon$ -terms in a proof, we have to give a right order to eliminate the critical $\varepsilon$ -terms, so that the above strategy works successfully. The following scenario illustrates an unsuccessful case. Consider a proof involving two different critical formulas.

[TABLE]

If we first try to eliminate $\varepsilon_{x}A(x,s)$ by substituting $t$ , the second formula goes to the following non-critical formula which is in general not provable.

[TABLE]

The following lemmas tell us about substitutions for a non-critical $\varepsilon$ -term in critical and $\varepsilon$ -equality formulas. As a consequence, by eliminating a critical $\varepsilon$ -term $e$ of the greatest rank, critical and $\varepsilon$ -equality formulas not belonging to $e$ are kept to be critical and $\varepsilon$ -equality formulas after replacing $e$ by any term.

Lemma 6.3.

Assume $e$ is an $\varepsilon$ -term, $\eta$ is a substitution $\{e/t\}$ where $t$ is some term, and $A$ is a critical formula with $\mathsf{rk}(A)\leq\mathsf{rk}(e)$ . If $A$ does not belong to $e$ , $A\eta$ is a critical formula of the rank $\mathsf{rk}(A)$ .

Proof.

Let $A$ be $B(t;\vec{u})\to B(\varepsilon_{x}B(x;\vec{u});\vec{u})$ . We first show that neither $t$ nor $\varepsilon_{x}B(x;\vec{u})$ is a proper subterm of $e$ . If $e\equiv_{\alpha}e^{\prime}(t)$ for some $\varepsilon$ -term $e^{\prime}(a)$ , $B(t;\vec{u})\equiv_{\alpha}B^{\prime}(e^{\prime}(t);\vec{u})$ for some $B^{\prime}(a;\vec{b})$ . Then, $e$ is subordinating the critical $\varepsilon$ -term $\varepsilon_{x}B(x;\vec{u})\equiv_{\alpha}\varepsilon_{x}B^{\prime}(e^{\prime}(x);\vec{u})$ and hence $\mathsf{rk}(e)<\mathsf{rk}(\varepsilon_{x}B(x;\vec{u}))=\mathsf{rk}(A)$ , which is contradictory. If $e\equiv_{\alpha}e^{\prime}(\varepsilon_{x}B(x;\vec{u}))$ for some $\varepsilon$ -term $e^{\prime}(a)$ , $\mathsf{deg}(e)>\mathsf{deg}(\varepsilon_{x}B(x;\vec{u}))$ holds, which is contradictory. As the occurrence of $e$ in $A$ is as subterms among $t$ and $\vec{u}$ , $A\eta$ is $B(t\eta;\vec{u}\eta)\to B(\varepsilon_{x}B(x;\vec{u}\eta);\vec{u}\eta)$ which is a critical formula of the rank $\mathsf{rk}(\varepsilon_{x}B(x;\vec{u}\eta))=\mathsf{rk}(\varepsilon_{x}B(x;\vec{u}))$ . ∎

Lemma 6.4.

Assume $e$ is an $\varepsilon$ -term and $A$ is an $\varepsilon$ -equality formula. If $A$ does not belong to $e$ , the formula $A\{e/t\}$ for any term $t$ is an $\varepsilon$ -equality formula of rank $\mathsf{rk}(A)$ .

Proof.

Let $A$ be $\vec{u}=\vec{v}\to\varepsilon_{x}B(x;\vec{u})=\varepsilon_{x}B(x;\vec{v})$ . As $e\not\equiv_{\alpha}\varepsilon_{x}B(x;\vec{u})$ and $e\not\equiv_{\alpha}\varepsilon_{x}B(x;\vec{v})$ , the substitution can change only $\vec{u},\vec{v}$ , hence $A\eta$ is $\vec{u}\eta=\vec{v}\eta\to\varepsilon_{x}B(x;\vec{u}\eta)=\varepsilon_{x}B(x;\vec{v}\eta)$ , which is an $\varepsilon$ -equality formula. ∎

On the other hand, elimination of one critical $\varepsilon$ -term may increase the number of different critical $\varepsilon$ -terms. It would be a problem if it would increase particularly the number of ones of the greatest rank, because then the termination of our procedure becomes a concern. Assume a proof involving the following critical formulas belonging to the two different critical $\varepsilon$ -terms of the greatest rank.

[TABLE]

If we try to eliminate $\varepsilon_{x}A(x)$ first, we afterwards have three different critical $\varepsilon$ -terms, $\varepsilon_{y}B(y,\varepsilon_{x}A(x))$ , $\varepsilon_{y}B(y,s)$ , and $\varepsilon_{y}B(y,t)$ , which is more than the number we had. We eliminate a critical $\varepsilon$ -term of the greatest degree among $\varepsilon$ -terms of the greatest rank, in order not to change any subterm of critical $\varepsilon$ -terms of the greatest rank.

Lemma 6.5.

Let $A$ be a critical formula belonging to $e$ and $\eta$ be a substitution $\{e^{\prime}/t\}$ for an $\varepsilon$ -term $e^{\prime}$ with $e\not\equiv_{\alpha}e^{\prime}$ and a term $t$ . If $\mathsf{deg}(e)\leq\mathsf{deg}(e^{\prime})$ , $A\eta$ is a critical formula belonging to $e$ . The same holds for an $\varepsilon$ -equality formula $A$ belonging to $e_{0},e_{1}$ and a substitution $\{e^{\prime}/t\}$ for an $\varepsilon$ -term $e^{\prime}$ with $e_{i}\not\equiv_{\alpha}e^{\prime}$ and $\mathsf{deg}(e_{i})\leq\mathsf{deg}(e)$ for $i\in\{0,1\}$ .

Proof.

We prove that $e^{\prime}$ does not have an occurrence in $e$ . Suppose to the contrary that $e^{\prime}$ has an occurrence in $e$ , $\mathsf{deg}(e)>\mathsf{deg}(e^{\prime})$ which contradicts the assumption $\mathsf{deg}(e)\leq\mathsf{deg}(e^{\prime})$ . The case of $\varepsilon$ -equality formulas is trivial. ∎

By eliminating a maximal critical $\varepsilon$ -term, our $\varepsilon$ -elimination method illustrated so far successfully decrease the number of different critical $\varepsilon$ -terms. By repeating this procedure, we can eliminate all critical $\varepsilon$ -term of the greatest rank and decrease the rank of the proof. Finally, we eliminate all critical $\varepsilon$ -terms to obtain an $\mathbf{EC}^{=}$ -proof.

Lemma 6.6.

Assume $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi}E$ for $\varepsilon$ -free formula $E$ and let $r$ be $\mathsf{rk}(\pi)$ . If $e$ is a maximum critical $\varepsilon$ -term in $\pi$ and no $\varepsilon$ -equality formula belongs to $e$ , $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi_{e}}E$ for some $\pi_{e}$ such that $\mathsf{rk}(\pi_{e})\leq r$ and $\mathsf{o}(\pi_{e},r)=\mathsf{o}(\pi,r)-1$ .

Proof.

Let $e$ be $\varepsilon_{x}A(x;\vec{v})$ and $w$ be $0pt_{\pi}(e)$ . All the critical formulas belonging to $e$ in $\pi$ can be listed by $A(t_{i};\vec{v})\to A(\varepsilon_{x}A(x;\vec{v});\vec{v})$ for $i<w$ . We form the proofs $\bar{\pi}$ and $\pi_{i}$ for $i<w$ such that $\vdash_{\bar{\pi}}\bigwedge_{i<w}\neg A(t_{i};\vec{v})\to E$ and $\vdash_{\pi_{i}}A(t_{i};\vec{v})\to E$ , from which $E$ is derivable by propositional calculus. In order to make $\bar{\pi}$ , we assume $\bigwedge_{i<w}\neg A(t_{i};\vec{v})$ , then $A(t_{i};\vec{v})\to A(\varepsilon_{x}A(x;\vec{v});\vec{v})$ is provable from $\neg A(t_{i};\vec{v})$ without those critical formulas belonging to $e$ , hence we get $\bar{\pi}$ by Theorem 2.12. In order to make $\pi_{j}$ , we first replace $e$ in $\pi$ by $t_{j}$ , then prove $A(t_{i}\{e/t_{j}\};\vec{v})\to A(t_{j};\vec{v})$ for each $i<w$ by assuming $A(t_{j};\vec{v})$ . We use modus ponens to the tautology $A(t_{j};\vec{v})\to A(t_{i}\{e/t_{j}\};\vec{v})\to A(t_{j};\vec{v})$ and $A(t_{j};\vec{v})$ . Assume $\pi_{e}$ is a proof of $E$ obtained by the above procedure, then it remains to prove that $\mathsf{rk}(\pi_{e})\leq\mathsf{rk}(\pi)$ and $\mathsf{o}(\pi_{e},r)=\mathsf{o}(\pi,r)-1$ . By the construction, critical formulas belonging to $e$ in $\pi$ don’t remain in $\pi_{e}$ . For any critical $\varepsilon$ -term $e^{\prime}$ in $\pi$ , $\mathsf{rk}(e^{\prime})<\mathsf{rk}(e)$ implies that critical and $\varepsilon$ -equality formulas belonging to $e^{\prime}$ in $\pi$ are critical and $\varepsilon$ -equality formulas of the same rank in $\pi_{e}$ due to Lemma 6.3 and Lemma 6.4. For any critical $\varepsilon$ -term $e^{\prime}$ in $\pi$ , $\mathsf{rk}(e^{\prime})=\mathsf{rk}(e)$ and $e\not\equiv_{\alpha}e^{\prime}$ imply that critical and $\varepsilon$ -equality formulas belonging to $e^{\prime}$ are not affected by the substitutions for $e$ due to Lemma 6.5. Therefore, the critical $\varepsilon$ -term $e$ does not occur in $\pi_{e}$ anymore, $\mathsf{rk}(\pi_{e})\leq r$ , and $\mathsf{o}(\pi_{e},r)=\mathsf{o}(\pi,r)-1$ . ∎

Note that for $r^{\prime}<\mathsf{rk}(\pi)$ , $\mathsf{o}(\pi_{e},r^{\prime})$ may be larger than $\mathsf{o}(\pi,r^{\prime})$ and also maximal degrees of critical and $\varepsilon$ -equality formulas of rank below $\mathsf{rk}(\pi)$ in $\pi_{e}$ may be strictly larger than the corresponding original ones in $\pi$ . Also, $0pt(\pi_{e},\mathsf{rk}(\pi))$ may be larger than $0pt(\pi,\mathsf{rk}(\pi))$ , because each premise of critical formulas may be changed by the substitutions. As we constantly decrease the order at the greatest rank, the termination is still guaranteed. We now study the epsilon elimination method for $\mathbf{EC}_{\varepsilon}^{=}$ . We use the following lemma on the identity formula.

Lemma 6.7.

Assume $A(a;\vec{b})$ has exactly one occurrence of each $b_{i}$ and for each $b_{i}$ , if it is a subterm of some term $t$ , $a\in\mathrm{FV}(t)$ holds. For any term $s$ there exists $\pi$ and $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi}\vec{u}=\vec{v}\to A(s;\vec{u})\to A(s;\vec{v})$ , such that $\mathsf{cc}(\pi)\leq|\vec{b}|\cdot\mathsf{deg}(A(a;\vec{b}))$ and $\mathsf{rk}(\pi)\leq\mathsf{rk}(A(a;\vec{b}))$ .

Proof.

Let $r_{i}(a,\vec{b}_{i})$ for $0\leq i<k$ be the immediate subterms of $A(a;\vec{b})$ where the list of vectors $\vec{b}_{0},\ldots,\vec{b}_{k-1}$ is a split of $\vec{b}$ and $\{\vec{b}_{i}\}\subseteq\mathrm{FV}(r_{i}(a,\vec{b}_{i}))$ . For some $A^{*}(\vec{c}\,)$ , $A(a;\vec{b})\equiv_{\alpha}A^{*}(r_{0}(a,\vec{b}_{0}),\ldots,r_{k-1}(a,\vec{b}_{k-1}))$ . Assume $\vec{u}_{i}$ and $\vec{v}_{i}$ are the subvectors of $\vec{u}$ and $\vec{v}$ corresponding to $\vec{b}_{i}$ . By induction on the construction of $r_{i}(a,\vec{b}_{i})$ , there is a proof $\pi_{i}$ of $\vec{u}_{i}=\vec{v}_{i}\to r_{i}(s,\vec{u}_{i})=r_{i}(s,\vec{v}_{i})$ such that $\mathsf{cc}(\pi_{i})\leq|\vec{b}_{i}|\cdot\mathsf{deg}(r_{i}(a,\vec{b}_{i}))$ and $\mathsf{rk}(\pi_{i})\leq\mathsf{rk}(r_{i}(a,\vec{b}_{i}))$ for each $i$ . In case $r_{i}(a,\vec{b}_{i})$ is a variable $b_{ij}$ , it is trivial. In case $r_{i}(a,\vec{b}_{i})$ is a function term $f(r_{i0}(a,\vec{b}_{i0}),\ldots,r_{i(l-1)}(a,\vec{b}_{i(l-1)}))$ where the list $\vec{b}_{i0},\ldots,\vec{b}_{i(l-1)}$ is a split of $\vec{b}_{i}$ , for each $0\leq j<|\vec{b}_{i}|=:l$ , there is a proof $\pi_{ij}$ of $\vec{u}_{ij}=\vec{v}_{ij}\to r_{ij}(s,\vec{u}_{ij})=r_{ij}(s,\vec{v}_{ij})$ such as $\mathsf{cc}(\pi_{ij})\leq|\vec{b}_{ij}|\cdot\mathsf{deg}(r_{ij}(a,\vec{b}_{ij}))$ and $\mathsf{rk}(\pi_{ij})\leq r_{ij}(a,\vec{b}_{ij})$ by induction hypothesis. The claim follows by $\mathbf{EQ}$ as $\mathsf{cc}(\pi_{i})\leq\sum_{j=0}^{l-1}|\vec{b}_{ij}|\cdot\mathsf{deg}(r_{ij}(a,\vec{b}_{ij}))\leq l\cdot\mathsf{deg}(r_{i}(a,\vec{b}_{i}))$ and $\mathsf{rk}(\pi_{i})\leq\max\{r_{ij}(a,\vec{b}_{ij})\mid 0\leq j<l\}\leq\mathsf{rk}(r_{i}(a,\vec{b}_{i}))$ . In case $r_{i}(a,\vec{b}_{i})$ is an $\varepsilon$ -term $r_{i}(a,\vec{b}_{i})\equiv_{\alpha}\varepsilon_{y}A_{i}(y;r_{i0}(a,\vec{b}_{i0}),\ldots,r_{i(l-1)}(a,\vec{b}_{i(l-1)}))$ where $\varepsilon_{y}A_{i}(y;\vec{c}_{i})$ is an $\varepsilon$ -matrix, for each $0\leq j<|\vec{b}_{i}|=:l$ , there is a proof $\pi_{ij}$ of $\vec{u}_{ij}=\vec{v}_{ij}\to r_{ij}(s,\vec{u}_{ij})=r_{ij}(s,\vec{v}_{ij})$ such as $\mathsf{cc}(\pi_{ij})\leq|\vec{b}_{ij}|\cdot\mathsf{deg}(r_{ij}(a,\vec{b}_{ij}))$ and $\mathsf{rk}(\pi_{ij})\leq\mathsf{rk}(r_{ij}(a,\vec{b}_{ij}))$ by induction hypothesis. The claim follows by $\varepsilon$ -equality formulas as $\mathsf{cc}(\pi_{i})\leq\left(\sum_{j=0}^{l-1}|\vec{b}_{ij}|\cdot\mathsf{deg}(r_{ij}(a,\vec{b}_{ij}))\right)+1\leq l\cdot\mathsf{deg}(r_{i}(a,\vec{b}_{i}))$ and $\mathsf{rk}(\pi_{i})\leq\max\{r_{ij}(a,\vec{b}_{ij})\mid 0\leq j<l\}+1=\mathsf{rk}(r_{i}(a,\vec{b}_{i}))$ . There is $\pi$ due to $\vec{\pi}$ without further use of critical nor $\varepsilon$ -equality formulas by induction on $A^{*}(\vec{c})$ , so that $\mathsf{cc}(\pi)\leq\sum_{i=0}^{k-1}|\vec{b}_{i}|\cdot\mathsf{deg}(r_{i}(a,\vec{b}_{i}))\leq|\vec{b}|\cdot\mathsf{deg}(A(a;\vec{b}))$ and $\mathsf{rk}(\pi)\leq\max\{\mathsf{rk}(r_{i}(a,\vec{b}_{i}))\mid 0\leq i<k\}\leq\mathsf{rk}(A(a;\vec{b}))$ . ∎

We deal with the case both critical formula and $\varepsilon$ -equality formula belong to the same critical $\varepsilon$ -term. By the next lemmas, we replace such a critical formula and an $\varepsilon$ -equality formula, so that we obtain a proof whose order of the greatest rank is strictly smaller than the original one.

Lemma 6.8.

Let $\varepsilon_{x}A(x;\vec{b})=:g(\vec{b})$ be an $\varepsilon$ -matrix, and assume $\mathsf{deg}(g(\vec{v}))\leq\mathsf{deg}(g(\vec{u}))$ . There is an $\mathbf{EC}_{\varepsilon}^{=}$ -proof $\pi$ from the assumption $\vec{u}=\vec{v}$ such that it concludes $A(t_{i};\vec{u})\to A(\varepsilon_{x}A(x;\vec{v});\vec{u})$ for arbitrary terms $\vec{t}$ for each $i<|\vec{t}\,|$ , and moreover the following conditions hold: $\mathsf{rk}(\pi)=\mathsf{rk}(\varepsilon_{x}A(x;\vec{b}))$ , $\mathsf{cc}(\pi)\leq|\vec{b}|\cdot(|\vec{t}\,|+1)\cdot\mathsf{deg}(A(a;\vec{b}))+|\vec{t}\,|$ , and $0pt(\pi,\mathsf{rk}(\pi))=|\vec{t}\,|$ .

Proof.

For each $i$ , $\vec{u}=\vec{v}\vdash A(t_{i};\vec{u})\to A(t_{i};\vec{v})$ by Lemma 6.7. On the other hand, $\vdash A(t_{i};\vec{v})\to A(\varepsilon_{x}A(x;\vec{v}),\vec{v})$ as it is a critical formula, and also $\vec{u}=\vec{v}\vdash A(\varepsilon_{x}A(x;\vec{v}),\vec{v})\to A(\varepsilon_{x}A(x;\vec{v}),\vec{u})$ by Lemma 6.7, hence we obtain $\pi$ using the deduction theorem. As $\mathsf{rk}(A(a;\vec{b}))\leq\mathsf{rk}(\varepsilon_{x}A(x;\vec{v}))$ , $\mathsf{rk}(\pi)=\mathsf{rk}(\varepsilon_{x}A(x;\vec{v}))$ , $\mathsf{cc}(\pi)\leq|\vec{b}|\cdot\mathsf{deg}(A(a;\vec{b}))\cdot|\vec{t}\,|+|\vec{t}\,|+|\vec{b}|\cdot\mathsf{deg}(A(a;\vec{b}))$ , and $0pt(\pi,\mathsf{rk}(\pi))=|\vec{t}\,|$ which comes from the number of critical formulas belonging to $g(\vec{b})$ . ∎

Lemma 6.9.

Let $\varepsilon_{x}A(x;\vec{b})=:g(\vec{b})$ be an $\varepsilon$ -matrix, and assume $\mathsf{deg}(g(\vec{v}))\leq\mathsf{deg}(g(\vec{u}))$ and $\mathsf{deg}(g(\vec{w}))\leq\mathsf{deg}(g(\vec{u}))$ . There is an $\mathbf{EC}_{\varepsilon}^{=}$ proof $\pi$ such that $\vec{v}=\vec{u}\vdash_{\pi}\vec{w}=\vec{u}\to\varepsilon_{x}A(x;\vec{v})=\varepsilon_{x}A(x;\vec{w})$ , $\mathsf{rk}(\pi)=\mathsf{rk}(\varepsilon_{x}A(x;\vec{b}))$ , $\mathsf{deg}(\pi)\leq\mathsf{deg}(\varepsilon_{x}A(x;\vec{u}))$ , and $\mathsf{cc}(\pi)=1$ .

Proof.

Assuming $\vec{v}=\vec{u}$ and $\vec{w}=\vec{u}$ , $\vec{v}=\vec{w}$ holds. By $\varepsilon$ -equality formula and modus ponens, $\varepsilon_{x}A(x;\vec{v})=\varepsilon_{x}A(x;\vec{w})$ . Then we use Theorem 2.12. ∎

Lemma 6.10.

Assume $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi}E$ for a quantifier-free $E$ and $e$ is a maximum critical $\varepsilon$ -term in $\pi$ , then $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi_{e}}E$ for some $\pi_{e}$ where $\mathsf{o}(\pi_{e},r)=\mathsf{o}(\pi,r)-1$ for $r=\mathsf{rk}(\pi)$ and $\mathsf{rk}(\pi_{e})\leq r$ .

Proof.

If there is no $\varepsilon$ -equality formula belonging to $e$ , we apply Lemma 6.6. Otherwise, assume $e$ is of the form $\varepsilon_{x}A(x;\vec{u})$ , and let $B_{i}$ be an $\varepsilon$ -equality formula $\vec{u}=\vec{v}_{i}\to\varepsilon_{x}A(x;\vec{u})=\varepsilon_{x}A(x;\vec{v}_{i})$ in $\pi$ for $i<w$ where $w:=0pt^{=}(\pi,e)$ . We form the proofs $\bar{\pi}$ of $(\bigwedge_{i<w}\neg\vec{u}=\vec{v}_{i})\to E$ and $\pi_{i}$ of $\vec{u}=\vec{v}_{i}\to E$ for $i<w$ , from which $E$ is derived, in the following procedure. Assuming $\bigwedge_{i<w}\neg\vec{u}=\vec{v}_{i}$ , all the $\varepsilon$ -equality formulas belonging to $e$ are provable in propositional calculus. If there is no critical formulas belonging to $e$ , we already have $\bar{\pi}$ , and otherwise we further apply Lemma 6.6 to get $\bar{\pi}$ . Concerning $\pi_{i}$ , assume $\vec{u}=\vec{v}_{i}$ and substitute $\varepsilon_{x}A(x;\vec{v}_{i})$ for $e$ throughout $\pi$ . Then, critical formulas of the form $A(t;\vec{u})\to A(\varepsilon_{x}A(x;\vec{u});\vec{u})$ in $\pi$ goes to $A(t;\vec{u})\to A(\varepsilon_{x}A(x;\vec{v});\vec{u})$ which is provable by Lemma 6.8. On the other hand, each $\varepsilon$ -equality formula $B_{i}$ is provable in propositional calculus, and the other $\varepsilon$ -equality formulas $E_{j}$ for $j\neq i$ is provable by Lemma 6.9. Let $\pi_{e}$ be a proof obtained by means of propositional calculus from $\bar{\pi}$ and $\vec{\pi}$ . Since all the critical $\varepsilon$ -term $e$ in $\pi$ have been removed and the substitutions don’t make any other critical $\varepsilon$ -terms $e^{\prime}$ which is different from $e$ be same as $e$ , there is no critical $\varepsilon$ -term $e$ in $\pi_{e}$ . It remains to prove that the obtained proof $\pi_{e}$ satisfies $\mathsf{o}(\pi_{e},r)=\mathsf{o}(\pi,r)-1$ for $r=\mathsf{rk}(\pi)$ and $\mathsf{rk}(\pi_{e})\leq r$ . As it is apparent for $\bar{\pi}$ , we consider each $\pi_{i}$ . Although Lemma 6.8 introduces $\varepsilon$ -equality formulas belonging to a new $\varepsilon$ -term, those terms are of rank strictly below $r$ . Any critical formula of rank $r$ in each $\pi_{i}$ belongs to $\varepsilon_{x}A(x;\vec{v}_{i})$ , which is of the same rank $r$ , occurs in $\pi$ , and is distinct from $e$ . The $\varepsilon$ -equality formulas of rank $r$ used in Lemma 6.9 belong to some critical $\varepsilon$ -term of rank $r$ in $\pi$ which is different from $e$ . Therefore, $\mathsf{o}(\pi_{e},r)=\mathsf{o}(\pi,r)-1$ and $\mathsf{rk}(\pi_{e})\leq r$ hold. ∎

Repeatedly using the results so far, the rank of the proof is diminished.

Lemma 6.11.

Assume $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi}E$ for a quantifier-free $E$ and $e$ is a maximum critical $\varepsilon$ -term in $\pi$ , then $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\rho}E$ for some $\rho$ where $\mathsf{rk}(\rho)<\mathsf{rk}(\pi)$ .

Proof.

We make a sequence of proofs $\pi_{0},\pi_{1},\ldots,\pi_{n}$ for $n=\mathsf{o}(\pi,r)$ , where $\pi_{0}:=\pi$ and $r:=\mathsf{rk}(\pi)$ . If no $\varepsilon$ -equality formula belongs to the maximal critical $\varepsilon$ -term of $\pi_{i}$ , let $\pi_{i+1}$ be a proof obtained by applying Lemma 6.6 to $\pi_{i}$ , and otherwise let $\pi_{i+1}$ be a result of applying Lemma 6.10 to $\pi_{i}$ . As in any case the order is decreasing, $\mathsf{o}(\pi_{n},r)=0$ and hence $\mathsf{rk}(\pi_{n})<r$ , therefore we let $\rho$ be $\pi_{n}$ . ∎

Theorem 6.12 (First epsilon theorem).

If $E$ is a formula in $L(\mathbf{EC}^{=})$ and $\mathbf{EC}_{\varepsilon}^{=}\vdash E$ , then $\mathbf{EC}^{=}\vdash E$ .

Proof.

Assume $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi}E$ . We make a sequence of proofs $\pi_{0},\pi_{1},\ldots,\pi_{r}$ for $r=\mathsf{rk}(\pi)$ , where $\pi_{0}:=\pi$ . In case $\mathsf{rk}(\pi_{i})=r-i$ , let $\pi_{i+1}$ be a proof obtained by applying Lemma 6.11 to $\pi_{i}$ , and otherwise let $\pi_{i+1}$ be $\pi_{i}$ . Then $\pi_{r}$ is the $\mathbf{EC}^{=}$ -proof of $E$ . ∎

We conclude this section, giving the statement of Second epsilon Theorem [HB39]. The proof is given due to the $\varepsilon$ -elimination method without anything new.

Theorem 6.13 (Second epsilon theorem).

If $A$ is a formula in $L(\mathbf{PC}^{=})$ and $\mathbf{PC}_{\varepsilon}^{=}\vdash A$ , then $\mathbf{PC}^{=}\vdash A$ .

7 Extended First Epsilon Theorem

In contrast to Section 6, we consider the case the goal formula involves $\varepsilon$ -terms, which leads us to extended first epsilon theorem. Assume an $\mathbf{EC}_{\varepsilon}^{=}$ -proof of a formula $E(\vec{s}\,)$ is given, where $\vec{s}$ may contain $\varepsilon$ -terms. Eliminating critical formulas and $\varepsilon$ -equality formulas, we obtain an $\mathbf{EC}^{=}$ -proof of a quantifier free formula $\bigvee_{i<n}E(\vec{t}_{i})$ for some terms $\vec{t}_{0},\vec{t}_{1},\ldots,\vec{t}_{n-1}$ , which is called the Herbrand disjunction. We say that the Herbrand complexity of $E(\vec{s}\,)$ is the smallest length $n$ of such a Herbrand disjunction, which is denoted by $\mathrm{HC}(E(\vec{s}\,))$ . We firstly study the $\varepsilon$ -elimination method to prove extended first epsilon theorem without considering the Herbrand complexity, then the complexity analysis follows to compute an upper bound of the Herbrand complexity.

7.1 Proof of extended first epsilon theorem

We start from describing how to eliminate maximal critical $\varepsilon$ -terms. In case there is a critical formula belonging to a critical $\varepsilon$ -term of the maximal rank, we follow the $\varepsilon$ -elimination method described in Section 6 for $\mathbf{EC}_{\varepsilon}^{=}$ [HB39]. Otherwise, only $\varepsilon$ -equality formulas belong to critical $\varepsilon$ -terms of the maximal rank. Then we substitute a function symbol for the $\varepsilon$ -matrix corresponding to the $\varepsilon$ -term, in order to find a better upper bound of the Herbrand complexity.

The following lemma is for the first case, i.e. a critical formula is involved.

Lemma 7.1.

Assume $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi}E(\vec{s}\,)$ for a formula $E(\vec{a})\in L(\mathbf{EC}^{=})$ and terms $\vec{s}\in L(\mathbf{EC}_{\varepsilon}^{=})$ . If $e$ is the maximal $\varepsilon$ -term of $\pi$ , there is a proof $\pi_{e}$ with $\mathsf{rk}(\pi_{e})\leq\mathsf{rk}(\pi)$ and $\mathsf{o}(\pi_{e},r)<\mathsf{o}(\pi,r)$ such that $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi_{e}}\bigvee_{i=0}^{0pt_{\pi}(e)}E(\vec{s}_{i})$ for $\vec{s}_{i}\in L(\mathbf{EC}_{\varepsilon}^{=})$ .

Proof.

Assume $\varepsilon_{x}A(x;\vec{u})=:e$ is the maximal critical $\varepsilon$ -term in $\pi$ and $\vec{v}_{i}=\vec{u}\to\varepsilon_{x}A(x;\vec{v}_{i})=\varepsilon_{x}A(x;\vec{u})$ for $0\leq i<0pt^{=}_{\pi}(e)=:l$ and $A(t_{j};\vec{u})\to A(\varepsilon_{x}A(x;\vec{u});\vec{u})$ for $0\leq j<0pt^{\varepsilon}_{\pi}(e)=:m$ occur in $\pi$ as $\varepsilon$ -equality and critical formulas, resp. We form proofs $\bar{\pi}$ of $(\bigwedge_{i=0}^{l-1}\vec{v}_{i}\neq\vec{u})\to\bigvee_{i=0}^{m}E(\vec{s}_{i})$ where $\vec{s}_{0}:=\vec{s}$ and $\vec{s}_{j}$ for $0<j\leq m$ is a result of replacing $e$ in $\vec{s}$ by $t_{j-1}$ , and $\pi_{i}$ of $\vec{v}_{i}=\vec{u}\to E(\vec{s}_{m+i+1})$ , where $\vec{s}_{m+i+1}$ is a result of replacing $e$ in $\vec{s}$ by $\varepsilon_{x}A(x;\vec{v}_{i})$ . To get the former proof, assume $\bigwedge_{i=0}^{l-1}\vec{v}_{i}\neq\vec{u}$ , then all the $\varepsilon$ -formulas belonging to $e$ are eliminated due to ex falso quodlibet. By $\varepsilon$ -elimination for $\mathbf{EC}_{\varepsilon}$ , remaining critical formulas belonging to $e$ are eliminated and we get a proof of $\bigvee_{j=0}^{m}(\bigwedge_{i=0}^{l-1}\vec{v}_{i}\neq\vec{u}\to E(\vec{s}_{j}))$ . As $e$ is maximal $\varepsilon$ -term, $\vec{u}$ and $\vec{v}_{i}$ for $0\leq i<l$ are not affected by the substitution, hence $\bigwedge_{i=0}^{l-1}\vec{v}_{i}\neq\vec{u}\to\bigvee_{j=0}^{m}E(\vec{s}_{j})$ follows. To get the latter proof for $j$ such that $0\leq j<l$ , assume $\vec{v}_{j}=\vec{u}$ and substitute $\varepsilon_{x}A(x;\vec{v}_{j})$ in $\pi$ for $e$ . Due to Lemma 6.9, the modified $\varepsilon$ -equality formulas in $\pi$ are provable after the substitution. Due to Lemma 6.8, the changed critical formulas which belonged to $e$ in $\pi$ are provable after the substitution. We obtain $\pi_{e}$ combining $\bar{\pi}$ and $\vec{\pi}$ , where there is no critical $\varepsilon$ -term belonged to by a critical formula nor an $\varepsilon$ -equality formula, $\mathsf{rk}(\pi_{e})\leq\mathsf{rk}(\pi)$ , and there is no new critical $\varepsilon$ -term of rank $\mathsf{rk}(\pi)$ belonged by an $\varepsilon$ -equality formula. ∎

Repeating the above lemma, we obtain a proof of a strictly smaller rank which concludes a Herbrand disjunction.

Definition 7.2 (Critical ranks).

For a proof $\pi$ , define the set $\mathcal{CR}(\pi)$ of the ranks of critical formulas by $\{\mathsf{rk}(e)\mid\text{a critical formula belongs to$ e $}\}$ .

Lemma 7.3.

Assume $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi}E(\vec{s}\,)$ for $E(\vec{a})\in L(\mathbf{EC}^{=})$ and $\vec{s}\in L(\mathbf{EC}_{\varepsilon}^{=})$ . If there is a critical formula of the rank $\mathsf{rk}(\pi)=:r$ , there is a proof $\pi^{\prime}$ such that $\mathsf{rk}(\pi^{\prime})<r$ , $\mathcal{CR}(\pi^{\prime})=\mathcal{CR}(\pi)\setminus\{r\}$ , and $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi^{\prime}}\bigvee_{i=0}^{n}E(\vec{s}_{i})$ where $\vec{s}_{0},\ldots,\vec{s}_{n}\in L(\mathbf{EC}_{\varepsilon}^{=})$ for some $n$ .

Proof.

At most $\mathsf{o}(\pi,r)$ times applications of Lemma 7.1 eliminate all the critical $\varepsilon$ -terms of rank $\mathsf{rk}(\pi)$ in $\pi$ . ∎

The function symbol substitution is applicable for eliminating $\varepsilon$ -equality formulas, provided no critical formula belongs to any $\varepsilon$ -term of those $\varepsilon$ -equality formulas.

Definition 7.4 (Function symbol substitution).

Assume $e$ is a critical $\varepsilon$ -term of the form $g(\vec{u})$ , where $g$ is the $\varepsilon$ -matrix of $e$ , and let $f$ be a function symbol of the arity $|\vec{u}|$ which is uniquely assigned to $g$ . The substitution $\{g(\vec{u})/f(\vec{u})\}$ is the function symbol substitution for $e$ .

We are particularly interested in using the function symbol substitution for the case only $\varepsilon$ -equality formulas are of the maximal rank.

Lemma 7.5.

Assume $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi}E(\vec{e}\,)$ for $E(\vec{a})\in L(\mathbf{EC}^{=})$ and for $\varepsilon$ -terms $\vec{e}$ . If only $\varepsilon$ -equality formula is of rank $\mathsf{rk}(\pi)$ , there is an $\mathbf{EC}_{\varepsilon}^{=}$ -proof $\pi^{\prime}$ of $E(\vec{t}\,)$ for some terms $\vec{t}$ such that $\mathsf{rk}(\pi^{\prime})<\mathsf{rk}(\pi)$ , $\mathcal{CR}(\pi^{\prime})=\mathcal{CR}(\pi)$ , and $\mathsf{cc}(\pi^{\prime})<\mathsf{cc}(\pi)$ . Moreover for any $r^{\prime}<\mathsf{rk}(\pi)$ , $\mathsf{o}(\pi^{\prime},r^{\prime})\leq\mathsf{o}(\pi,r^{\prime})$ , and $\mathsf{m}(\pi^{\prime},r^{\prime})=\mathsf{m}(\pi,r^{\prime})$ .

Proof.

Let $r$ be the rank $\mathsf{rk}(\pi)$ . We repeatedly replace the maximal critical $\varepsilon$ -term of rank $r$ through the corresponding function symbol substitution. After $\mathsf{o}(\pi,r)$ times of replacements, there is no more critical $\varepsilon$ -term of rank $r$ and the process terminates. After the substitutions, each $\varepsilon$ -equality formula of rank $r$ in $\pi$ is an identity formula for a function symbol $f$ . Each critical (and $\varepsilon$ -equality respectively) formula of a rank strictly smaller than $\mathsf{rk}(\pi)$ is another critical (and $\varepsilon$ -equality respectively) formula which belongs to the same critical $\varepsilon$ -term after the substitution due to Lemma 6.5, hence what we obtained is for some terms $\vec{t}$ an $\mathbf{EC}_{\varepsilon}^{=}$ -proof $\pi^{\prime}$ of $E(\vec{t}\,)$ with the stated conditions. ∎

Lemma 7.6.

Assume $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi}E(\vec{s}\,)$ for $E(\vec{a})\in L(\mathbf{EC}^{=})$ and $\vec{s}\in L(\mathbf{EC}_{\varepsilon}^{=})$ . There is an $\mathbf{EC}_{\varepsilon}^{=}$ -proof $\pi^{\prime}$ of the formula $E(\vec{t}\,)$ for $\vec{t}\in L(\mathbf{EC}_{\varepsilon}^{=})$ such that $\mathsf{o}(\pi^{\prime})\leq\mathsf{o}(\pi)$ and $\mathcal{CR}(\pi^{\prime})=\mathcal{CR}(\pi)$ . Moreover, $\mathcal{CR}(\pi)$ is empty or $\mathsf{rk}(\pi^{\prime})\in\mathcal{CR}(\pi^{\prime})$ .

Proof.

Assume $\mathsf{rk}(\pi)\not\in\mathcal{CR}(\pi)$ , namely, there are $r_{1},r_{2},\ldots,r_{n}>\max(\mathcal{CR}(\pi))$ and for each $r_{i}$ , there are $\varepsilon$ -equality formulas of rank $r_{i}$ in $\pi$ . We apply Lemma 7.5 for $n$ -times. ∎

Remark 7.7.

Even in case there are both critical and $\varepsilon$ -equality formulas of the maximal rank, it is still possible to make use of the function symbol substitution to eliminate $\varepsilon$ -equality formulas, provided there is no critical formula belonging to any critical $\varepsilon$ -term of those $\varepsilon$ -equality formulas. During the substitution process, it happens to have non-proofs because formulas of the form $\vec{u}=\vec{v}\to f(\vec{u})=\varepsilon_{x}A(x;\vec{v})$ are present, but eventually these formulas will be of the form $\vec{u}=\vec{v}\to f(\vec{u})=f(\vec{v})$ , an instance of identity formula of a function symbol.

Theorem 7.8 (Extended first epsilon theorem for $\mathbf{EC}_{\varepsilon}^{=}$ ).

Assume $\mathbf{EC}_{\varepsilon}^{=}\vdash_{\pi}E(\vec{s}\,)$ for $E(\vec{a})\in L(\mathbf{EC}^{=})$ and $\vec{s}\in L(\mathbf{EC}_{\varepsilon}^{=})$ . There is a proof $\pi^{\prime}$ such that $\mathbf{EC}^{=}\vdash_{\pi^{\prime}}\bigvee_{i=0}^{n}E(\vec{s}_{i})$ for some $n$ and $\vec{s}_{0},\ldots,\vec{s}_{n}\in L(\mathbf{EC})$ .

Proof.

We make a sequence of proofs $\pi_{0},\pi_{1},\ldots,\pi_{m}$ for $m=|\mathcal{CR}(\pi)|$ in the following way. Let $\pi_{0}$ be a result of applying Lemma 7.6 to $\pi$ . If $\mathcal{CR}(\pi_{i})$ is not empty, let $\pi_{i+1}$ be a proof obtained by applying Lemma 7.3 and then Lemma 7.6 to $\pi_{i}$ . For each $i$ , $\mathcal{CR}(\pi_{i+1})=\mathcal{CR}(\pi_{i})\setminus\{\mathsf{rk}(\pi_{i})\}$ . Since $\mathsf{rk}(\pi_{m})=0$ , $\pi_{m}$ is an $\mathbf{EC}^{=}$ -proof. Remaining occurrences of $\varepsilon$ -terms may be replaced by free variables. ∎

7.2 Complexity analysis

We make a detailed analysis on the proofs of the previous subsection, in order to compute the numerical bound of the length of the disjunction in Theorem 7.8. To do so, we consider the property degree as a means of measuring the complexity of a critical formula. Given a critical formula $A(t)\to A(\varepsilon_{x}A(x))$ , we can determine the formula $A(a)$ . The property degree is to count the number of $\varepsilon$ -terms with a free variable occurrence of $a$ . If $A(a)$ is of the form $A^{\prime}(a;b)$ , this number tells us at most how many $\varepsilon$ -equality formulas are needed to prove the identity formula $u=v\to A^{\prime}(a;u)\to A^{\prime}(a;v)$ . We define the property degree and the maximal property degree for an $\varepsilon$ -term as follows.

Definition 7.9 (Property degree).

For an $\varepsilon$ -term $e$ , the property degree $\mathsf{pd}(e)$ is defined to be $\max\{\mathsf{deg}(t)\mid\text{$ t $subordinates$ e $}\}$ . The maximal property degree $\mathsf{mpd}(\pi,r)$ of a proof $\pi$ of rank $r$ is defined to be

[TABLE]

Also $\mathsf{mpd}(\pi)$ is defined to be

[TABLE]

We give the following results concerning the property degree.

Lemma 7.10.

For any $\vec{u},\vec{v}$ and $\varepsilon$ -matrix $g(\vec{b})$ , $\mathsf{pd}(g(\vec{u}))=\mathsf{pd}(g(\vec{v}))$ . For a given proof $\pi$ , the set of critical $\varepsilon$ -matrices which belong to critical formulas does not increase through the $\varepsilon$ -elimination.

Proof.

The first part is trivial. Concerning the second part, we consider just Lemma 6.8, because a new critical formula is introduced only if a critical term belongs to both critical formulas and $\varepsilon$ -equality formulas at the same time. The critical formulas introduced in the Lemma belongs to an $\varepsilon$ -term of an $\varepsilon$ -matrix in the original proof, hence the claim holds. The third part is trivial since the epsilon elimination method does not add a new $\varepsilon$ -matrix. ∎

Applying the epsilon elimination method, the maximal property degree and the maximal arity are weakly decreasing. Therefore, we can compute them at the very beginning and keep referring to them as the upper bounds of the property degrees and the arities through the whole elimination procedure.

Lemma 7.11.

The epsilon elimination method does not increase the maximal property degree nor the maximal arity.

Proof.

Notice that the method does not introduce any new $\varepsilon$ -matrix. ∎

As a consequence, the upper bounds of the critical counts in Lemma 6.7 and Lemma 6.8 depend only on an initial proof.

Lemma 7.12 (Elimination for critical formulas).

Let $E(\vec{a})$ be a formula in $L(\mathbf{EC}^{=})$ , $\vec{s}$ terms of $L(\mathbf{EC}_{\varepsilon}^{=})$ , and $\pi$ an $\mathbf{EC}_{\varepsilon}^{=}$ -proof of $E(\vec{s})$ where its maximal $\varepsilon$ -term $e:=\varepsilon_{x}A(x;\vec{v})$ is belonging only to a critical formula. There are terms $\vec{s}_{i}$ such that each of them is in $L(\mathbf{EC}_{\varepsilon}^{=})$ and an $\mathbf{EC}_{\varepsilon}^{=}$ -proof $\pi_{e}$ of $\bigvee_{i=0}^{w}E(\vec{s}_{i})$ for $w=0pt_{\pi}(e)$ . Moreover, $\mathsf{cc}(\pi_{e})\leq\mathsf{cc}(\pi)\cdot(w+1)$ and $\mathsf{mwd}(\pi_{e},r)\leq\mathsf{mwd}(\pi,r)\cdot(w+1)\leq(\mathsf{mwd}(\pi,r))^{2}$ hold for $r=\mathsf{rk}(\pi)$ .

Proof.

Straightforward from Lemma 21 by Moser and Zach [MZ06]. ∎

Lemma 7.13 (Elimination for $\varepsilon$ -equality formulas).

Let $E(\vec{a})$ be a formula in $L(\mathbf{EC}^{=})$ , $\vec{s}$ terms of $L(\mathbf{EC}_{\varepsilon}^{=})$ , and $\pi$ an $\mathbf{EC}_{\varepsilon}^{=}$ -proof of $E(\vec{s})$ where its maximal $\varepsilon$ -term $e:=\varepsilon_{x}A(x;\vec{v})$ is belonging only to $\varepsilon$ -equality formulas. There are terms $\vec{s}_{i}$ such that each of them is in $L(\mathbf{EC}_{\varepsilon}^{=})$ and an $\mathbf{EC}_{\varepsilon}^{=}$ -proof $\pi_{e}$ of $\bigvee_{i=0}^{w}E(\vec{s}_{i})$ for $w=0pt_{\pi}(e)$ . Moreover, $\mathsf{cc}(\pi_{e})\leq k\cdot(w+1)-2\cdot w$ and $\mathsf{mwd}(\pi_{e},r)\leq\mathsf{mwd}(\pi,r)\cdot(w+1)-2\cdot w$ hold for $r=\mathsf{rk}(\pi)$ and $k=\mathsf{cc}(\pi)$ .

Proof.

Assume $\varepsilon$ -equality formulas belonging to $e$ are $\vec{u}_{i}=\vec{v}\to\varepsilon_{x}A(x;\vec{u}_{i})=\varepsilon_{x}A(x;\vec{v}\,)$ for $0\leq i<w$ . There is a proof $\bar{\pi}$ of $(\bigwedge_{i=0}^{k}\vec{u}_{i}=\vec{v})\to E(\vec{s}\,)$ satisfying $\mathsf{cc}(\bar{\pi})\leq k-w$ and $\mathsf{mwd}(\bar{\pi},r)\leq\mathsf{mwd}(\pi,r)-w$ . On the other hand there are proofs $\pi_{i}$ of $\vec{u}_{i}=\vec{v}\to\varepsilon_{x}A(x;\vec{u}_{i})=\varepsilon_{x}A(x;\vec{v})$ for $0\leq i<w$ with $\mathsf{cc}(\pi_{i})\leq k-1$ and $\mathsf{mwd}(\pi_{i},r)\leq\mathsf{mwd}(\pi,r)-1$ . In order to get $\pi_{i}$ we replace all $e$ in $\pi$ by $\varepsilon_{x}A(x,\vec{u}_{i})$ , and give a proof of $\vec{u}_{j}=\vec{v}\to\varepsilon_{x}A(x;\vec{u}_{j})=\varepsilon_{x}A(x;\vec{u}_{i})$ . It is trivial if $i=j$ , and otherwise we apply Lemma 6.9. We obtain $\pi_{e}$ , combining $\vec{\pi}$ and $\bar{\pi}$ , which satisfies $\mathsf{cc}(\pi_{e})\leq(w+1)\cdot k-2\cdot w$ and $\mathsf{mwd}(\pi_{e},r)\leq(w+1)\cdot\mathsf{mwd}(\pi,r)-2\cdot w$ . ∎

The following lemma concerns the case that the maximal critical $\varepsilon$ -term is belonged to by both critical and $\varepsilon$ -equality formulas at the same time.

Lemma 7.14 (Eliminating a maximal critical $\varepsilon$ -term).

We deal with the case the maximal critical $\varepsilon$ -term is belonged to by both critical and $\varepsilon$ -equality formulas at the same time. Assume $\pi$ is an $\mathbf{EC}_{\varepsilon}^{=}$ -proof, $e$ is the maximal $\varepsilon$ -term of $\pi$ , and $g$ is the $\varepsilon$ -matrix of $e$ . Let $a$ be the arity of $g$ , $p$ the property degree of $e$ , and $r$ the rank of $\pi$ . The critical count and the maximal width of $\pi_{e}$ at $r$ obtained in Lemma 7.1 are bounded as follows: $\mathsf{mwd}(\pi_{e},r)\leq 2\cdot(\mathsf{mwd}(\pi,r))^{2}$ and $\mathsf{cc}(\pi_{e})\leq(\mathsf{cc}(\pi)+a\cdot p)\cdot(\mathsf{mwd}(\pi,r))^{2}$ .

Proof.

Assume $e$ is of the form $\varepsilon_{x}A(x;\vec{u})$ . The proof $\bar{\pi}$ of $(\bigwedge_{j<0pt^{=}_{\pi}(e)}\vec{v}_{j}\neq\vec{u})\to E$ is due to $\bar{\sigma}$ of $(\bigwedge_{i<0pt^{\varepsilon}_{\pi}(e)}\neg A(t_{i};\vec{u}))\to(\bigwedge_{j<0pt^{=}_{\pi}(e)}\vec{v}_{j}\neq\vec{u})\to E$ and $\sigma_{i}$ of $A(t_{i};\vec{u})\to(\bigwedge_{j<0pt^{=}_{\pi}(e)}\vec{v}_{j}\neq\vec{u})\to E$ for $i<0pt^{\varepsilon}_{\pi}(e)$ . By removing all critical and $\varepsilon$ -equality formulas belonging to $e$ using ex falso quodlibet we obtain $\bar{\sigma}$ , hence $\mathsf{cc}(\bar{\sigma})\leq\mathsf{cc}(\pi)-0pt_{\pi}(e)$ . On the other hand, each critical formula belonging to $e$ in $\pi$ is gone due to $\varepsilon$ -elimination, $\mathsf{cc}(\sigma_{i})\leq\mathsf{cc}(\pi)-0pt_{\pi}(e)$ . Note that each $\varepsilon$ -equality formula belonging to $e$ is removed due to the premise $\bigwedge_{j<0pt^{=}_{\pi}(e)}\vec{v}_{j}\neq\vec{u}$ . Therefore,

[TABLE]

On the other hand concerning the width, the following condition holds.

[TABLE]

The proof $\pi_{j}$ of $\vec{v}_{j}=\vec{u}\to E$ is due to $\varepsilon$ -elimination, replacing $e$ by $\varepsilon_{x}A(x;\vec{v}_{j})$ . All $\varepsilon$ -equality formulas belonging to $e$ is gone, each critical formula belonging to $e$ is replaced by a critical formula belonging to $\varepsilon_{x}A(x;\vec{v}_{j})$ , and there are additional $\varepsilon$ -equality formulas due to Lemma 6.8. Therefore,

[TABLE]

and

[TABLE]

Considering the construction of $\pi_{e}$ ,

[TABLE]

and

[TABLE]

Note that $0pt_{\pi}(e)\geq 2$ , $0pt_{\pi}(e)>0pt^{\varepsilon}_{\pi}(e)$ , and $0pt_{\pi}(e)>0pt^{=}_{\pi}(e)$ hold. We conclude the claimed bounds due to Lemma 7.12 and Lemma 7.13. ∎

The notation of hyperexponentiation is useful to represent large numerals. We define the hyperexponentiation and give some arithmetics on the exponentiation and the hyperexponentiation.

Definition 7.15 (Hyperexponentiation).

For natural numbers $k,n,m$ , $k_{n}^{m}$ is defined to be

[TABLE]

Lemma 7.16.

For natural numbers $x,y$ , the following formulas hold.

[TABLE]

Lemma 7.17 (Eliminating critical $\varepsilon$ -terms of maximal rank).

Assume $\pi$ is an $\mathbf{EC}$ -proof of $E$ in Lemma 7.3. Let $r$ , $n$ , $w$ , $a$ , and $p$ be $\mathsf{rk}(\pi)$ , $\mathsf{o}(\pi,r)$ , $\mathsf{mwd}(\pi,r)$ , the maximal arity $\mathsf{ma}(\pi,r)$ , and the maximal property degree $\mathsf{mpd}(\pi,r)$ , respectively. The critical count of $\pi^{\prime}$ and the length of disjunction of $E^{\prime}$ obtained in Lemma 7.3, such that $\mathsf{rk}(\pi^{\prime})<\mathsf{rk}(\pi)$ , are bounded as follows: $\mathsf{cc}(\pi^{\prime})\leq(k+a\cdot p)\cdot 2^{2^{(w+n)\cdot n}}\leq(k+a\cdot p)\cdot 2_{2}^{3k^{2}}$ and $\mathsf{len}(E,E^{\prime})\leq 2^{2^{w+n+1}}\leq 2_{2}^{3k+1}$ for $k=\mathsf{cc}(\pi)$ .

Proof.

Consider a list of proofs $\rho_{0},\rho_{1},\rho_{2},\ldots,\rho_{n}$ where $\rho_{0}=\pi$ , $\rho_{n}=\pi^{\prime}$ , and each $\rho_{j+1}$ is obtained by means of Lemma 7.14 from $\rho_{j}$ , eliminating a critical $\varepsilon$ -term of rank $r$ . We prove by induction on $j$ that $\mathsf{mwd}(\rho_{j},r)\leq 2^{2^{j}-1}w^{2^{j}}$ . In case $j=0$ , $\mathsf{mwd}(\rho_{j},r)=w$ and $2^{2^{0}-1}w^{2^{0}}=w$ . In case $j\mapsto j+1$ ,

[TABLE]

Also, $2^{2^{j}-1}\cdot w^{2^{j}}\leq 2^{2^{j}-1}\cdot 2^{w\cdot 2^{j}}\leq 2^{(w+1)\cdot 2^{j}}\leq 2^{w+j}_{2}$ , therefore, $\mathsf{mwd}(\pi^{\prime},r)\leq 2_{2}^{w+j}$ holds. Let $k=\mathsf{cc}(\pi)$ . We also prove by induction on $j$ that

[TABLE]

hence $\mathsf{cc}(\pi^{\prime})\leq(k+a\cdot p)\cdot 2^{2^{n(w+n)}}$ . Since $w=\mathsf{mwd}(\pi)\leq\mathsf{cc}(\pi)$ and $n=\mathsf{o}(\pi,r)\leq 2\cdot\mathsf{cc}(\pi)$ , $\mathsf{cc}(\pi^{\prime})\leq(k+a\cdot p)\cdot 2_{2}^{3k^{2}}$ holds. The bound of the length of disjunction $\mathsf{len}(E,E^{\prime})$ is $2^{2^{n+w+1}}$ because of the following calculation, hence $\mathsf{len}(E,E^{\prime})\leq 2^{2^{3k+1}}$ .

[TABLE]

∎

Theorem 7.18 (Extended first epsilon theorem).

Assume $\pi$ is an $\mathbf{EC}_{\varepsilon}^{=}$ -proof of $E(\vec{s}\,)$ , an $\mathbf{EC}_{\varepsilon}^{=}$ -formula where terms other than $\vec{s}$ are $\varepsilon$ -free. There is an $\mathbf{EC}^{=}$ -proof $\pi^{\prime}$ of the $\mathbf{EC}$ -formula $\bigvee_{i<n}E(\vec{t}_{i})$ for some $\varepsilon$ -free terms $\vec{t}_{0},\ldots,\vec{t}_{n}$ such that $n\leq 2_{2k}^{6k^{2}+2k+a\cdot p}$ , where $k$ is the critical count $\mathsf{cc}(\pi)$ , $a$ is the maximal arity $\mathsf{ma}(\pi)$ , and $p$ is the maximal property degree $\mathsf{mpd}(\pi)$ .

Proof.

We make a sequence of proofs $\pi_{0},\pi_{1},\ldots,\pi_{m}$ where $\pi_{0}$ is obtained by applying Lemma 7.6 to $\pi$ , $m$ is $|\mathcal{CR}(\pi)|$ , and $\pi_{i+1}$ is obtained by applying Lemma 7.17 and Lemma 7.6 in this order to $\pi_{i}$ . We prove that $\mathsf{cc}(\rho_{i})\leq 2_{2i}^{6k^{2}+a\cdot p+2i}$ by induction on $i$ . It is trivial in case $i=0$ . In case $i\mapsto i+1$ ,

[TABLE]

We calculate the length $\mathsf{len}(E,E_{m})=\mathsf{len}(E,E_{1})\cdot\mathsf{len}(E_{1},E_{2})\cdot\cdots\cdot\mathsf{len}(E_{m-1},E_{m})$ .

[TABLE]

Since $|\mathcal{CR}(\pi)|\leq\mathsf{cc}(\pi)=k$ , $n\leq 2_{2k}^{6k^{2}+3k+a\cdot p}$ . ∎

7.3 Alternative $\varepsilon$ -equality formula and closure

We study $\mathbf{EC}_{\varepsilon}^{=_{1}}$ , the system $\mathbf{EC}_{\varepsilon}$ with $\mathbf{EQ}$ and the following $\varepsilon$ -equality formula,

[TABLE]

where $\vec{u}_{i\mapsto v}$ is $u_{1},u_{2},\ldots,u_{i-1},v,u_{i+1},\ldots,u_{n}$ ; obtained by replacing the $i$ -th element of $\vec{u}$ by $v$ . We say that $i$ is the position of the $\varepsilon$ -equality formula, and call the above formula the $\varepsilon$ -equality formula of position $i$ . Assuming $\varepsilon_{x}A(x,a)$ is an $\varepsilon$ -matrix, Hilbert and Bernays employed the following $\varepsilon$ -equality formula

[TABLE]

and presented the $\varepsilon$ -elimination method [HB39]. The formula (9) in $\mathbf{EC}_{\varepsilon}^{=_{1}}$ explicitly expresses the notion of position. An upper bound of the Herbrand complexity for $\mathbf{EC}_{\varepsilon}^{=_{1}}$ can be independent from the maximal arity of critical $\varepsilon$ -matrices in the given proof, in contrast to the case for $\mathbf{EC}_{\varepsilon}^{=}$ . The following lemma tells us that $\mathbf{EC}_{\varepsilon}^{=_{1}}$ is as strong as $\mathbf{EC}_{\varepsilon}^{=}$ .

Lemma 7.19.

$\mathbf{EC}_{\varepsilon}^{=}\vdash A$ * if and only if $\mathbf{EC}_{\varepsilon}^{=_{1}}\vdash A$ .*

Proof.

Assume $\pi$ is an $\mathbf{EC}_{\varepsilon}^{=}$ -proof of A. We prove $\vec{u}=\vec{v}\to\varepsilon_{x}A(x;\vec{u})=\varepsilon_{x}A(x;\vec{v})$ in $\mathbf{EC}_{\varepsilon}^{=_{1}}$ by induction on the number of differences between $\vec{u}$ and $\vec{v}$ . If there is one difference, say $u_{i}$ and $v_{i}$ , the formula is an instance of the alternative $\varepsilon$ -equality formula $u_{i}=v_{i}\to\varepsilon_{x}A(x;\vec{u})=\varepsilon_{x}A(x;\vec{v})$ . Assume $\pi$ is an $\mathbf{EC}_{\varepsilon}^{=_{1}}$ -proof of the formula $\vec{u}=\vec{v}\to\varepsilon_{x}A(x;\vec{u})=\varepsilon_{x}A(x;\vec{v})$ where there are $n$ differences between $\vec{u},\vec{v}$ and $u_{k}=v_{k}$ . For any $w$ , $v_{k}=w\to\varepsilon_{x}A(x;\vec{v})=\varepsilon_{x}A(x;\vec{v}_{k\mapsto w})$ is an instance of the alternative $\varepsilon$ -equality formula, hence by the transitivity of equality and the deduction theorem, $\vec{u}=\vec{v}_{k\mapsto w}\to\varepsilon_{x}A(x;\vec{u})=\varepsilon_{x}A(x;\vec{v}_{k\mapsto w})$ . The other direction is trivial. ∎

We define the closure $\mathcal{CL}(\pi,g)$ of the $\varepsilon$ -matrix $g$ in the proof $\pi$ , which is to enumerate all the critical $\varepsilon$ -terms of $g$ which may occur during the $\varepsilon$ -elimination process.

Definition 7.20 (Closure).

Let $\pi$ be an $\mathbf{EC}_{\varepsilon}^{=_{1}}$ -proof and $g$ the maximal $\varepsilon$ -matrix of $\pi$ . We define a set $\mathcal{T}_{\pi}(g,i)$ of terms which occur at the premise of the $\varepsilon$ -equality formulas of the position $i$ in $\pi$ , and a strict order $\prec_{g,i}$ such that for $u,v\in\mathcal{T}_{\pi}(g,i)$ , $u\prec_{g,i}v$ iff $\mathsf{deg}(u)\leq\mathsf{deg}(v)~{}\text{and}~{}u\not\equiv_{\alpha}v$ . We define a partially ordered set $\mathcal{CL}(\pi,g)$ to be $\{g(\vec{u})\mid u_{i}\in\mathcal{T}_{\pi}(g,i)\}$ with the transitive closure of the strict order $\prec_{g}$ such that for any position $i$ , $g(\vec{u})\prec_{g}g(\vec{u}_{i\mapsto v})$ iff $u_{i}\prec_{g,i}v$ .

It is easy to see that a closure is a lattice.

Definition 7.21 (Strict partial order on closures).

Let $M,N$ be sublattices of $\mathcal{CL}(\pi,g)$ . Define a strict partial order as follows: $M\prec N$ if and only if the upper bounds of $N$ is a proper subset of the upper bounds of $M$ .

Due to the above strict partial order, we can choose terms among maximal $\varepsilon$ -terms such that they are also maximal due to $\prec$ .

Definition 7.22 ( $\prec$ -maximal critical $\varepsilon$ -term).

Assume $\pi$ is a proof and $e$ is a critical $\varepsilon$ -term of $\varepsilon$ -matrix $g$ in $\pi$ . If $e$ is maximal and $e^{\prime}\prec e$ holds for all the other critical $\varepsilon$ -term $e^{\prime}$ of $g$ in $\pi$ , $e$ is a $\prec$ -maximal critical $\varepsilon$ -term n $\pi$ .

In the rest of this section, we simply say maximality to mean the $\prec$ -maximality. We introduce an order on proofs.

Definition 7.23 (Strict partial order on proofs due to a closure).

Let $\pi$ and $\pi^{\prime}$ be proofs. We define $\pi\prec\pi^{\prime}$ if and only if either $\mathsf{rk}(\pi)<\mathsf{rk}(\pi^{\prime})$ or $\mathsf{rk}(\pi)=\mathsf{rk}(\pi^{\prime})$ and $M\prec_{g}M^{\prime}$ for all $g$ of the rank $\mathsf{rk}(\pi)$ , where $M$ and $M^{\prime}$ are given by the sets of critical $\varepsilon$ -terms of $g$ in $\pi$ and $\pi^{\prime}$ , respectively.

Instead of Lemma 6.7, we use the following identity lemma in our current setting.

Lemma 7.24.

Assume $A(a;\vec{b})$ has exactly one occurrence of each $b_{i}$ and for each $b_{i}$ , if it is a subterm of some term $t$ , $a\in\mathrm{FV}(t)$ holds. For any term $s$ there exists $\pi$ and $\mathbf{EC}_{\varepsilon}^{=_{1}}\vdash_{\pi}u_{i}=v\to A(s;\vec{u})\to A(s;\vec{u}_{i\mapsto v})$ , such that $\mathsf{cc}(\pi)\leq\mathsf{deg}(A(a;\vec{b}))$ and $\mathsf{rk}(\pi)\leq\mathsf{rk}(A(a;\vec{b}))$ . Moreover, concerning the $\varepsilon$ -equality formulas $B_{1},\ldots,B_{n}$ used in $\pi$ , if $B_{i}$ belongs to $e_{i}$ and $e_{i}^{\prime}$ , $e_{i}$ and $e_{i}^{\prime}$ are proper subterms of $e_{i+1}$ and $e_{i+1}^{\prime}$ , $e_{i}$ and $e_{i}^{\prime}$ occur just once in $e_{i+1}$ and $e_{i+1}^{\prime}$ , respectively, $\mathsf{deg}(e_{i+1})=\mathsf{deg}(e_{i})+1$ and $\mathsf{deg}(e_{i+1}^{\prime})=\mathsf{deg}(e_{i}^{\prime})+1$ .

Proof.

Trivial due to the proof of Lemma 6.7. ∎

If no $\varepsilon$ -equality formula belongs to the maximal critical $\varepsilon$ -term to eliminate, we apply Lemma 7.12. Otherwise the following lemma applies.

Lemma 7.25 (Eliminating a maximal critical $\varepsilon$ -term involving $\varepsilon$ -equality).

Let $\pi$ be an $\mathbf{EC}_{\varepsilon}^{=_{1}}$ -proof of $E(\vec{e})$ where $E(\vec{a})$ is an $\varepsilon$ -free formula in $\mathbf{EC}_{\varepsilon}^{=_{1}}$ and $\vec{e}$ is a list of $\varepsilon$ -terms. Assume the maximal critical $\varepsilon$ -term of $\pi$ is $\varepsilon_{x}A(x;\vec{u})=:e$ and in $\pi$ there are critical formulas including ones of the form $A(t_{l},\vec{u})\to A(\varepsilon_{x}A(x;\vec{u}),\vec{u})$ for $1\leq l\leq m$ as well as $\varepsilon$ -equality formulas as follows.

[TABLE]

There is a proof $\pi_{e}$ of $\bigvee_{i\leq 0pt(\pi,e)}E(\vec{r}_{i})$ for some $\vec{r}_{i}$ where there is no critical occurrence of $\varepsilon_{x}A(x;\vec{u})$ in $\pi_{e}$ , $\pi_{e}\prec\pi$ , $\mathsf{cc}(\pi_{e})\leq 2\cdot(p+1)\cdot\mathsf{cc}(\pi)^{2}$ , $\mathsf{mwd}(\pi_{e},r)\leq 2\cdot\mathsf{mwd}(\pi,r)^{2}$ , and $|\mathcal{CL}(\pi_{e},g)|\leq|\mathcal{CL}(\pi,g)|$ for any critical $\varepsilon$ -matrix $g$ of rank $r$ in $\pi_{e}$ , where $p:=\mathsf{pd}(e)$ , $r:=\mathsf{rk}(\pi)$ and $w:=0pt(\pi,e)$ .

Proof.

We give proofs $\rho_{j}$ of $u_{i_{j}}=v_{j}\to E(\vec{s}\{e/\varepsilon_{x}A(x;\vec{u}_{i_{j}\mapsto v_{j}})\})$ for $1\leq j\leq n$ and also a proof $\sigma$ of $(\bigwedge_{1\leq j\leq n}\neg u_{i_{j}}=v_{j})\to E(\vec{e})$ , such that they are all free from the critical $\varepsilon$ -term $e$ . For the former ones, we apply the substitution $\{e/\varepsilon_{x}A(x;\vec{u}_{i_{j}\mapsto v_{j}})\}$ throughout the proof $\pi$ . Then, the critical formulas in $\pi$ belonging to $e$ goes to $A(t_{l},\vec{u})\to A(\varepsilon_{x}A(x;\vec{u}_{i_{j}\mapsto v_{j}}),\vec{u})$ , which is provable by the following three steps

[TABLE]

where the first and the third formulas by Lemma 7.24, and the second one is a critical formula. On the other hand, the $k$ -th $\varepsilon$ -equality formulas in $\pi$ belonging to $e$ goes to

[TABLE]

which is provable as follows. If the positions of $j$ -th and $k$ -th $\varepsilon$ -equality formulas are the same, namely, $i_{j}=i_{k}$ , the assumption $u_{i_{j}}=v_{j}$ and the premise $u_{i_{k}}=v_{k}$ implies $v_{j}=v_{k}$ , hence the above formula is proven by means of an $\varepsilon$ -equality formula $v_{j}=v_{k}\to\varepsilon_{x}A(x;\vec{u}_{i_{j}\mapsto v_{j}})=\varepsilon_{x}A(x;\vec{u}_{i_{k}\mapsto v_{k}})$ . Otherwise, we prove the above formula by using the two $\varepsilon$ -equality formulas.

[TABLE]

In any case, the critical $\varepsilon$ -term $e$ is gone. In the second case, although the critical $\varepsilon$ -term $\varepsilon_{x}A(x;\vec{u}_{i_{j}\mapsto v_{j},i_{k}\mapsto v_{k}})$ may be new in the sense that it has no critical occurrence in $\pi$ , the obtained proof is strictly smaller than the $\pi$ with respect to $\prec$ in $\mathcal{CL}(\pi,\varepsilon_{x}A(x;\vec{a}))$ . On the other hand for the proof $\sigma$ , these $\varepsilon$ -equality formulas are provable due to ex falso quodlibet, hence we use Lemma 7.12 to get rid of the critical formulas.

Finally, we discuss the bounds. After the elimination process, each $\rho_{j}$ may have an increased number for $\mathsf{mwd}(\rho_{j},r)$ , which is bounded by $2\cdot\mathsf{mwd}(\pi,r)$ . Due to Lemma 7.12, $\mathsf{mwd}(\sigma,r)\leq\mathsf{mwd}(\pi,r)\cdot(m+1)$ . Therefore, $\mathsf{mwd}(\pi_{e},r)\leq 2\cdot\mathsf{mwd}(\pi,r)\cdot n+\mathsf{mwd}(\pi,r)\cdot(m+1)\leq 2\cdot\mathsf{mwd}(\pi,r)^{2}$ . The critical counts for $\vec{\rho},\sigma$ are bounded as $\mathsf{cc}(\rho_{j})\leq\mathsf{cc}(\pi)+2\cdot(p\cdot m+n-1)$ and $\mathsf{cc}(\sigma)\leq(\mathsf{cc}(\pi)-n)\cdot(m+1)$ , hence $\mathsf{cc}(\pi_{e})\leq n\cdot(\mathsf{cc}(\pi)+2\cdot(p\cdot m+n-1))+(\mathsf{cc}(\pi)-n)\cdot(m+1)\leq(\mathsf{cc}(\pi)+2\cdot\mathsf{mwd}(\pi,r)\cdot p)\cdot(\mathsf{mwd}(\pi,r)+1)\leq 2\cdot(p+1)\cdot\mathsf{cc}(\pi)^{2}$ . ∎

In case we deal with $\varepsilon$ -equality formulas with different positions, we may introduce a critical $\varepsilon$ -term which is not a critical $\varepsilon$ -term in the original proof $\pi$ . The closure covers all the potential new critical $\varepsilon$ -terms beforehand. Also note that the closure won’t be expanded through the $\varepsilon$ -elimination procedure, because the maximal critical $\varepsilon$ -term never matches terms in the premise of $\varepsilon$ -equality formulas, hence we don’t get anything new for $\mathcal{T}_{\pi}(g,i)$ .

By Lemma 7.25, we get a proof without using a maximal $\varepsilon$ -term in the corresponding closure, hence the size of the closure is an upper bound for the number of applications of the lemma needed to eliminate the $\varepsilon$ -matrix. The size of the closure $|\mathcal{CL}(\pi,g)|$ is bounded by $\prod_{i\in I_{g}}|\mathcal{T}_{\pi}(g,i)|$ , where $I_{g}$ is the set of positions of $\varepsilon$ -equality formulas belonging to $g$ in $\pi$ . In the following proof the upper bound of this size is estimated only by the number of $\varepsilon$ -equality formulas.

Lemma 7.26.

For any proof $\pi$ and $\varepsilon$ -matrix $g$ in $\pi$ , $|\mathcal{CL}(\pi,g)|\leq 2^{0pt^{=}(\pi,g)^{2}}$ .

Proof.

The number of different positions of $\varepsilon$ -equality formulas belonging to $g$ is at most $0pt^{=}(\pi,g)$ , and $|\mathcal{T}_{\pi}(g,i)|\leq w+1$ holds for each $i$ , hence $|\mathcal{CL}(\pi,g)|$ is bounded by $(0pt^{=}(\pi,g)+1)^{0pt^{=}(\pi,g)}$ which is smaller than $2^{0pt^{=}(\pi,g)^{2}}$ . ∎

Repeatedly eliminating a maximal critical $\varepsilon$ -term, the rank is diminished.

Lemma 7.27 (Eliminating critical $\varepsilon$ -terms of the maximal rank).

Let $\pi$ be an $\mathbf{EC}_{\varepsilon}^{=_{1}}$ -proof of $E(\vec{e}\,)$ where $E(\vec{a})$ is an $\varepsilon$ -free formula in $\mathbf{EC}_{\varepsilon}^{=_{1}}$ and $\vec{e}$ is a list of $\varepsilon$ -terms. There is an $\mathbf{EC}_{\varepsilon}^{=_{1}}$ -proof $\pi^{\prime}$ of $\bigvee_{i\leq N}E(\vec{r}_{i})$ for some quantifier free terms $\vec{r}_{i}$ and $N$ bounded by $2_{3}^{2\mathsf{cc}(\pi)^{2}+3\mathsf{cc}(\pi)}$ such that $\mathsf{rk}(\pi^{\prime})<\mathsf{rk}(\pi)$ and $\mathsf{cc}(\pi^{\prime})\leq 2_{3}^{(\mathsf{cc}(\pi)+1)^{2}+p}$ , where $p$ is $\mathsf{pd}(\pi)$ .

Proof.

We repeatedly apply Lemma 7.25 in order to eliminate maximal critical $\varepsilon$ -terms. As Lemma 7.25 eliminates an $\varepsilon$ -term, which is a least upper bound of a closure, without expanding the closures of rank $r$ , this process is terminating. Let $g_{1},g_{2},\ldots,g_{\mathsf{m}(\pi,r)}$ be the critical $\varepsilon$ -matrices of rank $r$ in $\pi$ , then the number of applications of the lemma is bounded by the number of critical formulas of rank $r$ plus the number of elements in the closures $\mathcal{CL}(\pi,g_{i})$ , namely, by $\mathsf{cc}(\pi)+\sum_{i\in\mathsf{m}(\pi,r)}|\mathcal{CL}(\pi,g_{i})|\leq 2^{\mathsf{cc}(\pi)\cdot(\mathsf{cc}(\pi)+1)}$ . Let $k$ be $\mathsf{cc}(\pi)$ . Solving the recurrence relations $w_{0}=\mathsf{mwd}(\pi,r)$ and $w_{n+1}=2\cdot w_{n}^{2}$ , we find $w_{n}=2^{(2^{n}-1)}\cdot(w_{0})^{2^{n}}$ , hence $w_{2^{k(k+1)}}=2^{(2_{2}^{k(k+1)}-1)}\cdot(w_{0})^{2_{2}^{k(k+1)}}<2_{3}^{2k^{2}+3k}$ is a bound for the length $N$ of the Herbrand disjunction. Also solving the recurrence relations $k_{0}=k$ and $k_{n+1}=2\cdot(p+1)\cdot k_{n}^{2}$ , $k_{n}=(2\cdot(p+1))^{2^{n}-1}\cdot k^{2^{n}}\leq 2_{2}^{k+p+n}$ , hence $\mathsf{cc}(\pi^{\prime})\leq 2_{2}^{k+p+2^{k(k+1)}}\leq 2_{3}^{(k+1)^{2}+p}$ by excluding a trivial case $k=0$ . ∎

Theorem 7.28 (Extended first epsilon theorem for $\mathbf{EC}_{\varepsilon}^{=_{1}}$ ).

Assume $\mathbf{EC}_{\varepsilon}^{=_{1}}\vdash_{\pi}E(\vec{s}\,)$ for $E(\vec{a})\in L(\mathbf{EC}^{=})$ and $\vec{s}\in L(\mathbf{EC}_{\varepsilon}^{=})$ , then $\mathbf{EC}^{=}\vdash\bigvee_{i=0}^{N}E(\vec{s}_{i})$ for some $\vec{s}_{i}\in L(\mathbf{EC})$ , where $N$ is bounded by $2_{3k}^{k^{2}+3k+p+2}\leq 2_{3k+2}^{k+p}$ for $p:=\mathsf{mpd}(\pi)$ .

Proof.

As long as the maximal rank of critical formulas in $\pi$ is smaller than $\mathsf{rk}(\pi)$ , we repeatedly apply Lemma 7.5. Our proof proceeds by induction on the number of different ranks of critical formulas in $\pi$ , namely, size of $S:=\{\mathsf{rk}(e)\mid e\text{ belongs to a critical formula in }\pi\}$ . In each step we use Lemma 7.25, and then Lemma 7.5 in the same way as mentioned above. Nullary constants can substitute $\varepsilon$ -terms which still remained. The number of applications of Lemma 7.27 is $|S|$ which is bounded by $\mathsf{cc}(\pi)$ . Solving the recurrence relations $k_{0}=k$ and $k_{n+1}=2_{2}^{2_{3}^{k_{n}}+2^{p}}$ , we find $k_{n}\leq 2_{3n}^{k^{2}+2k+p+n}$ . The length $N$ of the Herbrand disjunction is bounded by $2_{3k}^{k^{2}+3k+p+2}$ , taking $\mathsf{cc}(\pi)=k$ for the bound of $n$ , because $\prod_{i<n}w_{i}\leq(w_{n-1})^{n}=(2_{3}^{2(k_{n-1})^{2}+3k_{n-1}})^{n}\leq 2_{3n}^{k^{2}+2k+p+n+2}$ . ∎

The property degree $p$ is independent from the critical count, and the use of $p$ in the upper bound of the length of Herbrand disjunction is an explanation of potential complexity due to the presence of the $\varepsilon$ -equality formula.

8 Lower Bounds on Herbrand Disjunctions

In this section, we adapt the result by Statman to give the lower bounds on Herbrand disjunctions [Sta79]. The case without equality was studied by Moser and Zach, where Orevkov’s result was used to give the lower bound of Herbrand complexity [Ore82, MZ06]. We first define a combinatory logic, which allows to define a term whose normal form is hyperexponentially long. Then we can describe a sequence of formulas such as the critical counts of their proofs shows a linear growth in $\mathbf{PC}^{=}$ . Finally, we show that the Herbrand complexity of those formulas are hyperexponential.

Definition 8.1.

Let $\circ$ be a left associative binary function symbol. The combinators $S$ , $B$ , $C$ , and $I$ are defined as nullary constants satisfying

[TABLE]

Let $\mathbf{\lambda I}$ denote the set of the above semi-formulas and $\forall\mathbf{\lambda I}$ the set of universally quantified closed formulas of $\mathbf{\lambda I}$ .

From now, we omit $\circ$ and write terms in a common way, e.g. $Sxyz$ and etc.

Definition 8.2.

Define $T:=SB(CBI)$ and $T_{1}:=T$ , $T_{n+1}:=T_{n}T$ .

For a finite set of formulas $S:=\{A_{1},A_{2},\ldots,A_{n}\}$ , let $S\to B$ denote $A_{1}\to A_{2}\to\ldots\to A_{n}\to B$ .

Lemma 8.3.

There is a $\mathbf{PC}^{=}$ -proof $\pi$ such that $\vdash_{\pi}\forall\mathbf{\lambda I}\to Ttu=t(tu)$ .

Proof.

The proof $\pi$ involves the equality axioms, the combinatory logic axioms, and the quantifier axiom, so that the following equations are proven.

[TABLE]

Easy to find a $\pi$ whose critical count $\mathsf{cc}(\pi)$ is 12, which is constant. ∎

Definition 8.4.

Let $p$ and $q$ be nullary constants. For any $n\geq 1$ we define comprehension terms $H_{n}$ and a formula $E_{n}$ .

[TABLE]

Lemma 8.5.

In $\mathbf{PC}^{=}$ , $\vdash\forall\mathbf{\lambda I}\to z(zy)\vec{t}\in H_{m}$ implies $\vdash\forall\mathbf{\lambda I}\to Tzy\vec{t}\in H_{m}$ for any terms $\vec{t}$ of arbitrary length and for any $m\geq 1$ .

Proof.

By induction on $m$ . In the base case, we assume that there exists a proof of $z(zy)\vec{t}\in H_{1}$ , that is, $\forall x.px=p(z(zy)\vec{t}x)$ , and prove $Tzy\vec{t}\in H_{1}$ , namely, $\forall x.px=p(Tzy\vec{t}x)$ . We apply Lemma 8.3 once and equality axioms to make a proof. In the step case, we prove that if $\vdash z(zy)\vec{t}\in H_{m+1}$ then $\vdash Tzy\vec{t}\in H_{m+1}$ for any $\vec{t}$ of any length. Assume $\vec{t}$ and $\vdash z(zy)\vec{t}\in H_{m+1}$ , and we prove $\vdash Tzy\vec{t}\in H_{m+1}$ , which unfolds into $\forall x\in H_{m}.Tzy\vec{t}x\in H_{m}$ . Further assume $x\in H_{m}$ , then the induction hypothesis, $\vdash z(zy)\vec{t}\in H_{m}$ implies $\vdash Tzy\vec{t}\in H_{m}$ for any $\vec{t}$ of any length, is applicable with terms $\vec{t},x$ , hence it suffices to show $\vdash z(zy)\vec{t}x\in H_{m}$ which is trivial due to the assumptions $\vdash z(zy)\vec{t}\in H_{m+1}$ , which unfolds into $\vdash\forall x\in H_{m}.z(zy)\vec{t}x\in H_{m}$ , and $x\in H_{m}$ . ∎

Lemma 8.6.

In $\mathbf{PC}^{=}$ , $\vdash\forall\mathbf{\lambda I}\to T\in H_{m+1}$ for any $m\geq 1$ .

Proof.

In case $m=1$ , $T\in H_{2}$ holds trivially. Assume $m>1$ . Unfolding $T\in H_{m+2}$ .

[TABLE]

Due to its premise, $z(zy)\in H_{m}$ follows. Lemma 8.5 implies $Tzy\in H_{m}$ . ∎

Lemma 8.7.

There is a $\mathbf{PC}^{=}$ -proof $\pi$ such that $\vdash_{\pi}\forall\mathbf{\lambda I}\to T_{n}\in H_{2}$ for any $n\geq 1$ , such that $\mathsf{cc}(\pi)$ is linear in $n$ .

Proof.

We prove that in $\mathbf{PC}^{=}$ , $\vdash\forall\mathbf{\lambda I}\to T_{n}\in H_{m+1}$ for any $n,m\geq 1$ by induction on $n$ . The base case, $T_{1}\in H_{m+1}$ for any $m$ , is proven by Lemma 8.6. In order to prove the step case, $T_{n+1}\in H_{m+1}$ for any $m$ , it suffices to assume $m$ and to prove both $T_{n}\in H_{m+2}$ and $T\in H_{m+1}$ . They are straightforward by induction hypothesis and by Lemma 8.6, respectively. ∎

Assuming $\forall\mathbf{\lambda I}$ and $\forall x.px=p(qx)$ , there is a linear proof of $E_{n}$ in $\mathbf{PC}^{=}$ .

Theorem 8.8.

There is a $\mathbf{PC}^{=}$ -proof $\pi_{n}$ such that $\vdash_{\pi_{n}}\forall\mathbf{\lambda I}\to(\forall x.px=p(qx))\to E_{n}$ and $\mathsf{cc}(\pi_{n})$ is linear in $n$ .

Proof.

Assume $\forall\mathbf{\lambda I}$ and $\forall x.px=p(qx)$ . By Lemma 8.7, there is a proof $\pi$ of $T_{n}\in H_{2}$ with $\mathsf{cc}(\pi)$ being linear in $n$ . The formula unfolds into $\forall y.(\forall z.pz=p(yz))\to\forall z.pz=p(T_{n}yz)$ , hence $pq=p(T_{n}qq)$ . ∎

For a set $X$ of semi formulas, let $X^{*}$ denote a set of closed formulas obtained by instantiating each formula in $X$ . We use the following lemma, whose proof is found in Statman [Sta79], to prove the main theorem.

Lemma 8.9 (Statman [Sta79]).

Suppose that $X$ is a finite subset of $\{px=p(qx)\}^{*}$ such that $\vdash\mathbf{\lambda I}^{*}\to X\to E_{n}$ ; then there is a finite subset $Y$ of $\{px=p(qx)\}^{*}$ such that $\vdash\mathbf{\lambda I}^{*}\to X\to E_{n}$ , $|Y|\leq|X|$ , and each term occurring in $Y$ is closed and in normal form.

Theorem 8.10 (Statman [Sta79]).

Suppose $X$ is a finite subset of $\{px=p(qx)\}^{*}$ such that $\vdash\mathbf{\lambda I}^{*}\to X\to E_{n}$ and each term occurring in $X$ is in normal form; then $|X|\geq 2_{n}^{1}/2$ .

Proof.

Suppose to the contrary that the size of $X$ is less than $2_{n}^{1}/2$ . Assume $X=\{pM_{i}=p(qM_{i})\mid 1\leq i\leq m\}$ for some $m<2_{n}^{1}/2$ and $M_{i}$ . Because the number of possible different instantiations is at most $m$ , for some $k$ with $1<k\leq 2_{n}^{1}+1$ , for any $i$ , $1\leq i\leq m$ , neither $M_{i}$ nor $qM_{i}$ is same as $q^{k}$ . Let $0,1$ be new constants. We define $\forall\mathbf{\lambda I}^{+}$ to be $\forall\mathbf{\lambda I}$ extended with infinitely many equations, so that the following reduction rules for closed $M$ of normal form without $p$ in function position are available:

[TABLE]

We prove that $\not\vdash\forall\mathbf{\lambda I}^{+}\to E_{n}$ . The normal form of formula $pq=pq^{2_{n}^{1}+1}$ is $0=1$ , which is not provable in $\forall\mathbf{\lambda I}^{+}$ because of the Church-Rosser property, cf. Theorem 3 in [Hin74]. Notice that $pq^{2_{n}^{1}+1}$ is the normal form of $p(T_{n}qq)$ .

Finally, we prove that $\vdash\forall\mathbf{\lambda I}^{+}\to pM_{i}=p(qM_{i})$ for each $i$ . If $M_{i}$ does not contain $p$ in function position, we consider the following three cases. If $M_{i}$ is not of the form $q^{j}$ for any $j$ , $pM_{i},p(qM_{i})\rhd 1$ , and if $M_{i}$ is $q^{j}$ and $j,j+1<k$ , $pM_{i},p(qM_{i})\rhd 0$ , otherwise $k<j,j+1$ and then $pM_{i},p(qM_{i})\rhd 1$ . If $M_{i}$ contains $p$ in function position, $M_{i}$ has a normal form containing [math] or $1$ and without $p$ in function position, so $pM_{i},p(qM_{i})\rhd 1$ under $\forall\mathbf{\lambda I}^{+}$ .

As $\forall\mathbf{\lambda I}^{+}$ extends $\mathbf{\lambda I}^{*}$ , $\not\vdash\forall\mathbf{\lambda I}^{+}\to X\to E_{n}$ implies $\not\vdash\mathbf{\lambda I}^{*}\to X\to E_{n}$ . ∎

This theorem implies the following corollary which tells us that the length of Herbrand disjunction of the formula in Theorem 8.8 is hyperexponential. For a set of semi formulas $\{A_{0}(\vec{x}_{0}),\ldots,A_{n}(\vec{x}_{n})\}=:X$ , let $X(\vec{t}_{0},\ldots,\vec{t}_{n}\,)$ denote a conjunction $A_{0}(\vec{t}_{0})\land\ldots\land A_{n}(\vec{t}_{n})$ .

Corollary 8.11.

$\mathrm{HC}(\forall\mathbf{\lambda I}\to(\forall x.px=p(qx))\to E_{n})\geq 2_{n}^{1}/2$ .

Proof.

Let $G_{n}$ be $\forall\mathbf{\lambda I}\to(\forall x.px=p(qx))\to E_{n}$ . Assume $\mathrm{HC}(G_{n})<2_{n}^{1}/2$ , then there exists $N<2_{n}^{1}/2$ and a Herbrand disjunction $\bigvee_{i<N}(\mathbf{\lambda I}(\vec{t}_{i})\to pu_{i}=p(qu_{i})\to pq=p(T_{n}qq))$ , namely, $(\bigwedge_{i<N}\mathbf{\lambda I}(\vec{t}_{i}))\to(\bigwedge_{i<N}pu_{i}=p(qu_{i}))\to pq=p(T_{n}qq)$ , where $\varepsilon$ -free terms $\vec{t}_{i},u_{i}$ are in normal form. Due to Theorem 8.10, the number of instances of $px=p(qx)$ is at least $2_{n}^{1}/2$ , hence $\mathrm{HC}(G_{n})\geq 2_{n}^{1}/2$ . ∎

The analysis in this section tells us that the lower bound of the Herbrand complexity is hyperexponential. Further technical investigations will lead to a better lower bound, so that the gap between the upper and lower bounds gets smaller, and also the unneccesity of the property degree may be clarified.

9 Conclusion

We studied the Herbrand complexity for epsilon calculus with the $\varepsilon$ -equality axiom. For Herbrand’s theorem, which is on the prenex existential formula, the Herbrand complexity is same as the case without the $\varepsilon$ -equality formula, because the embedding result does not rely on the $\varepsilon$ -equality formula. The formulation of the $\varepsilon$ -equality formula has to be restricted through the notion of $\varepsilon$ -matrices, since otherwise as Yukami’s trick explains, we would fail to find the complexity bound. Our proof of first epsilon theorem is simpler than the original one by Bernays, as our formulation of the $\varepsilon$ -equality formula due to the vector notation allows us to get rid of the notion of closures from the proof. Using this formulation, we computed the upper bound of the Herbrand complexity for the first extended epsilon theorem with the $\varepsilon$ -equality axiom. While the hyperexponential part of the result, namely, the height of the tower, is the same as the case without the $\varepsilon$ -equality axiom, the exponential part is quadratic with the additional parameters, the property degree and the maximal arity, rather than the linear in the case without the $\varepsilon$ -equality axiom. Employing the original formulation of the $\varepsilon$ -equality axiom by Bernays and Hilbert, the parameter for the maximal arity can be got rid of, although the height of the hyperexponential tower grows faster than the original result. We also gave a lower bounds analysis which tells us that it has to be at least hyperexponential, namely, non-elementary.

Future work

There are open problems for future work. The notion of property degree was introduced to compute the upper bounds of the complexity in Section 7.2. On the other hand our lower bound analysis does not count the property degree, hence it is still not clarified whether the property degree is necessary for complexity analyses or not. On the other hand, it is important to explore a better proof representation suitable for epsilon calculus, because a syntactic complication of $\varepsilon$ -terms is a real practical obstacle to study epsilon calculus. Some modern formalizations of epsilon calculus are known, for example sequent calculus with function variables by Baaz, Leitsch, and Lolic [BLL18] and another formulation based on Miller’s expansion tree by Aschieri, Hetzl, and Weller [AHW18].

Bibliography34

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[AB 16] Juan P. Aguilera and Matthias Baaz. Unsound inferences make proofs shorter. Co RR , abs/1608.07703, 2016.
2[Ack 25] W. Ackermann. Begründung des Tertium non datur mittels der Hilbertschen Theorie der Widerspruchsfreiheit. Mathematische Annalen , 93:1–36, 1925.
3[Ack 40] W. Ackermann. Zur Widerspruchsfreiheit der Zahlentheorie. Mathematische Annalen , 117:162–194, 1940.
4[AHW 18] Federico Aschieri, Stefan Hetzl, and Daniel Weller. Expansion trees with cut. The Oberwolfach Preprints, 2018. Preprint on webpage at https://publications.mfo.de/bitstream/handle/mfo/1333/OWP 2018_01.pdf .
5[Ara 03] T. Arai. Epsilon substitution method for I D 1 ( Π 1 0 ∨ Σ 1 0 ) 𝐼 subscript 𝐷 1 subscript superscript Π 0 1 subscript superscript Σ 0 1 {I}{D}_{1}({\Pi}^{0}_{1}\lor{\Sigma}^{0}_{1}) . Annals of Pure and Applied Logic , 121:163–208, 2003.
6[Ara 05] T. Arai. Ideas in the epsilon substitution method for Π 1 0 subscript superscript Π 0 1 {\Pi}^{0}_{1} -FIX. Annals of Pure and Applied Logic , 136:3–21, 2005.
7[Avi 02] J. Avigad. Update procedures and the 1 1 1 -consistency of arithmetic. Mathematical Logic Quarterly , 48:3–13, 2002.
8[Bel 93a] J. L. Bell. Hilbert’s epsilon-operator and classical logic. Journal of Philosophical Logic , 22:1–18, 1993.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

The Epsilon Calculus with Equality and Herbrand Complexity

Abstract

1 Introduction

2 Epsilon Calculus

Definition 2.1** (Term and formula).**

Definition 2.2** (Substitution).**

Definition 2.3** (α\alphaα-equivalence).**

Definition 2.4** (Set induced by vector).**

Definition 2.5** (Equality).**

Definition 2.6** (Elementary calculus and predicate calculus).**

Definition 2.7** (Epsilon calculus).**

Definition 2.8** (Proof).**

Definition 2.9** (Languages).**

Example 2.10**.**

Example 2.11**.**

Theorem 2.12** (Deduction theorem).**

Lemma 2.13** (Identity schema).**

Proof.

3 Embedding Lemma

Definition 3.1** (ε\varepsilonε-translation).**

Example 3.2**.**

Example 3.3**.**

Remark 3.4**.**

Definition 3.5** (Regular proof).**

Definition 3.6** (Proof size).**

Theorem 3.7**.**

Definition 3.8** (Critical count).**

Lemma 3.9** (Embedding).**

Proof.

Theorem 3.10** (Herbrand’s theorem).**

Proof.

4 Epsilon Calculus with the ε\varepsilonε-Equality Formula

Definition 4.1** (ε\varepsilonε-matrix and semicolon notation).**

Definition 4.2** (Arity of ε\varepsilonε-matrix).**

Lemma 4.3**.**

Proof.

Definition 4.4** (Epsilon calculus with the ε\varepsilonε-equality formula).**

Lemma 4.5** (Identity schema).**

Proof.

Definition 4.6** (Rank).**

Lemma 4.7**.**

Proof.

Lemma 4.8**.**

Proof.

Definition 4.9** (Degree).**

Definition 4.10** (Maximal critical ε\varepsilonε-term).**

Definition 4.11** (Order).**

Definition 4.12** (Width).**

Definition 4.13** (Critical count).**

5 Yukami’s Trick

Theorem 5.1** (cf. [Yuk84]).**

Proof.

Remark 5.2**.**

Lemma 5.3**.**

Proof.

Theorem 5.4**.**

Proof.

Corollary 5.5**.**

Proof.

Theorem 5.6**.**

Proof.

6 First and Second Epsilon Theorems

Example 6.1**.**

Example 6.2**.**

Lemma 6.3**.**

Proof.

Lemma 6.4**.**

Proof.

Lemma 6.5**.**

Proof.

Lemma 6.6**.**

Proof.

Lemma 6.7**.**

Definition 2.1 (Term and formula).

Definition 2.2 (Substitution).

Definition 2.3 ( $\alpha$ -equivalence).

Definition 2.4 (Set induced by vector).

Definition 2.5 (Equality).

Definition 2.6 (Elementary calculus and predicate calculus).

Definition 2.7 (Epsilon calculus).

Definition 2.8 (Proof).

Definition 2.9 (Languages).

Example 2.10.

Example 2.11.

Theorem 2.12 (Deduction theorem).

Lemma 2.13 (Identity schema).

Definition 3.1 ( $\varepsilon$ -translation).

Example 3.2.

Example 3.3.

Remark 3.4.

Definition 3.5 (Regular proof).

Definition 3.6 (Proof size).

Theorem 3.7.

Definition 3.8 (Critical count).

Lemma 3.9 (Embedding).

Theorem 3.10 (Herbrand’s theorem).

4 Epsilon Calculus with the $\varepsilon$ -Equality Formula

Definition 4.1 ( $\varepsilon$ -matrix and semicolon notation).

Definition 4.2 (Arity of $\varepsilon$ -matrix).

Lemma 4.3.

Definition 4.4 (Epsilon calculus with the $\varepsilon$ -equality formula).

Lemma 4.5 (Identity schema).

Definition 4.6 (Rank).

Lemma 4.7.

Lemma 4.8.

Definition 4.9 (Degree).

Definition 4.10 (Maximal critical $\varepsilon$ -term).

Definition 4.11 (Order).

Definition 4.12 (Width).

Definition 4.13 (Critical count).

Theorem 5.1 (cf. [Yuk84]).

Remark 5.2.

Lemma 5.3.

Theorem 5.4.

Corollary 5.5.

Theorem 5.6.

Example 6.1.

Example 6.2.

Lemma 6.3.

Lemma 6.4.

Lemma 6.5.

Lemma 6.6.

Lemma 6.7.

Lemma 6.8.

Lemma 6.9.

Lemma 6.10.

Lemma 6.11.

Theorem 6.12 (First epsilon theorem).

Theorem 6.13 (Second epsilon theorem).

Lemma 7.1.

Definition 7.2 (Critical ranks).

Lemma 7.3.

Definition 7.4 (Function symbol substitution).

Lemma 7.5.

Lemma 7.6.

Remark 7.7.

Theorem 7.8 (Extended first epsilon theorem for $\mathbf{EC}_{\varepsilon}^{=}$ ).

Definition 7.9 (Property degree).

Lemma 7.10.

Lemma 7.11.

Lemma 7.12 (Elimination for critical formulas).

Lemma 7.13 (Elimination for $\varepsilon$ -equality formulas).

Lemma 7.14 (Eliminating a maximal critical $\varepsilon$ -term).

Definition 7.15 (Hyperexponentiation).

Lemma 7.16.

Lemma 7.17 (Eliminating critical $\varepsilon$ -terms of maximal rank).

Theorem 7.18 (Extended first epsilon theorem).

7.3 Alternative $\varepsilon$ -equality formula and closure

Lemma 7.19.

Definition 7.20 (Closure).

Definition 7.21 (Strict partial order on closures).

Definition 7.22 ( $\prec$ -maximal critical $\varepsilon$ -term).

Definition 7.23 (Strict partial order on proofs due to a closure).