Intersection Types and Counting††thanks: Work supported by the National Science Center (decision DEC-2012/07/D/ST6/02443).
Paweł Parys
University of Warsaw, Poland
[email protected]
Abstract
We present a new approach to the following meta-problem: given a quantitative property of trees,
design a type system such that the desired property for the tree generated by an infinitary ground λ-term corresponds to some property of a derivation of a type for this λ-term, in this type system.
Our approach is presented in the particular case of the language finiteness problem for nondeterministic higher-order recursion schemes (HORSes):
given a nondeterministic HORS, decide whether the set of all finite trees generated by this HORS is finite.
We give a type system such that the HORS can generate a tree of an arbitrarily large finite size if and only if in the type system we can obtain derivations that are arbitrarily large, in an appropriate sense;
the latter condition can be easily decided.
1 Introduction
In this paper we consider λY-calculus, which is an extension of the simply typed λ-calculus by a fixed-point operator Y.
A term P of λY-calculus that is of sort111We use the word “sort” instead of the usual “type” to avoid confusion with intersection types introduced in this paper.
o can be used to generate an infinite tree BT(P), called the Böhm tree of P.
Trees generated by terms of λY-calculus can be used to faithfully represent the control flow of programs in languages with higher-order functions.
Traditionally, Higher Order Recursive Schemes (HORSes) are used for this purpose [9, 13, 18, 17]; this formalism is equivalent to λY-calculus,
and the translation between them is rather straightforward [22].
Collapsible Pushdown Systems [11] and Ordered Tree-Pushdown Systems [8] are other equivalent formalisms.
Intersection type systems were intensively used in the context of HORSes, for several purposes like
model-checking [14, 17, 6, 21],
pumping [15],
transformations of HORSes [16, 7], etc.
Interestingly, constructions very similar to intersection types were used also on the side of collapsible pushdown systems; they were alternating stack automata [5], and types of stacks [20, 12].
In this paper we show how intersection types can be used for deciding quantitative properties of trees generated by λY-terms.
We concentrate on the language finiteness problem for nondeterministic HORSes:
given a nondeterministic HORS, decide whether the set of all finite trees generated by this HORS is finite.
This problem can be restated in the world of λY-terms (or standard, deterministic HORSes), generating a single infinite tree.
Here, instead of resolving nondeterministic choices during the generation process, we leave them in the resulting tree.
Those nondeterministic choices are denoted by a distinguished br (“branch”) symbol, below which we put options that could be chosen.
Then to obtain a finite tree generated by the original HORS we just need to recursively choose in every br-labeled node which of the two subtrees we want to consider.
Thus, in this setting, the language finiteness problem asks whether the set of all finite trees obtained this way is finite.
The difficulty of this problem lies in the fact that sometimes the same finite tree may be found in infinitely many different places of BT(P) (i.e., generated by a nondeterministic HORS in many ways);
thus the actual property to decide is whether there is a common bound on the size of each of these trees.
This makes the problem inaccessible for standard methods used for analyzing HORSes, as they usually concern only regular properties of the Böhm tree, while boundedness is a problem of different kind.
The same difficulty was observed in [15], where they prove a pumping lemma for deterministic HORSes, while admitting (Remark 2.2) that their method is too weak to reason about nondeterministic HORSes.
In order to solve the language finiteness problem, we present an appropriate intersection type system, where derivations are annotated by flags and markers of multiple kinds.
The key property of this type system is that the number of flags in a type derivation for a λY-term P
approximates the size of some finite tree obtained by resolving nondeterministic choices in the infinite tree BT(P).
In consequence, there are type derivations using arbitrarily many flags if, and only if, the answer to the language finiteness problem is “no”.
The language finiteness problem was first attacked in [2] (for safe HORSes only), but their algorithm turned out to be incorrect [3].
To our knowledge, the only known solution of this problem follows from a recent decidability result for the diagonal problem [10, 7].
This problem asks, given a nondeterministic HORS and a set of letters Σ, whether for every n∈N the HORS generates a finite tree in which every letter from Σ appears at least n times.
Clearly, a nondeterministic HORS generates arbitrarily large trees exactly when for some letter a it generates trees having arbitrarily many a letters, i.e., when the answer to the diagonal problem for Σ={a} is “yes”.
Our type system is, to some extent, motivated by the algorithm of [7] solving the diagonal problem.
This algorithm works by repeating two kinds of transformations of HORSes.
The first of them turns the HORS into a HORS generating trees having only a fixed number of branches, one per each letter from Σ (i.e., one branch in our case of ∣Σ∣=1).
The branches are chosen nondeterministically out of some tree generated by the original HORS; for every a∈Σ there is a choice witnessing that a appeared many times in the original tree. Then such a HORS of the special form is turned into a HORS that is of order lower by one,
and generates trees having the same nodes as trees generated by the original HORS, but arranged differently (in particular, the new trees may have again arbitrarily many branches).
After finitely many repetitions of this procedure, a HORS of order [math] is obtained, and the diagonal problem becomes easily decidable.
In some sense we want to do the same, but instead of applying all these transformations one by one, we simulate all of them simultaneously in a single type derivation.
In this derivation, for each order n, we allow to place arbitrarily one marker “of order n”; this corresponds to the nondeterministic choice of one branch in the n-th step of the previous algorithm.
We also place some flags “of order n”, in places that correspond to nodes remaining after the n-th step of the previous algorithm.
The idea of using intersection types for counting is not completely new.
Paper [19] presents a type system that, essentially, allows to estimate the size of the β-normal form of a λ-term just by looking at (the number of some flags in) a derivation of a type for this term.
A similar idea, but for higher-order pushdown automata, is present in [20], where we can estimate the number of ♯ symbols appearing on a particular, deterministically chosen branch of the generated tree.
This previous approach also uses intersection types, where the derivations are marked with just one kind of flags, denoting “productive” places of a λ-term
(oppositely to our approach, where we have different flags for different orders, and we also have markers).
The trouble with the “one-flag” approach is that it works well only in a completely deterministic setting, where looking independently at each node of the Böhm tree we know how it contributes to the result;
the method stops working (or at least we do not know how to prove that it works) in our situation, where we first nondeterministically perform some guesses in the Böhm tree, and only after that we want to count something that depends on the chosen values.
Acknowledgements.
I would like to thank Szymon Toruńczyk for stimulating discussions, and anonymous reviewers for useful comments.
2 Preliminaries
Trees.
Let Σ be a ranked alphabet, i.e., a set of symbols together with a rank function assigning a nonnegative integer to each of the symbols.
We assume that Σ contains a distinguished symbol br of rank 2, used to denote nondeterministic choices.
A Σ-labeled tree is a tree that is rooted (there is a distinguished root node),
node-labeled (every node has a label from Σ),
ranked (a node with label of rank n has exactly n children),
and ordered (children of a node of rank n are numbered from 1 to n).
When t is a Σ-labeled tree t, by L(t) we denote the set of all finite trees that can be obtaining by choosing in every br-labeled node of t which of the two subtrees we want to consider.
More formally, we consider the following relation →br: we have t→bru if u can be obtained from t by choosing in t a br-labeled node x and its child y,
and replacing the subtree starting in x by the subtree starting in y (which removes x and the other subtree of x).
Let →br∗ be the reflexive transitive closure of →br.
Then L(t) contains all trees u that do not use the br label, are finite, and such that t→br∗u.
Infinitary λ-calculus.
The set of sorts (a.k.a. simple types), constructed from a unique basic sort o using a binary operation →, is defined as usual.
The order of a sort is defined by: ord(o)=0, and ord(α→β)=max(1+ord(α),ord(β)).
We consider infinitary, sorted λ-calculus.
Infinitary λ-terms (or just λ-terms) are defined by coinduction, according to the following rules:
if a∈Σ is a symbol of rank r, and P1o,…,Pro are λ-terms, then (aP1o…Pro)o is a λ-term,
for every sort α there are infinitely many variables xα,yα,zα,…; each of them is a λ-term,
if Pα→β and Qα are λ-terms, then (Pα→βQα)β is a λ-term, and
if Pβ is a λ-term and xα is a variable, then (λxα.Pβ)α→β is a λ-term.
We naturally identify λ-terms differing only in names of bound variables.
We often omit the sort annotations of λ-terms, but we keep in mind that every λ-term (and every variable) has a particular sort.
A λ-term P is closed if it has no free variables.
Notice that, for technical convenience, a symbol of positive rank is not a λ-term itself, but always comes with arguments.
This is not a restriction, since e.g. instead of a unary symbol a one may use the term λx.ax.
The order of a λ-term is just the order of its sort.
The complexity of a λ-term P is the smallest number m such that the order of every subterm of P is at most m.
We restrict ourselves to λ-terms that have finite complexity.
A β-reduction is defined as usual.
We say that a β-reduction P→βQ is of order n if it concerns a redex (λx.R)S such that ord(λx.R)=n.
In this situation the order of x is at most n−1, but may be smaller (when other arguments of R are of order n−1).
Böhm Trees.
We consider Böhm trees only for closed λ-terms of sort o.
For such a term P, its Böhm tree BT(P) is constructed by coinduction, as follows:
if there is a sequence of β-reductions from P to a λ-term of the form aP1…Pr (where a is a symbol),
then the root of the tree t has label a and r children, and the subtree starting in the i-th child is BT(Pi).
If there is no sequence of β-reductions from P to a λ-term of the above form, then BT(P) is the full binary tree with all nodes labeled by br.222Usually one uses a special label ⊥ of rank [math] for this purpose, but from the perspective of our problem both definitions are equivalent.
By L(P) we denote L(BT(P)).
λY-calculus.
The syntax of λY-calculus is the same as that of finite λ-calculus, extended by symbols Y(α→α)→α, for each sort α.
A term of λY-calculus is seen as a term of infinitary λ-calculus
if we replace each symbol Y(α→α)→α by the unique infinite λ-term Z such that Z is syntactically the same as λxα→α.x(Zx).
In this way, we view λY-calculus as a fragment of infinitary λ-calculus.
It is standard to convert a nondeterministic HORS G into a closed λY-term Po such that L(P) is exactly the set of all finite trees generated by G.
The following theorem, which is our main result, states that the language finiteness problem is decidable.
Theorem 1**.**
Given a closed λY-term P of sort o, one can decide whether L(P) is finite.
3 Intersection Type System
In this section we introduce a type system that allows to determine the desired property: whether in L(P) there is an arbitrarily large tree.
Intuitions.
The main novelty of our type system is in using flags and markers, which may label nodes of derivation trees.
To every flag and marker we assign a number, called an order.
While deriving a type for a λ-term of complexity m, we may place in every derivation tree at most one marker of each order n∈{0,…,m−1}, and arbitrarily many flags of each order n∈{0,…,m}.
Consider first a λ-term M0 of complexity [math].
Such a term actually equals its Böhm tree.
Our aim is to describe some finite tree t in L(M0), i.e., obtained from M0 by resolving nondeterministic choices in some way.
We thus just put flags of order [math] in all those (appearances of) symbols in M0 that contribute to this tree t;
the type system ensures that indeed all symbols of some finite tree in L(M0) are labeled by a flag.
Then clearly we have the desired property that there is a derivation with arbitrarily many flags if, and only if, there are arbitrarily large trees in L(M0).
Next, consider a λ-term M1 that is of complexity 1, and reduces to M0.
Of course every finite tree from L(M0) is composed of symbols appearing already in M1;
we can thus already in M1 label (by order-[math] flags) all symbols that contribute to some tree t∈L(M0) (and an intersection type system can easily check correctness of such labeling).
There is, however, one problem: a single appearance of a symbol in M1 may result in many appearances in M0 (since a function may use its argument many times).
Due to this, the number of order-[math] flags in M1 does not correspond to the size of t.
We rescue ourselves in the following way.
In t we choose one leaf, we label it by an order-[math] marker, and on the path leading from the root to this marker we place order-1 flags.
On the one hand, L(M0) contains arbitrarily large trees if, and only if, it contains trees with arbitrarily long paths, i.e., trees with arbitrarily many order-1 flags.
On the other hand, we can perform the whole labeling (and the type system can check its correctness) already in M1, and the number of order-1 flags in M1 will be precisely the same as it would be in M0.
Indeed, in M1 we have only order-1 functions, i.e., functions that take trees and use them as subtrees of larger trees;
although a tree coming as an argument may be duplicated, the order-[math] marker can be placed in at most one copy.
This means that, while reducing M1 to M0, every symbol of M1 can result in at most one symbol of M0 lying on the selected path to the order-[math] marker
(beside of arbitrarily many symbols outside of this path).
This procedure can be repeated for M2 of complexity 2 that reduces to M1 via β-reductions of order 2 (and so on for higher orders).
We now place a marker of order 1 in some leaf of M1;
afterwards, we place an order-2 flag in every node that is on the path to the marked leaf, and that has a child outside of this path whose some descendant is labeled by an order-1 flag.
In effect, for some choice of a leaf to be marked, the number of order-2 flags approximates the number of order-1 flags, up to logarithm.
Moreover, the whole labeling can be done in M2 instead of in M1, without changing the number of order-2 flags.
In this intuitive description we have talked about labeling “nodes of a λ-term”, but formally we label nodes of a derivation tree that derives a type for the term, in our type system.
Every such node contains a type judgment for some subterm of the term.
Type Judgments.
For every sort α we define the set Tα of types of sort α,
and the set Fα of full types of sort α.
This is done as follows, where P denotes the powerset:
[TABLE]
Notice that the sets Tα and Fkα are finite (unlike Fα).
A type (T,τ)∈Tα→β is denoted as T→τ.
A full type τ^=(k,F,M,τ)∈Fkα consists of its order k, a set F of flag orders, a set M of marker orders, and a type τ;
we write ord(τ^)=k.
In order to distinguish types from full types, the latter are denoted by letters with a hat, like τ^.
A type judgment is of the form Γ⊢P:τ^▹c, where Γ, called a type environment,
is a function that maps every variable xα to a subset of Fα,
P is a λ-term, τ^ is a full type of the same sort as P (i.e., τ^∈Fβ when P is of sort β), and c∈N.
As usual for intersection types, the intuitive meaning of a type T→τ is that a λ-term having this type can return a λ-term having type τ, while taking an argument for which we can derive all full types from T.
Moreover, in To there is just one type o, which can be assigned to every λ-term of sort o.
Suppose that we have derived a type judgment Γ⊢P:τ^▹c with τ^=(m,F,M,τ).
Then
τ is the type derived for P;
Γ contains full types that could be used for free variables of P in the derivation;
m bounds the order of flags and markers that could be used in the derivation: flags could be of order at most m, and markers of order at most m−1;
M⊆{0,…,m−1} contains the orders of markers used in the derivation, together with those provided by free variables
(i.e., we imagine that some derivations, specified by the type environment, are already substituted in our derivation for free variables);
we, however, do not include markers provided by arguments of the term (i.e., coming from the sets Ti when τ=T1→⋯→Tk→o);
F contains those numbers n∈{0,…,m−1} (excluding n=m) for which a flag of order n is placed in the derivation itself, or provided by a free variable, or provided by an argument;
for technical convenience we, however, remove n from F whenever n∈M
(when n∈M, the information about order-n flags results in placing an order-(n+1) flag, and need not to be further propagated);
c, called a flag counter, counts the number of order-m flags present in the derivation.
Type System.
Before giving rules of the type system, we need a few definitions.
We use the symbol ⊎ to denote disjoint union.
When A⊆N and n∈N, we write A↾<n for {k∈A∣k<n}, and similarly A↾≥n for {k∈A∣k≥n}.
By ε we denote the type environment mapping every variable to ∅,
and by Γ[x↦T] the type environment mapping x to T and every other variable y to Γ(y).
Let us now say how a type environment Γ from the conclusion of a rule may be split into type environments (Γi)i∈I used in premisses of the rule:
we say that Split(Γ∣(Γi)i∈I) holds if and only if for every variable x it holds Γi(x)⊆Γ(x) for every i∈I,
and every full type from Γ(x) providing some markers (i.e., (k,F,M,τ) with M=∅) appears in some Γi(x).
Full types with empty M may be discarded and duplicated freely.
This definition forbids to discard full types with nonempty M, and from elsewhere it will follow that they cannot be duplicated.
As a special case Split(Γ∣Γ′) describes how a type environment can be weakened.
All type derivations are assumed to be finite
(although we derive types mostly for infinite λ-terms, each type derivation analyzes only a finite part of a term).
Rules of the type system will guarantee that the order m of derived full types will be the same in the whole derivation (although in type environments there may be full types of different orders).
We are ready to give the first three rules of our type system:
[TABLE]
[TABLE]
We see that to derive a type for the nondeterministic choice brP1P2, we need to derive it either for P1 or for P2.
The (Var) rule allows to have in the resulting set M some numbers that do not come from the set M′ assigned to x by the type environment; these are the orders of markers placed in the leaf using this rule.
Notice, however, that we allow here only orders not smaller than k (which is the order of the superterm λx.P binding this variable x).
This is consistent with the intuitive description of the type system (page 3),
which says that a marker of order n can be put in a place that will be a leaf after performing all β-reductions of orders greater than n.
Indeed, the variable x remains a leaf after performing β-reductions of orders greater than k, but while performing β-reductions of order k this leaf will be replaced by a subterm substituted for x.
Recall also that, by definition of a type judgment, we require that (k,F,M′,τ)∈Fkα and (m,F,M,τ)∈Fmα, for appropriate sort α;
this introduces a bound on maximal numbers that may appear in the sets F and M.
Example 1**.**
Denoting ρ^1=(1,∅,{0},o) we can derive:
[TABLE]
In the derivation on the right, the marker of order 1 is placed in the conclusion of the rule.
The (λ) rule allows to use (in a subderivation concerning the λ-term P) the variable x with all full types given in the set T.
When the sort of λx.P is α→β, by definition of Tα→β we have that all full types in T have the same order k=ord(α→β) (since (T→τ)∈Tα→β).
Recall that we intend to store in the set M the markers contained in the derivation itself and those provided by free variables, but not those provided by arguments.
Because of this, in the conclusion of the rule we remove from M the markers provided by x.
This operation makes sense only because there is at most one marker of each order, so markers provided by x cannot be provided by any other free variable nor placed in the derivation itself.
The set F, unlike M, stores also flags provided by arguments, so we do not need to remove anything from F.
Example 2**.**
The (λ) rule can be used, e.g., in the following way (where a is a symbol of rank 1):
[TABLE]
Notice that in the conclusion of the rule, in both examples, we remove [math] from the set of marker orders, because the order-[math] marker is provided by x.
The next two rules use a predicate Compm, saying how flags and markers from premisses contribute to the conclusion.
It takes “as input” pairs (Fi,ci) for i∈I; each of them consists of the set of flag orders Fi and of the flag counter ci from some premiss.
Moreover, the predicate takes a set of marker orders M from the current type judgment (it contains orders of markers used in the derivation, including those provided by free variables).
The goal is to compute the set of flag orders F and the flag counter c that should be placed in the current type judgment.
First, for each n∈{1,…,m} consecutively, we decide whether a flag of order n should be placed on the current type judgment.
We follow here the rules mentioned in the intuitive description.
Namely, we place a flag of order n if we are on the path leading to the marker of order n−1 (i.e., if n−1∈M), and simultaneously we receive an information about a flag of order n−1.
By receiving this information we mean that either a flag of order n−1 was placed on the current type judgment, or n−1 belongs to some set Fi.
Actually, we place multiple flags of order n: one per each flag of order n−1 placed on the current type judgment, and one per each set Fi containing n−1.
Then, we compute F and c.
In c we store the number of flags of the maximal order m: we sum all the numbers ci, and we add the number of order-m flags placed on the current type judgment.
In F we keep elements of all Fi, and we add the orders n of flags that were placed on the current type judgment.
We, however, remove from F all elements of M.
This is because every flag of some order n−1 should result in creating at most one flag of order n, in the closest ancestor that lies on the path leading to the marker of order n−1.
If we have created an order-n flag on the current type judgment, i.e., if n−1∈M, we do not want to do this again in the parent.
Below we give a formal definition, in which fn′ contains the number of order-n flags placed on the current type judgment,
while fn additionally counts the number of premisses for which n∈Fi.
We say that Compm(M;((Fi,ci))i∈I)=(F,c) when
[TABLE]
We now present a rule for constants other than br:
[TABLE]
Here, the conditions in the second line say that in a node using the (Con) rule we always place a flag of order [math] (via F′ or via c′, depending on m),
and that if the node is a leaf (i.e., r=0), then we are allowed to place markers of arbitrary order (via M′).
Then to the Compm predicate, beside of pairs (Fi,ci) coming from premisses, we also pass the information (F′,c′) about the order-[math] flag placed in the current node;
this predicate decides whether we should place also some flags of positive orders.
Let us emphasize that in this rule (and similarly in the next rule) we have a disjoint union M′⊎M1⊎⋯⊎Mr,
which ensures that a marker of any order may be placed only in one node of a derivation.
Example 3**.**
The (Con) rule may be instantiated in the following way:
[TABLE]
In the left example, flags of order [math] and 1 are placed in the conclusion of the rule
(a flag of order [math] is created because we are in a constant; since the marker of order [math] is visible, we do not put [math] into the set of flag orders, but instead we create a flag of order 1).
In the right example, a marker of order 1 is visible, which causes that this time flags of order [math], 1, and 2 are placed in the conclusion of the (Con) rule
(again, we do not put [math] nor 1 into the set of flag orders, because of [math] and 1 in the set of marker orders).
The next rule describes application:
[TABLE]
In this rule, it is allowed (but in fact useless) that for two different i∈I the full types (m,Fi,Mi,τi) are equal.
It is also allowed that I=∅, in which case no type needs to be derived for Q.
Observe how flags and markers coming from premisses concerning Q are propagated: only flags and markers of order n<ord(P) are visible to P, while only flags of order n≥ord(P) are passed to the Compm predicate.
This can be justified if we recall the intuitions staying behind the type system (see page 3).
Indeed, while considering flags and markers of order n, we should imagine the λ-term obtained from the current λ-term by performing all β-reductions of all orders greater than n;
the distribution of flags and markers of order n in the current λ-term actually simulates their distribution in this imaginary λ-term.
Thus, if n<ord(P), then our application will disappear in this imaginary λ-term, and Q will be already substituted somewhere in P;
for this reason we need to pass the information about flags and markers of order n from Q to P.
Conversely, if n≥ord(P), then in the imaginary λ-term the considered application will be still present,
and in consequence the subterm corresponding to P will not see flags and markers of order n placed in the subterm corresponding to Q.
Example 4**.**
Denote by τ^f and τ^m the types derived in Example 2:
[TABLE]
Then, using the (@) rule, we can derive (where e is a symbol of rank [math], and f a variable):
[TABLE]
Recall that ρ^1=(1,∅,{0},o).
In the conclusion of the (@) rule the information about a flag of order 1 (from the second premiss) meets the information about the marker of order 1 (from the first premiss),
and thus a flag of order 2 is placed, which increases the flag counter.
Notice that we have discarded the full type τ^f assigned to f in the type environment;
this is allowed because τ^f provides no markers (equally well τ^f could be assigned to f also in one or two of the premisses, and discarded there).
On the other hand, the full type τ^m provides markers, so it cannot be discarded nor duplicated (in particular, we could not pass it to the conclusion of the (Con) rule).
The key property of the type system is described by the following theorem.
Theorem 2**.**
Let P be a closed λ-term of sort o and complexity m.
Then L(P) is infinite if and only if for arbitrarily large c we can derive ε⊢P:ρ^m▹c, where ρ^m=(m,∅,{0,…,m−1},o).
The left-to-right implication of Theorem 2 (completeness of the type system) is shown in Section 4, while the opposite implication (soundness of the type system) in Section 5.
In Section 6 we discuss how Theorem 1 follows from Theorem 2.
Before all that, we give a few more examples of derivations, illustrating the type system and Theorem 2.
Example 5**.**
In this example we analyze the λ-term P1=R(λx.ax), where R is defined by coinduction as R=(λf.br(fe)(R(λx.f(fx)))).
As previously, a and e are symbols of rank 1 and [math], respectively.
In L(P1) there are trees that consist of a branch of a symbols ended with an e symbol, but only those where the number of a symbols is 2k for some k∈N.
Notice that the complexity of P1 is 2.
Continuing Example 4, we derive the full type σ^R=(2,∅,{0},{τ^f,τ^m}→o) for R:
[TABLE]
Next, we derive the same full type for R, but using the second argument of the br symbol; this results in greater values of the flag counter.
We start by deriving the full type τ^f for the subterm λx.f(fx):
[TABLE]
In the above derivation there are no flags nor markers.
Next, we derive τ^m for the same subterm:
[TABLE]
Below the lower (@) rule the information about a flag of order 1 meets the information about the marker of order 1, and thus a flag of order 2 is placed, which increases the flag counter.
We continue with the λ-term R:
[TABLE]
In this fragment of a derivation no flag nor marker is placed.
In particular, there is no order-2 flag in conclusion of the (@) rule, although its second premiss provides a flag of order 1 while the third premiss provides the marker of order 1.
We recall from the definition of the (@) rule that the information about flags and markers coming from the arguments is divided into two parts.
Numbers smaller than the order of the operator (ord(R)=2 in our case) are passed to the operator, while only greater numbers (≥2 in our case) contribute in creating new flags via the Comp predicate.
By composing the above fragments of a derivation, we can derive ε⊢R:σ^R▹c for every c≥1.
Recall that in Examples 1-3 we have derived ε⊢λx.ax:τ^f▹0 and ε⊢λx.ax:τ^m▹1.
Together with the above, this allows to derive for P1 the full type ρ^2=(2,∅,{0,1},o) (appearing in Theorem 2):
[TABLE]
We can notice a correspondence between a derivation with flag counter c+1 and a tree in L(P) of size 2c−1+1.
We remark that in every of these derivations only three flags of order [math] and only three flags of order 1 are present, in the three nodes using the (Con) rule.
Example 6**.**
Consider a similar λ-term P2=R(λx.bxx), where R is as previously, and b is a symbol of rank 2.
In L(P2) we have, for every k∈N, a full binary tree in which every branch consist of 2k symbols b and ends with an e symbol.
This time for the subterm λx.bxx we need to derive three full types:
[TABLE]
The last one is derived with flag counter 1.
Notice that τ^f′ and τ^m′ need now two full types for the argument x; the new one (1,{0},∅,o) describes the subtree that is not on the path to the order-[math] marker.
We also have a new full type τ^0′ that describes the use of λx.bxx outside of the path to the order-[math] marker.
Then, similarly as in the previous example, for every c≥1 we can derive ε⊢R:σ^R′▹c,
where σ^R′=(2,∅,{0},{τ^0′,τ^f′,τ^m′}→o).
Again, this allows to derive ε⊢P2:ρ^2▹c+1.
This time a derivation with flag counter c+1 corresponds to a tree in L(P) of size 2h−1 with h=2c−1+1.
Example 7**.**
Next, consider the λ-term P3=R(λx.x).
The only tree in L(P3) consists of a single e node.
Let us see how the derivation from Example 5 has to be modified.
The full type τ^m can still be derived for λx.x (although with flag counter [math] now),
but instead of τ^f we have to use τ^f′′=(2,∅,∅,{ρ^1}→o) that provides no flag of order 1:
[TABLE]
Next, for R we want to derive the full type σ^R′′=(2,∅,{0},{τ^f′′,τ^m}→o).
We can easily adopt every of the previous derivations for ε⊢R:σ^R▹c: we basically replace every τ^f by τ^f′′.
The key point is that while deriving the full type τ^m for the subterm λx.f(fx), previously in the lower (@) rule we have received information about an order-1 flag,
and thus we have created an order-2 flag and increased the flag counter;
this time there is no information about an order-1 flag, and thus we do not create an order-2 flag and do not increase the flag counter.
In consequence, even if this part of the derivation is repeated arbitrarily many times, the value of the flag counter of the whole derivation remains 1.
Example 8**.**
Finally, consider the λ-term P4=(λg.P3)(λx.a(a(…(ax)…)), which β-reduces to P3.
Notice that we can create the following derivation:
[TABLE]
Every (Con) rule used in this derivation places in its conclusion an order-[math] flag and an order-1 flag.
This derivation can be used as a part of a derivation for P4:
[TABLE]
Because τ^f provides no markers, it can be removed from the type environment and thus for P3 we can use the derivation from the previous example.
We thus obtain a derivation for P4 in which there are many order-[math] and order-1 flags (but only one flag of order 2).
This shows that in the flag counter we indeed need to count only the number of flags of the maximal order (not, say, the total number of flags of all orders).
4 Completeness
The proof of the left-to-right implication of Theorem 2 is divided into the following three lemmata.
Recall that a β-reduction P→βQ is of order n if it concerns a redex (λx.R)S such that ord(λx.R)=n.
The number of nodes of a tree t is denoted ∣t∣.
As in Theorem 2, we denote ρ^m=(m,∅,{0,…,m−1},o).
Lemma 3**.**
Let P be a closed λ-term of sort o and complexity m, and let t∈L(P).
Then there exist λ-terms Qm,Qm−1,…,Q0 such that P=Qm, and for every k∈{1,…,m} the term Qk−1 can be reached from Qk using only β-reductions of order k,
and we can derive ε⊢Q0:ρ^0▹∣t∣.
Lemma 4**.**
Suppose that we can derive ε⊢P:ρ^m▹c.
Then we can also derive ε⊢P:ρ^m+1▹c′ for some c′≥log2c.
Lemma 5**.**
Suppose that P→βQ is a β-reduction of order m, and we can derive Γ⊢Q:τ^▹c with ord(τ^)=m.
Then we can also derive Γ⊢P:τ^▹c.
Now the left-to-right implication of Theorem 2 easily follows.
Indeed, take a closed λ-term P of sort o and complexity m such that L(P) is infinite, and take any c∈N.
By log2k we denote the k-fold application of the logarithm: log20x=x and log2k+1x=log2(log2kx).
Since L(P) is infinite, it contains a tree t so big that log2m∣t∣≥c.
We apply Lemma 3 to this tree, obtaining λ-terms Qm,Qm−1,…,Q0 and a derivation of ε⊢Q0:ρ^0▹∣t∣.
Then repeatedly for every k∈{1,…,m} we apply Lemma 4, obtaining a derivation of ε⊢Qk−1:ρ^k▹ck for some ck≥log2k∣t∣,
and Lemma 5 for every β-reduction (of order k) between Qk and Qk−1, obtaining a derivation of ε⊢Qk:ρ^k▹ck.
We end with a derivation of ε⊢P:ρ^m▹cm, where cm≥log2m∣t∣≥c, as needed.
In the remaining part of this section we prove the three lemmata.
Proof of Lemma 3 (sketch).
Recall that t∈L(P) is a finite tree, thus it can be found in some finite prefix of the Böhm tree of P.
By definition, this prefix will be already expanded after performing some finite number of β-reductions from P.
We need to observe that these β-reductions can be rearranged, so that those of higher order are performed first.
The key point is to observe that when we perform a β-reduction of some order k, then no new β-redexes of higher order appear in the term.
Indeed, suppose that (λx.R)S is changed into R[S/x] somewhere in a term, where ord(λx.R)=k.
One new redex that may appear is when R starts with a λ, and to the whole R[S/x] some argument is applied; this redex is of order ord(R)≤k.
Some other redexes may appear when S starts with a λ, and is substituted for such appearance of x to which some argument is applied; but this redex is of order ord(S)<k.
We can thus find a sequence of β-reductions in which β-reductions are arranged according to their order, that leads from P to some Q0 such that t can be found in the prefix of Q0 that is already expanded to a tree.
It is now a routine to use the rules of our type system and derive ε⊢Q0:ρ^0▹∣t∣:
in every br-labeled node we choose the subtree in which t continues, and this effects in counting the number of nodes of t in the flag counter.
∎
Proof of Lemma 4.
Consider some derivation of ε⊢P:ρ^m▹c.
In this derivation we choose a leaf in which we will put the order-m marker, as follows.
Starting from the root of the derivation, we repeatedly go to this premiss in which the flag counter is the greatest (arbitrarily in the case of a tie).
In every node that is not on the path to the selected leaf, we replace the current type judgment Γ⊢Q:(m,F,M,τ)▹d by Γ⊢Q:(m+1,F′,M,τ)▹0,
where F′=F∪{m} if d>0, and F′=F otherwise.
In the selected leaf and all its ancestors, we change the order from m to m+1, we add m to the set of marker orders, and we recalculate the flag counter.
Let us see how such transformation changes the flag counter on the path to the selected leaf.
We will prove (by induction) that the previous value d and the new value d′ of the flag counter in every node on this path satisfy d′≥log2d.
In the selected leaf itself, the flag counter (being either [math] or 1) remains unchanged; we have d′=d≥log2d.
Next, consider any proper ancestor of the selected node.
Let k be the number of those of its children in which the flag counter was positive, plus the number of order-m flags placed in the considered node itself.
Let also dmax and dmax′ be the previous value and the new value of the flag counter in this child that is in the direction of the selected leaf.
By construction, the flag counter in this child was maximal, which implies k⋅dmax≥d, while by the induction assumption dmax′≥log2dmax.
To d′ we take the flag counter only from the special child, while for other children with positive flag counter we add 1, i.e., d′=k−1+dmax′.
Altogether we obtain d′=k−1+dmax′≥k−1+log2dmax≥log2(k⋅dmax)≥log2d, as required.
∎
Proof of Lemma 5.
We consider the base case when P=(λx.R)S and Q=R[S/x]; the general situation (redex being deeper in P) is easily reduced to this one.
In the derivation of Γ⊢Q:τ^▹c we identify the set I of places (nodes) where we derive a type for S substituted for x.
For i∈I, let Σi⊢S:σ^i▹di be the type judgment in i.
We change the nodes in I into leaves, where we instead derive ε[x↦{σ^i}]⊢x:σ^i▹0.
It should be clear that we can repair the rest of the derivation, by changing type environments, replacing S by x in λ-terms, and decreasing flag counters.
In this way we obtain derivations of Σi⊢S:σ^i▹di for every i∈I, and a derivation of Σ′⊢R:τ^▹d,
where Σ′=Σ[x↦{σ^i∣i∈I}] with Σ(x)=∅,
and Split(Γ∣Σ,(Σi)i∈I), and c=d+Σi∈Idi.
To the latter type judgment we apply the (λ) rule, and then we merge it with the type judgments for S using the (@) rule, which results in a derivation for Γ⊢P:τ^▹c.
We remark that different i∈I may give identical type judgments for S (as long as the set of markers in σ^i is empty); this is not a problem.
The (@) rule requires that ord(σ^i)=ord(λx.R); we have that ord(σ^i)=ord(τ^), and ord(τ^)=m=ord(λx.R) by assumption.
∎
5 Soundness
In this section we sketch the proof of the right-to-left implication of Theorem 2.
We, basically, need to reverse the proof from the previous section.
The following new fact is now needed.
Lemma 6**.**
If we can derive Γ⊢P:(m,F,M,τ)▹c with m−1∈M and ord(P)≤m−1, then c=0.
A simple inductive proof is based on the following idea:
flags of order m are created only when a marker of order m−1 is visible;
the derivation itself (together with free variables) does not provide it (m−1∈M), and the arguments, i.e. sets T1,…,Tk in τ=T1→⋯→Tk→o,
may provide only markers of order at most ord(P)−1≤m−2 (see the definition of a type), thus no flags of order m can be created.
We say that a λ-term of the form PQ is an application of order n when ord(P)=n, and that an (@) rule is of order n if it derives a type for an application of order n.
We can successively remove applications of the maximal order from a type derivation.
Lemma 7**.**
Suppose that ε⊢P:ρ^m▹c for m>0 is derived by a derivation D in which the (@) rule of order m is used n times.
Then there exists Q such that P→βQ and ε⊢Q:ρ^m▹c can be derived by a derivation D′ in which the (@) rule of order m is used less than n times.
Recall from the definition of the type system that the (@) rule of orders higher than m cannot be used while deriving a full type of order m.
Thus in D we have type judgments only for subterms of P of order at most m (although P may also have subterms of higher orders),
and in type environments we only have variables of order at most m−1.
In order to prove Lemma 7 we choose in P a subterm RS with ord(R)=m such that there is a type judgment for RS in some nodes of D (at least one),
but no descendants of those nodes use the (@) rule of order m.
Since R is of order m, it cannot be an application (then we would choose it instead of RS) nor a variable; thus R=λx.R′.
We obtain Q by reducing the redex (λx.R′)S; the derivation D′ is obtained by performing a surgery on D similar to that in the proof of Lemma 5 (but in the opposite direction).
Notice that every full type (m,F,M,τ) (derived for S) with nonempty M is used for exactly one appearance of x in the derivation for R′;
full types with empty M may be used many times, or not used at all, but thanks to Lemma 6 duplicating or removing the corresponding derivations for S does not change the flag counter.
In the derivations for R′[S/x] no (@) rule of order m may appear, and the application RS disappears, so the total number of (@) rules of order m decreases.
When all (@) rules of order m are eliminated, we can decrease m.
Lemma 8**.**
Suppose that ε⊢P:ρ^m▹c for m>0 is derived by a derivation D in which the (@) rule of order m is not used.
Then we can also derive ε⊢P:ρ^m−1▹c′ for some c′≥c.
The proof is easy; we simply decrease the order m of all derived full types by 1, and we ignore flags of order m and markers of order m−1.
To obtain the inequality c′≥c we observe that when no (@) rule of order m is used, the information about flags of order m−1 goes only from descendants to ancestors,
and thus every flag of order m is created because of a different flag of order m−1.
By repeatedly applying the two above lemmata, out of a derivation of ε⊢P:ρ^m▹c we obtain a derivation of ε⊢Q:ρ^0▹c′, where P→β∗Q and c′≥c.
Since ρ^0 is of order [math], using the latter derivation it is easy to find in the already expanded part of Q (and thus in L(Q)=L(P)) a tree t such that ∣t∣=c′≥c.
6 Effectiveness
Finally, we show how Theorem 1 follows from Theorem 2, i.e., how given a λY-term P of complexity m we can check whether ε⊢P:ρ^m▹c can be derived for arbitrarily large c.
We say that two type judgments are equivalent if they differ only in the value of the flag counter.
Let us consider a set D of all derivations of ε⊢P:ρ^m▹c in which on each branch (i.e., each root-leaf path) there are at most three type judgments from every equivalence class,
and among premisses of each (@) rule there is at most one type judgment from every equivalence class.
These derivations use only type judgments Γ⊢Q:τ^▹d with Q being a subterm of P and with Γ(x)=∅ only for variables x appearing in P.
Since a finite λY-term, even when seen as an infinitary λ-term, has only finitely many subterms,
this introduces a common bound on the height of all derivations in D, and on their degree (i.e., on the maximal number of premisses of a rule).
It follows that there are only finitely many derivations in D, and thus we can compute all of them.
We claim that ε⊢P:ρ^m▹c can be derived for arbitrarily large c if and only if in D there is a derivation in which on some branch
there are two equivalent type judgments with different values of the flag counter (and the latter condition can be easily checked).
Indeed, having such a derivation, we can repeat its fragment between the two equivalent type judgments,
obtaining derivations of ε⊢P:ρ^m▹c with arbitrarily large c.
We use here an additivity property of our type system: if out of Γ⊢Q:τ^▹d we can derive Γ′⊢Q′:τ^′▹d′,
then out of Γ⊢Q:τ^▹d+k we can derive Γ′⊢Q′:τ^′▹d′+k, for every k≥−d.
Conversely, take a derivation of ε⊢P:ρ^m▹c for some large enough c.
Suppose that some of its (@) rules uses two equivalent premisses.
These premisses concern the argument subterm, which is of smaller order than the operator subterm, and thus of order at most m−1.
The set of marker orders in these premisses has to be empty, as the sets of marker orders from all premisses have to be disjoint.
Thus, by Lemma 6, the flag counter in our two premisses is [math].
In consequence, we can remove one of the premisses, without changing anything in the remaining part of the derivation, even the flag counters.
In this way we clean the whole derivation, so that at the end among premisses of each (@) rule there is at most one type judgment from every equivalence class.
The degree is now bounded, and at each node the flag counter grows only by a constant above the sum of flag counters from the children.
Thus, if c is large enough, we can find on some branch two equivalent type judgments with different values of the flag counter.
Then, for some pairs of equivalent type judgments, we remove the part of the derivation between these type judgments (and we adopt appropriately the flag counters in the remaining part).
It it not difficult to perform this cleaning so that the resulting derivation will be in D, and simultaneously on some branch there will remain two equivalent type judgments with different values of the flag counter.
7 Conclusions
In this paper, we have shown an approach for expressing quantitative properties of Böhm trees using an intersection type system, on the example of the finiteness problem.
It is an ongoing work to apply this approach to the diagonal problem, which should give a better complexity than that of the algorithm from [7].
Another ongoing work is to obtain an algorithm for model checking Böhm trees with respect to the Weak MSO+U logic [4].
This logic extends Weak MSO by a new quantifier U, expressing that a subformula holds for arbitrarily large finite sets.
Furthermore, it seems feasible that our methods may help in proving a pumping lemma for nondeterministic HORSes.
Appendix A Proof of Lemma 3
Let us write P≈nP′ if the λ-terms P and P′ agree up to depth n∈N.
Formally, ≈n is defined by induction on n as the smallest equivalence relation such that:
P≈0Q for all λ-terms P,Q,
aP1…Pr≈naP1′…Pr′ if Pi≈n−1Pi′ for all i∈{1,…,r},
PQ≈nP′Q′ if P≈n−1P′ and Q≈n−1Q′, and
λx.P≈nλx.P′ if P≈n−1P′.
Observe that P≈nP′ implies P≈kP′ for k<n.
We split the proof of Lemma 3 into several lemmata.
Lemma 9**.**
Let P be a closed λ-term of sort o, and let t∈L(P).
Then there exists a number n∈N and a λ-term Q such that P→β∗Q, and whenever Q≈nQ′ and Q′→β∗Q′′ for some λ-terms Q′, Q′′, then one can derive
ε⊢Q′′:ρ^0▹∣t∣.
Proof.
Let →brk be the relation obtained by composing →br with itself k times.
By definition of L(P) we have that BT(P)→br∗t, and thus BT(P)→brkt for some k∈N.
We prove the lemma by induction on ∣t∣+k, where k is the smallest number such that BT(P)→brkt.
Because L(P)=∅, we have that P→β∗aP1…Pr (where possibly a=br).
Then BT(P) has a in its root, and the subtrees starting in root’s children are BT(P1),…,BT(Pr).
We have two cases.
Suppose first that a=br (this case serves as the induction base when r=0).
Then also t has a in its root, and the subtrees t1,…,tr starting in root’s children are such that BT(Pi)→brkiti for all i∈{1,…,r},
where k1+⋯+kr=k.
We have ∣ti∣+ki<∣t∣+k, since ∣ti∣<∣t∣.
By the induction assumption, for every i∈{1,…,r} we obtain a number ni and a λ-term Qi
such that Pi→β∗Qi, and whenever Qi≈niQi′ and Qi′→β∗Qi′′ for some λ-terms Qi′, Qi′′, then one can derive ε⊢Qi′′:ρ^0▹∣ti∣.
Taking Q=aQ1…Qr we have P→β∗aP1…Pr→β∗aQ1…Qr=Q.
As n we take 1+n1+⋯+nr.
Let now Q′ and Q′′ be such that Q≈nQ′ and Q′→β∗Q′′.
Then Q′=aQ1′…Qr′ and Q′′=aQ1′′…Qr′′, where Qi≈n−1Qi′ (thus also Qi≈niQi′) and Qi′→β∗Qi′′ for i∈{1,…,r}.
From the induction assumption we obtain derivations of ε⊢Qi′′:ρ^0▹∣ti∣.
Recall that ρ^0=(0,∅,∅,o).
We apply the (Con) rule to these derivations.
Since ord(ρ^)=0, the pair (F′,c′) appearing in the rule’s definition equals (∅,1).
The Comp0 predicate simply adds flag counters from its arguments, and hence the resulting flag counter is 1+∣t1∣+⋯+∣tr∣, which equals ∣t∣.
Thus the resulting type judgment is ε⊢Q′′:ρ^0▹∣t∣, as required.
Next, suppose that a=br (and hence r=2).
It should be clear that in the shortest reduction sequence BT(P)→brkt we can rearrange reductions (without increasing their number)
so that we first eliminate the br symbol from the root of BT(P).
In other words, we have BT(P)→brBT(Pi)→brk−1t, for some i∈{1,2}.
Let us focus our attention on the case of i=1; the case of i=2 is completely symmetric.
Since ∣t∣+k−1<∣t∣+k, from the induction assumption we obtain a number n1 and a λ-term Q1
such that P1→β∗Q1, and whenever Q1≈nQ1′ and Q1′→β∗Q1′′ for some λ-terms Q1′, Q1′′, then one can derive ε⊢Q1′′:ρ^0▹∣t∣.
Taking Q=brQ1P2 we have P→β∗brP1P2→β∗brQ1P2=Q.
As n we take 1+n1.
Let now Q′, Q′′ be such that Q≈nQ′ and Q′→β∗Q′′.
Then Q′=brQ1′Q2′ and Q′′=brQ1′′Q2′′, where Q1≈n−1Q1′ and Q1′→β∗Q1′′.
By the induction assumption we can derive ε⊢Q1′′:ρ^0▹∣t∣, which after applying the (Br) rule gives ε⊢Q′′:ρ^0▹∣t∣.
∎
Lemma 10**.**
If P≈nP′ and Q≈nQ′ for some n∈N, then also P[Q/x]≈nP′[Q′/x].
Proof.
Induction on n.
For n=0 the lemma is obvious: ≈0 always holds.
When n>0 and P=RS, then P′=R′S′ with R≈n−1R′ and S≈n−1S′.
By the induction assumption we have R[Q/x]≈n−1R′[Q′/x] and S[Q/x]≈n−1S′[Q′/x], and thus P[Q/x]≈nP′[Q′/x].
The cases when P=aP1…Pr or P=λy.Q are similar.
Finally, when P=P′ is a variable, the thesis follows immediately from Q≈nQ′.
∎
Lemma 11**.**
If P≈n+2P′ and P→βQ, then for some Q′ we have P′→β∗Q′ and Q≈nQ′.
Proof.
Induction on n.
If n=0, the thesis holds for Q′=P′.
Suppose that n>0 and P=(λx.R)S and Q=R[S/x].
Then P′=(λx.R′)S′, where R≈nR′ and S≈n+1S′.
Taking Q′=R′[S′/x] we have P′→βQ′, and by Lemma 10 Q≈nQ′.
The remaining case is that n>0 and the redex involved in the β-reduction P→βQ is not located on the front of P.
Then the thesis follows from the induction assumption.
Let us consider only a representative example: suppose that P=RS, and Q=TS, and R→βT.
In this case P′=R′S′ with R≈n+1R′ and S≈n+1S′.
The induction assumption gives us T′ such that R′→β∗T′ and T≈n−1T′.
Thus for Q′=T′S′ we have P′→β∗Q′ and Q≈nQ′.
∎
Lemma 12**.**
For every n∈N, we can represent every λ-term P as P=P′[S1/x1,…,Ss/xs] so that P′ is finite and P≈nP′.
Proof.
Induction on n.
For n=0 we represent P=x[S/x], and clearly x≈0P.
For n>0 we consider the representative case of P=QR; for other forms of P the proof is similar.
The induction assumption gives us representations Q=Q′[S1/x1,…,Sr/xr] and R=R′[Sr+1/xr+1,…,Ss/xs]
with Q′,R′ finite and such that Q≈n−1Q′ and R≈n−1R′.
W.l.o.g. we can assume that the fresh variables x1,…,xr are not free in R′, and the fresh variables xr+1,…,xs are not free in Q′.
Then, for P′=Q′R′ we have P≈nP′ and P=P′[S1/x1,…,Ss/xs].
∎
Corollary 13**.**
Let n∈N, and let P, Q be λ-terms such that P→β∗Q.
Then we can represent P as P=P′[S1/x1,…,Ss/xs] so that P′ is finite, and for some λ-term Q′ it holds P′→β∗Q′ and Q≈nQ′.
Proof.
Let us write P=P0→βP1→β⋯→βPl=Q.
By Lemma 12 we can write P=P′[S1/x1,…,Ss/xs] so that P′ is finite and P≈n+2lP′.
Take P0′=P′.
Consecutively for all i∈{1,…,l} Lemma 11 gives us a λ-term Pi′ such that Pi−1′→β∗Pi′ and Pi≈n+2(l−i)Pi′.
At the end, for Q′=Pl′ we have P′→β∗Q′ and Q≈nQ′.
∎
Below, a λ-term (λx.P)Q is called a β-redex of order k if k=ord(λx.P).
Lemma 14**.**
Let k∈N, and let P be a λ-term without β-redexes of orders higher than k (as subterms).
If P→βQ is a β-reduction of order k, then also in Q there are no β-redexes of order higher than k.
Proof.
This lemma was already justified on page 4, but let us repeat.
Let P′ be obtained from P by replacing by y the β-redex (λx.R)S involved in the β-reduction P→βQ.
Then we have P=P′[(λx.R)S/y] and Q=P′[R[S/x]/y].
Suppose that Q has a β-redex of order higher than k, i.e., a subterm (λz.T)U with ord(λz.T)>k; we want to prove that this is impossible.
The subterm (λz.T)U, like every subterm of Q, can be found in one of the following three places:
Possibly (λz.T)U is a subterm of S.
This is impossible, because by assumption P (and thus also its subterm S) contains no β-redexes of orders higher than k.
Possibly (λz.T)U=V[S/x] for a subterm V of R, where V=x.
Then V has to be an application V=WX, with ord(W)=ord(λz.T)>k.
We have W=x, because ord(x)<ord(λx.R)=k.
Thus W is a λ-abstraction, with V being itself a β-redex of order higher than k, which is again impossible by assumption.
Otherwise (λz.T)U=V[R[S/x]/y] for a subterm V of P′, where V=y.
Again, V has to be an application V=WX, with ord(W)=ord(λz.T)>k.
We have W=y, because ord(y)=ord(R)≤ord(λx.R)=k.
Thus W is a λ-abstraction, with V being itself a β-redex of order higher than k;
this is impossible, since V[(λx.R)S/y] is a subterm of P and also a β-redex of order higher than k.∎
Lemma 15**.**
Let P′ be a finite λ-term of complexity at most m.
Then there exist λ-terms Qm′,Qm−1′,…,Q0′ such that P′=Qm′, and for every k∈{1,…,m} the term Qk−1′ can be reached from Qk′ using only β-reductions of order k,
and Q0′ is in β-normal form.
Proof.
Take Qm′=P′.
Then, for k=m,…,1, consecutively, out of Qk′ we perform β-reductions of order k as long as possible, and as Qk−1′ we take the resulting λ-term,
from which no more β-reductions of order k are possible.
Every such sequence of β-reductions finally ends, because P′ is finite.
For every k∈{1,…,m} Lemma 14 ensures that in Q0′ there are no β-redexes of order k,
because all β-reductions between Qk−1′ and Q0′ were of orders smaller than k.
Moreover, because the complexity of a λ-term cannot grow during β-reductions, the complexity of Q0′ is at most m, and hence it has no β-redexes of order higher than m.
Thus Q0′ is in β-normal form.
∎
Proof of Lemma 3.
Recall that in this lemma we are given a closed λ-term P of sort o and complexity m, and some t∈L(P),
and our goal is to exhibit λ-terms Qm,Qm−1,…,Q0 such that P=Qm, and for every k∈{1,…,m} the term Qk−1 can be reached from Qk using only β-reductions of order k,
and we can derive ε⊢Q0:ρ^0▹∣t∣.
We first apply Lemma 9 to P and t, obtaining a number n and a λ-term Q such that P→β∗Q.
Then, we apply to them Corollary 13, obtaining a representation P=P′[S1/x1,…,Ss/xs] for finite P′, and a λ-term Q′.
Notice that the complexity of P′ cannot be higher than that of P, as for every subterm R of P′, the term R[S1/x1,…,Ss/xs] is a subterm of P, and has the same order as R.
We then apply Lemma 15 to P′, obtaining λ-terms Qm′,Qm−1′,…,Q0′.
As Qk we take Qk′[S1/x1,…,Ss/xs], for k∈{0,…,m}.
We have Qm=P since Qm′=P′.
For every k∈{1,…,m} Lemma 15 gives us a sequence of β-reduction of order k from Qk′ to Qk−1′.
After performing the substitution [S1/x1,…,Ss/xs] to every λ-term in this sequence, all β-reductions in this sequence remain correct β-reductions of order k,
and now the sequence leads from Qk to Qk−1.
Finally, by Corollary 13 we have P′→β∗Q′ and Q≈nQ′.
Since, by Lemma 15, the β-normal form of P′ is Q0′, we also have Q′→β∗Q0′.
Thus by Lemma 9 (where we take Q0′ as Q′′) we obtain a derivation of ε⊢Q0′:ρ^0▹∣t∣.
To every λ-term in this derivation we apply the substitution [S1/x1,…,Ss/xs], obtaining a derivation of ε⊢Q0:ρ^0▹∣t∣.
The derivation remains correct: the problem could only appear in a (Var) rule used for some of the variables x1,…,xs,
but since the type environment of the resulting type judgment is empty, the derivation uses the (Var) rule only for bound variables of Q0′, not for x1,…,xs.
∎
Appendix B Proof of Lemma 4
Lemma 16**.**
If Compm(M;((Fi,ci))i∈I)=(F,c), where M and Fi are subsets of {0,…,m−1},
then it holds that Compm+1(M;((Gi,0))i∈I)=(G,0), where G=F∪{m∣c>0} and Gi=Fi∪{m∣ci>0}.
In the above, by {m∣c>0} we mean the set {m} when c>0, and ∅ otherwise.
Proof.
In the definition of the Comp predicate some numbers fn and fn′ are computed.
Denote by fn,m and fn,m′ the values taken by fn and fn′ in the above instantiation of the Compm predicate,
and by fn,m+1 and fn,m+1′ their values in the above instantiation of the Compm+1 predicate.
Since the sets Fi and Gi are the same when restricted to numbers smaller than m,
we have fn,m=fn,m+1 for n<m, and fn,m′=fn,m+1′ for n≤m.
In particular, for n<m the definition of Comp says that n∈F⇔n∈G.
We also have fm+1,m+1′=0, since m∈M;
thus the flag counter computed by the Compm+1 predicate (as the sum of fm+1,m+1′=0 and of the zeroes given as arguments to Compm+1) is actually [math].
Finally, we recall that c=fm,m′+∑i∈Ici;
thus c>0 if and only if fm,m′>0 or ci>0 for some i∈I.
On the other hand, we have fm,m+1=fm,m+1′+∑i∈I∣Gi∩{m}∣;
thus fm,m+1>0 if and only if fm,m+1′>0 or m∈Gi for some i∈I.
Since fm,m′=fm,m+1′ and ci>0⇔m∈Gi, we obtain c>0⇔fm,m+1>0.
The Compm+1 predicate wants to put m to the set G if only if fm,m+1>0 (since m∈M), thus if and only if c>0;
this agrees with our definition of G.
∎
Lemma 17**.**
If we can derive Γ⊢R:(m,F,M,τ)▹c, then we can also derive Γ⊢R:(m+1,G,M,τ)▹0,
where G=F∪{m∣c>0}.
Proof.
Denote τ^=(m,F,M,τ) and τ^′=(m+1,G,M,τ).
The proof is by induction on the structure of the derivation of Γ⊢R:τ^▹c.
We have several cases, depending on the shape of R.
Suppose that R=brP1P2.
Then the last rule of the derivation is (Br) with premiss Γ⊢Pi:τ^▹c for some i∈{1,2}.
The induction assumption gives us a derivation of Γ⊢Pi:τ^′▹0.
Applying back the (Br) rule we derive Γ⊢R:τ^′▹0.
Next, suppose that R=x is a variable.
Then the (Var) rule requires that c=0, and thus G=F.
We just need to change in this rule m to m+1 (which is not problematic at all), and we obtain a derivation of Γ⊢R:τ^′▹0.
Next, suppose that R=λx.P.
Then the derivation ends with the (λ) rule, whose premiss is Γ′[x↦T]⊢P:(m,F,Mλ,τλ)▹c,
where τ=T→τλ, and M=Mλ∖⋃(k,F′,M′,σ)∈TM′, and Split(Γ∣Γ′), and Γ′(x)=∅.
The induction assumption gives us a derivation of Γ′[x↦T]⊢P:(m+1,G,Mλ,τλ)▹0.
We can apply back the (λ) rule, and derive Γ⊢R:τ^′▹0.
Next, suppose that R=aP1…Pr for a=br.
We have τ=o.
Let Γi⊢Pi:(m,Fi,Mi,o)▹ci for i∈{1,…,r} be the premisses of the final (Con) rule.
Using the induction assumption we derive Γi⊢Pi:(m+1,Gi,Mi,o)▹0, where Gi=Fi∪{m∣ci>0}.
We want to apply back the (Con) rule.
Some of the conditions of the (Con) rule remain unchanged: M=M′⊎M1⊎⋯⊎Mr, and M′=∅ if r>0, and Split(Γ∣Γ1,…,Γr).
We also know that (F,c)=Compm(M;(F′,c′),(F1,c1),…,(Fr,cr)), where F′=∅ and c′=1 if m=0, and F′={0} and c′=0 if m>0.
Notice that F′∪{m∣c′>0}={0}.
Thus from Lemma 16 we obtain that Compm+1(M;({0},0),(G1,0),…,(Gr,0))=(G,0), as required.
Finally, suppose that R=PQ.
Let Γ′⊢P:(m,F′,M′,T→τ)▹c′ and Γi⊢Q:(m,Fi,Mi,τi)▹ci for each i∈I be the premisses of the final (@) rule,
where T={(ord(P),Fi↾<ord(P),Mi↾<ord(P),τi)∣i∈I}.
Using the induction assumption we derive Γ′⊢P:(m+1,G′,M′,T→τ)▹0, where G′=F′∪{m∣c′>0}, and
Γi⊢Q:(m+1,Gi,Mi,τi)▹ci, where Gi=Fi∪{m∣ci>0}, for each i∈I.
We want to apply back the (@) rule.
The conditions M=M′⊎⨄i∈IMi and Split(Γ∣Γ′,(Γi)i∈I) required by the (@) rule remain unchanged,
while ord(P)≤m+1 holds because previously we had ord(P)≤m.
Also due to ord(P)≤m we have Gi↾<ord(P)=Fi↾<ord(P),
and thus the type needed now for P, that is {(ord(P),Gi↾<ord(P),Mi↾<ord(P),τi)∣i∈I}→o, is actually equal T→o,
The condition Compm(M;(F′,c′),((Fi↾≥ord(P),ci))i∈I)=(F,c) implies that Compm+1(M;(G′,0),((Gi↾≥ord(P),0))i∈I)=(G,0) by Lemma 16
(notice that Gi↾≥ord(P)=Fi↾≥ord(P)∪{m∣ci>0} because m≥ord(P)).
∎
Lemma 18**.**
Suppose that Compm(M;((Fi,ci))i∈I)=(F,c), where M and Fi are subsets of {0,…,m−1}.
If s∈I is such that ci≤cs for all i∈I, and ds≥log2cs, and Gi=Fi∪{m∣ci>0} for i∈I,
then Compm+1(M∪{m};(Fs,ds),((Gi,0))i∈I∖{s})=(F,d) for some d such that d≥log2c.
Proof.
As in the proof of Lemma 16,
denote by fn,m and fn,m′ the values taken by the variables fn and fn′ in the above instantiation of the Compm predicate,
and by fn,m+1 and fn,m+1′ their values in the above instantiation of the Compm+1 predicate.
As previously we have fn,m=fn,m+1 for n<m, and fn,m′=fn,m+1′ for n≤m,
and thus Compm+1 correctly says which numbers smaller than m should belong to F.
Moreover, it says that m∈F, since m∈M∪{m}, which is also correct.
It remains to check that the flag counter d computed by Compm+1 actually satisfies d≥log2c.
Because m∈M∪{m} we have fm+1,m+1′=fm,m+1, and thus
[TABLE]
On the other hand, we have
[TABLE]
In the degenerate case of cs=0 we have that ci=0 for all i∈I, and by the above d=c, thus even more d≥log2c.
Next, suppose that cs>0.
Denote k=fm,m′+∣{i∈I∣ci>0}∣.
Then (1) gives d=k−1+ds, while continuing (2) we have
c≤fm,m′⋅cs+∣{i∈I∣ci>0}∣⋅cs=k⋅cs.
Altogether we obtain as required:
[TABLE]
Corollary 19**.**
Suppose that Compm(M;(F′,c′))=(F,c), where M⊆{0,…,m−1}, and F′=∅∧c′=1 if m=0, and F′={0}∧c′=0 if m>0.
Then Compm+1(M∪{m};({0},0))=(F,d) for some d such that d≥log2c.
Proof.
For m>0 this is a direct consequence of Lemma 18 (where the assumption ds≥log2cs is instantiated as 0≥log20).
For m=0 we have M=∅, and Comp0(M;(∅,1))=(∅,1), while Comp1(M∪{0};({0},0))=(∅,1);
this is fine, since d=1≥log21=log2c
(for m=0 we could not apply Lemma 18, because the set F′=∅ in Compm changes into {0} in Compm+1).
∎
Corollary 20**.**
Suppose that Compm(M;(F′,c′),((Fi,ci))i∈I)=(F,c), where M and Fi are subsets of {0,…,m−1}, and F′=∅∧c′=1 if m=0, and F′={0}∧c′=0 if m>0.
If s∈I is such that ci≤cs for all i∈I, and ds≥log2cs, and Gi=Fi∪{m∣ci>0} for i∈I,
then Compm+1(M∪{m};({0},0),(Fs,ds),((Gi,0))i∈I∖{s})=(F,d) for some d such that d≥log2c.
Proof.
For m>0 this is a direct consequence of Lemma 18 (we notice that F′∪{m∣c′>0}={0} and c′=0≤cs).
If m=0 and cs≥1 we can use Lemma 18 as well.
Suppose that m=0 and cs=0.
Then M=∅, and Fi=Gi=∅∧ci=0 for all i∈I,
thus Comp0(M;(∅,1),((Fi,ci))i∈I)=(∅,1),
while Comp1(M∪{0};({0},0),(Fs,ds),((Gi,0))i∈I∖{s})=(∅,1+ds); this is fine, since d=1+ds≥log21=log2c.
∎
Lemma 21**.**
Suppose that we can derive Γ⊢R:(m,F,M,τ)▹c, where ord(R)≤m and that every full type assigned by Γ to a variable is of order at most m.
Then we can also derive Γ⊢R:(m+1,F,M∪{m},τ)▹d for some d such that d≥log2c.
Proof.
Denote τ^=(m,F,M,τ) and τ^′=(m+1,F,M∪{m},τ).
The proof is by induction on the structure of the derivation of Γ⊢R:τ^▹c.
We have several cases, depending on the shape of R.
The case of R=brP1P2 follows immediately from the induction assumption, as in Lemma 17.
Suppose that R=x is a variable.
The (Var) rule used in the derivation ensures that c=0 and that Γ(x) contains a full type (k,F,M′,τ) with M↾<k=M′.
By assumptions of the lemma, k≤m.
Then (M∪{m})↾<k=M′, and hence the (Var) rule can equally well derive Γ⊢R:τ^′▹0;
we have 0≥log20.
Next, suppose that R=λx.P.
Then the derivation ends with the (λ) rule, whose premiss is Γ′[x↦T]⊢P:(m,F,Mλ,τλ)▹c,
where τ=T→τλ, and M=Mλ∖⋃(k,F′,M′,σ)∈TM′, and Split(Γ∣Γ′), and Γ′(x)=∅.
Because τ is a type, the definition of a type ensures that all full types in T are of order ord(R)≤m.
Additionally ord(P)≤ord(R)≤m, so assumptions of the lemma are satisfied for the premiss;
the induction assumption gives us a derivation of Γ′[x↦T]⊢P:(m+1,F,Mλ∪{m},τλ)▹d, where d≥log2c.
Again because all full types in T are of order ord(R)≤m (and hence M′⊆{0,…,m−1} for all (k,F′,M′,σ)∈T)
we have M∪{m}=Mλ∪{m}∖⋃(k,F′,M′,σ)∈TM′.
Thus after applying back the (λ) rule we obtain a derivation of Γ⊢R:τ^′▹d.
Next, suppose that R=a (where a is a symbol of rank [math]).
We have τ=o.
The conditions of the (Con) rule are Split(Γ∣ε) and Compm(M;(F′,c′))=(F,c), where F′=∅ and c′=1 if m=0, and F′={0} and c′=0 if m>0.
By Corollary 19, Compm+1(M∪{m};({0},0))=(F,d) for some d such that d≥log2c.
We can use the (Con) rule to derive Γ⊢R:τ^′▹d.
Next, suppose that R=aP1…Pr for a=br and r>0.
We have τ=o.
Denote I={1,…,r}.
Let Γi⊢Pi:(m,Fi,Mi,o)▹ci for i∈I be the premisses of the final (Con) rule,
and let s∈I be such that cs≥ci for all i∈I.
Using the induction assumption we derive Γs⊢Ps:(m+1,Fs,Ms∪{m},o)▹ds for some ds such that ds≥log2cs,
while for i∈I∖{s} we use Lemma 17 to derive Γi⊢Pi:(m+1,Gi,Mi,o)▹0, where Gi=Fi∪{m∣ci>0}.
We want to apply back the (Con) rule.
The condition M=M′⊎M1⊎⋯⊎Mr updates accordingly: m is added to M and to Ms.
The conditions (r>0)⇒(M′=∅) and Split(Γ∣Γ1,…,Γr) remain unchanged.
We know that Compm(M;(F′,c′),((Fi,ci))i∈I)=(F,c), where F′=∅ and c′=1 if m=0, and F′={0} and c′=0 if m>0;
by Corollary 20 this implies Compm+1(M∪{m};({0},0),(Fs,ds),((Gi,0))i∈I∖{s})=(F,d) for some d such that d≥log2c, as required.
Finally, suppose that R=PQ.
Let Γ′⊢P:(m,F′,M′,T→τ)▹c′ and Γi⊢Q:(m,Fi,Mi,τi)▹ci for each i∈I be the premisses of the final (@) rule,
where T={(ord(P),Fi↾<ord(P),Mi↾<ord(P),τi)∣i∈I}.
A condition of the (@) rule implies that ord(Q)<ord(P)≤m, so we can use the induction assumption for these premisses.
We know that Compm(M;(F′,c′),((Fi↾≥ord(P),ci))i∈I)=(F,c).
Denote G′=F′∪{m∣c′>0} and Gi=Fi∪{m∣ci>0} for i∈I.
Notice that Gi↾≥ord(P)=Fi↾≥ord(P)∪{m∣ci>0}.
We have two subcases.
If c′≥ci for all i∈I, then using the induction assumption we derive Γ′⊢P:(m+1,F′,M′∪{m},T→τ)▹d′ for some d′ such that d′≥log2c′,
and using Lemma 17 we derive Γi⊢Q:(m+1,Gi,Mi,τi)▹0 for each i∈I.
Lemma 18 then implies that Compm+1(M∪{m};(F′,d′),((Gi↾≥ord(P),0))i∈I)=(F,d) for some d such that d≥log2c.
Otherwise, we choose s∈I such that cs≥ci for all i∈I (and cs≥c′);
using the induction assumption we derive Γs⊢Q:(m+1,Fs,Ms∪{m},τs)▹ds for some ds such that ds≥log2cs,
and using Lemma 17 we derive Γ′⊢P:(m+1,G′,M′,T→τ)▹0 and Γi⊢Q:(m+1,Gi,Mi,τi)▹0 for each i∈I∖{s}.
Lemma 18 then implies that Compm+1(M∪{m};(Fs↾≥ord(P),ds),(G′,0),((Gi↾≥ord(P),0))i∈I∖{s})=(F,d) for some d such that d≥log2c.
In both cases we apply back the (@) rule to the obtained type judgments.
The condition M=M′⊎⨄i∈IMi is updated accordingly: m is added to M and either to M′ or to Ms.
The condition Split(Γ∣Γ′,(Γi)i∈I) remains unchanged, and ord(P)≤m+1 holds since we even have ord(P)≤m.
We also have that Gi↾<ord(P)=Fi↾<ord(P) and (Ms∪{m})↾<ord(P)=Ms, so the type required in a premiss for P is indeed T→o.
∎
Lemma 4 says that if we can derive ε⊢R:(m,∅,{0,…,m−1},o)▹c,
then we can also derive ε⊢R:(m+1,∅,{0,…,m},o)▹d for some d such that d≥log2c.
This is just a special case of Lemma 21.
Appendix C Proof of Lemma 5
Below, by Mk(τ^) we denote the set of marker orders of the full type τ^, i.e., Mk(τ^)=M if τ^=(m,F,M,τ).
We extend this notation to sets of full types: Mk(T)=⋃τ^∈TMk(τ^),
and to type environments: Mk(Γ)=⋃xMk(Γ(x)), where x ranges over all variables.
Lemma 22**.**
Suppose that we can derive Γ⊢R:τ^▹c, and x is not free in R.
Then for Σ=Γ[x↦∅] we can also derive Σ⊢R:τ^▹c, and Split(Γ∣Σ) holds.
Proof.
Because x is not free in R, all full types assigned to x by Γ are discarded somewhere in the derivation of Γ⊢R:τ^▹c.
On the one hand, this means that the set of marker orders in all these full types is empty, and thus Split(Γ∣Σ) holds.
On the other hand, we can remove these full types from type environments in the derivation, which results in a derivation of Σ⊢R:τ^▹c.
∎
Lemma 23**.**
Suppose that we can derive Γ⊢P:τ^▹c.
If Split(Γ′∣Γ) holds, then we can also derive Γ′⊢P:τ^▹c.
Proof.
Induction on the structure of a fixed derivation of Γ⊢P:τ^▹c.
If the last rule is (Br), we change Γ to Γ′ in the premiss of the rule using the induction assumption, and we obtain a derivation of Γ′⊢P:τ^▹c.
In every other rule we can change Γ to Γ′ only in the conclusion.
∎
Lemma 24**.**
Suppose that we can derive Γ⊢R:τ^▹c.
Then Mk(Γ)⊆Mk(τ^), and Mk(Γ(x))∩Mk(Γ(y))=∅ for all variables x,y with x=y.
Proof.
Fix some derivation of Γ⊢R:τ^▹c; the proof is by induction on the structure of this derivation.
We analyze the shape of R.
Suppose first that R=x.
The (Var) rule says that Split(Γ,ε[x↦T]) holds for T such that Mk(T)⊆Mk(τ^).
Thus Mk(Γ(x))⊆Mk(τ^) and Mk(Γ(y))=∅ for all variables y with y=x.
In the case when R=brP1P2 the thesis follows immediately from the induction assumption applied to the premiss of the final (Br) rule.
Next, suppose that R=λz.P.
Let Γ′[z↦T]⊢P:τ^′▹c with Γ′(z)=∅ be the premiss of the final (λ) rule.
By the conditions of the rule we have Mk(τ^)=Mk(τ^′)∖Mk(T) and Split(Γ∣Γ′).
The latter condition implies that Mk(Γ(z))=∅.
For every variable x other than z the induction assumption ensures that Mk(Γ′[z↦T](x))⊆Mk(τ^′)
and that Mk(Γ′[z↦T](x))∩Mk(Γ′[z↦T](z))=∅.
Since Mk(Γ(x))=Mk(Γ′[z↦T](x)) and Mk(Γ′[z↦T](z))=Mk(T) we obtain Mk(Γ(x))⊆Mk(τ^′)∖Mk(T)=Mk(τ^).
For any two variables x,y with z=x=y=z we have Mk(Γ′[z↦T](x))∩Mk(Γ′[z↦T](y))=∅ by the induction assumption,
and thus Mk(Γ(x))∩Mk(Γ(y))=∅.
Next, suppose that R=aP1…Pr.
Let Γi⊢Pi:τ^i▹ci for i∈{1,…,r} be the premisses of the final (Con) rule.
Consider some variable x, and some k∈Mk(Γ(x)).
Because of the condition Split(Γ∣Γ1,…,Γr) of the (Con) rule we have k∈Mk(Γi(x)) for some i∈{1,…,r};
then k∈Mk(τ^i) by the induction assumption Mk(Γi(x))⊆Mk(τ^i),
and thus k∈Mk(τ^) by the condition Mk(τ^)=Mk(τ^1)⊎⋯⊎Mk(τ^r) of the (Con) rule.
Suppose that we also have k∈Mk(Γ(y)) for some variable y other than x.
Then k∈Mk(Γj(y))⊆Mk(τ^j) for some j∈{1,…,r};
we cannot have j=i by the induction assumption (Mk(Γi(x)) and Mk(Γi(y)) are disjoint),
and we cannot have j=i because Mk(τ^i) and Mk(τ^j) are disjoint.
The case when R=PQ is completely analogous to the previous one.
∎
Lemma 25**.**
Suppose that we can derive Γ⊢R[S/x]:τ^▹c.
Then, for some finite set I, we can derive Σi⊢S:σ^i▹di for every i∈I, and Λx⊢R:τ^▹e,
where Λx=Λ[x↦{σ^i∣i∈I}] with Λ(x)=∅, and Split(Γ∣Λ,(Σi)i∈I) holds,
and c=e+∑i∈Idi, and Mk(σ^i)∩Mk(σ^j)=∅ for i,j∈I if i=j.
Proof.
Fix some derivation of Γ⊢R[S/x]:τ^▹c; the proof is by induction on the structure of this derivation.
One possibility is that x is not free in R.
Then we can take I=∅, and Λx=Λ=Γ[x↦∅], and e=c.
Because x is not free in R=R[S/x], by Lemma 22 applied to the original derivation,
we know that Λx⊢R:τ^▹e can be derived, and that Split(Γ∣Λ) holds.
Because I=∅, the condition concerning disjointness of Mk(σ^i) becomes trivial.
In the sequel we assume that x is free in R.
We analyze the shape of R.
Suppose first that R=x.
Then we take I={1}, and (Σ1,σ^1,d1)=(Γ,τ^,c), and Λ=ε, and e=0.
Obviously Λ(x)=∅, and Split(Γ∣Λ,Σ1) holds, and c=e+d1.
We can derive Σ1⊢S:σ^1▹d1 by assumption, and Λx⊢R:τ^▹e using the (Var) rule, where Λx=Λ[x↦{σ^1}].
Next, suppose that R=brP1P2.
Then our derivation ends with the (Br) rule, whose premiss is Γ⊢Pk[S/x]:τ^▹c, for some k∈{1,2}.
The induction assumption applied to this premiss gives us a derivation of Σi⊢S:σ^i▹di for every i∈I,
and of Λx⊢Pk:τ^▹e, where appropriate conditions hold.
By applying back the (Br) rule to the latter type judgment, we obtain a derivation of Λx⊢R:τ^▹e as required.
Next, suppose that R=λy.P.
We have y=x, and, as always during a substitution, we assume (by performing α-conversion) that y is not free in S.
The original derivation ends with the (λ) rule, whose premiss is Γ′[y↦T]⊢P[S/x]:τ^′▹c with Γ′(y)=∅.
We apply the induction assumption to this premiss, and we obtain a derivation of Σi[y↦Ti]⊢S:σ^i▹di for every i∈I,
and of Λx[y↦T′]⊢P:τ^′▹e,
where Λx=Λ[x↦{σ^i∣i∈I}] with Λ(x)=Λ(y)=Σi(y)=∅, and Split(Γ′[y↦T]∣Λ[y↦T′],(Σi[y↦Ti])i∈I) holds,
and c=e+∑i∈Idi, and Mk(σ^i)∩Mk(σ^j)=∅ for i,j∈I if i=j.
Together with Split(Γ∣Γ′) the above implies that Split(Γ∣Λ,(Σi)i∈I) holds.
Because y is not free in S, Lemma 22 implies that for every i∈I we can derive Σi⊢S:σ^i▹di
(instead of Σi[y↦Ti]⊢S:σ^i▹di) and that Split(Σi[y↦Ti]∣Σi) holds.
Thus Mk(Ti)=∅ for all i∈I, and hence Mk(T∖T′)=∅; simultaneously T′⊆T, which implies that Split(Λx[y↦T]∣Λx[y↦T′]) holds.
In consequence, by Lemma 23 applied to Λx[y↦T′]⊢P:τ^′▹e we can derive Λx[y↦T]⊢P:τ^′▹e.
To the latter type judgment we apply again the (λ) rule, which gives Λx⊢R:τ^▹e.
Another possibility is that R=aP1…Pr.
Then the original derivation ends with the (Con) rule, whose premisses are Γj⊢Pj[S/x]:τ^j▹cj for j∈{1,…,r}.
We apply the induction assumption to these premisses.
Assuming w.l.o.g. that the resulting sets Ij are disjoint, and taking I=⋃j=1rIj,
we obtain a derivation of Σi⊢S:σ^i▹di for every i∈I, and of Λjx⊢Pj:τ^j▹ej for every j∈{1,…,r},
where, for every j∈{1,…,r}, we have Λjx=Λj[x↦{σ^i∣i∈Ij}] with Λj(x)=∅,
and Split(Γj∣Λj,(Σi)i∈Ij) holds,
and cj=ej+∑i∈Ijdi,
and Mk(σ^i)∩Mk(σ^i′)=∅ for i,i′∈Ij if i=i′.
By Lemma 24 we have Mk(σ^i)⊆Mk(τ^j) for i∈Ij, j∈J.
Since Mk(τ^j)∩Mk(τ^j′)=∅ for j,j′∈J with j=j′ by a side condition of the (Con) rule,
this implies that Mk(σ^i)∩Mk(σ^i′)=∅ for all i,i′∈I with i=i′.
Recalling the side condition Split(Γ∣(Γj)j∈{1,…,r}) of the (Con) rule,
we observe that Split(Γ∣(Λj)j∈{1,…,r},(Σi)i∈I) holds.
Define Λ by taking Λ(z)=⋃j=1rΛj(z) for every variable z.
We then have Split(Γ∣Λ,(Σi)i∈I) and Split(Λ∣(Λj)j∈{1,…,r}), as well as Split(Λx∣(Λjx)j∈{1,…,r}).
Another side condition of the (Con) rule says that Compm(M;(F′,c′),(F1,c1),…,(Fr,cr))=(F,c) for appropriate arguments M,F,F′,c′,Fj.
Taking e=c+∑j=1r(ej−cj) we also have that Compm(M;(F′,c′),(F1,e1),…,(Fr,er))=(F,e).
Having all this, we can apply the (Con) rule again, deriving Λx⊢R:τ^▹e.
Simultaneously we observe that c=e+∑i∈Idi.
Finally, suppose that R=PQ.
This case is very similar to the previous one.
The original derivation ends with the (@) rule, whose premisses are Γ0⊢P[S/x]:τ^0▹c0 and Γj⊢Q[S/x]:τ^j▹cj for j∈J,
where we assume that 0∈J.
We apply the induction assumption to all these premisses.
Assuming w.l.o.g. that the resulting sets Ij are disjoint, and taking I=⋃j∈{0}∪JIj, we obtain a derivation of
Σi⊢S:σ^i▹di for every i∈I,
and of Λ0x⊢P:τ^0▹e0, and of Λjx⊢Q:τ^j▹ej for every j∈J,
where, for every j∈{0}∪J, we have Λjx=Λj[x↦{σ^i∣i∈Ij}] with Λj(x)=∅,
and Split(Γj∣Λj,(Σi)i∈Ij) holds,
and cj=ej+∑i∈Ijdi,
and Mk(σ^i)∩Mk(σ^i′)=∅ for i,i′∈Ij if i=i′.
By Lemma 24 we have Mk(σ^i)⊆Mk(τ^j) for i∈Ij, j∈{0}∪J.
Since Mk(τ^j)∩Mk(τ^j′)=∅ for j,j′∈{0}∪J with j=j′ by a side condition of the (Con) rule,
this implies that Mk(σ^i)∩Mk(σ^i′)=∅ for all i,i′∈I with i=i′.
Recalling the side condition Split(Γ∣(Γj)i∈{0}∪J) of the (@) rule,
we observe that Split(Γ∣(Λj)j∈{0}∪J,(Σi)i∈I) holds.
Define Λ by taking Λ(z)=⋃j∈{0}∪JΛj(z) for every variable z.
We then have Split(Γ∣Λ,(Σi)i∈I) and Split(Λ∣(Λj)j∈{0}∪J), as well as Split(Λx∣(Λjx)j∈{0}∪J).
Another side condition of the (Con) rule says that Compm(M;((Gj,cj))j∈{0}∪J)=(F,c) for appropriate sets M,F,Gj.
Taking e=c+∑j∈{0}∪J(ej−cj) we also have that Compm(M;((Gj,cj))j∈{0}∪J)=(F,e).
Thus we can apply the (@) rule again, deriving Λx⊢R:τ^▹e.
Simultaneously we observe that c=e+∑i∈Idi.
∎
Lemma 26**.**
Suppose that Compm(M;(F,e),((∅,di))i∈I)=(G,c), where F∩M=∅ and F,M⊆{0,…,m−1}.
Then G=F and c=e+∑i∈Idi.
Proof.
Consider the numbers fn and fn′ appearing in the definition of the Compm predicate.
Looking at them consecutively for n=0,…,m we notice that fn′=0 and fn=∣F∩{n}∣.
Indeed, fn′=0 implies fn=∣F∩{n}∣, and if n−1∈M, we have fn′=0,
while if n−1∈M, we have fn′=fn−1=∣F∩{n−1}∣=0, because F∩M=∅.
Then G={n∈{0,…,m−1}∣fn>0∧n∈M}=F (again because F∩M=∅),
and c=fm′+e+∑i∈Idi=e+∑i∈Idi.
∎
Proof of Lemma 5.
Recall that we are given a derivation of Γ⊢Q:τ^▹c with ord(τ^)=m, and a β-reduction P→βQ that is of order m,
and our goal is to derive Γ⊢P:τ^▹c.
Suppose first that P=(λx.R)S and Q=R[S/x], where ord(λx.R)=m.
From Lemma 25 we obtain a derivation of Σi⊢S:σ^i▹di for every i∈I (for some set I),
and a derivation of Λ[x↦T]⊢R:τ^▹e,
where T={σ^i∣i∈I}, and Λ(x)=∅, and Split(Γ∣Λ,(Σi)i∈I) holds, and c=e+∑i∈Idi,
and Mk(σ^i)∩Mk(σ^j)=∅ for i,j∈I if i=j.
Let us write τ^=(m,F,M,τ), and σ^i=(m,Fi,Mi,σi).
To the type judgment Λ[x↦T]⊢R:τ^▹e we apply the (λ) rule,
deriving Λ⊢λx.R:(m,F,M∖⋃i∈IMi,T→τ)▹e.
To this type judgment, and to Σi⊢S:σ^i▹di for i∈I, we want to apply the (@) rule.
By definition of a full type, the sets Fi and Mi may contain only numbers smaller than m (since σ^i is a full type).
Recalling that ord(λx.R)=m we have that the type {(ord(λx.R),Fi↾<ord(λx.R),Mi↾<ord(λx.R),σi)∣i∈I}→τ
that we have to derive for λx.R is indeed T→τ.
The conditions M=(M∖⋃i∈IMi)⊎⨄i∈IMi,
and ord(λx.R)≤m, and Split(Γ∣Λ,(Σi)i∈I) follow from what we have.
Notice that the sets Fi↾≥ord(λx.R) are empty, and that F∩M=∅ by definition of a full type (τ^=(m,F,M,τ) is a full type),
and hence Compm(M;(F,e),((Fi↾≥ord(λx.R),di))i∈I)=(F,c) by Lemma 26.
Thus the (@) rule can be applied; it derives Γ⊢P:τ^▹c.
It remains to consider the general situation: the redex involved in the β-reduction P→βQ is located somewhere deeper in P.
Then the proof is by a trivial induction on the depth of this redex.
Formally, we have several cases depending on the shape of P, but let us consider only a representative example:
suppose that P=TU and Q=TV with U→βV.
In the derivation Γ⊢Q:τ^▹c we apply the induction assumption to those premisses of the final (@) rule that concern the subterm V,
and we obtain type judgments in which V is replaced by U.
We can apply the (@) rule to them, and to the premiss talking about T, and derive Γ⊢P:τ^▹c.
∎
Appendix D Proof of Lemma 6
In order to enable an inductive proof of Lemma 6, we need to strengthen slightly its statement, and consider also λ-terms of order m.
We say that a full type (m,F,M,T1→⋯→Tk→o) is (m−1)-clear if m−1∈M∪⋃i=1kMk(Ti).
Lemma 27**.**
If τ^∈Fmα for ord(α)<m, and m−1∈Mk(τ^), then τ^ is (m−1)-clear.
Proof.
If α=α1→⋯→αk→o, we can write τ^=(m,F,M,T1→⋯→Tk→o).
For i∈{1,…,k} by definition we have Ti⊆Fmiαi for mi=ord(αi→⋯→αk→o),
and hence Mk(Ti)⊆{0,…,mi−1};
because ord(αi→⋯→αk→o)≤ord(α)<m, we have m−1∈Ti.
Thus τ^ is (m−1)-clear.
∎
Lemma 28**.**
If we can derive Γ⊢R:τ^▹c, where ord(τ^)=m>0 and τ^ is (m−1)-clear, then c=0.
Proof.
Fix some derivation of Γ⊢R:τ^▹c.
The proof is by induction on the structure of this derivation.
Let us write τ^=(m,F,M,τ).
We have several cases depending on the shape of R.
If R=x, the (Var) rule ensures that the flag counter c is [math].
If R=brP1P2, then above the final (Br) rule we have a premiss Γ⊢Pi:τ^▹c for some i∈{1,2};
the induction assumption used for this premiss implies c=0.
Suppose that R=λx.P.
Then above the final (λ) rule we have a premiss Γ′⊢P:(m,F,M′,τ′)▹c, where τ=T→τ′ and M=M′∖Mk(T).
Because τ^ is (m−1)-clear, we have m−1∈M∪Mk(T), hence also m−1∈M′; thus (m,F,M′,τ′) is (m−1)-clear.
The induction assumption applied to our premiss implies c=0.
Next, suppose that R=aP1…Pr.
Let Γi⊢Pi:(m,Fi,Mi,o)▹ci for i∈{1,…,r} be the premisses of the final (Con) rule.
For all i∈{1,…,r} a side condition of this rule says that Mi⊆M, so the full type (m,Fi,Mi,o) is (m−1)-clear, and hence ci=0 by the induction assumption.
We know that Compm(M;({0},0),(F1,c1),…,(Fr,cr))=(F,c) (recall that m>0).
The number fm′ considered in the definition of Compm is [math] in our case of m−1∈M, and thus we have c=fm′+∑i=1rci=0.
Finally, suppose that R=PQ.
Let Γ′⊢P:(m,F′,M′,T→τ)▹c′ and Γi⊢Q:(m,Fi,Mi,τi)▹ci for i∈I be the premisses of the final (@) rule.
A side condition of the rule says that M′⊆M and Mi⊆M for i∈I, so m−1 does not belong to these sets.
By definition the marker sets in full types in T are subsets of some Mi, so m−1∈Mk(T), and hence (m,F′,M′,T→τ) is (m−1)-clear.
On the other hand ord(Q)<ord(P)≤m, so (m,F,Mi,τi) for i∈I are (m−1)-clear by Lemma 27.
Thus the induction assumption can be used for all premisses; it says that c′=0 and ci=0 for all i∈I.
Because Comp(M;(F′,c′),((Fi,ci))i∈I)=(F,c), and m−1∈M, we obtain c=0 (as in the previous case).
∎
Proof of Lemma 6.
Recall that in this lemma we are given a derivation of Γ⊢P:τ^▹c, where m−1∈Mk(τ^) and ord(P)≤m−1 for m=ord(τ^), and we have to prove that c=0.
Lemma 27 implies that τ^ is (m−1)-clear, and thus c=0 by Lemma 28 (we have m>0 because ord(P)≤m−1).
∎
Appendix E Proof of Lemma 7
Let us formalize the notion of counting (@) rules of order m in a derivation.
We use here extended type judgments of the form Γ⊢P:τ^▹c≻n, where Γ⊢P:τ^▹c is a type judgment, and n∈N.
The number n is called application counter.
The meaning is that Γ⊢P:τ^▹c can be derived by a derivation in which the (@) rule of order ord(τ^) is used n times.
Formally, we lift our type system so that it can derive extended type judgments, as follows.
If, in the original type system, using premisses Γi⊢Pi:τ^i▹ci for i∈I some rule could derive Γ⊢R:τ^▹c, then
if R=PQ with ord(P)=ord(τ^),
then using premisses Γi⊢Pi:τ^i▹ci≻ni for i∈I we can derive Γ⊢R:τ^▹c≻1+∑i∈Ini;
otherwise (i.e., when R=PQ with ord(P)=ord(τ^), or R does not start with an application)
using premisses Γi⊢Pi:τ^i▹ci≻ni for i∈I we can derive Γ⊢R:τ^▹c≻∑i∈Ini.
It should be clear that we can derive Γ⊢R:τ^▹c if and only if we can derive Γ⊢R:τ^▹c≻n for some n∈N.
We also notice that in proofs of Lemmata 22 and 23 the number of (@) rules of the maximal order (and, more generally, the shape of a derivation) remain unchanged.
Thus we can restate these lemmata as follows.
Lemma 29**.**
Suppose that we can derive Γ⊢R:τ^▹c≻0, and x is not free in R.
Then for Σ=Γ[x↦∅] we can also derive Σ⊢R:τ^▹c≻0, and Split(Γ∣Σ) holds.
Lemma 30**.**
Suppose that we can derive Γ⊢P:τ^▹c≻0.
If Split(Γ′∣Γ) holds, then we can also derive Γ′⊢P:τ^▹c≻0.
As in the proof of Lemma 5 we start with a lemma describing substitution.
Lemma 31**.**
Suppose that we can derive Σi⊢S:σ^i▹di≻0 for every i∈I (for some finite set I), and Λx⊢R:τ^▹e≻0,
where Λx=Λ[x↦{σ^i∣i∈I}] with Λ(x)=∅, and Split(Γ∣Λ,(Σi)i∈I) holds,
and ord(S)<ord(τ^), and ord(σ^i)=ord(τ^) for all i∈I.
Then we can derive Γ⊢R[S/x]:τ^▹e+∑i∈Idi≻0.
Proof.
We start by observing the following property, denoted (♢):
If Mk(σ^i)=∅ for some i∈I, then di=0 and Mk(Σi)=∅.
Indeed, Mk(Σi)⊆Mk(σi)=∅ follows from Lemma 24, while di=0 follows from Lemma 6 if we recall that ord(S)<ord(τ^)=ord(σ^i).
The proof of the lemma is by induction on the structure of some fixed derivation of Λx⊢R:τ^▹e≻0.
One possibility is that x is not free in R.
In such a situation by Lemma 29 we can derive Λ⊢R:τ^▹e≻0 and Split(Λx∣Λ) holds,
which means that Mk(σ^i)=∅ for all i∈I.
By (♢) we have di=0 and Mk(Σi)=∅ for all i∈I.
Thus Split(Γ∣Λ,(Σi)i∈I) implies Split(Γ∣Λ), and thus by Lemma 30 applied to Λ⊢R:τ^▹e≻0
we can derive Γ⊢R:τ^▹e≻0.
This is the desired type judgment since R[S/x]=R and e+∑i∈Idi=e.
In the sequel we assume that x is free in R.
We analyze the shape of R.
Suppose first that R=x.
Then the derivation for R consists of a single rule, and thus e=0 and for some s∈I we have Split(Λx∣ε[x↦{σ^s}]) and τ^=σ^s
(this holds because all σ^i are of the same order as τ^).
It follows that Mk(Λ(y))=∅ for every variable y, and that Mk(σ^i)=∅ for every i∈I∖{s}.
By (♢) we have di=0 and Mk(Σi)=∅ for all i∈I∖{s}.
It follows that Split(Γ∣Λ,(Σi)i∈I) implies Split(Γ∣Σs);
we can thus derive Γ⊢S:σ^s▹ds≻0 by Lemma 30.
This is what we need, since R[S/x]=S, and τ^=σ^s, and e+∑i∈Idi=ds.
Next, suppose that R=brP1P2.
Then our derivation ends with the (Br) rule, whose premiss is Λx⊢Pk:τ^▹e≻0, for some k∈{1,2}.
The induction assumption applied to this premiss gives us a derivation of Γ⊢Pk[S/x]:τ^▹e+∑i∈Idi≻0.
By applying back the (Br) rule we derive Γ⊢R[S/x]:τ^▹e+∑i∈Idi≻0, as required.
Next, suppose that R=λy.P.
We have y=x, and, as always during a substitution, we assume (by performing α-conversion) that y is not free in S.
The derivation for R ends with the (λ) rule, whose premiss is Λλx[y↦T]⊢P:τ^′▹e≻0, where Λλx(y)=∅ and Split(Λx∣Λλx) holds.
Denote Λλ=Λλx[x↦∅].
We then have Λλx=Λλ[x↦{σ^i∣i∈J}] for some J⊆I.
The condition Split(Λx∣Λλx) implies that Split(Λ∣Λλ) holds, and that Mk(σ^i)=∅ for all i∈I∖J;
then di=0 and Mk(Σi)=∅ for i∈I∖J by (♢).
In the light of Split(Γ∣Λ,(Σi)i∈I) this implies that Split(Γ∣Λλ,(Σi)i∈J) holds.
Because Γ(y) may be nonempty, we have to define Γλ=Γ[y↦∅] and Σiλ=Σi[y↦∅] for i∈J.
By assumption y is not free in S, so Lemma 29 says that we can derive Σiλ⊢S:σ^i▹di≻0 and that Split(Σi∣Σiλ) holds for i∈J.
Recall that Λλ(y)=∅.
Due to Split(Γ∣Λλ,(Σi)i∈J) we also have Split(Γ∣Γλ) and Split(Γλ∣Λλ,(Σiλ)i∈J),
thus also Split(Γλ[y↦T]∣Λλ[y↦T],(Σiλ)i∈J).
We are ready to apply the induction assumption to our premiss.
We obtain a derivation of Γλ[y↦T]⊢P[S/x]:τ^′▹e+∑i∈Jdi≻0.
Because Split(Γ∣Γλ) holds and Γλ(y)=∅, we can apply the (λ) rule, obtaining Γ⊢R[S/x]:τ^▹e+∑i∈Jdi≻0,
where the full type is indeed τ^, as in the original derivation.
Because di=0 for i∈J∖I, this is what we need.
Another possibility is that R=aP1…Pr.
Then the derivation for R ends with the (Con) rule, whose premisses are Λjx⊢Pj:τ^j▹ej≻0 for j∈{1,…,r}.
For j∈{1,…,r} denote Λj=Λjx[x↦∅].
A side condition of the (Con) rule says that Split(Λx∣(Λjx)j∈{1,…,r}) holds.
On the on one hand, this implies Split(Λ∣(Λj)j∈{1,…,r}).
On the other hand, for j∈{1,…,r} we have Λjx=Λj[x↦{σ^i∣i∈Ij}] for some Ij⊆I,
and for i∈I∖⋃j=1rIj we have Mk(σ^i)=∅, and thus di=0 and Mk(Σi)=∅ by (♢).
Knowing that Split(Γ∣Λ,(Σi)i∈I) holds, we obtain Split(Γ∣(Λj)j∈{1,…,r},(Σi)i∈Ij,j∈{1,…,r}).
For j∈{1,…,r} we define Γj by taking Γj(y)=Λj(y)∪⋃i∈IjΣi(y) for all variables y.
Then we have Split(Γ∣(Γj)j∈{1,…,r}) and Split(Γj∣Λj,(Σi)i∈Ij) for j∈{1,…,r}.
Using the induction assumption for every premiss, we obtain a derivation of Γj⊢Pj[S/x]:τ^j▹ej+∑i∈Ijdi≻0 for j∈{1,…,r}.
Because Split(Γ∣(Γj)j∈{1,…,r}) holds, and the derived full types are the same as in the derivation for R, and the flag counters are higher by ∑i∈Ijdi,
we can apply the (Con) rule and derive Γ⊢R[S/x]:τ^▹e+∑j=1r∑i∈Ijdi≻0.
It remains to observe that ∑j=1r∑i∈Ijdi=∑i∈Idi.
As already said, di=0 when i∈I∖⋃j=1rIj.
Consider some i∈Ij∩Ij′ for j,j′∈{1,…,r} with j=j′.
We have Mk(σ^i)⊆Mk(Λjx), so Lemma 24 applied to the type judgment Λjx⊢Pj:τ^j▹ej says that Mk(σ^i)⊆Mk(τ^j),
and similarly Mk(σ^i)⊆Mk(τ^j′).
But, by a side condition of the (Con) rule, Mk(τ^j) and Mk(τ^j′) are disjoint, so Mk(σ^i)=∅, and hence di=0 by (♢).
In consequence the two sums are indeed equal.
Finally, suppose that R=PQ.
The proof is almost the same as in the previous case.
The derivation for R ends with the (@) rule, whose premisses are Λ0x⊢P:τ^0▹e0≻0 and Λjx⊢Q:τ^j▹ej≻0 for j∈J,
where we assume that 0∈J.
For j∈{0}∪J denote Λj=Λjx[x↦∅].
A side condition of the (@) rule says that Split(Λx∣(Λjx)j∈{0}∪J) holds.
On the on one hand, this implies Split(Λ∣(Λj)j∈{0}∪J).
On the other hand, for j∈{0}∪J we have Λjx=Λj[x↦{σ^i∣i∈Ij}] for some Ij⊆I,
and for i∈I∖⋃j∈{0}∪JIj we have Mk(σ^i)=∅, and thus di=0 and Mk(Σi)=∅ by (♢).
Knowing that Split(Γ∣Λ,(Σi)i∈I) holds, we obtain Split(Γ∣(Λj)j∈{0}∪J,(Σi)i∈Ij,j∈{0}∪J).
For j∈{0}∪J we define Γj by taking Γj(y)=Λj(y)∪⋃i∈IjΣi(y) for all variables y.
Then we have Split(Γ∣(Γj)j∈{0}∪J) and Split(Γj∣Λj,(Σi)i∈Ij) for j∈{0}∪J.
Using the induction assumption for every premiss, we obtain a derivation of Γ0⊢P[S/x]:τ^0▹e0+∑i∈I0di≻0,
and of Γj⊢Q[S/x]:τ^j▹ej+∑i∈Ijdi≻0 for all j∈J.
Because Split(Γ∣(Γj)j∈{0}∪J) holds, and the derived full types are the same as in the derivation for R, and the flag counters are higher by ∑i∈Ijdi,
we can apply the (@) rule and derive Γ⊢R[S/x]:τ^▹e+∑j∈{0}∪J∑i∈Ijdi≻0.
It remains to observe that ∑j∈{0}∪J∑i∈Ijdi=∑i∈Idi.
As already said, di=0 when i∈I∖⋃j∈{0}∪JIj.
Consider some i∈Ij∩Ij′ for j,j′∈{0}∪J with j=j′.
We have Mk(σ^i)⊆Mk(Λjx),
so Lemma 24 applied to the type judgment Λjx⊢T:τ^j▹ej (where T=P or T=Q, depending on j) says that Mk(σ^i)⊆Mk(τ^j),
and similarly Mk(σ^i)⊆Mk(τ^j′).
But, by a side condition of the (@) rule, Mk(τ^j) and Mk(τ^j′) are disjoint, so Mk(σ^i)=∅, and hence di=0 by (♢).
In consequence the two sums are indeed equal.
∎
Lemma 32**.**
Suppose that we can derive Γ⊢RS:τ^▹c≻n so that all premisses of the final (@) rule have [math] in the application counter.
If ord(R)=ord(τ^), and Γ(y)=∅ only for variables y of order at most ord(τ^)−1,
then R=λx.R′, and we can derive Γ⊢R′[S/x]:τ^▹c≻0.
Proof.
Let Γλ⊢R:τ^λ▹e≻0 and Σi⊢S:σ^i▹di≻0 for i∈I be the premisses of the final (@) rule.
Let us write τ^=(m,F,M,τ), and τ^λ=(m,F′,Mλ,T→τ), and σ^i=(m,Fi,Mi,σi) for i∈I.
We have T={(ord(R),Fi↾<ord(R),Mi↾<ord(R),σi)∣i∈I}={σ^i∣i∈I}, because ord(R)=m and sets Fi and Mi contain only numbers smaller than m.
We start by determining the shape of R, by looking at the premiss concerning it, i.e., Γλ⊢R:τ^λ▹e≻0.
If R was a variable, then the derivation of this premiss would consist of the (Var) rule requiring that Γλ(R)=∅, hence also Γ(R)=∅;
this is impossible by assumption since R is of order m.
If R is an application, R=UV, then the derivation of the premiss concerning R starts with the (@) rule, requiring that ord(U)≤m.
However ord(U)≥ord(R)=m, so actually ord(U)=m, and hence the application counter in this premiss should be at least 1, violating our assumption.
It follows that R cannot be an application.
Moreover, R takes an argument, so it cannot start with a symbol.
Thus R starts with a λ-abstraction, R=λx.R′.
The type judgment concerning R is derived using the (λ) rule out of a premiss Λ[x↦T]⊢R′:τ^′▹e≻0,
where τ^′=(m,F′,M′,τ) and Λ(x)=∅.
The two rules imply that Mλ=M′∖Mk(T) and M=Mλ∪Mk(T).
Lemma 24 applied to the type judgment Λ[x↦T]⊢R′:τ^′▹e implies that Mk(T)⊆M′, and thus we obtain M=M′.
Next, we notice that the sets Fi↾≥ord(R) are empty, and that F′∩M=F′∩M′=∅ (because τ^′=(m,F′,M′,τ) is a full type),
and that Compm(M;(F′,e),((Fi↾≥ord(R),di))i∈I)=(F,c) by a side condition of the (@) rule.
In such a situation Lemma 26 implies that F=F′ (thus actually τ^′=τ^) and c=e+∑i∈Idi.
Finally, due to side conditions of the (@) and (λ) rules we have Split(Γ∣Γλ,(Σi)i∈I) and Split(Γλ∣Λ),
hence also Split(Γ∣Λ,(Σi)i∈I).
We also have ord(S)<ord(R)=m.
Thus we can apply Lemma 31 to type judgments Σi⊢S:σ^i▹di≻0 for i∈I
and Λ[x↦T]⊢R′:τ^▹e≻0.
We obtain a derivation of Γ⊢R′[S/x]:τ^▹c≻0, as required.
∎
We now give a generalization of Lemma 7 suitable for induction.
Notice that a subterm of a λ-term may be involved in multiple subtrees of a derivation tree for the whole λ-term.
Because of that, we have to handle multiple derivations for the same λ-term at once.
Lemma 33**.**
Suppose that for a finite set I, a number m, and a λ-term P we can derive Γi⊢P:τ^i▹ci≻ni for i∈I,
where ord(P)≤m, and ∑i∈Ini>0, and for all i∈I we have ord(τ^i)=m and Γi(x)=∅ only for variables x of order at most m−1.
Then there is a λ-term Q such that P→βQ, and we can derive Γi⊢Q:τ^i▹ci≻ni′ for all i∈I,
for some numbers ni′ such that ∑i∈Ini′<∑i∈Ini.
Proof.
The proof is by induction on the smallest total size of derivations needed to derive Γi⊢P:τ^i▹ci≻ni for all i∈I.
Because ∑i∈Ini>0, we have ∣I∣≥1.
Suppose first that for every i∈I all premisses of the last rule used to derive Γi⊢P:τ^i▹ci≻ni have [math] in the application counter.
This is the base case, in which we perform a β-reduction.
Because ∑i∈Ini>0, necessarily P=RS with ord(R)=m, because only in such a situation the application counter in a derived type judgment can be higher than the sum of application counters in premisses.
Then, for every i∈I separately we apply Lemma 32 to our type judgment,
and we obtain that R=λx.R′, and that we can derive Γi⊢R′[S/x]:τ^i▹ci≻0.
Since P=(λx.R′)S→βR′[S/x], this gives the thesis.
Let us now consider the opposite case, when we have a premiss with positive application counter.
We have multiple cases depending on the shape of P, but all of them are similar, and boil down to a use of the induction assumption.
Suppose, for example, that P=RS.
Then for every i∈I the type judgment Γi⊢P:τ^i▹ci≻ni is derived by the (@) rule out of premisses
Γi′⊢R:τ^i′▹ci′≻ni′ and Γi,j⊢S:τ^i,j▹ci,j≻ni,j for j∈Ji, for some finite set Ji.
We have ord(S)<ord(R)≤m by a side condition of the (@) rule.
For every variable x of order at least m we have Γi′(x)⊆Γi(x)=∅ for all i∈I, and Γi,j(x)⊆Γi(x)=∅ for all i∈I, j∈Ji.
When ni,j>0 for some i∈I, j∈Ji, we apply the induction assumption to S and to the collection of all premisses concerning it in all our derivations.
We obtain a λ-term S′ such that S→βS′, and derivations of Γi,j⊢S′:τ^i,j▹ci,j≻ni,j′ for all i∈I, j∈Ji,
where ∑i∈I∑j∈Jini,j′<∑i∈I∑j∈Jini,j.
By applying the (@) rule to these type judgments and to the premisses concerning R, we obtain derivations of Γi⊢Q:τ^i▹ci≻ni′ for all i∈I,
where Q=RS′ and ∑i∈Ini′=∑i∈I(ni+∑j∈Ji(ni,j′−ni,j))<∑i∈Ini.
When ni,j=0 for all i∈I, j∈Ji, we necessarily have ni′>0 for some i (as we are in the case in which some premiss has a positive application counter),
and we apply the induction assumption to the premisses concerning R.
We proceed similarly when P=aP1…Pr for a=br, when P=brP1P2, and when P=λx.R (we cannot have P=x, as then the application counter in all the type judgments would be [math]).
In the case of P=λx.R we use the assumption ord(P)≤m to deduce that the full types assigned to x in type environments of premisses are not assigned to a variable of order higher than m−1: we have ord(x)<ord(P)≤m.
∎
Finally, we observe that Lemma 7 is a special case of Lemma 33, where ∣I∣=1 and the type judgment is of the form ε⊢P:ρ^m▹c.
Appendix F Proof of Lemma 8
Lemma 34**.**
Suppose that Compm(M;(Fi,ci)i∈I)=(F,c), where m≥1.
If ci′≥ci+∣Fi∩{m−1}∣ for i∈I, then Compm−1(M↾<m−1;(Fi↾<m−1,ci′)i∈I)=(F↾<m−1,c′) for some c′≥c+∣F∩{m−1}∣.
Proof.
The definition of the Comp predicate specifies variables fn and fn′.
Let fn,m and fn,m′ be values of these variables in the above instantiation of the Compm predicate, while fn,m−1 and fn,m−1′—in the above instantiation of the Compm−1 predicate.
Notice that fn,m−1=fn,m for n<m−1, and fn,m−1′=fn,m′ for n≤m−1.
In consequence the requirements given by Compm−1 on the set F are satisfied, since they are the same as the requirements given by Compm.
Next, let us observe that fm−1,m≥fm,m′+∣F∩{m−1}∣.
Indeed, if m−1∈M, we have fm,m′=fm−1,m and m−1∈F.
Conversely, if m−1∈M, we have fm,m′=0, and if fm−1,m=0 then also m−1∈F.
Finally, because fm−1,m−1′=fm−1,m′, we have
[TABLE]
We now generalize Lemma 8 to arbitrary type judgments.
Lemma 35**.**
Suppose that we can derive Γ⊢P:(m,F,M,τ)▹c≻0, where ord(P)≤m−1, and for every variable x and every η^∈Γ(x) we have ord(η^)≤m−1.
Then we can also derive Γ⊢P:(m−1,F↾<m−1,M↾<m−1,τ)▹c′ with c′≥c+∣F∩{m−1}∣.
Proof.
Denote τ^=(m,F,M,τ) and σ^=(m−1,F↾<m−1,M↾<m−1,τ).
The proof is by induction on the structure of some fixed derivation of Γ⊢P:τ^▹c≻0.
We have m≥1, since ord(P)≤m−1.
We distinguish several cases depending on the shape of R.
Suppose first that P is a variable, P=x.
Then the (Var) rule used in the derivation implies that Split(Γ∣ε[x↦(k,F,M↾<k,τ)]) holds, and c=0.
By assumption of the lemma we have k≤m−1, so (M↾<m−1)↾<k=M↾<k and F↾<m−1=F (because F⊆{0,…,k−1}).
In consequence, we can use the (Var) rule to derive Γ⊢P:σ^▹0.
Next, suppose that P=brP1P2.
Then the final (Br) rule has a premiss Γ⊢Pk:τ^▹c≻0 for some k∈{1,2}.
Surely ord(Pk)=0≤m−1.
The induction assumption applied to this premiss gives us a derivation of Γ⊢Pk:σ^▹c′ with c′≥c+∣F∩{m−1}∣.
We apply back the (Br) rule, obtaining Γ⊢P:σ^▹c′.
Next, suppose that P=λx.Q.
Then the final (λ) rule has a premiss Γ′[x↦T]⊢Q:(m,F,M′,τ′)▹c≻0, where τ=T→τ^′, and M=M′∖Mk(T), and Split(Γ∣Γ′) holds.
Clearly ord(Q)≤ord(P)≤m−1.
We also have T⊆Ford(P)α, where α is the sort of x.
In consequence, for every variable y and every η^∈(Γ′[x↦T])(y) we have ord(η^)≤m−1.
Using the induction assumption for our premiss we obtain a derivation of Γ′[x↦T]⊢Q:(m−1,F↾<m−1,M′↾<m−1,τ′)▹c′ with c′≥c+∣F∩{m−1}∣.
Because M′↾<m−1∖Mk(T)=(M′∖Mk(T))↾<m−1=M↾<m−1, by applying back the (λ) rule we derive Γ⊢P:σ^▹c′.
Next, suppose that P=aP1…Pr with a=br.
Then τ=o, and the final (Con) rule has premisses Γi⊢Pi:(m,Fi,Mi,o)▹ci≻0 for i∈{1,…,r}.
For every i∈{1,…,r} we have ord(Pi)=0≤m−1, and Γi(x)⊆Γ(x) for every variable x,
so we can use the induction assumption for the i-th premiss and obtain a derivation of Γi⊢Pi:(m−1,Fi↾<m−1,Mi↾<m−1,o)▹ci′ with ci′≥ci+∣Fi∩{m−1}∣.
If r>0, we have a side condition M=⨄i=1rMi, which implies M↾<m−1=⨄i=1rMi↾<m−1.
Another side condition says that Compm(M;({0},0),(Fi,ci)i∈{1,…,r})=(F,c),
and we need to see that Compm−1(M↾<m−1;(F0,c0),(Fi↾<m−1,ci′)i∈{1,…,r})=(F↾<m−1,c′) for some c′≥c+∣F∩{m−1}∣,
where (F0,c0)=({0},0) if m−1>0, and (F0,c0)=(∅,1) if m−1=0.
This follows from Lemma 34, where we notice that {0}↾<m−1=F0, and c0≥0+∣{0}∩{m−1}∣.
Thus we can apply back the (Con) rule deriving Γ⊢P:σ^▹c′.
Finally, suppose that P=QR.
Then the final (@) rule has premisses Γ′⊢Q:(m,F′,M′,T→τ)▹e≻0 and Γi⊢R:(m,Fi,Mi,τi)▹di≻0 for i∈I,
where T={(ord(Q),Fi↾<ord(Q),Mi↾<ord(Q),τi)∣i∈I}.
Because the application counter is [math] in the conclusion, we have ord(Q)=m, so actually ord(Q)≤m−1 by a side condition of the (@) rule.
Simultaneously ord(R)<ord(Q)≤m−1, and the type environments Γ′ and (Γi)i∈I store only full types stored already in Γ.
The induction assumption applied to all premisses gives us derivations of Γ′⊢Q:(m−1,F′↾<m−1,M′↾<m−1,T→τ)▹e′ with e′≥e+∣F′∩{m−1}∣,
and of Γi⊢R:(m−1,Fi↾<m−1,Mi↾<m−1,τi)▹di′ with di′≥di+∣Fi∩{m−1}∣ for i∈I.
The side condition M=M′⊎⨄i∈IMi implies M↾<m−1=M′↾<m−1⊎⨄i∈IMi↾<m−1.
Another side condition says that Compm(M;(F′,e),(Fi,di)i∈I)=(F,c),
which by Lemma 34 implies that Compm−1(M↾<m−1;(F′↾<m−1,e′),(Fi↾<m−1,di′)i∈I)=(F↾<m−1,c′) for some c′≥c+∣F∩{m−1}∣.
We also have that T={(ord(Q),(Fi↾<m−1)↾<ord(Q),(Mi↾<m−1)↾<ord(Q),τi)∣i∈I}, because ord(Q)≤m−1.
Having all this, we can apply back the (@) rule, and derive Γ⊢P:σ^▹c′.
∎
Lemma 8 says that if we can derive ε⊢P:(m,∅,{0,…,m−1},o)▹c≻0 with m>0,
then we can also derive ε⊢P:(m−1,∅,{0,…,m−2},o)▹c′ for some c′≥c.
Here ord(P)=0, so this is just a special case of Lemma 35.
Appendix G Remaining proofs for Section 5
In the final part of Section 5 we have implicitly used the following lemma, which we now prove.
Lemma 36**.**
If we can derive ε⊢P:ρ^0▹c,
then there exists a tree t∈L(P) such that ∣t∣=c.
Proof.
Recall that ρ^0=(0,∅,∅,o).
The proof is by induction on the structure of some fixed derivation of ε⊢P:ρ^0▹c.
Let us analyze the shape of P.
Because the type environment is empty, P cannot be a variable.
The sort of ρ^0, and hence of P, is o, and thus P cannot start with a λ-abstraction.
Moreover, P cannot be an application QR, because the (@) rule requires that ord(Q)≤ord(ρ^0)=0.
Thus P starts with a symbol.
We have two cases.
Suppose first that P=aP1…Pr with a=br.
We notice that ρ^0 is the only full type in F0o, and that Split(ε∣Γ1,…,Γr) implies Γ1=⋯=Γr=ε.
Thus the premisses of the final (Con) rule are ε⊢Pi:ρ^0▹ci, for i∈{1,…,r}.
Because Comp0(∅;(∅,1),(∅,c1),…,(∅,cr))=(∅,c), we have c=1+c1+⋯+cr.
The induction assumption gives us, for i∈{1,…,r}, trees ti such that ∣ti∣=ci and ti∈L(Pi), which means that BT(Pi)→br∗ti.
As t we take the tree having a in its root, and t1,…,tr as subtrees starting in the root’s children.
Because BT(P) has a in its root, and BT(P1),…,BT(Pr) as subtrees starting in the root’s children, it should be clear that BT(P)→br∗t.
Moreover, ∣t∣=1+∣t1∣+⋯+∣tr∣=1+c1+⋯+cr=c.
Another possibility is that P=brP1P2.
Then the final (Br) rule has one premiss ε⊢Pi:ρ^0▹c for some i∈{1,2}.
The induction assumption gives us a tree t such that ∣t∣=c and BT(Pi)→br∗t.
Recalling that BT(P) has br in its root, and BT(Pi) as its subtree starting in the i-th child of the root, we see that BT(P)→brBT(Pi), and thus t∈L(P).
∎
Appendix H Proofs for Section 6
Let us recall two definitions from page 6.
We say that two type judgments are equivalent if they differ only in the value of the flag counter.
Given a λY-term P and a number m,
we have also defined a set D of all derivations of ε⊢P:ρ^m▹c in which on each branch there are at most three type judgments from every equivalence class,
and among premisses of each (@) rule there is at most one type judgment from every equivalence class.
We complete the proof of effectiveness contained in Section 6 by a formal proof of the following lemma.
Lemma 37**.**
Suppose that for some λY-term P we can derive ε⊢P:ρ^m▹c for arbitrarily large numbers c∈N.
Then in the set D there is a derivation in which on some branch there are two equivalent type judgments with different values of the flag counter.
Proof.
In this lemma it is convenient to see derivations as trees: type judgments of a derivation constitute nodes of a tree; premisses of a type judgment are located in its children.
We consider P and m to be fixed.
A derivation is called narrow if among premisses of each its (@) rule there is at most one type judgment from every equivalence class.
We have already justified on page 6 that if a type judgment ε⊢P:ρ^m▹c can be derived,
then it has a narrow derivation.
Moreover, we have justified that there are only finitely many equivalence classes of type judgments that can be used in any derivation of ε⊢P:ρ^m▹c, for any c;
let E be their number.
This gives a bound on the number of premisses of (@) rules in narrow derivations.
Rules (Var), (Br), and (λ) always have at most one premiss.
The number of premisses of a (Con) rule is specified by the rank of the symbol involved;
when this rule is used in a derivation of ε⊢P:ρ^m▹c, this symbol has to appear in P, which gives a bound on its rank.
All this gives us a bound D on the degree of nodes appearing in the considered narrow derivations of ε⊢P:ρ^m▹c for arbitrarily large c.
Looking at the definition of the Compm predicate it is easy to see that if Compm(M;(Fi,ci)i∈I)=(F,c), then c≤∣I∣⋅m+∑i∈Ici.
In consequence, the flag counter in a conclusion of a (@) rule or a (Con) rule, having at most D premisses, can be higher than the sum of flag counters in the premisses at most by (D+1)⋅m+1
(these “+1” appear here, because in the (Con) rule, beside of pairs (Fi,ci) coming from premisses, we pass to the Compm predicate an additional pair (F′,c′) with c′≤1).
The conclusion of every (Br) and (λ) rule has the same flag counter as the only premiss, and the conclusion of the (Var) rule always has [math] in its flag counter.
This means that there is constant C such that in any node of any narrow derivation of ε⊢P:ρ^m▹c
the flag counter can be higher than the sum of flag counters in its premisses at most by C (and simultaneously it cannot be smaller than this sum).
We define a level of a node in a derivation, by induction on the depth of the node.
Leaves and all nodes with flag counter [math] have level [math].
If an internal node has its flag counter positive and equal to the flag counter in some child of this node, then the level of this node is equal to the level of this child
(notice that there is at most one such child, and the flag counter in all other children is [math]).
The level of every other internal node is defined as one plus the maximum of levels of its children.
Next, we prove that for every level i there is a bound Ci on the value of the flag counter among nodes of this level.
Indeed, the value of the flag counter in leaves, and thus in all nodes of level [math], is bounded by C0=C.
Take now a node of level i>0 having only children of levels smaller than i.
In each of these (at most D) children the flag counter is at most Ci−1, so so the flag counter in our node is at most Ci=C+D⋅Ci−1.
If a node of level i has a child of level i, then their flag counter is equal, thus is also at most Ci (trivial induction on the depth of the node).
Let us now take a narrow derivation of ε⊢P:ρ^m▹c for some c>CE−1.
It necessarily contains a node of level greater than E−1.
Moreover, every node of some level i>0 has a child of level at least ≥i−1.
Thus in the considered derivation there exists a branch having nodes of at least E+1 different levels.
Among them we can find two nodes with equivalent type judgments and being on different levels (as the type judgments come from at most E equivalence classes).
If the two nodes had the same flag counters, then also all nodes on the path between them would have the same flag counter, and thus the two nodes would have the same level, which is not the case.
Thus we have found two nodes having equivalent type judgments, different values of the flag counter, and such that one of them is a descendant of the other.
Let x and y be nodes of a derivation of ε⊢P:ρ^m▹c, where y is a descendant of x, and they contain equivalent type judgments.
In such a situation, we can cut out the fragment between nodes x and y, in the following sense:
we decrease by cx−cy the flag counter in every ancestor of x, we remove x and all its descendants not being in the subtree starting in y, and we attach y in the place of x.
This results in some (correct) narrow derivation of ε⊢P:ρ^m▹c′ for some c′.
Take now the smallest (in the sense of the number of nodes) narrow derivation of a type judgment ε⊢P:ρ^m▹c, for any c,
in which there are two nodes u, v having equivalent type judgments, different values of the flag counter, and such that v is a descendant of u.
If this derivation is in D, we are done.
If not, we can find four nodes x1,x2,x3,x4 with equivalent type judgments, such that xi+1 is a descendant of xi for i∈{1,2,3}.
Let ci be the value of the flag counter in xi, for i∈{1,2,3,4}.
We have two cases.
Suppose first that ci>ci+1 for some i∈{1,2,3}.
We then take some j∈{1,2,3}∖{i}, and we cut off the fragment between xj and xj+1;
we obtain a smaller narrow derivation of a type judgment ε⊢P:ρ^m▹c′, for some c′,
in which there are two nodes, namely xi and xi+1, having equivalent type judgments, different values of the flag counter, and such that one of them is a descendant of the other.
This contradicts minimality of our derivation.
Next, suppose that c1=c2=c3=c4.
Then we recall that we already have two nodes u, v having equivalent type judgments, different values of the flag counter, and such that v is a descendant of u.
For i∈{1,2,3} consider the set Vi containing xi and all its descendants not being in the subtree starting in xi+1.
These sets are disjoint, so for some j∈{1,2,3} we have u∈Vj and v∈Vj.
We cut off the fragment between xj and xj+1.
This removes exactly the nodes from Vj, so the nodes u and v are still present in the derivation.
Moreover, equality cj=cj+1 implies that we have not changed flag counters in any node, in particular in u and v, so u and v in the new derivation again have different values of the flag counter.
Thus also in this case we have obtained a contradiction with minimality of our derivation.
∎