Linear $\beta$-reduction

Stefano Guerrini (LIPN; Institut Galil\'ee; Universit\'e Paris Nord; 13; Sorbonne Paris Cit\'e)

arXiv:1701.04918·cs.LO·January 19, 2017·LINEARITY

Linear $\beta$-reduction

Stefano Guerrini (LIPN, Institut Galil\'ee, Universit\'e Paris Nord, 13, Sorbonne Paris Cit\'e)

PDF

TL;DR

This paper introduces a comprehensive analysis of linear beta-reduction at a distance in lambda-calculus, clarifying its syntactic structure and properties, and extends strong normalization proofs to this setting.

Contribution

It provides a general notion of beta-reduction at a distance and linear reduction, along with their properties and relations, using a refined sigma-equivalence for better analysis.

Findings

01

Defined a general notion of beta-reduction at a distance

02

Analyzed relations and properties of linear reduction

03

Extended strong normalization proof to linear reduction in simply typed case

Abstract

Linear head reduction is a key tool for the analysis of reduction machines for lambda-calculus and for game semantics. Its definition requires a notion of redex at a distance named primary redex in the literature. Nevertheless, a clear and complete syntactic analysis of this rule is missing. We present here a general notion of beta-reduction at a distance and of linear reduction (i.e., not restricted to the head variable), and we analyse their relations and properties. This analysis rests on a variant of the so-called sigma-equivalence that is more suitable for the analysis of reduction machines, since the position along the spine of primary redexes is not permuted. We finally show that, in the simply typed case, the proof of strong normalisation of linear reduction can be obtained by a trivial tuning of Gandy's proof for strong normalisation of beta-reduction.

Equations66

H ::= □ ∣ λ x . H ∣ H t .

H ::= □ ∣ λ x . H ∣ H t .

E_{1, 2} ::= □ ∣ E_{1} [λ x . E_{2}] t

E_{1, 2} ::= □ ∣ E_{1} [λ x . E_{2}] t

η (□) = ϵ and η (E_{1} [λ x . E_{2}] t) = η (E_{2}), t / x, η (E_{1})

η (□) = ϵ and η (E_{1} [λ x . E_{2}] t) = η (E_{2}), t / x, η (E_{1})

E [λ x . t] s \to_{β_{d}} E [t {s / x}]

E [λ x . t] s \to_{β_{d}} E [t {s / x}]

t_{1} / x_{1}, t_{2} / x_{2}, \dots, t_{n} / x_{n} ⟼ E (λ x_{n} . \dots (λ x_{2} . (λ x_{1} . □) t_{1}) t_{2} \dots) t_{n}

t_{1} / x_{1}, t_{2} / x_{2}, \dots, t_{n} / x_{n} ⟼ E (λ x_{n} . \dots (λ x_{2} . (λ x_{1} . □) t_{1}) t_{2} \dots) t_{n}

E_{1} [λ x . E_{2}] t

E_{1} [λ x . E_{2}] t

E_{1} [E_{2}]

E_{1} [λ x . E_{2}] t

E_{1} [λ x . E_{2}] t

E_{1} [E_{2}]

H_{λ} [E [λ x . s]]

H_{λ} [E [λ x . s]]

H_{λ} [E [s] t]

H_{λ} [E_{1} [s]]

H = H_{λ} [E_{c} [H_{@}]]

H = H_{λ} [E_{c} [H_{@}]]

t

t

= λ x_{1} . \dots λ x_{n} . (λ y_{1} . (\dots (λ y_{p} . z t_{1} \dots t_{m}) s_{p}) \dots) s_{1})

H

H

H_{i}

H_{n + j}

H_{n + m + 1}

H = E_{0} [λ x_{1} . E_{1} [λ x_{2} . E_{2} [\dots [λ x_{n} . E_{n} [E_{n + 1} [\dots [E_{n + m - 1} [E_{n + m} t_{1}] t_{2}] \dots] t_{m}]] \dots]]]

H = E_{0} [λ x_{1} . E_{1} [λ x_{2} . E_{2} [\dots [λ x_{n} . E_{n} [E_{n + 1} [\dots [E_{n + m - 1} [E_{n + m} t_{1}] t_{2}] \dots] t_{m}]] \dots]]]

H_{λ}

H_{λ}

H_{@}

E_{c}

H \sim_{E} H_{c} = λ x_{1} . \dots λ x_{n} . E_{c} [□ t_{1} \dots t_{m}]

H \sim_{E} H_{c} = λ x_{1} . \dots λ x_{n} . E_{c} [□ t_{1} \dots t_{m}]

H_{λ} [E [λ x . s]]

H_{λ} [E [λ x . s]]

H_{λ} [E [s] t]

H_{λ} [E_{1} [s]]

((λ x . u) v) w

((λ x . u) v) w

(λ x . λ y . u) v

E_{1} [(λ x_{1} . (λ x_{2} . E_{2}) t_{2}) t_{1}] \sim E_{1} [(λ x_{2} . (λ x_{1} . E_{2}) t_{1}) t_{2}]

E_{1} [(λ x_{1} . (λ x_{2} . E_{2}) t_{2}) t_{1}] \sim E_{1} [(λ x_{2} . (λ x_{1} . E_{2}) t_{1}) t_{2}]

E [λ x . C [x]] s ⊸ E [λ x . C [s^{'}]] s

E [λ x . C [x]] s ⊸ E [λ x . C [s^{'}]] s

E [λ x . t] s \to_{g} E [t] if x \neq \in FV (t)

E [λ x . t] s \to_{g} E [t] if x \neq \in FV (t)

E [λ x . H [x]] s ⊸_{h} E [λ x . H [s^{'}]] s

E [λ x . H [x]] s ⊸_{h} E [λ x . H [s^{'}]] s

[τ \to σ] = {f \in [τ] \to [σ] ∣ \forall v, w \in [τ] : v ≺_{τ} w \Rightarrow f (v) ≺_{σ} f (w)}

[τ \to σ] = {f \in [τ] \to [σ] ∣ \forall v, w \in [τ] : v ≺_{τ} w \Rightarrow f (v) ≺_{σ} f (w)}

\forall f, g \in [τ \to σ] : f ≺_{τ \to σ} g iff \forall v \in [τ] : f (v) ≺_{σ} g (v)

\forall f, g \in [τ \to σ] : f ≺_{τ \to σ} g iff \forall v \in [τ] : f (v) ≺_{σ} g (v)

n +_{o} k = n + k f +_{[τ \to σ]} k = (λ v \in [τ] . f (v) +_{σ} k)

n +_{o} k = n + k f +_{[τ \to σ]} k = (λ v \in [τ] . f (v) +_{σ} k)

o_{*}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Linear $\beta$ -reduction††thanks: Partially supported by the Project ELICA (ref. ANR-14-CE25-0005), of

the ANR program “Fondements du numérique (DS0705) 2014”.

Stefano Guerrini

LIPN

Institut Galilée

Université Paris Nord 13

Sorbonne Paris Cité

[email protected]

Abstract

Linear head reduction is a key tool for the analysis of reduction machines for $\lambda$ -calculus and for game semantics. Its definition requires a notion of redex at a distance named primary redex in the literature. Nevertheless, a clear and complete syntactic analysis of this rule is missing. We present here a general notion of $\beta$ -reduction at a distance and of linear reduction (i.e., not restricted to the head variable), and we analyse their relations and properties. This analysis rests on a variant of the so-called $\sigma$ -equivalence that is more suitable for the analysis of reduction machines, since the position along the spine of primary redexes is not permuted. We finally show that, in the simply typed case, the proof of strong normalisation of linear reduction can be obtained by a trivial tuning of Gandy’s proof for strong normalisation of $\beta$ -reduction.

1 Introduction

Linear head reduction is a key tool for the analysis of reduction machines for $\lambda$ -calculus and for game semantics. A detailed analysis of it, and more generally of a notion of reduction at a distance, has been given by Accattoli [2] in terms of proof nets and explicit substitutions. Linear head reduction is usually presented in terms of the so-called $\sigma$ -equivalence introduced by Regnier in [9]. In the following, we introduce a variant of the $\sigma$ -equivalence, which has the main advantage of leaving unchanged the order of primary redexes (a notion of $\beta$ -redex that will be discussed later). Such a new equivalence is more suitable for the analysis of abstract reduction machines based on linear head reduction, as for instance Danos and Regnier’s Pointer Abstract Machine (PAM) [4], which has been analysed in detail by the author and Pellitta in [6]. Indeed, most of the material that we shall present in this paper has been developed for formalising the results in [6].

The key tool of our approach is a notion of context which is indeed an implicit representation of environments mapping variables to their values. By means of these contexts, one can define a $\beta$ -reduction at a distance and its linearised version. Both of these reduction rules preserve $\beta$ -equivalence, and both of them are strongly normalising in the case of simply typed $\lambda$ -calculus. However, the proof of strong normalisation is not at all evident. In fact, linear reduction does not erase any term, it just replaces one of the occurrences of a variable with a (larger) $\lambda$ -term; in other words, the size of the reducing term always increases along the reduction. Surprisingly, this apparent difficulty can be trivially overcome by a small tuning of Gandy’s proof for strong normalisation of $\beta$ -reduction [5]. Just by changing a detail in the interpretation of variable occurrences—it suffices to increase by 1 their measure—we can adapt the measure used in Gandy’s proof to the case of linear reduction. Moreover, the new measure obtained in this way simultaneously proves strong normalisation of $\beta$ -reduction and of its linearised version.

As already remarked, linear reduction has been studied in detail by Accattoli [2] by means of linear logic proof nets. Such an approach has been inspired by the structural calculus introduced by Accattoli and Kesner [3], a calculus with explicit substitutions and reduction rules at a distance. In the present paper, our goal is to analyse linear reduction directly on $\lambda$ -calculus, without introducing explicit substitutions or without going down to the low level analysis of reduction that can be achieved by means of proof nets. As we shall see later, such a goal is achieved by introducing a variant of $\sigma$ -equivalence, named E-equivalence, which is more suitable for investigating reduction machines based on pointers, as for instance the PAM. Moreover, the proof of strong normalisation that we shall give is much simpler then the one based on reducibility candidates required in the case of proof nets. Recently, Pedrot and Saurin [8] have proposed a call-by-need variant of $\lambda\mu$ -calculus defined in terms of a notion of closure contexts. Such closure contexts correspond to the E-contexts introduced in the following by Definition 2, but extended to $\lambda\mu$ -calculus too. We remark that most of the material that we shall present below is a by-product of the study of the PAM started in [6], and that one of the further developments mentioned in [6] is the extension of the PAM to the $\lambda\mu$ -calculus; Pedrot and Saurin’s call-by-need closure contexts seem to be the right tool for formalising such an extension.

2 Preliminaries

The set of the $\lambda$ -terms $\Lambda$ is defined by the abstract grammar $s,t::=x\mid\lambda x.t\mid st$ , where $x\in\mathcal{V}$ and $\lambda x$ is a binder for the variable $x$ . The set of the free variables of a term $t$ is denoted by $\mathsf{FV}(t)$ . The key computational step of $\lambda$ -calculus is $\beta$ -contraction $(\lambda x.t)s\to_{\beta}t\{s/x\}$ , where $t\{s/x\}$ denotes that every free occurrence of the variable $x$ in $t$ is replaced by $s$ , provided that such a replacement does not cause any name clash of some free variable of $s$ ; otherwise, if this is not the case, one has to preliminarily apply a suitable sequence of variable renamings, or $\alpha$ -rules, to $t$ . (The $\alpha$ -congruence is the least congruence induced by the $\alpha$ -rule $\lambda x.t=\lambda y.t\{y/x\}$ , in which $y$ replaces all the free occurrences of $x$ in $t$ and $y$ does not occur in $t$ .) As usual, $\to_{\beta}^{*}$ denotes the reflexive and transitive closure of the binary relation defined by the $\beta$ -rule, and $=_{\beta}$ denotes the corresponding equivalence (closing by symmetry also). Such notations will extend to the other rewriting rules that we shall see in the paper.

In order to avoid the bureaucratic problems connected to $\alpha$ -congruence, we can assume to work modulo it, and that all the bound variables in the terms that we shall consider have the distinct names property (sometimes referred to as Baredrengt variable names convention). A term has the distinct names property if no free variable in it has the same name of a bound variable, and all the bound variables have distinct names. Remarkably, for every $\lambda$ -term there is an $\alpha$ -congruent one which has the distinct name property. In this way, no name clash can arise by replacing $s$ for $x$ in $t$ in the $\beta$ -reduction of $(\lambda x.t)s$ . However, even if correct, the resulting term $t\{s/x\}$ might not have the above distinct names property. In order to guarantee that $t\{s/x\}$ preserves the distinct names property of $(\lambda x.t)s$ , we can assume to replace each occurrence of $x$ with a fresh copy of $s$ , in which every bound variable has a fresh name which has not been already used in the term or in another copy of $s$ .

In the simply typed $\lambda$ -calculus, every term has a type. The set of types is given by the abstract grammar $\tau,\sigma::=o\mid\tau\to\sigma$ , where the constant $o$ is the unique base type and any type $\tau\to\sigma$ is said a functional type. The set $\Lambda^{\to}$ of the simply typed terms is the subset of $\Lambda$ whose terms respect the following typing rules:

(i) each variable $x$ has a given type $\tau$ ;

(ii) if the variable $x$ has type $\tau$ , and the term $t$ has type $\sigma$ , then $\lambda x.t$ has type $\tau\to\sigma$ ;

(iii) if the term $s$ has type $\tau\to\sigma$ and $t$ has type $\tau$ , then $st$ has type $\sigma$ .

We shall write $t:\tau$ or $t^{\tau}$ to denote that a term $t$ has type $\tau$ . The $\beta$ -rule preserves typing; namely, if $t\to_{\beta}^{*}s$ and $t:\tau$ , then $s:\tau$ .

A reduction strategy is a set of rules specifying how to reduce a $\lambda$ -term. Roughly speaking, given a reducible term $t$ , a reduction strategy is a function that selects the redex (or the redexes) of $t$ that must (or among which we can choose the redex to) be reduced at the next step. A reduction strategy defines a sub-rewriting system of $\beta$ -reduction and, in some cases, if some $\beta$ -reducible term $t$ contains no valid redex for the given reduction strategy, it introduces new normal forms.

2.1 Head reduction

Let us say that a $\beta$ -redex $(\lambda x.t)s$ is in outermost head position111Usually this is simply referred to as head position. In the following we shall however present a larger notion of head position, in which a $\beta$ -redex may be in head position even if it is inside the body of a $\beta$ -redex in head position. According to such a new notion of head position, the redex reduced by the head reduction is the outermost $\beta$ -redex in head position. in $v$ when $v=\lambda y_{1}.\ldots\lambda y_{k}.(\lambda x.t)su_{1}\ldots u_{h}$ , and that $v$ head reduces to $v^{\prime}=\lambda y_{1}.\ldots\lambda y_{k}.t\{s/x\}u_{1}\ldots u_{h}$ , written $v\to_{h}v^{\prime}$ , by reducing its outermost head redex. A term $v$ is in head normal form when $\lambda y_{1}.\ldots\lambda y_{k}.xu_{1}\ldots u_{h}$ , which in general is not a $\beta$ -normal form, since $u_{1},\ldots,u_{h}$ may contain $\beta$ -redexes. Indeed, the $\beta$ -normal form of $t$ , if it exists, can be found by head reducing a term $t$ to its head normal form $\lambda y_{1}.\ldots\lambda y_{k}.xu_{1}\ldots u_{h}$ (if any) first, and then by recursively applying the head reduction strategy to every $u_{i}$ and to the subterms of their head normal forms.

2.2 Head contexts

As usual, a context $C$ is a term with a hole $\square$ (a sort of dummy free variable occurring exactly once in the term) $C::=\square\mid\lambda x.C\mid Ct\mid tC$ . Given any term $t$ , by $C[t]$ we denote the term obtained by replacing the hole of the context $C$ with the term $t$ , without performing any variable renaming; therefore, when the hole is under the scope of a $\lambda$ -abstraction binding the variable $x$ , any free occurrence of $x$ in $t$ is captured in $C[t]$ , and becomes bound.

Definition 1 (H-context, head variable)

A head context, or H-context, is a context whose hole appears in head position. More precisely, H-contexts are defined by the following grammar

[TABLE]

A head context of a term $t$ is any H-context $H$ s.t. $t=H[s]$ , for some term $s$ , that we shall say to be in head position in $t$ . In particular, for every $\lambda$ -term $t$ , there is a unique head context $H$ of $t$ (the maximal head context of $t$ ) and a unique variable $x=\operatorname{\mathsf{hv}}(t)$ (the head variable of $t$ ) s.t. $t=H[x]$ . □

2.3 Spine

A spine $\lambda$ -abstraction/application of a term $t$ is any $\lambda$ -abstraction/application in head position in $t$ . The spine of $t=H[x]$ , and of its head context $H$ , is the sequence of its spine $\lambda$ -abstractions/applications ordered from the head variable of $t$ (the hole of $H$ ) to its root. A variable $x$ bound by a spine abstraction is a spine variable, while the right subterm of a spine application is a spine argument of $t$ . By $\mathsf{SV}(t)$ and $\mathsf{SV}(H)$ we denote the set of the spine variables of a term $t$ and of a H-context $H$ , respectively.

A H-context $H_{\lambda}$ is a $\lambda$ -context if its spine is formed of $\lambda$ -abstractions only (equivalently, $H_{\lambda}$ has no spine arguments). A H-context $H_{@}$ is a $@$ -context if its spine is formed of applications only (equivalently, $H_{@}$ has no spine variables).

3 $\beta$ -reduction at a distance

3.1 Environment contexts

Definition 2 (E-context)

An environment context, or E-context, is a particular H-context in which spine $\lambda$ -abstractions and spine applications are balanced. E-contexts are defined by the grammar

[TABLE]

□

An E-context $E$ contains an equal number $\texttt{\#}_{p}E$ of spine variables and of spine arguments. For every E-context $E\neq\square$ , there is a unique pair $(x,t)$ s.t. $E=E_{1}[\lambda x.E_{2}]t$ , for some pair of E-contexts $E_{1},E_{2}$ . Therefore, every E-context defines a unique bijection between its spine variables and its spine arguments. Such a correspondence can be formalised in terms of environments. An environment $\eta=t_{1}/x_{1},\ldots,t_{k}/x_{k}$ is an ordered sequence of variable substitutions $t_{i}/x_{i}$ (where $t_{i}$ is a term replacing the variable $x_{i}$ ). Given an environment $\eta$ , we define $t\{\eta\}=t\{t_{1}/x_{1},\ldots,t_{k}/x_{k}\}=t\{t_{1}/x_{1}\}\ldots\{t_{k}/x_{k}\}$ .

Definition 3

The environment $\eta(E)$ associated to an E-context is inductively defined by

[TABLE]

□

According to the above definition, every pair of matching spine argument/variable corresponds to a substitution $t/x$ in $\eta(E)$ . We remark that the order of the substitutions in an environment is relevant, since for $i<j$ , the occurrences of $x_{j}$ in the term $t_{i}$ are replaced by the term $t_{j}$ , while this is not the case for any occurrence of $x_{j}$ in a term $t_{k}$ with $k\geq j$ . In particular, the order of the spine variables in $\eta(E)$ corresponds to the order in which they appear in $E$ , assuming to move from the inner head position to the root. In other words, $x$ precedes $y$ in $\eta(E)$ iff the binder of $x$ is in the scope of the binder of $y$ .

Lemma 1

Let $E$ be an E-context. For every $\lambda$ -term t, $E[t]\to_{\beta}^{*}t\{\eta(E)\}$ . □

3.2 Primary redexes and $\beta$ -contraction at a distance

Any pair of matching spine argument/variable in an E-environment is as a sort of redex at a distance.

Definition 4 (Primary $\beta$ -redex)

A $\beta$ -redex at a distance is a term $E[\lambda x.t]s$ , where $E$ is an E-context. A primary $\beta$ -redex is a $\beta$ -redex at a distance occurring in a head position. □

As a particular case, for $E=\square$ , any $\beta$ -redex is a $\beta$ -redex at a distance. $\beta$ -redexes at a distance can be reduced as usual $\beta$ -redexes, by defining the following generalisation at a distance of the $\beta$ -rule

[TABLE]

and by taking the $\beta_{d}$ -reduction as the closure by contexts of the above rule. The H-context $E[\lambda x.\square]s$ of a $\beta$ -redex at a distance is an E-context. Then, every pair $t/x$ of matching spine argument/variable of an $E$ -context (and therefore every substitution in $\eta(E)$ ) forms a primary redex. As a consequence, it is readily seen that $E[t]\to_{\beta_{d}}^{*}t\{\eta(E)\}$ for every E-context $E$ and every term $t$ . More generally, $\beta$ -reduction at a distance is sound w.r.t. the usual $\beta$ -reduction.

Proposition 1

Let $t\to_{\beta_{d}}^{*}s$ , then $t=_{\beta}s$ . Moreover, $s$ is a normal form for $\to_{\beta_{d}}$ iff it is a $\beta$ -normal form. □

4 Spine permutation equivalence of $\lambda$ -terms

The head canonical E-contexts are a particular case of E-contexts in which every redex at a distance is also a $\beta$ -redex. Head canonical E-contexts are defined by the grammar $E_{c}::=\square\mid(\lambda x.E_{c})t$ , and any head canonical E-context has the shape $(\lambda x_{n}.\ldots(\lambda x_{2}.(\lambda x_{1}.\square)t_{1})t_{2}\ldots)t_{n}.$ An environment $\eta$ can be seen as the explicit representation of a head canonical E-context $\mathcal{E}(\eta)$ in which the order of the $\beta$ -redexes along the spine is the inverse of the substitution pairs in the environment

[TABLE]

Which corresponds to the inductive definition $\mathcal{E}(\epsilon)=\square$ , and $\mathcal{E}(t/x,\eta)=\mathcal{E}(\eta)[(\lambda x.\square)t]$ .

4.1 Surface E-equivalence

By Lemma 1, we have that $E_{1}=_{\beta}E_{2}$ , for every pair of E-contexts $E_{1}$ and $E_{2}$ s.t. $\eta(E_{1})=\eta(E_{2})$ . We can then define the following equivalence.

Definition 5 (Surface E-equivalence on E-contexts)

The surface E-equivalence on E-contexts is the least equivalence $\sim_{\!\scriptscriptstyle E}$ defined by

[TABLE]

□

Such an equivalence captures exactly the equivalence classes of E-contexts $\{E\mid\mathcal{E}(\eta(E))=E_{c}\}$ , where $E_{c}$ is head canonical, as formally stated by the following lemma.

Lemma 2

For every E-context $E$ , there is a unique canonical E-context $E_{c}\sim_{\!\scriptscriptstyle E}E$ , which is also the unique normal form of the terminating rewriting system $\to_{E}$ obtained by orienting the $E$ -equivalence rules of Definition 5 from the left to the right

[TABLE]

Moreover, $E_{c}=\mathcal{E}(\eta(E))$ , and therefore $E\sim_{\!\scriptscriptstyle E}E^{\prime}$ iff $\eta(E)=\eta(E^{\prime})$ . □

Example 1

Let $E=E_{1}[\lambda x.E_{2}]t$ with $E_{1}=(\lambda y.\square)s$ and $E_{2}=\square$ . The E-context $E_{c}=(\lambda y.(\lambda x.\square)t)s$ is the unique canonical E-context $\sim_{\!\scriptscriptstyle E}$ -equivalent to $E=(\lambda y.\lambda x.\square)st$ . □

4.2 Canonical $\lambda$ -terms

The E-equivalence can be extended to terms. In the corresponding head canonical forms, along the spine, one finds first all the unmatched spine abstractions, then the E-context formed of the primary redexes, and finally the unmatched spine arguments.

Definition 6 (Surface E-equivalence on terms)

The surface E-equivalence on terms is the least equivalence defined by the E-equivalence rules on E-contexts of Definition 5, plus

[TABLE]

where $H_{\lambda}$ is a $\lambda$ -context, and $E,E_{1},E_{2}$ are E-contexts. The equivalence naturally extends to H-contexts, by replacing $\square$ for $s$ in the above equations. □

Definition 7 (head canonical $\lambda$ -term)

Let us say that $H$ is a head canonical H-context when

[TABLE]

where $H_{\lambda}$ is a $\lambda$ -context, $H_{@}$ is an $@$ -context, and $E_{c}$ is a head canonical E-context. The spine $\lambda$ -abstractions of $H_{\lambda}$ are the head $\lambda$ -abstractions of $H$ , while the spine arguments of $H_{@}$ are the head arguments of $H$ . The $\lambda$ -term $t$ is head canonical when its maximal head context is head canonical. □

Summing up, any head canonical $\lambda$ -term $t$ has the shape

[TABLE]

and we can define $\texttt{\#}_{\lambda}t=n$ , $\texttt{\#}_{@}t=m$ , $\eta(t)=E_{c}$ , and $\texttt{\#}_{p}t=\texttt{\#}_{p}E_{c}=p$ .

Every H-context $H$ , and then every $\lambda$ -term $t=H[x]$ , has a unique E-equivalent head canonical form $H_{c}$ , or $H_{c}[x]$ for terms. Moreover, as shown by Theorem 1 below, $H_{c}$ preserves the same relative positions of unmatched spine $\lambda$ -abstractions, unmatched spine arguments, and primary redexes of $H$ . (A spine $\lambda$ -abstraction/argument is unmatched when it is not involved in a primary redex.) More precisely, the $i$ -th head $\lambda$ -abstraction of $H_{c}$ is the $i$ -th unmatched $\lambda$ -abstraction on the spine of $H$ , the $i$ -th head argument of the head canonical form is the $i$ -th unmatched spine argument on the spine of $H$ , the $i$ -th primary redex of $H_{c}$ is the $i$ -th primary redex on the spine of $H$ .

Theorem 1

For any H-context $H$ , there is a unique head canonical context $H_{c}\sim_{\!\scriptscriptstyle E}H$ . More precisely,

for every H-context $H$ , there is a unique sequence of spine variables $x_{1},\ldots x_{n}$ , a unique sequence of spine arguments $t_{1},\ldots,t_{m}$ , and a unique sequence of E-contexts $E_{0},E_{1},\ldots,E_{n+m}$ s.t.

[TABLE]

that is

[TABLE] 2. 2.

there is a unique head canonical context $H_{c}\sim_{\!\scriptscriptstyle E}H$ , and $H_{c}=H_{\lambda}[E[H_{@}]]$ is equal to

[TABLE]

that is

[TABLE]

where $\widetilde{E}_{i}=\mathcal{E}(\eta(E_{i}))\sim_{\!\scriptscriptstyle E}E_{i}$ is the unique head canonical E-context equivalent to $E_{i}$ ; 3. 3.

the canonical context $H_{c}$ of $H$ is the unique normal form of the rewriting system $\to_{E}$ obtained by orienting from the left to the right the surface E-equivalences on terms of Definition 6. Namely,

[TABLE]

plus the rules for E-contexts in Lemma 2.

□

4.3 E-equivalence

The surface E-equivalence permutes the arguments on the spine of a term without modifying them. The E-equivalence is obtained by recursively applying the surface E-equivalence to spine arguments too. If we denote by $\operatorname{\mathsf{arg}}(t,i)$ the $i$ -th head spine argument of the term $t$ (which corresponds to the $i$ -th spine argument in the head $@$ -context of its head canonical form) and by $\operatorname{\mathsf{arg}}(t,-i)$ the spine argument of the $i$ -th primary redex of $t$ (which corresponds to the $i$ -th spine argument in the head canonical E-context $E_{c}$ of the head canonical form of $t$ ), we define $\simeq_{\!\scriptscriptstyle E}$ as the least equivalence s.t. $t_{1}\simeq_{\!\scriptscriptstyle E}t_{2}$ if $t_{1}\sim_{\!\scriptscriptstyle E}t_{2}$ , and $\operatorname{\mathsf{arg}}(t_{1},i)\simeq_{\!\scriptscriptstyle E}\operatorname{\mathsf{arg}}(t_{2},i)$ , for $1\leq i\leq\texttt{\#}_{@}t_{1}=\texttt{\#}_{@}t_{2}$ or $-\texttt{\#}_{p}t_{2}=-\texttt{\#}_{p}t_{1}\leq i\leq-1$ .

4.4 $\sigma$ -equivalence

The $\sigma$ -equivalence [9] is the least congruence induced by

[TABLE]

The rewriting system obtained by orienting the latter $\sigma$ -equivalences from the left to the right is terminating—its head canonical forms are the same already defined for the E-equivalence—but is not confluent. Indeed, the $\sigma$ -equivalence contains the E-equivalence, but it equates head canonical forms $E_{1}$ and $E_{2}$ s.t. the environments $\eta(E_{1})$ and $\eta(E_{2})$ are equivalent modulo the following permutation rule $t_{1}/x_{1},t_{2}/x_{2}\sim t_{2}/x_{2},t_{1}/x_{1}$ if $x_{1}\not\in\mathsf{FV}(t_{2})$ and $x_{2}\not\in\mathsf{FV}(t_{1})$ .

Example 2

Let us take the $\lambda$ -term $u=E[v]=(\lambda y.\lambda x.v)st$ , where $E$ is the $E$ -context of Example 1. Its unique head E-canonical form is $(\lambda y.(\lambda x.v)t)s$ , which can be also obtained by applying the first $\sigma$ -rule. However, since by applying the second $\sigma$ -rule, $u\to_{\sigma}(\lambda x.(\lambda y.v)s)t$ too, the $\lambda$ -term $u$ has two $\sigma$ -equivalent canonical forms. □

Summing up, the E-equivalence is a variant of the $\sigma$ -equivalence which equates less terms then the latter one. The definition of the $\sigma$ -equivalence is simpler and more elegant, and has a direct and nice interpretation in terms of linear logic proof nets. However, the better rewriting properties of the E-equivalence—canonical form uniqueness and preservation of primary redexes relative positions—makes it more suitable for a finer analysis of reduction machines requiring a reduction at a distance based on $\sigma$ -equivalence, as for instance the PAM. The $\sigma$ -equivalence can be recovered from the E-equivalence by adding the following permutation equivalence of primary redexes

[TABLE]

if $x_{1}\not\in\mathsf{FV}(t_{2})$ and $x_{2}\not\in\mathsf{FV}(t_{1})$ , to the E-equivalence of E-contexts.

5 Linear head reduction

5.1 Linear reduction

Let $(\lambda x.t)s$ be a redex s.t. the term $t$ contains at least one occurrence of $x$ . For any occurrence of $x$ in $t$ , we can take the context $C$ obtained by replacing such an occurrence of $x$ with $\square$ . The following reduction rule $(\lambda x.C[x])s\multimap_{\beta}(\lambda x.C[s^{\prime}])s,$ where $s^{\prime}$ is a fresh copy of $s$ , is a linearised variant of the usual $\beta$ -rule in which, instead of removing the redex after replacing all the occurrences of the bound variable $x$ , the redex is kept and only one occurrence of $x$ is replaced by a fresh copy of the argument $s$ . Such a linear $\beta$ -reduction can be extended to be applied at a distance too. We obtain then the linear reduction rule (at a distance)

[TABLE]

where $s^{\prime}$ is a fresh copy of $s$ . When the term $t$ in $E[\lambda x.t]s$ does not contain any occurrence of $x$ , we can instead take the following garbage rule (which is just a degenerated case of $\beta$ -reduction at a distance)

[TABLE]

Given a $\beta$ -redex (at a distance), by iterating the linear $\beta$ -reduction (at a distance), we can eventually obtain a redex (at a distance) to which apply the garbage rule. Therefore, $\beta$ -reduction (at a distance) can be simulated by a sort of affine reduction $\to_{a}$ which is the union of linear and garbage reduction.

Proposition 2

Let $\to_{a}=\multimap\cup\to_{g}$ .

If $t\to_{\beta}^{*}s$ , then $t\to_{a}^{*}s$ . Moreover, there is $s^{\prime}$ s.t. $t\multimap^{*}s^{\prime}\to_{g}^{*}s$ . 2. 2.

If $t\to_{a}^{*}u$ , then $u=_{\beta}t$ . Therefore, there is $t\to_{\beta}^{*}s$ s.t. $u\to_{a}^{*}s$ .

□

As a consequence of the above proposition, a term has a normal form for $\to_{a}$ iff it has a $\beta$ -normal form; moreover, the two normal forms coincide. We also remark the second part of the first item of Proposition 2. This is a particular case of a more general property stating that garbage reductions can be always postponed; that is, for every $t\to_{a}^{*}s$ , there is $s^{\prime}$ s.t. $t\multimap^{*}s^{\prime}\to_{g}^{*}s$ .

5.2 Linear head $\beta$ -rule

A particular case of linear reduction arises when the occurrence to be replaced is the head variable.

Definition 8 (Linear head reduction)

The linear head reduction is the least reduction which contains the linear head $\beta$ -rule

[TABLE]

where $s^{\prime}$ is a fresh copy of $s$ , and which is closed by head contexts. □

Linear head reduction is strongly related to head $\beta$ -reduction, as shown by the following statements.

Proposition 3

Let $t\multimap_{h}^{*}s$ . There is $t\to_{h}^{*}s^{\prime}$ s.t. $s\to_{h}^{*}s^{\prime}$ . □

Corollary 1

A term $t$ has a linear head normal form iff it has a head normal form. Moreover, let $s$ be the linear head normal form of $t$ .

The head normal form of $s$ is obtained by $\beta$ -reducing all the primary redexes in $s$ . 2. 2.

The head normal form of $s$ is the head normal form of $t$ , indeed.

□

6 Strong normalisation

All the rewriting systems defined above are strong normalising on simply typed $\lambda$ -terms. The proof of strong normalisation is however not at all evident. In fact, since linear reduction does not erase the reducing redex—it just replaces the occurrence of a variable by a (larger) $\lambda$ -term—the size of the reducing term increases at each step. Accattoli [2], in its analysis of proof nets linear reduction, proved strong normalisation by applying reducibility candidates. Here, we show that, surprisingly, the proof of strong normalisation of linear reduction is simpler then one might have thought, as it can be easily obtained by a trivial tuning of the proof of strong normalisation originally proposed by Gandy for $\beta$ -reduction [5]. In Gandy’s proof, each type $\tau$ is interpreted as a well-founded ordered set $[\tau]$ . In particular, any functional type $\tau\to\sigma$ is mapped into a set of increasing functions from $[\tau]$ to $[\sigma]$ . A measure is then associated to every term by interpreting any $t:\tau$ as an element $[t]\in[\tau]$ . Strong normalisation is a consequence of the fact that any $\beta$ -reduction $t\to_{\beta}s$ sends $[t]$ to a lower element $[s]$ .

The original measure defined for the analysis of $\beta$ -reduction does not directly work for the case of linear reduction, since such a measure does not change along linear reduction (i.e., $[t]=[s]$ , when $t\multimap s$ ). Indeed, Gandy’s measure just counts the number of $\lambda$ -abstractions erased along a $\beta$ -reduction. However, by taking the successor of the usual interpretation of a variable occurrence, one obtains a new measure which counts the number of variable occurrences replaced by some $\lambda$ -term. Such a new measure decreases along linear reduction, and allows to prove at the same time the strong normalisation of all the rewriting systems described in the present papers.

In the following, we shall follow the presentation of Gandy’s proof given by Miquel [7]. Let us interpret the base type $o$ as the strict partial order $(\mathbb{N},<)$ , and every functional type $\tau\to\sigma$ as the strict partial order of the increasing functions from the interpretation of $\tau$ to the interpretation of $\sigma$ . Formally, for every type $\tau$ , let us inductively define $([\tau],\prec_{\tau})$ by

[TABLE]

with $[o]=\mathbb{N}$ and ${\prec_{o}}={<}$ . We define then the binary operation $+_{\tau}:[\tau]\times\mathbb{N}\to[\tau]$ as

[TABLE]

for $n,k\in\mathbb{N}$ and $f\in[\tau\to\sigma]$ . It is readily seen that $v+_{\tau}0=v$ , that $(v+_{\tau}k)+h=v+_{\tau}+(k+h)$ , and that $k<h$ implies $v+_{\tau}k\prec v+_{\tau}h$ , for every $v\in[\tau]$ and $k,h\in\mathbb{N}$ .

For every type $\tau$ , let us define $\tau_{*}\in[\tau]$ and $\tau^{*}:[\tau]\to\mathbb{N}$ by

[TABLE]

for $n\in\mathbb{N}$ and $f\in[\tau\to\sigma]$ . By induction, we can see that $\tau^{*}$ is increasing (that is, $\tau^{*}(v)<\tau^{*}(w)$ , for all $v,w\in[\tau]$ s.t. $v\prec_{\tau}w$ ).

A valuation is a function $\phi$ associating an element of $[\tau]$ to every variable $x:\tau$ . Given a valuation $\phi$ , a variable $x:\tau$ , and a value $v\in[\tau]$ , we shall denote by $\phi[x\mapsto v]$ a new valuation s.t. $\phi[x\mapsto v](x)=v$ , and $\phi[x\mapsto v](y)=\phi(y)$ , when $y\neq x$ .

Given a valuation $\phi$ , any typed $\lambda$ -term $t^{\tau}$ can be interpreted as an element $[t]_{\phi}\in[\tau]$ by application of the following inductive definition

[TABLE]

For every valuation $\phi$ , we can also define the measure $\mu_{\phi}:\Lambda^{\to}\to\mathbb{N}$ , by $\mu_{\phi}(t^{\tau})=\tau^{*}[t]_{\phi}$ .

Remark 1

The only difference w.r.t. the usual interpretation used in the proof of strong normalisation of $\beta$ -reduction is the interpretation of variables. Indeed, one usually takes $[x:\tau]_{\phi}=\phi(x)$ (see [7]). With this choice, however, we would get $[t]=[s]$ when $t\multimap s$ . □

Lemma 3

For every valuation $\phi$ , every $C[x^{\tau}]:\sigma$ , and every $t:\tau$ , we have that

$[C[x^{\tau}]]_{\phi[x\mapsto[t]_{\phi}]}\prec_{\sigma}[C[t]]_{\phi}$ ; 2. 2.

if $t\to_{a}s$ , then $[s]_{\phi}\prec_{\sigma}[t]_{\phi}$ and $\mu_{\phi}(s)<\mu_{\phi}(t)$ .

□

By the previous lemma, and the fact that there is at least a valuation (for instance, the valuation $\phi_{0}$ defined by $\phi_{0}(x^{\tau})=\tau_{*}$ ), we can eventually get the strong normalisation result.

Theorem 2

The rewriting systems $\to_{a}$ , $\multimap$ , $\to_{\beta_{d}}$ , $\to_{\beta}$ , $\to_{h}$ , and $\multimap_{h}$ are strongly normalising. □

7 Conclusions

In the paper we have analysed linear $\beta$ -reduction in terms of a notion of evaluation context, and we have seen how a simple adaptation of the semantical proof of strong normalisation for the simply typed $\lambda$ -calculus allows to prove the same result for the linear case. The proof is surprisingly simple and its idea might be adapted to prove strong normalisations of other $\lambda$ -calculi in which the $\beta$ -rule is decomposed in more elementary steps, as for instance in the case of explicit substitution $\lambda$ -calculi.

Bibliography9

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1]
2[2] Beniamino Accattoli (2013): Linear Logic and Strong Normalization . In Femke van Raamsdonk, editor: 24th International Conference on Rewriting Techniques and Applications (RTA 2013) , LIP Ics 21, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, pp. 39–54, 10.4230/LIP Ics.RTA.2013.39 . · doi ↗
3[3] Beniamino Accattoli & Delia Kesner (2010): The Structural λ 𝜆 \lambda -Calculus . In Anuj Dawar & Helmut Veith, editors: Computer Science Logic , LNCS 6247, Springer Berlin Heidelberg, pp. 381–395, 10.1007/978-3-642-15205-4_30 . · doi ↗
4[4] Vincent Danos & Laurent Regnier (2004): Head Linear Reduction . Http://iml.univ-mrs.fr/ regnier/articles/pam.ps.gz.
5[5] R. O. Gandy (1980): Proofs of strong normalisation . In J. P. Seldin & J. R. Hindley, editors: To H. B. Curry: Essays in Combinatory Logic, Lambda Calculus, and Formalism , Academic Press, pp. 457–477.
6[6] Stefano Guerrini & Giulio Pellitta (2016): Dissecting the PAM . Submitted.
7[7] Alexandre Miquel: A combinatorial proof of strong normalisation for the simply typed -calculus . Unpublished draft.
8[8] Pierre-Marie Pédrot & Alexis Saurin (2016): Classical By-Need . In Peter Thiemann, editor: Programming Languages and Systems. 25th European Symposium on Programming, ESOP 2016 , LNCS 9632, Springer, pp. 616–643, 10.1007/978-3-662-49498-1_24 . · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Linear β\betaβ-reduction††thanks: Partially supported by the Project ELICA (ref. ANR-14-CE25-0005), of

Abstract

1 Introduction

2 Preliminaries

2.1 Head reduction

2.2 Head contexts

Definition 1** (H-context, head variable)**

2.3 Spine

3 β\betaβ-reduction at a distance

3.1 Environment contexts

Definition 2** (E-context)**

Definition 3

Lemma 1

3.2 Primary redexes and β\betaβ-contraction at a distance

Definition 4** (Primary β\betaβ-redex)**

Proposition 1

4 Spine permutation equivalence of λ\lambdaλ-terms

4.1 Surface E-equivalence

Definition 5** (Surface E-equivalence on E-contexts)**

Lemma 2

Example 1

4.2 Canonical λ\lambdaλ-terms

Definition 6** (Surface E-equivalence on terms)**

Definition 7** (head canonical λ\lambdaλ-term)**

Theorem 1

4.3 E-equivalence

4.4 σ\sigmaσ-equivalence

Example 2

5 Linear head reduction

5.1 Linear reduction

Proposition 2

5.2 Linear head β\betaβ-rule

Definition 8** (Linear head reduction)**

Proposition 3

Corollary 1

6 Strong normalisation

Remark 1

Lemma 3

Theorem 2

7 Conclusions

Linear $\beta$ -reduction††thanks: Partially supported by the Project ELICA (ref. ANR-14-CE25-0005), of

Definition 1 (H-context, head variable)

3 $\beta$ -reduction at a distance

Definition 2 (E-context)

3.2 Primary redexes and $\beta$ -contraction at a distance

Definition 4 (Primary $\beta$ -redex)

4 Spine permutation equivalence of $\lambda$ -terms

Definition 5 (Surface E-equivalence on E-contexts)

4.2 Canonical $\lambda$ -terms

Definition 6 (Surface E-equivalence on terms)

Definition 7 (head canonical $\lambda$ -term)

4.4 $\sigma$ -equivalence

5.2 Linear head $\beta$ -rule

Definition 8 (Linear head reduction)