TL;DR
This paper introduces a graph-based, local reasoning approach to proving observational equivalence, including its fragility, using hypergraph-rewriting and a new robustness concept, applicable to complex language features.
Contribution
It presents a novel step-wise proof methodology leveraging local graph reasoning and hypergraph-rewriting to establish a generalized, quantitative notion of observational equivalence.
Findings
The approach effectively proves observational equivalence in lambda-calculus with state.
It introduces a formal robustness concept as a key condition for equivalence.
The methodology can handle syntactically restricted and quantitatively constrained contexts.
Abstract
We propose a new step-wise approach to proving observational equivalence, and in particular reasoning about fragility of observational equivalence. Our approach is based on what we call local reasoning. The local reasoning exploits the graphical concept of neighbourhood, and it extracts a new, formal, concept of robustness as a key sufficient condition of observational equivalence. Moreover, our proof methodology is capable of proving a generalised notion of observational equivalence. The generalised notion can be quantified over syntactically restricted contexts instead of all contexts, and also quantitatively constrained in terms of the number of reduction steps. The operational machinery we use is given by a hypergraph-rewriting abstract machine inspired by Girard's Geometry of Interaction. The behaviour of language features, including function abstraction and application, is…
| template | robustness | ||||
| (input-safety) | of | of | dependency | implication of | |
| — | |||||
| — | |||||
| — | |||||
| — | |||||
| — | |||||
| — | |||||
| — | |||||
| Weakening | |||||||||||
| Parametricity | , (W) | , (W) |
| dependency | implication of | used for | |
|---|---|---|---|
| local rule | labels of path | label of edge |
|---|---|---|
| contraction | box | |
| box | ||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A robust graph-based approach to observational equivalence
Dan R. Ghica
University of Birmingham, UK
,
Koko Muroya
RIMS, Kyoto University, Japan
and
Todd Waugh Ambridge
University of Birmingham, UK
Abstract.
We propose a new approach to defining programming languages with effects, and proving observational equivalence. Operational machinery is given by a hypergraph-rewriting abstract machine inspired by Girard’s Geometry of Interaction. It interprets a graph calculus of only three intrinsic constructs: variable binding, atom binding, and thunking. Everything else, including features which are commonly thought of as intrinsic, such as arithmetic or function abstraction and application, must be provided as extrinsic operations, with associated rewrite rules. The graph representation naturally leads to two new principles about equational reasoning, which we call locality and robustness. These concepts enable a novel flexible and powerful reasoning methodology about (type-free) languages with effects. The methodology is additionally capable of proving a generalised notion of observational equivalence that can be quantified over syntactically restricted contexts instead of all contexts, and also can be quantitatively constrained in terms of the number of reduction steps. We illustrate the methodology using the call-by-value lambda-calculus extended with (higher-order) state.
1. Introduction
1.1. Context and motivation
Program equivalence is an old central question in the study of programming languages. Establishing whether two program fragments are equivalent has immediate consequences regarding compiler optimisation and correctness, program verification and validation, and other key applications. The standard formal definition of the concept is that of observational equivalence [MJ69]. Two executable programs are observationally equivalent if they have the same input-output behaviour. On terms (program fragments) observational equivalence is the smallest congruence induced by using executable programs as contexts, a definition which makes direct reasoning difficult.
The mathematical challenge of observational equivalence is two-fold. On the one hand, quantifying over all contexts is unwieldy, which led to the development of indirect methods. As an extremal case, denotational semantics provides a model-theoretic route to proving program equivalence [SS71], while the interplay of syntactic (operational) and denotational techniques led to the development of hybrid methods such as Kripke logical relations [Sta85] or trace semantics [JR05]. On the other hand, program equivalences are notoriously fragile. Adding new semantic features to a language can break equivalences and, more significantly, can invalidate reasoning techniques wholesale. Increasing the expressiveness of a language goes hand in hand with increasing the discriminating power of contexts, and new techniques based on bisimulation have been devised to meet the challenge of increasingly realistic languages [KW06].
A methodological point that became increasingly clear is that the richness of features of modern programming languages demands a systematic study, ideally a classification, from the point of view of the consequences such features have on the equational properties of a language. The development of game semantics made it possible to give combinatorial, syntax-independent, orthogonal characterisations for entire classes of features such as state and control, e.g. the so-called “Abramsky cube” [Abr97], or to replace the syntactic notion of context by an abstracted adversary [GT12]. A classification [DNB12] and characterisation [DNRB10] of language features and their impact on reasoning has also been undertaken using logical relations.
1.2. Overview and contribution
In this paper we present a radically new approach to defining effectful programming languages and proving observational equivalence, which builds on operational machinery inspired by the Geometry of Interaction (GoI) [DR96, Mac95, HMH14, MHH16], modified so that the underlying net can be rewritten during execution [MG17]. The ability to rewrite its own net makes the machine efficient, in the sense of Accattoli et al. [ABM14], for common reduction strategies (namely, call-by-value and call-by-need). It also gives the machine the ability to model exotic effects, e.g. transforming stateful into pure computation by abstracting the state [MCG18]. However, our focus here will be on new reasoning principles and methods that arise out of this new perspective.
We revise and generalise the formulation of the rewriting machine from graphs to what we call hypernets — hierarchical hypergraphs, in which other hypergraphs can be used as edge labels, recursively. The hypernet-rewriting abstract machine is used to interpret Spartan, a graph calculus of only four fundamental constructs: variable binding, atom binding, thunking, and (extrinsic) operations. Execution is traversal of the hypernet representing the program, according to some prescribed strategy that stipulates when the rewrite rules for operations are to be triggered. We call this ‘focussed graph rewriting’.
The main contribution of the paper is conceptual, showing how graph-related concepts can be used to reason about observational equivalence in a new and advantageous way. The key new concepts are:
**Locality: **
When reasoning about a term, it is important to understand how the term interacts with its context, the rest of the program. Not necessarily all the program is relevant, because each interaction happens around a specific part of the program that is under evaluation, which may involve the term of interest. In other words, the interactions can be analysed locally by looking at neighbourhoods of where evaluation happens. The textual locality, i.e. whether parts of the program next to the term of interest are rewritten, is largely irrelevant. Instead, graph locality, or the graphical concept of neighbourhood, will be shown to be a powerful tool111The graphical locality is not to be confused with memory locality, a property which it can subsume.. The interactions between the term of interest and the context can be analysed by looking at the trajectory of graph traversal and its graph neighbourhoods, namely: rewrite rules that can be triggered along the trajectory, and parts of the graph that are reachable from, or that can reach, where the rewrite rules can be applied.
**Robustness: **
In reasoning about observational equivalence, it is crucial to establish that two terms to be equated interact with contexts in the same way. In light of the graph locality discussed above, it boils down to ensuring that the two terms (as sub-graphs) interfere with the traversal trajectory and also with rewrite rules triggered along the trajectory, in the same manner. Of particular importance is the latter kind of interference, because application of a rewrite rule may change part of the sub-graphs that we wish to equate and produce new sub-graphs that cannot be equated. Such unsafe interference can imply existence of a context that distinguishes the two terms of interest, and hence it can witness violation of the observational equivalence of interest.
This observation leads to a new concept, which we call robustness, that instead characterises safe interference. The characterisation concerns rewrite rules and a “template” of an observational equivalence, that is, a relation on terms (as sub-graphs) that we wish to become the observational equivalence. We say that the template is robust relative to a rewrite rule, if the rewrite rule interferes with each related pair of sub-graphs in such a way that the rewrite rule produces sub-graphs that are either identical, or can be again related by the template. Intuitively, our concept of robustness characterises a situation where a rewrite rule preserves, or simply eliminates, the difference between sub-graphs of interest.
The Spartan graph calculus can be augmented with arbitrary operations, including extravagant ones. Some of the operations are local, in the sense that they come with rewrite rules that involve a smaller graph neighbourhood: e.g. arithmetic operations will be seen to only involve their immediate graph neighbours. Some operations, however, can be less local and come with rewrite rules that involve a larger graph neighbourhood: e.g. assignment operations for state involve their immediate graph neighbours and also their neighbours. The more local an operation is, the more likely a template of an observational equivalence is robust relative to the operation’s rewrite rule.
Our most important technical result is the partial characterisation theorem (Thm. 6.20) that identifies robustness as a key sufficient condition of observational equivalence. To be an observational equivalence in a given language, a template needs to be robust relative to all operations (more precisely, their rewrite rules) that are allowed in the language.
Using this theorem, for the first time, we can begin to address one of the most pernicious problems of observational equivalence, its fragility. All the effort expended on proving a particular equivalence, in a particular language, is at serious risk to be wasted if the language were to be extended with a new semantic feature. How do we know what equivalences still hold and which are invalidated without re-proving them all over again? And what do we do if the new features invalidate not just the equivalences, but the proof techniques used in the proving of the equivalences? We expect that the partial characterisation theorem can come to the rescue.
Let us illustrate this fragility with a strikingly simple class of equivalences: arithmetic. It is usually the case in a programming language that , i.e. wherever we write in a program we could just write . This justifies basic compiler optimisation such as constant folding. A serious problem here is that, perhaps surprisingly, one can extend a language so that this equivalence fails. Operations that examine the syntactic structure of the program at run-time, such as reflection, invalidate such equivalences. Also operations that collect profiling information about the size of the program may also invalidate such equivalences. On the other hand, adding state, jumps, recursion, higher-order functions, etc. will not invalidate arithmetically induced equivalences.
While conventional operational reasoning would struggle to establish both the conservativity and the non-conservativity of equivalences, the partial characterisation theorem gives some help. Reflection and profiling require highly non-local operations which operate on very large graph neighbourhoods. Their lack of locality increases a chance that robustness fails for these operations, so very few equivalences are likely to hold in the presence of them. Note that, although failure of robustness does not necessarily mean violation of equivalences, it can suggest potential counterexamples of equivalences. On the other hand, operations involving state, jumps, and so on can be seen to be largely local. The interactions between them and arithmetic, which is very local, are hence limited, and it is often easy to find a common pattern of the limited interactions. This will lead to the conclusion that arithmetic equivalences are robust relative to these operations, with the robustness proofs sharing a common structure.
Additionally, we propose a generalised notion of observational equivalence that has two parameters: a class of contexts and a preorder on natural numbers. The first parameter enables us to quantify over syntactically restricted contexts, instead of all contexts as in the standard notion. This can be used to identify a shape of contexts that respects or violates certain observational equivalences, given that not necessarily all arbitrarily generated contexts arise in program execution. The second parameter, a preorder on natural numbers, deals with numbers of steps it takes for program execution to terminate. Taking the universal relation recovers the standard notion of observational equivalence. Another instance is observational equivalence with respect to the greater-than-equal relation on natural numbers, which resembles the notion of improvement [ADV20] that is used to establish equivalence and also to compare efficiency of abstract machines. This instance of observational equivalence is useful to establish that two programs have the same observable execution result, and also that one program terminates with fewer steps than the other.
1.3. Organisation of the paper
Sec. 2 gives a gentle introduction to our graph-based approach to modelling program execution, and reasoning about observational equivalence with the key concepts of locality and robustness. Our graph-based approach to operational semantics is then formalised in the subsequent sections. Sec. 3 defines the graphs we use, dubbed hypernets. Sec. 4 presents operational semantics based on strategic rewriting of hypernets, and Sec. 5 provides technical details of the semantics.
Sec. 6 presents our main technical contributions. It first defines our generalised notion of contextual equivalence, which is capable of focusing on syntactically restricted contexts, instead of always dealing with arbitrary contexts, and also capable of comparing the number of steps of program execution. The section then formalises the concept of robustness, and presents our main technical result which is the partial characterisation theorem (Thm. 6.20). Sec. 7 shows a proof of the theorem, which is centred around a variant of weak simulation combined with the so-called up-to technique.
The next few sections demonstrate how our graph-based approach works in proving observational equivalence. Sec. 8 shows formalisation of observational equivalence for an extended call-by-value lambda-calculus, by means of the contextual equivalence in the strategic rewriting of hypernets. Sec. 9 demonstrates how one can use the partial characterisation theorem to prove an observational equivalence, using a Parametricity law for the call-by-value lambda-calculus extended with state, as an example. Sec. 10 presents technical details of the use of the theorem, and in particular a common principle of checking robustness.
Finally, Sec. 11 discusses related and future work, concluding the paper. Some details of proofs are presented in Appendices.
2. A gentle introduction
2.1. From terms to graphs
Compilers and interpreters deal with programs mainly in the form of an abstract syntax tree (AST) rather than text. It is broadly accepted that such a data structure is easier to manipulate algorithmically. Somewhat curiously perhaps, operational semantics, essentially a list of rules for program manipulation, is expressed using text rather than the tree form. As we shall see, expressing syntax as a data structure rather than text, and operational semantics as algorithmic manipulations of that data structure will turn out to be useful and intuitive.
The first observation is that ASTs are not good enough, as they pack in too much syntactic detail. Consider for example the following (contrived) term in the lambda calculus, \lambda y.\bigl{(}\lambda x.(\lambda y.x)(\lambda x.x)\bigr{)}(\lambda x.y). Its AST (see Fig. 1, left hand side) records variables (, ) and their binders () as vertices in the graph. This representation is inconventient on two accounts. First, terms should be identified up to systematic renaming of bound variables (alpha equivalence), which is not the case with an AST. Second, various operations on terms (primarily substitution) need to be aware of the scope of variables, which is not immediately obvious in the AST representation. It takes a certain amount of work to match each variable occurrence to its binder, so in the diagram we use colour to indicate just that.
We propose an alternative representation which solves these problems in one fell swoop, by treating variables as edges in the graph, rather than nodes. This keeps them anonymous. Binding structure and scope are made explicit by using boxes around sub-graphs. We call this representation an abstract syntax graph (ASG) or a hypernet, shown in Fig. 2 (left hand side). ASGs may seem more difficult to understand due to their unfamiliar nature, but upon closer inspection their advantages, particularly in how they treat variables, binders, and scope, will hopefully become apparent.
The advantage of the ASG over the AST is best illustrated by examining how an equation is implemented in both representations — namely, the beta law. The AST is not much better than the textual representation for this purpose. We need to search through the graph to find variable occurrences and replace them with the argument, while systematically renaming bound variables (Fig. 1, with renamed to in the scope of the affected binder). In contrast, the application of the beta law to hypernets (Fig. 2) is a strictly local affair, which requires no inspection or search once the redex has been identified. The rewrites involved, namely the removal of the application node, the opening of the box, and the reattachment of the bound variable to the argument can all be considered elementary operations, efficient (in principle), and a suitable basis for execution. But more on this in the next section.
2.2. An abstract machine
The main difference between a ‘law’ (an equation) and a ‘reduction’ is that the former can be applied in any context, whereas the other must be applied in contexts determined by a strategy. Different reduction strategies, for instance, make different programming languages out of the same calculus.
In the previous section we saw an example, the beta law, and how hypernets as a data structure simplify it. The question to be addressed here is how to define strategies for determining redexes in hypernets. In conventional (small-step) operational semantics or in abstract machine semantics (see [Pit00] for a comprehensive introduction) the strategy is implicit in the logical structure of the deduction rules for the former, and explicit in the manipulation of contexts for the latter. In this sense, operational semantics with hypernets resemble more an abstract machine, in that the determination of a redex is governed by explicit ‘search’ rules. These rules are nothing more than a depth-first-search traversal of the graph along with special rules for copying shared nodes and triggering rewrites. In this paper we will be concerned with post-order reduction of operations, i.e. rewrite rules triggered after all the operands have been evaluated.
To illustrate the strategic reduction based on graph traversal, let us consider the first few steps of the call-by-value execution of the term , as seen in Fig. 3. The thicker green line is not part of the graph but it shows the flow of control, i.e. the trajectory of the token triggering the reductions.
- (1)
The depth-first traversal witnesses that the two lambda-forms are values, and hence the outermost application node is ready to be beta-reduced. The reduction follows. First the application node, the matching lambda node on the left, and the box boundary associated with the lambda node are removed. Then the argument is connected to the bound variable, thus yielding the second graph. 2. (2)
Since the variable is used twice, the two occurrences are linked by the () contraction node. Following any rewrite the traversal starts again from the current point in the graph. When the contraction node is reached it creates a copy of its argument, resulting in the third graph. 3. (3)
The traversal and reduction continues on the third graph, in the same way as the first graph. A new outermost application node can be reduced now, which yields the forth graph. 4. (4)
It is apparent that this particular reducing sequence diverges since we cycled back to a graph isomorphic to (2).
2.3. Local reasoning
Combination of graph traversal representing the flow of control, and graph transformations representing actual computation, yields a graph-based abstract machine. Notably, graph transformations are always performed along the flow of control, and thus controlled by the selected edge (“focus”) that realises the flow. Due to this control, the process of program execution can be described locally in terms of the focus and its neighbourhood.
This locality around the focus turns out to be particularly useful to directly inspect interaction between contexts and program fragments during execution, in terms of the focus. Given a context and a program fragment , their possible interactions during execution of the composite program can be analysed in a case-by-case manner, by local inspection around the focus. At the beginning of the execution, the focus enters the hypernet that represents , which starts the flow of control. The hypernet can be split into two parts, one corresponding to the context and the other corresponding to the fragment . As the focus navigates through the hypernet , its position will therefore be either inside , inside , or on the border between the two sub-graphs. At some point, a graph transformation may be triggered by and around the focus. By inspecting the position of the focus and triggered graph transformations, possible interactions between and can be classified into the following three, as illustrated in Fig. 4.
Case A: move inside the context. The focus (‘’) moves within the sub-graph , as indicated by magenta in Fig. 4a. The sub-graph is not involved in, and hence is irrelevant to, the move. This means that there is no actual interaction between and .
Case B: visit to the fragment. The focus enters the sub-graph . It will navigate through the sub-graph , as indicated by magenta in Fig. 4b, and may trigger some graph transformation. The transformation possibly involves a part of the sub-graph , in which case interacts with .
Case C: graph transformation. The focus is in the sub-graph and triggers a graph transformation. The transformation may involve a part of the sub-graph , as indicated by magenta in Fig. 4c, in which case interacts with .
The above case-by-case analysis, based on locality, paves the way for proving observational equivalence. Namely, it leads to a direct, case-by-case, reasoning principle to establish that two program fragments and interact with any context in the same manner. Sufficient conditions of observational equivalence can be identified by examining interactions between and , and those between and at the same time, according to the three cases described above. One ideal situation that yields a suficient condition is when and have the same behavoiur on their own, and have no meaningful interaction with the common context . Fig. 5 illustrates example scenarios of such situation, one for each case in Fig. 4, which are explained below.
Any scenario in Case A: move inside the common context. The focus moves within the common context , as indicated by magenta in Fig. 5a. This move is regardless of the fragments and (denoted by in the figure). The context indeed has no interaction with and .
An ideal scenario in Case B: visit to the fragments. Once the focus visits the fragment , it triggers some graph transformations that change the fragment actually to the other fragment , as illustrated in Fig. 5b. In this scenario, visiting yields the same result as visiting , without affecting the common contexts . The fragments and hence do not interact with . This typically happens when the fragments and are taken from reduction, e.g. and .
An ideal scenario in Case C: graph transformation. The focus in the common context triggers a graph transformation that only affects a part of the context , resulting in another context as in Fig. 5c. The transformation hence preserves the fragments and (denoted by in the figure), which means that does not intaract with or .
The last scenario is of particular importance, giving rise to a new concept of robustness. It characterises safe involvement of program fragments in graph transformations, namely where the fragments are respected in the same manner by the transformations. Measuring robustness of fragments reveals when, namely with which graph transformations allowed, the fragments can be observationally equivalent. Robustness provides a key sufficient condition of observational equivalence.
Robustness of program fragments are relative to graph transformations, but it can be seen as being relative to language features as well, because behaviour of language features are modelled by graph transformations. Therefore, by measuring robustness of program fragments, one can examine which language features can enable the fragments to be observationally equivalent. In other words, robustness provides a way to analyse which observational equivalences are respected by which language features.
3. Preliminaries: hypernets
We now formalise the ideas introduced in the previous section. The Spartan graph calculus will be formulated in terms of the following mathematical structures.
3.1. Monoidal hypergraphs and hypernets
Given a set we write by the set of elements of the free monoid over . Given a function we write for the pointwise application (map) of to the elements of .
Definition 3.1**.**
A monoidal hypergraph is a pair of finite sets, vertices and (hyper)edges along with a pair of functions , defining the source list and target list, respectively, of an edge.
Definition 3.2**.**
A labelled monoidal hypergraph consists of a monoidal hypergraph, a set of vertex labels , a set of edge labels , and labelling functions such that:
- •
For any edge , its source list consists of distinct vertices, and its target list also consists of distinct vertices.
- •
For any vertex there exists at most one edge such that and at most one edge such that .
- •
For any edges if then f_{\mathsf{L}}^{*}\bigl{(}S(e_{1})\bigr{)}=f_{\mathsf{L}}^{*}\bigl{(}S(e_{2})\bigr{)}, and f_{\mathsf{L}}^{*}\bigl{(}T(e_{1})\bigr{)}=f_{\mathsf{L}}^{*}\bigl{(}T(e_{2})\bigr{)}.
In words, the label of an edge is always consistent with the number and labelling of its endpoints. This makes it possible to use the vertex labels as types for edge labels. Given we can write for such that x=f_{\mathsf{L}}^{*}\bigl{(}S(e)\bigr{)} and x^{\prime}=f_{\mathsf{L}}^{*}\bigl{(}T(e)\bigr{)} for any such that .
If a vertex belongs to the target (resp. source) list of no edge we call it an input (resp. output).
Definition 3.3**.**
A labelled monoidal hypergraph is interfaced if inputs and outputs are respectively ordered, and no vertex is both an input and an output.
We can type a hypergraph where are the lists of inputs and outputs, respectively. In the syntax for lists of inputs and outputs we use to denote concatenation and define to be the empty list, and for any label . In the sequel, when we say hypergraphs we always mean interfaced labelled monoidal hypergraphs.
Definition 3.4** (Interface permutation).**
The permutation of the interface of a hypergraph yields another hypergraph. Let be a hypergraph with an input list and an output list . Given two bijections and on sets and , respectively, we write to denote the hypergraph that is defined by the same data as except for the input list and the output list .
Definition 3.5** (Hypernets).**
Given a set of vertex labels and edge labels we write for the set of hypergraphs with these labels; we also call these level-0 hypernets . We call level- hypernets the set of hypergraphs
[TABLE]
We call (labelled monoidal) hypernets the set .
Informally, hypernets are nested hypergraphs, up to some finite depth, using the same sets of labels. An edge labelled with a hypergraph is called “box” edge, and a hypergraph labelling a box edge is called “content”. Edges of a hypernet are said to be “shallow”. Edges of nesting hypernets of , i.e. edges of hypernets that recursively appear as edge labels, are said to be “deep” edges of . Shallow edges and deep edges of a hypernet are altogether referred to as edges “at any depth”.
3.2. Graphical conventions
A monoidal hypergraph with vertices and edges such that
[TABLE]
is normally represented as
.
However, this style of representing hypergraphs is awkward for understanding their structure. We will often graphically represent hypergraphs as graphs, with both vertices and edges drawn as nodes marked by their label and connecting input vertices with edges, and edges with output vertices using arrows. To distinguish them, the edge labels are circled.
Since the node labels are often determined in our typed graphs by the edges we can omit them to avoid clutter, showing only the edges and the way they link. The graph above would be drawn like this:
.
Sometimes we will draw a hypergraph so that to identify sub-graphs of interest. In this case we may draw interface nodes twice and connect them with arrows, with intention that the nodes at either ends are two graphical representations of the same node. If it is obvious from context we may omit node labels and just draw the arrows between the sub-graphs, or just one line if the entire input interface of a sub-graph is identified with the entire output interface of another sub-graph. For example, below we consider as the sub-graphs consisting just of edge and , respectively:
The final conventions are that a bunch of parallel arrows can be drawn as a single arrow with a dash across, and that a hypergraph-labelled edge is indicated with a dotted box, and decorated with its type.
4. The Spartan hypernet calculus
Spartan hypernets have vertex labels indicating either regular arguments (), thunks () or references (), ranged over by . Edge labels have unique types, and they are as follows:
- •
operations with regular arguments and thunked arguments,
- •
instances , linking terms to references,
- •
atoms , linking references to terms,
- •
a family of contractions \{{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\ell}_{\mathsf{W}}:\epsilon\Rightarrow\ell},\ {\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\ell}_{\mathsf{C}}:\ell^{\otimes 2}\Rightarrow\ell}\mid\ell\in\{\star,\diamond\}\},
- •
a family of tokens , indicating flow of control.
When a hypergraph is used as an edge label it must always have type ; the box edge is assigned type for some .
The hypernet is said to be focussed if there exist only one token-labelled edge and it is shallow, plus some other basic well-formedness conditions discussed later.
To graphically represent multiple occurrences of a single variable or a single atom, we will define and use a family of forests of contractions with which we call distributors (Def. 5.7). Intuitively, they are the -interleaving of bunches of -contraction edges, For instance, D^{*}_{2,3}=\vbox{\hbox{\includegraphics{d23a}}} and D^{\diamond}_{3,0}=\vbox{\hbox{\includegraphics{d30}}}.
4.1. Focussed rewriting
In this section we define an abstract machine for rewriting Spartan hypernets, thus giving a notion of evaluation for the calculus. The machine is based on the Dynamic Geometry of Interaction Machine [MG17], which it generalises for the modelling of effects. In this section we give a brief introduction to the machine, with the full technical details presented later, in Sec. 5.
In a conventional reduction semantics a subtle and often complex aspect of the reduction bureaucracy is the identification and isolation of a sub-term which is a redex from the context [FH92]. Similarly, in an abstract machine much of the work consists of moving information (representing the context) on and off the stack in order to reach redexes [WF94]. In both cases the formal machinery identifies an implicit focus of action which either moves around the term or is acted upon via a substitution. In the case of focussed graph rewriting the focus is represented by the token edge. The redex search is defined by the local interaction between the token and the neighbouring edges, which can cause the token to navigate the graph. The same local interactions can determine a rewrite action.
The state of the token edge determines the possible actions. If the token is then the applicable rules (interaction rules) are search rules, which do not change the underlying graph; the token always travels in the direction of the edges and it is searching for a redex. In comparison to a conventional reduction semantics this is a narrowing of the term-in-context. If the token is then the applicable rules are also search (interaction) rules but the token travels in the opposite direction of the edges. In comparison to a reduction semantics this is a widening of the term-in-context normally following the detection of a value. Finally, if the token is then a rewrite is about to be performed at the very next step. There are two kinds of rewrites, intrinsic to the Spartan calculus and extrinsic, defined by the operations. The intrinsic rewrites are only copying of subgraphs which are triggered when a token interacts with contraction .
Interaction rules are in Fig. 6. These rules capture the intuition behind redex search discussed earlier:
Rule 1. When encountering a contraction () the search token () becomes a rewrite token () as a copying action will follow. This rule has two sub-rules depending on whether the focus is on the left or right branch.
Rule 2. When encountering an instance edge () the search token () changes to a value token () as atoms block both evaluation and copying.
Rule 3. When encountering an operation () with at least one eager argument the search token () will inspect the first argument.
Rule 4. After resolving eager argument of an operation () the value token () changes to a search token () which proceeds to inspect the next eager argument, if it exists.
Rule 5a. After all eager arguments of an operation (), the token will change to a -token depending on the kind of operation is, value or rewrite. (Rule 5b. is a degenerate version for operations with no eager arguments).
The interaction rules essentially specify the control flow through the hypernet, and as given correspond to a left-to-right call-by-value evaluation. The other general set of rules concerns copying and ar given in Fig. 7. Whenever the focus reaches a contraction tree leading up to a value-like hypernet (a ‘copyable hypernet’, Def. 5.9), a copy of will be created, the current entry into the contraction tree replaced by a nullary contraction (which still keeps the contraction tree in legal form), and the upstream dependences of connected via appropriate distributors.
4.2. Example operations
The five interaction rules of the preceding sections along with the copying rules induced by contraction nodes are the general structural framework for the behaviour of programs expressed as hypernets. The final required ingredient are the rewrite rules specifying the behaviour of operations. We give here a selection of such operations. This is to illustrate the key fact that, unlike conventional term-based operational semantics, it suffices to specify just the rewrite rules that are associated with operations and triggered by a token, without changing the structural framework. We only give the rules in their graphical form, as the sets-and-relations form used in the formal definitions can be unambiguously recovered.
4.2.1. Arithmetic.
Arithmetic presuposes the existence of passive operations representing data (natural numbers, integers, floats, etc) of type and -ary operations . The rule for an arithmetic operation connected to two (passive) values replaces it with the value of the operation . This is illustrated in Fig. 8a. Note that since is itself a passive operation, the next step of the machine will be to change the token from to .
4.2.2. Abstraction and application.
The lambda calculus requires a unary passive operation which has a thunk as its unique argument and a binary active operation, function application, . The (left-to-right, call-by-value) rewrite rule when the token reaches the application hyperedge is given in Fig. 8b. The rule corresponds closely to what we would write in a conventional operational semantics for a lambda calculus with explicit substitutions as : the abstraction-application pair is removed, the body of the function is exposed, and the bound variable bound to the argument. Note that the next step of the machine will be to advance to token into the body of the function, the hypernet .
4.2.3. Recursion.
Recursion requires the addition of an active unary operation which replicates its argument (i.e. a thunk) as seen in Fig. 8c. The upstream dependencies of of are shared via distributor sub-graphs. Note that from the point of view of expressiveness recursion is not needed as the untyped lambda calculus can encode it. However, a recursion primitive has its own value. It can be used as part of the interpretation of a typed language and it requires a single execution step as opposed to, for example, the Y combinator which requires 4 steps of execution for each single step of the ‘native’ recursion operator.
4.2.4. Store.
We further add to the language state via reference creation, assignment and dereferencing. There are no restrictions on what the store can hold, so this corresponds to a model of higher-order untyped state. Store is interesting because it makes use of the atom node. A contraction node always triggers a copying of the hypernet it points to, so it cannot be used to represent the kind of sharing that happens in references. The role of the atom is to block copying and enforce permanent sharing.
We give the following rewrite rules:
- •
Reference creation via active operation in Fig. 9a: The operation is eliminated and replaced by an atom node connected to the execution context by an instance adaptor and a distributor. The interpretation of the rule is that the hypernet is now ‘stored’ at the newly created atom. Computationally, the distributor does not contribute anything, but it will make proofs of observational equivalence easier, as we shall see in the sequel.
- •
Equality testing via active operation in Fig. 9b: The two arguments of the equality testing are instance nodes, connected via (a) tree(s) of contractions to either distinct or the same atoms. In the former case the test becomes (false), and in the latter (true). In both cases the adaptor nodes are disconnected via a nullary contraction node, becoming unreachable in the graph (‘garbage’).
- •
Assignment via active operation in Fig. 9c: The left-hand-side of the assignment reaches an atom via an adaptor node (required by typing rules) and a tree of contractions. Whatever that atom node was linked to is changed via the assignment to , the hypernet on its right-hand-side. The role of the atom node is essential. Without it, whatever data was shared by the first and second argument of the assignment operator would be copied as the focus moves into the tree of contractions. The atom node blocks that, preserving sharing.
- •
Dereferencing via active operation in Fig. 9d: The argument reaches an atom, also via an adaptor and a tree of contractions. Our definition of dereferencing is ‘efficient’ in the sense that no fetching or copying of data from the atom is performed, but a distributor is inserted between the atom, the execution context, and the value stored in the atom. The actual copying will happen in the subsequent steps as the token advances up the distributor.
4.2.5. Control.
The Spartan framework is particularly handy in modelling control operators, also without modifying the general framework. We give labelled break and continue statement in the style of C as an example. The cont(inue) operation takes as arguments an atom, used as the label of the block, and a thunk with a bound variable. When executed, the cont operation uses two copies of the thunk, one of which is forced, with the atom linked to its bound variable. Without a break statement, discussed just below, the thunk of a cont will be executed in an infinite look. The break statement takes aas argument an atom, which is the label of the block. The rewrite rule finds the cont statement which points to the atom and the seq statement pointing to it and ‘jumps’ directly into that scope, discarding both the current context and the unforced thunk. The complex navigation required for the latter rule makes the graph representation of the term particularly convenient, eliding the need for stack frames or other such workarounds.
4.2.6. Stat.
We finally introduce a kind of operation that is usually ignored by programming language semantics because it is both difficult to formalise and reason about. We call this operation stat, and it provides some kind of ‘statistical’ information about the hypernet state – for example the number of nodes it has. A similar function, also called can be found in the OCaml garbage-collection (Gc) module, giving the size of the program at that point (“profiling” information). The rewrite rule simply replaces the node stat with a constant value , which depends on the global size of the hypernet. Such operations have terrible equational properties, but they provide a stress-test for the semantic framework, so we can and will consider them.
Finally, it can be easily checked by inspection that all rewrite rules given in this section preserve the well-formedness of a hypernet.
An interpreter and visualiser of Spartan execution can be accessed online222https://tnttodda.github.io/Spartan-Visualiser/.
Remark 4.1** (Term syntax for Spartan hypernets).**
The reasoning techniques introduced in this paper are essential graph-based, relying crucially on graph-specific concepts such as reachability. However, the structure of hypernets makes it possible to always translate them into a term calculus. In relating graphs and terms, string diagrams play the role of a stepping stone, completing a useful triad of syntactic representations [AGSZ21]. Sec. 4.2 (loc. cit.) gives soundness and completeness results for the correspondence between string diagrams and graphs. Since string diagrams are the internal language of a syntactic category, the further connection between them and terms is standard. ∎
5. Focussed graph rewriting – further technical details
We now give a formal definition of the abstract machine introduced informally in Sec. 4.1. We call it a universal abstract machine because it can be seen as a universal algebra (the operations) combined with a mechanism for sharing or copying resources and scheduling evaluation. The machine, and hence the definitions below, are all globally parametrised by the operation set and its behaviour .
5.1. Auxiliary definitions
We use the terms incoming and outgoing to characterise the incidence relation between neighbouring edges. Conventionally incidence is defined relative to nodes, but we find it helpful to extend this notion to edges.
Definition 5.1** (Incoming and outgoing edges).**
An incoming edge of an edge has a target that is a source of the edge . An outgoing edge of the edge has a source that is a target of the edge .
Definition 5.2** (Path).**
A path in a hypergraph is given by a non-empty sequence of edges, where an edge is followed by an edge if the edge is an incoming edge of the edge .
Note that, in general, the first edge (resp. the last edge) of a path may have no source (resp. target). A path is said to be from a vertex , if is a source of the first edge of the path. Similarly, a path is said to be to a vertex , if is a target of the last edge of the path. A hypergraph is itself said to be a path, if all edges of comprise a path from an input (if any) and an output (if any) and every vertice is an endpoint of an edge.
Definition 5.3** (Reachability).**
A vertex is reachable from a vertex if holds, or there exists a path from the vertex to the vertex .
To represent Spartan terms, we fix the vertex label set and the edge label set as described in Sec. 4, using the given operation set .
Definition 5.4** (Spartan hypernets).**
Spartan hypernets for an operation set are hypernets whose the label sets and are defined as follows.
[TABLE]
We will simply refer to Spartan hypernets as hypernets.
Definition 5.5** (Operation path).**
A path whose edges are all labelled with operations is called operation path.
Definition 5.6** (Contraction tree).**
For each , a contraction tree is a hypernet (C:\ell^{\otimes k}\Rightarrow\ell)\in\mathcal{H}(\{\ell\},\{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\ell}_{\mathsf{W}},\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\ell}_{\mathsf{C}}\}), such that the unique output is reachable from each vertex.
It can be observed that, for any contraction tree, an input (if any) is not an output but a source of a contraction edge.
Definition 5.7** (Distributor).**
We define a family , with , of hypernets which we call distributors, inductively as follows:
[TABLE]
where denotes the empty hypernet, is the identity map, and is a bijection such that, for each and , and .
When , a distributor is a contraction tree that includes one weakening edge. It is indeed a normal form with respect to the laws on contraction trees that we introduce later. For instance, D^{*}_{2,3}=\vbox{\hbox{\includegraphics{d23a}}} and D^{\diamond}_{3,0}=\vbox{\hbox{\includegraphics{d30}}}.
Definition 5.8** (Box/stable hypernets).**
If a hypernet is a path of only one box edge, it is called box hypernet. A stable hypernet is a hypernet , such that and each vertex is reachable from the unique input.
Definition 5.9** (Copyable hypernets).**
A hypernet is called copyable if it is
or
, where , and each is a box hypernet.
Definition 5.10** (One-way hypernets).**
A hypernet is one-way if, for any pair of an input and an output of such that and both have type , any path from to is not an operation path.
Remark 5.11** (Distributors).**
To the reader familiar with diagrammatic languages based on monoidal categories equipped with “sharing” (co)monoid operators, such as the ZX-calculus of Coecke and Duncan [CD11], the distributor nets may seem an awkward alternative to quotienting the nets by the equational properties of the (co)monoid operator. Indeed a formulation of Spartan semantics in which distributor nets are collapsed into -ary contractions would be quite accessible.
However, the structural laws of Spartan including certain desirable equational properties of contraction, can be invalidated by certain ill-behaved but definable operations. Forcing these properties into the framework does not seem to be practically possible, as it leads to intractable interactions between such complex -ary contractions and operations in the context as required by the key notion of robustness which will be introduced in Sec. 6.3.
We will later introduce the equational properties of contraction that are validated by “well behaved” subsets of operations, in Ex. 6.10. These equational properties do not make contractions and weakenings form a (co)monoid, but they enable us to identify contraction trees so long as the trees contain at least one weakening. If we see the equations on contraction trees as rewrite rules from left to right, distributors are indeed normal forms with respect to these rules. ∎
5.2. Focussed hypernets
Definition 5.12**.**
A token edge in a hypergraph is said to be exposed if its source is an input and its target is an output, and self-acyclic if its source and its target are different vertices.
Definition 5.13** (Focussed hypernets).**
Focussed hypernets (typically ranged over by ) are those that contain only one token edge and the edge is shallow, self-acyclic and not exposed.
Focus-free hypernets are given by , i.e. hypernets without token edges. A focussed hypernet can be turned into an underlying focus-free hypernet with the same type, by removing its unique token edge and identifying the source and the target of the edge. When a focussed hypernet has a -token, then changing the token label to another one yields a focussed hypernet denoted by . The source (resp. target) of a token is called “token source” (resp. “token target”) in short.
Given a focus-free hypernet , a focussed hypernet with the same type can be yielded by connecting a -token to the -th input of if the input has type . Similarly, a focussed hypernet with the same type can be yielded by connecting a -token to the -th output of if the output has type . If it is not ambiguous, we omit the index in the notation .
5.3. Contexts
The set of holed hypernets (typically ranged over by ) is given by , where the edge label set is extended by a set of hole labels. Hole labels are typed, and typically ranged over by .
Definition 5.14** (Contexts).**
A holed hypernet is said to be context if each hole label appears at most once (at any depth) in .
Definition 5.15**.**
A simple context is a context that contains a single hole, which is shallow.
When gives a list of all and only hole labels that appear in a context , the context can be also written as ; a hypernet in can be seen as a “context without a hole”, .
Let and be contexts, such that the hole and the latter context have the same type and . A new context can be obtained by plugging into : namely, by replacing the (possibly deep) hole edge of that has label with the context , and by identifying each input (resp. output) of with its corresponding source (resp. target) of the hole edge (Def. B.1). Each edge of the new context is inherited from either or , keeping the type; this implies that the new context is indeed a context with hole labels . Inputs and outputs of the new context coincide with those of the original context , and hence these two contexts have the same type. The plugging is associative in two senses: plugging two contexts into two holes of a context yields the same result regardless of the order, i.e. is well-defined; and nested plugging yields the same result regardless of the order, i.e. .
The notions of focussed and focus-free hypernets can be naturally extended to contexts. In a focussed context , the token is said to be entering if it is an incoming edge of a hole, and exiting if it is an outgoing edge of a hole. The token may be both entering and exiting.
5.4. States and transitions
Given the two parameters and , the universal abstract machine is defined as a state transition system. It is namely given by data as follows, each of which we will describe in the sequel.
- •
is a set of states,
- •
is a set of intrinsic transitions, and
- •
is a set of extrinsic transitions.
A focussed hypernet of type in is said to be a state. A state is called initial if , and final if . A state is said to be stuck if it is not final and cannot be followed by any transition. An execution on a focus-free hypernet is a sequence of transitions starting from an initial state . The following will be apparent once transitions are defined: initial states are indeed initial in the sense that no search transition results in an initial state; and final states are indeed final in the sense that no transition is possible from a final state.
The interaction rules in Fig. 6 specify the first class of intrinsic transitions, search transitions, and the contraction rules in Fig. 7 specify the second class of intrinsic transitions, copy transitions. These intrinsic transitions are defined as follows: for each interaction rule (or resp. contraction rule {\dot{G}}\overset{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}}{\mapsto}{\dot{G^{\prime}}}), if there exists a focus-free simple context such that and are states, is a search transition (or resp. copy transition).
Search transitions are deterministic, because at most one interaction rule can be applied at any state. Although two different contraction rules may be possible at a state, copy transitions are still deterministic. Namely, if two different contraction rules and can be applied to the same state, i.e. there exist focus-free simple contexts and such that , then these two rules yield the same transition, by satisfying . Informally, in Fig. 7, is determined uniquely and the choice of does not affect the result.
Intrinsic transitions are therefore all deterministic, and moreover, search transitions are reversible because the inverse of the interaction rules is again deterministic. When a sequence of transitions consists of search transitions only, it is annotated by the symbol as {\dot{G}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}{\dot{G^{\prime}}}.
An execution on any stable net, or on representation of any value, terminates successfully at a final state with only search transitions (by Lem. D.2, Lem. D.4 and Lem. D.6(1)).
The behaviour , which is a parameter of the machine, specifies a set of extrinsic transitions. Extrinsic transitions are also dubbed “compute” transitions, and each of them must target an active operation. Namely, a transition is a compute transition if: the first state has a rewrite token () that is an incoming edge of an active operation edge; and the second state has a search token (). Copy transitions or compute transitions are possible if and only if a state has a rewrite token (), and they always change the token to a search token (). We refer to copy transitions and compute transitions altogher as “rewrite” transitions.
Compute transitions may be specified locally, by rewrite rules, in the same manner as the intrinsic transitions. We leave it entirely open what the actual rewrite associated to some operation is, by having the behaviour as parameter as well as the operation set . This is part of the semantic flexibility of our framework. We do not specify a meta-language for encoding effects as particular transitions. Any algorithmic state transformation is acceptable.
Remark 5.16**.**
This abstract machine is “universal” in a sense of the word similar to the way it is used in “universal algebra” rather than in “universal Turing machine”. It is a general abstract framework in which a very wide range of concrete abstract machines can be instantiated by providing operations and their rewrite rules. ∎
6. A partial characterisation theorem
We can now formalise a proof method for contextual refinement and equivalence. All the technical development in this section is with respect to the universal abstract machine , parametrised by an operation set and its behaviour , that satisfies some conditions including determinism.
Firstly in Sec. 6.1, we state the condition of the machine, to which the proof method applies. Sec. 6.2 proposes our generalised notions of contextual refinement and equivalence. Sec. 6.3 formalises the proof method as a partial characterisation theorem (Thm. 6.20), introducing the key concept of robustness. Finally, in Sec. 6.4, we list some useful lemmas that can be used in robustness check.
6.1. Determinism and refocusing
We will focus on the situation where the universal abstract machine is both deterministic as a state transition system, and refocusing in the following sense.
Definition 6.1** (Rooted states and refocusing machine).**
- •
A state is rooted if ?;|{\dot{G}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}{\dot{G}}.
- •
The universal abstract machine is refocusing if every transition preserves the rooted property.
Note that any initial state is trivially rooted, and search transitions preserve the rooted property. The stationary property below gives a sufficient condition for a rewrite transition to preserve the rooted property (Lem. C.8).
Definition 6.2** (Stationary rewrite transitions).**
A rewrite transition is stationary if there exist a focus-free simple context , focus-free hypernets and , and a number , such that is one-way, and , and the following holds. For any , such that is a state, there exists a state with a rewrite token, such that \mathcal{C}[?;_{j}H]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{N}}.
Determinism and refocusing of the universal abstract machine boil down to corresponding properties of compute transitions, which are specified by the behaviour .
Lemma 6.3**.**
- •
The universal abstract machine is deterministic if its compute transitions are deterministic.
- •
The universal abstract machine is refocusing if its compute transitions preserve the rooted property.
Proof.
Intrinsic transitions and compute transitions (i.e. extrinsic transitions) are mutually exclusive, and intrinsic transitions are all deterministic. Therefore, if compute transitions are deterministic, all transitions become deterministic.
Search transitions trivially preserve the rooted property. Copy transitions also preserve the rooted property, because they are stationary. This is mainly due to the fact that each input of the contraction tree in Fig. 7 is a source of a contraction edge. Therefore, all transitions but compute transitions already preserve the rooted property. ∎
Remark 6.4** (Refocusing).**
When a rewrite transition results in a rooted state , starting the search process (i.e. search transitions) from the state can be seen as resuming the search process ?;|{\dot{N^{\prime}}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}{\dot{N^{\prime}}}, from an initial state, on the underlying hypernet . Resuming redex search after a rewrite, rather than starting from scratch, is an important aspect of abstract machines. In the case of the lambda-calculus, enabling the resumption is identified as one of the key steps (called “refocusing”) to synthesise abstract machines from reduction semantics by Danvy et al. [DMMZ12]. In our setting, it is preservation of the rooted property that justifies the resumption.
The stationary property, as a sufficient condition of preservation of the rooted property, characterises many operations with local behaviour. Compute transitions of operations that involve non-local change of a token position, like label jumping, could preserve the rooted property without being stationary. ∎
6.2. Contextual refinement and equivalence
We propose notions of contextual refinement and equivalence that check for successful termination of execution. These notions generalise the standard notions, by additionally taking into account a class of contexts to quantify over, and also the number of transitions. They are namely with respect to the universal abstract machine with some operation set and its behaviour , and parametrised by the following: a set of focus-free contexts that is closed under plugging (i.e. for any contexts such that is defined, ); and a preorder on natural numbers.
Definition 6.5** (State refinement and equivalence).**
Let be a preorder on , and and be two states.
- •
is said to refine up to , written as , if for any number and any final state such that , there exist a number and a final state such that and .
- •
and are said to be equivalent up to , written as , if and .
Definition 6.6** (Contextual refinement and equivalence).**
Let be a set of contexts that is closed under plugging, be a preorder on , and and be focus-free hypernets of the same type.
- •
is said to contextually refine in up to , written as , if any focus-free context , such that and are states, yields refinement .
- •
and are said to be contextually equivalent in up to , written as , if and .
In the sequel, we simply write etc., making the parameter implicit.
Because is a preorder, and are indeed preorders, and accordingly, equivalences and are indeed equivalences (Lem E.2). Examples of preorder include: the universal relation , the “greater-than-or-equal” order , and the equality .
When the relation is the universal relation , the notions concern successful termination, and the number of transitions is irrelevant. If all compute transitions are deterministic, contextual equivalences and coincide for any (as a consequence of Lem. E.3).
Because is closed under plugging, the contextual notions and indeed become congruences. Namely, for any and such that and are defined, , where .
As the parameter , we will particularly use the set of any focus-free contexts, and its subset of binding-free contexts.
Definition 6.7** (Binding-free contexts).**
A focus-free context is said to be binding-free if there exists no path, at any depth, from a source of a contraction, atom, box or hole edge, to a source of a hole edge.
The set is closed under plugging, and so is the set (Lem. E.1). Restriction to binding-free contexts is useful to focus on call-by-value languages, because only values will be bound during evaluation in these languages. Syntactically, the restriction would mean forbidding the hole of contexts from appearing in the bound positions.
The standard notions of contextual refinement and equivalence can be recovered as and , by taking the set of all focus-free contexts, and the universal relation .
6.3. Templates and robustness
6.3.1. Pre-templates and specimens
A candidate for contextual refinement, called pre-template, is given by a family of pairs of hypernets, indexed by types.
Definition 6.8** (Pre-template).**
A pre-template is a union of a family of binary relations on focus-free hypernets indexed by a set of types, such that for any where , and are focus-free hypernets with type and .
Obviously, if is a pre-template, its converse is also a pre-template.
Let be a set of contexts, and , and be binary relations on . Given a pre-template , our goal is to prove that implies contextual refinement , possibly with the help of state refinement up to or . As will be apparent in Sec. 9, the use of state refinement is particularly convenient in reasoning, allowing us to identify different contraction trees of the same type.
Pre-templates do not necessarily relate hypernets that represent terms, nor hypernets that arise from rewrite rules of the operations. However, the rewrite rules are indeed natural candidates for contextual refinements, and therefore natural sources of pre-templates.
Example 6.9** (Beta pre-template).**
As a leading example, we consider the beta pre-template derived from the beta rewrite rule (Fig. 8b), by forgetting the token: namely, if is a beta rewrite rule. These hypernets and have the same type , where and the sequence of types can be arbitrary. ∎
Example 6.10** (Structural pre-templates).**
As discussed in Rem. 5.11, we opt to introduce equational properties of contractions and weakenings as contextual equivalences, rather than built-in equations on hypernets. We call pre-templates for these equational properties structural pre-templates, and present eight examples of these here. These can be defined generally for any operation sets.
The first example is for duplication that is enabled by contractions. It is derived from the contraction rules (Fig. 7) in the same manner as the beta pre-template: namely, |{\dot{G_{1}}}|\vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}}|{\dot{G_{2}}}| if is a contraction rule.
The other seven examples are presented in Fig. 11, for which the definition of box-permutation pair is provided later in Def. 6.11.
Contextual equivalences induced by \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Assoc}}, \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Comm}} and \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Idem}} would enable the so-called idempotent completion (aka. Karoubi envelope or Cauchy completion) on contractions and weakenings. This means that contraction trees with the same type can be identified, so long as they contain at least one weakening edge.
The pre-template represents the so-called garbage collection, which allows us to remove copyable hypernets that are connected to weakening edges and hence not reachable from any input of a whole hypernet. While the pre-template is to enable permutation of outputs of a box edge, the pre-templates and are to allow contraction edges and weakening edges in a particular position to be pulled across the boundary of a box. ∎
Definition 6.11** (Box-permutation pair).**
For any , let and be bijections on sets and , respectively. These bijectons form a box-permutation pair if, for each , the following holds:
(A) if ,
(B) if ,
(C) if ,
(D) if .
At the core of the proof is comparison between hypernets related by the pre-template , in any possible contexts specified by the set . In other words, the comparison is between a pair of states whose only difference is given by the pre-template . This pair is given by data called specimen. The comparison can be relaxed by allowing state refinements and , in addition to the pre-template , to specify the difference between states. The relaxed comparison is targeted at a pair of states, which is intuitively a specimen up to state refinements, called quasi-specimen.
Definition 6.12** ((Quasi-)specimens).**
Let be a pre-template, and and be binary relations on states.
- (1)
A triple is a -specimen of if the following hold:
(A) , and the three sequences have the same length .
(B) for each .
(C) is a state for each . 2. (2)
A pair of states is a quasi--specimen of up to , if there exists a -specimen of such that the following hold:
(A) The tokens of , and all have the same label.
(B) If and are rooted, then and are also rooted, , and .
We can refer to the token label of a -specimen and a quasi--specimen. Any -specimen gives a quasi--specimen up to . Note that the focussed context of each -specimen may have multiple holes. We say a -specimen is single if the sequence only has one element, i.e. the context has exactly one hole edge (at any depth).
6.3.2. Robust templates
For the pre-template to imply contextual refinement, hypernets related by should intuitively induce the same behaviour of the token, regardless of contexts. We use specimens to analyse the token behaviour, in a case-by-case manner. Namely, given any -specimen of a pre-template , we analyse how the token in the context interacts with the hypernets , and , that are contributed by . According to the actual position and the label of the token, the possible interaction can be classified as follows:
**Case I: move inside the common context.: **
The token is a search token () or a value token (), and it interacts with an edge of the context . It does not interact with the fragments at all.
**Case II: visit to the fragments.: **
The token is a search token () or a value token (), and it is set to interact with edges of the fragments. The token is necessarily next to a hole edge in the context .
**Case III: rewrite.: **
The token in the context is a rewrite token (). It triggers a compute transition, in which a part of the fragments may be rewritten.
**Case IV: termination.: **
The token is a value token () and does not have an edge to interact, which is an incoming edge of the token. In this case, both states and are necessarily final.
Our objective is to establish that the token always interacts with the hypernets in the same manner as with the hypernets . By step-wise reasoning whose details will appear in Sec. 7, it suffices to show that the interaction in the -specimen , described above, always results in another (quasi-)-specimen of , unless it results in stuck states or final states.
Apart from Case IV which actually involves no interaction, it is only Case I that always results in another -specimen with a different context . For the other cases, i.e. Case II and Case III, we identify sufficient conditions for the pre-template , as well as for the operations , to yield a (quasi-)-specimen as a result.
There are two sufficient conditions for Case II, according to the token: input-safety for a search token (), and output-closure for a value token (). Input-safety enumerates some situations of interaction that yield a (quasi-)specimen or a pair of stuck states. On the other hand, output-closure characterises a situation where a value token cannot interact with the hypernets contributed by the pre-template , under the assumption that the machine is refocusing.
Definition 6.13** (Input-safety).**
A pre-template is -input-safe if, for any -specimen of such that has an entering search token, one of the following holds.
(I) There exist two stuck states and such that for each .
(II) There exist a -specimen of and two numbers , such that the token of is not a rewrite token and not entering, , , and .
(III) There exist a quasi--specimen of up to , whose token is not a rewrite token, and two numbers , such that , , and .
Definition 6.14** (Output-closure).**
A pre-template is output-closed if, for any hypernets , either or is one-way.
Definition 6.15** (Templates).**
A pre-template is a -template, if it is -input-safe and also output-closed.
It is possible to allow a value token to interact with the hypernets contributed by the pre-template , and define a counterpart of input-safety for a value token. However, we here opt for a simple sufficient condition, output-closure, that excludes interaction in the refocusing machine. It is simple yet powerful enough to prove interesting contextual refinements, as we will see in Sec. 9.
Example 6.16** (Beta template).**
We continue with the beta pre-template from Ex. 6.9. It is natural to expect that the beta pre-template is input-safe, because it is derived from the beta rewrite rule. Given hypernets , whenever a search token enters , it should eventually become a rewrite token and trigger the beta rewrite of to . If this is the case, Def. 6.13(II) is fulfilled, where and the new context has one less hole than the original context .
However, the actual behaviour of an entering search token is not necessarily as expected. When a search token visits the hypernet that is the second argument of the application operation (see Fig. 8b), there is no guarantee that the token returns to the application edge () and hence triggers the beta rewrite of the hypernet . Even if it returns, it may have triggered some rewrites that change the hypernet to something else. Consequently, the pre-template is not necessarily input-safe as it is.
One possible solution is to restrict the pre-template , by requiring the hypernet to be stable. This restriction ensures that a search token that enters eventually triggers the beta rewrite of to , because a search token that visits the stable hypernet is guaranteed to return without triggering any rewrites (by Lem. D.4 and Lem. D.6(1)). Moreover, this restriction makes the pre-template output-closed, because of the type of . The restriction in fact corresponds to restriction of the standard beta law to the call-by-value one , where the argument is required to be a value. ∎
For Case III of the possible interaction between the token , where the token in the context triggers a rewrite, we identify a sufficient condition called robustness. It is in a similar style as input-safety, but additionally relative to the triggered rewrite, whose target is a contraction or an active operation .
Definition 6.17** (Robustness).**
A pre-template is -robust relative to a rewrite transition if, for any -specimen of , such that and the token of is a rewrite token and not entering, one of the following holds.
(I) or is not rooted.
(II) There exists a stuck state such that .
(III) There exist a quasi--specimen of up to , whose token is not a rewrite token, and two numbers , such that , , and .
Example 6.18** (Robustness of beta pre-template).**
Let us see informally how robust the beta pre-template , from Ex. 6.9, can be.
The beta pre-template is robust relative to the arithmetic rewrite rule in Fig. 8a, because application of the rewrite rule does not interfere with any hypernets contributed by the pre-template . Any specimen is therefore turned into another specimen with a different context .
The beta pre-template is also robust relative to the beta rewrite rule. Hypernets contributed by the pre-template may appear as a part of the redex of the rewrite rule, namely, as a part of and in Fig. 8b. However, the rewrite does not manipulate what is inside and , and hence it preserves the contributed hypernets. Any specimen is again turned into another specimen with a different context.
When it comes to a copy transition, which applies a contraction rule (Fig. 7), robustness is not guaranteed in general. Starting from a pair of states given by a specimen , it may be the case that some copy transitions are possible without reaching another (quasi-)specimen. An example scenario is when the specimen yields the following two states, where the context is indicated by magenta:
[TABLE]
Transitions from the state eventually duplicate the application edge (), the abstraction edge () and also the entire box connected to the abstraction edge. In particular, these transitions duplicate the atom edge contained in the box. However, transitions from the state can never duplicate the atom edge, because the edge is shallow in this state. This mismatch of duplication prevents the beta pre-template from being robust relative to a copy transition.
This is why we might prefer to restrict contexts to be binding-free. If the context is binding-free, the situation explained above would never happen. Application of a contraction rule can involve the hypernets and only as a part of box contents that are duplicated as a whole. Any specimen is therefore turned into another specimen whose context possibly has more holes, by a single copy transition. Note that this explains why we allow the context of a specimen to have multiple holes.
Another interesting example is given by the operation mentioned earlier, whose rewrite transition measures the size of the whole hypernet. The transition introduces an edge labelled by an integer, which can be seen as a passive operation with no arguments, that represents the measured size. Consequently, the rewrite transition on states and may introduce different integer edges, which cannot be related by the pre-template . Robustness of the beta pre-template fails relative to the operation .
Since robustness is only a sufficient condition, this failure does not necessarily mean that the beta pre-template cannot imply contextual refinement. Nevertheless, the failure suggests how the operation could violate the contextual refinement. It namely indicates that such a violation would rely on equality testing of integers, such as an active operation whose rewrite transition is only defined when the two eager arguments are the same integer. ∎
6.3.3. The partial characterisation theorem
Finally, the partial characterisation theorem (Thm. 6.20 below) enables us to prove contextual refinement using state refinements and , under the assumption that the machine is deterministic and refocusing, and that the triple is “reasonable” in the following sense.
Definition 6.19** (Reasonable triples).**
A triple of preorders on is reasonable if the following hold:
(A) is closed under addition, i.e. any and satisfy .
(B) , and .
(C) , where denotes composition of binary relations.
Examples of a reasonable triple include: , , , , , , , , .
Theorem 6.20**.**
If an universal abstract machine is deterministic and refocusing, it satisfies the following property. For any set of contexts that is closed under plugging, any reasonable triple , and any pre-template on focus-free hypernets :
- (1)
If is a -template and -robust relative to all rewrite transitions, then implies contextual refinement in up to , i.e. any implies . 2. (2)
If is a -template and the converse is -robust relative to all rewrite transitions, then implies contextual refinement in up to , i.e. any implies .
Proof.
This is a consequence of Prop. 7.6, Prop. 7.2 and Prop. 7.4 in Sec. 7. ∎
Remark 6.21** (Coinductive reasoning behind the scenes).**
The proof of Thm. 6.20, which will be given in Sec. 7, crucially uses coinductive reasoning. It enables step-wise reasoning, or transition-wise reasoning, to compare sequences of transitions that end with final states. The proof is namely centred around a variant of weak simulation combined with the so-called up-to technique. The definitions of input-safety (Def. 6.13), output-closure (Def. 6.14), and robustness (Def. 6.17) are in fact designed so that a robust template induces the variant of weak simulation up-to.
We can observe in Ex. 6.16 that, when a pre-template is derived from a local rule of an active operation, a pair of hypernets related by the pre-template can result in the same hypernet after different numbers of transitions: namely, some number of transitions on results in precisely , which can be seen as the result of zero transition on . This motivates our use of the notion of weak simulation, instead of the standard notion of simulation that only allows comparison between single transitions.
Recall that, as discussed in Rem. 5.11, we do not opt to quotient hypernets with equatinoal properties. This is because such quotienting seems to make local reasoning, by means of templates and specimens, ineffective. We instead opt to incorporate the equational properties as contextual equivalences, and use them to check input-safety and robustness of a template, via the notion of quasi specimen (Def. 6.12(2)). In coinductive reasoning, this amounts to the use of contextual equivalences (and hence state equivalence) in establishing a weak simulation, which is possible with the up-to technique. More precisely, we will use a variant of weak simulation up to state refinement in the proof of Thm. 6.20.
It is however a well-known challenge to combine weak simulations with the up-to technique, as a naive combination does not give a sound proof technique (see e.g. [Pou05, Pou07]). Here we overcome the challenge by introducing quantitative constraints by means of preorders on natural numbers, namely reasonable triples (Def. 6.19). Thanks to reasonable triples, our variant of weak simulation up-to provides a sound proof technique for Thm. 6.20.
The use of reasonable triples has a byproduct, that is, the weak simulation up-to enables us to prove not only the standard notion of contextual equivalence, but also its generalisation parametrised by preorders on natural numbers. It is the choice of not quotienting hypernets, which is crucial for local reasoning, that leads us to impose the quantitative constraints, and ultimately enables us to generalise contextual equivalence with respect to the preorders. ∎
Remark 6.22** (Monotonicity).**
Contextual/state refinement and equivalence are monotonic with respect to , in the sense that implies for each . Contextual refinement and equivalence are anti-monotonic with respect to , in the sense that implies for each . This means, in particular, .
Given that , , , , and , the following holds. Any -specimen is a -specimen, and any quasi--specimen up to is a quasi--specimen up to . Any -template is a -template. If is -robust relative to a rewrite transition, then it is also -robust relative to the same transition. Note that the notions of template and robustness are not monotonic nor anti-monotonic with respect to .
To prove that a pre-template induces contextual equivalence, one can use Thm. 6.20(1) twice with respect to and . One can alternatively use Thm. 6.20(1) and Thm. 6.20(2), both with respect to . This alternative approach is often more economical. The reason is that the approach involves proving input-safety of with respect to two parameters and , which typically boils down to a proof for one parameter, thanks to the monotonicity. ∎
6.4. Sufficient conditions for robustness
A proof of robustness becomes trivial for a specimen with a rewrite token that gives a non-rooted state. Thanks to the lemma below, we can show that a state is not rooted, by checking paths from the token target.
Definition 6.23** (Accessible paths).**
- •
A path of a hypernet is said to be accessible if it consists of edges whose all sources have type .
- •
An accessible path is called stable if the labels of its edges are included in .
- •
An accessible path is called active if it starts with one active operation edge and possibly followed by a stable path.
Note that box edges and atom edges never appear in an accessible path.
Lemma 6.24**.**
*If a state has a rewrite token that is not an incoming edge of a contraction edge, then the state satisfies the following property: If there exists an accessible, but not active, path from the token target, then the state is not rooted. *
Proof.
This is a contraposition of a consequence of Lem. C.5 and Lem. D.6(2). ∎
Checking the condition (III) of robustness (see Def. 6.17) involves finding a quasi--specimen of up to , namely checking the condition (B) of Def. 6.12(2). The following lemma enables us to use contextual refinement to yield state refinement , via single -specimens of a certain pre-template .
Definition 6.25**.**
A pre-template is a trigger if it satisfies the following:
(A) For any single -specimen of , such that has an entering search token, for each .
(B) For any hypernets , both and are one-way.
Lemma 6.26**.**
Let be a set of contexts, and be a binary relation on such that, for any , implies . Let be a pre-template that is a trigger and implies contextual refinement . For any single -specimen of , if compute transitions are all deterministic, and one of states and is rooted, then the other state is also rooted, and moreover, .
Proof.
This is a corollary of Lem. F.1. ∎
Remark 6.27**.**
The notion of contextual refinement concerns initial states, and therefore, only enables us to safely replace a part of a hypernet before execution. Because any initial state is rooted, if all transitions preserve the rooted property, we can safely assume that any state that arises in an execution is rooted. If all transitions are also deterministic, Lem. 6.26 enables us to use some contextual refinement and safely replace a part of a hypernet during execution. This can validate run-time garbage collection, for example. ∎
7. Proof of the partial characterisation theorem
This section details the proof of Thm. 6.20, with respect to the machine parametrised by and .
At the core of the proof is step-wise reasoning, or transition-wise reasoning, using a lax variation of simulation. Providing a simulation boils down to case analysis on transitions, namely on possible interactions between the token and parts of states contributed by a pre-template. While output-closure helps us disprove some cases, input-safety and robustness give the cases that are specific to a pre-template and an operation set.
We also employ the so-called up-to technique in the use of quasi-specimens. Our variation of simulation is up to state refinements, with a quantitative restriction implemented by the notion of reasonable triple. This restriction is essential to make this particular up-to technique work, in combination with our lax variation of simulation. A similar form of up-to technique is studied categorically by Bonchi et al. [BPPR17], but for the ordinary notion of (weak) simulation, without this quantitative restriction.
The lax variation of simulation we use is namely -simulation, parametrised by a triple . This provides a sound approach to prove state refinement , using and , given that all transitions are deterministic and forms a reasonable triple.
Definition 7.1** (-simulations).**
Let be a binary relation on states, and be a triple of preorders on . The binary relation is a -simulation if, for any two related states , the following (A) and (B) hold:
(A) If is final, is also final.
(B) If there exists a state such that , one of the following (I) and (II) holds:
(I) There exists a stuck state such that .
(II) There exist two states and , and numbers , such that , , , and .
Proposition 7.2**.**
When the universal abstract machine is deterministic, it satisfies the following.
For any binary relation on states, and any reasonable triple , if is a -simulation, then implies refinement up to , i.e. any implies .
Proof.
Our goal is to show the following: for any states , any number and any final state , such that , there exist a number and a final state such that and . The proof is by induction on .
In the base case, when , the state is itself final because . Because is a -simulation, is also a final state, which means we can take [math] as and itself as . Because is a reasonable triple, is a preorder and holds.
In the inductive case, when , we assume the induction hypothesis on any such that . Now that , there exists a state such that . Because all intrinsic transitions are deterministic, the assumption that compute transitions are all deterministic implies that states and transitions comprise a deterministic abstract rewriting system, in which final states and stuck states are normal forms. By Lem. C.1, we can conclude that there exists no stuck state such that .
Therefore, by being a -simulation, there exist two states and , and numbers , such that , , , and . By the determinism, must hold; if is a final state, must coincide with ; otherwise, must be a suffix of . There exist two states and , and we have the following situation, where the relations , and are represented by vertical dotted lines from top to bottom.
[TABLE]
We expand the above diagram as below (indicated by magenta), in three steps.
[TABLE]
Firstly, by definition of state refinement, there exist a number and a final state such that and . Because is a reasonable triple, , and hence . Therefore, secondly, by induction hypothesis on , there exist a number and a final state such that and . Thirdly, by definition of state refinement, there exist a number and a final state such that and .
Now we have , and , which means . Because is a reasonable triple, this implies , and moreover, . We can take as . ∎
The token in a focussed context is said to be remote, if it is a search token, a value token, or not entering. The procedure of contextual lifting reduces a proof of contextual refinement down to that of state refinement.
Definition 7.3** (Contextual lifting).**
Let be a set of contexts. Given a pre-template on focus-free hypernets , its -contextual lifting is a binary relation on states defined by: if there exists a -specimen of , such that the token of is remote, , and is rooted, for each .
The contextual lifting is by definition a binary relation on rooted states.
Proposition 7.4**.**
For any set of contexts that is closed under plugging, any preorder on , and any pre-template on focus-free hypernets , if the -contextual lifting implies refinement (resp. equivalence ), then implies contextual refinement (resp. contextual equivalence ).
Proof of refinement case.
Our goal is to show that, for any and any focus-free context such that and are states, we have refinement .
Because for , and , the triple is a -specimen of with a search token. Moreover the states and are trivially rooted. Therefore, , and by the assumption, . ∎
Proof of equivalence case.
It suffices to show that, for any and any focus-free context such that and are states, we have refinements and , i.e. equivalence .
Because for , and , the triple is a -specimen of with a search token. Moreover the states and are trivially rooted. Therefore, , and by the assumption, . ∎
Lemma 7.5**.**
For any set of contexts that is closed under plugging, any pre-template on focus-free hypernets , and any -specimen of , the following holds.
- (1)
The state is final (resp. initial) if and only if the state is final (resp. initial). 2. (2)
If is output-closed, and and are both rooted states, then the token of is not exiting. 3. (3)
If is output-closed, and are both rooted states, the token of is a value token or a non-entering search token, and a transition is possible from or , then there exists a focussed context with a remote token such that and for each .
Proof of point (1).
Let be an arbitrary element of a set . If is final (resp. initial), the token source is an input in . Because input lists of , and all coincide, the token source must be an input in , and in too. This means is also a final (resp. initial) state. ∎
Proof of point (2).
This is a consequence of the contraposition of Lem. C.7(3). ∎
Proof of the
point (3).
The transition possible from or is necessarily a search transition. By case analysis on the token of , we can confirm that the search transition applies an interaction rule to the token and an edge from .
- •
When the token of is a value token, the transition can only change the token and its incoming operation edge. Because is output-closed, by the point (2), the token of is not exiting. This implies that the incoming operation edge of the token is from in both states and .
- •
When the token of is a non-entering search token, the transition can only change the token and its outgoing edge. Because the token is not entering in , the outgoing edge is from in both states and .
Therefore, there exist a focus-free simple context and an interaction rule , such that , and is a focussed context.
Examining interaction rules confirms , and hence . By definition of search transitions, we have:
[TABLE]
for each .
The rest of the proof is to check that has a remote token, namely that, if its token is a rewrite token, the token is not entering. This is done by inspecting interaction rules.
- •
When the interaction rule changes a value token to a rewrite token, this must be the interaction rule (5a), which means consists of the rewrite token and its outgoing operation edge. The operation edge remains to be a (unique) outgoing edge of the token in , and hence the token is not entering in .
- •
When the interaction rule changes a search token to a rewrite token, this must be the interaction rule (1a), (1b) or (5b), which means . Because the token is not entering in , the token is also not entering in .
∎
Proposition 7.6**.**
When the universal abstract machine is deterministic and refocusing, it satisfies the following, for any set of contexts that is closed under plugging, any reasonable triple , and any pre-template on focus-free hypernets .
- (1)
If is a -template and -robust relative to all rewrite transitions, then the -contextual lifting is a -simulation. 2. (2)
If is a -template and the converse is -robust relative to all rewrite transitions, then the -contextual lifting of the converse is a -simulation.
Proof prelude.
Let be an arbitrary -specimen of , such that the token of is remote, and is a rooted state for each . By definition of contextual lifting, , and equivalently, . Note that .
Because is output-closed, by Lem. 7.5(2), the token is not exiting in . This implies that, if the token has an incoming edge in or , the incoming edge must be from .
Because the machine is deterministic and refocusing, rooted states and transitions comprise a deterministic abstract rewriting system, in which final states and stuck states are normal forms. By Lem. C.1, from any state, a sequence of transitions that result in a final state or a stuck state is unique, if any.
Because is a reasonable triple, and are reflexive. By Lem. E.2, this implies that and are reflexive, and hence , and . ∎
Proof of the point (1).
Our goal is to check conditions (A) and (B) of Def. 7.1 for the states .
If is final, by Lem. 7.5(1), is also final. The condition (A) of Def. 7.1 is fulfilled.
If there exists a state such that , we show that one of the conditions (I) and (II) of Def. 7.1 is fulfilled, by case analysis of the token in .
- •
When the token is a value token, or a search token that is not entering, by Lem. 7.5(3), there exists a focussed context with a remote token, such that and for each . We have the following situation, namely the black part of the diagram below. Showing the magenta part confirms that the condition (II) of Def. 7.1 is fulfilled.
[TABLE]
By the determinism, . Because is a reasonable triple, is a preorder and . The context satisfies , so is a -specimen of . The context has a remote token, and the states and are both rooted. Therefore, we have .
- •
When the token is a search token that is entering in , because is -input-safe, we have one of the following three situations corresponding to (I), (II) and (III) of Def. 6.13.
- –
There exist two stuck states and such that for each . By the determinism of transitions, we have , which means the condition (I) of Def. 7.1 is satisfied.
- –
There exist a -specimen of and two numbers , such that the token of is not a rewrite token and not entering, , , and . By the determinism of transitions, we have the following situation, namely the black part of the diagram below. Showing the magenta part confirms that the condition (II) of Def. 7.1 is fulfilled.
[TABLE]
The context has a remote token, and states and are rooted. Therefore, .
- –
There exist a quasi--specimen of up to , whose token is not a rewrite token, and two numbers , such that , , and . By the determinism of transitions, we have the following situation, namely the black part of the diagram below. Showing the magenta part confirms that the condition (II) of Def. 7.1 is fulfilled.
[TABLE]
Because is a quasi--specimen of up to , and states and are rooted, there exists a -specimen of with a non-rewrite token, such that and are also rooted, , and . Because is a reasonable triple, , and hence . Therefore, we have:
[TABLE]
- •
When the token is a rewrite token, is a rewrite transition, and by definition of contextual lifting, the token is not entering in . Because is -robust relative to all rewrite transitions, and and are rooted, we have one of the following two situations corresponding to (II) and (III) of Def. 6.17.
- –
There exists a stuck state such that . The condition (I) of Def. 7.1 is satisfied.
- –
There exist a quasi--specimen of up to , whose token is not a rewrite token, and two numbers , such that , , and . We have the following situation, namely the black part of the diagram below. Showing the magenta part confirms that the condition (II) of Def. 7.1 is fulfilled.
[TABLE]
Because is a quasi--specimen of up to , and states and are rooted, there exists a -specimen of with a non-rewrite token, such that and are also rooted, , and . This means , and hence:
[TABLE]
∎
Proof of the point (2).
It suffices to check the “reverse” of conditions (A) and (B) of Def. 7.1 for the states , namely the following conditions (A’) and (B’).
(A’) If is final, is also final.
(B’) If there exists a state such that , one of the following (I’) and (II’) holds.
(I’) There exists a stuck state such that .
(II’) There exist two states and , and numbers , such that , , , and .
The proof is mostly symmetric to the point (1). Note that there is a one-to-one correspondence between -specimens of and -specimens of ; any -specimen of gives a -specimen of . Because is output-closed, its converse is also output-closed.
If is final, by Lem. 7.5(1), is also final. The condition (A’) is fulfilled.
If there exists a state such that , we show that one of the conditions (I’) and (II’) above is fulfilled, by case analysis of the token in .
- •
When the token is a value token, or a search token that is not entering, by Lem. 7.5(3), there exists a focussed context with a remote token, such that and for each . We have the following situation, namely the black part of the diagram below. Showing the magenta part confirms that the condition (II’) is fulfilled.
[TABLE]
By the determinism, . Because is a reasonable triple, is a preorder and . The context satisfies , so is a -specimen of . The context has a remote token, and the states and are both rooted. Therefore, we have .
- •
When the token is a search token that is entering in , because is -input-safe, we have one of the following three situations corresponding to (I), (II) and (III) of Def. 6.13.
- –
There exist two stuck states and such that for each . By the determinism of transitions, we have , which means the condition (I’) is satisfied.
- –
There exist a -specimen of and two numbers , such that the token of is not a rewrite token and not entering, , , and . We have the following situation, namely the black part of the diagram below.
[TABLE]
The magenta part holds, because the token of is not a rewrite token and not entering, and because states and are rooted. We check the condition (II’) by case analysis on the number .
When , by the determinism of transitions, we have the following diagram, which means the condition (II’) is fulfilled.
[TABLE]
- *
When , , and we have the following situation, namely the black part of the diagram below.
[TABLE]
Because , and the token of is a value token, or a non-entering search token, by Lem. 7.5(3), there exists a focussed context with a remote token, such that and for each . By the determinism of transitions, . Because is a reasonable triple, is a preorder and . The context satisfies , so is a -specimen of . The context has a remote token, and the states and are both rooted. Therefore, we have . Finally, because is a reasonable triple, is closed under addition, and hence . The condition (II’) is fulfilled.
- –
There exist a quasi--specimen of up to , whose token is not a rewrite token, and two numbers , such that , , and . By the determinism of transitions, we have the following situation, namely the black part of the diagram below. Showing the magenta part confirms that the condition (II’) is fulfilled.
[TABLE]
Because is a quasi--specimen of up to , and states and are rooted, there exists a -specimen of with a non-rewrite token, such that and are also rooted, , and . Because is a reasonable triple, , and hence . Therefore, we have:
[TABLE]
- •
When the token is a rewrite token, is a rewrite transition, and by definition of contextual lifting, the token is not entering in . Because is -robust relative to all rewrite transitions, and and are rooted, we have one of the following two situations corresponding to (II) and (III) of Def. 6.17.
- –
There exists a stuck state such that . The condition (I’) is satisfied.
- –
There exist a quasi--specimen of up to , whose token is not a rewrite token, and two numbers , such that , , and . We have the following situation, namely the black part of the diagram below. Showing the magenta part confirms that the condition (II’) is fulfilled.
[TABLE]
Because is a quasi--specimen of up to , and states and are rooted, there exists a -specimen of with a non-rewrite token, such that and are also rooted, , and . This means , and hence:
[TABLE]
∎
8. Observational equivalences of an extended call-by-value lambda-calculus
The partial characterisation theorem (Thm. 6.20) is to prove contextual equivalences on hypernets. By encoding terms into hypernets, observational equivalences on terms can also be proved using the theorem. This section demonstrates how observational equivalences of an extension of the call-by-value lambda-calculus can be formalised in terms of focussed hypernet rewriting.
8.1. An extended call-by-value lambda-calculus
Fig. 12 presents an extention of the untyped call-by-value lambda-calculus that additionally includes atoms , constants , unary operations \{1}\in\mathbf{UnOpr}${2}\in\mathbf{BinOpr}\Lambda\mathbf{Const}\mathbf{UnOpr}\mathbf{BinOpr}$. Operations with higher arities could be added in a straightforward manner.
Formation rules derive a judgement that consists of a finite sequence of distinct variables, a finite sequence of distinct atoms, and a -term . For a -term , the sets of bound variables, free variables, free atoms, respectively, can be defined as usual. In the sequel, we assume that bound variables of -terms are distinct, i.e. each variable is bound at most once in a -term.
Fig. 12 also introduces call-by-value let-binding and sequential composition as syntactic sugar, for later use.
8.2. Encoding into hypernets
Well-defined -terms can be directly encoded into hypernets. The encoding is defined inductively on (unique) derivations of well-defined -terms, as shown in Fig. 13. The resultant hypernets use the operation set that is induced by the sets , and as follows:
[TABLE]
Term constructors, namely function abstraction, application, constants and unary/binary operations, are encoded simply into edges that have corresponding labels. In particular, function application () and binary operations (\{2}(\Gamma\mid\Delta\vdash{\Lambda}t\mathbin{$}u)^{\dagger}in Fig. [13](#S8.F13). Note that the type\mathop{\overset{\rightarrow}{@}}\colon\star\Rightarrow\star^{\otimes 2}$ of function application encodes, and also ensures, the call-by-value evaluation of function application.
Thunks are used to encode function abstractions, which introduce a bound variable. Each variable is encoded into a contraction tree (i.e. a tree that consists of contraction edges and weakening edges) that has as many inputs (leaves) as the variable appears in a term. Atoms are encoded similarly to variables, except for instance edges (labelled with ‘’) connected to inputs of contraction trees. Informally speaking, encoding a term yields a hypernet where term constructors appear towards the sole input and contraction trees appear towards the outputs which correspond to free variables and free atoms.
In fact, there can be many different contraction trees that have a specific number of inputs, and consequently, there is no canonical choice of contraction trees that can be used to encode a given variable (or atom). It is desirable to identify these contraction trees, and we opt to do so by means of contextual equivalences induced by certain pre-templates (to be presented in Sec. 9.2.2). Via the so-called idempotent completion, these contextual equivalences allow us to identify certain contraction trees that (i) have the same number of inputs and (ii) contain at least one weakening edge. The encoding is designed so that variables and atoms are always encoded into those contraction trees that can be identified using the contextual equivalences. Distributors from Def. 5.7, e.g. , and in Fig. 13, are specifically used as standard ones among these contraction trees.
For example, in the encoding of a function application , free variables and free atoms are distributed to the two sub-terms and using distributors and . In this case, each free variable (and atom) is encoded not simply into one contraction edge which already has two inputs, but into a contraction tree that consists of two contraction edges and, importantly, one weakening edge. In the encoding of a function abstraction , distributors and are inserted, while the term simply inherits the free variables/atoms from the sub-term .
8.3. Observational equivalence
The notion of observational equivalence on terms can now be defined using the contextual equivalence on hypernets.
Definition 8.1** (Observational equivalence on -terms).**
Let and be two derivable judgements. The -terms and are said to be observationally equivalent, written as , if holds.
This notion of observational equivalence only concerns the coincidence of termination, which is standard given that the language is untyped. The use of the universal relation in Def. 8.1 makes the number of transitions until termination irrelevant.
The observational equivalence is defined in terms of the contextual equivalence that is with respect to binding-free contexts for the operation set . This contextual equivalence is a larger relation than the contextual equivalence with respect to all contexts for the operation set thanks to the anti-monotonicity (Remark 6.22). The restriction to binding-free contexts can be justified by the fact that the observational equivalence is a congruence relation with respect to -contexts defined by the following grammar:
[TABLE]
where \mathord{\{1}}\in{\mathtt{ref},\mathord{!},\mathord{-{1}}}\mathord{$_{2}}\in{\mathord{=},\mathord{:=},\mathord{+},\mathord{-}}$. This congruence property will be formalised as Lem. 8.2 below.
The formation rules of -terms in Fig. 12 can be adapted to -contexts, by annotating the hole ‘’ as ‘’ and adding a formation rule . We write when the hole of is annotated with . The encoding of -terms into hypernets in Fig. 13 can be extended accordingly, by encoding the additional formation rule into a path hypernet , where and are the length of and respectively. The encoding of -contexts yields hypernets that are binding-free contexts, and consequently, the observational equivalence on -terms is indeed a congruence relation with respect to -contexts.
Lemma 8.2**.**
Let and be two derivable judgements. If holds, then for any -context such that is derivable, holds.
Proof outline.
The proof is a combination of the congruence property of the contextual equivalence with respect to binding-free contexts, with two key properties of the encoding of -contexts. The two properties, stated below, can be proved by induction on -contexts.
The first property is that the encoding is a binding-free context (as a hypernet). To check this property, one needs to examine paths to the sole source of the unique hole edge that appears in the encoding. These paths are in fact always operation paths, noting that paths never go across the boundary of boxes (by Def. 5.2).
The second property is that and are both derivable, and moreover, their encoding can be decomposed as follows:
[TABLE]
∎
9. Example: a Parametricity law
In this section we use the partial characterisation theorem (Thm. 6.20) and prove a Parametricity law. The law originates in the Idealised Algol literature [OT97], a call-by-name language with ground-type local variables. The law we use here is for the untyped call-by-value lambda-calculus extended primarily with store, whose operations are already introduced in Sec. 4.2. The Parametricity law states that a term
[TABLE]
is observationally equivalent to a term
[TABLE]
These terms use the standard call-by-value variable binding ‘’ and sequential composition ‘;’, which will be both defined by syntactic sugar. In the first term, a store is created by ‘’, and any access to the store simply fetches the stored value, due to ‘’, without modifying it. As a consequence, the fetched value is always expected to be the original stored value ‘’, and hence the whole computation involving the particular state is expected to have the same result as just having the value ‘’ in the first place as in the second term above.
The proof of the Parametricity law consists of the following four steps.
**Step 1: **
design necessary pre-templates.
**Step 2: **
prove that the pre-templates are robust templates.
**Step 3: **
apply the partial characterisation theorem (Thm. 6.20) to the pre-templates, choosing appropriate parameters (i.e. a set of contexts and a reasonable triple).
**Step 4: **
combine resultant contextual equivalences and yield the Parametricity law.
In this section we will explain all the steps except for Step 2, which will be discussed separately in Sec. 10.
The rest of this section is organised as follows. Sec. 9.1 formalises the Parametricity law in a particular extension of the untyped call-by-value lambda-calculus. Sec. 9.2 explains Step 1 of the proof. It first illustrates the design process of pre-templates, and then presents the necessary pre-templates. Sec. 9.3 describes Step 3 of the proof, namely how the partial characterisation theorem can be applied to each of the pre-templates. Sec. 9.4 presents Step 4 of the proof, showing how the contextual equivalences that are induced by the pre-templates can be combined to yield the Parametricity law.
9.1. The Parametricity law
We formalise the Parametricity law in an instance of the extended (untyped) call-by-value lambda-calculus presented in Sec. 8.1. This particular extension involves store and arithmetic, which are already introduced in Sec. 4.2. The extension, referred to as , is specified by the following operations and constants.
- •
Store operations: reference creation ‘’, equality testing ‘’, assignment ‘’, and dereferencing ‘’.
- •
Arithmetic operations: addition ‘’, subtraction ‘’, and negation ‘’.
- •
Constants: natural numbers, boolean values, and the unit value ‘’.
Terms of the extended calculus can be encoded into hypernets, as described in Sec. 8.2. The encoding consequently uses the following passive operations and active operations :
[TABLE]
Their union is denoted by .
The Parametricity law can be formally stated as an observational equivalence.
Proposition 9.1** (Parametricity law).**
For any finite sequence of distinct variables and any finite sequence of distinct variables ,
[TABLE]
9.2. Designing pre-templates
9.2.1. Design process
The proof of the Parametricity law starts with designing necessary pre-templates. Instead of turning the law directly into a single pre-template, we will decompose the law into several, more primitive, pre-templates. Once we apply the partial characterisation theorem to the pre-templates, the obtained contextual equivalences can be composed to yield the Parametricity law. This approach increases the possibility of reusing parts of a proof of one law in a proof of another law. For example, most of the pre-templates that will be used to prove the Parametricity law can be reused for proving the call-by-value Beta law (see [Mur20, Sec. 4.5] for details).
The idea of the pre-templates is that they can describe all the possible differences that may arise during execution of any two programs whose differences is given precisely by the Parametricity law. As an illustration of the design process, we compare informal reduction sequences of two example programs, as summarised in Fig. 14.
We choose the context , which expects a function in the hole and uses it twice. It generates two programs, by receiving the terms and that are a de-sugared version of the two sides of the Parametricity law (i.e. (1) and (2)).
Each informal reduction step updates a term and its associated store. The step is either the standard call-by-value beta-reduction, addition of numbers (for ‘’), reference creation (for ‘’), or dereferencing (for ‘’). Reference creation is the only step that modifies store. It replaces a (sub-)term of the form ‘’ with a fresh name, say ‘’, and extends the store with ‘’. The empty store is denoted by ‘’. Each reduction step in Fig. 14 is given a tag, such as L0, L0’, R1. We use the tags for referring to a corresponding term and store, and also to the reduction step, if any, from the term and store.
Before explaining the colouring scheme of Fig. 14, let us observe the (in-)correspondence between the reduction sequences of and . The reduction sequence of consists of seven beta-reduction steps (R1–R7) and one addition step (R8). The other reduction sequence, of , in fact contains steps that correspond to these seven beta-reduction steps and the addition step, as suggested by the tags (L1, L2, L3, L4, L5, L6, L7, L8). The sequence has four additional steps, namely: one reference-creation step (L0), one application step (L0’) of a function to the created name ‘’, and two dereferencing steps (L4’, L7’). The two sequences result in the same term, but in different store (L9, R9).
The colouring scheme is as follows. In the eight matching steps and the final result, differences between two sides (e.g. L1 and R1) are highlighted in magenta. Note that it is not the minimum difference that are highlighted. Highlighted parts are chosen in such a way that they capture the smallest difference “on the surface”. Sub-terms on the surface are those that are outside of any lambda-abstraction, which can be graphically represented by sub-hypernets that are outside of any boxes. For example, after the name creation (i.e. in L1, R1), the function abstractions (‘’, ‘’) are highlighted, instead of the minimum difference (‘’, ‘’).
In summary, the reduction sequences of and have eight steps corresponding with each other (L1–L8, R1–R8) where the differences of their results can be described by means of store and sub-terms on the surface. The extracted differences are namely function abstractions (‘’ and ‘’, against ‘’ and ‘’) and store (‘’ against ‘’). By simply collecting these differences, we can obtain our first pre-template: parametricity pre-template . It is the essential, key, pre-template for the Parametricity law. Textually, and intuitively, it looks like the following.
[TABLE]
Graphically, the parametricity pre-template is simply a relation between hypernets. In particular, the left-hand side of the pre-template can be represented as a single hypernet, thanks to the graphical representation where mentions of a name (e.g. ‘’ above) become connection. Hypernets of the function abstractions are all connected to the hypernet of the store ‘’. This graphical representation naturally entails the crucial piece of information, which is invisible in the informal textual representation (3), that the function abstractions are the all and only parts of a program that have access to the name ‘’.
The choice of smallest differences on the surface, instead of absolute minimum difference, also plays a key role here. Should we choose minimum differences that may be inside lambda-abstraction, their hypernet representations cannot be directly combined to yield a single valid hypernet. This is due to the box structure of hypernets, which will be used to represent function abstraction. In other words, to connect the hypernet of the store ‘’ to the hypernet of the sub-term ‘’ that appears inside the function abstraction ‘’, we must first make a connection to the hypernet of the whole function abstraction, which contains the hypernet of ‘’ inside a box.
Recall that the parametricity pre-template collects differences in matching steps (eight steps each, i.e. L1–L8, R1–R8). From the first two unmatched steps (L0, L0’), we extract other two pre-templates and . These are dubbed operational pre-templates, because they are both induced by the reductions, namely: by reference creation (L0) and by beta-reduction (L0’). For instance, these pre-templates relate the sub-terms that are yielded by the two reductions (L0, L0’), informally as follows.
[TABLE]
The three pre-templates , , are all the key pre-templates we will need for the Parametricity law. Once the partial characterisation theorem is applied to these pre-templates, the law can be obtained as a chain of the induced contextual equivalences. The chain roughly looks as follows for the particular programs and , where we use to denote the informal textual counterpart of contextual equivalence that is between terms accompanied by store.
[TABLE]
Finally, in the formal proof that uses hypernets and focussed graph rewriting rather than terms and reductions, we will additionally need some auxiliary pre-templates, namely the structural pre-templates introduced in Ex. 6.10. These will enable us to simplify or identify certain contractions and weakenings, which are not present in the textual representation but important in the graphical representation as hypernets. Contextual equivalences induced by the structural pre-templates enable simplification of certain hypernets that involve contractions and weakanings. The simplification will appear in the formal counterpart of the chain (4), which will be presented in Sec. 9.4. It primarily applies to the hypernets produced by the encoding . Additionally, the pre-templates will help us prove input-safety and robustness of other pre-templates.
Remark 9.2**.**
In the reduction sequences in Fig. 14, we can find various situations of robustness for the parametricity pre-template . These can be classified into three.
The first kind of situations is that the highlighted parts (in magenta), which are the parts related by the pre-template, are not involved in a reduction step and hence preserved. This is the case for the third and sixth matching beta-reduction steps (L3, R3, L6, R6), and the addition step (L8, R8).
The second kind of situations is that the highlighted parts are involved in beta reduction as an argument. This is the case for the first beta-reduction step (L1, R1). The involved highlighted parts (i.e. the function abstractions ‘’ and ‘’) are simply duplicated, without any modification. This mere duplication of highlighted parts is typical in checking robustness of a pre-template. It is often caused by duplication of some function abstraction that contains the highlighted parts. This case will be formalised in Sec. 10.
The last kind of situations is that the highlighted parts are involved in beta reduction as a function, which is the case for the rest of the matching beta-reduction steps. These situations can further be classified into two. Firstly, involved highlighted parts (i.e. the function abstractions ‘’ and ‘’) are modified, and the results are again highlighted (i.e. ‘’ and ‘’). This is the case for the second and fifth beta-reduction steps (L2, R2, L5, R5). Secondly, involved highlighted parts (i.e. ‘’ and ‘’) are modified, and the modification leads to an identical sub-term (i.e. ‘’). This is the case for the fourth and seventh beta-reduction steps (L4, R4, L7, R7). Although their direct results are not identical (i.e. ‘’ vs. ‘’), the same result ‘’ can be reached after the dereferencing steps on one side (L4’, L7’).
Note that there may be another kind of situations, depending on contexts. For example, closed function abstractions that are related by may be involved in beta reduction only as a part of a function, in which case these are preserved.
In general, checking robustness of any pre-template boils down to enumerating and examining all situations where parts related by a pre-template is involved in a reduction step. Situations where related parts actually get modified, such as those of the third kind described above, are the primary situations to examine. ∎
9.2.2. Pre-templates
Now we define the actual pre-templates on hypernets, which we informally introduced in Sec. 9.2.1. These are the parametricity pre-template , and the two operational pre-templates and . The three pre-templates, together with the structural pre-templates presented in Ex. 6.10, will be used to prove the Parametricity law.
Fig. 15 shows the parametricity pre-template , which formalises the textual representation (3). The right-hand side is straightforward. It simply consists of a bunch of encodings of two function abstractions, namely ‘’ and ‘’. The empty store becomes absent in the graphical representation.
The left-hand side contains a bunch of encodings of two function abstractions, i.e. ‘’ and ‘’, and also the graphical representation of the store ‘’ that consists of an atom edge and an edge labelled with the value ‘’. The variable ‘’ refers to the name ‘’, and therefore, the encodings of function abstractions are all connected to the atom edge via a contraction tree. Note that the variable ‘’ is distributed to the function abstractions, instead of the name ‘’, which is a subtle difference between the actual definition (Fig. 15) and the informal textual representation (3). The difference is due to the way the universal abstract machine handles substitution. It delays substitution for the variable ‘’, whereas the term reduction we used in Sec. 9.2.1 eagerly substitutes the name ‘’ for the variable ‘’.
The two operational pre-templates and are derived from local rewrite rules of active operations, with an extra condition. The one is for reference creation: if is a reference-creation rewrite rule (Fig. 9a) where (see the figure) is not an arbitrary hypernet but a stable hypernet. The other is for function application, defined in the same manner: if is a beta rewrite rule (Fig. 8b) where (see the figure) is not an arbitrary hypernet but a stable hypernet. The extra requirement of stable hypernets reflect the call-by-value nature of the language .
9.3. Using the partial characterisation theorem
9.3.1. Properties of compute transitions
The behaviour of the operation set is given by compute transitions for the active operations . These are all defined locally via the rewrite rules presented in Sec. 4.2: for function application in Fig. 8b, reference manipulation in Fig. 9, and arithmetic in Fig. 8a. We impose the same restriction on the rewrite rules for function application and reference creation as we introduced for the operational pre-templates in Sec. 9.2.2, that is, the hypernet in Fig. 8b and Fig. 9a is required to be stable.
To use the partial characterisation theorem (Thm. 6.20), the particular machine needs to be deterministic and refocusing. This boils down to determinism, and preservation of the rooted property, of the compute transitions for .
Compute transitions of operations are deterministic, because at most one rewrite rule can be applied to each state. In particular, the stable hypernet in the figures is uniquely determined (by Lem. D.1(3)).
As discussed in Sec. 5, copy transitions are all deterministic, because different contraction rules applied to a single state result in the same state. Compute transitions of name-accessing operations are deterministic for the same reason.
Compute transitions of all the operations are stationary, and hence they preserve the rooted property. The stationary property can be checked using local rewrite rules. Namely, in each rewrite rule of the operations, only one input of has type , and and . Moreover, any output of with type is a target of an atom edge or a box edge (by definition of stable hypernets), which implies is one-way.
Because any initial state is rooted, given that all transitions preserve the rooted property, we can safely assume that any state that arises in an execution is rooted. This means that the additional requirement of stable hypernets in local rewrite rules is in fact guaranteed to be satisfied in any execution (by Lem. C.5, Lem. D.6 and Lem. D.4).
9.3.2. Usage of the partial characterisation theorem
For all the pre-templates introduced so far, their output-closure can be easily checked, typically by spotting that an input or an output, of type , is a source or a target of a contraction, atom or box edge.
Table 1 summarises how we use Thm. 6.20 on the pre-templates. For example, \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}} is a -template, as shown in the “template” column, and both itself and its converse are -robust relative to all rewrite transitions, as shown in the “robustness” columns. Thanks to the monotonicity (Remark 6.22), we can use Thm. 6.20(1) with a reasonable triple , and Thm. 6.20(2) with a reasonable triple . Consequently, H_{1}\vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}}H_{2} implies and , which is shown in the “implication of ” column.
Pre-templates that relate hypernets with no input of type are trivially a -template for any , and . The table uses ‘’ to represent this situation.
Typically, a reasonable triple for a pre-template can be found by selecting “bigger” parameters from those of input-safety and robustness, thanks to the monotonicity. However, the parametricity pre-template requires non-trivial use of the monotonicity. This is because the parameter that makes the pre-template robust, as the upper row in the “robustness” column shows, is not itself a reasonable triple. The lower row shows the alternative, bigger, parameter to which Thm. 6.20 can be applied.
In the table, cyan symbols indicate where a proof of input-safety or robustness relies on contextual refinement. The “dependency” column indicates which pre-templates can be used to prove the necessary contextual refinement, given that these pre-templates imply contextual refinement as shown elsewhere in the table. This reliance specifically happens in finding a quasi-specimen, using contextual refinements/equivalences via Lem. 6.26. In the case of \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}}, its input-safety and robustness are proved under the assumption that \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Assoc}} and \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Comm}} imply contextual equivalence .
The restriction to binding-free contexts plays a crucial role only in robustness regarding the operational pre-templates and . In fact, these pre-templates are input-safe with respect to both and . This gap reflects duplication behaviour on atom edges, which is only encountered in a proof of robustness. A shallow atom edge is never duplicated, whereas a deep one can be duplicated as a part of a box (which represents a thunk).
Later in Sec. 10.2 and Sec. 10.3, we will present how input-safety and robustness of pre-templates can be established, as indicated in Table 1.
9.4. Combining robust templates
Assuming that pre-templates are robust templates and hence imply contextual equivalences, in the way shown in Table 1, we combine the contextual equivalences and yield the Parametricity law. We start with combining the structural templates \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Idem}} and , and prove the following Weakening laws.
Proposition 9.3** (Weakening laws).**
- •
Given a derivable judgement such that ,
- •
Given a derivable judgement such that ,
where are the lengths of , respectively. (1) instead of , (2) instead of
Proof outline.
The proof is by induction on derivations. Base cases are for variables, atoms and constants. The proof for these cases are trivial, because any distributor with no inputs is simply a bunch of weakening edges (see Def. 5.7).
In inductive cases, we need to identify a single weakening edge with a certain (sub-)hypernet, namely: (i) a distributor whose sole input is connected to a weakening edge, (ii) a distributor whose two inputs are connected to weakening edges, and (iii) a distributor whose sole input is connected to a box edge, in which a weakening edge is connected to the corresponding output. The first two situations are for unary/binary operations and function application. These can be handled with the contextual equivalence implied by \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Idem}}. The third situation is for function abstraction, and it boils down to the first situation, thanks to the contextual equivalence implied by . ∎
The Parametricity law (Prop. 9.1) can now be proved as follows.
Proof of Prop. 9.1.
Let and be a de-sugared version of the left-hand side and the right-hand side of the Parametricity law, i.e.:
[TABLE]
The Parametricity law can be obtained as a chain of contextual equivalences whose outline is as follows.
[TABLE]
The leftmost and rightmost contextual equivalences are consequences of the Weakening laws (Prop. 9.3), because the terms and have no free variables nor free atoms. The middle contextual equivalence follows from the special case of the Parametricity law where the environment is empty, i.e. . The contextual equivalence is namely derived from another chain of contextual equivalences that is shown in Fig. 16, via the binding-free context that consists of a hole of type and weakening edges (i.e. and ). In Fig. 16, each contextual equivalence is accompanied by relevant templates, and preorders on natural numbers are omitted.
The contextual equivalences used in the whole chain are with respect to various preorders on natural numbers, and with respect to either all contexts or binding-free contexts for the operation set . However, all these contextual equivalences are smaller relations than the contextual equivalence , which defines the Parametricity law, and therefore they can be combined together to imply . This is thanks to the monotonicity of contextual equivalence with respect to preorders and anti-monotonicity with respect to sets of contexts (Remark 6.22). ∎
Table 2 summarises dependency of the Weakening laws and the Parametricity law on templates, which can be observed in the above proofs of the laws. The symbol ‘’ indicates direct dependency on templates, in the sense that a law can be proved by combining contextual equivalences implied by these templates. For example, the Weakening laws are obtained by combining two templates \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Idem}} and . Because the proof of the Parametricity law uses the Weakening law, the Parametricity law also depends on these two templates, which is indicated by ‘(W)’ in the table.
Note that two templates \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}} and do not directly appear in the proof of the Parametricity law, in particular in the chain shown in Fig. 16. They are however necessary for robustness of the parametricity template (see Table 1). The Parametricity law depends on the two templates only indirectly, which is indicated by the symbol ‘’ in Table 2.
Finally, let us recall the textual chain (4) from Sec. 9.2.1. It can be seen as an informal presentation of the chain (5) of contextual equivalences, in particular the subsumed chain in Fig. 16. The textual chain extracts the core part of the graphical chain that uses key templates , and . Comparison between the textual and graphical chains helps us observe how the graphical chain is augmented by structural templates. In particular, the first and the last contextual equivalences of the chain in Fig. 16 simplify contraction trees that result from the encoding .
10. Local rewrite rules and transfer properties
The partial characterisation theorem reduces a proof of an observational equivalence down to establishing robust templates. As illustrated in Sec. 9.3.2, this typically boils down to checking input-safety of pre-templates, and checking robustness of pre-templates relative to rewrite transitions.
The key part of checking input-safety or robustness of a pre-template is to analyse how a rewrite transition involves edges (at any depth) of a state that are contributed by the pre-template. In this section, we focus on rewrite transitions that are locally specified by means of the contraction rules or rewrite rules (e.g. the beta rewrite rules), and identify some situation where these transitions involve the edges contributed by a pre-template in a safe manner. These situations can be formalised for arbitrary instances of the universal abstract machine, including the particular instance that is used in Sec. 9 to prove the Parametricity law.
This section proceeds as follows. Firstly, Sec. 10.1 formalises the safe involvement in terms of transfer properties. Sec. 10.2 establishes transfer properties for the particular pre-templates and local rewrite rules used in Sec. 9 to prove the Parametricity law. Finally, Sec. 10.3 demonstrates the use of the transfer properties, by providing details of checking input-safety and robustness to prove the Parametricity law.
10.1. Transfer properties
We refer to the contraction rules which locally specify copy transitions, and rewrite rules that locally specify rewrite transitions for active operations, altogether as -rules. To analyse how a -rule involves edges contributed by a pre-template, one would first need to check all possible overlaps between the local rule and the edges, and then observe how these overlaps are affected by application of the local rule. We identify safe involvement of the pre-template in the -rule, as the situation where the overlaps get only eliminated or duplicated without any internal modification.
We will first formalise safe involvement for a single application of -rules, and then for a pair of applications of -rules. The latter can capture safe involvement of edges contributed by a pre-template, which can be exploited to check input-safety and robustness of pre-templates.
Notation 10.1**.**
Let and . Given a sequence of length and a function , a sequence of length is given by for each .
Definition 10.2** (Transfer of hypernets).**
Let and be two sets of focus-free contexts, and be a set of focus-free hypernets. A -rule of a universal abstract machine transfers from to if, for any , any focussed context such that , and any focus-free hypernets () such that , there exist some , some focussed context , and some function , and the following holds.
- •
.
- •
.
- •
is a -rule, for any focus-free hypernets ().
This transfer property enjoys monotonicity in the following sense: if a -rule transfers from to , and , then the -rule transfers from to as well. If a -rule transfers from to the same , we say the -rule preserves in .
Given an operation set , we will be particularly interested in the following sets of hypernets and contexts for , some of which have already been introduced elsewhere: the set of all focus-free hypernets, the set of contraction trees, the set of all focus-free contexts, the set of all binding-free contexts, the set of all deep contexts, i.e. focus-free contexts whose holes are all deep.
Example 10.3** (Transfer/preservation of hypernets in contexts).**
- •
When a -rule preserves in , any deep edge of also appears as a deep edge in , and it also retains its neighbours. This is trivially the case if the -rule involves no box edges (and hence deep edges) at all. It is also the case if the -rule only eliminates or duplicates box edges without modifying deep edges. The contraction rules are an example of duplicating boxes.
- •
Preservation of deep edges can be restricted to binding-free positions, which are specified by binding-free contexts. When a -rule preserves in , any deep edge of in a binding-free position also appears as a deep edge in a binding-free position in .
- •
When a -rule transfers from to , any deep edge of also appears as an edge in , retaining its neighbours, but not necessarily as a deep edge. This is preservation of deep edges in a weak sense. It is the case when a -rule replaces a box edge with its contents, turning some deep edges into shallow edges without modifying their connection. The beta rewrite rules are an example of this situation.
- •
When a -rule preserves in , any contraction tree in also appears in . The contraction rules are designed to satisfy this preservation property. ∎
Definition 10.4** (Transfer of (rooted) specimens).**
Let and be two sets of focus-free contexts, and be a pre-template.
- •
A -rule of a universal abstract machine transfers specimens of from to if, for any -specimen of the form such that , there exist some focussed context and two sequences and of focus-free hypernets, and the following holds.
- –
.
- –
is a -rule.
- –
is a -specimen of .
- •
The -rule is said to transfer rooted specimens of from to if, in the above definition, the -specimen is restricted to yield two rooted states and .
If a -rule transfers specimens of from to the same , we say the -rule preserves specimens of in .
We can prove that certain transfer properties of hypernets imply the corresponding transfer properties of specimens, as stated in Prop. 10.6 below. These are primarily transfer of deep edges, and preservation of contraction trees. Prop. 10.6 below will simplify some part of establishing input-safety and robustness of a pre-template, because it enables us to analyse a single application of a -rule on a state, instead of a pair of applications of a -rule on two states induced by a specimen of the pre-template.
Definition 10.5** (Root-focussed -rules).**
A -rule is said to be root-focussed if it satisfies the following.
- •
has only one input.
- •
holds, i.e. the sole input of coincides with the source of the token.
- •
Every output of is reachable from the sole input of .
Proposition 10.6**.**
For any -rule of a universal abstract machine , the following holds.
- (1)
If it transfers from to , it transfers specimens of any pre-template from to . 2. (2)
If it preserves in , it preserves specimens of any pre-template in . 3. (3)
If it is root-focussed and transfers from to , it transfers rooted specimens of any output-closed pre-template from to . 4. (4)
If it preserves in , it preserves specimens of any pre-template , which is on contraction trees, in .
Proof of the point (1).
We take an arbitrary -specimen of the form
[TABLE]
of the pre-template , such that . Because the context of the specimen must be focussed, the token in the context is shallow. This means that the hole labelled with in the context must be shallow. On the other hand, the specimen satisfies , and hence . As a consequence, we have . Now, by the assumption, there exist some focussed context and some function , such that and hold, and moreover, is also a -rule. We obtain a triple . It satisfies , and is a -specimen of the pre-template . ∎
Proof of the point (2).
We take an arbitrary -specimen of the form
[TABLE]
of the pre-template , such that .
We first check that follows from , as follows.
- •
Because the context must be focussed, the token in the context is shallow. This means that the hole labelled with in the context must be shallow. This, combined with , implies .
- •
If the context contains a path that makes it not binding-free, the path is also a path in the context and makes the context not binding-free. Therefore, because holds, the context is without any path that makes the context not binding-free. This means .
By the assumption, there exist some focussed context and some function , such that and hold, and moreover, is also a -rule. We obtain a triple .
To conclude the proof, it suffices to prove that this triple is a -specimen of the pre-template , which boils down to showing .
We firstly prove . Because holds, the holes labelled with of the context must be all deep. This, together with , implies .
We then prove by contradiction, which will conclude the whole proof. Assume that the context is not binding-free. It has a path from a source of an edge that is either a contraction edge, an atom edge, a box edge or a hole edge, to a source of an edge that is a hole edge. Thanks to and , the hole edge of the context must be deep. We will infer a contradiction by case analysis on the hole edge . There are two cases.
- •
One case is when the edge of the context comes from the context . In this case, the edge is one of the deep hole edges labelled with . This means that the path in the context must consist of deep edges only, and these deep edges, together with the edge , must be contained in a box of the context . Therefore the path is also a path in the context , and it makes the context not binding-free. This contradicts .
- •
The other case is when the edge of the context comes from the context . In this case, the edge is a hole edge of the context , and hence a deep edge. This means that the path is also a path in the context , consisting of deep edges only. The path therefore makes the context not binding-free, which contradicts .
∎
Proof of the point (3).
We take an arbitrary -specimen of the form
[TABLE]
of the pre-template , such that holds, and two states and are both rooted.
We can first check , in the same way as the proof of the point (2). By the assumption, there exist some focussed context and some function , such that and hold, and moreover, is also a -rule. We obtain a triple .
To conclude the proof, it suffices to prove that this triple is a -specimen of the pre-template , which boils down to showing . We prove this by contradiction, as follows.
Assume that the context is not binding-free. It has a path from a source of an edge that is either a contraction edge, an atom edge, a box edge or a hole edge, to a source of an edge that is a hole edge. We will infer a contradiction by case analysis on the edge . There are two cases.
- •
One case is when the edge of the context comes from the context . In this case, the edge is one of the hole edges labelled with . Because of , the hole edge must be deep. This means that the path must consist of deep edges contained in a box of the context . The path is therefore a path in the context , and also in the context . This means , which is a contradiction.
- •
The other case is when the edge of the context comes from the context . In this case, we will infer a contradiction by further case analysis on the edge and the path . There are three (sub-)cases.
- –
The first case is when the edge comes from the context and is a path in the context. In this case, the path makes the context not binding-free, which contradicts .
- –
The second case is when the edge comes from the context and does not give a single path in the context. In this case, the edges and both come from the context , but is a valid path only in the whole context . This means that, in the context , a source of the hole edge labelled with is reachable from a target of the same hole edge.
Because the -rule is root-focussed, the focussed hypernet has only one input, the input coincides with the source of the token, and every output of the hypernet is reachable from the sole input. Moreover, because of , the same holds for the focussed context too, namely: the context has only one input, the input coincides with the source of the token, and every output of the context is reachable from the sole onput.
As a conseqence, in the focussed context , the token source is reachable from itself, via a cyclic path that contains some edges coming from the context including the token edge. This path is not an operation path. Therefore, by Lem. C.7(3), at least one of the states and is not rooted. This is a contradiction.
- –
The last case is when the edge comes from the context . Recall that the edge comes from the context . In this case, the path in the context has a prefix that gives a path in the context , from the same source of the edge as the path , to a source of the hole edge labelled with . Because the path is given as a part of the path in the context , the path in the context does not itself contain the hole edge labelled with .
Because the -rule is root-focussed, the focussed hypernet has only one input, and the input coincides with the source of the token. Moreover, because of , the same holds for the focussed context too, namely: the context has only one input, and the input coincides with the source of the token.
As a consequence, the path in turn gives a path in the focussed context , from the same source of the edge as the path , to the source of the token. The first edge of this path is not an operation edge, and therefore the path is not an operation path. By Lem. C.7(3), at least one of the states and is not rooted. This is a contradiction.
∎
Proof of the point (4).
We take an arbitrary -specimen of the form
[TABLE]
of the pre-template , such that . All the hypernets in are elements of , i.e. contraction trees. It trivially holds that . Therefore, by the assumption, there exist some focussed context and some function , such that and hold, and moreover, is also a -rule. We obtain a triple , and this is a -specimen of the pre-template . ∎
10.2. Transfer properties for the Parametricity law
We can now establish transfer properties of deep edges and contraction trees for the particular machine which is used to prove the Parametricity law.
Proposition 10.7**.**
The universal abstract machine satisfies the following.
- (1)
The contraction rules and all rewrite rules transfer from to . 2. (2)
The contraction rules, and all rewrite rules except for the beta rewite rules, preserve in . 3. (3)
The beta rewrite rules transfer from to . 4. (4)
The contraction rules and all rewrite rules preserve in .
Sketch of the proof.
We can prove the four points by analysing each -rule, i.e. a contraction rule or a local rewrite rule for an active operation, of the universal abstract machine .
Firstly, the only way in which a contraction rule involves deep edges is to have them inside the hypernet to be duplicated ( in Fig. 7). The deep edges and their connection are all preserved, and replacing these edges with arbitrary deep edges still enables the contraction rule. The point (1) therefore holds. Additionally, any path to a source of a deep edge must consist of deep edges only, and if such a path appears in the result of a contraction rule , the path necessarily appears in the original hypernet too. Therefore, if the contraction rule moves deep edges out of binding-free positions, these edges must not be at binding-free positions beforehand. This is a contradiction, and the point (2) holds. As for contraction trees, whenever a contraction rule involves a contraction tree, the tree is either deep and gets duplicated, or shallow and left unmodified. Replacing the contraction tree with another contraction tree still enables the contraction rule that duplicates the same hypernet. The point (4) therefore holds.
Secondly, we analyse the beta rewrite rules. Whenever deep edges are involved in a beta rewrite rule, they must be inside the box edge that gets opened (i.e. in Fig. 8b). These deep edges may be turned into shallow edges, but their connection is unchanged. The difference of deep edges does not affect application of the rule, and hence the point (1) holds. If these deep edges are at binding-free positions, they remain at binding-free positions after applying the beta rewrite rule, for a similar reason as the contraction rules. The point (3) therefore holds. As for contraction trees, the only way in which contraction trees get involved in a beta rewrite rule is for them to be deep. The point (4) reduces to the point (1) for beta rewrite rules.
The rest of the local rewrite rules involve no deep edges at all, and therefore points (1) and (2) trivially hold. These rules either involve no contraction trees, or involve shallow contraction trees without any modification. The difference of contraction trees does not affect application of the rules. The point (4) therefore holds. ∎
Corollary 10.8**.**
In the universal abstract machine , the contraction rules and all rewrite rules satisfies the following.
- (1)
The rules transfer specimens of any pre-template from to . 2. (2)
The rules transfer rooted specimens of any output-closed pre-template from to . 3. (3)
The rules preserve specimens of any pre-template , which is on contraction trees, in .
Proof.
This is a consequence of Prop. 10.7 and Prop. 10.6, noting that the beta rewrite rules are root-focussed and preserving in implies transferring from to . ∎
10.3. Input-safety and robustness for the Parametricity law
In this section we give some details of proving input-safety and robustness of the pre-templates for the Parametricity law, as indicated in Table 1. The proofs exploit the transfer properties established in Cor. 10.8.
Fig. 17 lists triggers that we use to prove input-safety and robustness of some of the pre-templates333The numbering of triggers is according to the one used in [Mur20, Sec. 4.5.5]. Some of triggers in loc. cit. are for observational equivalences that we do not consider in this paper, and hence not presented here.. Table 3 shows contextual refinements/equivalences implied by these triggers (in the “implication” column), given that some pre-templates (shown in the “dependency” column) imply contextual refinement as shown in Table 1. All the implications can be proved simply using the congruence property and transitivity of contextual refinement. Table 3 shows which pre-template requires each trigger in its proof of input-safety or robustness (in the “used for” column). Note that the converse of any trigger is again a trigger.
Recall that there is a choice of contraction trees upon applying a contraction rule and some of the local rewrite rules. The minimum choice is to collect only contraction edges whose target is reachable from the token target. The maximum choice is to take the contraction tree(s) so that no contraction or weakening edge is incoming to the unique hole edge in a context.
10.3.1. Pre-templates on contraction trees
First we check input-safety and robustness of \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Assoc}}, \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Comm}} and \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Idem}}, which are all on contraction trees.
Input-safety of \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Assoc}} and \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}\mathrm{Comm}} can be checked as follows. Given a -specimen with an entering search token, because any input of a contraction tree is a source of a contraction edge, we have:
[TABLE]
It can be observed that a rewrite transition is possible in if and only if a rewrite transition is possible in . When a rewrite transition is possible in both states, we can use Cor. 10.8(3), by considering a maximal possible contraction rule. The results of the rewrite transition can be given by a new quasi--specimen up to (here denotes equality on states). When no rewrite transition is possible, both of the states are not final but stuck.
Robustness of the three pre-templates and their converse can also be proved using Cor. 10.8(3), by considering a maximal possible local (contraction or rewrite) rule in each case.
10.3.2. Input-safety of pre-templates not on
contraction trees
As mentioned in Sec. 9.2.2, pre-templates that relate hypernets with no input of type are trivially input-safety for any parameter . This leaves us pre-templates \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}}, , and to check.
As for \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}}, note that the pre-template \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}} relates hypernets with at least one input. Any -specimen of \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}} with an entering search token can be turned into the form where is a positive number. The proof is by case analysis on the number .
- •
When , we have:
[TABLE]
We can take as a -specimen, and the token in is not entering.
- •
When , the token target must be a source of a contraction edge. There exist a focus-free context , two focus-free hypernets H^{\prime 1}\vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}}H^{\prime 2} and a focus-free hypernet , such that
[TABLE]
and given by the trigger via Lem. 6.26. The results of these sequences give a quasi--specimen up to .
A proof of input-safety of the operational pre-templates and is a simpler version of that of \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}}, because the operational pre-templates relate hypernets with only one input.
Let be either or . Any -specimen of an operational pre-template with an entering search token can be turned into the form ; note that the parameter that we had for \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}} is redundant in . We have:
[TABLE]
We can take as a -specimen, and the token in is not entering. This data gives a -specimen when , which follows from the closedness of with respect to plugging (Lem. E.1). Note that can be seen as a context with no holes, which is trivially binding-free.
Finally, we look at the parametricity pre-template . Any -specimen of this pre-template, with an entering search token, can be turned into the form where is a positive number. The token target is a source of an edge labelled with , so we have:
[TABLE]
The results of these sequences give a quasi--specimen up to .
10.3.3. Robustness of pre-templates not on
contraction trees: a principle
Robustness can be checked by inspecting rewrite transition from the state given by a specimen of a pre-template, where the token of is not entering. We in particular consider the minimum local (contraction or rewrite) rule applied in this transition. This means that, in the hypernet , every vertex is reachable from the token target.
The inspection boils down to analyse how the minimum local rule involves edges that come from the hypernets . If all the involvement is deep, i.e. only deep edges from are involved in the local rule, these deep edges must come via deep holes in the context . We can use Cor. 10.8(1).
If the minimum local rule involves shallow edges that are from , endpoints of these edges are reachable from the token target. This means that, in the context , some holes are shallow and their sources are reachable from the token target. Moreover, given that the token is not entering in , the context has a path from the token target to a source of a hole edge.
For example, in checking robustness of with respect to copy transitions, one situation of shallow overlaps is when is in the form in Fig. 18, and some of the box edges are from . Taking the minimum contraction rule means that in the graph is a contraction tree that gives a path from the token target. This path followed by the operation edge corresponds to paths from the token target to hole sources in the context .
So, if the minimum local rule involves shallow edges that are from , the context necessarily has a path from the token target to a hole source. The path becomes a path in the state , from the token target to a source of an edge that is from . The edge is necessarily shallow, and also involved in the application of the minimum local rule, because of the connectivity of . Moreover, a source of the edge is an input, in the relevant hypernet of . By inspecting minimum local rules, we can enumerate possible labelling of the path and the edge , as summarised in Table 4. Explanation on the notation used in the table is to follow.
We use the regular-expression like notation in Table 4. For example, (\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\star}_{\mathsf{C}})^{+}\cdot\mathbb{O}^{\mathrm{ex}} represents finite sequences of edge labels, where more than one occurrences of the label \mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\star}_{\mathsf{C}} is followed by one operation . This characterises paths that inhabit the overlap shown in Fig. 18, i.e. the contraction tree followed by the operation edge . Note that this regular-expression like notation is not a proper regular expression, because it is over the infinite alphabet , the edge label set, and it accordingly admits infinite alternation (aka. union) implicitly.
To wrap up, checking robustness of each pre-template that is not on contraction trees boils down to using Cor. 10.8(1) and/or analysing the cases enumerated in the table above.
10.3.4. Robustness of and its converse
Robustness check of the pre-template \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}} with respect to copy transitions has two cases. The first case is when one shallow overlap is caused by a path characterised by (\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\star}_{\mathsf{C}})^{+}, and the second case is when no shallow overlaps are present and Cor. 10.8(1) can be used.
In the first case, namely, a -specimen with a non-entering rewrite token can be turned into the form where is a positive number, and is a focussed context in the form of Fig. 19. A rewrite transition is possible on both states given by the specimen, in which a contraction rule is applied to and . Results of the rewrite transition give a new quasi--specimen. When , this quasi-specimen is up to , using the trigger . When , the quasi-specimen is also up to , but using the trigger .
Robustness check of the pre-template \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}} with respect to rewrite transitions always boils down to Cor. 10.8(1). This is intuitively because no local rewrite rule of operations involves any shallow contraction edge of type .
Robustness of (\vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}})^{-1} can be checked in a similar manner. Namely, using Table 4, shallow overlaps are caused by paths:
[TABLE]
from the token target. All paths but (\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\star}_{\mathsf{C}})^{+} gives rise to a state that is not rooted, which can be checked using Lem. 6.24. This reduces the robustness check of (\vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}})^{-1} to that of \vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}}.
10.3.5. Robustness of
and its converse
These two pre-templates both relate hypernets with no inputs. Proofs of their robustness always boil down to the use of Cor. 10.8(1), following the discussion in Sec. 10.3.3. Namely, it is impossible to find the path in the context from the token target to a hole source.
10.3.6. Robustness of
, and , and their converse
These six pre-templates all concern boxes. Using Table 4, shallow overlaps are caused by paths (\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\star}_{\mathsf{C}})^{+}\cdot\mathbb{O}^{\mathrm{ex}} and from the token target.
Robustness check with respect to compute transitions of operations always boil down to Cor. 10.8(1).
As for compute transitions of the operation ‘’, either one path causes one shallow overlap, or all overlaps are deep. The latter situation boils down to Cor. 10.8(1). In the former situation, a beta rule involves one box that is contributed by a pre-template, and states given by a -specimen are turned into a quasi--specimen up to , by one rewrite transition.
As for copy transitions, there are two possible situations.
- •
Paths (\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\star}_{\mathsf{C}})^{+}\cdot\mathbb{O}^{\mathrm{ex}} cause some shallow overlaps and there are some deep overlaps too.
- •
All overlaps are deep, which boils down to Cor. 10.8(1).
In the first situation, some of the shallow boxes duplicated by a contraction rule are contributed by a pre-template, and other duplicated boxes may have deep edges contributed by the pre-template. By tracking these shallow and deep contributions in a contraction rule, it can be checked that one rewrite transition turns states given by a -specimen into a quasi--specimen. This quasi-specimen is up to the following, depending on pre-templates:
- •
for and its converse,
- •
for , and for its converse, using the trigger , and
- •
for , and for its converse, using the trigger .
10.3.7. Robustness of operational pre-templates and their
converse
For the operational pre-templates and their converse, we use the class of binding-free contexts. This restriction is crucial to rule out some shallow overlaps.
Using Table 4, shallow overlaps with the operational pre-templates and are caused by paths (\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\star}_{\mathsf{C}})^{+} from the token context. However, the restriction to binding-free contexts makes this situation impossible, which means the robustness check always boils down to Cor. 10.8(1) and Cor. 10.8(2).
In checking robustness of the converse and , shallow overlaps are caused by paths:
[TABLE]
from the token target. Like the case of (\vartriangleleft^{\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}})^{-1}, all paths but (\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\star}_{\mathsf{C}})^{+} give rise to a state that is not rooted, which can be checked using Lem. 6.24. The paths (\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\star}_{\mathsf{C}})^{+} are impossible because of the binding-free restriction. As a result, this robustness check also boils down to Cor. 10.8(1) and Cor. 10.8(2).
10.3.8. Robustness of the parametricity pre-template
and its converse
These two pre-templates concern lambda-abstractions, and they give rather rare examples of robustness check where we compare different numbers of transitions.
Using Table 4, shallow overlaps with these pre-templates are caused by paths:
[TABLE]
from the token target.
As for compute transitions of operations , there are two possible situations.
- •
Shallow overlaps are caused by paths or .
- •
There is no overlap at all, which boils down to Cor. 10.8(1).
In the first situation, a stable hypernet of a local rewrite rule (see e.g. Fig. 9a) contains shallow edges, labelled with , that are contributed by a pre-template. The overlapped shallow contributions are not modified at all by the rewrite rule, and consequently, one rewrite transition results in a quasi--specimen up to .
As for copy transitions, either one path (\mathrel{\raisebox{0.2pt}{\scalebox{0.6}{\boldmath\otimes}}}^{\star}_{\mathsf{C}})^{+} causes one shallow overlap, or all overlaps are deep. The latter situation boils down to Cor. 10.8(1). In the former situation, one lambda-abstraction contributed by a pre-template gets duplicated. Namely, a -specimen with a non-entering rewrite token can be turned into the form where is a focussed context in the form of Fig. 19. There exist a focussed context and two hypernets such that:
[TABLE]
Results of these rewrite transitions give a new quasi--specimen up to .
As for compute transitions of the operation ‘’, there are two possible situations.
- •
One path causes a shallow overlap of the edge that has label and gets eliminated by a beta rewrite rule, and possibly some other paths cause shallow overlaps in the stable hypernet (see Fig. 8b).
- •
There are possibly deep overlaps, and paths may cause shallow overlaps in the stable hypernet .
In the second situation, all overlaps are not modified at all by the beta rewrite rule, except for some deep overlaps turned shallow. Consequently, one rewrite transition results in a quasi--specimen up to .
In the first situation, one lambda-abstraction contributed by the pre-template is modified, while all the other shallow overlaps (if any) are not. We can focus on the lambda-abstraction. The beta rewrite acts on the lambda-abstraction, an edge labelled with ‘’, and the stable hypernet .
The involved lambda-abstraction can be in two forms (see Fig. 15). Firstly, it contains funcion application in its body. Application of the beta rule discloses the inner function application, whose function side is another lambda-abstraction that can be related by the pre-template again. As a result, one rewrite transition yields a quasi--specimen up to . Secondly, the involved lambda-abstraction consists of dereferencing ‘’ or constant ‘’. Application of the beta rule discloses the dereferencing, or constant, edge. When it is the dereferencing edge that is disclosed, the beta rule is followed by a few transitions to perform dereferencing and produce the same constant ‘’. As a result, we compare nine transitions with one transition, and obtain a quasi--specimen up to , using triggers and .
The case of the converse of is similar. The only difference is that, in the last situation described above where we compare nine transitions with one transition and obtain a quasi--specimen up to , we obtain a quasi--specimen up to as a result of the symmetrical comparison of transitions.
On a final note, let us recall Rem. 9.2, where we observed some situations of robustness for using informal reduction semantics on terms. We namely observed situations where parts related by the pre-template is subject to the standard call-by-value beta reduction, either as an argument or a function. The involvement as a function corresponds to one of the robustness situations described in this section, namely: a shallow overlap with a beta rewrite rule that is caused by a path and modified by the rewrite rule. The other involvement, which is as an argument, corresponds to a combination of two situations described in this section, namely: any overlaps with a contraction rule, and shallow overlaps with a beta rule that are caused by paths and preserved. The combination is due to the fact that the universal abstraction machine decomposes the beta reduction into the beta rewrite rule and contraction rules, making substitution explicit and not eager.
11. Related and future work
Our Spartan UAM is motivated by a need for a flexible and expressive framework in which a wide variety of effects can be given a cost-accurate model. As discussed, this opens the door to a uniform study of operations and their interactions. Defining new styles of abstract machines is a rich and attractive vein of research. The “monoidal computer” of Pavlovic [Pav13] or the “evolving algebras” of Gurevich [Gur18] are such examples. What sets the Spartan UAM apart is the fact that it can be used, rather conveniently, for reasoning robustly about observational equivalence.
It is well known that observational equivalence is fragile, in the sense that the validity of an equation depends not only on the operations involved in the operation itself but also on the operations used in the context. Therefore, the addition of any new operation to a language may invalidate some or all of its equations even if they do not concern the operation itself. Consequently, reasoning about “real-life” languages must be constrained to a syntactically idealised subset which cannot support exotic extrinsics (e.g. , discussed earlier) or, more significantly, foreign function calls, which can be seen as a very general, programmable form of extrinsic operation. This is, in some sense, unavoidable.
An even more troubling aspect of observational equivalence is that the proof methods themselves are fragile, in the sense that language additions may violate the very principles on which proofs are constructed. This issue is addressed by Dreyer et al. [DNB12] by carefully distinguishing between various kinds of operations (state vs. control). Using Kripke relations, Dreyer et al. [DNRB10] goes beyond an enumerative classification of effects to using characterisations of effects in the aid of reasoning; the notion of island introduced in loc. cit. has similar intuitions to our robustness property. More radical approaches are down to replacing the concept of a syntactic context with an “epistemic” context, akin to a Dolev-Yao-style attacker by Ghica and Tzevelekos [GT12], or by characterising combinatorially the interaction between a term and its context as is the case with the game semantics [AJM13, HO00] or the trace semantics [JR05]. Our paper is a new approach to this problem which, by using a uniform graph representation of the the program and its configuration, can take the useful notion of locality beyond state, the overall program.
We argue that our approach is both flexible and elementary. A specific version [MCG18] of this formalism has been used to prove, for example, the soundness of exotic operations involved in (a functional version of) Google’s TensorFlow machine learning library. Even though the proofs can seem complicated, this is in part due to the graph-based formalism being new, and in part due to the fact that proofs of equivalences are lengthy case-based analyses. But herein lies the simplicity and the robustness of the approach, avoiding the discovery of clever-but-fragile language invariants which can be used to streamline proofs. Our tedious-but-elementary proofs on the other hand seem highly suitable for automation – this is for future work.
The original motivation of the DGoI machine [MG17] was to produce an abstract machine that expresses the computational intuitions of the GoI while correctly modelling the cost of evaluation, particularly for call-by-value and call-by-need. Although the Universal Abstract Machine framework does not aim at efficiency, one can think of a cost model of the machine in a similar way as the DGoI machine. Moreover, the indexing of observational equivalence with a preorder representing the number of steps gives a direct avenue for modelling and comparing computation costs. For example, the beta law (Tbl. 1) is indexed by the normal order on , which indicates that one side always requires fewer steps than the other in the evaluation process. The only details to be resolved are associating costs (time and space) with steps, in particular different costs for different operations.
Another strand of research related to ours is work on defining programming languages with effects. We are not so much interested in “simulated effects”, which are essentially the encoding of effectful behaviour into pure languages, and which can be achieved via monads [Wad98], but we are interested in genuine “native” effects which happen outside of the language. Semantically this has been introduced by Plotkin and Power [PP08] and more recently developed by Møgelberg and Staton [MS11]. Spartan in some sense takes the idea to the extreme by situating all operations (pure or effectful) outside of the program, in “the world”, and keeping as intrinsic to the language only the structural aspects of copying vs. sharing, and scheduling of computation via thunking.
Nested (hyper)graphs are inspired by the exponential boxes of proof nets, a graphical representation of linear logic proofs [Gir87] and have an informal connection to Milner’s bigraphs [Mil01]. Exponential boxes can be formalised by parametrising an agent (which corresponds to an edge in our setting) by a net, as indicated by Lafont [Laf95]. In the framework of interaction nets [Laf90] that subsume proof nets, agents can be coordinated to represent a boundary of a box, as suggested by Mackie [Mac98]. An alternative representation of boxes that use extra links between agents is proposed by Accattoli and Guerrini [AG09]. Our graphical formulation of boxes shares the idea with the first parametrising approach, but we have flexibility regarding types of a box edge itself and its content (i.e. the hypernet that labels it). We use box edges to represent thunks, and a box edge can have less targets than outputs of its contents, reflecting the number of bound variables a thunk has. This generalised box structure is also studied by Drewes et al. [DHP02] as hierarchical graphs, in the context of double-pushout graph transformation (DPO) [Roz97], an well-established algebraic approach to graph rewriting. More recently, Alvarez-Picallo et al. have formulated DPO rewriting for a class of hypernets similar to those used here [AGSZ21, Sec. 4]; this paper further relates hypernets with string diagrams with “functorial boxes” in the style of Melliès [Mel06].
Interaction nets are another established framework of graph rewriting, in which various evaluations of pure lambda-calculus can be implemented [Sin05, Sin06]. The idea of having the token to represent an evaluation strategy can be found in loc. cit., which suggests that our focussed rewriting on hypernets could be implemented using interaction nets. However, the local reasoning we are aiming at with focussed rewriting does not seem easy in the setting of interaction nets, because of technical subtleties observed in loc. cit.; namely, a status of evaluation is remembered by not only the token but also some other agents around an interaction net.
Another future work is the introduction of a more meaningful type system for Spartan hypernets. The current type system of Spartan hypernets is very weak, and we consider it a strength of the approach that equivalences can be proved without the aid of a powerful type infrastructure. On the other hand, in order to avoid stuck configurations and ensure safety of evaluation, types are required. The usage of more expressive types is perfectly compatible with Spartan, and is something we intend to explore. In particular we would like to study notions of typing which are germane to Spartan, capturing its concepts of locality and robustness.
Beyond types, if we look at logics there are some appealing similarities between Spartan and separation logic [Rey02]. The division of nodes into copying nodes via variables and sharing nodes via atoms is not accidental, and their different contraction properties match those from bunched implications [OP99]. On a deeper level, the concepts of locality and in particular robustness developed here are related to the “frame” rule of separation logic.
Finally, our formulation of equivalence has some self-imposed limitations needed to limit the complexity of the technical presentation. We are hereby concerned with sequential and deterministic computation. Future work will show how these restrictions can be relaxed. Parallelism and concurrency can be naturally simulated using multi-token reductions, as inspired by the multi-token GoI machine of Dal Lago et al. [DTY17], whereas nondeterminism (or probabilistic evaluation) requires no changes to the machinery but rather a new definition of observational equivalence. This is work we are aiming to undertake in the future.
Appendix A Equivalent definitions of hypernets
Informally, hypernets are nested hypergraphs, and one hypernet can contain nested hypergraphs up to different depths. This intuition is reflected by Def. 3.5 of hypernets, in particular the big union in \mathcal{H}_{k+1}(L,M)=\mathcal{H}\Bigl{(}L,M\cup\bigcup_{i\leq k}\mathcal{H}_{i}(L,M)\Bigr{)}. In fact, the definition can be replaced by a simpler, but possibly less intuitive, definition below that does not explicitly deal with the different depths of nesting.
Definition A.1**.**
Given sets and , a set is defined by induction on :
[TABLE]
and hence a set .
Lemma A.2**.**
Given arbitrary sets and , any two numbers satisfy .
Proof.
If , the inclusion trivially holds. If not, i.e. , it can be proved by induction on . The key reasoning principle we use is that implies .
In the base case, when (and ), we have
[TABLE]
In the inductive case, when (and ), we have
[TABLE]
where the inclusion is by induction hypothesis on . ∎
Proposition A.3**.**
Any sets and satisfy for any , and hence .
Proof.
We first prove by induction on . The base case, when , is trivial. In the inductive case, when , we have
[TABLE]
The other direction, i.e. , can be also proved by induction on . The base case, when , is again trivial. In the inductive case, we have
[TABLE]
∎
Given a hypernet , by Lem. A.2 and Prop. A.3, there exists a minimum number such that , which we call the “minimum level” of .
Lemma A.4**.**
Any hypernet has a finite number of shallow edges, and a finite number of deep edges.
Proof.
Any hypernet has a finite number of shallow edges by definition. We prove that any hypernet has a finite number of deep edges, by induction on minimum level of the hypernet.
When , the hypernet has ho deep edges.
When , each hypernet that labels a shallow edge of belongs to , and therefore its minimum level is less than . By induction hypothesis, the labelling hypernet has a finite number of deep edges, and also a finite number of shallow edges. Deep edges of are given by edges, at any depth, of any hypernet that labels a shallow edge of . Because there is a finite number of the hypernets that label the shallow edges of , the number of deep edges of is finite. ∎
Remark A.5**.**
Another, slightly different, definition of hypernets is given in [AGSZ21, Sec. 4]. ∎
Appendix B Plugging
An interfaced labelled monoidal hypergraph can be given by data of the following form: where is the input list, is the output list, is the set of all the other vertices, is the set of edges, defines source and target lists, and is labelling functions.
Definition B.1** (Plugging).**
Let and be contexts, such that the hole and the latter context have the same type and . The plugging is a hypernet given by data such that:
[TABLE]
where is the hole edge labelled with , and denotes the -th element of a list.
In the resulting context , each edge comes from either or . If a path in does not contain the hole edge , the path gives a path in . Conversely, if a path in consists of edges from only, the path gives a path in .
Any path in gives a path in . However, if a path in consists of edges from only, the path does not necessarily give a path in . The path indeed gives a path in , if sources and targets of the hole edge are distinct in (i.e. the hole edge is not a self-loop).
Appendix C Rooted states
Lemma C.1**.**
Let is an abstract rewriting system that is deterministic.
- (1)
For any such that and are normal forms, and for any , if there exist two sequences and , then these sequences are exactly the same. 2. (2)
For any such that is a normal form, and for any such that and , if there exists a sequence , then its -th rewrite and -th rewrite satisfy .
Proof.
The point (1) is proved by induction on . In the base case, when (i.e. ), the two sequences are both the empty sequence, and . The inductive case, when , falls into one of the following two situations. The first situation, where or , boils down to the base case, because must be a normal form itself, which means . In the second situation, where and , there exist elements such that and . Because is deterministic, follows, and hence by induction hypothesis on , these two sequences are the same.
The point (2) is proved by contradiction. The sequence from to the normal form is unique, by the point (1). If its -th rewrite and -th rewrite satisfy , determinism of the system implies that these two rewrites are the same. This means that the sequence has a cyclic sub-sequence, and by repeating the cycle different times, one can yield different sequences of rewrites from to . This contradicts the uniqueness of the original sequence . ∎
Lemma C.2**.**
If a state is rooted, a search sequence ?;|{\dot{G}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}{\dot{G}} from the initial state to the state is unique. Moreover, for any -th search transition and -th search transition in the sequence such that , these transitions do not result in the same state.
Proof.
Let be the set of states with search or value token. We can define an abstract rewriting system of “reverse search” by: if {\dot{H^{\prime}}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{H}}. Any search sequence corresponds to a sequence of rewrites in this rewriting system.
The rewriting system is deterministic, i.e. if {\dot{H^{\prime}}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{H}} and {\dot{H^{\prime\prime}}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{H}} then , because the inverse of the interaction rules (Fig. 6) is deterministic.
If a search transition changes a token to a search token, the resulting search token always has an incoming operation edge. This means that, in the rewriting system , initial states are normal forms. Therefore, by Lem. C.1(1), if there exist two search sequences from the initial state to the state , these search sequences are exactly the same. The rest is a consequence of Lem. C.1(2). ∎
Lemma C.3**.**
For any hypernet , if there exists an operation path from an input to a vertex, the path is unique. Moreover, no edge appears twice in the operation path.
Proof.
Given the hypernet whose set of (shallow) vertices is , we can define an abstract rewriting system of “reverse connection” by: if there exists an operation edge whose unique source is and targets include . Any operation path from an input to a vertex in corresponds to a sequence of rewrites in this rewriting system.
This rewriting system is deterministic, because each vertex can have at most one incoming edge in a hypergraph (Def. 3.2) and each operation edge has exactly one source. Because inputs of the hypernet have no incoming edges, they are normal forms in this rewriting system. Therefore, by Lem. C.1(1), an operation path from any input to any vertex is unique.
The rest is proved by contradiction. We assume that, in an operation path from an input to a vertex, the same operation edge appears twice. The edge has one source, which either is an input of the hypernet or has an incoming edge. In the former case, the edge can only appear as the first edge of the operation path , which is a contradiction. In the latter case, the operation edge has exactly one incoming edge in the hypernet . In the operation path , each appearance of the operation edge must be preceded by this edge via the same vertex. This contradicts Lem. C.1(2). ∎
Lemma C.4**.**
For any rooted state , if its token source (i.e. the source of the token) does not coincide with the unique input, then there exists an operation path from the input to the token source.
Proof.
By Lem. C.2, the rooted state has a unique search sequence ?;|{\dot{G}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}{\dot{G}}. The proof is by the length of this sequence.
In the base case, where , the state itself is an initial state, which means the input and token source coincide in .
In the inductive case, where , there exists a state such that ?;|{\dot{G}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{k-1}{\dot{G^{\prime}}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{G}}. The proof here is by case analysis on the interaction rule used in {\dot{G^{\prime}}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{G}}.
- •
When the interaction rules (1a), (1b), (2) or (5b) is used (see Fig. 6), the transition {\dot{G^{\prime}}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{G}} only changes a token label.
- •
When the interaction rule (3) is used, the transition {\dot{G^{\prime}}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{G}} turns the token and its outgoing operation edge into an operation edge and its outgoing token. By induction hypothesis on , the token source coincides with its input, or there exists an operation path from the input to the token source, in .
In the former case, in , the source of the operation edge coincides with the input. The edge itself gives the desired operation path in .
In the latter case, the operation path from the input to the token source in does not contain the outgoing operation edge of the token; otherwise, the edge must be preceded by the token edge in the operation path , which is a contradiction. Therefore, the operation path in is inherited in , becoming a path from the input to the source of the incoming operation edge of the token. In the state , the path followed by the edge yields the desired operation path.
- •
When the interaction rule (4) is used, the transition {\dot{G^{\prime}}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{G}} changes the token from a -th outgoing edge of an operation edge to a -th outgoing edge of the same operation edge , for some . In , the token source is not an input, and therefore, there exists an operation path from the input to the token source, by induction hypothesis.
The operation path ends with the operation edge , and no outgoing edge of the edge is involved in the path ; otherwise, the edge must appear more than once in the path , which is a contradiction by Lem. C.3. Therefore, the path is inherited exactly as it is in , and it gives the desired operation path.
- •
When the interaction rule (5a) is used, by the same reasoning as in the case of rule (4), has an operation path from the input to the token source, where the incoming operation edge of the token appears exactly once, at the end. Removing the edge from the path yields another operation path from the input in , and it also gives an operation path from the input to the token source in .
∎
Lemma C.5**.**
For any state with a -token such that , if is rooted, then there exists a search sequence ?;|{\dot{G}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}\langle{\dot{G}}\rangle_{?/\mathsf{t}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{+}{\dot{G}}.
Proof.
By Lem. C.2, the rooted state has a unique search sequence ?;|{\dot{G}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}{\dot{G}}. The proof is to show that a transition from the state appears in this search seqeunce, and it is by the length of the search sequence.
Because does not have a search token, is impossible, and therefore the base case is when . The search transition ?;|{\dot{G}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{G}} must use one of the interaction rules (1a), (1b), (2) and (5b). This means .
In the inductive case, where , there exists a state such that ?;|{\dot{G}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{k-1}{\dot{G^{\prime}}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{G}}. The proof here is by case analysis on the interaction rule used in {\dot{G^{\prime}}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{G}}.
- •
When the interaction rule (1a), (1b), (2) or (5b) is used, .
- •
Because does not have a search token, the interaction rules (3) and (4) can be never used in {\dot{G^{\prime}}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{G}}.
- •
When the interaction rule (5a) is used, has a value token, which is a -th outgoing edge of an operation edge , for some . The operation edge becomes the outgoing edge of the token in . By induction hypothesis on , we have
[TABLE]
If , in , the token is the only outgoing edge of the operation edge . Because is not an initial state, it must be a result of the interaction rule (3), which means the search sequence (A) is factored through as:
[TABLE]
If , for each , let be a state with a search token, such that and the token is an -th outgoing edge of the operation edge . This means . The proof concludes by combining the following internal lemma with (A), taking as .
Lemma C.6**.**
For any , if there exists such that ?;|{\dot{G}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{h}{\dot{N_{m}}}, then it is factored through as ?;|{\dot{G}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}\langle{\dot{G}}\rangle_{?/\mathsf{t}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{+}{\dot{N_{m}}}.
Proof.
By induction on . In the base case, when , the token of is the first outgoing edge of the operation edge . This state is not initial, and therefore must be a result of the interaction rule (3), which means
[TABLE]
In the inductive case, when , the state is not an initial state and must be a result of the interaction rule (4), which means
[TABLE]
The first half of this search sequence, namely ?;|{\dot{G}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}\langle{\dot{N_{m-1}}}\rangle_{\checkmark/?}, consists of transitions. Therefore, by (outer) induction hypothesis on , we have
[TABLE]
The first part, namely ?;|{\dot{G}}|\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}{\dot{N_{m-1}}}, consists of less than transitions. Therefore, by (inner) induction hypothesis on , we have
[TABLE]
∎
∎
Lemma C.7**.**
- (1)
For any state , if it has a path to the token source that is not an operation path, then it is not rooted. 2. (2)
For any focus-free hypernet and any focussed context with one hole edge, such that is a state, if the hypernet is one-way and the context has a path to the token source that is not an operation path, then the state is not rooted. 3. (3)
For any -specimen of an output-closed pre-tepmlate , if the context has a path to the token source that is not an operation path, then at least one of the states and is not rooted.
Proof of the point (1).
Let be the path in to the token source that is not an operation path. The proof is by contradiction; we assume that is a rooted state.
Because of , the token source is not an input. Therefore by Lem. C.4, the state has an operation path from its unique input to the token source. This operation path contradicts the path , which is not an operation path, because each operation edge has only one source and each vertex has at most one incoming edge. ∎
Proof of the point (2).
Let be the path in to the token source that is not an operation path.
If the path contains no hole edge, it gives a path in the state to the token source that is not an operation path. By the point (1), the state is not rooted.
Otherwise, i.e. if the path contains a hole edge, we give a proof by contradiction; we assume that the state is rooted. We can take a suffix of the path , so that it gives a path from a target of a hole edge to the token source in , and moreover, gives a path from a source of an edge from to the token source in . This implies the token source is not an input, and therefore by Lem. C.4, the state has an operation path from its unique input to the token source. This operation path must have has a suffix, meaning is also an operation path, because each operation edge has only one source and each vertex has at most one incoming edge. Moreover, must have an operation path from an input to an output, such that the input and the output have type and the path ends with the first edge of the path . This contradicts being one-way. ∎
Proof of the point (3).
Let be the path in to the token source that is not an operation path.
If the path contains no hole edge, it gives a path in the states and to the token source that is not an operation path. By the point (1), the states are not rooted.
Otherwise, i.e. if the path contains a hole edge, we can take a suffix of that gives a path from a source of a hole edge to the token source in , so that the path does not contain any hole edge. We can assume that the hole edge is labelled with , without loss of generality. The path gives paths and to the token source, in contexts and , respectively. The paths and are not an operation path, because they start with the hole edge labelled with .
Because is output-closed, or is one-way. By the point (2), at least one of the states and is not rooted. ∎
Lemma C.8**.**
If a rewrite transition is stationary, it preserves the rooted property, i.e. being rooted implies is also rooted.
Proof.
The stationary rewrite transition is in the form of , where is a focus-free simple context, is a focus-free one-way hypernet, is a focus-free hypernet and . We assume is rooted, and prove that is rooted, i.e. {?;\mathcal{C}[H^{\prime}]}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}\mathcal{C}[?;_{i}H^{\prime}]. By Lem. C.5, there exists a number such that:
[TABLE]
The rest of the proof is by case analysis on the number .
- •
When , i.e. , the unique input and the -th source of the hole coincide in the simple context . Therefore, , which means is rooted.
- •
When , there exists a state such that {?;\mathcal{C}[H]}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{k-1}{\dot{N}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}\mathcal{C}[?;_{i}H]. By the following internal lemma (Lem. C.9), there exists a focussed simple context , whose token is not entering nor exiting, and we have two search sequences:
[TABLE]
The last search transition {\dot{\mathcal{C}_{N}}}[H]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}\mathcal{C}[?;_{i}H], which yields a search token, must use the interaction rule (3) or (4). Because the token is not entering nor exiting in the simple context , either of the two interaction rules acts on the token and an edge of the context. This means that the same interaction is possible in the state , yielding:
[TABLE]
which means is rooted.
Lemma C.9**.**
For any and any state such that {?;\mathcal{C}[H]}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{m}{\dot{N}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{k-m}\mathcal{C}[?;_{i}H], the following holds.
(A) If there exists a focussed simple context such that , the token of the context is not entering.
(B) If there exists a focussed simple context such that , the token of the context is not exiting.
(C) There exists a focussed simple context such that , and {?;\mathcal{C}[H^{\prime}]}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{m}{\dot{\mathcal{C}_{N}}}[H^{\prime}] holds.
Proof.
Firstly, because search transitions do not change an underlying hypernet, if there exists a focussed simple context such that , necessarily holds.
The point (A) is proved by contradiction; we assume that the context has an entering token. This means that there exist a number and a token label such that . By Lem. C.5, there exists a number such that and:
[TABLE]
We derive a contradiction by case analysis on the numbers and .
- –
If and , the state must be initial, but it is a result of a search transition because . This is a contradiction.
- –
If and , two different transitions in the search sequence (\$$) result in the same state, because of h>0k-h>0$, which contradicts Lem. C.2.
- –
If , by Def. 6.2, there exists a state with a rewrite token such that \mathcal{C}[?;_{p}H]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{N^{\prime}}}. This contradicts the search sequence (\$$), because k-h>0$ and search transitions are deterministic.
The point (B) follows from the contraposition of Lem. C.7(2), because is one-way and is rooted. The rooted property of follows from the fact that search transitions do not change underlying hypernets.
The point (C) is proved by induction on . In the base case, when , we have , and therefore the context can be taken as . This means .
In the inductive case, when , there exists a state such that
[TABLE]
By the induction hypothesis, there exists a focussed simple context such that and
[TABLE]
Our goal here is to find a focussed simple context , such that and {\dot{\mathcal{C}_{N^{\prime}}}}[H^{\prime}]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{\mathcal{C}_{N}}}[H^{\prime}].
In the search transition {\dot{\mathcal{C}_{N^{\prime}}}}[H]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{N}}, the only change happens to the token and its incoming or outgoing edge in the state . By the points (A) and (B), the token is not entering nor exiting in the context , which means the edge must be from the context, not from .
Now that no edge from is changed in {\dot{\mathcal{C}_{N^{\prime}}}}[H]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{N}}, there exists a focussed simple context such that , and moreover, {\dot{\mathcal{C}_{N^{\prime}}}}[H^{\prime}]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}{\dot{\mathcal{C}_{N}}}[H^{\prime}]. ∎
∎
Appendix D Accessible paths and stable hypernets
A stable hypernet always has at least one edge, and any non-output vertex is labelled with . It has a tree-like shape.
Lemma D.1** (Shape of Stable Hypernets).**
- (1)
In any stable hypernet, if a vertex is reachable from another vertex such that , there exists a unique path from the vertex to the vertex . 2. (2)
Any stable hypernet has no cyclic path, i.e. a path from a vertex to itself. 3. (3)
Let be a simple context such that: its hole has one source and at least one outgoing edge; and its unique input is the hole’s source. There are no two stable hypernets and that satisfy .
Proof.
To prove the point (1), assume there are two different paths from the vertex to the vertex . These paths, i.e. non-empty sequences of edges, have to involve an edge with more than one source, or two different edges that share the same target. However, neither of these is possible in a stable hypernet, because both a passive operation edge and an instance edge have only one source and vertices can have at most one incoming edge. The point (1) follows from this by contradiction.
If a stable hypernet has a cyclic path from a vertex to itself, there must be infinitely many paths from the input to the vertex , depending on how many times the cycle is included. This contradicts the point (1).
The point (3) is also proved by contradiction. Assume that there exist two stable hypernets and that satisfy for the simple context . In the stable hypernet , a vertex is always labelled with if it is not an output. However, in the simple context , there exists at least one target of the hole that is not an output of the context but not labelled with either. This contradicts being a stable hypernet. ∎
A stable hypernet can be found as a part of representation of a value.
Lemma D.2**.**
Let be a sequence of variables and be a sequence of atoms. For any derivable type judgement where is a value, its representation can be decomposed as using a stable hypernet and a simple context whose unique input coincides with a (unique) source of its hole.
Proof.
By induction on the definition of value.
When the value is an atom, in the representation , only an instance edge can comprise a stable hypernet.
When the value is , by induction hypothesis, a stable hypernet can be extracted from (a bottom part of) representation of each eager argument . The stable hypernet that decomposes the representation can be given by all these stable hypernets together with the passive operation edge that is introduced in the representation.
When the value is , or , by induction hypothesis, representation of the value includes a stable hypernet . The stable hypernet itself decomposes the representation in the required way. ∎
Lemma D.3**.**
For any state , and its vertex , such that the vertex is not a target of an instance edge or a passive operation edge, if an accessible path from the vertex is stable or active, then the path has no multiple occurences of a single edge.
Proof.
Any stable or active path consists of edges that has only one source. As a consequence, except for the first edge, no edge appears twice in the stable path. If the stable path is from the vertex , its first edge also does not appear twice, because is not a target of an instance edge or a passive operation edge. ∎
Lemma D.4**.**
For any state , and its vertex , such that the vertex is not a target of an instance edge or a passive operation edge, the following are equivalent.
(A) There exist a focussed simple context and a stable hypernet , such that , where the vertex of corresponds to a unique source of the hole edge in .
(B) Any accessible path from the vertex in is a stable path.
Proof of (A) (B).
Because no output of a stable hypernet has type , any path from the vertex in gives a path from the unique input in . In the stable hypernet , any path from the unique input is a stable path. ∎
Proof of (B) (A).
In the state , the token target has to be a source of an edge, which forms an accessible path itself. By Lem. D.3, in the state , we can take maximal stable paths from the vertex , in the sense that appending any edge to these paths, if possible, does not give a stable path.
If any of these maximal stable paths is to some vertex, the vertex does not have type ; this can be confirmed as follows. If the vertex has type , it is not an output, so it is a source of an instance, token, operation or contraction edge. The case of an instance or passive operation edge contradicts the maximality. The other case yields a non-stable accessible path that contradicts the assumption (B).
Collecting all edges contained by the maximal stable paths, therefore, gives the desired hypernet . These edges are necessarily all shallow, because of the vertex of . The focussed context , whose hole is shallow, can be made of all the other edges (at any depth) of the state . ∎
Lemma D.5**.**
Let be a state, where the token is an incoming edge of an operation edge , whose label takes at least one eager arguments. Let denote the number of eager arguments of .
For each , let be a state such that: both states and have the same token label and the same underlying hypernet, and the token in is the -th outgoing edge of the operation edge .
For each , the following are equivalent.
(A) In , any accessible path from an -th target of the operation edge is a stable (resp. active) path.
(B) In , any accessible path from the token target is a stable (resp. active) path.
Proof.
The only difference between and is the swap of the token with the operation edge , and these two edges form an accessible path in the states and , individually or together (in an appropriate order). Therefore, there is one-to-one correspondence between accessible paths from an -th target of the edge in , and accessible paths from the token target in .
When (A) is the case, in , any accessible paths from an -th target of the edge does not contain the token nor the edge ; otherwise there would be an accessible path that contains the token and hence not stable nor active, which is a contradiction. This means that, in , any accessible path from the token target also does not contain the token nor the edge , and the path must be a stable (resp. active) path.
When (B) is the case, the proof takes the same reasoning in the reverse way. ∎
Lemma D.6**.**
Let be a rooted state with a search token, such that the token is not an incoming edge of a contraction edge.
- (1)
{\dot{N}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{+}\langle{\dot{N}}\rangle_{\checkmark/?}, if and only if any accessible path from the token target in is a stable path. 2. (2)
{\dot{N}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{+}\langle{\dot{N}}\rangle_{\lightning/?}, if and only if any accessible path from the token target in is an active path.
Proof of the forward direction.
Let be either ‘’ or ‘’. The assumption is {\dot{N}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}\langle{\dot{N}}\rangle_{\mathsf{t}/?}. We prove the following, by induction on the length of this search sequence:
- •
any accessible path from the token target in is a stable path, when , and
- •
any accessible path from the token target in is an active path, when .
In the base case, where , because the token is not an incoming edge of a contraction edge, the token target is a source of an instance edge, or an operation edge labelled with that takes no eager argument. In either situation, the outgoing edge of the token gives the only possible accessible path from the token target. The path is stable when , and active when .
In the inductive case, where , the token target is a source of an operation edge labelled with an operation that takes at least one eager argument.
Let denote the number of eager arguments of , and be an arbitrary number in . Let be the state as defined in Lem. D.5. Because is rooted, by Lem. C.5, the given search sequence gives the following search sequence (proof by induction on ):
[TABLE]
By induction hypothesis on the intermediate sequence \mathit{sw}_{i}({\dot{N}})\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{+}\langle\mathit{sw}_{i}({\dot{N}})\rangle_{\checkmark/?}, any accessible path from the token target in is a stable path. By Lem. D.5, any accessible path from an -th target of the operation edge in is a stable path.
In , any accessible path from the token target is given by the operation edge followed by an accessible path, which is proved to be stable above, from a target of . Any accessible path from the token target is therefore stable when , and active when . ∎
Proof of the backward direction.
Let be either ‘’ or ‘’. The assumption is the following:
- •
any accessible path from the token target in is a stable path, when , and
- •
any accessible path from the token target in is an active path, when .
Our goal is to show {\dot{N}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}\langle{\dot{N}}\rangle_{\mathsf{t}/?}.
In the state , the token target has to be a source of an edge, which forms an accessible path itself. By Lem. D.3, we can define by the maximum length of stable paths from the token target. This number is well-defined and positive. We prove {\dot{N}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{*}\langle{\dot{N}}\rangle_{\mathsf{t}/?} by induction on .
In the base case, where , the outgoing edge of the token is the only possible accessible path from the token target. The outgoing edge is not a contraction edge by the assumption, and hence it is an instance edge, or an operation edge labelled with that takes no eager argument. We have {\dot{N}}\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}\langle{\dot{N}}\rangle_{\mathsf{t}/?}.
In the inductive case, where , the outgoing edge of the token is an operation edge labelled with that takes at least one eager argument. Any accessible path from the token target in is given by the edge followed by a stable path from a target of .
Let denote the number of eager arguments of , and be an arbitrary number in . Let be the state as defined in Lem. D.5.
By the assumption, any accessible path from an -th target of the operation edge in is a stable path. Therefore by Lem. D.5, in , any accessible path from the token target is a stable path. Moreover, these paths in and correspond to each other. By Lem. D.3, we can define by the maximum length of stable paths from the token target. This number is well-defined, and satisfies . By induction hypothesis on this number, we have:
[TABLE]
Combining this search sequence with the following possible search transitions concludes the proof:
[TABLE]
∎
Appendix E Parametrised (contextual) refinement and equivalence
Lemma E.1**.**
For any focus-free contexts and such that is defined, if both and are binding-free, then is also binding-free.
Proof.
Let denote , and denote the hole edge of labelled with .
The proof is by contradiction. We assume that there exists a path in , from a source of a contraction, atom, box or hole edge , to a source of a hole edge . We derive a contradiction by case analysis on the path .
- •
When comes from , and the path consists of edges from only, the path gives a path in that contradicts being binding-free.
- •
When comes from , and the path contains an edge from , by finding the last edge from in , we can take a suffix of that gives a path from a target of the hole edge to a source of a hole edge, in . Adding the hole edge at the beginning yields a path in that contradicts being binding-free.
- •
When both and come from , and the path gives a path in , this contradicts being binding-free.
- •
When both and come from , and the path does not give a single path in , there exists a path from a source of the hole edge to a source of the hole edge , in . This path contradicts being binding-free.
- •
When comes from and comes from , by finding the first edge from in , we can take a prefix of that gives a path from a source of a contraction, atom, box or hole edge to a source of the hole edge , in . This path contradicts being binding-free.
∎
Lemma E.2**.**
For any set of contexts that is closed under plugging, and any preorder on natural numbers, the following holds.
- •
* and are reflexive.*
- •
* and are transitive.*
- •
* and are equivalences.*
Proof.
Because and are defined as a symmetric subset of and , respectively, and are equivalences if and are preorders.
Reflexivity and transitivity of is a direct consequence of those of the preorder .
For any focus-free hypernet , and any focus-free context such that is a state, because of reflexivity of .
For any focus-free hypernets , and , and any focus-free context , such that , , and both and are states, our goal is to show . Because and , all three hypernets , and have the same type, and hence is also a state. Therefore, we have and , and the transitivity of implies . ∎
Lemma E.3**.**
For any set of contexts that is closed under plugging, and any preorder on natural numbers, the following holds.
- (1)
For any hypernets and , implies . 2. (2)
If all compute transitions are deterministic, for any hypernets and , implies .
Proof.
Because , the point (1) follows from the monotonicity of contextual equivalence.
For the point (2), means that any focus-free context , such that and are states, yields and . If the state terminates at a final state after transitions, there exists such that and the state terminates at a final state after transitions. Moreover, there exists such that and the state terminates at a final state after transitions.
Because search transitions and copy transitions are deterministic, if all compute transitions are deterministic, states and transitions comprise a deterministic abstract rewriting system, in which final states are normal forms. By Lem. C.1, must hold. This means , and . Similarly, we can infer , and hence . ∎
Appendix F Proof for Sec. 6.4
Lemma F.1**.**
Let be a set of contexts, and be a binary relation on such that, for any , implies . Let be a pre-template that is a trigger and implies contextual refinement . For any single -specimen of , the following holds.
- (1)
For any , ?;|{\dot{\mathcal{C}}}|[H^{1}]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{k}{\dot{\mathcal{C}}}[H^{1}] if and only if ?;|{\dot{\mathcal{C}}}|[H^{2}]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{k}{\dot{\mathcal{C}}}[H^{2}]. 2. (2)
If compute transitions are all deterministic, and one of states and is rooted, then the other state is also rooted, and moreover, .
Proof of the point (1).
Let be an arbitrary element of a set . We prove that, for any , ?;|{\dot{\mathcal{C}}}|[H^{p}]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{k}{\dot{\mathcal{C}}}[H^{p}] implies ?;|{\dot{\mathcal{C}}}|[H^{q}]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{k}{\dot{\mathcal{C}}}[H^{q}]. The proof is by case analysis on the number .
- •
When , is initial, and by Lem. 7.5(1), is also initial. Note that is a trigger and hence output-closed.
- •
When , by the following internal lemma, ?;|{\dot{\mathcal{C}}}|[H^{q}]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{k}{\dot{\mathcal{C}}}[H^{q}] follows from ?;|{\dot{\mathcal{C}}}|[H^{p}]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{k}{\dot{\mathcal{C}}}[H^{p}].
Lemma F.2**.**
For any , there exists a focussed context such that and the following holds:
[TABLE]
Proof.
By induction on . In the base case, when , we can take as .
In the inductive case, when , by induction hypothesis, there exists a focussed context such that and the following holds:
[TABLE]
Because , is a single -specimen of , which yields rooted states. Because , cannot have a rewrite token. The rest of the proof is by case analysis on the token of .
- –
When has an entering search token, because is a trigger, for each . Because , and search transitions are deterministic, we have the following:
[TABLE]
We also have .
- –
When has a value token, or a non-entering search token, because is output-closed, by Lem. 7.5(3), there exists a focussed context such that and for each . The transition , for each , is a search transition, and by the determinism of search transitions, we have the following:
[TABLE]
∎
∎
Proof of the point (2).
If one of states and is rooted, by the point (1), the other state is also rooted, and moreover, there exists such that ?;|{\dot{\mathcal{C}}}|[H^{r}]\mathrel{\stackrel{{\scriptstyle\raisebox{1.50694pt}{\scriptstyle\bullet,}}}{{\raisebox{0.0pt}[0.0pt][0.0pt]{\rightarrow}}}}^{k}{\dot{\mathcal{C}}}[H^{r}] for each .
Our goal is to prove that, for any and any final state such that , there exist and a final state such that and . Assuming , we have the following:
[TABLE]
Because implies contextual refinement , and , we have state refinement . Therefore, there exist and a final state such that and .
The assumption that compute transitions are all deterministic implies that all transitions, including intrinsic ones, are deterministic. Following from this are and the following:
[TABLE]
By the assumption on , implies . ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[ABM 14] Beniamino Accattoli, Pablo Barenbaum, and Damiano Mazza. Distilling abstract machines. In 19th ACM SIGPLAN International Conference on Functional Programming, ICFP 2014, September 1-3 2014, Gothenburg, Sweden , pages 363–376, 2014.
- 2[Abr 97] Samson Abramsky. Game semantics for programming languages. In 22nd International Symposium on Mathematical Foundations of Computer Science, MFCS 1997, August 25-29 1997, Bratislava, Slovakia , pages 3–4, 1997.
- 3[ADV 20] Beniamino Accattoli, Ugo Dal Lago, and Gabriele Vanoni. The machinery of interaction. In PPDP 2020 , pages 4:1–4:15. ACM, 2020.
- 4[AG 09] Beniamino Accattoli and Stefano Guerrini. Jumping boxes. In CSL 2009 , volume 5771 of Lect. Notes Comp. Sci. , pages 55–70. Springer, 2009.
- 5[AGSZ 21] Mario Alvarez-Picallo, Dan R. Ghica, David Sprunger, and Fabio Zanasi. Functorial string diagrams for reverse-mode automatic differentiation. Co RR , abs/2107.13433, 2021.
- 6[AJM 13] Samson Abramsky, Radha Jagadeesan, and Pasquale Malacaria. Full abstraction for PCF. Co RR , abs/1311.6125, 2013.
- 7[BGK + 16] Filippo Bonchi, Fabio Gadducci, Aleks Kissinger, Pawel Soboci’nski, and Fabio Zanasi. Rewriting modulo symmetric monoidal structure. In Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, LICS ’16, New York, NY, USA, July 5-8, 2016 , pages 710–719, 2016.
- 8[BPPR 17] Filippo Bonchi, Daniela Petrisan, Damien Pous, and Jurriaan Rot. A general account of coinduction up-to. Acta Inf. , 54(2):127–190, 2017.
