C-SHORe: Higher-Order Verification via Collapsible Pushdown System Saturation
Christopher Broadbent, Arnaud Carayol, Matthew Hague, and Olivier, Serre

TL;DR
C-SHORe introduces a scalable, automata-theoretic model-checking approach for higher-order recursion schemes using collapsible pushdown systems, enhancing verification efficiency and providing counter-examples.
Contribution
It presents the first practical model-checking algorithm for CPDS, integrating backward saturation, forward guidance, pruning, and counter-example extraction.
Findings
Effective backward saturation algorithm for CPDS
Guided model-checking using forward reachability information
Competitive performance compared to state-of-the-art tools
Abstract
Higher-order recursion schemes (HORS) have received much attention as a useful abstraction of higher-order functional programs with a number of new verification techniques employing HORS model-checking as their centrepiece. We give an account of the C-SHORe tool, which contributed to the ongoing quest for a truly scalable model-checker for HORS by offering a different, automata theoretic perspective. C-SHORe implements the first practical model-checking algorithm that acts on a generalisation of pushdown automata equi-expressive with HORS called collapsible pushdown systems (CPDS). At its core is a backwards saturation algorithm for CPDS. Additionally, it is able to use information gathered from an approximate forward reachability analysis to guide its backward search. Moreover, it uses an algorithm that prunes the CPDS prior to model-checking and a method for extracting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Logic, programming, and type systems · Security and Verification in Computing
††thanks:
\titlecomment
This article gives a full account of the C-SHORe tool [cshore] whose algorithms and implementation have been published in ICALP 2012 [BCHS12] and ICFP 2013 [BCHS13].
C-SHORe: Higher-Order Verification via Collapsible Pushdown System Saturation
Christopher Broadbent
Institut für Informatik (I7), Technische Universität München
Arnaud Carayol
Laboratoire d’informatique de l’Institut Gaspard Monge, Université Paris-Est, and CNRS
Matthew Hague
Department of Computer Science, Royal Holloway, University of London
Olivier Serre
IRIF, Université Paris Diderot - Paris 7, and CNRS
Abstract.
Higher-order recursion schemes (HORS) have received much attention as a useful abstraction of higher-order functional programs with a number of new verification techniques employing HORS model-checking as their centrepiece. We give an account of the C-SHORe tool, which contributed to the ongoing quest for a truly scalable model-checker for HORS by offering a different, automata theoretic perspective. C-SHORe implements the first practical model-checking algorithm that acts on a generalisation of pushdown automata equi-expressive with HORS called collapsible pushdown systems (CPDS). At its core is a backwards saturation algorithm for CPDS. Additionally, it is able to use information gathered from an approximate forward reachability analysis to guide its backward search. Moreover, it uses an algorithm that prunes the CPDS prior to model-checking and a method for extracting counter-examples in negative instances. We provide an up-to-date comparison of C-SHORe with the state-of-the-art verification tools for HORS. The tool and additional material are available from http://cshore.cs.rhul.ac.uk.
keywords:
Higher-Order; Verification; Model-Checking; Recursion Schemes; Collapsible Pushdown Systems; Saturation; Automata
1991 Mathematics Subject Classification:
F.1.1; Models of Computation; Automata
1. Introduction
Functional languages such as Haskell, OCaML and Scala strongly encourage the use of higher-order functions. This represents a challenge for software verification, which usually does not model recursion accurately, or models only first-order calls (e.g. SLAM [BR02] and Moped [S02]). However, there has recently been much interest in a model called higher-order recursion schemes (HORS) (see e.g. [O06]), which offers a way of abstracting functional programs in a manner that precisely models higher-order control-flow.
The execution trees of HORS enjoy decidable -calculus theories [O06]. Even ‘reachability’ properties (subsumed by the -calculus) are very useful in practice. As a simple example, the safety of incomplete pattern matching clauses could be checked by asking whether the program can ‘reach a state’ where a pattern match failure occurs. More complex ‘reachability’ properties can be expressed using a finite automaton and could, for example, specify that the program respects a certain discipline when accessing a particular resource (see [K09]). Despite even reachability being -EXPTIME complete, recent research has revealed that useful properties of HORS can be checked in practice.
Kobayashi’s TRecS [K09b] tool, which checks properties expressible by a deterministic trivial Büchi automaton (all states accepting), was the first to achieve this. It works by determining whether a HORS is typable in an intersection-type system characterising the property to be checked [K09]. In a bid to improve scalability, a number of other algorithms have subsequently been designed and implemented such as Kobayashi et al.’s GTRecS(2) [K11b, gtrecs2] and Neatherway et al.’s TravMC [NRO12] tools, all based on intersection type inference. A recent overview of HORS model-checking was given by Ong [O15].
This work is the basis of various techniques for verifying functional programs. In particular, Kobayashi*et al.*have developed MoCHi [KSU11] that checks safety properties of (OCaML) programs, and EHMTT Verifier [Unno10APLAS] for tree processing programs. Both use a recursion schemes model-checker as a central component. Similarly, Ong and Ramsay [OR11] analyse programs with pattern matching employing recursion schemes as an abstraction.
Achieving scalability while accurately tracking higher-order control-flow is a challenging problem. This article offers an automata-theoretic perspective on this challenge, providing a fresh set of tools that contrast with previous intersection-type approaches.
Collapsible pushdown systems (CPDS) [HMOS08, HMOS17] are an alternative representation of the class of execution trees that can be generated by recursion schemes (with linear-time mutual-translations between the two formalisms [HMOS08, HMOS17, CS12]). While pushdown systems augment a finite-state machine with a stack and provide an ideal model for first-order programs [JM77], collapsible pushdown systems model higher-order programs by extending the stack of a pushdown system to a nested “stack-of-stacks” structure. The nested stack structure enables one to represent closures. Indeed the reader might find it helpful to view a CPDS as being a Krivine’s Abstract Machine in a guise making it amenable to the generalisation of techniques for pushdown model-checking. Salvati and Walukiewicz have studied in detail the connection with the Krivine abstract machine [SW12, SW16].
For ordinary (‘order-’) pushdown systems, a model-checking approach called saturation has been successfully implemented by tools such as Moped [S02] and PDSolver [HO10b]. Given a regular set of configurations of the pushdown system (represented by a finite automaton acting on stacks), saturation can solve the ‘backward reachability problem’ by computing another finite automaton recognising a set of configurations from which a configuration in can be reached. This is a fixed-point computation that gradually adds transitions to until it is ‘saturated’. If recognises a set of error configurations, one can determine whether the pushdown system is ‘safe’ by checking if its initial configuration is recognised by the automaton computed by saturation.
The first contribution of this article was first presented in ICALP 2012. We extend the saturation method to a backward reachability analysis of collapsible pushdown systems [BCHS12]. This runs in PTIME when the number of control states is bounded. Crucially, this condition is satisfied when translating from recursion schemes of bounded arity with properties represented by automata of bounded size [HMOS08, HMOS17]. Whilst the HORS/intersection-type based tool GTRecS(2) also enjoys this fixed-parameter tractability, it times out on many benchmarks that our tool solves quickly. We remark also Ramsay*et al.*introduced a third fixed-parameter tractable algorithm in 2014 underlying their tool [RNO14].
In this work, we revisited the foundations of higher-order verification tools and introduced C-SHORe [cshore] — the first model-checking tool for the (direct) analysis of collapsible pushdown systems. This work was presented in ICFP 2013 [BCHS13]. To achieve an efficient implementation, some substantial modifications and additions were made to the algorithm, leading to several novel practical and theoretical contributions:
- (1)
An approximate forward reachability algorithm providing data
- (a)
…allowing the CPDS to be pruned so that saturation receives a smaller input. 2. (b)
…employed by a modified saturation algorithm to guide its backward search.
This is essential for termination on most of our benchmarks. 2. (2)
A method for extracting witnesses to reachability. 3. (3)
A complete rework of the saturation algorithm to speed up fixed-point computation. 4. (4)
Experimental results comparing our approach with other tools.
We remark that the tools mentioned above propagate information forwards WRT the evaluation of the model. In contrast, the raw saturation algorithm works backwards, but we also show how forward and backward propagation can be combined.
Here we give a full account of the C-SHORe tool. This covers the saturation algorithm presented at ICALP 2012 as well as efficient algorithms implemented by C-SHORe in ICFP 2013. To prove soundness, we diverge from the ICALP 2012 proof, and instead base our proof on the witness generation algorithm presented in ICFP 2013. In particular, we present novel generalisations of witness generation, the forwards analysis, and the efficient fixed-point calculation to alternating CPDSs. These were only given for non-alternating CPDSs in ICFP 2013. The tool is available at http://cshore.cs.rhul.ac.uk.
Since C-SHORe was released, two new tools were released. Broadbent*et al.introduced HorSat, which is an application of the saturation technique and initial forward analysis directly to intersection type analysis of HORS [BK13]. Recently HorSat2 improved the forwards analysis and made other algorithmic improvements [horsat2]. Secondly, in POPL 2014, Ramsayet al.*introduced [RNO14]. This is a type-based abstraction-refinement algorithm that attempts to simultaneously prove and disprove the property of interest. Both HorSat2 and perform significantly better than previous tools.
Even though both and HorSat2 both outperform C-SHORe, we consider the CPDS approach to offer a different perspective by providing a link between successful pushdown model-checking tools and higher-order model-checking. Moreover, CPDS have been instrumental in proving a number of results about higher-order languages. Hence, it is very natural to consider the implementation of a model-checker using these automata techniques and the challenges and opportunities therein. This article provides an account of a significant effort to extend pushdown model-checking to the higher-order case, and therefore we hope it will be instructive to readers interested in building verification tools for higher-order programming languages.
Section 2 is an informal introduction to HORS and CPDS. In Section 3 we describe CPDS and how to represent sets of their configurations. The basic saturation algorithm introduced in ICALP 2012 is presented in Section 4 and proven correct in Section 5. Section LABEL:sec:counter-egs gives our generalised witness generation algorithm (that also implies soundness of saturation). We describe two optimisations to the saturation algorithm used by C-SHORe: an initial forwards analysis in Section LABEL:section:optimisations and an efficient fixed point computation in Section LABEL:sec:fastalgorithm. Experimental results are in Section LABEL:sec:experiments.
2. Modelling Higher-Order Programs
In this section we give an informal introduction to the process of modelling higher-order programs for verification. In particular, we show how a simple example program can be modelled using a higher-order recursion scheme, and then we show how this scheme is evaluated using a collapsible pushdown system. For a more systematic approach to modelling higher-order programs with recursion schemes, we refer the reader to work by Kobayashiet al. [KSU11]. This section is for background only, and can be safely skipped.
For this section, consider the toy example below.
Main = MakeReport Nil MakeReport x = if * (Commit x) else (AddData x MakeReport) AddData y f = if * (f Error) else (f Cons(_, y))
In this example, * represents a non-deterministic choice (that may, for example, be a result of some input by the user). Execution begins at Main which aims to make a report which is a list. It sends an empty report to MakeReport. Either MakeReport finishes and commits the report somehow, or it adds an item to the head of the list using AddData, which takes the report so far, and a continuation. AddData either detects a problem with the new data (maybe it is inconsistent with the rest of the report) and flags an error by passing Error to the continuation, or extends the report with some item. In this case, since there is no error handling in MakeReport, an Error may be committed.
2.1. Higher-Order Recursion Schemes
We introduce, informally, higher-order recursion schemes. These are rewrite systems that generate the computation tree of a functional program. A rewrite rule takes the form
[TABLE]
where is a (simply) typed non-terminal with (possibly higher-order) arguments and . A term rewrites to with substituted for and substituted for . Note that recursion schemes require to be of ground type. We illustrate recursion schemes and their use in analysis using the toy example from above. We can directly model our example with the scheme
[TABLE]
where is the non-terminal associated with the MakeReport function, and is the non-terminal associated with the AddData function; , , , and are terminal symbols of arity 0, 2, 1, 0 and 1 respectively (e.g. in the second rule, takes the two arguments and ). The scheme above begins with the non-terminal and, through a sequence of rewrite steps, generates a tree representation of the evolution of the program. Figure 1, described below, shows such a sequence.
Beginning with the non-terminal , we apply the first rewrite rule to obtain the tree representing the term . We then apply the second rewrite rule, instantiating with to obtain the next tree in the sequence. This continues ad infinitum to produce a possibly infinite tree labelled only by terminals.
We aim to show the correctness of the program. I.e. the program never tries to an . The rightmost tree in Figure 1, has a branch labelled . Note, is being called with an report. In general we define the regular language . If the tree generated by the HORS contains a branch labelled by a word appearing in , then we have identified an error in the program.
2.2. Collapsible Pushdown Automata
Previous research into the verification of HORS has used intersection types (e.g. [K11, NRO12]). Here we investigate a radically different approach exploiting the connection between HORS and an automata model called collapsible pushdown automata (CPDA). These two formalisms are, in fact, equivalent.
Theorem 1** (Equi-expressivity [HMOS08, HMOS17]).**
For each order- recursion scheme, there is an order- collapsible pushdown automaton generating the same tree, and vice-versa. Furthermore, the translation from recursion schemes to collapsible pushdown automata is linear, and the opposite translation is polynomial. ∎
We describe at a high level the structure of a CPDA and how they can be used to evaluate recursion schemes. In our case, this means outputting a sequence of non-terminals representing each path in the tree. More formal definitions are given in Section 3. At any moment, a CPDA is in a configuration , where is a control state taken from a finite set , and is a higher-order collapsible stack. In the following we will focus on the stack. Control states are only needed to ensure that sequences of stack operations occur in the correct order and are thus elided for clarity.
In our toy example, we have an order- HORS and hence an order- stack. An order- stack is a stack of characters from a finite alphabet . An order- stack is a stack of order- stacks. Thus denotes the order- stack containing only the order- stack ; is an order- stack containing only the character . In general will contain all subterms appearing in the original statement of our toy example recursion scheme. The evolution of the CPDA stack is given in Figure 2 and explained below.
The first step is to rewrite using . Since is a subterm of our recursion scheme, we have and we rewrite the stack to . Next, we call . As usual, a function call necessitates a new stack frame. In particular, we push the body of (that is ) onto the stack, giving the third stack in Figure 2. Note, we do not instantiate the variable , hence we use only the subterms appearing in the recursion scheme.
Recall that we want to obtain a CPDA that outputs a sequence of terminals representing each path in the tree. To evaluate we output the terminal and then (non-deterministically) choose a branch of the tree to follow. Let us choose . Hence, the CPDA outputs and rewrites the top term to . Next we call , pushing its body to the stack, then pick out the branch of the terminal. This takes us to the beginning of the second row of Figure 2.
To proceed, we evaluate . To do this, we have to know the value of . We can obtain this information by inspecting the stack and seeing that the second argument of the call of is . However, since we can only see the top of a stack, we would have to remove the character to determine that , thus losing our place in the computation.
However, an order- stack is able — via a operation — to create a copy of its topmost order- stack. After this copy (note that the top of the stack is written on the left) we delve into the copy of the stack to find the value of . Simultaneously we create a collapse link, pictured as an arrow from to the stack with the term on top. This collapse link points from to the context in which will be evaluated. In particular, if we need to know the value of in the body of , we need to know that was called with the argument, within the term ; the collapse link points to this information (i.e. encodes a closure in the stack). We can access this information via a collapse operation. These are the two main features of a higher-order collapsible stack, described formally in the next section.
To continue, we push the body of on to the stack, output the symbol and choose the branch. Since is a terminal, we output it and evaluate . To compute , we look into the stack and follow the collapse link from to the stack with on top. We do not create a copy of the stack here because is an order-[math] variable and thus represents a self-contained execution. Since has value , we output it and terminate. This completes the execution corresponding to the error branch identified in Figure 1.
2.3. Collapsible Pushdown Systems
The CPDA output in the execution above. This is an error sequence in and should be flagged. In general, we take the finite automaton representing the regular language and form a synchronised product with the CPDA. This results in a CPDA that does not output any symbols, but instead keeps in its control state the progression of . Thus we are interested in whether the CPDA is able to reach an accepting state of , not the language it generates. We call a CPDA without output symbols a collapsible pushdown system (CPDS), and the question of whether a CPDS can reach a given state is the reachability problem. This is the subject of the remainder of the paper.
3. Preliminaries
3.1. Collapsible Pushdown Systems
We give the definition of higher-order collapsible stacks and their operations, before giving the definition of collapsible pushdown systems.
3.1.1. Higher-Order Collapsible Stacks
Higher-order collapsible stacks are a nested “stack-of-stacks” structure over a stack alphabet . Each stack character contains a pointer — called a “link” — to a position lower down in the stack. The stack operations, defined below, create copies of sub-stacks. The link is intuitively a pointer to the context in which the stack character was first created. These links will be defined as tuples, the meaning of which is expanded upon after the following definition. Let the natural numbers be . We will write stacks with the top of the stack appearing on the left.
{defi}
[Order- Collapsible Stacks] An order- link is a tuple where and are natural numbers. If we say the link is up-to order-. Given a finite set of stack characters , an order-[math] stack with an up-to order- link is where and is an up-to order- link. An order- stack with up-to order- links is a sequence such that each is an order- stack with up-to order- links. Moreover, for each and each order- link appearing on a character in , we have . Let denote the set of order- stacks over with up-to order- links.
In the sequel we will refer to order- stacks with up-to order- links simply as order- stacks. We will use order- stack to mean an order- stack with up-to order- links, where is clear from the context. We define the interpretation of the collapse links formally below. Intuitively, the collapse links point to a position lower down in the stack. In a link the first component indicates that the link points to a location inside the order- stack where the link is contained. The second component gives the distance from the bottom of the stack of the targeted position. For example, a link in an order- stack would point to the bottom of the order- stack. That is, after . Hence, we can represent collapse links informally with arrows as shown below.
Example 3.1**.**
An example order- stack is where the topmost character is . This could be written
[TABLE]
The collapse operation, defined below, will remove all parts of the stack above the destination of the topmost collapse link. Collapse on the stack in the example above gives . Note, we will often omit the collapse link annotations for readability. In particular, we will often write instead of when we are not interested in the link.
Given an order- stack , we define
[TABLE]
noting that is undefined if is empty for any . For technical reasons, we also define when is an order- stack. We remove the top portion of a stack using, where ,
[TABLE]
For , the destination of the link is .
When is an order- stack and is an -stack with , we define as the stack obtained by adding on top of the topmost -stack of . Formally, we let
[TABLE]
3.1.2. Operations on Order- Collapsible Stacks
The following operations may be performed on an order- collapsible stack.
[TABLE]
We say is of order- when is minimal such that . E.g., is of order .
The operation is non-standard in the sense of Hagueet al. [HMOS08, HMOS17] and has the semantics of a normal collapse, with the additional constraint that the top character has an order- link. The standard version of collapse can be simulated with a non-deterministic choice on the order of the stack link. In the other direction, we can store in the stack alphabet the order of the collapse link attached to each character on the stack. Note, we do not allow order- links to be created or used. In effect, these links are “null”.
We define each stack operation in turn for an order- stack . Collapse links are created by the operations, which add a character to the top of a given stack with a link pointing to .
- (1)
We set when decomposes into . 2. (2)
We set when . 3. (3)
We set when for some . 4. (4)
We set where . 5. (5)
We set where .
Note that, for a operation, links outside of point to the same destination in both copies of , while links pointing within point within the respective copies of . For full introduction, we refer the reader to Hagueet al. [HMOS08, HMOS17]. In Section 4.3 we give several example stacks and show how the stack operations affect them.
3.1.3. Collapsible Pushdown Systems
We define alternating collapsible pushdown systems.
{defi}
[Collapsible Pushdown Systems] An alternating order- collapsible pushdown system (collapsible PDS) is a tuple where is a finite set of control states, is a finite stack alphabet, and is a set of rules.
We write configurations of a collapsible PDS as a pair where and . We write to denote a transition from a rule with and . Furthermore, we have a transition whenever we have a rule . A non-alternating collapsible PDS has no rules of this second form. We write to denote a set of configurations.
We will be interested in the configurations that may reach a particular target set. That is, given a set of target configurations, we define the set of configurations which can eventually reach . A configuration can reach if it is contained in or there is a transition to a configuration that can reach . In the case of alternating transitions to a set of configurations , we require all configurations in to be able to reach . This is formally defined as where
[TABLE]
3.2. Regularity of Collapsible Stacks
We will present an algorithm that operates on sets of configurations.
For this we use order- stack automata, thus defining a notion of regular sets of stacks. These have a nested structure based on a similar automata model by Bouajjani and Meyer [BM04]. The handling of collapse links is similar to automata introduced by Broadbentet al. [BCOS10], except we read stacks top-down rather than bottom-up. Note, the second condition in the definition below is a uniqueness condition that will be technically convenient throughout this article. It can be shown that it does not restrict the expressive power of the automata.
{defi}
[Order- Stack Automata] An order- stack automaton
[TABLE]
is a tuple where is a finite stack alphabet, are finite disjoint statesets, and
- (1)
for all , we have that is a transition relation, and is a set of accepting states, 2. (2)
for all and , if and then and , and 3. (3)
is a transition relation, and a set of accepting states.
Stack automata are alternating automata that read the stack in a nested fashion. Order- stacks are recognised from states in . A transition from to for some can be fired when the topmost order- stack is accepted from . The remainder of the stack must be accepted from all states in . At order-, a transition is a standard alternating -transition with the additional requirement that the stack pointed to by the collapse link of is accepted from all states in . A stack is accepted if a subset of is reached at the end of each order- stack. In Section 3.2.1, we formally define the runs of a stack automaton. We write whenever is accepted from a state . For ease of presentation, we write instead of and instead of . Note that a transition to the empty set is distinct from having no transition.
We give two informal examples of runs below. The first is more schematic, while the second is concrete. Further examples can be found in Section 4.3.
Example 3.2**.**
A (partial) run is informally pictured in Figure LABEL:fig:schematic-run, reading an order- stack using and . Note, the transition reads the topmost order- stack, with the remainder of the stack being read from . The node labelled begins a run on the stack pointed to by the collapse link of . Note that the label of this node may contain other elements apart from . These additional elements come from the part of the run coming from the previous node (and other collapse links).
