Faster Algorithms for Weighted Recursive State Machines
Krishnendu Chatterjee, Bernhard Kragl, Samarth Mishra, Andreas, Pavlogiannis

TL;DR
This paper introduces faster algorithms for analyzing recursive state machines using semiring-labeled transitions, improving complexity bounds and efficiency for interprocedural analysis, including concurrent RSMs, with practical speed-ups demonstrated.
Contribution
It presents novel algorithms that improve complexity bounds for RSM analysis, especially for finite-height semirings, and extends these improvements to concurrent RSMs, with practical implementation results.
Findings
Improved complexity bounds for RSM analysis algorithms.
Efficient algorithms for extracting distance values and handling queries.
Prototype implementation shows significant speed-up on benchmarks.
Abstract
Pushdown systems (PDSs) and recursive state machines (RSMs), which are linearly equivalent, are standard models for interprocedural analysis. Yet RSMs are more convenient as they (a) explicitly model function calls and returns, and (b) specify many natural parameters for algorithmic analysis, e.g., the number of entries and exits. We consider a general framework where RSM transitions are labeled from a semiring and path properties are algebraic with semiring operations, which can model, e.g., interprocedural reachability and dataflow analysis problems. Our main contributions are new algorithms for several fundamental problems. As compared to a direct translation of RSMs to PDSs and the best-known existing bounds of PDSs, our analysis algorithm improves the complexity for finite-height semirings (that subsumes reachability and standard dataflow properties). We further consider the…
| # | Trans. | D | TC | jMop | Ours | SpUp |
|---|---|---|---|---|---|---|
| 1 | 246,101 | 1.9 | ||||
| 2 | 216,021 | 0.8 | ||||
| 3 | 593,041 | 1.5 | ||||
| 4 | 1,043,217 | 1.2 | ||||
| 5 | 329,088 | 1.4 | ||||
| 6 | 10,281,149 | 3.0 | ||||
| 7 | 908,092 | 1.7 | ||||
| 8 | 969,388 | 2.2 | ||||
| 9 | 298,126 | 1.5 | ||||
| 10 | 1,780,776 | 1.3 | ||||
| 11 | 163,853 | 1.4 | ||||
| 12 | 205,608 | 1.0 | ||||
| 13 | 28,568,561 | 1.7 | ||||
| 14 | 21,911,277 | 1.8 | ||||
| 15 | 2,453,881 | 1.5 | ||||
| 16 | 5,833,574 | 1.8 | ||||
| 17 | 332,768 | 0.8 | ||||
| 18 | 1,782,697 | 1.3 | ||||
| 19 | 246,127 | 1.9 | ||||
| 20 | 21,648,560 | 1.8 | ||||
| 21 | 7,033,834 | 2.1 | ||||
| 22 | 28,944,391 | 1.7 | ||||
| 23 | 464,004 | 1.7 | ||||
| 24 | 424,916 | 1.6 | ||||
| 25 | 22,186,326 | 1.6 | ||||
| 26 | 11,719,007 | 5.2 | ||||
| 27 | 2,989,001 | 1.4 | ||||
| 28 | 1,952,647 | 1.3 | ||||
| 29 | 7,970,359 | 3.2 | ||||
| 30 | 682,435 | 2.1 | ||||
| 31 | 9,480,799 | 4.9 | ||||
| 32 | 845,867 | 2.4 | ||||
| 33 | 953,420 | 3.1 | ||||
| 34 | 1,205,731 | 2.0 | ||||
| 35 | 754,270 | 1.7 | ||||
| 36 | 1,463,749 | 2.0 | ||||
| 37 | 434,884 | 5.8 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
MnLargeSymbols’164 MnLargeSymbols’171
11institutetext: IST Austria, Klosterneuburg, Austria 22institutetext: IIT Bombay, Mumbai, India
Faster Algorithms for
Weighted Recursive State Machines
Krishnendu Chatterjee 11
Bernhard Kragl 11
Samarth Mishra 22
Andreas Pavlogiannis 11
Abstract
Pushdown systems (PDSs) and recursive state machines (RSMs), which are linearly equivalent, are standard models for interprocedural analysis. Yet RSMs are more convenient as they (a) explicitly model function calls and returns, and (b) specify many natural parameters for algorithmic analysis, e.g., the number of entries and exits. We consider a general framework where RSM transitions are labeled from a semiring and path properties are algebraic with semiring operations, which can model, e.g., interprocedural reachability and dataflow analysis problems.
Our main contributions are new algorithms for several fundamental problems. As compared to a direct translation of RSMs to PDSs and the best-known existing bounds of PDSs, our analysis algorithm improves the complexity for finite-height semirings (that subsumes reachability and standard dataflow properties). We further consider the problem of extracting distance values from the representation structures computed by our algorithm, and give efficient algorithms that distinguish the complexity of a one-time preprocessing from the complexity of each individual query. Another advantage of our algorithm is that our improvements carry over to the concurrent setting, where we improve the best-known complexity for the context-bounded analysis of concurrent RSMs. Finally, we provide a prototype implementation that gives a significant speed-up on several benchmarks from the SLAM/SDV project.
1 Introduction
Interprocedural analysis. One of the classical algorithmic analysis problems in programming languages is the interprocedural analysis. The problem is at the heart of several key applications, ranging from alias analysis, to data dependencies (modification and reference side effect), to constant propagation, to live and use analysis [32, 35, 10, 15, 23, 18, 13, 14, 17]. In seminal works [32, 35] it was shown that a large class of interprocedural dataflow analysis problems can be solved in polynomial time.
Models for interprocedural analysis. Two standard models for interprocedural analysis are pushdown systems (or finite automata with stacks) and recursive state machines (RSMs) [4, 5]. An RSM is a formal model for control flow graphs of programs with recursion. We consider RSMs that consist of modules, one for each method or function that has a number of entry nodes and a number of exit nodes, and each module contains boxes that represent calls to other modules. A special case of RSMs with a single entry and a single exit node for every module (SESE RSMs, aka supergraph in [32]) has also been considered. While pushdown systems and RSMs are linearly equivalent (i.e., there is a linear translation from one model to the other and vice versa), there are two distinct advantages of RSMs. First, the model of RSMs closely resembles the problems of programming languages with explicit function calls and returns, and hence even its special cases such as SESE RSMs has been considered to model many applications. Second, the model of RSMs provides many parameters, such as the number of entry and exit nodes, and the number of modules, and better algorithms can be developed by considering that some parameters are small. Typically the SESE RSMs can model data-independent interprocedural analysis, whereas general RSMs can model data dependency as well. For most applications, the number of entries and exits of a module, usually represents the input parameters of the module.
Semiring framework. We consider a general framework to express computation properties of RSMs where the transitions of an RSM are labeled from a semiring. The labels are referred to as weights. A computation of an RSM executes transitions between configurations consisting of a node (representing the current control state) and a stack of boxes (representing the current calling context). To express properties of interest we need to define how to assign weights to computations, i.e., to accumulate weights along a computation, and how to assign weights to sets of computations, i.e., to combine weights across a set of computations. The weight of a given computation is the semiring product of the weights on the individual transitions of the computation, and the weight of a given set of computations is the semiring plus of the weights of the individual computations in the set. For example, (i) with the Boolean semiring (with semiring product as AND, and semiring plus as OR) we express the reachability property; (ii) with a Dataflow semiring we can express problems from dataflow analysis. One class of such problems is given by the IFDS/IDE framework [32, 35] that considers the propagation of dataflow facts along distributive dataflow functions (note that the IFDS/IDE framework only considers SESE RSMs). Hence the large and important class of dataflow analysis problems that can be expressed in the IFDS/IDE framework can also be expressed in our framework. Pushdown systems with semiring weights have also been extensively considered in the literature [34, 21, 33, 19].
Problems considered. We consider the following basic distance problems.
- •
Configuration distance. Given a set of source configurations and a target configuration, the configuration distance is the weight of the set of computations that start at some source configuration and end in the target configuration. In the configuration distance problem the input is a set of source configurations and the output is the configuration distance to all reachable configurations.
- •
Superconfiguration distance. We also consider a related problem of superconfiguration distance. A superconfiguration represents a sequence of modules, rather than a sequence of invocations. Intuitively, it does not consider the sequence of function calls, but only which functions were invoked. This is a coarser notion than configurations and allows for fast overapproximation. The superconfiguration distance problem is then similar to the configuration distance problem, with configurations replaced by superconfigurations.
- •
Node distance. Given a set of source configurations and a target node, the node distance is the weight of the set of computations that start at some source configuration and end in a configuration with the target node (with arbitrary stack). In the node distance problem the input is a set of source configurations and the output is the node distance to all reachable nodes.
Symbolic representation. A core ingredient for solving distance problems is the symbolic representation of sets of RSM configurations and their manipulation. Given a symbolic representation of the set of initial configurations, we provide a two step approach to solve the distance problems. In step one we compute a symbolic representation of the set of all configurations reachable from the initial configurations. Furthermore, the transitions in the representation are annotated with appropriate semiring weights to capture the various distances described above. In step two we query the computed representation for the required distances. Thus we make the important distinction between the complexity of a one-time preprocessing and the complexity of every individual query.
Concurrent RSMs. While reachability is the most basic property, the study of pushdown systems and RSMs with the semiring framework is the fundamental quantitative extension of the basic problem. An orthogonal fundamental extension is to study the reachability property in a concurrent setting, rather than the sequential setting. However, the reachability problem in concurrent RSMs (equivalently concurrent pushdown systems) is undecidable [31]. A very relevant problem to study in the concurrent setting is to consider context-bounded reachability, where at most context switches are allowed. The context-bounded reachability problem is both decidable [28] and practically relevant [25, 26].
Previous results. Many previous results have been established for pushdown systems, and the translation of RSMs to pushdown systems implies that similar results carry over to RSMs as well. We describe the most relevant previous results with respect to our results. For an RSM , let denote its size, and the maximum number of entries and exits, respectively, and the number of modules. The existing results for weighted pushdown systems over semirings of height [36, 34] along with the linear translation of RSMs to pushdown systems [4] gives an -time algorithm for the configuration and node distance problems for RSMs. The previous results for context-bounded reachability of concurrent pushdown systems [28] applied to concurrent RSMs gives the following complexity bound: , where is the size of the concurrent RSM, is the number of exit nodes, is the number of component RSMs, is the global part of the concurrent RSM, and is the bound on the number of context switches.
Our contributions. Our main contributions are as follows:
Finite-height semirings. We present an algorithm for computing configuration and node distance problems for RSMs over semirings with finite height with running time , where is the number of call nodes. The algorithm we present constructs the symbolic representations from which the distances can be extracted. Thus our algorithm improves the current best-known algorithms by a factor of (Table 1) (also see Remark 3 for details). 2. 2.
Distance queries. Once a symbolic representation is constructed, it can be used for extracting distances. We present algorithms which given a configuration query of size , return the distance in time. Furthermore, we present several improvements for the case when the semiring has a small domain. Finally, we show that when the RSM has a sparse call graph, we can obtain a range of tradeoffs between preprocessing and querying times. Our results on distance queries are summarized in Table 2. 3. 3.
Concurrent RSMs. For the context-bounded reachability of concurrent RSMs we present an algorithm with time bound . Thus our algorithm significantly improves the current best-known algorithm (Table 1). 4. 4.
Experimental results. We experiment with a basic prototype implementation for our algorithms. Our implementation is an explicit (rather than symbolic) one. We compare our implementation with jMoped [1], which is a leading and mature tool for weighted pushdown systems, on several real-world benchmarks coming from the SLAM/SDV project [6, 7]. We consider the basic reachability property (representative for finite-height semirings) for the sequential setting. Our experimental results show that our algorithm provides significant improvements on the benchmarks compared to jMoped.
Technical contribution. The main technical contributions are as follows:
- •
We show how to combine (i) the notion of configuration automata as a symbolic representation structure for sets of configurations, and (ii) entry-to-exit summaries to avoid redundant computations, and obtain an efficient dynamic programming algorithm for various distance problems in RSMs over finite-height semirings.
- •
Configuration and superconfiguration distances are extracted using graph traversal of configuration automata. When the semiring has small domain, we obtain several speedups by exploiting advances in matrix-vector multiplication. Finally, the speedup of superconfiguration distance extraction on sparse RSMs is achieved by devising a Four-Russians type of algorithm, which spends some polynomial preprocessing time in order to allow compressing the query input in blocks of logarithmic length.
All proofs are provided in Appendix 0.A and 0.C.
2 Preliminaries
In this section we present the necessary definitions of recursive state machines (RSMs) where every transition is labeled with a value (or weight) from an appropriate domain (semiring). Then we formally state the problems we study on weighted RSMs.
Semirings. An idempotent semiring is a quintuple , where is a set called the domain, and are elements of , and (the combine operation) and (the extend operation) are binary operators on such that
is an idempotent commutative monoid with neutral element , 2. 2.
is a monoid with neutral element , 3. 3.
distributes over , 4. 4.
is an annihilator for , i.e., for all .
An idempotent semiring has a canonical partial order , defined by
[TABLE]
Furthermore, this partial order is monotonic, i.e., for all
[TABLE]
The height of an idempotent semiring is the length of the longest descending chain in . In the rest of the paper we will only write semiring to mean an idempotent finite-height semiring.
Remark 1
Instead of finite height, the more general descending chain condition would be sufficient for our purposes. This only requires that there are no infinite descending chains in , but there is not necessarily a finite height .
Recursive State Machines (informally). Intuitively, an RSM is a collection of finite automata, called modules, such that computations consist of ordinary local transitions within a module as well as calls to other modules, and returns from other modules. For this, every module has a well-defined interface of entry and exit nodes. Calls to other modules are represented by boxes, which have call and return nodes corresponding to the respective entry and exit nodes of the called module.
Unlike pushdown automata (PDAs), there is no explicit stack manipulation in RSMs. Instead a call stack is maintained implicitly along computations as follows. When a call node of a box is reached, the control is passed to the respective entry node of the called module and the box is pushed onto the top of the stack. When an exit node of a module is reached, a box is popped off from the top of the stack and the control is passed to the corresponding return node of the box. Hence, the stack is a sequence of boxes representing the current calling context and a configuration in a computation of an RSM is a node together with a sequence of boxes.
Recursive State Machines (formally). A recursive state machine (RSM) over a semiring is a tuple , where every module is given by
- •
a finite set of boxes,
- •
a mapping ,
- •
a finite set of nodes, partitioned into
- –
internal nodes ,
- –
entry nodes ,
- –
exit nodes ,
- –
call nodes ,
- –
return nodes ,
- •
a transition relation ,
- •
a weight function , with for every exit node .
We write for , and similarly for , , , , , , , . To measure the size of an RSM we let . A major source of complexity in analysis problems for RSMs is the number of entry and exit nodes of the modules. Throughout the paper we express complexity with respect to the entry bound and the exit bound , i.e., the maximum number of entries and exits, respectively, over all modules. Note that the restriction on the weight function to assign weight to every transition to an exit node is wlog, as any weighted RSM that does not respect this can be turned into an equivalent one that does, with only a constant factor increase in its size.
Stacks. A stack is a sequence of boxes , where the first box denotes the top of the stack; and is the empty stack. The height of is , i.e, the number of boxes it contains. For a box and a stack , we denote with the concatenation of and , i.e., a push of onto the top of .
Configurations and transitions. A configuration of an RSM is a tuple , where is an internal, entry, or return node, and is a stack. For , where for and some , we require that for , as well as . This corresponds to the case where the control is inside the module of node , which was entered via box from module , which was entered via box from module , and so on.
We define a transition relation over configurations and a corresponding weight function , such that with if and only if there exists a transition in with and one of the following holds:
Internal transition: , , and . 2. 2.
Call transition: for some box , , and . 3. 3.
Return transition: for some box and exit node , , and .
Note that we follow the convention that a call immediately enters the called module and a return immediately returns to the calling module. Hence, the node of a configuration can be an internal node, an entry node, or a return node, but not a call node or an exit node.
Computations. A computation of an RSM is a sequence of configurations , such that for every . We say that is a computation from to , of length , and of weight (the empty extend is ). We write to denote that is a computation from to of any length. A computation is called non-decreasing if the stack height of every configuration of is at least as large as that of (in other words, the top stack symbol of is never popped in ). The computation is called same-context if it is non-decreasing, and and have the same stack height. A computation that cannot be extended by any transition is called a halting computation. For a set of computations we define its weight as (the empty combine is ). For a configuration and a set of configurations we denote by the set of all computations from any configuration in to . Here, and for similar purposes below, we will use the convention to write instead of .
Example 1
Figure 1 shows an RSM that consists of two modules and . The modules are mutually recursive, since box of module calls module , and box of module calls module . A possible computation of is
[TABLE]
Distance problems. Given a set of configurations , the set of configurations that are reachable from is
[TABLE]
Instead of mere reachability, we are interested in the following distance metrics that aggregate over computations from using the semiring combine and hence are expressed as semiring values.
- •
Configuration distance. The configuration distance from to is defined as
[TABLE]
That is, we take the combine over the weights of all computations from a configuration in to . Naturally, for configurations not reachable from we have .
- •
Superconfiguration distance. A superstack is a sequence of modules . A stack refines if for all , i.e., the -th box of belongs to the -th module of . A superconfiguration of is a tuple . Let . The superconfiguration distance from to a superconfiguration is defined as
[TABLE]
The superconfiguration distance is only concerned with the sequence of modules that have been used to reach the node , rather than the concrete sequence of boxes as in the configuration distance. This is a coarser notion than configuration and allows for fast overapproximation.
- •
Node and same-context distance. The node distance of a node from is defined as
[TABLE]
where ranges over stacks of . Finally, the same-context node distance of a node in module is defined as
[TABLE]
Intuitively, the node distance minimizes over all possible ways (i.e., stack sequences) to reach a node, and the same-context problem considers nodes in the same module that can be reached with empty stack.
Relevance. We discuss the relevance of the model and the problems we consider in program analysis. A prime application area of our framework is the analysis of procedural programs. Computations in an RSM correspond to the interprocedurally valid paths of a program. The distance values defined above allow to obtain information at different levels of granularity, depending on the requirement for a particular analysis. MEME (multi-entry, multi-exit) RSMs naturally arise in the model checking of procedural programs, where every node represents a combination of control location and data. Checking for reachability, usually of an error state, requires only the simple Boolean semiring. On the other hand, interprocedural data flow analysis problems, like in IFDS/IDE, are usually cast on SESE (single-entry, single-exit) RSMs (the control flow graph of the program) using richer semirings. Our framework captures both of these important applications, and furthermore allows a hybrid approach of modeling program information both in the state space of the RSM as well as in the semiring.
3 Configuration Distance Algorithm
In this section we present an algorithm which takes as input an RSM and a representation of a regular set of configurations , and computes a representation of the set of reachable configurations that allows the extraction of the distance metrics defined above. In Section 3.1 we introduce configuration automata as representation structures for regular sets of configurations. In Section 3.2 we present an algorithm for RSMs over finite-height semirings. The algorithm saturates the input configuration automaton with additional transitions and assigns the correct weights via a dynamic programming approach that gradually relaxes transition weights from an initial overapproximation. We exploit the monotonicity property in idempotent semirings which allows to factor the computation into subproblems, and hence corresponds to the optimal substructure property of dynamic programming. Although a transition might have to be processed multiple times, the finite height of the semiring prevents a transition from being relaxed indefinitely. Here we show that the final configuration automata constructed by our algorithms correctly capture configuration distances. The extraction of distance values is considered in Section 4.
3.1 Configuration Automata
In general, like , the set is infinite. Hence we make use of a representation of regular sets of configurations as the language accepted by configuration automata, defined below. The main feature of a regular set of configurations is its closure under . That is, is also a regular set of configurations and can be represented by a configuration automaton.
Intuition. Every state in a configuration automaton corresponds to a node in the RSM. In order to represent arbitrary regular sets of configurations we must allow the replication of states with the same node. Therefore we annotate every state with a mark (see Remark 2 for details). Transitions are of two types: (i) -transitions pointing from a node to an entry node and labeled with , denoting that a computation reaching entered the module of via entry , and (ii) b-transitions pointing from an entry node to another entry node and labeled with a box , corresponding to a call transition in the module of in the RSM. Reading the labels along a path in the automaton yields a stack.
In addition to the labeling with boxes we label every transition of a configuration automaton with a semiring value. In the final configuration automata constructed by our algorithms, every run generates a configuration and thereby captures a certain subset of computations from the initial set of configurations to . The weight of the run equals the combine over the weight of the computations in . The combine over the weights of all runs in the automaton that generate equals the combine over the weights of all computations from to , i.e., the configuration distance . Since the transitions in a configuration automaton are essentially reversed transitions of the RSM (and the extend operation is not commutative), the weight of a run is given by the extend of the transitions in reversed order.
Configuration automata. Let be a countably infinite set of marks. A configuration automaton for an RSM , also called an -automaton, is a weighted finite automaton , where
- •
is a finite set of states,
- •
(the boxes of ) is the transition alphabet,
- •
is a transition relation, such that every transition has one of the following forms:
- –
b-transition: , where for some , , and ,
- –
-transition: , where for some , and either , or ,
- •
is a set of initial states,
- •
and is a set of final states,
- •
is a weight function that assigns a semiring weight to every transition.
Remark 2 (Marks)
The marks in the states of a configuration automaton are introduced to support the general setting of representing an arbitrary set of configurations, e.g., with stacks that are not even reachable in the RSM. Since every state is tied to an RSM node, the marks allow to have multiple “copies” of the same node in unrelated parts of the automaton. Furthermore, our algorithm (Section 3.2) introduces a fresh mark to recognize when it can safely store entry-to-exit summaries. For the common setting of starting the analysis from the entry nodes of a main module with empty stack, marks are not necessary and can be elided.
Runs and regular sets of configurations. A run of a configuration automaton is a sequence , such that there are states and each is a transition of labeled with . We say that is a run from to , of length , labeled by , and of weight (note that the weights of the transitions are extended in reverse order). We write to denote that is a run from to of any length labeled by and of weight . We will also use the notation without if we are not interested in the weight. The run is accepting if and . A configuration is accepted by if there is an accepting run for some mark , and additionally . We say that two runs are equivalent if they accept the same configuration with the same weight. For technical convenience we consider that for every state with entry node there is an -self-loop with weight .
The set of all configurations accepted by is denoted by . A set of configurations is called regular if there exists an -automaton such that . For a configuration let be the set of all accepting runs of and define the weight that assigns to .
We note that, despite the imposed syntactic restrictions, our definition of configuration automata is most general in the following sense.
Proposition 1
Let be a set of configurations such that their string representations is a regular language. Then there exists a configuration automaton such that .
3.2 Algorithm for Finite-Height Semirings
In the following we present algorithm for computing the set of a regular set of configurations . The algorithm operates on an -automaton with . In the end, it has constructed an -automaton such that . Moreover, the configuration distance from to any configuration can be obtained from the labels of as . A computation is called initialized, if its first configuration is accepted by the initial configuration automation .
Key technical contribution. In this work we consider the configuration distance computation. Using the notion of configuration automata as a symbolic representation structure for regular sets of configurations, the solution of the configuration distance problem has been previously studied in the setting of (weighted) pushdown systems [36, 34, 9]. One of the main algorithmic ideas for the efficient RSM reachability algorithm of [4] is to expand RSM transitions and use entry-to-exit summaries to avoid traversing a module more than once. However, the algorithm in [4] is limited to the node reachability problem. We combine the symbolic representation of configuration automata, along with the summarization principle, to obtain an efficient algorithm for the general configuration distance problem on RSMs.
Intuitive description of . The intuition behind our algorithm is very simple: it performs a forward search in the RSM. In every iteration it picks a frontier node and extends the already discovered computations to with the outgoing transitions from . Depending on the type of outgoing transitions, a new node discovered and added to the frontier can be (a) an internal node by following an internal transition, (b) the entry node of another module by following a call transition, and (c) a return node corresponding to a previously discovered call by following an exit transition.
In summary, the algorithm simply follows interprocedural paths. However, the crux to achieve our complexity is to keep summaries of paths through a module. Whenever we discovered a full (interprocedural) path from an entry to an exit , we keep its weight as an upper bound. Now any subsequently discovered call reaching does not need to continue the search from , but short-circuits to by using the stored summary.
Preprocessing. In order to ease the formal presentation of the algorithm, we consider the following preprocessing on the initial configuration automaton . Let be the set of marks in the initial automaton and a fresh mark.
For every node , we add a new state marked with the fresh mark. Additionally, all these new states are declared initial. 2. 2.
For every initial state such that there is a call transition in , for every state where is an entry node of the same module as , we add a b-transition with weight . 3. 3.
For every state with entry node and every internal or return node in the same module as , we add an -transition with weight .
Essentially the preprocessing a priori adds to all possible states and transitions, so that the algorithm only has to relax those transitions (i.e., without adding them first). Note that the preprocessing only provides for an easier presentation of our algorithm. Indeed, in practice it would be impractical to do the full preprocessing and thus our implementation adds states and transitions to the automaton on the fly.
Technical description of . We present a detailed explanation of the algorithm supporting the formal description given in Algorithm 1. We require that every transition in the input configuration automaton has weight , since the configurations in should not contribute any initial weight to the configuration distance. The algorithm maintains a worklist of weighted transitions either of the form or , and a summary function . Initially, the worklist contains all such transitions where the source state is an initial state in , and is all . In every iteration a transition is extracted from the worklist and processed as follows. Since every accepting run starting with corresponds to a reachable configuration (where varies over different runs), every transition in gives rise to another reachable configuration. More precisely, the run corresponds to a set of computations reaching from the initial set of configurations, and allows to extend these computations by one step. The algorithm incorporates the newly discovered computations by relaxing a transition as follows, illustrated in Figure 2.
If is of the form , then:
- (a)
If is an internal node then the algorithm captures the internal transition by relaxing the transition using the weights and . 2. (b)
If is a call node then the transition is relaxed with the new weight . Furthermore, an -self-loop is stored in the worklist to continue exploration from the called entry node . 3. (c)
If is an exit node then the algorithm relaxes if a smaller computation to has been discovered. Note that for this corresponds to valid entry-to-exit computations from to . If another call to is discovered later, the summary is used to avoid traversing the module again. For the summary does not necessarily correspond to valid entry-to-exit computations (e.g., because node was provided as an initial configuration) and is only stored to avoid redundant work.
For a return transition from the stack has to be non-empty. The algorithm looks for all possible boxes at the top of by going along a -transition from to a state . Then for any , relaxing the transition captures the return transition . Note that here we make use of the fact that the return transition itself has weight . 2. 2.
If is of the form , then:
- (d)
for every exit node in the module of the summary function is used to relax the weight of the transition to the value .
The initial states of are the initial states of together with all states with the fresh mark added in the preprocessing. The final states of are the unmodified final states of .
Example 2
In Figure 3 we illustrate an execution of for the reachability problem in the RSM from Figure 1. The reader can verify that every configuration in the example computation (1) is accepted by a run of the constructed automaton.
Correctness. In the following we outline the correctness of the algorithm. We start with a simple observation about the shape of runs in the constructed configuration automaton.
Proposition 2
For every accepting run there exists an equivalent accepting run that starts with an -transition followed by only b-transitions. Furthermore, all but the first state contain an entry node.
The following three lemmas capture the correctness of . We start with completeness, namely that the distance computed for any configuration is at most the actual distance from the initial set of configurations to . The proof relies on showing that for any initialized computation there is a run accepting such that , and follows an induction on the length .
Lemma 1 (Completeness)
For every configuration we have .
We now turn our attention to soundness, namely that the distance computed for any configuration is at least the actual distance from the initial set of configurations to . The proof is established via a set of interdependent invariants that state that the algorithm maintains sound entry-to-exit summaries and any run in the automaton has a weight that is witnessed by a set of computations.
Lemma 2 (Soundness)
For every configuration we have .
Complexity. Finally, we turn our attention to the complexity analysis of the algorithm, which is done by bounding the number of times the algorithm can perform a relaxation step. The complexity bound is based on the height of the semiring , which implies that every transition can be relaxed at most times. The contribution of the size of the initial automaton in the complexity is captured by the number of initial marks .
Lemma 3 (Complexity)
Let be the number of distinct marks of the initial automaton . Algorithm constructs in time , and has transitions.
We summarize the results of this section in the following theorem.
Theorem 3.1
Let be an RSM over a semiring of height , and an -automaton with marks. Algorithm constructs in time an -automaton with marks, such that for every configuration .
Remark 3 (Comparison with existing work)
We now relate Theorem 3.1 with the existing work for computing configuration distance (often called generalized reachability in the literature) in weighted pushdown systems (WPDS) [36, 34]. For simplicity we assume that the initial automaton is of constant size. A formal description of WPDS is omitted; the reader can refer to [4, 34]. Let be a WPDS where:
is the number of states 2. 2.
is the size of the transition relation 3. 3.
is the number of different pairs such that there is a transition of the form (i.e., from some state with on the top of the stack, the WPDS (i) transitions to state , (ii) swaps and , and (iii) pushes on the top of the stack).
As shown in [34], given a WPDS with weights from a semiring with height , together with a corresponding automaton that encodes configurations of , an automaton can be constructed as a solution to the configuration distance problem for . For ease of presentation we focus on the common case where has constant size (e.g., for encoding an initial configuration of with empty stack). Then the time required to construct is [36, 34].
A direct consequence of [4, Theorem 1] is that an RSM and a configuration automaton can be converted to an equivalent PDS and configuration automaton , and vice versa, such that the following equalities hold:
[TABLE]
where represents the number of modules. Hence, the bound we obtain by translating the input RSM to a WPDS and using the algorithm of [36, 34] is . Our complexity bound on Theorem 3.1 is better by a factor . Moreover, to verify such improvements, we have also constructed a family of dense RSMs, and apply our algorithm, and compare against the jMoped implementation of the existing algorithms, and observe a linear speed-up (see Section 6.1 for details).
The above analysis considers an explicit model, where comprises two parts, a program control-flow graph and the set of all data valuations , where . Hence, . In a symbolic model, where all the data valuations are tracked on the semiring, the input RSM is a factor smaller (i.e., the contribution of the data valuation to ), and . However, now each semiring operation incurs a factor increase in time cost, and the height of the semiring increases by a factor as well, in the worst case. Hence, existing symbolic approaches for PDSs have the same worst-case time complexity as the explicit one, and our comparison applies to these as well. For further discussion on symbolic extensions of our algorithm we refer to Appendix 0.B.
4 Distance Extraction
The algorithm presented in Section 3 takes as input a weighted RSM over a semiring and a configuration automaton that represents a regular set of configurations of , and outputs an automaton that encodes the distance to every configuration . We now discuss the algorithmic problem of extracting such distances from , and present fast algorithms for this problem. First we will consider the general case for RSMs over an arbitrary semiring. Then we present several improvements for special cases, like RSMs over a semiring with small domain, or sparse RSMs. As the correctness of the constructions is straightforward, our attention will be on the complexity.
4.1 Distances over General Semirings
Configuration distances. Given a configuration , , the task is to extract This is done by a dynamic-programming style algorithm, which computes iteratively for every prefix of and state with and , the weight
[TABLE]
Since there are transitions labeled with , every iteration requires time, and the total time for computing is .
Superconfiguration distances. Given a superconfiguration , , the task is to extract To handle such queries, we perform a one-time preprocessing of , so that the transitions are labeled with modules instead of boxes. That is, we create an automaton , initially identical to . Then we add a transition , with being the module of , if there exists a b-transition in . The weight function of is such that the weight of the transition is
[TABLE]
where ranges over transitions of . This construction requires linear time in the number of b-transitions of , i.e., . It is straightforward to see that
[TABLE]
where and range over accepting runs of and respectively, and refines . Then, given a superconfiguration , the extraction of is done similarly to the configuration distance extraction, in time.
Node distances. For node distances, the task is to compute for every node of . This reduces to treating the automaton as a graph , and solving a traditional single-source distance problem, where the source set contains all states with old marks (i.e., old states that appear in the initial automaton ). This requires time for semirings of height . An informal argument for these bounds is to observe that can be itself encoded by a SESE RSM with a single module, where the entry represents the source set of nodes with old marks. Then, running for the corresponding semiring, we obtain a solution to the single-source distance problem in the aforementioned times, as established in Theorem 3.1. Finally, computing same-context node distances requires time in total (i.e., for all nodes). Hence, regardless of the semiring, all node distances can be computed with no overhead, i.e., within the time bounds required for constructing the respective configuration automaton . The following theorem summarizes the complexity bounds that we obtain for the various distance extraction problems.
Theorem 4.1 (Distance extraction)
Let be an RSM over a semiring of height and an -automaton with marks. After preprocessing time
configuration and superconfiguration distance queries are answered in time; 2. 2.
node distance queries are answered in time.
4.2 Distances over Semirings with Small Domain
We now turn our attention to configuration and superconfiguration distance extraction for the case of semirings with small domains . Such semirings express a range of important problems, with reachability being the most well-known (expressed on the Boolean semiring with ). We harness algorithmic advancements on the matrix-vector multiplication problem and Four-Russians-style algorithms to obtain better bounds on the distance extraction problem.
Recall that given a box , the configuration automaton has at most transitions labeled with . Such transitions can be represented by a matrix . Additionally, for every internal node we have one matrix that captures the weights of all transitions of the form . Then, answering a configuration distance query with amounts to evaluating the expression
[TABLE]
where is a row vector of s and size , denotes the transpose, and matrix multiplication is taken over the semiring. The situation is similar in the case of superconfiguration distances, where we have one matrix for each pair of modules , such that invokes .
Evaluating equation (2) from left to right (or right to left) yields a sequence of matrix-vector multiplications. The following two theorems use the results of [24] and [38] on matrix-vector multiplications to provide a speedup on the distance extraction problem when the semiring has constant size .
Theorem 4.2 (Mailman’s speedup [24])
Let be an RSM over a semiring of constant size, and an -automaton with marks. After preprocessing time, configuration and superconfiguration distance queries are answered in time.
Theorem 4.3 (Williams’s speedup [38])
Let be an RSM over a semiring of size , and an -automaton with marks. For any fixed , let and . After preprocessing time, configuration and superconfiguration distance queries are answered in time.
Finally, using the Four-Russians technique for parsing on non-deterministic automata [27], we obtain the following speedup for the case of reachability. We note that although the alphabet is not of constant size (i.e., the number of boxes is generally non-constant) this poses no overhead, as long as comparing two boxes for equality requires constant time (which is the case in the standard RAM model).
Theorem 4.4 (Four-Russians speedup [27])
Let be an RSM over a binary semiring, and an -automaton with marks. After preprocessing time, configuration and superconfiguration distance queries are answered in time.
4.3 A Speedup for Sparse RSMs
We call an RSM sparse if there is a constant bound such that for all modules we have i.e., every module invokes at most other modules (although can have arbitrarily many boxes). Typical call-graphs of most programs are very sparse, e.g., typical call graphs of thousands of nodes have average degree at most eight [8, 30]. Hence, an RSM modeling a typical program is expected to comprise thousands of modules, while the average module invokes a small number of other modules. Although this does not imply a constant bound on the number of invoked modules, such an assumption provides a good theoretical basis for the analysis of typical programs.
Our goal is to provide a speedup for extracting superconfiguration distances w.r.t. a sparse RSM. This is achieved by an additional polynomial-time preprocessing, which then allows to process a distance query in blocks of logarithmic size, and thus offers a speedup of the same order.
Given an RSM of modules and an integer , there exist at most valid module sequences which can appear as a substring in a module sequence which is refined by some stack . Recall the definition of the matrices from Section 4.2. For every valid sequence of modules , we construct a matrix in total time
[TABLE]
where is time require to multiply two matrices (currently , due to [39]).
Observe that as long as , there are polynomially many such sequences , and thus each one can be indexed in time on the standard RAM model. Then a superconfiguration distance query can be answered by grouping in blocks of size each, and for each such block multiply with matrix .
Theorem 4.5 (Sparsity speedup)
Let be a sparse RSM over a semiring of height , and an -automaton with marks. Let , and given an integer parameter , let . After preprocessing time, superconfiguration distance queries are answered in time.
By varying the parameter , Theorem 4.5 provides a tradeoff between preprocessing and query times. Finally, the presented method can be combined with the preprocessing on constant-size semirings of Section 4.2 which leads to a factor improvement on the query times of Theorem 4.2, Theorem 4.3, and Theorem 4.4.
5 Context-Bounded Reachability in Concurrent Recursive State Machines
Context bounding, i.e., limiting the number of context switches considered during state space exploration, is an effective technique for systematic analysis of concurrent programs. The context-bounded reachability problem in concurrent pushdown systems has been studied in [28]. In this section we phrase the context-bounded reachability problem over concurrent RSMs and show that the procedure of [28] using our algorithm together with the results of the previous sections give a better time complexity for the problem. As the section follows closely the well-known framework of concurrent pushdown systems [28], we keep the description brief.
Concurrent RSMs. A concurrent RSM (CRSM) is a collection of RSMs equipped with a finite set of global states used for communication between the RSMs. To this end, the semantics of RSMs is lifted to -configurations of the form , carrying an additional global state . Then, a global configuration of is a tuple , where are configurations of , respectively. The semantics of over global configurations is the standard interleaving semantics, i.e., in each step some RSM modifies the global state and its local configuration, while the local configuration of every other RSM remains unchanged.
Context-bounded reachability. For a positive natural number and a fixed initial global configuration , the -bounded reachability problem asks for all global configurations such that there is a computation from to that switches control between RSMs at most times.
An algorithm for context-bounded reachability. The procedure of [28] for solving the -bounded reachability problem for concurrent pushdown systems (CPDSs) systematically performs operations on the reachable configuration set of every constituent PDS, while capturing all possible interleavings within context switches. The -bounded reachability problem for CRSMs can be solved with an almost identical procedure, replacing the black-box invocations of the PDS reachability algorithm of [36] with our algorithm . However, using our algorithm for each operation, we obtain a complexity improvement over the method of [28].
Key complexity improvement. The key advantage of our algorithm as compared to [28] is as follows: in the algorithm of [28], in each iteration the configuration automata, used to represent the reachable configurations of each component RSM, grows by a cubic term; in contrast, replacing with our algorithm the configuration automata grows only by a linear term in each iteration. This comes from the fact that in our configuration automata every state corresponds to a node of the RSM, whereas such strong correspondence does not hold for the configuration automata of [28].
Theorem 5.1 ()
For a concurrent RSM , and a bound , the procedure of [28, Figure 2] using for performing operations correctly solves the -bounded reachability problem and requires time.
Compared to Theorem 5.1, solving the CRSM problem by translation to a CPDS and using the algorithm of [28] gives the bound . Conversely, solving the CPDS problem by translation to a CRSM and using our algorithm gives an improvement by a factor . We refer to Appendix 0.D for a detailed discussion.
6 Experimental Results
In this section we empirically demonstrate the algorithmic improvements achieved by our RSM-based algorithm over existing PDS-based algorithms on interprocedural program analysis problems. The main goal is to demonstrate the improvements in algorithmic ideas rather than implementation details and engineering aspects. In particular, we implemented our algorithm in a prototype tool and compared its efficiency against jMoped [1], which implements the algorithms of [36, 34] and is a leading tool for the analysis of weighted pushdown systems. In all cases we used an explicit representation of data valuations on the nodes of RSMs, as opposed to a symbolic semiring representation. All experiments were run on a machine with an Intel Xeon CPU and a memory limit of 80GB. We first present our result on a synthetic example to verify the algorithmic improvements on a constructed family, and then present results on real-world benchmarks.
6.1 A Family of Dense RSMs
For our first experiments we constructed a family of dense RSMs that can be scaled in size. The purpose of this experiments is to verify that (i) our algorithm indeed achieves a speedup over the algorithms of [36, 34], and (ii) the speedup scales with the size of the input to ensure that improvements on real-world benchmarks are not due to implementation details, such as the used data types. Let be a single-module RSM that consists of entries and exits, and a single box which makes a recursive call. The transition relation is , i.e., every entry node connects to every call and exit node, and every return node connects to every exit node. Hence . The transition weights are irrelevant, as we will focus on reachability. The initial configuration automaton contains a single entry state. We considered with in the range from 10 to 200. For each RSM, we used the standard translation to a PDS [4], and then applied our tool and jMoped to compute a configuration automaton that represents . Figure 4 depicts the obtained speedup, which scales linearly with . We have also experimented with other similar synthetic RSMs with different means of scaling; and in all cases the obtained speedups have the same qualitative behavior. This confirms the theoretical algorithmic improvements of our algorithm on the synthetic benchmarks.
6.2 Boolean Programs from SLAM/SDV
Benchmarks. For our second experiments we used the collection of Boolean programs distributed as part of the SLAM/SDV project [6, 7]. These programs are the final abstractions in the verification of Windows device drivers, and thus they represent RSMs obtained from real-world programs. From the Boolean programs we obtained RSMs where every node represents a control location together with a valuation of Boolean variables, and call/entry and exit/return nodes model the parameter passing between functions. Thus, the RSMs are naturally multi-entry-multi-exit. Overall we obtained 73 RSMs, which correspond to the largest Boolean programs possible to handle explicitly.
Evaluation. To ensure a fair performance comparison, we applied two preprocessing steps to the benchmark RSMs.
- •
First, to ensure that both tools compute the same result without any potential unnecessary work, we restricted the state space of the RSMs to the interprocedurally reachable states.
- •
Second, to focus on the performance of interprocedural analysis, we eliminated all internal nodes by computing the intraprocedural transitive closure within every RSM module.
The above two transformations ensure preprocessing steps like removal of unreachable states and intraprocedural analysis is already done, and we compare the interprocedural algorithmic aspects of the algorithms. For each RSM, we used the standard translation to a PDS [4], and then applied our tool and jMoped to compute a configuration automaton that represents , where is an initial configuration automaton that contains the entry states of the main module. Table 3 shows for every benchmark the number of RSM transitions (Trans.), their ratio to nodes (D), the runtime for computing the intraprocedural transitive closure (TC), the runtime of jMoped (jMop), the runtime of our tool (Ours), and the speedup our tool achieved over jMoped (SpUp).
Out tool clearly outperforms jMoped on every benchmark, with speedups from 3.94 up to 28.48. The runtimes of our tool range from 0.13 to 33.96 seconds, while the runtimes of jMoped range from 1.03 to 950.82 seconds. Thus, our experiments show that also for real-world examples our algorithm successfully exploits the structure of procedural programs preserved in RSMs. This shows the potential of our algorithm for building program analysis tools.
Note that the benchmark RSMs are quite large, with millions of nodes and transitions, which even a basic implementation of our algorithm handled quite efficiently. Moreover, in our experiments we observed that our tool uses considerably less memory than jMoped. While we set 80GB as the memory limit, the peak memory consumption of jMoped was 72GB, whereas our tool solved all benchmarks with less than 32GB memory.
6.3 Discussion
In our experiments we compared the implementation of our algorithm with jMoped on sequential RSM analysis in an explicit setting. While our algorithm can be made symbolic in a straightforward way (see Appendix 0.B), a symbolic implementation and efficiency for large symbolic domains involve significant engineering efforts. Moreover, the main goal of our work is to compare the algorithmic improvements over the existing approaches, which is best demonstrated in an explicit setting, since in the explicit setting the improvements are algorithmic rather than due to implementation details of symbolic data-structures. Our experimental results show the potential of the new algorithmic ideas, and investigating the applicability of them with a symbolic implementation is a subject of future work.
7 Related Work
Sequential setting. Pushdown systems are very well studied for interprocedural analysis [32, 35, 10]. While the most basic problem is reachability, the weighted pushdown systems (i.e., pushdown systems enriched with semiring) can express several basic dataflow properties, and other relevant problems in interprocedural program analysis [34, 21, 33, 19]. Hence weighted pushdown systems have been studied in many different contexts, such as [35, 32, 16, 12], and tools have been developed, such as Moped [2], jMoped [1], and WALi [3]. The more convenient model of RSMs was introduced and studied in [4], which on the one hand explicitly models the function calls and returns, and on the other hand specifies many natural parameters for algorithmic analysis. In this work, we improve the fundamental algorithms for RSMs over finite-height semirings, as compared to the bounds obtained by translating RSMs to pushdown systems and applying the best-known bounds for the pushdown case. Along with general RSMs, special cases of SESE RSMs have also been considered, such as RSMs with constant treewidth, and only same context queries [11] (i.e., computation of node distances between nodes of the same module). Our results apply to the general case of all RSMs and are not restricted to any special types of queries.
Concurrent setting. The problem of reachability in concurrent pushdown systems (or concurrent RSMs) is again a fundamental problem in program analysis, which allows for the interprocedural analysis in a concurrent setting. However, the problem is undecidable [31]. Motivated by practical problems, where bugs are discovered with few context switches, the context-bounded reachability problem, where there can be at most context switches have been considered for concurrent pushdown systems [28, 25, 26, 22, 20] as well as related models of asynchronous pushdown networks [9]. We present a new algorithm for concurrent pushdown systems and concurrent RSMs which improves the existing complexity when the size of the global component is small.
8 Conclusion
In this work we consider RSMs, a fundamental model for interprocedural analysis, with path properties expressed over finite-height semirings, that can express a large class of properties for program analysis. We present algorithms that improve the previous algorithms, both in the sequential as well as in the concurrent setting. Moreover, along with our algorithm, we present new methods to extract distances from the data-structure (configuration automata) that the algorithm constructs. We present a prototype implementation for sequential RSMs in an explicit setting that provides significant improvements for real-world programs obtained from SLAM/SDV benchmarks. Our results show the potential of the new algorithmic ideas. There are several interesting directions of future work. A symbolic implementation is a direction for future work. Another direction of future work is to explore the new algorithmic ideas in the concurrent setting in practice.
Acknowledgments.
This research was supported in part by the Austrian Science Fund (FWF) under grants S11402-N23, S11407-N23, P23499-N23, and Z211-N23, and by the European Research Council (ERC) under grant 279307.
Appendix 0.A Proofs of Section 3
Proposition 3
At all times, every run in the automaton contains at most one transition switching from the fresh mark to an old mark, and no transition switching from an old mark to the fresh mark. Furthermore, every accepting run has to end in a state with an old mark.
See 1
Proof
First we show that for every initialized computation there is a run
[TABLE]
accepting such that (1) , and (2) was added to the worklist. We proceed by induction on the length of . Part (2) of the induction hypothesis is used to argue that will be extracted with its final weight at some point in the algorithm. We do not explicitly prove this part below since it is an obvious consequence of the steps in the algorithm we refer to in order to prove part (1).
As a base case, if then , , and . Since there must be a -run accepting , and since all transitions in the initial automaton have weight we have .
For the induction step, if there is a configuration such that
[TABLE]
By applying the induction hypothesis to we obtain an accepting run
[TABLE]
such that
[TABLE]
Let be the module of . We split cases according to the type of the last transition of .
Internal transition: , , and . We consider the iteration of the main loop where is extracted from with its final weight. Line 1 relaxes the transition with and hence
[TABLE]
By combining and we obtain the accepting run
[TABLE]
and we derive
[TABLE] 2. 2.
Call transition: for some box , , and .
Again, we consider the iteration of the main loop where is extracted from with its final weight. Line 1 relaxes the transition with and hence
[TABLE]
By combining and we obtain the accepting run
[TABLE]
and we derive
[TABLE] 3. 3.
Return transition: for some and , , and . First note that
[TABLE]
We consider the iteration of the main loop, where is the transition extracted from such that the if-condition in line 1 holds for the last time. Then and since line 1 relaxes with we have
[TABLE]
Now observe that we must have
[TABLE]
We distinguish whether the transition already had weight in the current iteration of processing or not. If yes, then line 1 in the current iteration, if no, then line 1 in the later iteration where is extracted with weight from , relaxes the transition with and hence
[TABLE]
By combining and we obtain the accepting run
[TABLE]
and we derive
[TABLE]
In all cases we obtain the desired run accepting with . Now the claim of the lemma follows, as
[TABLE]
were the inequality holds since, as shown above, the weight of any is bounded from below by the weight of a .∎
Lemma 4 (Soundness invariants)
Algorithm maintains the following loop invariants:
- I1
The function maintains sound summaries, i.e., for every entry and exit of the same module , and every box with , there exists a set of computations such that . 2. I2
For every run , there exists a set of computations , such that . 3. I3
For every run accepting a configuration there exists a set of initialized computations ending in , such that .
Proof
Note that every accepting run has to end in a final state with an old mark since we do not add final states to the automaton. Moreover, we recall that Proposition 3 implies that every run contains at most one transition switching from the fresh mark to an old mark, and no transition switching from an old mark to the fresh mark.
Initially, the invariants hold due to the initialization steps of the algorithm.
Now we need to show that I1 is preserved by updates to the summary function in line 1, and that I2 and I3 are preserved by all possible relaxations performed in line 1, 1, 1, and 1.
Since all cases follow the similar pattern of applying the invariants to sub-runs and combining the obtained sets of computations suitably, we only give the details of one case for I3.
Consider the run
[TABLE]
with and an iteration of the main loop where transition is extracted from and transition is relaxed in line 1 due to a call transition .
We apply I3 to to obtain a set of initialized computations ending in . We extended each computation by to obtain the set of initialized computations ending in . We apply I3 to to obtain a set of initialized computations ending in . We apply I2 to to obtain a set of computations from to . We lift the stack of each computation by to obtain the set of computations from to . Let be the set of computations obtained by combining every computation in with every computation in . Then is the desired set of initialized computations ending in , such that after the relaxation of .∎
See 2
Proof
Conversely to the inequality of equation (4) in the proof of Lemma 1we derive
[TABLE]
were the inequality holds since, by invariant I3 from Lemma 4, the weight of any is bounded from below by the weight of a .∎
See 3
Proof
We bound the number of times that each loop will be executed.
For a given node , line 1 will be executed at most times. We denote by , , the number of internal, call, and exit transitions from node in of module . For every iteration of line 1, the following upper bounds on each inner loop are straightforward:
- (a)
Line 1: times. 2. (b)
Line 1: times. 3. (c)
Line 1: times.
Hence for a given pair the algorithm spends time in the above loops, and summing over all we obtain time. 2. 2.
Given a pair of a state and an exit , Line 1 will hold true at most times. Summing over all possible such pairs, we obtain that Line 1 will be executed times in total. 3. 3.
Finally, line 1 will be executed times in total, since the total number of different edges of the form added in the worklist is bounded by the number of call nodes that were used in Line 1, times the maximum number of entries and exits in any module of the RSM.
The desired result follows.∎
Appendix 0.B Symbolic Extensions
Note that in our framework we deal with explicit RSMs, and our algorithm is also explicit. However, our results carry over to symbolic extensions of RSMs, similar to symbolic PDS [5]. For the symbolic extension we describe the symbolic extension of the model and our algorithm.
- •
Symbolic model. We consider the RSM to represent the control-flow structure of a program (represented explicitly), and the semiring capturing valuations on the variables (represented symbolically). The symbolic semiring operations express the value changes along the program execution. In general the MEME RSMs can represent explicitly a combination of control-flow and some Boolean variables.
- •
Symbolic algorithm. We observe that our algorithm for computation on the semiring uses the basic semiring operations. Hence our algorithm can be straightforwardly made symbolic on the semiring, and is only explicit on the RSM structure.
Appendix 0.C Proofs of Section 5
See 5.1
Proof
The algorithm of [28, Figure 3] for concurrent RSMs basically calls algorithm for sequential RSMs as a black-box procedure. Using our algorithm we obtain an algorithm for concurrent RSM, and the correctness follows from [28] and Theorem 3.1.
We now sketch the complexity analysis. By Theorem 3.1, every execution of the algorithm increases the number of marks of the input configuration automaton by . In the -th iteration of [28, Figure 3] the algorithm will perform a operation on a configuration automaton of marks, which by Theorem 3.1 will require time. Each such iteration will spawn iterations in the inner loop of [28, Figure 3], one for each component RSM (among components) and state of the global component (among possible states). Then the total time is (up to constant factors)
[TABLE]
The desired result follows.∎
Appendix 0.D Comparison with existing work on -bounded reachability
We compare our results for CPDSs, CRSMs, and the related model of asynchronous pushdown networks (APNs).
Comparison for CPDSs. As shown in [28], the -bounded reachability problem in a CPDS can be solved in time
[TABLE]
The bisimulation relation of [4, Theorem 1] between PDSs and RSMs has a straightforward extension to CPDSs and CRSMs. In particular:
Given a CPDS with components and global set , the -bounded reachability problem for can be reduced to the -bounded reachability problem for a CRSM with components and global set . Additionally,
[TABLE] 2. 2.
Given a CRSM with components and global set , the -bounded reachability problem for can be reduced to the -bounded reachability problem for a CPDS with components and global set . Additionally,
[TABLE]
It follows that our approach can solve the -bounded reachability problem of a CPDS in time
[TABLE]
Note that typically is very small, (e.g. in [29], in [37]). However, in real applications the size of is typically smaller than , e.g., when encodes only the synchronization variables among threads. Our algorithm gives an improvement by a factor .
Comparison for CRSMs. The naive upper bound for the -bounded reachability problem of a CRSM obtained using a modification of the standard method of [4] to reduce it to the CPDS case, and then apply the algorithm of [28], is . In contrast, our bound is .
Comparison for APNs. The problem of -bounded reachability has also been studied in the closely related model of asynchronous pushdown networks (APNs) [9]. Informally, the main difference of an APN from a CPDS is that in the former case, the stacks have an additional set of local control states, different from the common global finite control . Hence APNs are more general than CPDS. As shown in [9], the -bounded reachability problem for an APN of components can be solved essentially in time , where is the set of local control states. Since APNs are more general than CPDSs, our previous analysis implies that the algorithm of [9] can be used to solve the -bounded reachability problem for a CRSM in time . This time is incomparable with what we obtain from Theorem 5.1. When the number of entries and the number of components is constant, the algorithm presented in this work has a better complexity.
Appendix 0.E Names of the Boolean programs used as benchmarks
All listed benchmarks belong to the “bebop-itp” collection.
src_7600_general_toaster_kmdf_filter_generic__InvalidReqAccess 2. 2.
src_7600_general_toaster_wdm_filter_devupper__PnpSurpriseRemove 3. 3.
src_7600_general_event_wdm__MarkIrpPending2 4. 4.
src_7600_storage_sfloppy__PagedCode 5. 5.
src_7600_input_kbfiltr_sys__InvalidReqAccess 6. 6.
src_7600_general_toaster_wdm_toastmon__PendedCompletedRequest 7. 7.
src_7600_general_toaster_wdm_filter_devupper__CriticalRegions 8. 8.
src_7600_network_ndis_athwifi_driver_atheros__Irql_SendRcv_Function 9. 9.
src_7600_general_event_wdm__IrpProcessingComplete 10. 10.
src_7600_general_toaster_wdm_func_featured1__WmiForward 11. 11.
src_7600_general_toaster_wdm_func_featured1__IrqlKeWaitForSingleObject 12. 12.
src_7600_network_ndis_athwifi_driver_atheros__Irql_Timer_Function 13. 13.
src_7600_general_toaster_wdm_func_featured1__IrqlIoPassive3 14. 14.
src_7600_general_toaster_wdm_func_featured2__IrqlIoApcLte 15. 15.
src_7600_general_toaster_wdm_func_featured2__WmiForward 16. 16.
src_7600_general_ioctl_kmdf_sys__InitFreeDeviceCreateType4 17. 17.
src_7600_general_toaster_wdm_func_incomplete2__IrqlReturn 18. 18.
src_7600_general_pcidrv_wdm_hw__WmiForward 19. 19.
src_7600_input_moufiltr__InvalidReqAccess 20. 20.
src_7600_general_toaster_wdm_func_featured2__IrqlIoPassive3 21. 21.
src_7600_general_toaster_kmdf_filter_sideband__ControlDeviceInitAPI 22. 22.
src_7600_general_toaster_wdm_func_featured1__IrqlIoApcLte 23. 23.
src_7600_general_toaster_wdm_func_featured2__PnpSurpriseRemove 24. 24.
src_7600_general_amcc5933_sys__KmdfIrql 25. 25.
src_7600_general_toaster_wdm_func_incomplete1__IrpProcessingComplete 26. 26.
src_7600_general_toaster_wdm_filter_devupper__PendedCompletedRequest 27. 27.
src_7600_general_toaster_wdm_func_incomplete1__TargetRelationNeedsRef 28. 28.
src_7600_general_toaster_wdm_func_incomplete2__TargetRelationNeedsRef 29. 29.
src_7600_general_toaster_wdm_func_featured1__PnpSurpriseRemove 30. 30.
src_7600_bth_bthecho_bthcli_sys__RequestFormattedValid 31. 31.
src_7600_general_toaster_wdm_filter_buslower__PendedCompletedRequest 32. 32.
src_7600_input_hiddigi_wacompen__MarkIrpPending 33. 33.
src_7600_general_toaster_wdm_bus__PnpSurpriseRemove 34. 34.
src_7600_general_toaster_kmdf_bus_static__PdoInitFreeDeviceCreateType4 35. 35.
src_7600_network_ndis_athwifi_driver_atheros__Irql_IrqlSetting_Function 36. 36.
src_7600_serial_serenum__IrqlIoApcLte 37. 37.
src_7600_hid_hidusbfx2_sys__SyncReqSend2 38. 38.
src_7600_general_toaster_wdm_toastmon__IrpProcessingComplete 39. 39.
src_7600_general_toaster_kmdf_filter_sideband__ControlDeviceDeleted 40. 40.
src_7600_general_cancel_startio__MarkIrpPending2 41. 41.
src_7600_general_toaster_wdm_func_featured2__IrqlKeSetEvent 42. 42.
src_7600_general_toaster_wdm_func_incomplete2__IrqlKeSetEvent 43. 43.
src_7600_serial_serenum__IrqlIoPassive3 44. 44.
src_7600_general_toaster_wdm_filter_devlower__IrpProcessingComplete 45. 45.
src_7600_general_toaster_wdm_func_featured1__IrqlExAllocatePool 46. 46.
src_7600_general_toaster_wdm_func_featured2__CriticalRegions 47. 47.
src_7600_general_toaster_wdm_bus__MarkIrpPending2 48. 48.
src_7600_storage_class_cdrom__WdfSpinlockRelease 49. 49.
src_7600_general_toaster_wdm_func_featured2__IrqlKeWaitForSingleObject 50. 50.
src_7600_network_ndis_xframeii_sys_ndis6__MandatoryOid 51. 51.
src_7600_general_toaster_kmdf_filter_sideband__DeviceInitAllocate 52. 52.
src_7600_network_ndis_xframeii_sys_ndis6__NdisStallExecution_Delay 53. 53.
src_7600_bth_bthecho_bthsrv_sys__InvalidReqAccessLocal 54. 54.
src_7600_general_toaster_wdm_func_featured2__IrqlKeApcLte2 55. 55.
src_7600_general_toaster_wdm_bus__IrqlIoPassive3 56. 56.
src_7600_network_ndis_xframeii_sys_ndis6__SpinlockRelease 57. 57.
src_7600_storage_filters_diskperf__PnpIrpCompletion 58. 58.
src_7600_storage_filters_diskperf__TargetRelationNeedsRef 59. 59.
src_7600_general_toaster_wdm_func_featured2__IrqlZwPassive 60. 60.
src_7600_hid_hidusbfx2_hidmapper__ForwardedAtBadIrql 61. 61.
src_7600_general_pcidrv_wdm_hw__DoubleCompletion 62. 62.
src_7600_serial_serenum__MarkIrpPending2 63. 63.
src_7600_general_toaster_wdm_func_incomplete2__MarkIrpPending2 64. 64.
src_7600_input_moufiltr__SyncReqSend2 65. 65.
src_7600_general_toaster_kmdf_filter_generic__SyncReqSend2 66. 66.
src_7600_input_kbfiltr_sys__SyncReqSend2 67. 67.
src_7600_input_hiddigi_wacompen__SpinLock 68. 68.
src_7600_general_toaster_wdm_filter_devupper__IrpProcessingComplete 69. 69.
src_7600_general_pcidrv_wdm_hw__IrqlIoPassive1 70. 70.
src_7600_general_toaster_wdm_func_incomplete1__ForwardedAtBadIrql 71. 71.
src_7600_general_toaster_wdm_filter_devlower__ForwardedAtBadIrql 72. 72.
src_7600_general_toaster_wdm_toastmon__ForwardedAtBadIrql 73. 73.
src_7600_general_toaster_wdm_filter_devupper__ForwardedAtBadIrql
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] j Moped 2.0. https://www 7.in.tum.de/tools/jmoped/ .
- 2[2] Moped. http://www 2.informatik.uni-stuttgart.de/fmi/szs/tools/moped/ .
- 3[3] WA Li. https://research.cs.wisc.edu/wpis/wpds/ .
- 4[4] Rajeev Alur, Michael Benedikt, Kousha Etessami, Patrice Godefroid, Thomas W. Reps, and Mihalis Yannakakis. Analysis of Recursive State Machines. ACM Trans. Program. Lang. Syst. , 27(4), 2005.
- 5[5] Rajeev Alur, Ahmed Bouajjani, and Javier Esparza. Model Checking Procedural Programs. In Handbook of Model Checking . Springer, 2016.
- 6[6] Thomas Ball, Ella Bounimova, Vladimir Levin, Rahul Kumar, and Jakob Lichtenberg. The static driver verifier research platform. In CAV , 2010.
- 7[7] Thomas Ball and Sriram K. Rajamani. Bebop: A symbolic model checker for boolean programs. In SPIN , 2000.
- 8[8] Pamela Bhattacharya, Marios Iliofotou, Iulian Neamtiu, and Michalis Faloutsos. Graph-based analysis and prediction for software evolution. In ICSE , 2012.
