Checking Robustness Against Snapshot Isolation
Sidi Mohamed Beillahi, Ahmed Bouajjani, Constantin Enea

TL;DR
This paper presents a method to verify if programs maintain correctness under snapshot isolation by reducing the problem to a state reachability analysis, enabling reuse of existing verification techniques.
Contribution
It introduces a polynomial-time reduction from robustness verification against snapshot isolation to state reachability, facilitating the application of classic verification methods.
Findings
Verification problem is polynomial-time reducible to state reachability.
Reduction enables reuse of classic verification techniques.
Proof technique based on Lipton's reduction theory for robustness.
Abstract
Transactional access to databases is an important abstraction allowing programmers to consider blocks of actions (transactions) as executing in isolation. The strongest consistency model is {\em serializability}, which ensures the atomicity abstraction of transactions executing over a sequentially consistent memory. Since ensuring serializability carries a significant penalty on availability, modern databases provide weaker consistency models, one of the most prominent being \emph{snapshot isolation}. In general, the correctness of a program relying on serializable transactions may be broken when using weaker models. However, certain programs may also be insensitive to consistency relaxations, i.e., all their properties holding under serializability are preserved even when they are executed over a weak consistent database and without additional synchronization. In this paper, we…
| Application | #Transactions | Robustness | Reachability Analysis | CDG Analysis | ||
| PO | PT | PO | PT | |||
| Auction | 4 | ✓ | 70 | 0.3 | 20 | 0.5 |
| Courseware | 5 | ✗ | 59 | 0.37 | na | na |
| FusionTicket | 4 | ✓ | 72 | 0.3 | 34 | 0.5 |
| SmallBank | 5 | ✗ | 48 | 0.28 | na | na |
| TPC-C | 5 | ✓ | 54 | 0.7 | 82 | 3.7 |
| Cassieq-Core | 8 | ✓ | 173 | 0.55 | 104 | 2.9 |
| Currency-Exchange | 6 | ✓ | 88 | 0.35 | 26 | 3.5 |
| PlayList | 14 | ✓ | 99 | 4.63 | 236 | 7.3 |
| RoomStore | 5 | ✓ | 85 | 0.3 | 22 | 0.5 |
| Shopping-Cart | 4 | ✓ | 58 | 0.25 | T | T |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: Université de Paris, IRIF, CNRS, Paris, France, 11email: {beillahi,abou,cenea}@irif.fr
Checking Robustness Against Snapshot Isolation††thanks: This work is supported in part by the European Research Council (ERC) under the Horizon 2020 research and innovation programme (grant agreement No 678177).
Sidi Mohamed Beillahi
Ahmed Bouajjani
Constantin Enea
Abstract
Transactional access to databases is an important abstraction allowing programmers to consider blocks of actions (transactions) as executing in isolation. The strongest consistency model is serializability, which ensures the atomicity abstraction of transactions executing over a sequentially consistent memory. Since ensuring serializability carries a significant penalty on availability, modern databases provide weaker consistency models, one of the most prominent being snapshot isolation. In general, the correctness of a program relying on serializable transactions may be broken when using weaker models. However, certain programs may also be insensitive to consistency relaxations, i.e., all their properties holding under serializability are preserved even when they are executed over a weak consistent database and without additional synchronization.
In this paper, we address the issue of verifying if a given program is robust against snapshot isolation, i.e., all its behaviors are serializable even if it is executed over a database ensuring snapshot isolation. We show that this verification problem is polynomial time reducible to a state reachability problem in transactional programs over a sequentially consistent shared memory. This reduction opens the door to the reuse of the classic verification technology for reasoning about weakly-consistent programs. In particular, we show that it can be used to derive a proof technique based on Lipton’s reduction theory that allows to prove programs robust.
1 Introduction
Transactions simplify concurrent programming by enabling computations on shared data that are isolated from other concurrent computations and resilient to failures. Modern databases provide transactions in various forms corresponding to different tradeoffs between consistency and availability. The strongest consistency level is achieved with serializable transactions [20] whose outcome in concurrent executions is the same as if the transactions were executed atomically in some order. Since serializability carries a significant penalty on availability, modern databases often provide weaker consistency models, one of the most prominent being snapshot isolation (SI) [4]. Then, an important issue is to ensure that the level of consistency needed by a given program coincides with the one that is guaranteed by its infrastructure, i.e., the database it uses. One way to tackle this issue is to investigate the problem of checking robustness of programs against consistency relaxations: Given a program and two consistency models and such that is stronger than , we say that is robust for against if for every two implementations and of and respectively, the set of computations of when running with is the same as its set of computations when running with . This means that is not sensitive to the consistency relaxation from to , and therefore it is possible to reason about the behaviors of assuming that it is running over , and no additional synchronization is required when runs over the weak model such that it maintains all its properties satisfied with .
In this paper, we address the problem of verifying robustness of transactional programs for serializability, against snapshot isolation. Under snapshot isolation, any transaction reads values from a snapshot of the database taken at its start and can commit only if no other committed transaction has written to a location that wrote to, since started. Robustness is a form of program equivalence between two versions of the same program, obtained using two semantics, one more permissive than the other. It ensures that this permissiveness has no effect on the program under consideration. The difficulty in checking robustness is to apprehend the extra behaviors due to the relaxed model w.r.t. the strong model. This requires a priori reasoning about complex order constraints between operations in arbitrarily long computations, which may need maintaining unbounded ordered structures, and make robustness checking hard or even undecidable.
Our first contribution is to show that verifying robustness of transactional programs against snapshot isolation can be reduced in polynomial time to the reachability problem in concurrent programs under sequential consistency (SC). This allows (1) to avoid explicit handling of the snapshots from where transactions read along computations (since this may imply memorizing unbounded information), and (2) to leverage available tools for verifying invariants/reachability problems on concurrent programs. This also implies that the robustness problem is decidable for finite-state programs, PSPACE-complete when the number of sites is fixed, and EXPSPACE-complete otherwise. This is the first result on the decidability and complexity of the problem of verifying robustness in the context of transactional programs. The problem of verifying robustness has been considered in the literature for several models, including eventual and causal consistency [5, 9, 10, 11, 19]. These works provide (over- or under-)approximate analyses for checking robustness, but none of them provides precise (sound and complete) algorithmic verification methods for solving this problem.
Based on this reduction, our second contribution is a proof methodology for establishing robustness which builds on Lipton’s reduction theory [17]. We use the theory of movers to establish whether the relaxations allowed by SI are harmless, i.e., they don’t introduce new behaviors compared to serializability.
We applied the proposed verification techniques on 10 challenging applications extracted from previous work [10, 23, 2, 15, 5, 13, 18]. These techniques were enough for proving or disproving the robustness of these applications.
2 Overview
In this section, we give an overview of our approach for checking robustness against snapshot isolation. While serializability enforces that transactions are atomic and conflicting transactions, i.e., which read or write to a common location, cannot commit concurrently, SI [4] allows that conflicting transactions commit in parallel as long as they don’t contain a write-write conflict, i.e., write on a common location. Moreover, under SI, each transaction reads from a snapshot of the database taken at its start. These relaxations permit the “anomaly” known as Write Skew (WS) shown in Figure 1(a), where an anomaly is a program execution which is allowed by SI, but not by serializability. The execution of Write Skew under SI allows the reads of and to return although this cannot happen under serializability. These values are possible since each transaction is executed locally (starting from the initial snapshot) without observing the writes of the other transaction.
Execution trace. Our notion of program robustness is based on an abstract representation of executions called trace. Informally, an execution trace is a set of events, i.e., accesses to shared variables and transaction begin/commit events, along with several standard dependency relations between events recording the data-flow. The transitive closure of the union of all these dependency relations is called happens-before. An execution is an anomaly if the happens-before of its trace is cyclic. Figure 1(b) shows the happens-before of the Write Skew anomaly. Notice that the happens-before order is cyclic in both cases.
Semantically, every transaction execution involves two main events, the issue and the commit. The issue event corresponds to a sequence of reads and/or writes where the writes are visible only to the current transaction. We interpret it as a single event since a transaction starts with a database snapshot that it updates in isolation, without observing other concurrently executing transactions. The commit event is where the writes are propagated and made visible to all processes. Under serializability, the two events coincide, i.e., they are adjacent in the execution. Under SI, this is not the case and in between the issue and the commit of the same transaction, we may have issue/commit events from concurrent transactions. When a transaction commit does not occur immediately after its issue, we say that the underlying transaction is delayed. For example, the following execution of WS corresponds to the happens-before cycle in Figure 1(b) where the write to was committed after finished, hence, was delayed:
[TABLE]
Above, stands for starting a new transaction by process , ld represents read (load) actions, while isu denotes write actions that are visible only to the current transaction (not yet committed). The writes performed during become visible to all processes once the commit event takes place.
Reducing robustness to SC reachability. The above SI execution can be mimicked by an execution of the same program under serializability modulo an instrumentation that simulates the delayed transaction. The local writes in the issue event are simulated by writes to auxiliary registers and the commit event is replaced by copying the values from the auxiliary registers to the shared variables (actually, it is not necessary to simulate the commit event; we include it here for presentation reasons). The auxiliary registers are visible only to the delayed transaction. In order that the execution be an anomaly (i.e., not possible under serializability without the instrumentation) it is required that the issue and the commit events of the delayed transaction are linked by a chain of happens-before dependencies. For instance, the above execution for WS can be simulated by:
[TABLE]
The write to was delayed by storing the value in the auxiliary register and the happens-before chain exists because the read on that was done by is conflicting with the write on from and the read on by is conflicting with the write of in the simulation of ’s commit event. On the other hand, the following execution of Write-Skew without the read on in :
[TABLE]
misses the conflict (happens-before dependency) between the issue event of and . Therefore, the events of can be reordered to the left of and obtain an equivalent execution where occurs immediately after . In this case, is not anymore delayed and this execution is possible under serializability (without the instrumentation).
If the number of transactions to be delayed in order to expose an anomaly is unbounded, the instrumentation described above may need an unbounded number of auxiliary registers. This would make the verification problem hard or even undecidable. However, we show that it is actually enough to delay a single transaction, i.e., a program admits an anomaly under SI iff it admits an anomaly containing a single delayed transaction. This result implies that the number of auxiliary registers needed by the instrumentation is bounded by the number of program variables, and that checking robustness against SI can be reduced in linear time to a reachability problem under serializability (the reachability problem encodes the existence of the chain of happens-before dependencies mentioned above). The proof of this reduction relies on a non-trivial characterization of anomalies.
Proving robustness using commutativity dependency graphs. Based on the reduction above, we also devise an approximated method for checking robustness based on the concept of mover in Lipton’s reduction theory [17]. An event is a left (resp., right) mover if it commutes to the left (resp., right) of another event (from a different process) while preserving the computation. We use the notion of mover to characterize happens-before dependencies between transactions. Roughly, there exists a happens-before dependency between two transactions in some execution if one doesn’t commute to the left/right of the other one. We define a commutativity dependency graph which summarizes the happens-before dependencies in all executions of a given program between transactions as they appear in the program, transactions where the writes of are
deactivated (i.e., their effects are not visible outside the transaction), and transactions where the reads of obtain non-deterministic values. The transactions are used to simulate issue events of delayed transactions (where writes are not yet visible) while the transactions are used to simulate commit events of delayed transactions (which only write to the shared memory). Two transactions and are linked by an edge iff cannot move to the right of (or cannot move to the left of ), or if they are related by the program order (i.e., issued in some order in the same process). Then a program is robust if for every transaction , this graph doesn’t contain a path from to formed of transactions that don’t write to a variable that writes to (the latter condition is enforced by SI since two concurrent transactions cannot commit at the same time when they write to a common variable). For example, Figure 2 shows the commutativity dependency graph of the modified WS program where the read of is removed from . The fact that it doesn’t contain any path like above implies that it is robust.
3 Programs
A program is parallel composition of processes distinguished using a set of identifiers . Each process is a sequence of transactions and each transaction is a sequence of labeled instructions. Each transaction starts with a instruction and finishes with a instruction. Each other instruction is either an assignment to a process-local register from a set or to a shared variable from a set , or an statement. The read/write assignments use values from a data domain . An assignment to a register is called a read of the shared-variable and an assignment to a shared variable is called a write to ( is an expression over registers whose syntax we leave unspecified since it is irrelevant for our development). The blocks the process if the Boolean expression over registers is false. They are used to model conditionals as usual. We use statements to model an arbitrary control-flow where the same label can be assigned to multiple instructions and multiple statements can direct the control to the same label which allows to mimic imperative constructs like loops and conditionals. To simplify the technical exposition, our syntax includes simple read/write instructions. However, our results apply as well to instructions that include SQL (select/update) queries. The experiments reported in Section 7 consider programs with SQL based transactions.
The semantics of a program under SI is defined as follows. The shared variables are stored in a central memory and each process keeps a replicated copy of the central memory. A process starts a transaction by discarding its local copy and fetching the values of the shared variables from the central memory. When a process commits a transaction, it merges its local copy of the shared variables with the one stored in the central memory in order to make its updates visible to all processes. During the execution of a transaction, the process stores the writes to shared variables only in its local copy and reads only from its local copy. When a process merges its local copy with the centralized one, it is required that there were no concurrent updates that occurred after the last fetch from the central memory to a shared variable that was updated by the current transaction. Otherwise, the transaction is aborted and its effects discarded.
More precisely, the semantics of a program under SI is defined as a labeled transition system where transactions are labeled by the set of events
[TABLE]
where begin and com label transitions corresponding to the start and the commit of a transaction, respectively. isu and ld label transitions corresponding to writing, resp., reading, a shared variable during some transaction.
An execution of program , under snapshot isolation, is a sequence of events corresponding to a run of . The set of executions of under SI is denoted by .
4 Robustness Against SI
A trace abstracts the order in which shared-variables are accessed inside a transaction and the order between transactions accessing different variables. Formally, the trace of an execution is obtained by (1) replacing each sub-sequence of transitions in corresponding to the same transaction, but excluding the com transition, with a single “macro-event” , and (2) adding several standard relations between these macro-events and commit events to record the data-flow in , e.g. which transaction wrote the value read by another transaction. The sequence of and events obtained in the first step is called a summary of . We say that a transaction in performs an external read of a variable if contains an event which is not preceded by a write on of , i.e., an event . Also, we say that a transaction writes a variable if contains an event , for some .
The trace of an execution consists of the summary of along with the program order , which relates any two issue events and that occur in this order in , write-read relation (also called read-from), which relates any two events and that occur in this order in such that performs an external read of , and is the last event in before that writes to (to mark the variable , we may use ), the write-write order (also called store-order), which relates any two store events and that occur in this order in and write to the same variable (to mark the variable , we may use ), the read-write relation (also called conflict), which relates any two events and that occur in this order in such that reads a value that is overwritten by , and the same-transaction relation , which relates the issue event with the commit event of the same transaction. The read-write relation is formally defined as (we use ; to denote the standard composition of relations) and . If a transaction reads the initial value of then relates to of any other transaction which writes to (i.e., ) (note that in the above relations, and might designate the same process).
Since we reason about only one trace at a time, to simplify the writing, we may say that a trace is simply a sequence as above, keeping the relations , , , , and implicit. The set of traces of executions of a program under SI is denoted by .
Serializability semantics. The semantics of a program under serializability can be defined using a transition system where the configurations keep a single shared-variable valuation (accessed by all processes) with the standard interpretation of read and write statements. Each transaction executes in isolation. Alternatively, the serializability semantics can be defined as a restriction of to the set of executions where each transaction is immediately delivered when it starts, i.e., the start and commit time of transaction coincide . Such executions are called serializable and the set of serializable executions of a program is denoted by . The latter definition is easier to reason about when relating executions under snapshot isolation and serializability, respectively.
Serializable trace. A trace is called serializable if it is the trace of a serializable execution. Let denote the set of serializable traces. Given a serializable trace we have that every event in is immediately followed by the corresponding event.
Happens before order. Since multiple executions may have the same trace, it is possible that an execution produced by snapshot isolation has a serializable trace even though events may not be immediately followed by actions. However, would be equivalent, up to reordering of “independent” (or commutative) transitions, to a serializable execution. To check whether the trace of an execution is serializable, we introduce the happens-before relation on the events of a given trace as the transitive closure of the union of all the relations in the trace, i.e., .
Finally, the happens-before relation between events is extended to transactions as follows: a transaction happens-before another transaction if the trace contains an event of transaction which happens-before an event of . The happens-before relation between transactions is denoted by and called transactional happens-before. The following characterizes serializable traces.
Theorem 4.1 ([1, 22])
A trace is serializable iff is acyclic.
A program is called robust if it produces the same set of traces as the serializability semantics.
Definition 1
A program is called robust against SI iff .
Since , the problem of checking robustness of a program is reduced to checking whether there exists a trace
5 Reducing Robustness against SI to SC Reachability
A trace which is not serializable must contain at least an issue and a commit event of the same transaction that don’t occur one after the other even after reordering of “independent” events. Thus, there must exist an event that occur between the two which is related to both events via the happens-before relation, forbidding the issue and commit to be adjacent. Otherwise, we can build another trace with the same happens-before where events are reordered such that the issue is immediately followed by the corresponding commit. The latter is a serializable trace which contradicts the initial assumption. We define a program instrumentation which mimics the delay of transactions by doing the writes on auxiliary variables which are not visible to other transactions. After the delay of a transaction, we track happens-before dependencies until we execute a transaction that does a “read” on one of the variables that the delayed transaction writes to (this would expose a read-write dependency to the commit event of the delayed transaction). While tracking happens-before dependencies we cannot execute a transaction that writes to a variable that the delayed transaction writes to since SI forbids write-write conflicts between concurrent transactions.
Concretely, given a program , we define an instrumentation of such that is not robust against SI iff the instrumentation reaches an error state under serializability. The instrumentation uses auxiliary variables in order to simulate a single delayed transaction which we prove that it is enough for deciding robustness. Let be the issue event of the only delayed transaction. The process that delayed is called the Attacker. When the attacker finishes executing the delayed transaction it stops. Other processes that execute transactions afterwards are called Happens-Before Helpers.
The instrumentation uses two copies of the set of shared variables in the original program to simulate the delayed transaction. We use primed variables to denote the second copy. Thus, when a process becomes the attacker, it will only write to the second copy that is not visible to other processes including the happens-before helpers. The writes made by the other processes including the happens-before helpers are made visible to all processes.
When the attacker delays the transaction , it keeps track of the variables it accessed, in particular, it stores the name of one of the variables it writes to, , it tracks every variable that it reads from and every variable that it writes to. When the attacker finishes executing , and some other process wants to execute some other transaction, the underlying transaction must contain a write to a variable that the attacker reads from. Also, the underlying transaction must not write to a variable that writes to. We say that this process has joined happens-before helpers through the underlying transaction. While executing this transaction, we keep track of each variable that was accessed and the type of operation, wheather it is a read or write. Afterward, in order for some other transaction to “join” the happens-before path, it must not write to a variable that writes to so it does not violate the fact that SI forbids write-write conflicts, and it has to satisfy one of the following conditions in order to ensure the continuity of the happens-before dependencies: (1) the transaction is issued by a process that has already another transaction in the happens-before dependency (program order dependency), (2) the transaction is reading from a shared variable that was updated by a previous transaction in the happens-before dependency (write-read dependency), (3) the transaction writes to a shared variable that was read by a previous transaction in the happens-before dependency (read-write dependency), or (4) the transaction writes to a shared variable that was updated by a previous transaction in the happens-before dependency (write-write dependency). We introduce a flag for each shared variable to mark the fact that the variable was read or written by a previous transaction.
Processes continue executing transactions as part of the chain of happens-before dependencies, until a transaction does a read on the variable that wrote to. In this case, we reached an error state which signals that we found a cycle in the transactional happens-before relation.
The instrumentation uses four varieties of flags: a) global flags (i.e., , , ), b) flags local to a process (i.e., and ), and c) flags per shared variable (i.e., , , and ). We will explain the meaning of these flags along with the instrumentation. At the start of the execution, all flags are initialized to null ().
Whether a process is an attacker or happens-before helper is not enforced syntactically by the instrumentation. It is set non-deterministically during the execution using some additional process-local flags. Each process chooses to set to at most one of the flags and , implying that the process becomes an attacker or happens-before helper, respectively. At most one process can be an attacker, i.e., set to . In the following, we detail the instrumentation for read and write instructions of the attacker and happens-before helpers.
5.1 Instrumentation of the Attacker
Figure 3 lists the instrumentation of the write and read instructions of the attacker. Each process passes through an initial phase where it executes transactions that are visible immediately to all the other processes (i.e., they are not delayed), and then non-deterministically it can choose to delay a transaction at which point it sets the flag to . During the delayed transaction it chooses non-deterministically a write instruction to a variable and stores the name of this variable in the flag (line (5)). The values written during the delayed transaction are stored in the primed variables and are visible only to the current transaction, in case the transaction reads its own writes. For example, given a variable , all writes to from the original program are transformed into writes to the primed version (line (3)). Each time, the attacker writes to , it sets the flag . This flag is used later by transactions from happens-before helpers to avoid writing to variables that the delayed transaction writes to.
A read on a variable, , in the delayed transaction takes her value from the primed version, . In every read in the delayed transaction, we set the flag to ld (line (1)) to be used latter in order for a process to join the happens-before helpers. Afterward, the attacker starts the happens-before path, and it sets the variable to (line (2)) to mark the start of the happens. When the flag is set to the attacker stops executing new transactions.
5.2 Instrumentation of the Happens-Before Helpers
The remaining processes, which are not the attacker, can become a happens-before helper. Figure 4 lists the instrumentation of write and read instructions of a happens-before helper. In a first phase, each process executes the original code until the flag is set to by the attacker. This flag signals the “creation” of the secondary copy of the shared-variables, which can be observed only by the attacker. At this point, the flag is set to , and the happens-before helper process chooses non-deterministically a first transaction through which it wants to join the set of happens-before helpers, i.e., continue the happens-before dependency created by the existing happens-before helpers. When a process chooses a transaction, it makes a pledge (while executing the instruction) that during this transaction it will either read from a variable that was written to by another happens-before helper, write to a variable that was accessed (read or written) by another happens-before helper, or write to a variable that was read from in the delayed transaction. When the pledge is met, the process sets the flag to true (lines (7) and (11)). The execution is blocked if a process does not keep its pledge (i.e., the flag is null) at the end of the transaction. Note that the first process to join the happens-before helper has to execute a transaction which writes to a variable that was read from in the delayed transaction since this is the only way to build a happens-before between , and the delayed transaction ( is not possible since is not from the attacker, is not possible since does not see the writes of the delayed transaction, and is not possible since cannot write to a variable that the delayed transaction writes to). We use a flag for each variable to record the type (read ld or write st) of the last access made by a happens-before helper (lines (8) and (10)). During the execution of a transaction that is part of the happens-before dependency, we must ensure that the transaction does not write to variable where is set to . Otherwise, the execution is blocked (line 9).
The happens-before helpers continue executing their instructions, until one of them reads from the shared variable whose name was stored in . This establishes a happens-before dependency between the delayed transaction and a “fictitious” store event corresponding to the delayed transaction that could be executed just after this read of . The execution doesn’t have to contain this store event explicitly since it is always enabled. Therefore, at the end of every transaction, the instrumentation checks whether the transaction read . If it is the case, then the execution stops and goes to an error state to indicate that this is a robustness violation. Notice that after the attacker stops, the only processes that are executing transactions are happens-before helpers, which is justified since when a process is not from a happens-before helper it implies that we cannot construct a happens-before dependency between a transaction of this process and the delayed transaction which means that the two transactions commute which in turn implies that this process’s transactions can be executed before executing the delayed transaction of the attacker.
5.3 Correctness
The role of a process in an execution is chosen non-deterministically at runtime. Therefore, the final instrumentation of a given program , denoted by , is obtained by replacing each labeled instruction with the concatenation of the instrumentations corresponding to the attacker and the happens-before helpers, i.e.,
The following theorem states the correctness of the instrumentation.
Theorem 5.1
is not robust against SI iff reaches the error state.**
If a program is not robust, this implies that the execution of the program under SI results in a trace where the happens-before is cyclic. Which is possible only if the program contains at least one delayed transaction. In the proof of this theorem, we show that is sufficient to search for executions that contain a single delayed transaction.
Notice that in the instrumentation of the attacker, the delayed transaction must contain a read and write instructions on different variables. Also, the transactions of the happens-before helpers must not contain a write to a variable that the delayed transaction writes to. The following corollary states the complexity of checking robustness for finite-state programs 111Programs with a bounded number of variables taking values from a bounded domain. against snapshot isolation. It is a direct consequence of Theorem 5.1 and of previous results concerning the reachability problem in concurrent programs running over a sequentially-consistent memory, with a fixed [16] or parametric number of processes [21].
Corollary 1
*Checking robustness of finite-state programs against snapshot isolation is PSPACE-complete when the number of processes is fixed and
EXPSPACE-complete, otherwise.*
The instrumentation can be extended to SQL (select/update) queries where a statement may include expressions over a finite/infinite set of variables, e.g., by manipulating a set of flags x.event for each statement instead of only one.
6 Proving Program Robustness
As a more pragmatic alternative to the reduction in the previous section, we define an approximated method for proving robustness which is inspired by Lipton’s reduction theory [17].
Movers. Given an execution of a program under serializability (where each event corresponds to executing an entire transaction), we say that the event moves right (resp., left) in if (resp., ) is also a valid execution of , the process of is different from the process of (resp., ), and both executions reach to the same end state . For an execution , let denote the transaction that generated the event . A transaction of a program is a right (resp., left) mover if for all executions of under serializability, the event with moves right (resp., left) in .
If a transaction is not a right mover, then there must exist an execution of under serializability and an event of with that does not move right. This implies that there must exist another of which caused to not be a right mover. Since and do not commute, then this must be because of either a write-read, write-write, or a read-write dependency. If , we say that is not a right mover because of and some dependency that is either write-read, write-write, or read-write. Notice that when is not a right mover because of then is not a left mover because of .
We define as a binary relation between transactions such that when is not a right mover because of and a write-read dependency. We define the relations and corresponding to write-write and read-write dependencies in a similar way.
Read/Write-free transactions. Given a transaction , we define as a variation of where all the reads from shared variables are replaced with non-deterministic reads, i.e., statements are replaced with where denotes non-deterministic choice. We also define as a variation of where all the writes to shared variables in are disabled. Intuitively, recalling the reduction to SC reachability in Section 5, simulates the delay of a transaction by the Attacker, i.e., the writes are not made visible to other processes, and approximates the commit of the delayed transaction which only applies a set of writes.
Commutativity dependency graph. Given a program , we define the commutativity dependency graph as a graph where vertices represent transactions and their read/write-free variations. Two vertices which correspond to the original transactions in are related by a program order edge, if they belong to the same process. The other edges in this graph represent the “non-mover” relations , , and .
Given a program , we say that the commutativity dependency graph of contains a non-mover cycle if there exist a set of transactions of such that the following hold:
- (a)
where is the write-free variation of and does not write to a variable that writes to; 2. (b)
for all , , and do not write to a shared variable that writes to; 3. (c)
where is the read-free variation of and does not write to a variable that writes to.
A non-mover cycle approximates an execution of the instrumentation defined in Section 5 in between the moment that the Attacker delays a transaction (which here corresponds to the write-free variation ) and the moment where gets committed (the read-free variation ).
The following theorem shows that the acyclicity of the commutativity dependency graph of a program implies the robustness of the program. Actually, the notion of robustness in this theorem relies on a slightly different notion of trace where store-order and write-order dependencies take into account values, i.e., store-order relates only writes writing different values and the write-order relates a read to the oldest write (w.r.t. execution order) writing its value. This relaxation helps in avoiding some harmless robustness violations due to for instance, two transactions writing the same value to some variable.
Theorem 6.1
For a program , if the commutativity dependency graph of does not contain non-mover cycles, then is robust.
7 Experiments
To test the applicability of our robustness checking algorithms, we have considered a benchmark of 10 applications extracted from the literature related to weakly consistent databases in general. A first set of applications are open source projects that were implemented to be run over the Cassandra database, extracted from [10]. The second set of applications is composed of: TPC-C [23], an on-line transaction processing benchmark widely used in the database community, SmallBank, a simplified representation of a banking application [2], FusionTicket, a movie ticketing application [15], Auction, an online auction application [5], and Courseware, a course registration service extracted from [13, 18].
A first experiment concerns the reduction of robustness checking to SC reachability. For each application, we have constructed a client (i.e., a program composed of transactions defined within that application) with a fixed number of processes (at most 3) and a fixed number of transactions (between 3 and 7 transactions per process). We have encoded the instrumentation of this client, defined in Section 5, in the Boogie programming language [3] and used the Civl verifier [14] in order to check whether the assertions introduced by the instrumentation are violated (which would represent a robustness violation). Note that since clients are of fixed size, this requires no additional assertions/invariants (it is an instance of bounded model checking). The results are reported in Table 1. We have found two of the applications, Courseware and SmallBank, to not be robust against snapshot isolation. The violation in Courseware is caused by transactions RemoveCourse and EnrollStudent that execute concurrently, RemoveCourse removing a course that has no registered student and EnrollStudent registering a new student to the same course. We get an invalid state where a student is registered for a course that was removed. SmallBank’s violation contains transactions Balance, TransactSaving, and WriteCheck. One process executes WriteCheck where it withdraws an amount from the checking account after checking that the sum of the checking and savings accounts is bigger than this amount. Concurrently, a second process executes TransactSaving where it withdraws an amount from the saving account after checking that it is smaller than the amount in the savings account. Afterwards, the second process checks the contents of both the checking and saving accounts. We get an invalid state where the sum of the checking and savings accounts is negative.
Since in the first experiment we consider fixed clients, the lack of assertion violations doesn’t imply that the application is robust (this instantiation of our reduction can only be used to reveal robustness violations). Thus, a second experiment concerns the robustness proof method based on commutativity dependency graphs (Section 6). For the applications that were not identified as non-robust by the previous method, we have used Civl to construct their commutativity dependency graphs, i.e., identify the “non-mover” relations , , and (Civl allows to check whether some code fragment is a left/right mover). In all cases, the graph didn’t contain non-mover cycles, which allows to conclude that the applications are robust.
The experiments show that our results can be used for finding violations and proving robustness, and that they apply to a large set of interesting examples. Note that the reduction to SC and the proof method based on commutativity dependency graphs are valid for programs with SQL (select/update) queries.
8 Related Work
Decidability and complexity of robustness has been investigated in the context of relaxed memory models such as TSO and Power [8, 6, 12]. Our work borrows some high-level principles from [6] which addresses the robustness against TSO. We reuse the high-level methodology of characterizing minimal violations according to some measure and defining reductions to SC reachability using a program instrumentation. Instantiating this methodology in our context is however very different, several fundamental differences being:
- –
SI and TSO admit different sets of relaxations and SI is a model of transactional databases. 2. –
We use a different notion of measure: the measure in [6] counts the number of events between a write issue and a write commit while our notion of measure counts the number of delayed transactions. This is a first reason for which the proof techniques in [6] don’t extend to our context. 3. –
Transactions induce more complex traces: two transactions might be related by several dependency relations since each transaction may contain multiple reads and writes to different locations. In TSO, each action is a read or a write to some location, and two events are related by a single dependency relation. Also, the number of dependencies between two transactions depends on the execution since the set of reads/writes in a transaction evolves dynamically.
Other works [8, 12] define decision procedures which are based on the theory of regular languages and do not extend to infinite-state programs like in our case.
As far as we know, our work provides the first results concerning the decidability and the complexity of robustness checking in the context of transactions. The existing work on the verification of robustness for transactional programs provide either over- or under-approximate analyses. Our commutativity dependency graphs are similar to the static dependency graphs used in [5, 9, 10, 11], but they are more precise, i.e., reducing the number of false alarms. The static dependency graphs record happens-before dependencies between transactions based on a syntactic approximation of the variables accessed by a transaction. For example, our techniques are able to prove that the program in Figure 5 is robust, while this is not possible using static dependency graphs. The latter would contain a dependency from transaction to and one from to just because syntactically, each of the two transactions reads both variables and may write to one of them. Our dependency graphs take into account the semantics of these transactions and do not include this happens-before cycle. Other over- and under-approximate analyses have been proposed in [19]. They are based on encoding executions into first order logic, bounded-model checking for the under-approximate analysis, and a sound check for proving a cut-off bound on the size of the happens-before cycles possible in the executions of a program, for the over-approximate analysis. The latter is strictly less precise than our method based on commutativity dependency graphs. For instance, extending the TPC-C application with additional transactions will make the method in [19] fail while our method will succeed in proving robustness (the three transactions are for adding a new product, adding a new warehouse based on the number of customers and warehouses, and adding a new customer, respectively).
Finally, the idea of using Lipton’s reduction theory for checking robustness has been also used in the context of the TSO memory model [7], but the techniques are completely different, e.g., the TSO technique considers each update in isolation and doesn’t consider non-mover cycles like in our commutativity dependency graphs.
Appendix 0.A Programs
0.A.1 Program Syntax
We consider a simple programming language grammar which is defined in Figure 6. A program is parallel composition of processes distinguished using a set of identifiers . Each process is a sequence of transactions and each transaction is a sequence of labeled instructions. Each transaction starts with a instruction and finishes with a instruction. Each other instruction is either an assignment to a process-local register from a set or to a shared variable from a set , or an statement. The read/write assignments use values from a data domain . An assignment to a register is called a read of and an assignment to a shared variable is called a write to ( is an expression over registers whose syntax we leave unspecified since it is irrelevant for our development). The blocks the process if the Boolean expression over registers is false.
0.A.2 Program Semantics Under SnapShot Isolation
The semantics of a program under SI is defined as follows. The shared variables are stored in a central memory and each process keeps a replicated copy of the central memory. A process starts a transaction by discarding its local copy and fetching the values of the shared variables from the central memory. When a process commits a transaction, it merges its local copy of the shared variables with the one stored in the central memory in order to make its updates visible to all processes. During the execution of a transaction, the process stores the writes to shared variables only in its local copy and reads only from its local copy. When a process merges its local copy with the centralized one, it is required that there were no concurrent updates that occurred after the last fetch from the central memory to a shared variable that was updated by the current transaction. Otherwise, the transaction is aborted and its effects discarded.
Thus, a program configuration is a tuple where associates a local state in to each process in , stores the largest timestamp for each shared variable, and holds the global valuation of shared variables. A local state is a tuple where is the program counter, i.e., the label of the next instruction to be executed, is the local valuation of the shared variables, is a local log which marks shared variables which were updated in a transaction, and is the valuation of the local registers. For a local state , we use to denote the program counter component of , and similarly for all the other components of . Given a transaction , we use to denote the start time of transaction and to denote the commit time of . Before merging store with Log, after executing a transaction , we check that for every variable that writes to () we have that (i.e., there were no concurrent write to ). Then, we store the value of for every variable that writes to () in . Also, for every variable that writes to, we store in .
Then, the semantics of a program under snapshot isolation consistency model is defined using a labeled transition system (LTS) where is the set of program configurations, is a set of transition labels called events, is the initial program configuration, and is the transition relation. The set of events under SI is defined as follow.
[TABLE]
where begin and com label transitions corresponding to the start and the commit of a transaction, respectively. isu and ld label transitions corresponding to writing, resp., reading, a shared variable during some transaction.
The transition relation is defined in Figure 7. For readability, the events labeling a transition are written on top of . A begin transition resets the local valuation of the shared variables and fetches their values from the central memory. A com transition applies the writes performed in a transaction to the central memory by merging the contents of the local copy store with the central memory Log. An ld transition reads the value of a shared-variable from the local copy store while an isu transition applies a new write to the local copy store.
An execution of program , under snapshot isolation, is a sequence of events labeling the transitions, such that there exists a sequence of configurations where is the initial configuration before starts execution and is a valid transition for . of transitions. The set of executions of under SI is denoted by .
Appendix 0.B Proofs of Section 5: Characterizations of SI Trace-Robustness
In this section, we describe the proof of Theorem 5.1. We first reduce the robustness of a program against SI to the existence of some execution trace that has a specific shape. We call a trace , an anomaly. Then, we show that of the anomaly of a particular shape is equivalent to an execution of the instrumented program reaching an error state.
First, we give an auxiliary lemma about the happens-before relation (between events). In the remaining of this section, we use to denote the happens-before without the transitive closure, i.e., . To decide if two events in a trace are “independent” (or commutative) we use the information about the existence of a happens-before relation between the events. If two events are not related by happens-before then they can be swapped while preserving the same happens-before. Thus, we extend the happens-before relation to obtain the happens-before through relation as follows:
Definition 2 ([6])
Let be a trace where and are events (or atomic macro events), and , , and are sequences of events (or atomic macro events) under a semantics . We say that happens-before through if there is a non empty sub-sequence of that satisfies:
[TABLE]
where , .
We deduce from the definition of serializable traces, that a anomaly trace must contain at least an issue and a commit events of the same transaction that are related via the happens-before through relation. Otherwise, we can build another trace with the same happens-before where events are reordered such that every issue is immediately followed by the corresponding commit . The latter is a serializable trace which contradicts the initial assumption.
Lemma 1
Given an anomaly , there must exist a transaction such that and happens before through .
Given an anomaly of the from , we call a delayed transaction in the trace when happens before through .
For an anomaly , the number of delays, denoted by , in is the total number of delayed transactions in the trace.
[TABLE]
Definition 3 (Minimal anomaly)
An anomaly is called minimal if it has the least number of delays among all possible anomalies (for a given program ).
Given an anomaly , and assuming that is the first delayed transaction in (w.r.t. the order between issue events of delayed transactions) and that is a minimal anomaly, the following lemma shows that we can assume w.l.o.g. that is empty.
Lemma 2
Let be a minimal anomaly such that happens-before through . Then, is also a minimal anomaly.
Proof
We can notice that after executing the event , we obtain a cycle in the relation. Thus, is already an anomaly not serializable.
The following result relates SI robustness problem to finding certain anomaly trace which is a minimal anomaly where only a single transaction is delayed.
Theorem 0.B.1
A program is not robust under SI iff there exists an anomaly under SI such that the following must hold:
[TABLE]
- (a)
* is the issue of the only delayed transaction in ; (Lemmas 6 and 5);* 2. (b)
* happens before through (Lemma 6);* 3. (c)
for any event , we have that and (Lemma 6); 4. (d)
there exist events and in such that and with (Lemma 6); 5. (e)
all delayed transactions in don’t write to shared variables that writes to (Lemma 3).
Note that in certain cases the events and can be identical and . Figure 8 shows two examples of anomalies of the form given in Theorem 0.B.1.
In the following, we give the lemmas that constitute Theorem 0.B.1.
An important property in SI semantics is that of conflict-free, and since the event is successfully executed only if there are no concurrent writes that were committed after . Thus, for every event in , don’t write to a shared variable that writes to.
Lemma 3
Let be a minimal anomaly such that happens-before through . Then, for every , does not write to a shared variable that writes to.
The following lemma shows that we must have an event such that happens before .
Lemma 4
Let be a minimal anomaly such that happens-before through . Then, there must exist an event such that happens before , i.e., .
Proof
Suppose by contradiction that does not contain an event such that happens before . Then, can be swapped with every event in . Thus, we obtain that the two events and are adjacent which means that is no longer a delayed transaction which is a contradiction. Therefore, does contain an event such that .
Next lemma shows that we can always obtain a minimal anomaly trace where contains no delayed transaction. We show that if it were to have a delayed transaction in , then it is possible to obtain a new anomaly where either is not delayed or is not delayed, and obtain a new anomaly with a smaller number of delayed transactions which contradicts the minimality assumption.
Lemma 5
Let be a minimal anomaly. Then, does not contain delayed transactions.
Proof
We suppose by contradiction that contains a delayed transaction issued by a process .
It is important to notice that there must exist and such that happens before through . Otherwise, we can commute the events until occurs just after and in this case transaction is not delayed by . Thus, is of the form .
Notice that we can build happens-before cycle from to . Also, since in no transaction depends on . Thus, we can safely remove and its associated commit event and obtains: . which is an anomaly because of the cycle formed as happens-before through . In , transaction was not delayed, therefore, has less number of delays than . Thus, is not a minimal anomaly, a contradiction to our hypothesis.
The next lemma characterizes the relation between the first delayed transaction and the commit of the underlying transaction. It shows the type of the first and last happens-before relations in the happens-before path between the issue of the only delayed transaction and its corresponding commit.
Lemma 6
Let be a minimal anomaly under SI. Then, the following must hold: There exist where , and .
Proof
We have that since happens-before through and does not contain a delayed transaction. Then, there must exist and in such that , , and . It is important to note that and might be identical in which case .
Notice that every event in (including and ) cannot write to a variable that writes to under SI semantics, thus store order relation is not possible. Also, since is not visible to any event in thus the read-from and program order are not possible. Thus, the only possibility is that , and .
Appendix 0.C Proofs of Section 5: The Complete Instrumentation
In this section, we present the instrumentation for the remaining instructions which are, begin and commit for the attacker and happens-before helper.
0.C.1 Instrumentation of the Attacker
We provide in Figure 9, the instrumentation of the code for the attacker process. When the attacker randomly chooses a transaction to delay, it sets the flag to in the instruction begin (line (15)). Then, it sets the flag to 1 to indicate that the current process is the attacker. It copies the values from every variable to its primed version .
In the case the attacker starts the happens-before chain, it has to set the variable to to mark the start of the happens-before chain and the end of the visibility chain and set the flag to ld (line (2) in Figure 3). We can notice that when the is set to , we can no longer execute new transactions from the attacker (all conditions in lines (13) and (14) become ).
0.C.2 Instrumentation of the Happens-Before Helpers
In Figure 10, we provide the instrumentation of the remaining instructions of a happens-before helper.
When the flag is set to , a process (which cannot be the attacker, i.e., the flag is null) starts the attempts to join the set of happens-before helpers. Thus, it randomly chooses a first transaction (the begin of this transaction is shown in line (16)) through which the process will join the set of happens-before helpers. When a process chose the transaction to join happens-before helpers, that means it has made pledge that during this transaction it will either do read from a variable that was updated by a another delayed transaction from some other process in happens-before helpers or write to a variable that was accessed with a read or write from another process in happens-before helpers. When either one of these criteria are satisfied the flag will be set to true. If a process does not keep its pledge (the flag is null) then before executing the com instruction of the first transaction we block the execution (line (20)).
The happens-before helpers processes continue executing their instructions, until one of them executes a load that reads from the shared variable that was stored in which implies the existence of a happens-before cycle. When executing the instruction com at the end of every transaction, we have a conditional check to detect if we have a load or a write accessing the variable (lines (17), (18), and (19)). When the check detects that the variable was accessed, the execution goes to the error state (line (19)) to indicate that the execution has produced an anomaly and we denote the reached state of the instrumented program’s execution, the error state.
As a direct consequence of Theorem 5.1, the next corollary states that some programs which have certain characteristics are robust against SI.
Corollary 2
Given a program , if one of the following holds:
- (a)
every transaction of contains a single instruction either a read or a write; 2. (b)
every transaction of contains only read/write events that access a single variable (different transactions might read/write to different variables); 3. (c)
given a variable , every transaction of contains a write to the variable .
then is robust under SI.
Appendix 0.D Proofs of Section 5: Soundness and Completeness of the Instrumentation
The aim of the instrumentation procedure is to reduce the problem of checking the existence of the anomaly described in Theorem 0.B.1 to reachability under serialisability of en error state by the instrumented version of a program. The instrumentation procedure is considered sound and complete iff if an error state is reachable, then we can reconstruct an anomaly, and every anomaly ensures that the error state is reachable by the instrumented version of the program.
Theorem 0.D.1 (Soundness and Completeness)
A program is not robust iff the instrumented version of it, , reaches an error state under SER.
Proof
Soundness. Suppose that the instrumented program reaches an error state. Then, the execution’s trace of the instrumented program is of the form:
[TABLE]
The last transaction, performed by a process that has a read accessing the variable and is part of the happens-before helpers. Because the conditional check can be performed only by a process () that is one of the happens-before helpers and is currently executing.
In order for , to join the set of happens-before helpers, it must have found that the valuation of the flag is not null which means there exists some process that is the attacker that sets the flag to . In , the attacker, happens-before helpers, and other processes start executing the original instructions without setting any flags or delaying any transactions. Afterwards, the attacker issues the delayed transaction and it starts populating the primed variables and reading from them and setting the flags to for every variable that it writes to and to ld for every variable that it reads from. During the execution of , the attacker sets the flag to . Hence, the happens-before helpers start checking at every instruction whether the flags are set to either st or ld. If so, they start populating the flags and as well. When is set to , the attacker stop issuing new transactions. Therefore, all transaction in are from the happens-before helpers.
We now transform into the following execution trace:
[TABLE]
Here, is the subsequence of all events that are produced by instructions from without the conditionals checking (i.e., the assume statements). The transaction which is executed by the attacker represents the delayed transactions in with the removal of the conditionals checking and the flags setting. is the subsequence of all events of produced by transactions from which are executed only by the happens-before helpers except the conditionals checking and the flags setting. We add the commit of transaction to describe the commit of the delayed transaction that was delayed by the attacker. is a possible execution’s trace of the program because is result of an execution of the instrumented version of and we have removed from all the effects of the instrumentation, and replaced the stores to auxiliary variables by issues of stores without changing the dependency between all the events in the execution.
All transactions in are from the happens-before helpers. Transactions in form a happens-before path between and . Also, we have such that and . No transaction in writes to a variable that writes to. Hence, indeed holds all the properties of the anomaly described in Theorem 0.B.1.
Completeness. Suppose we have an anomaly of a given program :
[TABLE]
such that maintains all the properties given in Theorem 0.B.1. We demonstrate that there is a possible serializable execution based on of the instrumented version of the program that reaches the error state. Next, we show how to build the instrumented program execution.
At the start of the execution, , the attacker, happens-before helpers, and other processes execute the original transactions with just conditional checks.
Afterwards, the attacker delays the transaction and starts populating the flags. In , the attacker issues a store to the shared variable and such that . All writes that were executed in by the attacker are invisible to the remaining processes which includes the happens-before helpers. While executing , the attacker sets the content of the flag to ld for every variable that it reads from and it sets the flag to .
On the other hand, the processes which are executing their transactions without delaying them will attempt to join the happens-before helpers by checking if the flag is set to . If so, they start the attempt of joining the happens-before helpers and when it succeed they joining the happens-before helpers and start executing their transactions which constitute . The first executed transaction by the happens-before helpers is described above which signals the start of and the happen before dependency. Thus, in , we have only transactions form the happens-before helpers (because the attacker stop when the flag is set to ) such that they are related by the happen before dependency that started from until it reaches through . We know that there must exist such that . is equivalent to the last executed transaction by the happens-before helpers that accesses the shared variable . Thus, the underlying happens-before helper will set the content of the flag to ld. Hence, when the underlying process executes the com instruction of this transaction, it will go to the error state (lines (17), (18), and (19)) and in this case the instrumented version of the program has reached the desired error state.
Appendix 0.E Proofs of Section 6
The following theorem shows that the acyclicity of the commutativity dependency graph of a program implies the robustness of the program. Actually, the notion of robustness in this theorem relies on a slightly different notion of trace where store-order and write-order dependencies take into account values, i.e., store-order relates only transactions writing different values and the write-order relates a reading transaction to the oldest transaction (w.r.t. execution order) writing its value. In more details, we assume that two transactions are -related iff they write different values. Notice that since is defined using then when two transactions are -related iff they write different values then when two transactions are -related this implies that the value that is read is different than the one that is written. For example, in Figure 11(a), if we don’t have this weakening of we get an execution trace, where the happens-before is cyclic, of the form because of the relation that links and . However, with our weakening of there will be no relation between and and this the above execution can be equivalently rewritten as which is serializable. Similarly, we assume that two transactions and are related by iff when we swap the two transactions does not read the same value that is writing. For instance, in Figure 11(b), if we don’t have this assumption we get an execution trace, where the happens-before is cyclic, of the form because of the relation that links and . However, with our assumption on there will be no relation between and and this the above execution can be equivalently rewritten as which is serializable. This approach helps in avoiding some of the harmless anomalies, where the happens before cycle might be caused by a write-write dependency between two transactions that writes the same values.
Theorem 0.E.1
For a program , if (1) the commutativity dependency graph of does not contain non-mover cycles, then (2) is robust.
Proof
We prove the contrapositive, i.e., . For the proof, we use the result of Theorem 5.1.
Assuming that the program is not robust. Then, based on Theorem 5.1 there must exist an execution of the instrumentation of that reaches the error state. We suppose that is the delayed transaction, is the instrumentation of (writes are stored in auxiliary registers), and is the attacker process. Therefore, the execution of the instrumentation of that reaches the error state is of the form where writes to a variable that reads from and reads from a variable that writes to. We assume that is the first event that does read that accesses a variable that writes to. In the following we show that the commutativity dependency graph of contains a non-mover cycle where is . We consider two cases, first case when and , and second case is when .
First case: where writes to a variable that reads from, reads from a variable that writes to, and does not write to a variable that writes to. Assume that . Thus, we can safely obtain a serializable execution trace of where is the reads free instantiation of . Since then reads a value that is overwriting with a different value. Therefore, is either a serializable execution with a different end state than has or it is not an serializable execution. Thus, and does not write to a variable that writes to. Similarly, we can safely obtain a serializable execution trace of where is the writes free instantiation of . Since then reads a value that is overwriting with a different value. Therefore, is either a serializable execution with a different end state than has or it is not an serializable execution. Thus, and does not write to a variable that writes to.
Second case: where writes to a variable that reads from, reads from a variable that writes to, and every transaction in does not write to a variable that writes to. Assume that and . Since the transactions in constitute the happens-before path in the execution trace . Then, for every we have that . In the case , we can safely obtain which is a serializable execution trace of where either empty (i.e., ) or . Since, , then swapping and will result in either reordering of writes or write overwrites a read, or read obtains a different value. Therefore, is either a serializable execution trace with a different end state than has or it is not an serializable execution trace. Thus, . Also, we have that and do not write to a variable that writes to. Similar to the first case, we can safely obtain a serializable execution trace of where is the reads free instantiation of . Since then reads a value that is overwriting with a different one. Therefore, is either a serializable execution with a different end state than has or it is not an serializable execution trace. Thus, and does not write to a variable that writes to. Furthermore, we can safely obtain a serializable execution trace of where is the writes free instantiation of . Since then reads a value that is overwriting with a different one. Then, is either a serializable execution trace with a different end state than has or it is not an serializable execution trace. Thus, and does not write to a variable that writes to.
Appendix 0.F Experiments Appendix
In this section we describe the applications we used to evaluate our techniques. For every table in the original application program, we added a boolean annotation in our formalization of the table in Boogie, in order, to inspect whether a given record does exist in the table or not. For instance, if we consider the following table Customer(CustomerId, CustomerName), in Boogie, we formalize the table as two maps: CustomerAlive which takes a CustomerId and returns true if there is a customer with underlying id and else otherwise, CustomerTable which takes a CustomerId and returns the corresponding CustomerName. Also, in certain cases we formalize a Table as multiple maps. Below, we describe each application.
Auction [5]: Which has five transactions that manipulate three tables: BIDS, ITEMS, and USERS. Transaction RegBid is for placing a bid on an item. Transaction RegUser is for registration a user. Transaction ViewItem is for viewing the number of bids for an item. Transaction ViewUser is for looking at a user name. Transaction ViewUsers is for looking at all registered users.
Cassieq-Core222https://github.com/paradoxical-io/cassieq: A core unit of a distributed queue. It has eight transactions that manipulate a single table: USERACCOUNTS. Transaction AddNewAccount is for adding a new account in USERACCOUNTS. Transaction DeleteAnAccount is for deleting an account from USERACCOUNTS. Transaction AddNewKey is for adding a new key to an existing account in USERACCOUNTS. Transaction DeleteAKey is removing a key from an existing account in USERACCOUNTS. Transaction GetAnAccount is to check whether there exist an account with a given id in the table USERACCOUNTS. Transaction GetAccounts is to return all existing accounts in USERACCOUNTS. Transaction GetAccountKeys is for getting all the keys of certain account in USERACCOUNTS. Transaction GetAccountKey is to check whether a certain account does hold a certain key in USERACCOUNTS.
Courseware [13, 18]: Which has five transactions that manipulate three tables: STUDENT, COURSE, and ENROLED. Transaction RegisterStudent is for registering a new student in the table STUDENT. Transaction AddCourse is for adding a new course in the table COURSE. Transaction EnrollStudent is for enrolling a given registered student in a given course. Transaction RemoveCourse is for removing a given course from the table COURSE. Transaction QueryCourses is for inspecting courses in the table COURSE.
Currency-Exchange333https://github.com/Haiyan2/Trade: A trading service. It has six transactions that manipulate a single table: TRADES. Transaction SaveTrade is for registering a new trade. Transaction ViewListTrades is for viewing the trades that occurred before a given instance of time. Transaction ViewTrade is for inspecting a given trade. Transaction ViewTradeUser is for looking for a user who carried out a given trade. Transaction GetNbTrades is for inspecting the number of trades. Transaction GetTradeTimeStamp is for looking for the time stamp of a given trade.
FusionTicket [15]: Which has four transactions that manipulate a single table: EVENTS. Transaction AddEvent is adding new event in some given venue. Transaction ViewEvent is for looking at given a event and the number of tickets available at this event. Transaction Browse is for looking at events that are planned in some given venue. Transaction Purchase is for buying a ticket at a certain event.
Shopping-Cart444https://github.com/nikhilswagle/Shopping_Cart_Angular_Cassandra: An on-line shop service implemented over Cassandra. It has four transactions that query two tables: USERS and PRODUCTS. Transaction GetUser is for querying the existence of a user in the table USERS. Transaction GetProductsByCategory is for finding products that are in a given category. Transaction GetProductByUPC is for finding a product through its UPC. Transaction GetCategories is for finding the categories.
Playlist555https://github.com/DataStaxDocs/playlist: An on-line music service. It has fourteen transactions that manipulate three tables: USERS, TRACKS, and ARTISTS. Transaction AddTrack is for adding a new track in the table TRACKS. Transaction GetTrack is for getting a certain track from the table TRACKS. Transaction AddUser is for adding a new user in the table USERS. Transaction GetUser is for looking for a certain user in the table USERS. Transaction CreatePlayList is for creating a playlist of certain user in the table USERS. Transaction ListArtistByLetter is for listing artists by their first letters of their names. Transaction ListSongsByArtist is for listing tracks produced by certain artist. Transaction ListSongsByGenre is for listing tracks of certain genre type. Transaction AddTrackToPlaylist is for adding an existing track (in TRACKS) to an existing user play list in the table USERS. Transaction DeleteTrackFromPlaylist is for removing a track from user play list in the table USERS. Transaction GetPlaylistForUser is for getting the contents of certain play list of certain user. Transaction GetPlaylistNames is for getting all the play lists of certain user. Transaction DeletePlayListForUser is for deleting a certain user’s play list. Transaction DeleteUser is deleting a user from the table USERS.
RoomStore666https://github.com/mebigfatguy/roomstore: A messages bot service. It has five transactions that manipulate a single table: MESSAGES. Transaction AddMessage is for adding a new message to the table MESSAGES. Transaction GetLastMessage is for getting the messages of given user. Transaction GetMessages is for looking for messages that were added in a certain date. Transaction GetSpecificMessage is for getting specific message that was added in a certain date and time. Transaction GetTopicMessages is for getting messages that are of certain topic.
SmallBank [2]: Which has five transactions that manipulate three tables: ACCOUNT, SAVING, and CHECKING. Transaction Balance is for looking at both the saving and checking balances of a given user account. Transaction DepositChecking is for depositing a certain amount into the checking balance. Transaction TransactSaving is for depositing or withdrawing into/form the saving balance. Transaction Amalgamate (Amg) is for moving the saving and checking balances of an account to another account checking balance and resetting the saving and checking balances of the first account to zero. Transaction WriteCheck is for withdrawing from a given account’s checking balance.
TPC-C [23]: Which has five transactions that manipulate nine tables: WAREHOUSE, DISTRICT, STOCK, ITEMS, CUSTOMERS, HISTORY, ORDER, NEWORDER, and ORDERLINE. Transaction NewOrder is for placing a new order on a set of items. Transaction Delivery is for delivering a withstanding order at certain warehouse. Transaction Payment is for a given customer paying a withstanding amount of credit. Transaction OrderStatus is for inspecting certain orders and the associated order lines. Transaction StockLevel is for inspecting stocks at certain warehouse and the withstanding orders at this warehouse.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Adya, A.: Weak consistency: A generalized theory and optimistic implementations for distributed transactions. Ph.D. thesis (1999)
- 2[2] Alomari, M., Cahill, M.J., Fekete, A., Röhm, U.: The cost of serializability on platforms that use snapshot isolation. In: Alonso, G., Blakeley, J.A., Chen, A.L.P. (eds.) Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, April 7-12, 2008, Cancún, Mexico. pp. 576–585. IEEE Computer Society (2008)
- 3[3] Barnett, M., Chang, B.E., De Line, R., Jacobs, B., Leino, K.R.M.: Boogie: A modular reusable verifier for object-oriented programs. In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.P. (eds.) Formal Methods for Components and Objects, 4th International Symposium, FMCO 2005, Amsterdam, The Netherlands, November 1-4, 2005, Revised Lectures. Lecture Notes in Computer Science, vol. 4111, pp. 364–387. Springer (2005)
- 4[4] Berenson, H., Bernstein, P.A., Gray, J., Melton, J., O’Neil, E.J., O’Neil, P.E.: A critique of ANSI SQL isolation levels. In: Carey, M.J., Schneider, D.A. (eds.) Proceedings of the 1995 ACM SIGMOD International Conference on Management of Data, San Jose, California, USA, May 22-25, 1995. pp. 1–10. ACM Press (1995)
- 5[5] Bernardi, G., Gotsman, A.: Robustness against consistency models with atomic visibility. In: Desharnais, J., Jagadeesan, R. (eds.) 27th International Conference on Concurrency Theory, CONCUR 2016, August 23-26, 2016, Québec City, Canada. LIP Ics, vol. 59, pp. 7:1–7:15. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik (2016)
- 6[6] Bouajjani, A., Derevenetc, E., Meyer, R.: Checking and enforcing robustness against TSO. In: Felleisen, M., Gardner, P. (eds.) Programming Languages and Systems - 22nd European Symposium on Programming, ESOP 2013, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2013, Rome, Italy, March 16-24, 2013. Proceedings. Lecture Notes in Computer Science, vol. 7792, pp. 533–553. Springer (2013)
- 7[7] Bouajjani, A., Enea, C., Mutluergil, S.O., Tasiran, S.: Reasoning about TSO programs using reduction and abstraction. In: Chockler, H., Weissenbacher, G. (eds.) Computer Aided Verification - 30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, Flo C 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part II. Lecture Notes in Computer Science, vol. 10982, pp. 336–353. Springer (2018)
- 8[8] Bouajjani, A., Meyer, R., Möhlmann, E.: Deciding robustness against total store ordering. In: Aceto, L., Henzinger, M., Sgall, J. (eds.) Automata, Languages and Programming - 38th International Colloquium, ICALP 2011, Zurich, Switzerland, July 4-8, 2011, Proceedings, Part II. Lecture Notes in Computer Science, vol. 6756, pp. 428–440. Springer (2011)
