Verifying Relational Properties using Trace Logic

Gilles Barthe; Renate Eilers; Pamina Georgiou; Bernhard Gleiss; Laura; Kovacs; Matteo Maffei

arXiv:1906.09899·cs.LO·August 13, 2019

Verifying Relational Properties using Trace Logic

Gilles Barthe, Renate Eilers, Pamina Georgiou, Bernhard Gleiss, Laura, Kovacs, Matteo Maffei

PDF

TL;DR

This paper introduces a logical framework using trace logic for verifying relational properties in imperative programs, especially useful for security applications involving complex quantifier structures.

Contribution

It develops a method to reduce relational property verification to first-order validity problems via trace logic, enabling reasoning about program traces and intermediate steps.

Findings

01

Framework effectively verifies security-related relational properties.

02

Trace logic supports reasoning about loop iterations and intermediate states.

03

Implementation in Rapid demonstrates practical applicability.

Abstract

We present a logical framework for the verification of relational properties in imperative programs. Our work is motivated by relational properties which come from security applications and often require reasoning about formulas with quantifier-alternations. Our framework reduces verification of relational properties of imperative programs to a validity problem into trace logic, an expressive instance of first-order predicate logic. Trace logic draws its expressiveness from its syntax, which allows expressing properties over computation traces. Its axiomatization supports fine-grained reasoning about intermediate steps in program execution, notably loop iterations. We present an algorithm to encode the semantics of programs as well as their relational properties in trace logic, and then show how first-order theorem proving can be used to reason about the resulting trace logic formulas.…

Tables1

Table 1. TABLE I : Rapid results with Vampire , Z3 and CVC4 .

Benchmarks	Vampire				CVC4	Z3
Benchmarks	S	S+A	F	F+A	CVC4	Z3
1-hw-equal-arrays	$✓$	$✓$	-	$✓$	$✓$	$✓$
2-hw-last-position-swapped	-	$✓$	-	-	$✓$	$✓$
3-hw-swap-and-two-arrays	-	$✓$	-	-	-	-
4-hw-swap-in-array-lemma	-	$✓$	-	-	-	-
4-hw-swap-in-array-full	-	$✓$	-	-	-	-
1-ni-assign-to-high	$✓$	$✓$	$✓$	$✓$	$✓$	$✓$
2-ni-branch-on-high-twice	$✓$	$✓$	$✓$	$✓$	$✓$	$✓$
3-ni-high-guard-equal-branches	$✓$	$✓$	$✓$	$✓$	$✓$	$✓$
4-ni-branch-on-high-twice-prop2	$✓$	$✓$	-	-	$✓$	$✓$
5-ni-temp-impl-flow	-	-	$✓$	$✓$	$✓$	$✓$
6-ni-branch-assign-equal-val	-	-	$✓$	$✓$	$✓$	$✓$
7-ni-explicit-flow	$✓$	$✓$	$✓$	$✓$	$✓$	$✓$
8-ni-explicit-flow-while	$✓$	$✓$	-	$✓$	$✓$	$✓$
9-ni-equal-output	$✓$	-	-	-	-	$✓$
10-ni-rsa-exponentiation	$✓$	$✓$	$✓$	$✓$	$✓$	-
1-sens-equal-sums	$✓$	$✓$	$✓$	$✓$	$✓$	$✓$
2-sens-equal-sums-two-arrays	$✓$	$✓$	$✓$	$✓$	-	-
3-sens-abs-diff-up-to-k	-	-	-	-	$✓$	$✓$
4-sens-abs-diff-up-to-k-two-arrays	-	-	-	-	-	-
5-sens-two-arrays-equal-k	$✓$	$✓$	$✓$	$✓$	-	-
6-sens-diff-up-to-explicit-k	$✓$	$✓$	$✓$	$✓$	-	-
7-sens-diff-up-to-explicit-k-sum	-	-	$✓$	$✓$	-	-
8-sens-explicit-swap	-	-	$✓$	$✓$	-	-
9-sens-explicit-swap-prop2	-		$✓$	$✓$	-	-
10-sens-equal-k	$✓$	$✓$	$✓$	$✓$	-	-
11-sens-equal-k-twice	$✓$	$✓$	$✓$	$✓$	-	-
12-sens-diff-up-to-forall-k	-	-	$✓$	$✓$	$✓$	-
Total Vampire	15	18	17	19
Unique Vampire	1	4	0	0
Total	25				14	13

Equations63

\begin{array}[]{l}\forall k_{\mathbb{I}}.\Big{(}\big{(}\forall\mathit{pos}_{\mathbb{I}}.((\mathit{pos}\not\simeq k\land\mathit{pos}\not\simeq k+1)\rightarrow\\ a(\mathit{pos},t_{1}){\,\simeq\,}a(\mathit{pos},t_{2}))\;\land\;a(k,t_{1}){\,\simeq\,}a(k+1,t_{2})\\ \land\;a(k,t_{2}){\,\simeq\,}a(k+1,t_{1})\land 0\leq k+1<\mathit{alength}\big{)}\\ \qquad\rightarrow hw(\textit{end},t_{1}){\,\simeq\,}hw(\textit{end},t_{2})\Big{)},\end{array}

\begin{array}[]{l}\forall k_{\mathbb{I}}.\Big{(}\big{(}\forall\mathit{pos}_{\mathbb{I}}.((\mathit{pos}\not\simeq k\land\mathit{pos}\not\simeq k+1)\rightarrow\\ a(\mathit{pos},t_{1}){\,\simeq\,}a(\mathit{pos},t_{2}))\;\land\;a(k,t_{1}){\,\simeq\,}a(k+1,t_{2})\\ \land\;a(k,t_{2}){\,\simeq\,}a(k+1,t_{1})\land 0\leq k+1<\mathit{alength}\big{)}\\ \qquad\rightarrow hw(\textit{end},t_{1}){\,\simeq\,}hw(\textit{end},t_{2})\Big{)},\end{array}

E q_{v} (i t) := v (l_{9} (i t), t_{1}) ≃ v (l_{9} (i t), t_{2}) .

E q_{v} (i t) := v (l_{9} (i t), t_{1}) ≃ v (l_{9} (i t), t_{2}) .

E q_{h w} (i t) := h w (l_{9} (i t), t_{1}) ≃ h w (l_{9} (i t), t_{2}) .

E q_{h w} (i t) := h w (l_{9} (i t), t_{1}) ≃ h w (l_{9} (i t), t_{2}) .

\begin{array}[]{l}\forall itB_{\mathbb{N}}.\Big{(}\big{(}Eq_{v}({\tt 0})\land\forall it_{\mathbb{N}}.((it<itB\land Eq_{v}(it))\\ \qquad\quad\rightarrow Eq_{v}({\tt succ}(it)))\big{)}\rightarrow Eq_{v}(itB)\Big{)},\end{array}

\begin{array}[]{l}\forall itB_{\mathbb{N}}.\Big{(}\big{(}Eq_{v}({\tt 0})\land\forall it_{\mathbb{N}}.((it<itB\land Eq_{v}(it))\\ \qquad\quad\rightarrow Eq_{v}({\tt succ}(it)))\big{)}\rightarrow Eq_{v}(itB)\Big{)},\end{array}

\begin{array}[]{l}\forall itB_{\mathbb{N}}.\Big{(}\big{(}Eq_{hw}({\tt 0})\land\forall it_{\mathbb{N}}.((it<itB\land Eq_{hw}(it))\\ \qquad\quad\rightarrow Eq_{hw}({\tt succ}(it)))\big{)}\rightarrow Eq_{hw}(itB)\Big{)}.\end{array}

\begin{array}[]{l}\forall itB_{\mathbb{N}}.\Big{(}\big{(}Eq_{hw}({\tt 0})\land\forall it_{\mathbb{N}}.((it<itB\land Eq_{hw}(it))\\ \qquad\quad\rightarrow Eq_{hw}({\tt succ}(it)))\big{)}\rightarrow Eq_{hw}(itB)\Big{)}.\end{array}

t p_{s}

t p_{s}

t p_{s} (i t)

lastIt_{s}

start_{s} := {t p_{s} (0) t p_{s} if s is while-statement otherwise

start_{s} := {t p_{s} (0) t p_{s} if s is while-statement otherwise

end_{s} := ⎩ ⎨ ⎧ start_{s^{'}} end_{s^{'}} end_{s^{'}} tp_{w} (succ (i t^{w})) l_{end} if s^{'} occurs after s in a context if s is last st. in if-branch of s^{'} if s is last st. in else-branch of s^{'} if s is last st. in body of w otherwise

end_{s} := ⎩ ⎨ ⎧ start_{s^{'}} end_{s^{'}} end_{s^{'}} tp_{w} (succ (i t^{w})) l_{end} if s^{'} occurs after s in a context if s is last st. in if-branch of s^{'} if s is last st. in else-branch of s^{'} if s is last st. in body of w otherwise

{\forall pos_{I} . v (t p_{1}, pos, t r) v (t p_{1}, t r) ≃ v (t p_{2}, pos, t r), ≃ v (t p_{2}, t r), if v is array otherwise

{\forall pos_{I} . v (t p_{1}, pos, t r) v (t p_{1}, t r) ≃ v (t p_{2}, pos, t r), ≃ v (t p_{2}, t r), if v is array otherwise

EqAll (t p_{1}, t p_{2}) := v \in S_{V} ⋀ E q (v, t p_{1}, t p_{2}),

EqAll (t p_{1}, t p_{2}) := v \in S_{V} ⋀ E q (v, t p_{1}, t p_{2}),

[[P]] := i = 1 ⋀ k [[s_{i}]] .

[[P]] := i = 1 ⋀ k [[s_{i}]] .

[[s]] := v \in S_{V} ⋀ E q (v, end_{s}, t p_{s})

[[s]] := v \in S_{V} ⋀ E q (v, end_{s}, t p_{s})

[[s]] := v (end_{s}) ≃ [[e]] (t p_{s}, t r) \land v^{'} \in S_{V} ∖ {v} ⋀ E q (v^{'}, end_{s}, t p_{s})

[[s]] := v (end_{s}) ≃ [[e]] (t p_{s}, t r) \land v^{'} \in S_{V} ∖ {v} ⋀ E q (v^{'}, end_{s}, t p_{s})

[[s]] :=

[[s]] :=

a (end_{s}, pos, t r) ≃ a (t p_{s}, pos, t r))

\land

\land

[[s]] :=

[[s]] :=

\land

\land

\land

\forall i t_{N}^{s} . (i t^{s} < lastIt_{s} \to [[Cond]] (t p_{s} (i t^{s})))

\forall i t_{N}^{s} . (i t^{s} < lastIt_{s} \to [[Cond]] (t p_{s} (i t^{s})))

\land

\land

\land

\land

S i g (L) := (S_{N} \cup S_{I}) \cup (S_{T p} \cup S_{n} \cup S_{V} \cup S_{T r}) .

S i g (L) := (S_{N} \cup S_{I}) \cup (S_{T p} \cup S_{n} \cup S_{V} \cup S_{T r}) .

[[P]] ⊨_{N \cup I} F .

[[P]] ⊨_{N \cup I} F .

⎩ ⎨ ⎧ \forall p o s_{I} . v (tp, p os, t_{1}) ≃ v (tp, p os, t_{2})) \forall p o s_{I} . v (p os, t_{1}) ≃ v (p os, t_{2})) v (tp, t_{1}) ≃ v (tp, t_{2})) v (t_{1}) ≃ v (t_{2}) if v is mutable array if v is constant array if v is mutable var. if v is constant var.

⎩ ⎨ ⎧ \forall p o s_{I} . v (tp, p os, t_{1}) ≃ v (tp, p os, t_{2})) \forall p o s_{I} . v (p os, t_{1}) ≃ v (p os, t_{2})) v (tp, t_{1}) ≃ v (tp, t_{2})) v (t_{1}) ≃ v (t_{2}) if v is mutable array if v is constant array if v is mutable var. if v is constant var.

(v \in L ⋀ EqTr (v, l_{0})) \to (v \in L ⋀ EqTr (v, l_{end})) .

(v \in L ⋀ EqTr (v, l_{0})) \to (v \in L ⋀ EqTr (v, l_{end})) .

EqTr (l o, l_{0}) \to EqTr (l o, l_{end}) .

EqTr (l o, l_{0}) \to EqTr (l o, l_{end}) .

s; p

s; p

if(C)then{p_{1}}else{p_{2}}; p

if(C)then{p_{1}}else{p_{2}}; p

while^{i} (C)do{p_{1}}; p

p_{1}; while^{i + 1} (C)do{p_{1}}; p

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

\stackMath

Verifying Relational Properties using Trace Logic

Gilles Barthe12, Renate Eilers3, Pamina Georgiou3, Bernhard Gleiss3, Laura Kovács34, Matteo Maffei3

1Max Planck Institute for Security and Privacy, Germany

2IMDEA Software Institute, Spain

3TU Wien, Austria

4Chalmers University of Technology, Sweden

Abstract

We present a logical framework for the verification of relational properties in imperative programs. Our framework reduces verification of relational properties of imperative programs to a validity problem in trace logic, an expressive instance of first-order predicate logic. Trace logic draws its expressiveness from its syntax, which allows expressing properties over computation traces. Its axiomatization supports fine-grained reasoning about intermediate steps in program execution, notably loop iterations. We present an algorithm to encode the semantics of programs as well as their relational properties in trace logic, and then show how first-order theorem proving can be used to reason about the resulting trace logic formulas. Our work is implemented in the tool Rapid and evaluated with examples coming from the security field.

I Introduction

Program verification generally focuses on proving that all executions of a program lie within a specified set of executions, that is, properties are seen as sets of traces. However, this approach is not general enough to capture various fundamental properties, such as non-interference [1] and robustness [2]. These notions are naturally modelled as relational properties, that is as properties over sets of pairs of traces. Relational properties are special instances of hyperproperties [3], which are formally defined as sets of sets of traces.

Verification of relational properties can be achieved in different ways. One approach is by reduction to program verification: given a program $P$ and a hyperproperty $\phi$ , construct a program $Q$ and a property $\psi$ , such that: (i) $Q$ verifies $\psi$ and (ii) $Q$ verifies $\psi$ implies $P$ verifies $\phi$ . The main advantage of this approach is that (i) can be verified using standard verification tools, whereas (ii) is proved generically for the method used for constructing $Q$ , for instance self-composition [4, 5] and product programs [6, 7]. Another approach to verify relational properties is to use relational Hoare logic [8] or specialized logics that target specific properties [9]. While both approaches have been applied successfully in several use cases, they suffer from fundamental limitations: (i) they are typically not efficient enough to scale to large programs and (ii) they are only partly automated and tailored to specific properties.

**Contributions.**In this paper, we develop a new approach based on reduction to first-order reasoning, with the intent of reconciling expressiveness and automation.

(1) We introduce and formally characterize trace logic $\mathcal{L}$ , an instance of many-sorted first-order logic with equality, which allows expressing properties over program locations, loop iterations, and computation traces (Section IV).

(2) We encode the semantics of programs as well as relational program properties in $\mathcal{L}$ (Section IV). Specifically, given a program $P$ and a relational property $F$ , we construct a first-order formula $\xi$ in $\mathcal{L}$ such that validity of $\xi$ entails that $P$ satisfies $F$ . Note that this semantic characterization stands in contrast with methods based on product programs, Hoare logics, and relational Hoare logics, where verification is syntax-directed.

(3) We show that relational properties, such as non-interference, can naturally be encoded in trace logic (Section V).

(4) We implemented our approach in the Rapid tool, which relies on the first-order theorem prover Vampire [10]. We conducted experiments on security-relevant hyperproperties, such as non-interference and sensitivity. Our results show that Rapid is more expressive than state-of-the-art non-interference verification tools and that Vampire is better suited to the verification of security-relevant hyperproperties than state-of-the-art SMT-solvers like Z3 and CVC4.

II Motivating Example

We motivate our work with the simple program of Figure 1. This program iterates over an integer-valued array a and stores in the variable hw the sum of array elements. If a is a bitstring, then this program leaks the so-called Hamming weight of a in the variable hw. Our aim is to prove the following relational property over two arbitrary computation traces $t_{1}$ and $t_{2}$ of Figure 1: if the elements of the array variable a in $t_{1}$ are component-wise equal to the elements of a in $t_{2}$ except for two consecutive positions $k$ and $k+1$ , for some $k$ , and the elements of a in $t_{1}$ at positions $k,k+1$ are swapped versions of the elements of a in $t_{2}$ (that is, the $k$ -th element of a in $t_{1}$ is the $(k+1)$ -th element of a in $t_{2}$ and vice-versa), then the program variable hw is the same at the end of $t_{1}$ and $t_{2}$ . We formalize this property as

[TABLE]

where $k_{\mathbb{I}}$ and $\mathit{pos}_{\mathbb{I}}$ respectively specify that $k$ and $\mathit{pos}$ are of sort integer $\mathbb{I}$ . Further, $a(\mathit{pos},t_{i})$ denotes the value of the element at position $\mathit{pos}$ of a in trace $t_{i}$ , whereas end refers to the last program location of Figure 1 (that is, line 14).

Property (1) is challenging to verify, since it requires theory-specific reasoning over integers and it involves alternation of quantifiers, as the length of the array a is unbounded and the $k$ -th position (corresponding to the swap) is arbitrary. To understand the difficulty in automating such kind of reasoning, let us first illustrate how humans would naturally prove property (1). First, split the iterations of the loop of Figure 1 into three intervals: (i) The interval from the first iteration of the loop to the iteration where i has value $k$ , (ii) the interval from the iteration where i has value $k$ to the iteration where i has value $k+2$ , and (iii) the interval from the iteration where i has value $k+2$ to the last iteration of the loop. Next, for each of the intervals above, one proves that the equality of the value of hw in traces $t_{1}$ and $t_{2}$ is preserved; that is, if hw has the same value in $t_{1}$ and $t_{2}$ at the beginning of the interval, then hw also has the same value in $t_{1}$ and $t_{2}$ at the end of the interval. In particular, for the first and third intervals one uses inductive reasoning, to conclude the preservation of the equality across the whole interval from the step-wise preservation in the interval of the equality of the value hw in traces $t_{1}$ and $t_{2}$ . Further, for the second interval, one uses commutativity of addition to prove that the value of hw in traces $t_{1}$ and $t_{2}$ is preserved. By combining that the values of hw in traces $t_{1}$ and $t_{2}$ are preserved in each of the three intervals, one finally concludes that property (1) is valid.

While the above proof might be natural for humans, it is challenging for automated reasoners for the following reasons: (i) one needs to express and relate different iterations in the execution of the loop in Figure 1 and use these iterations to split the reasoning about loop intervals; (ii) one needs to automatically synthesize the loop intervals whose boundaries depend on values of program variables; and (iii) one needs to combine theory-specific reasoning with induction for proving quantified properties, possibly with alternations of quantifiers. In our work we address these challenges: we introduce trace logic, allowing us to express and automatically prove relational properties, including property (1). The key advantages of trace logic are as follows.

(i) In trace logic, program variables are encoded as unary and binary functions over program execution timepoints. This way, we can precisely express the value of each program variable at any program execution timepoint, without introducing abstractions. For Figure 1, for example, we write $hw(\textit{end},t_{1})$ to denote to the value of hw in trace $t_{1}$ at timepoint end.

(ii) Trace logic further allows arbitrary quantification over iterations and values of program variables. In particular, we can express and reason about iterations that depend on (possibly non-ground) expressions involving program variables. We use superposition-based first-order reasoning to automate static analysis with trace logic and derive first-order properties about loop iterations, possibly with quantifier alternations. For Figure 1, we generate for example the property $\exists it_{\mathbb{N}}.\big{(}it<n_{9}\land i(l_{9}(it),t_{1}){\,\simeq\,}k\big{)},$ where $l_{9}$ denotes the location where the loop condition is tested and $n_{9}$ denotes the first iteration of the loop upon which the loop condition does not hold anymore.

(iii) We guide superposition reasoning in trace logic by using a set of lemmas statically inferred from the program semantics. These lemmas express inductive properties about the program behavior. To illustrate such lemmas, we first introduce the following notation. For an arbitrary program variable v, let $Eq_{v}(it)$ denote that v has the same value in both traces at iteration $it$ of the loop. For example, for every program variable v of Figure 1, we introduce the following definition:

[TABLE]

In particular, for variable hw, we introduce:

[TABLE]

We then derive the following inductive lemma for each program variable v:

[TABLE]

where $itB_{\mathbb{N}}$ and $it_{\mathbb{N}}$ denote iterations $itB,it$ and ${\tt succ}(it)$ denotes the successor of $it$ . Lemma (2) asserts that if v has the same value in traces $t_{1}$ and $t_{2}$ at the beginning of the loop (that is, at iteration ${\tt 0}$ ) and if the values of v are step-wise equal in traces $t_{1}$ and $t_{2}$ up to an arbitrary iteration $itB$ , then the values of v are equal in traces $t_{1}$ and $t_{2}$ at iteration $itB$ (and hence the values of v are preserved in $t_{1}$ and $t_{2}$ for the entire interval up to $itB$ ). For Figure 1, we generate lemma (2) for hw as:

[TABLE]

Note that lemma (2), and in particular lemma (3) for hw, is crucial for proving that the values of hw in traces $t_{1}$ and $t_{2}$ are the same up to iteration $k$ , as considered in the relational property of (1). With this lemma at hand, we automatically prove property (1) of Figure 1, using superposition reasoning in trace logic.

III Preliminaries

This section fixes our terminology and programming model.

III-A First-order logic

We consider standard many-sorted first-order logic with equality, where equality is denoted by ${\,\simeq\,}$ . We allow all standard boolean connectives and quantifiers in the language and write $s\not\simeq t$ instead of $\neg(s{\,\simeq\,}t)$ , for two arbitrary first-order terms $s$ and $t$ . A signature is any finite set of symbols. We consider equality ${\,\simeq\,}$ as part of the language; hence, ${\,\simeq\,}$ is not a symbol. We write $F_{1},\ldots,F_{n}\vDash F$ to denote that the formula $F_{1}\land\ldots\land F_{n}\rightarrow F$ is a tautology. In particular, we write $\vDash F$ , if $F$ is valid.

By a first-order theory, or simply just theory, we mean the set of all formulas valid on a class of first-order structures. When we discuss a theory, we call symbols occurring in the signature of the theory interpreted, and all other symbols uninterpreted. In our work, we consider the combination (union) ${\mathbb{N}\cup\mathbb{I}}$ of the theory $\mathbb{N}$ of natural numbers and the one $\mathbb{I}$ of integers. The signature of $\mathbb{N}$ consists of standard symbols ${\tt 0}$ , ${\tt succ}$ , ${\tt pred}$ and $<$ , respectively interpreted as zero, successor, predecessor and less. Note that $\mathbb{N}$ does not contain interpreted symbols for (arbitrary) addition and multiplication. We use the theory $\mathbb{N}$ to represent and reason about loop iterations (see Section IV). The signature of $\mathbb{I}$ consists of the standard integer constants $0,1,2,\ldots$ and integer operators $+$ , $*$ and $<$ . We use the theory $\mathbb{I}$ to represent and reason about integer-valued program variables (see Section IV). Additionally we use two (uninterpreted) sorts as two sets of uninterpreted symbols: (i) the sort Timepoint, written as $\mathbb{L}$ , for denoting (unique) timepoints in the execution of the program and (ii) the sort Trace, written as $\mathbb{T}$ , for denoting computation traces of a program.

Given a logical variable $x$ and sort $S$ , we write $x_{S}$ to denote that the sort of $x$ is $S$ . We use standard first-order interpretations/models modulo a theory $T$ , for example modulo ${\mathbb{N}\cup\mathbb{I}}$ . We write $\vDash_{T}F$ to denote that $F$ holds in all models of $T$ (and hence valid). If $I$ is a model of $T$ , we write $I\vDash_{T}F$ if $F$ holds in the interpretation $I$ .

III-B Programming Model $\mathcal{W}$

We consider programs written in a standard while-like programming language, denoted as $\mathcal{W}$ , with mutable and constant integer- and integer-array-variables. The language $\mathcal{W}$ includes standard side-effect free expressions over booleans and integers. Each program in $\mathcal{W}$ consists of a single top-level function main, with arbitrary nestings of if-then-else and while-statements. For simplicity, whenever we refer to loops, we mean while-loops. For each statement s, we refer to while-statements in which s is nested in as enclosing loops of s. The semantics of $\mathcal{W}$ is formalized in Section IV-C.

IV Trace Logic

We now introduce the concept of trace logic for expressing both the semantics and (relational) properties of $\mathcal{W}$ -programs.

IV-A Locations and Timepoints

We consider a program in $\mathcal{W}$ as a set of locations, where each location intuitively corresponds to a point in the program at which an interpreter can stop. That is, for each program statement s, we introduce a program location $l_{s}$ . We denote by $l_{\mathit{end}}$ the location corresponding to the end of the program.

As program locations can be revisited during program executions, for example due to the presence of loops, we model locations as follows. For each location $l_{s}$ corresponding to a program statement s, we introduce a function symbol $l_{s}$ with target sort $\mathbb{L}$ in our language, denoting the timepoint where the interpreter visits the location. For each enclosing loop of the statement s, the function symbol $l_{s}$ has an argument of type $\mathbb{N}$ ; this way, we distinguish between different iterations of the enclosing loop of s. We denote the set of all such function-symbols $l_{s}$ as $Sig_{Tp}$ . When s is a loop, we additionally include a function symbol $n_{s}$ with target sort $\mathbb{N}$ and an argument of sort $\mathbb{N}$ for each enclosing loop of s. This way, $n_{s}$ denotes the iteration in which s terminates for given iterations of the enclosing loops of s. We denote the set of all such function symbols $n_{s}$ as $\textit{Sig}_{n}$ .

Example 1

Consider Figure 1. We abbreviate each statement s by the line number of the first line of s. We use $l_{6}$ to refer to the timepoint corresponding to the first assignment of i in the program. We denote by $l_{9}({\tt 0})$ and $l_{9}(n_{9})$ the timepoints corresponding to evaluating the loop condition in the first and, respectively, last loop iteration. Further, we write $l_{11}(it)$ and $l_{11}({\tt succ}({\tt 0}))$ for the timepoint corresponding to the beginning of the loop body in the $it$ -th and, respectively, second iteration of the loop. Note that ${\tt succ}({\tt 0})$ is a term algebra expression of $\mathbb{N}$ .

∎

For simplicity, let us define terms over the most commonly used timepoints. First, define $it^{s}$ to be a function, which returns for each while-statement s a unique variable of sort $\mathbb{N}$ . Second, let s be a statement, let $w_{1},\dots,w_{k}$ be the enclosing loops of s and let $it$ be an arbitrary term of sort $\mathbb{N}$ .

[TABLE]

Third, let s be an arbitrary statement. We refer to the timepoint where the execution of s has started (parameterized by the enclosing iterators) by

[TABLE]

Fourth, for an arbitrary statement s, let $\mathit{end}_{\texttt{s}}$ denote the timepoint which follows immediately after s has been evaluated completely (including the evaluation of substatements of s):

[TABLE]

IV-B Program Variables and Expressions

In our setting, we reason about program behavior by expressing properties over program variables v. To do so, we capture the value of program variables v at timepoints (from $\mathbb{L}$ ) in arbitary program execution traces (from $\mathbb{T}$ ). Hence, we model program variables v as functions $v:(\mathbb{L}\times\mathbb{T})\mapsto\mathbb{I}$ , where $v(tp,tr)$ gives the value of v at timepoint $tp$ , in trace $tr$ . If the program variable v is an array, we add an additional argument of sort $\mathbb{I}$ , which corresponds to the position at which the array is accessed. We denote by $S_{V}$ the set of such introduced function symbols denoting program variables. We finally model arithmetic constants and program expressions using integer functions.

Note that our setting can be simplified for (i) non-mutable variables – in this case we omit the timepoint argument in the function representation of the variable; (ii) for non-relational properties about programs – in this case, we only focus on one computation trace and hence the trace argument in the function from $S_{V}$ can be omitted.

Example 2

Consider again Figure 1. By $i(l_{6},tr)$ we refer to the value of program variable i in trace $tr$ at the moment before i is first assigned. We use $\textit{alength}(tr)$ to refer to the value of variable alength in trace $tr$ . As a is unchanged in the program, we write $a(i(l_{11}(it),tr),tr)$ for the value of array a in trace $tr$ at position $\mathit{pos}$ , where $\mathit{pos}$ is the value of i in trace $tr$ at timepoint $l_{11}(it)$ . In case a would have changed during the loop, we would have written $a(l_{11}(it),i(l_{11}(it),tr),tr)$ instead. We denote by $i(l_{12}(it),tr)+1$ the value of the expression i+1 in trace $tr$ at timepoint $l_{12}(it)$ .

∎

Consider now an arbitrary program expression e. We write $\llbracket\texttt{e}\rrbracket(tp,tr)$ to denote the value of e at timepoint $tp$ , in trace $tr$ . With these notations at hand, we introduce two definitions expressing properties about values of expressions e at arbitrary timepoints and traces. Consider now $v\in S_{V}$ , that is a function denoting a program variable v, and let $tp_{1},tp_{2}$ denote two timepoints. We define: $Eq(v,tp_{1},tp_{2}):=$

[TABLE]

That is, $Eq(v,tp_{1},tp_{2})$ in (4) states that the program variable v has the same values at $tp_{1}$ and $tp_{2}$ . We also define:

[TABLE]

asserting that all program variables have the same values at the two timepoints $tp_{1}$ and $tp_{2}$ .

IV-C Semantics of $\mathcal{W}$

We now describe the semantics of $\mathcal{W}$ expressed in our trace logic $\mathcal{L}$ . To do so, we state trace axioms of $\mathcal{L}$ capturing the behavior of possible program computation traces and then define $\mathcal{L}$ .

In what follows, we consider an arbitrary but fixed program $P$ in $\mathcal{W}$ , and give all definitions relative to $P$ . Note that our semantics defines arbitrary executions, which are modeled by a free variable $tr$ of sort $\mathbb{T}{}$ .

Main-function

Let $\texttt{s$ \mathtt{{}{1}} $},\dots,\texttt{s$ \mathtt{{}{k}} $}$ be statements and $P$ be a program with top-level function func main {s ${}_{1};\ldots;$ sk}. The semantics of $P$ is defined by the conjunction of the semantics of the statements si in the top-level function and is the same for each trace. That is:

[TABLE]

The semantics of $P$ is then defined by structural induction, by asserting trace axioms for each program statement s, as follows.

Skip

Let s be a statement skip. The evaluation of $s$ has no effect on the value of the program variables. Hence:

[TABLE]

Integer assignments

Let s be an assignment v = e, where v is an integer program variable and e is an expression. We reason as follows. The assignment s is evaluated in one step. After the evaluation of s, the variable v has the same value as e before the evaluation, and all other variables remain unchanged. Hence:

[TABLE]

Array assignments

Let s be an assignment a[e1] = e2, where a is an array variable and $\texttt{e$ \mathtt{{}{1}} $},\texttt{e$ \mathtt{{}{2}} $}$ are expressions. We consider that the assignment is evaluated in one step. After the evaluation of s, the array a has the same value as before the evaluation, except for the position $\mathit{pos}$ corresponding to the value of e1 before the evaluation, where the array now has the value of e2 before the evaluation. All other program variables remain unchanged and we have:

[TABLE]

Conditional if-then-else Statements

Let s be the statement: if(Cond){s ${}_{1};\ldots;$ sk} else {s ${}_{1}`;\ldots;$ s ${}_{k`}`$ }. The semantics of s is defined by the following two properties: (i) entering the if-branch and/or entering the else-branch does not change the values of the variables, (ii) the evaluation in the branches proceeds according to the semantics of the statements in each of the branches. Thus:

[TABLE]

While-Loops

Let s be the while-statement while(Cond){s ${}_{1};\ldots;$ sk}.We refer to Cond as the loop condition. We use the following four properties to defined the semantics of s: (i) the iteration $\mathit{lastIt}_{\texttt{s}}$ is the first iteration where the loop condition does not hold, (ii) entering the loop body does not change the values of the variables, (iii) the evaluation in the body proceeds according to the semantics of the statements in the body, (iv) the values of the variables at the end of evaluating s are the same as the variable values at the loop condition location in iteration $lastIt(\texttt{s})$ . We then have:

$\llbracket\texttt{s}\rrbracket:=$

[TABLE]

IV-D Trace Logic $\mathcal{L}$

We now have all ingredients to define our trace logic $\mathcal{L}$ , allowing us to reason about both relational and non-relational properties of programs.

Let $S_{Tr}$ be a set $\{t_{1},t_{2},\dots\}$ of nullary function symbols of sort $\mathbb{T}$ . Intuitively, these symbols denote traces and allow us to express relational properties. The signature of $\mathcal{L}$ contains the symbols of the theories $\mathbb{N}$ and $\mathbb{I}$ together with symbols introduced in Section IV-A-IV-B, that is symbols denoting timepoints, last iterations in loops, program variables and traces. Formally,

[TABLE]

Recall that the semantics of $\mathcal{W}$ is defined by the trace axioms (7)-(11). By extending standard small-step operational semantics with timepoints and traces, we obtain the small-step semantics of $\mathcal{W}$ . For proving soundness, of this semantics, we rely on so-called execution-interpretation of a program execution $E$ : such an interpretation is a model in which for every (array) variable v the term $v(tp_{i})$ resp. $v(tp_{i},pos)$ is interpreted as the value of v at the execution step in $E$ corresponding to timepoint $tp_{i}$ – see our Appendix for more details. We then introduce $\mathcal{W}{}$ -soundness defining the soundness of the semantics of $\mathcal{W}$ , as follows:

Definition 1 ( $\mathcal{W}$ -Soundness)

Let $p$ be a program and let $A$ be a trace logic property. We say that $A$ is $\mathcal{W}$ -sound, if for any execution-interpretation $M$ we have $M\vDash A$ .

By using structural induction over program statements, we derive $\mathcal{W}$ -soundness of the semantics of $\mathcal{W}$ . That is:

Theorem 1 ( $\mathcal{W}$ -Soundness of Semantics of $\mathcal{W}$ )

For a given terminating program $p$ , the trace axioms (7)-(11) are $\mathcal{W}$ -sound.

As a consequence, the semantics of any terminating program $p$ expressed in $\mathcal{L}$ , as defined in (6), is $\mathcal{W}$ -sound.

IV-E Program Correctness in Trace Logic $\mathcal{L}$

Let $P$ be a program and $F$ be a first-order property of $P$ , with $F$ expressed in $\mathcal{L}$ . We use $\mathcal{L}$ to express and prove that $P$ “satisfies” $F$ , that is $P$ is partially correct w.r.t. $F$ , as follows:

We express $\llbracket P\rrbracket$ in $\mathcal{L}$ , as discussed in Section IV-C; 2. 2.

We prove the partial correctness of $P$ with respect to $F$ ; that is, we prove

[TABLE]

In what follows, we first discuss (relational) properties $F$ expressed in $\mathcal{L}$ (Section V) and then focus on proving partial correctness using $\mathcal{L}$ (Section VI).

V Hyperproperties in Trace Logic

We demonstrate the expressiveness of trace logic $\mathcal{L}$ by encoding non-interference [11] and sensitivity [12], two fundamental security properties. This secition also showcases the generic lemmas, similar to property (2), introduced by our work to automate the verification of hyperproperties. The examples considered in this section are deemed as insecure by existing syntax-driven, non-interference verification techniques, such as [11, 13].

Non-interference

Non-interference [1] is a security property that prevents information flow from confidential data to public channels. It is a so-called $2$ -safety property expressing that, given two runs of a program containing high and low confidentiality variables, denoted by $H$ and $L$ respectively, if the input for all $L$ variables is the same in both runs, the output of the computation should result in the same values for $L$ variables in both traces regardless of the initial value of any $H$ variable. Intuitively, this means that no private input leaks to any public sink. In what follows, we let lo denote an $L$ variable and hi an $H$ variable.

We formalize non-interference in trace logic $\mathcal{L}$ as follows. Let $l_{0}$ denote the first timepoint of the execution and let $\mathit{EqTr}(v,tp)$ denote that $v$ has the same value(s) in both traces at timepoint $tp$ , that is:

$\mathit{EqTr}(v,tp):=$

[TABLE]

We then express non-interference as:

[TABLE]

Example 3

Consider the program illustrated in Figure LABEL:fig:noninterference3, which branches on an $H$ guard. In the two branches, however, the $L$ variable is updated in the same way, thereby not leaking anything about the guard. The non-interference property for this program is a special instance of property (12), as follows:

[TABLE]

By adjusting superposition reasoning to trace logic $\mathcal{L}$ (see Section VI), we can automatically verify the property above. Traditional information-flow type systems [11] would however fail to prove this program secure, as they prevent any branching on $H$ guards. More permissive static analysis techniques based on program dependency graphs, such as Joana [13], would also classify this program as insecure.

∎

Bibliography48

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] J. A. Goguen and J. Meseguer, “Security Policies and Security Models,” in IEEE Symposium on Security and Privacy , 1982, pp. 11–20.
2[2] S. Chaudhuri, S. Gulwani, and R. Lublinerman, “Continuity analysis of programs,” in POPL , 2010, pp. 57–70.
3[3] M. R. Clarkson and F. B. Schneider, “Hyperproperties,” in CSF , 2008, pp. 51–65.
4[4] G. Barthe, P. R. D’Argenio, and T. Rezk, “Secure Information Flow by Self-Composition,” in CSFW , 2004, pp. 100–114.
5[5] Á. Darvas, R. Hähnle, and D. Sands, “A Theorem Proving Approach to Analysis of Secure Information Flow,” in SPC , 2005, pp. 193–209.
6[6] G. Barthe, J. M. Crespo, and C. Kunz, “Relational Verification Using Product Programs,” in FM , 2011, pp. 200–214.
7[7] B. Churchill, O. Padon, R. Sharma, and A. Aiken, “Semantic Program Alignment for Equivalence Checking,” in PLDI , 2019, pp. 1027–1040.
8[8] N. Benton, “Simple Relational Correctness Proofs for Static Analyses and Program Transformations,” in POPL , 2004, pp. 14–25.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Verifying Relational Properties using Trace Logic

Abstract

I Introduction

II Motivating Example

III Preliminaries

III-A First-order logic

III-B Programming Model W\mathcal{W}W

IV Trace Logic

IV-A Locations and Timepoints

Example 1

IV-B Program Variables and Expressions

Example 2

IV-C Semantics of W\mathcal{W}W

Main-function

Skip

Integer assignments

Array assignments

Conditional if-then-else Statements

While-Loops

IV-D Trace Logic L\mathcal{L}L

Definition 1** (W\mathcal{W}W-Soundness)**

Theorem 1** (W\mathcal{W}W-Soundness of Semantics of W\mathcal{W}W)**

IV-E Program Correctness in Trace Logic L\mathcal{L}L

V Hyperproperties in Trace Logic

Non-interference

Example 3

III-B Programming Model $\mathcal{W}$

IV-C Semantics of $\mathcal{W}$

IV-D Trace Logic $\mathcal{L}$

Definition 1 ( $\mathcal{W}$ -Soundness)

Theorem 1 ( $\mathcal{W}$ -Soundness of Semantics of $\mathcal{W}$ )

IV-E Program Correctness in Trace Logic $\mathcal{L}$