This paper explores the extension of algebraic specifications from trace properties to hyperproperties, providing a compositional semantics for imperative programs with loops and examining their implications for program correctness and confidentiality.
Contribution
It introduces a hyperproperty-level algebraic framework and proves a compositional semantics for imperative programs with loops, extending traditional trace-based approaches.
Findings
01
Hyperproperties can be lifted to algebraic structures.
02
A compositional semantics for programs with loops is developed.
03
The semantics aligns with standard semantics for subset-closed hyperproperties.
Abstract
Unifying theories distil common features of programming languages and design methods by means of algebraic operators and their laws. Several practical concerns --- e.g., improvement of a program, conformance of code with design, correctness with respect to specified requirements --- are subsumed by the beautiful notion that programs and designs are special forms of specification and their relationships are instances of logical implication between specifications. Mathematical development of this idea has been fruitful but limited to an impoverished notion of specification: trace properties. Some mathematically precise properties of programs, dubbed hyperproperties, refer to traces collectively. For example, confidentiality involves knowledge of possible traces. This article reports on both obvious and surprising results about lifting algebras of programming to hyperproperties, especially…
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · Formal Methods in Verification · Security and Verification in Computing
Full text
11institutetext: Stevens Institute of Technology, USA
††thanks: The authors were partially supported by NSF award 1718713
David A. Naumann and Minh Ngo
Abstract
Unifying theories distil common features of programming languages and design methods by means of algebraic operators and their laws. Several practical concerns — e.g., improvement of a program, conformance of code with design, correctness with respect to specified requirements — are subsumed by the beautiful notion that programs and designs are special forms of specification and their relationships are instances of logical implication between specifications. Mathematical development of this idea has been fruitful but limited to an impoverished notion of specification: trace properties. Some mathematically precise properties of programs, dubbed hyperproperties, refer to traces collectively. For example, confidentiality involves knowledge of possible traces. This article reports on both obvious and surprising results about lifting algebras of programming to hyperproperties, especially in connection with loops, and suggests directions for further research. The technical results are: a compositional semantics, at the hyper level, of imperative programs with loops, and proof that this semantics coincides with the direct image of a standard semantics, for subset closed hyperproperties.
1 Introduction
A book has proper spelling provided that each of its sentences does.
For a book to be captivating and suspenseful — that is not a property that can be reduced to a property of its individual sentences. Indeed, few interesting properties of a book are simply a property of all its sentences.
By contrast, many interesting requirements of a program can be specified as so-called trace properties: there is some property of traces (i.e., observable behaviors) which must be satisfied by all the program’s traces.
The unruly mess of contemporary programming languages, design tools, and approaches to formal specification has been given a scientific basis through unifying theories that abstract commonalities by means of algebraic operators and laws.
Algebra abstracts from computational notions like partiality and nondeterminacy by means of operators that are interpreted as total functions and which enable equational reasoning.
Several practical concerns — such as improving a program’s resource usage while not altering its observable behavior, checking conformance of code with design architecture, checking satisfaction of requirements,
and equivalence of two differently presented designs — are subsumed by the beautiful notion that programs and designs111This paper was written with the UTP [19] community in mind,
but our use of the term “design” is informal and does not refer to the technical notion in UTP.
are just kinds of specification and their relationships are instances of logical implication between specifications.
Transitivity of implication yields the primary relationship: the traces of a program are included in the traces allowed by its specification.
The mathematical development of this idea has been very successful — for trace properties.
Not all requirements are trace properties. A program should be easy to read,
consistent with dictates of style, and amenable to revision for adapting to changed requirements.
Some though not all such requirements may be addressed by mathematics; e.g., parametric polymorphism is a form of modularity that facilitates revision through reuse. In this paper we are concerned with requirements that are extensional in the sense that they pertain directly to observable behavior.
For a simple example, consider a program acting on variables hi,lo where the initial value of hi is meant to be a secret, on which the final value of lo must not depend. Consider this simple notion of program behavior:
a state assigns values to variables, and a trace is a pair: the initial and final states.
The requirement cannot be specified as a trace property, but it can be specified as follows:
for any two traces (σ,σ′) and (τ,τ′), if the initial states σ and τ have the same value for lo then so do the final states.
In symbols: σ(lo)=τ(lo)⇒σ′(lo)=τ′(lo).
Some requirements involve more than two traces, e.g., “the average response time is under a millisecond” can be made precise by averaging the response time of each trace, over all traces, perhaps weighted by a distribution that represents likelihood of different requests.
For a non-quantitative example, consider the requirement that a process in a distributed system should know which process is the leader: something is known in a given execution if it is true in all possible traces that are consistent with what the process can observe of the given execution (such as a subset of the network messages).
In the security literature, some information flow properties are defined by closure conditions on the program’s traces, such as: for any two traces, there exists a trace with the high (confidential) events of the first and the low (public) events of the second.
This paper explores the notion that just as a property of books is a set of books, not necessarily defined simply in terms of their sentences,
so too a property of programs is a set of programs, not merely a set of traces.
The goal is to investigate how the algebra of programming can be adapted for reasoning about non-trace properties.
To this end, we focus on the most rudimentary notion of trace, i.e., pre/post pairs, and rudimentary program constructs.
We conjecture that the phenomena and ideas are relevant to a range of models, perhaps even the rich notions of trace abstracted by variations on concurrent Kleene algebra [20].
It is unfortunate that the importance of trace properties in programming has led to well established use of the term “property” for trace property,
and recent escalation in terminology to “hyperproperty” to designate the general notion of program property — sets of programs rather than sets of traces [10, 9].
Some distinction is needed, so for clarity and succinctness we follow the crowd.
The technical contribution of this paper can now be described as follows: we give a lifting of the fixpoint semantics of loops to the “hyper level”, and show anomalies that occur with other liftings.
This enables reasoning at the hyper level with usual fixpoint laws for loops,
while retaining consistency with standard relational semantics.
Rather than working directly with sets of trace sets, our lifting uses a simpler model,
sets of state sets; this serves to illustrate the issues and make connections with other models that may be familiar.
The conceptual contribution of the paper is to call attention to the challenge of unifying theories of programming that encompass requirements beyond trace properties.
Outline.
Section 2 describes a relational semantics of imperative programs and defines an example program property that is not a trace property.
Relational semantics is connected, in Section 3, with semantics mapping sets to sets, like forward predicate transformers.
Section 4 considers semantics mapping sets of sets to the same, this being the level at which hyperproperties are formulated. Anomalies with obvious definitions motivate a more nuanced semantics of loops.
The main technical result of the paper is Theorem 4.1 in this section, connecting the semantics of
Section 4 with that of Section 3.
Section 5 connects the preceding results with the intrinsic notion of satisfaction for hyperproperties,
and sketches challenges in realizing the dream of reasoning about hyperproperties using only refinement chains.
The semantics and theorem are new, but similar to results in prior work discussed in Section 6.
Section 7 concludes.
2 Programs and specifications as binary relations
Preliminaries.
We review some standard notions, to fix notation and set the stage.
Throughout the paper we assume Σ is a nonempty set, which stands for the set of program states, or data values, on which programs act.
For any sets A, B,
let A⊸B denote the binary relations from A to B; that is,
A⊸B is ℘(A×B) where ℘ means powerset.
Unless otherwise mentioned, we consider powersets, including Σ⊸Σ, to be ordered by inclusion (⊆).
We write A→B for the set of functions from A to B.
For composition of relations, and in particular composition of functions, we use infix symbol \mbox; in the forward direction.
Thus for relations R,S and elements x,y we have
x(R\mbox;S)y iff \exists z\>\mbox{\small\bullet}\>xRz\land zSy.
For a function f:A→B and element x∈A we write application as fx and let it associate to the left.
Composition with g:B→C is written f\mbox;g, as functions are treated as special relations, so (f\mbox;g)x=g(fx).
The symbol \mbox; binds tighter than \mathbin{\mbox{\small\cup}} and other operators.
For a relation R:A⊸B, the direct image
⟨R⟩ is a total function ℘A→℘B
defined by
y∈⟨R⟩p iff \exists x\in p\>\mbox{\small\bullet}\>xRy.
It faithfully reflects ordering of relations:
[TABLE]
where ⊑ means pointwise order (i.e.,
φ⊑ψ iff \forall p\in\wp A\>\mbox{\small\bullet}\>\varphi\,p\subseteq\psi\,p).
We write ⊔ for pointwise union,
defined by (\varphi\mathbin{\sqcup}\psi)\,p=\varphi\,p\mathbin{\mbox{\small\cup}}\psi\,p.
The ⊑-least element is the function \lambda p\>\mbox{\small\bullet}\>\emptyset, abbreviated as ⊥.
A relation can be recovered from its direct image:
[TABLE]
where sglt:A→℘A maps element a to singleton set {a} and ∋:℘A⊸A is the converse of the membership relation.
Note that ⊥ is the direct image of the empty relation.
Direct image is functorial and distributes over union:
[TABLE]
We write id for identity function on the set indicated.
In fact ⟨−⟩ distributes over arbitrary union,
i.e., sends any union of relations to the pointwise join of their images.
Also, ⟨R⟩ is universally disjunctive, and (1) forms a bijection
between universally disjunctive functions
℘A→℘B and relations A⊸B.
In this paper we use the term transformer for monotonic functions of type
℘A→℘B. For φ:℘A→℘B to be monotonic is equivalent to
(⊇\mbox;φ)⊆(φ\mbox;⊇).
We write lfp for the least-fixpoint operator.
For monotonic functions f:A→A and g:B→B where A,B are sufficiently complete
posets that lfpf and lfpg exist,
the fixpoint fusion rule says that for strict and continuous h:A→B,
[TABLE]
Inequational forms, such as f\mbox;h≤h\mbox;g⇒h(lfpf)≤lfpg,
are also important.222Fusion rules, also called fixpoint transfer,
can be found in many sources,
e.g., [1, 4].
We need the form in Theorem 3 of [12],
for Kleene approximation of fixpoints.
Relational semantics.
The relational model suffices for reasoning about terminating executions.
If we write x+2≤x′ to specify a program that increases x by at least two,
we can write this simple refinement chain:
[TABLE]
to express that the nondeterministic choice (⊕) between adding 3 or adding 5 refines the specification and is refined in turn by the first alternative.
Relations model a good range of operations including relational converse and intersection which are not implementable in general but are useful for expressing specifications.
Their algebraic laws facilitate reasoning.
For example, choice is modeled as union, so the second step is from a law of set theory: R\mathbin{\mbox{\small\cup}}S\supseteq R.
Equations and inequations may serve as specifications.
For example, to express that relation R is deterministic we can write R∪\mbox;R⊆id,
where R∪ is the converse of R.
Note that this uses two occurrences of R.
Returning to the example in the introduction, suppose R relates
states with variables hi,lo.
To formulate the noninterference property that the final value of lo is independent of the initial value of hi,
it is convenient to define a relation on states that says they have the same value for lo:
define ∼˚ by σ∼˚τ iff σ(lo)=τ(lo).
The property is
[TABLE]
This is a form of determinacy. A weaker notion allows multiple outcomes for lo but the set of possibilities should be independent from the initial value of hi.
[TABLE]
This is known as possibilistic noninterference.
It can be expressed without quantifiers, by the usual simulation inequality:
[TABLE]
Another equivalent form is
∼˚\mbox;R\mbox;∼˚=R\mbox;∼˚,
which again uses two occurrences of R.
The algebraic formulations are attractive, but recall the beautiful idea of correctness proof as a chain of refinements
[TABLE]
This requires the specification to itself be a term in the algebra, rather than an (in)equation between terms.
Before proceeding to investigate this issue, we recall the well known fact that
possibilistic noninterference is not closed under refinement of trace sets [21].
Consider hi,lo ranging over bits, so we can write pairs compactly, and consider the set of traces
{(00,00),(00,01),(01,00),(01,01),(10,10),(10,11),(11,10),(11,11)}
It satisfies possibilistic noninterference, but if we remove the underlined pairs the result does not; in fact
the result copies hi to lo.
In the rest of this paper, we focus on deterministic noninterference, NI for short.
It has been advocated as a good notion for security [35] and it serves our purposes as an example.
A signature and its relational model.
To investigate how NI and other non-trace properties may be expressed and used in refinement chains,
it is convenient to focus on a specific signature, the simple imperative language over given atoms (ranged over by atm) and boolean expressions (ranged over by b).
[TABLE]
For expository purposes we refrain from decomposing the conditional and iteration constructs in terms of choice (⊕) and assertions.
That decomposition would be preferred in a more thorough investigation of algebraic laws, and it is evident in the semantic definitions to follow.
Assume that for each atm is given a relation [[atm]]:Σ⊸Σ,
and for each boolean expression b is given a coreflexive relation
[[b]]:Σ⊸Σ. That is, [[b]] is a subset of the identity relation idΣ on Σ.
For non-atom commands c the relational semantics [[c]] is defined in Fig. 1.
The fixpoint for loops333
It is well known that loops are expressible in terms of recursion:
whilebdoc can be expressed as \mu X.(b;c;X\mathbin{\mbox{\small\cup}}\neg b)
and this is the form we use in semantics.
A well known law is \mu X.(b;c;X\mathbin{\mbox{\small\cup}}\neg b)=\mu X.(b;c;X\mathbin{\mbox{\small\cup}}\mathsf{skip});\neg b
which factors out the termination condition.
is in Σ⊸Σ, ordered by ⊆ with least element ∅.
The language goes beyond ordinary programs, in the sense that atoms are allowed to be unboundedly nondeterministic. They are also allowed to be partial; coreflexive atoms serve as assume and assert statements.
Other ingredients are needed for a full calculus of specifications, but here our aim is to sketch ideas that merit elaboration in a more comprehensive theory.
3 Programs as forward predicate transformers
Here is yet another way to specify NI for a relation R:
[TABLE]
where Agrl says that all elements of p agree on lo:
[TABLE]
As with the preceding (in)equational formulations, like (3), this is not directly applicable as the specification in a refinement chain, but it does hint that escalating to sets of states may be helpful.
Note that R occurs just once in the condition.
Weakest-precondition predicate transformers are a good model for programming algebra:
Monotonic functions ℘Σ→℘Σ can model total correctness specifications with both angelic and demonic nondeterminacy.
In this paper we use transformers to model programs in the forward direction.
For boolean expression b we define
\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp=⟨[[b]]⟩
so that \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp is a filter:
x is in \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppp iff x∈p and b is true of x.
The transformer semantics is in Fig. 2.
For loops, the fixpoint is for the aforementioned ⊥ and ⊑.
Linking transformer with relational.
The transformer model may support a richer range of operators than the relational one,
but for several reasons it is important to establish their mutual consistency on a common set of operators [18, 19].
A relation can be recovered from its direct image, see (1), so the following is a strong link.
Proposition 1
For all c in the signature, ⟨[[c]]⟩=\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp.
Proof
By induction on c.
•
skip: ⟨[[skip]]⟩=⟨idΣ⟩=id℘Σ→℘Σ=\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppskip\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp
by definitions and ⟨−⟩ law.
•
atm: ⟨[[atm]]⟩=\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppatm\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp by definition.
•
c;d: ⟨[[c;d]]⟩=⟨[[c]]\mbox;[[d]]⟩=⟨[[c]]⟩\mbox;⟨[[d]]⟩=\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp\mbox;\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppd\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp=\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc;d\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp
by definitions, ⟨−⟩ laws, and induction hypothesis.
whilebdoc: To prove
⟨[[whilebdoc]]⟩=\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppwhilebdoc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp,
unfold the definitions to
⟨lfpF⟩=lfpG, where F,G are defined in Figs 1 and 2.
This follows by fixpoint fusion, taking h in (2) to be ⟨−⟩
so the antecedent to be proved is
\forall R\>\mbox{\small\bullet}\>\langle\mathsf{F}R\rangle=\mathsf{G}\langle R\rangle.
Observe for any R:
[TABLE]
∎
Subsets of ℘Σ→℘Σ, such as transformers satisfying Dijkstra’s healthiness conditions,
validate stronger laws than the full set of (monotonic) transformers.
Healthiness conditions can be expressed by inequations,
such as the determinacy inequation R∪\mbox;R⊆id,
and used as antecedents in algebraic laws.
Care must be taken with joins: not all subsets are closed under pointwise union.
Pointwise union does provide joins in the set of all transformers and also in the set of all universally disjunctive transformers.
In addition to transformers as weakest preconditions [4],
another similar model is multirelations which are attractive in maintaining a pre-to-post direction [25].
These are all limited to trace properties, though, so we proceed in a different direction.
4 Programs as h-transformers
Given R:A⊸B, the image ⟨R⟩ is a function and functions are relations, so the direct image can be taken:
⟨⟨R⟩⟩:℘2A→℘2B
where ℘2A abbreviates ℘(℘A).
In this paper, monotonic functions of this type are called h-transformers, in a nod to hyper terminology.
The underlying relation can be recovered by two applications of (1):
[TABLE]
More to the point, a quantifier-free formulation of NI is now in reach.
Recall that we have R∈NI iff
\forall p\in\wp\Sigma\>\mbox{\small\bullet}\>\mathsf{Agrl}\,(p)\Rightarrow\mathsf{Agrl}\,(\langle R\rangle p).
This is equivalent to
[TABLE]
where the set of sets A is defined by A={p∣Agrl(p)}.
This is one motivation to investigate ℘2Σ→℘2Σ as a model, rather than
℘(Σ⊸Σ) which is the obvious way to embody the idea that a program is a trace set and a property is a set of programs.
In the following we continue to write ⊔ and ⊑ for the pointwise join and pointwise order on ℘2Σ→℘2Σ. Please note the order is defined in terms of set inclusion at the outer layer of sets and is independent of the order on ℘Σ.
Define ⊥=⟨⊥⟩ and note that
⊥∅=∅ and ⊥Q={∅} for Q=∅.
Surprises.
For semantics using h-transformers, some obvious guesses work fine but others do not.
The semantics in Fig. 3
uses operators \ovee, ◃b▹ and ℘˘ which will be explained in due course.
For boolean expressions we simply lift by direct image, defining
\llparenthesisb\rrparenthesis=⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩.
The same for command atoms, so the semantics of atm is derived from the given [[atm]].
The analog of Proposition 1 is that
for all c in the signature, ⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩=\llparenthesisc\rrparenthesis,
allowing laws valid in relational semantics to be lifted to h-transformers.
Considering some cases suggests that this could be proved by induction on c:
•
skip: ⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppskip\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩=⟨id℘Σ⟩=\llparenthesisskip\rrparenthesis
by definitions and using that ⟨−⟩ preserves identity.
•
atm: ⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppatm\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩=\llparenthesisatm\rrparenthesis by definition.
•
c;d: ⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\mbox;d\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩=⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp\mbox;\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppd\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩=⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩\mbox;⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppd\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩=\llparenthesisc\rrparenthesis\mbox;\llparenthesisd\rrparenthesis=\llparenthesisc;d\rrparenthesis
by definitions, distribution of ⟨−⟩ over \mbox;, and putative induction hypothesis.
These calculations suggest we may succeed with this obvious guess:
[TABLE]
The induction hypothesis would give
\llparenthesisifbthencelsed\rrparenthesis=⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp\mbox;\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩⊔⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp¬b\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp\mbox;\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppd\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩.
On the other hand,
⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppifbthencelsed\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩=⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp\mbox;\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⊔\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp¬b\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp\mbox;\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppd\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩.
Unfortunately these are quite different because the joins are at different levels.
In general, for φ and ψ of type ℘Σ→℘Σ and Q∈℘2Σ we have
\langle\varphi\mathbin{\sqcup}\psi\rangle\mathbb{Q}=\{\varphi\,p\mathbin{\mbox{\small\cup}}\psi\,p\mid p\in\mathbb{Q}\}
whereas
(\langle\varphi\rangle\mathbin{\sqcup}\langle\psi\rangle)\mathbb{Q}=\{\varphi\,p\mid p\in\mathbb{Q}\}\mathbin{\mbox{\small\cup}}\{\psi\,p\mid p\in\mathbb{Q}\}.
Indeed, the same discrepancy would arise if we define
\llparenthesisc⊕d\rrparenthesis=\llparenthesisc\rrparenthesis⊔\llparenthesisd\rrparenthesis.
At this point one may investigate notions of “inner join”, but for expository purposes we proceed to consider a putative definition for loops.
Following the pattern for relational and transformer semantics, an obvious guess is
[TABLE]
Consider this program: whilex<4dox:=x+1. We can safely assume
\llparenthesisx<4\rrparenthesis is ⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppx<4\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩ and
\llparenthesisx:=x+1\rrparenthesis is ⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppx:=x+1\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩.
As there is a single variable, we can represent a state by its value, for example {2,5} is a set of two states.
Let us work out \llparenthesiswhilex<4dox:=x+1\rrparenthesis{{2,5}}.
Now \llparenthesiswhilex<4dox:=x+1\rrparenthesis is the limit of the chain Ki⊥ where Ki means i applications of K.
Note that for any Φ and i>0,
[TABLE]
Writing Qi for Ki⊥{{2,5}} one can derive
[TABLE]
at which point the sequence remains fixed.
As in the case of conditional (6), the result is not consistent with the underlying semantics:
[TABLE]
The result should be {{4,5}} if we are to have the analog of
Proposition 1.
A plausible inner join is ⊗ defined
by (\Phi\otimes\Psi)\mathbb{Q}=\{r\mathbin{\mbox{\small\cup}}s\mid\exists q\in\mathbb{Q}\>\mbox{\small\bullet}\>r\in\Phi\{q\}\land s\in\Psi\{q\}\}.
This can be used to define a semantics of ⊕
as well as semantics of conditional and loop;
the resulting constructs are ⊑-monotonic and enjoy other nice properties.
Indeed, using ⊗ in place of ⊔ in (7),
we get K3⊥{{2,5}}={{4,5}}, which is exactly the lift of the transformer semantics.
There is one serious problem: K fails to be increasing.
In particular, ⊥⊑K⊥; for example ⊥{{2,5}}={∅} but
H⊥{{2,5}}={{5}}.
While this semantics merits further study, we leave it aside because we aim to use fixpoint fusion results that rely on Kleene approximation: This requires ⊥⊑K⊥ in order to have an ascending chain, and the use of ⊥ so that ⟨−⟩ is strict.
A viable solution.
Replacing singleton by powerset in the definition of ⊗, for
any h-transformers Φ,Ψ:℘2Σ→℘2Σ we define the inner join \ovee by
[TABLE]
For semantics of conditionals, it is convenient to define, for boolean expression b,
this operator on h-transformers:
Φ◃b▹Ψ=\llparenthesisb\rrparenthesis\mbox;Φ\ovee\llparenthesis¬b\rrparenthesis\mbox;Ψ.
It satisfies
[TABLE]
because \llparenthesisb\rrparenthesis(℘p)=⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩(℘p)=℘(\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppp)).
These operators are used in Fig. 3 for semantics of conditional and loop.
It is straightforward to prove \ovee is monotonic:
Φ⊑Φ′ and Ψ⊑Ψ′ imply Φ\oveeΨ⊑Φ′\oveeΨ′.
It is also straightforward to prove
[TABLE]
but in general equality does not hold, so we focus on ◃−▹.
Lemma 1
For any b, ◃b▹ is monotonic:
Φ⊑Φ′ and Ψ⊑Ψ′ imply
Φ◃b▹Ψ⊑Φ′◃b▹Ψ′.
Proof
Keep in mind this is ⊑ at the outer level:
Φ⊑Φ′ means \forall\mathbb{Q}\>\mbox{\small\bullet}\>\Phi\mathbb{Q}\subseteq\Phi^{\prime}\mathbb{Q} (more sets, not bigger sets, if you will).
This follows by monotonicity of \ovee, or using characterization (8)
we have
[TABLE]
which implies
\exists q\in\mathbb{Q}\>\mbox{\small\bullet}\>r\in\Phi^{\prime}(\wp(\BeginAccSupp{method=hex,unicode,ActualText=2983}\textnormal{\char 102\relax}\mathchoice{\mkern-4.05mu}{\mkern-4.05mu}{\mkern-4.3mu}{\mkern-4.8mu}\textnormal{\char 106\relax}\EndAccSupp{}\,b\,\BeginAccSupp{method=hex,unicode,ActualText=2984}\textnormal{\char 106\relax}\mathchoice{\mkern-4.05mu}{\mkern-4.05mu}{\mkern-4.3mu}{\mkern-4.8mu}\textnormal{\char 103\relax}\EndAccSupp{}q))\land s\in\Psi^{\prime}(\wp(\BeginAccSupp{method=hex,unicode,ActualText=2983}\textnormal{\char 102\relax}\mathchoice{\mkern-4.05mu}{\mkern-4.05mu}{\mkern-4.3mu}{\mkern-4.8mu}\textnormal{\char 106\relax}\EndAccSupp{}\,\neg b\,\BeginAccSupp{method=hex,unicode,ActualText=2984}\textnormal{\char 106\relax}\mathchoice{\mkern-4.05mu}{\mkern-4.05mu}{\mkern-4.3mu}{\mkern-4.8mu}\textnormal{\char 103\relax}\EndAccSupp{}q))
by Φ⊑Φ′ and Ψ⊑Ψ′.
∎
With \llparenthesisifbthencelsed\rrparenthesis defined as in Fig. 3 we have the following refinement.
Lemma 2
⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppifbthencelsed\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩⊑\llparenthesisifbthencelsed\rrparenthesis*
provided that
⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩⊑\llparenthesisc\rrparenthesis and ⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppd\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩⊑\llparenthesisd\rrparenthesis.*
Proof
[TABLE]
∎
This result suggests that we might be able to prove ⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩⊑\llparenthesisc\rrparenthesis for all c, but that would be a weak link between the transformer and h-transformer semantics. A stronger link can be forged as follows.
We say Q∈℘2Σ is subset closed iff Q=sscQ
where the subset closure operator ssc is defined by p∈sscQ iff \exists q\in\mathbb{Q}\>\mbox{\small\bullet}\>p\subseteq q.
For example, the set A used in (5) is subset closed.
Observe that ssc=⟨⊇⟩.
Lemma 3
For transformers φ,ψ:℘Σ→℘Σ and condition b,
if Q=sscQ then
⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp\mbox;φ⊔\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp¬b\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp\mbox;ψ⟩Q=(⟨φ⟩◃b▹⟨ψ⟩)Q.
Proof
For the LHS, by definitions:
[TABLE]
For the RHS, again by definitions:
[TABLE]
Now (∗)⊆(†) by instantiating t:=\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppq and u:=\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp¬b\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppq, so LHS ⊆ RHS is proved—as expected, given (9).
If Q is subset closed, we get (†)⊆(∗) as follows.
Given q,t,u in (†), let q^{\prime}:=t\mathbin{\mbox{\small\cup}}u.
Then t=\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppq′ and u=\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp¬b\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppq because \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp and \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp¬b\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp are filters.
And q′∈Q by subset closure.
Taking q:=q′ in (∗) completes the proof of RHS ⊆ LHS.
∎
Preservation of subset closure.
In light of Lemma 3, we aim to restrict attention to h-transformers on subset closed sets.
To this end we introduce a few notations.
The subset-closed powerset operator ℘˘ is defined on powersets ℘A,
by
[TABLE]
To restrict attention to h-transformers of type ℘˘(℘Σ)→℘˘(℘Σ) we must show that subset closure is preserved by the semantic constructs.
For any transformer φ,
define PSCφ iff (⊇\mbox;φ)=(φ\mbox;⊇).
The acronym is explained by the lemma to follow.
By definitions, the inclusion (⊇\mbox;φ)⊇(φ\mbox;⊇) is equivalent to
[TABLE]
Recall from Section 2 that the reverse,
(⊇\mbox;φ)⊆(φ\mbox;⊇), is monotonicity of φ.
Lemma 4
PSCφ* implies ⟨φ⟩ preserves subset closure.*
Proof
For any subset closed Q, ⟨φ⟩Q is subset closed because
⟨⊇⟩(⟨φ⟩Q)=⟨φ\mbox;⊇⟩Q=⟨⊇\mbox;φ⟩Q=⟨φ⟩(⟨⊇⟩Q)=⟨φ⟩Q
using functoriality of ⟨−⟩, PSCφ, and sscQ=Q.
∎
It is straightforward to show PSC⊥.
The following is a key fact, but also a disappointment that leads us
away from nondeterminacy.
Lemma 5
If R is a partial function (i.e., R∪\mbox;R⊆id)
then PSC⟨R⟩.
Proof
In accord with (11) we show for any q,r that
r\subseteq\langle R\rangle q\Rightarrow\exists s\subseteq q\>\mbox{\small\bullet}\>\varphi s=r.
Suppose r⊆⟨R⟩q.
Let s=(\langle R^{\cup}\rangle r)\mathbin{\mbox{\small\cap}}q, so for any x we have
x∈s iff x∈q and \exists y\in r\>\mbox{\small\bullet}\>xRy.
We have s⊆q and it remains to show ⟨R⟩s=r, which holds because for any y
[TABLE]
Using dots to show domain and range elements, the diagram on the left
is an example R such that PSC⟨R⟩ but R is not a partial function.
The diagram on the right is a relation, the image of which does not satisfy PSC.
As a consequence of Lemmas 4 and 5 we have the following.
Lemma 6
If R is a partial function then ⟨⟨R⟩⟩:℘2A→℘2B preserves subset closure.
The theorem.
To prove \llparenthesisc\rrparenthesis=⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩,
we want to identify a subset of ℘Σ→℘Σ
satisfying two criteria.
First, \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp−\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp can be defined within it, so in particular it is closed under G in Figure 2.
Second, on the subset, ⟨−⟩ is strict and continuous
into ℘˘(℘Σ)→℘˘(℘Σ), to enable the use of fixpoint fusion.
Strictness is the reason444In [10],
other reasons are given for using
{∅} rather than ∅ as the false hyperproperty.
to disallow the empty set in (10);
it makes ⊥ (which equals ⟨⊥⟩) the least element, whereas otherwise the least element
would be \lambda\mathbb{Q}\>\mbox{\small\bullet}\>\emptyset.
We need the subset to be closed under pointwise union, at least for chains, so that ⟨−⟩ is continuous.
Given that ⟨R⟩ is universally disjunctive for any R,
Proposition 1 suggests restricting to universally disjunctive transformers.
Lemma 4 suggests restricting to transformers satisfying PSC.
But we were not able to show the universally disjunctive transformers satisfying PSC are closed under limits.
We proceed as follows.
Define Domφ={x∣φ{x}=∅} and note that Dom⟨R⟩=domR where domR is the usual domain of a relation.
By a straightforward proof we have:
Lemma 7
For universally disjunctive φ and any r we have \varphi\,r=\varphi(r\mathbin{\mbox{\small\cap}}\mathsf{Dom}\,\varphi).
Lemma 8
For universally disjunctive φ,ψ with \mathsf{Dom}\,\varphi\mathbin{\mbox{\small\cap}}\mathsf{Dom}\,\psi=\emptyset,
if PSCφ and PSCψ then PSC(φ⊔ψ).
Proof
For any q,r with r⊆(φ⊔ψ)q we need to show
\exists s\subseteq q\>\mbox{\small\bullet}\>(\varphi\mathbin{\sqcup}\psi)s=r.
First observe
[TABLE]
We use (∗) to show that t\mathbin{\mbox{\small\cup}}t^{\prime} witnesses PSC(φ⊔ψ), as follows:
(\varphi\mathbin{\sqcup}\psi)(t\mathbin{\mbox{\small\cup}}t^{\prime})=\varphi(t\mathbin{\mbox{\small\cup}}t^{\prime})\mathbin{\mbox{\small\cup}}\psi(t\mathbin{\mbox{\small\cup}}t^{\prime})=\varphi t\mathbin{\mbox{\small\cup}}\psi t^{\prime}=s\mathbin{\mbox{\small\cup}}s^{\prime}=r
using also the definition of ⊔, and
\varphi(t\mathbin{\mbox{\small\cup}}t^{\prime})=\varphi\,t and
\psi(t\mathbin{\mbox{\small\cup}}t^{\prime})=\psi\,t^{\prime} from Lemma 7 and (∗).
∎
Lemma 9
If Φ and Ψ preserve subset closure then
(Φ◃b▹Ψ)Q is subset closed (regardless of whether Q is).
Proof
Suppose
q is in (Φ◃b▹Ψ)Q and
q′⊆q.
So according to (8)
there are r,s,p with
p∈Q,
q=r\mathbin{\mbox{\small\cup}}s,
r∈Φ(℘(\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppp)), and
s∈Ψ(℘(\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp¬b\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppp)).
Let r^{\prime}=r\mathbin{\mbox{\small\cap}}q^{\prime} and
s^{\prime}=s\mathbin{\mbox{\small\cap}}q^{\prime}, so r′⊆r and s′⊆s.
Because powersets are subset closed,
Φ(℘(\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppp)) and
Ψ(℘(\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp¬b\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppp)) are subset closed, hence
r′∈Φ(℘(\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppb\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppp)) and
s′∈Ψ(℘(\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp¬b\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppp)).
As q^{\prime}=r^{\prime}\mathbin{\mbox{\small\cup}}s^{\prime},
we have q′∈(Φ◃b▹Ψ)Q.
∎
It is straightforward to prove that Φ\oveeΨ preserves subset closure if Φ,Ψ do, similar to the proof of Lemma 9.
By contrast, Φ⊗Ψ does not preserve subset closure even if Φ and Ψ do.
Next, we confirm that \llparenthesis−\rrparenthesis can be defined within the monotonic functions
℘˘(℘Σ)→℘˘(℘Σ).
Lemma 10
For all c,Q, if Q is subset closed then so is \llparenthesisc\rrparenthesisQ, provided that
PSC⟨[[atm]]⟩ for every atm.
Proof
By induction on c.
•
atm:
\llparenthesisatm\rrparenthesis is ⟨⟨[[atm]]⟩⟩ so by assumption
PSC⟨[[atm]]⟩ and Lemma 4.
•
skip: immediate.
•
c;d: by definitions and induction hypothesis.
•
c⊕d: by induction hypothesis and observation above about \ovee.
•
ifbthencelsed: by Lemma 9 and induction hypothesis.
•
whilebdoc: Because ⊥ is least in ℘˘(℘Σ)→℘˘(℘Σ), we have ⊥⊑H⊥,
so using monotonicity of H we have Kleene iterates.
Suppose Q is subset closed.
To show lfpHQ is subset closed, note that lfpH=HγQ where γ is some ordinal. We show that HαQ is subset closed, for every α up to γ,
by ordinal induction.
–
H0Q=Q which is subset closed.
–
Hα+1Q=(\llparenthesisc\rrparenthesis\mbox;Hα◃b▹\llparenthesisskip\rrparenthesis)Q by definition of H.
Now Hα preserves subset closure by the ordinal induction hypothesis,
and \llparenthesisc\rrparenthesis preserves subset closure by the main induction hypothesis.
So \llparenthesisc\rrparenthesis\mbox;Hα preserves subset closure, as does \llparenthesisskip\rrparenthesis.
Hence \llparenthesisc\rrparenthesis\mbox;Hα◃b▹\llparenthesisskip\rrparenthesis preserves subset closure
by Lemma 9.
–
HβQ=(⊔α<βHα)Q (for non-0 limit ordinal β),
which in turn equals
\mathbin{\mbox{\small\cup}}_{\alpha<\beta}(H^{\alpha}\,\mathbb{Q}) because ⊔ is pointwise.
By induction, each HαQ is subset closed, and closure is preserved by union,
so we are done.
∎
Returning to the two criteria for a subset of ℘Σ→℘Σ,
suppose [[atm]] is a partial function, for all atm — in short,
atoms are deterministic.
If in addition c is ⊕-free, then [[c]] is a partial function.
Under these conditions, by Proposition 1, \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp is the direct image of a partial function.
Let IPF be the subset of ℘Σ→℘Σ that are direct images of partial functions,
i.e.,
\mathsf{IPF}=\{\varphi\in\wp\Sigma\to\wp\Sigma\mid\exists R\>\mbox{\small\bullet}\>\varphi=\langle R\rangle\mbox{ and }R^{\cup}\mathbin{\mbox{\small{;}}}R\subseteq id\}.
Observe that IPF is closed under G,
because for φ∈IPF with φ=⟨R⟩ we have \mathsf{G}\langle R\rangle=\langle\llbracket\,b\,\rrbracket\rangle\mathbin{\mbox{\small{;}}}\langle\llbracket\,c\,\rrbracket\rangle\mathbin{\mbox{\small{;}}}\langle R\rangle\mathbin{\sqcup}\langle\llbracket\,\neg b\,\rrbracket\rangle=\langle\llbracket\,b\,\rrbracket\mathbin{\mbox{\small{;}}}\llbracket\,c\,\rrbracket\mathbin{\mbox{\small{;}}}R\rangle\mathbin{\sqcup}\langle\llbracket\,\neg b\,\rrbracket\rangle=\langle\llbracket\,b\,\rrbracket\mathbin{\mbox{\small{;}}}\llbracket\,c\,\rrbracket\mathbin{\mbox{\small{;}}}R\mathbin{\mbox{\small\cup}}\llbracket\,\neg b\,\rrbracket\rangle
and the union is of partial functions with disjoint domains
so it is a partial function.
We have ⊥⊑G⊥ because ⊥ is the least element
in ℘Σ→℘Σ.
By Lemma 6, when ⟨−⟩ is restricted to IPF, its range is included in
℘˘(℘Σ)→℘˘(℘Σ).
In IPF, lubs of chains are given by pointwise union,
so ⟨−⟩ is a strict and continuous function
from IPF to ℘˘(℘Σ)→℘˘(℘Σ).
To state the theorem,
we write =˙ for extensional equality
on h-transformers of type ℘˘(℘Σ)→℘˘(℘Σ),
i.e., equal results on all subset closed Q.
Theorem 4.1
⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩=˙\llparenthesisc\rrparenthesis, provided atoms are deterministic and c is ⊕-free.
Proof
By induction on c.
For the cases of skip, atoms, and \mbox; the arguments preceding (6) are still valid.
For conditional, observe
[TABLE]
Finally, the loop:
[TABLE]
The antecedent for fusion is
\forall\varphi\>\mbox{\small\bullet}\>\langle\mathsf{G}\varphi\rangle=\mathsf{H}\langle\varphi\rangle
and it holds because for any φ:
[TABLE]
5 Specifications and refinement
We wish to conceive of specifications as miraculous programs that can
achieve by refusing to do, can choose the best angelically, and can compute the uncomputable.
We wish to establish rigorous connections between programs and specifications,
perhaps by deriving a program that can be automatically compiled for execution,
perhaps by deriving a specification that can be inspected to determine the usefulness or trustworthiness of the program.
A good theory may enable automatic derivation in one direction or the other,
but should also account for ad hoc construction of proofs.
Simple reasons should be expressed simply, so algebraic laws and transitive refinement chains are important.
In this inconclusive section, we return to the general notion of hyperproperty and consider how the h-transformer semantics sheds light on refinement for hyperproperties. Initially we leave aside the signature/semantics notations.
Let R:Σ⊸Σ be considered as a program, and H be a hyperproperty, that is, H is a set of programs. Formally: H∈℘(Σ⊸Σ). For R to satisfy H means, by definition, that R∈H.
The example of possibilistic noninterference shows that in general trace refinement is unsound: R∈H does not follow from
S∈H and S⊇R.
It does follow in the case that H is subset closed. Given that ℘(Σ⊸Σ) is a huge space, one may hope that specifications of practice interest may lie in relatively tame subsets. Let us focus on subset closed hyperproperties,
for which one form of chain looks like
[TABLE]
Although this is a sound way to prove R∈H, it does not seem sufficient, at least for examples like NI which require some degree of determinacy. The problem is that for
intermediate steps of trace refinement it is helpful to use nondeterminacy for the sake of abstraction and underspecification, so finding suitable S and T may be difficult.
One approach to this problem is to use a more nuanced notion of refinement, that preserves a hyperproperty of interest.
For confidentiality, Banks and Jacob explore this approach in the setting of UTP [6].
Another form of chain looks like
[TABLE]
where most intermediate terms are at the hyper level, i.e., S and T are, like H, elements of
℘(Σ⊸Σ).
The chain is a sound way to prove R∈H, even if H is not subset closed.
But in what way are the intermediates S,T,… expressed, and by what reasoning are the containments established? What is the relevant algebra, beyond elementary set theory?
The development in Section 4 is meant to suggest a third form of chain:
[TABLE]
Here the intermediate terms are of type ℘˘(℘Σ)→℘˘(℘Σ) and ⊑ is the pointwise ordering (used already in Section 4).
The good news is that if the intermediate terms are expressed using program notations, they may be amenable to familiar laws such as those of Kleene algebra with
tests [23, 38], for which relations are a standard model.
A corollary of Theorem 4.1 is that the laws hold for deterministic terms expressed in the signature (4).
To make this claim precise one might spell out the healthiness conditions of elements in the range of \llparenthesis−\rrparenthesis,
but more interesting would be to extend the language with specification constructs, using (in)equational conditions like
R∪\mbox;R⊆id as antecedents in conditional laws of healthy fragments.
We leave this aside in order to focus on a gap in our story so far.
The third form of chain is displayed with elipses on the left because we lack an account of specifications!
Our leading example, NI, is defined as a set of relations, whereas
in (14)
the displayed chain needs the specification, say Ψ, to have type ℘˘(℘Σ)→℘˘(℘Σ).
The closest we have come is the characterization R∈NI iff ⟨⟨R⟩⟩A⊆A, see (5).
But this is a set containment, whereas we seek Ψ with
R∈NI iff Ψ⊒⟨⟨R⟩⟩.
In the rest of this section we sketch two ways to proceed.
On the face of it, Ψ⊒⟨⟨R⟩⟩ seems problematic because ⊒ is an ordering on functions.
Given a particular set p∈℘Σ with Agrlp, a noninterfering R makes specific choice of value for lo whereas specification Ψ should allow any value for lo provided that the choice does not depend on the initial value for hi.
One possibility is to escalate further and allow the specification to be a relation ℘˘(℘Σ)⊸℘˘(℘Σ).
To define such a relation, first lift the predicate Agrl on sets to the filter
Agrl:℘(℘Σ)⊸℘(℘Σ) defined by
[TABLE]
Note that A=Agrl(℘Σ).
More to the point, AgrlQ=Q just if each p∈Q satisfies Agrl.
Now define NI,
as a relation
NI:℘˘(℘Σ)⊸℘˘(℘Σ),
by
[TABLE]
This achieves the following: R∈NI iff NI⊇⟨⟨R⟩⟩.
But this inclusion does not compose transitively with ⊒ in the third form of chain,
so we proceed no further in this direction.
The second way to proceed can be described using a variation on the h-transformer semantics
of Section 4.
It will lead us back to
the second form of chain, (13), in particular for NI as H.
The idea resembles UTP models of reactive processes [19, Chapt. 8],
in which an event history is related to its possible extension.
Here we use just pre-post traces, as follows.
Let Trc=Σ×Σ.
Consider a semantics \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp−\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp′ such that
\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp′ has type ℘Trc→℘Trc.
Instead of transforming an initial state to a final one
(or rather, state set as in \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp−\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp),
an initial trace (σ,τ) is mapped to traces (σ,υ) for υ with τ[[c]]υ.
The semantics \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp−\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp′ is not difficult to define (or see [3, sec. 2]).
The upshot is that for S∈℘Trc,
the trace set \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp′S is the relation S\mbox;[[c]].
In particular, let init be idΣ, viewed as an element of ℘Trc.
We get \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp′init=[[c]], the relation denoted by c.
Lifting, we obtain a semantics \llparenthesis−\rrparenthesis′,
at the level ℘2Trc→℘2Trc,
such that
\llparenthesisc\rrparenthesis′{init} contains [[c]].
(In light of Theorem 4.1 and the discussion preceding it, we do not expect
\llparenthesisc\rrparenthesis′{init} to be just the singleton {[[c]]}, as {init} is not subset closed.)
This suggests a chain of the form
NI⊇…⊇Ψ(ssc{init})⊇\llparenthesisc\rrparenthesis′(ssc{init})∋[[c]]
which proves that [[c]] satisfies NI — and which may be derived from a subsidiary chain of refinements
like Ψ⊒\llparenthesisc\rrparenthesis′ as in the third form of chain, independent of the argument ssc{init}.
This approach has been explored in the setting of abstract interpretation, where the intermediate terms are obtained as a computable approximation of a given program’s semantics. To sketch the the idea we first review abstract interpretation for trace properties.
Mathematically, abstract interpretation is very close to data refinement, where intermediate steps involve changes of data representation.
For example, the state space Σ of R:Σ⊸Σ would be connected with another, say Δ,
by a coupling relation ρ:Δ⊸Σ subject to a simulation condition such as
S\mbox;ρ⊇ρ\mbox;R, recall (3).
With a functional coupling, the connection could be ρ(⟨S⟩Δ)⊒⟨R⟩(ρΔ).
Let T∈Σ⊸Σ be a trace set intended as a trace property specification.
In terms of the trace-computing semantics \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp−\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp′ above, c satisfies T provided that
T⊇\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp′init.
It can be proved by the following chain, ingredients of which are to be explained.
[TABLE]
Here γ:A→℘Trc is like ρ above, mapping some convenient domain A to traces.
The element a∈A is supposed to be an approximation of the initial traces, i.e.,
γa⊇init.
Thus the containment \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp′(γa)⊇\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp′init is by monotonicity of
semantics.
The next containment, γ(\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp♯a)⊇\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp′(γa), involves
an “abstract” semantics \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp♯:A→A.
Indeed, the containment is the soundness requirement for such semantics.
What remains is the containment
T⊇γ(\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp♯a) which needs to be checked somehow.
Typically, γ is part of a Galois connection, i.e., it has a lower adjoint α:℘Trc→A
and the latter check is equivalent to
αT≥\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp♯a where ≥ is the order on A.
Ideally it is amenable to automation, but that is beside the point.
The point is to escalate this story to the hyper level, in a chain of this form:
[TABLE]
Now γ has type A→℘2Trc.
Again, the abstract semantics should be sound — condition γ(\llparenthesisc\rrparenthesis♯a)⊇\llparenthesisc\rrparenthesis′(γa) — now with respect to a set-of-trace-set semantics \llparenthesis−\rrparenthesis′ that corresponds to our Fig. 3.
The element a∈A now approximates the set ssc{init}
and \llparenthesisc\rrparenthesis′(γa)⊇\llparenthesisc\rrparenthesis′(ssc{init}) is by monotonicity.
The step H⊇γ(\llparenthesisc\rrparenthesis♯a) may again be checked at the abstract level
as αH≥\llparenthesisc\rrparenthesis♯a.
Of course H is a hyperproperty so the goal is to prove the program is an element of H.
This follows provided that
\llparenthesisc\rrparenthesis′(ssc{init})∋\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp′init,
a connection like our Theorem 4.1 except for moving from ℘Σ to ℘Trc.
6 Related work
The use of algebra in unifying theories of programming has been explored in many works including the book that led to the UTP meetings [19, 37, 20].
Methodologically oriented works include the books by Morgan [28] and by Bird and de
Moor [8].
The term hyperproperty was introduced by Clarkson and Schneider who among other things mention that refinement at the level of trace properties is admissible for proving subset closed hyperproperties [10].
They point out that the topological classification of trace properties, i.e., safety and liveness, corresponds to similar notions dubbed hypersafety and hyperliveness. Subset closed hyperproperties strictly subsume hypersafety.
As it happens, NI is in the class called 2-safety that specifies a property as holding for every pair of traces.
For fixed k, one can encode k-safety by a product program, each trace of which represents k traces of the original.
Some interesting requirements, such as quantitative information flow, are not in k-safety for any k.
Epistemic logic is the topic of a textbook [16]
and has been explored in the security literature [5].
Mantel considers a range of security properties via closure operators [24].
The limited usefulness of trace refinement for proving NI even for deterministic programs, as in the chain (12), is discussed by Assaf and Pasqua [3, 27].
The formulation of possibilistic noninterference as
∼˚\mbox;R\mbox;∼˚=R\mbox;∼˚ is due to Joshi and Leino [22]
and resembles the formulation of Roscoe et al. [35].
The textbook of Back and von Wright [4] explores predicate transformer semantics and refinement calculus.
The use of sglt and ∋ in (1) is part of the extensive algebra connecting predicate transformers and relations using categorical notions [14].
Upward closed sets of predicates play an important role in that algebra [33],
which should be explored in connection with the present investigation and its potential application to higher order programs.
The extension of functional programming calculus to imperative refinement is one setting in which strong laws (Cartesian closure) for a well-behaved subset are expressed
as implications with inequational antecedents [32, 34].
These works use backward predicate transformers in order to model general specifications and in particular the combination of angelic and demonic nondeterminacy.
Alternative models with similar aims can be found in
Martin et al [25] and Morris et al [31].
A primary precursor to this paper is the dissertion work of Assaf, which targets refinement chains in the style of abstract interpretation [13, 11].
Assaf’s work [3]
introduced a set-of-sets lifted semantics
from which our h-transformer semantics is adapted.
In keeping with the focus on static analysis, Assaf shows the lifted semantics is an approximation of the underlying one.555Assaf et al use fixpoint fusion in the inequational form mentioned following (2),
to prove soundness of the derived abstract semantics.
Their inequational result corresponding to our Theorem is proved, in the loop case, using explicit induction on approximation chains.
See the proof of Theorem 1 in [3].
Assaf derives an abstraction \llparenthesisc\rrparenthesis♯ for dependency from \llparenthesisc\rrparenthesis (for every c), by calculation, following Cousot [11] and similar to data refinement by calculation [17, 30].
For this purpose and others, it is essential that loops be interpreted by fixpoint at the level of sets-of-sets, so standard fixpoint reasoning is applicable,
as opposed to using ⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppwhilebdoc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩ which is not a fixed point per se.
In fact Assaf derives two abstract semantics \llparenthesisc\rrparenthesis♯:
one for dependency (NI) and one that computes cardinality of low-variation, for quantitative information flow properties. The cardinality abstraction is not in k-safety for any k.
Pasqua and Mastroeni have aims similar to Assaf et al, and investigate several variations on set-of-set semantics of loops [27].
Our example in Section 4 is adapted from their work,
which uses examples to suggest that Assaf’s definition (called “mixed” in [27]) is preferable.
They also point out that, for subset closed hyperproperties,
these variations are precise in the sense of our Theorem 4.1,
strengthening the inequation in Assaf et al.666Displayed formula {\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppc\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSuppT∣T∈T}⊆\llparenthesisc\rrparenthesisT
following Theorem 1 of [3].
A peculiarity of these works is the treatment of conditionals.
Assaf et al take the equation \llparenthesisifbthencelsed\rrparenthesis=⟨\BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSuppifbthencelsed\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp⟩ as the definition of \llparenthesisifbthencelsed\rrparenthesis, but this makes the definition of \llparenthesis−\rrparenthesis non-compositional and thus the inductive proofs a little sketchy.
We do not discern an explicit definition in [27],
but do find remarks like this: “The definition of the collecting hypersemantics is just the additive lift…for every statement, except for the while case.”
As a description of fact, it is true for subset closed hyperproperties (see our Theorem 4.1).
But it is unsuitable as a definition.
We show that a proper definition is possible.
Non-compositionality for sequence and conditional seems difficult to reconcile with proofs by induction on program structure.
It also results in anomalies, e.g., the formulation in Assaf et al means that in case c is a loop, the semantics of c
is different from the semantics of iftruethencelseskip.
The obscurity is rectified when the semantics is restricted to subset closed sets,
as spelled out in detail in an unpublished note [15],
confirming remarks by Pasqua and Mastroeni [26]
elaborated in [27] and in the thesis of Pasqua.
Although similar to results in the preceding work, our results are novel in a couple of ways.
Our semantics is for a language with nondeterminacy, unlike theirs.
Of course, nondeterministic programs typically fail to satisfy NI and related properties,
and our theorem is restricted to deterministic programs.
A minor difference is that they formulate loop semantics in a standard form mentioned in Footnote 3 that asserts the negated guard following the fixpoint rather than as part of it.
Those works use semantics mapping traces to traces (or sets of sets thereof),
like our \BeginAccSuppmethod=hex,unicode,ActualText=2983fj\EndAccSupp−\BeginAccSuppmethod=hex,unicode,ActualText=2984jg\EndAccSupp′ and \llparenthesis−\rrparenthesis′,
said to be needed in order to express dependency.
We have shown that states-to-states is sufficient to exhibit both the anomaly and its resolution.
It suffices for specifications of the form (5)
and may facilitate further investigation owing to its similarity to many variations on relational and transformer semantics.
Apropos NI, the formulation (5) is robust in the sense that it generalizes to more nuanced notions of dependency:
⟨⟨R⟩⟩P⊆Q where P expresses agreement on some projections of the input (e.g., agreement on whether a password guess is correct, or agreement on some aggregate value derived from a sensitive database)
and Q expresses agreement on the observable output values.
Such policies are the subject of [36].
Banks and Jacob [6] formalize general confidentiality policies in UTP
and introduce a family of confidentiality preserving refinement relations.
The ideas are developed further in subsequent work
where confidentiality-violating refinements are represented as miracles [7]
and knowledge is explicitly represented by sets encoding alternate executions,
an idea that has appeared in other guises [29, 2].
7 Conclusion
We have given what, to the best of our knowledge, is the first
compositional definition of semantics at the hyper level.
Moreover, we proved that it is the lift of a standard semantics when restricted to subset closed hyperproperties. The latter is a “forward collecting semantics”
in the terminology of abstract interpretation.
The new semantics includes nondeterministic constructs, although the lifting equivalence is only proved for the deterministic fragment.
Although deterministic noninterference is a motivating example, there are other interesting hyperproperties such as quantitative information flow that can be expressed in subset closed form
and which are meaningful for nondeterministic programs.
This is one motivation for further investigation including the following questions.
∙
Restricting to subset closed hyperproperties is sufficient to make possible a compositional fixpoint semantics at the hyper level that accurately represents the underlying semantics — is it necessary?
∙ The h-transformer semantics allows nondeterministic choice and nondeterministic atoms that satisfy PSC (in light of Lemma 10), and in fact the definitions
can be used for a semantics in ℘(℘Σ)→℘(℘Σ), into which
℘˘(℘Σ)→℘˘(℘Σ) embeds nicely owing to joins being pointwise —
but what exactly is the significance of determinacy?
∙ Are disjunctive transformers satisfying PSC closed under join?
What is a good characterization of transformers that are images of relations?
A person not familiar with unifying theories of programming may wonder whether programs are specifications.
Indeed, the author was once criticized by a famous computer scientist who objected to refinement calculi on the grounds that underspecification and nondeterminacy are distinct notions that ought not be confused — though years later he published a soundness proof for a program logic, in which that confusion is exploited to good effect.
A positive answer to the question can be justified by embedding programs in a larger space of specifications, in a way that faithfully reflects a given semantics of programs.
Our theorem is a result of this kind. The larger space makes it possible to express important requirements such as noninterference.
However, we do not put forward a compelling notion of specification that encompasses hyperproperties and supports a notion of refinement analogous to existing notions for trace properties. Rather, we hope the paper inspires or annoys the reader enough to provoke further research.
Acknowledgements.
Anonymous reviewers offered helpful suggestions
and pointed out errors, omissions, and infelicities in an earlier version.
The authors were partially supported by NSF award 1718713.
Bibliography38
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] Aarts, C., Backhouse, R.C., Boiten, E.A., Doornbos, H., van Gasteren, N., van Geldrop, R., Hoogendijk, P.F., Voermans, E., van der Woude, J.: Fixed-point calculus. Inf. Process. Lett. 53 (3), 131–136 (1995)
2[2] Assaf, M., Naumann, D.A.: Calculational design of information flow monitors. In: Computer Security Foundations (2016)
3[3] Assaf, M., Naumann, D.A., Signoles, J., Totel, É., Tronel, F.: Hypercollecting semantics and its application to static analysis of information flow. In: POPL (2017)
4[4] Back, R.J., von Wright, J.: Refinement Calculus: A Systematic Introduction. Springer-Verlag (1998)
5[5] Balliu, M., Dam, M., Guernic, G.L.: Epistemic temporal logic for information flow security. In: Programming Languages and Analysis for Security (2011)
6[6] Banks, M.J., Jacob, J.L.: Unifying theories of confidentiality. In: Unifying Theories of Programming (2010)
7[7] Banks, M.J., Jacob, J.L.: On integrating confidentiality and functionality in a formal method. Formal Asp. Comput. 26 (5), 963–992 (2014)
8[8] Bird, R., de Moor, O.: Algebra of Programming. Prentice-Hall (1996)