Strand Spaces with Choice via a Process Algebra Semantics
Fan Yang, Santiago Escobar, Catherine Meadows, Jos\'e, Meseguer

TL;DR
This paper introduces a formal process algebra semantics for cryptographic protocols with complex choice points, enhancing the expressiveness and analysis capabilities of the Maude-NPA tool.
Contribution
It develops a formal semantics for a process algebra supporting rich choice primitives, integrating it into Maude-NPA for improved protocol modeling and analysis.
Findings
Formal semantics for choice in cryptographic protocols.
Bisimulation between process algebra and Maude-NPA semantics.
Enhanced protocol analysis with expressive choice constructs.
Abstract
Roles in cryptographic protocols do not always have a linear execution, but may include choice points causing the protocol to continue along different paths. In this paper we address the problem of representing choice in the strand space model of cryptographic protocols, particularly as it is used in the Maude-NPA cryptographic protocol analysis tool. To achieve this goal, we develop and give formal semantics to a process algebra for cryptographic protocols that supports a rich taxonomy of choice primitives for composing strand spaces. In our taxonomy, deterministic and non-deterministic choices are broken down further. Non-deterministic choice can be either explicit, i.e., one of two paths is chosen, or implicit, i.e., the value of a variable is chosen non-deterministically. Likewise, deterministic choice can be either an explicit if-then-else choice, i.e., one path is chosen if a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Authentication Protocols Security · User Authentication and Security Systems · Cryptographic Implementations and Security
Strand Spaces with Choice
via a Process Algebra Semantics
Fan Yang
University of Illinois at Urbana-Champaign, USA
Santiago Escobar
Universitat Politècnica de València, Spain
Catherine Meadows
Naval Research Laboratory, Washington DC, USA
José Meseguer
University of Illinois at Urbana-Champaign, USA
Abstract
Roles in cryptographic protocols do not always have a linear execution, but may include choice points causing the protocol to continue along different paths. In this paper we address the problem of representing choice in the strand space model of cryptographic protocols, particularly as it is used in the Maude-NPA cryptographic protocol analysis tool.
To achieve this goal, we develop and give formal semantics to a process algebra for cryptographic protocols that supports a rich taxonomy of choice primitives for composing strand spaces. In our taxonomy, deterministic and non-deterministic choices are broken down further. Non-deterministic choice can be either explicit, i.e., one of two paths is chosen, or implicit, i.e., the value of a variable is chosen non-deterministically. Likewise, deterministic choice can be either an explicit if-then-else choice, i.e., one path is chosen if a predicate is satisfied, while the other is chosen if it is not, or implicit deterministic choice, i.e., execution continues only if a certain pattern is matched. We have identified a class of choices which includes finite branching and some cases of infinite branching, which we address in this paper.
We provide a bisimulation result between the expected forwards execution semantics of the new process algebra and the original symbolic backwards semantics of Maude-NPA that preserves attack reachability. We have fully integrated the process algebra syntax and its transformation into strands in Maude-NPA. We illustrate its expressive power and naturalness with various examples, and show how it can be effectively used in formal analysis. This allows users to write protocols from now on using the process syntax, which is more convenient for expressing choice than the strand space syntax, in which choice can only be specified implicitly, via two or more strands that are identical until the choice point.
1 Introduction
Formal analysis of cryptographic protocols has become one of the most successful applications of formal methods to security, with a number of tools available and many successful applications to the analysis of protocol standards. In the course of developing these tools it has become clear that there are certain universal features that can best be handled by accounting for them directly in syntax and semantics of the formal specification language, e.g., unguessable nonces, communication across a network controlled by an attacker, and support for the equational properties of cryptographic primitives. Thus a number of different languages have been developed that include these features.
At the same time, it is necessary to provide support for more commonly used constructs, such as choice points that cause the protocol to continue in different ways, and to do so in such a way that they are well integrated with the more specifically cryptographic features of the language. However, in their original form most of these languages do not support choice, or support it only in a limited way.
In particular the strand space model [Fabrega et al(1999)], one of the most popular models designed for use in cryptographic protocol analysis, does not support choice in its original form; strands describe linear sequences of input and output messages, without any branching. One response to dealing with this limitation, and to formalizing strand spaces in general has been to embed the strand space model in some other formal system that supports choice, e.g., event-based models for concurrency [Crazzolara and Winskel(2002)], Petri nets [Fröschle(2009)], or multi-set rewriting [Cervesato et al(2000)]. However, we believe that there is an advantage in introducing choice in the strand space model itself, while proving soundness and completeness with another formal system in order to validate the augmented model. This allows us to concentrate on handling the types of choice that commonly arise in cryptographic protocols. A detailed discussion of related work can be found at Section 11.
1.1 Contributions
This paper is an extended version of the conference paper [Yang et al(2016)]. We address the problem of representing choice in the strand space model, particularly as it is used in the Maude-NPA cryptographic protocol analysis tool. We have identified a class of choices which includes finite branching and some cases of infinite branching. At the theoretical level, we provide a bisimulation result between the expected forwards execution semantics of the new process algebra and the original symbolic backwards semantics of Maude-NPA. This requires extra intermediate forwards and backwards semantics that are included in this paper, together with all the proofs, but were not included in the conference paper [Yang et al(2016)]. What these results make possible is a sound and complete symbolic reachability analysis method for cryptographic protocols with choice modulo equational properties of the cryptographic functions satisfying the finite variant property (FVP) (see [Escobar et al(2017)] for a detailed explanation of how FVP theories are supported in Maude-NPA). At the tool level, we have fully integrated the process algebra syntax, and its transformation into strands, and have developed new methods to specify attack states using the process notation in the recent release of Maude-NPA 3.0 (see Section 10.1, and [Escobar et al(2017)]). None of this was available at the time of the conference paper [Yang et al(2016)]. Furthermore, we illustrate the expressive power and naturalness of adding choice to strand spaces with various examples, and show how it can be effectively used in formal analysis.
1.2 Choice in Maude-NPA
Previous to this work, Maude-NPA offered some ways of handling choice, but its scope was limited, and a uniform semantics of choice was lacking. Several kinds of branching could be handled by a protocol composition method in which a single parent strand is composed with one or more child strands. Although protocol composition is intended for modular construction of protocols, with suitable restrictions it can also be used to express both non-deterministic branching and deterministic branching predicates on pattern matching of output parameters of the parent with the input parameters of the child. However, repurposing composition to branching has its limitations. First of all, it is possible to inadvertently introduce non-deterministic choice into what was intended to be deterministic choice by unwise choice of input and output parameters. Secondly, the limitation to pattern matching rules out certain types of deterministic choice conditioned on predicates that cannot be expressed this way, e.g., disequality predicates. Finally, implementation of choice via composition can be inefficient, since Maude-NPA must evaluate all possible child strands that match a parent strand.
Maude-NPA, in common with many other cryptographic protocol analysis tools, also offers a type of implicit choice that does not involve branching: non-deterministic choice of the values of certain variables. For example, a strand that describes an initiator communicating with a responder generally uses variables for both the initiator and responder names; this represents a non-deterministic choice of initiator responder identities. However, the semantic implications of this kind of choice were not that well understood, which made it difficult to determine where it could safely be used. Clearly, a more unified treatment of choice was necessary, together with a formal semantics of choice.
In support of this work we have developed a taxonomy of choice in which the categories of deterministic and non-deterministic choice are further subdivided. First of all, non-deterministic choice is subdivided into explicit and implicit non-deterministic choice. In explicit non-deterministic choice a role111As further explained later, the behaviors of protocol participants, e.g., sender, receiver, server, etc., are described by their respective roles. Since a protocol may have multiple sessions, various participants, called principals, may play the same role in different sessions. chooses either one branch or another at a choice point non- deterministically. In implicit non-deterministic choice a logical choice variable is introduced which may be non-deterministically instantiated by the role. Deterministic choice is subdivided into (explicit) if-then-else choice and implicit deterministic choice. In if-then-else choice a predicate is evaluated. If the predicate evaluates to true one branch is chosen, and if it evaluates to false another branch is chosen. Deterministic choice with more than two choices can be modeled by nesting of if-then-else choices. In implicit deterministic choice, a term pattern is used as an implicit guard, so that only messages matching such pattern can be chosen i.e., accepted, by the role. Although implicit deterministic choice can be considered a special case of if-then-else choice in which the second branch is empty, it is often simpler to treat it separately. Classifying choice in this way allows us to represent all possible behaviors of a protocol by a finite number of strands modeling possible executions, while still allowing the variables used in implicit non-deterministic and deterministic choice to be instantiated in an infinite number of ways.
1.3 A Motivating Example
In this section we introduce a protocol that we will use as a running example. It is a simplified version of the handshake protocol in TLS 1.3 [Rescorla(2016)], a proposed update to the TLS standard for client-server authentication. This protocol, like most other protocol standards, offers a number of different choices that are applied in different situations. In order to make the presentation and discussion manageable, we present only a subset here: the client chooses a Diffie-Hellman group, and proposes it to the server. The server can either accept it or request that the client proposes a different group. In addition, the server has the option of requesting that the client authenticates itself. We present the protocol at a high level similar to the style used in [Rescorla(2016)].
Example 1.1**.**
We let a dashed arrow denote an optional message, and an asterisk * denote an optional field.
* ClientHello, Key_Share *
The client sends a Hello message containing a nonce and the Diffie-Hellman group it wants to use. It also sends a Diffie-Hellman key share.
- •
*1.1 HelloRetryRequest *
The server may optionally reject the Diffie-Hellman group proposed by the client and request a new one.
- •
*1.2 DHGroup, Key_Share *
The client proposes a new group and sends a new key share. 2. 2.
* ServerHello, Key_Share, *
* {AuthReq},{CertificateVerify}*, *{Finished}
The server sends its own Hello message and a Diffie-Hellman key share. It may optionally send an AuthReq to the client to authenticate itself with a public key signature from its public key certificate. It then signs the entire handshake using its own public key in the CertificateVerify field. Finally, in the Finished field it computes a MAC over the entire handshake using the shared Diffie-Hellman key. The notation denotes a field encrypted using the shared Diffie-Hellman key. 3. 3.
* CertificateVerify*, Finished *
If the client received an AuthReq from the server it returns its own CertificateVerify and Finished fields.
1.4 Plan of the paper
The rest of the paper is organized as follows. After some preliminaries in Section 2 and a high level introduction of the Maude-NPA tool in Section 3, we first define the process algebra syntax and operational semantics in Section 4. In Section 5 we extend Maude-NPA’s strand space syntax to include choice operators. The main bisimulation results between the expected forwards semantics of the process algebra in Section 4 and the original symbolic backwards strand semantics of Maude-NPA of Section 3 are Theorems 4 and 5. They are proved by introducing an intermediate semantics, a forward strand space semantics originally introduced in [Escobar et al(2014)]. First, in Section 5 we extend the strand space model with constraints, since strands are the basis of both the forwards semantics and the backwards semantics of Maude-NPA. In Section 6 we augment the forwards strand space semantics of [Escobar et al(2014)] with choice operators and operational semantic rules to produce a constrained forwards semantics. In Section 7 we prove bisimilarity of the process algebra semantics of Section 4 and the constrained forwards semantics of Section 6. In [Escobar et al(2014)] the forwards strand space semantics was proved sound and complete w.r.t. the original symbolic backwards semantics of Maude-NPA and, therefore, such proofs had to be extended to handling constraints. In Section 8 we augment the original symbolic backwards semantics of Maude-NPA with choice operators and operational semantic rules to produce a constrained backwards semantics. In Section 9 we then prove that the constrained backwards semantics is sound and complete with respect to the constrained forwards semantics. By combining the bisimulation between the process algebra and the constrained forwards semantics on the one hand (Theorem 1) and the bisimulation between the constrained forwards semantics and the constrained backwards semantics on the other hand (Theorems 3 and 2) we obtain the main bisimulation results (Theorems 4 and 5). Finally, in Section 10 we describe how the process algebra has been fully integrated into Maude-NPA and show some experiments we have run using Maude-NPA on various protocols exhibiting both deterministic and non-deterministic choice. In Section 11 we discuss related and future work, in particular the potential of using the process algebra syntax as a specification language. Finally, we conclude in Section 12.
2 Preliminaries
We follow the classical notation and terminology for term rewriting and for rewriting logic and order-sorted notions, see [Meseguer(1992)]. We assume an order-sorted signature with poset of sorts . We also assume an -sorted family of disjoint variable sets with each countably infinite. is the set of terms of sort , and is the set of ground terms of sort . We write and for the corresponding order-sorted term algebras. For a term , denotes the set of variables in .
A substitution is a sorted mapping from a finite subset of to . Substitutions are written as where the domain of is and the set of variables introduced by terms is written . The identity substitution is denoted id. Substitutions are homomorphically extended to . The application of a substitution to a term is denoted by . For simplicity, we assume that every substitution is idempotent, i.e., satisfies . This ensures . The restriction of to a set of variables is . Composition of two substitutions and is denoted by . A substitution is a ground substitution if .
A -equation is an unoriented pair , where for some sort . Given and a set of -equations, order-sorted equational logic induces a congruence relation on terms . The -equivalence class of a term is denoted by and and denote the corresponding order-sorted term algebras modulo . Throughout this paper we assume that for every sort , because this affords a simpler deduction system. An equational theory is a pair with an order-sorted signature and a set of -equations. The -subsumption preorder (or just if is understood) holds between , denoted (meaning that is more general than modulo ), if there is a substitution such that ; such a substitution is said to be an -match from to .
An -unifier for a -equation is a substitution such that . For , a set of substitutions is said to be a complete set of unifiers for the equality modulo away from iff: (i) each is an -unifier of ; (ii) for any -unifier of there is a such that ; (iii) for all , and . If the set of variables is irrelevant or is understood from the context, we write instead of . An -unification algorithm is complete if for any equation it generates a complete set of -unifiers. A unification algorithm is said to be finitary and complete if it always terminates after generating a finite and complete set of solutions.
A rewrite rule is an oriented pair , where222We do not impose the requirement , since extra variables (e.g., choice variables) may be introduced in the righthand side of a rule. Rewriting with extra variables in righthand sides is handled by allowing the matching substitution to instantiate these extra variables in any possible way. Although this may produce an infinite number of one-step concrete rewrites from a term due to the infinite number of possible instantiations, the symbolic, narrowing-based analysis used by Maude-NPA and explained below can cover all those infinite possibilities in a finitary way.
and for some sort . An (unconditional) order-sorted rewrite theory is a triple with an order-sorted signature, a set of -equations, and a set of rewrite rules.
The rewriting relation on , written or holds between and iff there exist , and a substitution , such that , and . The subterm is called a redex. The relation on is , i.e., iff there exists s.t. . Note that on induces a relation on the free -algebra by iff . The transitive (resp. transitive and reflexive) closure of is denoted (resp. ).
The relation can be difficult to compute. However, under the appropriate conditions it is equivalent to the relation in which it is enough to compute the relationship on any representatives of two -equivalence classes. A relation on is defined as: (or just ) iff there exist , a rule in , and a substitution such that and .
Let be a term and be a set of variables such that , the -narrowing relation on is defined as ( if is understood, if are also understood, and if is also understood) if there is a non-variable position , a rule properly renamed s.t. , and a unifier for , such that . For convenience, in each narrowing step we only specify the part of that binds variables of . The transitive (resp. transitive and reflexive) closure of is denoted by (resp. ). We may write if there are and substitutions such that , , and .
Maude-NPA uses backwards narrowing (i.e., uses protocol rules “in reverse” as rules ) modulo the algebraic properties of cryptographic functions as a sound and complete reachability analysis method. Section 9 gives a detailed proof of the soundness and completeness of this method for strands with choice.
3 Overview of Maude-NPA
Here we give a high-level summary of Maude-NPA. For further details please see [Escobar et al(2017)].
Given a protocol , we define its specification in the strand space model as a rewrite theory of the form , where (i) the signature is split into predefined symbols for strand syntax and user-definable symbols based on a parametric sort of messages, (ii) the algebraic properties are also split into the algebraic properties of the strand notation and the user-definable algebraic properties for the cryptographic functions, and (iii) the transition rules are defined on states, i.e., terms of a predefined sort . They are reversed for backwards execution.
In Maude-NPA states are modeled as elements of an initial algebra , i.e., an -equivalence class with a ground -term. A state has the form where is an associative-commutative union operator with identity symbol . Each element in the set is either a strand or the intruder knowledge at that state.
The intruder knowledge belongs to the state and is represented as a set of facts using the comma as an associative-commutative union operator with identity element . There are two kinds of intruder facts: positive knowledge facts (the intruder knows message , i.e., ), and negative knowledge facts (the intruder does not yet know but will know it in a future state, i.e., ), where is a message expression.
A strand [Fabrega et al(1999)] specifies the sequence of messages sent and received by a principal executing a given role in the protocol and is represented as a sequence of messages with either (also written ) representing an input message, or (also written ) representing an output message. Note that each is a term of a special sort Msg.
Strands are used to represent both the actions of honest principals (with a strand specified for each protocol role) and the actions of an intruder (with a strand for each action an intruder is able to perform on messages). In Maude-NPA strands evolve over time; the symbol is used to divide past and future. That is, given a strand , messages are the past messages, and messages are the future messages ( is the immediate future message). A strand is shorthand for . An initial state is a state where the bar is at the beginning for all strands in the state, and the intruder knowledge has no fact of the form . A final state is a state where the bar is at the end for all strands in the state and there is no intruder fact of the form .
Since the number of states is in general infinite, rather than exploring concrete protocol states Maude-NPA explores symbolic strand state patterns on the free -algebra over a set of variables . In this way, a state pattern represents not a single concrete state but a possibly infinite set of such states, namely all the instances of the pattern where the variables have been instantiated by concrete ground terms.
The semantics of Maude-NPA is expressed in terms of the following forward rewrite rules that describe how a protocol moves from one state to another via the intruder’s interaction with it.
[TABLE]
[TABLE]
where and are variables denoting a list of strand messages, IK is a variable for a set of intruder facts ( or ), SS is a variable denoting a set of strands, and , denote a list of strand messages. The set of backwards state transition rules is defined by reversing the direction of the above set of rules . In the backwards executions of , marks when the intruder learnt .
One uses Maude-NPA to find an attack by specifying an insecure state pattern called an attack pattern. Maude-NPA attempts to find a path from an initial state to the attack pattern via backwards narrowing (narrowing using the rewrite rules with the orientation reversed). That is, a narrowing sequence from an initial state to an attack state is searched in reverse as a backwards path from the attack state to the initial state. Maude-NPA attempts to find paths until it can no longer form any backwards narrowing steps, at which point it terminates. If at that point it has not found an initial state, the attack pattern is shown to be unreachable modulo . (Section 9 gives a detailed proof of the soundness and completeness of this symbolic method for the Maude-NPA extension supporting strands with choice). Note that Maude-NPA places no bound on the number of sessions, so reachability is undecidable in general. Note also that Maude-NPA does not perform any data abstraction such as a bounded number of nonces. However, the tool makes use of a number of sound and complete state space reduction techniques that help to identify unreachable and redundant states, and thus make termination more likely.
4 A Process Algebra for Protocols with Choice
In this section we define a process algebra that extends the strand space model to naturally specify protocols exhibiting choice points. Throughout the paper we refer to this process algebra as the protocol process algebra.
The rest of this section is organized as follows. First, in Section 4.1 we define the syntax of the protocol process algebra and state the requirements that a well-formed process must satisfy. Then in Section 4.2, we explain how protocol specifications can be defined in this process algebra. In Section 4.3 we then define the operational semantics of the protocol process algebra. Note that the operational semantics of Maude-NPA given in Section 3 corresponds to a symbolic backwards semantics while in Section 4.3 we give a rewriting-based forwards semantics for process algebra. Sections 7 and 9 will relate these two semantics using bisimulations.
4.1 Syntax of the Protocol Process Algebra
In the protocol process algebra the behaviors of both honest principals and the intruder are represented by labeled processes. Therefore, a protocol is specified as a set of labeled processes. Each process performs a sequence of actions, namely, sending or receiving a message, and may perform deterministic or non-deterministic choices. The protocol process algebra’s syntax is parameterized333More precisely, as explained in Section 4.2, they are parameterized by a user-definable equational theory having a sort of messages. by a sort of messages and has the following syntax:
[TABLE]
- •
stands for a process configuration, that is, a set of labeled processes. The symbol & is used to denote set union for sets of labeled processes.
- •
stands for a labeled process, that is, a process with a label . refers to the role of the process in the protocol (e.g., initiator or responder). is a natural number denoting the identity of the process, which distinguishes different instances(sessions) of a process specification. indicates that the action at stage of the process specification will be the next one to be executed, that is, the first actions of the process for role have already been executed. Note that we omit and in the protocol specification when both and are [math].
- •
defines the actions that can be executed within a process. , and respectively denote sending out or receiving a message . We assume a single channel, through which all messages are sent or received by the intruder. “” denotes sequential composition of processes, where symbol . is associative and has the empty process as identity. “” denotes an explicit nondeterministic choice, whereas “” denotes an explicit deterministic choice, whose continuation depends on the satisfaction of the constraint .
- •
denotes a constraint that will be evaluated in explicit deterministic choices. In this work we only consider constraints that are either equalities () or disequalities () between message expressions.
Let , and be process configurations, and , and be protocol processes. Our protocol syntax satisfies the following structural axioms:
[TABLE]
[TABLE]
The specification of the processes defining a protocol’s behavior may contain some variables denoting information that the principal executing the process does not yet know, or that will be different in different executions. In all protocol specifications we assume three disjoint kinds of variables:
- •
fresh variables: these are not really variables in the standard sense, but names for constant values in a data type of unguessable values such as nonces. A fresh variable is always associated with a role in the protocol. For each protocol session , we associate to a unique name for a constant in the data type . What is assumed is that if (including the case ), the values interpreting and in are both different and unguessable. In particular, for role , the interpretation mapping is injective and random. In our semantics, a constant denotes its (unguessable) interpretation . Throughout this paper we will denote this kind of variables as .
- •
choice variables: variables first appearing in a sent message , which can be substituted by any value arbitrarily chosen from a possibly infinite domain. A choice variable indicates an implicit non-deterministic choice. Given a protocol with choice variables, each possible substitution of these variables denotes a possible continuation of the protocol. We always denote choice variables by uppercase letters postfixed with the symbol “?” as a subscript, e.g., .
- •
pattern variables: variables first appearing in a received message . These variables will be instantiated when matching sent and received messages. Implicit deterministic choices are indicated by terms containing pattern variables, since failing to match a pattern term may lead to the rejection of a message. A pattern term plays the implicit role of a guard, so that, depending on the different ways of matching it, the protocol can have different continuations. This kind of variables will be written with uppercase letters, e.g., .
Note that fresh variables are distinguished from other variables by having a specific sort . Choice variables or pattern variables can never have sort .
To guarantee the requirements on different kinds of variables that can appear in a given process, we consider only well-formed processes. We make this notion precise by defining a function checking whether a given process is well-formed. A labeled process is well-formed if the process it labels is well-formed. A process configuration is well-formed if all the labeled process in it are well-formed. The definition of uses an auxiliary function , retrieving the “shared variables” of a process, i.e., the set of variables that show up in all branches. Below we define both functions, where and are processes, is a message, and is a constraint.
[TABLE]
[TABLE]
Remark 1**.**
Note that the well-formedness property implies that if a process begins with a deterministic choice action if T then Q else R, then all variables in must be instantiated, and thus only one branch may be taken. For this reason, it is undesirable to specify processes that begin with such an action. Furthermore, note that the well-formedness property avoids explicit choices where both possibilities are the process. That is, processes containing either (if T then nil else nil), or (nil ? nil), respectively.
We illustrate the notion of well-formed process below.
Example 4.1**.**
The behavior of a Client initiating an instance of the handshake protocol from Example 1.1 with the Server, where the Server may or may not request the Client to authenticate itself, may be specified by the well-formed process shown below:
[TABLE]
where , and are macros used to construct messages sent in the protocol. The variables and are choice variables denoting the client and Diffie-Hellman group respectively, and the variables and are fresh variables. All other variables are pattern variables. In particular, the variable is a pattern variable that can be instantiated to either or . The Client makes a deterministic choice whether or not to sign its next message with its digital signature, depending on which value of it receives.
Example 4.2**.**
The behavior of a Server who may or may not request a retry from a Client in an instance of the handshake protocol from Example 1.1 may be specified as follows:
[TABLE]
In this case the server nondeterministically chooses to request or not to request a retry. In the case of a retry it waits for the retry message from the client, and then proceeds with the handshake message using the new key information from the client. In the case when it does not request a retry, it sends the handshake message immediately after receiving the client’s Hello message. The variable is a fresh variable, while and are choice variables. denotes the name of the server, and is nondeterministically instantiated to or .
Example 4.3**.**
The following process does not satisfy the well-formedness property.
[TABLE]
The problem with this process is the fresh variable appearing in message , since
[TABLE]
*more specifically, because it does not appear in message , but *
4.2 Protocol Specification in Process Algebra
Given a protocol , we define its specification in the protocol process algebra, written , as a pair of the form , where is an equational theory explained below, and is a term denoting a well-formed process configuration representing the behavior of the honest principals as well as the capabilities of the attacker. That is, , where each , , is either the role of an honest principal or identifies one of the capabilities of the attacker. cannot contain two processes with the same label, i.e., the behavior of each honest principal, and each attacker capability are represented by a unique process. is a set of equations with denoting the protocol’s cryptographic properties and denoting the properties of process constructors. The set of equations is user-definable and can vary for different protocols. Instead, the set of equations is always the same for all protocols. is the signature defining the sorts and function symbols as follows:
- •
is an order-sorted signature defining the sorts and function symbols for the messages that can be exchanged in protocol . However, independently of protocol , must always have a sort as the top sort in one of its connected components. We call a sort a data sort iff it is either a subsort of , or there is a message constructor , with a subsort of . The specific sort for fresh variables is an example of data sort. Choice and pattern variables have sort or any of its subsorts.
- •
is an order-sorted signature defining the sorts and function symbols of the process algebra infrastructure. corresponds exactly to the BNF definition of the protocol process algebra’s syntax in Section 4.1. Although it has a sort for messages, it leaves this sort totally unspecified, so that different protocols may use completely different message constructors and may satisfy different equational properties . Therefore, will be the same signature for any protocol specified in the process algebra. More specifically, contains the sorts for process configurations (), labeled processes (), processes (), constraints (), and messages(), as well as the subsort relations . Furthermore, the function symbols in are also defined according to the BNF definition.
Therefore, the syntax of processes for will be in the union signature , consisting of the protocol-specific syntax , and the generic process syntax through the shared sort .
4.3 Operational Semantics of the Protocol Process Algebra
Given a protocol , a state of consists of a set of (possibly partially executed) labeled processes, and a set of terms in the intruder’s knowledge . That is, a state is a term of the form . Given a state of this form, we abuse notation and write if is a labeled process in the set .
The intruder knowledge models the single channel through which all messages are sent and received. We consider an active attacker who has complete control of the channel, i.e, can read, alter, redirect, and delete traffic as well as create its own messages by means of intruder processes. That is, the purpose of some is to perform message-manipulation actions for the intruder.
State changes are defined by a set of rewrite rules, such that the rewrite theory characterizes the behavior of protocol , where extends by adding state constructor symbols. We assume that a protocol’s execution begins with an empty state, i.e., a state with an empty set of labeled processes, and an empty intruder knowledge. That is, the initial state is always of the form . Each transition rule in is labeled with a tuple of the form , where:
- •
is the role of the labeled process being executed in the transition.
- •
denotes the identifier of the labeled process being executed in the transition. Since there can be more than one process instance of the same role in a process state, is used to distinguish different instances, i.e., and together uniquely identify a process in a state.
- •
denotes the process’ step number since its beginning.
- •
is a ground term identifying the action that is being performed in the transition. It has different possible values: “” or “” if the message was sent (and added to the intruder’s knowledge) or received, respectively; “” if the message was sent but did not increase the intruder’s knowledge, “” if the transition performs an explicit non-deterministic choice, or “” if the transition performs a explicit deterministic choice.
- •
is a number that, if the action that is being executed is an explicit choice, indicates which branch has been chosen as the process continuation. In this case takes the value of either or . If the transition does not perform any explicit choice, then .
Below we describe the set of transition rules that define a protocol’s execution in the protocol process algebra, that is, the set of rules . Note that in the transition rules shown below, denotes the rest of labeled processes of the state (which can be the empty set ).
- •
The action of sending a message is represented by the two transition rules below. Since we assume the intruder has complete control of the network, it can learn any message sent by other principals. Rule (PA++) denotes the case in which the sent message is added to the intruder’s knowledge. Note that this rule can only be applied if the intruder has not already learnt that message. Rule (PA+) denotes the case in which the intruder chooses not to learn the message, i.e., the intruder’s knowledge is not modified, and, thus, no condition needs to be checked. Since choice variables denote messages that are nondeterministically chosen, all (possibly infinitely many) admissible ground substitutions for the choice variables are possible behaviors.
[TABLE]
[TABLE]
- •
As shown in the rule below, a process can receive a message matching a pattern if there is a message in the intruder’s knowledge, i.e., a message previously sent either by some honest principal or by some intruder process, that matches the pattern message . After receiving this message the process will continue with its variables instantiated by the matching substitution, which takes place modulo the equations . Note that the intruder can “delete” a message via choosing not to learn it (executing Rule PA+ instead of Rule PA++) or not to deliver it (failing to execute Rule PA-).
[TABLE]
- •
The two transition rules shown below define the operational semantics of explicit deterministic choices. That is, the operational semantics of an expression. More specifically, rule (PAif1) describes the then case, i.e., if the constraint is satisfied, the process will continue as . Rule (PAif2) describes the else case, that is, if the constraint is not satisfied, the process will continue as . Note that, since we only consider well-formed processes, these transition rules will only be applied if . Note also that since has been fully substituted by the time the if-then-else is executed, and the constraints that we considered in this paper are of the form or , the satisfiability of can be checked by checking whether the corresponding ground equality or disequality holds.
[TABLE]
[TABLE]
- •
The two transition rules below define the semantics of explicit non-deterministic choice . In this case, the process can continue either as , denoted by rule (PA?1), or as , denoted by rule (PA?2). Note that this decision is made non-deterministically.
[TABLE]
- •
The transition rules shown below describe the introduction of a new process from the specification into the state, which allows us to support an unbounded session model. Recall that fresh variables are associated with a role and an identifier. Therefore, whenever a new process is introduced: (a) the largest process identifier will be increased by 1, and (b) new names will be assigned to the fresh variables in the new process. The function in the transition rule below is used to get the largest process identifier of role in the process configuration . The substitution in the transition rule below takes a labeled process and assigns new names to the fresh variables according to the label. More specifically, . In a process state, a role name together with an identifier uniquely identifies a process. Therefore, there is a unique subset of fresh names for each process in the state. In the rest of this paper we will refer to this kind of substitutions as fresh substitutions.
[TABLE]
Note that denotes the action of the state transition, and can be of any of the forms explained above. The function MaxProcId is defined as follows:
[TABLE]
where denotes a process configuration, denotes a process, and denote role names.
Therefore, the behavior of a protocol in the process algebra is defined by the set of transition rules .
Our main result is a bisimulation between the state space generated by the transition rules , associated to the symbolic backwards semantics of Section 3, and the transition rules above, associated to the forwards semantics for process algebra. This is nontrivial, since there are three major ways in which the two semantics differ. The first is that processes “forget” their past, while strands “remember” theirs. The second is that Maude-NPA uses backwards search, while the process algebra proceeds forwards. The third is that Maude-NPA performs symbolic reachability analysis using terms with variables, while the process algebra considers only ground terms.
We systematically relate these different semantics by introducing an intermediate semantics, a forward strand space semantics extending that in [Escobar et al(2014)]. First, in Section 5 we extend the strand space model with constraints, since strands are the basis of both the forwards semantics and the backwards semantics of Maude-NPA. In Section 6 we augment the forwards strand space semantics of [Escobar et al(2014)] with choice operators and operational semantic rules to produce a constrained forwards semantics. In Section 7 we prove bisimilarity of the process algebra semantics of Section 4 and the constrained forwards semantics of Section 6. In [Escobar et al(2014)] the forwards strand space semantics was proved sound and complete w.r.t. the original symbolic backwards semantics of Maude-NPA. But now such proofs had to be extended to handle constraints. In Section 8 we also augment the original symbolic backwards semantics of Maude-NPA with choice operators and operational semantic rules to produce a constrained backwards semantics. In Section 9, we then prove that the constrained backwards semantics is sound and complete with respect to the constrained forwards semantics. By combining the bisimulation between the process algebra and the constrained forwards semantics on the one hand, and the bisimulation between the constrained forwards semantics and the constrained backwards semantics on the other hand, we obtain the main bisimulation result.
Besides providing a detailed semantic account of how the strand model can be extended with choice features, the key practical importance of these bisimulation results is that, with the relatively modest extensions to Maude-NPA described in Section 10.1 and supported by its recent 3.0 release, sound and complete analysis of protocols with choice features specified in process algebra is made possible.
5 Constrained Protocol Strands with Choice
To specify and analyze protocols with choices in Maude-NPA, in this section we extend Maude-NPA’s strand notation by adding new symbols to support explicit choices. We refer to the strands in this extended syntax as constrained protocol strands.
In Section 5.1 we describe the syntax for constrained protocol strands. Then, in Section 5.2 we define a mapping from a protocol specification in the protocol process algebra, as described in Section 4.2, to a specification based on constrained protocol strands.
5.1 Constrained Protocol Strands Syntax
In this section we extend Maude-NPA’s syntax by adding constrained messages, which are terms of the form , where is a constraint, and is a natural number that identifies the continuation of the protocol’s execution, among the two possibilities after an explicit choice point. More specifically, we extend the part of the signature of the Maude-NPA’s syntax we defined in Section 3 as follows:
- •
A new sort represents the constraints allowed in constrained messages, containing three symbols: (i) , (ii) , and (iii) .
- •
A new sort for constrained messages, such that , where is an existing Maude-NPA sort denoting signed messages (i.e., messages with + or -). Therefore, now a strand is a sequence of output, input and constrained messages.
- •
A new operator \{\_,\_\}:\mathsf{Cstr}\ \mathsf{Nat}\to\mathsf{CstrMsg}\ constructs constrained messages.
We refer to this extended signature as . Note that the protocol signature is contained in , and therefore in . Furthermore, in the constrained semantics we allow each honest principal or intruder capability strand to be labeled by the “role” of that strand in the protocol (e.g., (Client) or (Server)). Therefore, strands are now terms of the form , where denotes the role of the strand in the protocol, is a unique identifier distinguishing different instances of strands of the same role, and each can be a sent or received message, i.e., a term of the form , or a constraint message of the form . We often omit , or both and for clarity when they are not relevant.
5.2 Protocol Specification using Constrained Protocol Strands
The behavior of a protocol involving choices can be specified using the syntax presented in Section 5.1 as described below.
Definition 1** (Constrained protocol strand specification).**
Given a protocol , we define its specification by means of constrained protocol strands, written , as a tuple of the form , where is the protocol’s signature (see Section 5.1), and is a set of equations as we defined in Section 3, where denotes the protocol’s cryptographic properties and denotes the protocol-independent properties of constructors of strands. That is, the set of equations may vary depending on different protocols, but the set of equations is always the same for all protocols. is a set of constrained protocol strands as defined in Section 5.1, representing the behavior of the honest principals as well as the capabilities of the attacker. That is, is a set of labeled strands of the form: , where, for each such that , is either the role of an honest principal, or identifies one of the capabilities of the attacker. We note that may contain several strands with the same label, each defining one of the possible paths of such a principal.
The protocol specification described above can be obtained by transforming a specification in the process algebra of Section 4.2 as follows. Given a protocol , its specification in the process algebra , consists of a set of well-formed labeled processes. We transform a term denoting a set of labeled processes into a term denoting a set of constrained protocol strands by the mapping toCstrSS. The intuitive idea is that, since our process contains no recursion, each process can be “deconstructed” as a set of constrained protocol strands, where each such strand represent a possible execution path of the process.
The mapping toCstrSS is specified in Definition 2 below.
Definition 2** (Mapping labeled processes toCstrSS).**
Given a labeled process and a process configuration , we define the mapping recursively as follows:
[TABLE]
where is the empty set of strands. toCstrSS is an auxiliary mapping that maps a term denoting a labeled process to a term that denotes a set of constrained protocol strands. It takes two arguments: a labeled process, and a temporary store that keeps a sequence of messages. More specifically, is defined as follows:*
[TABLE]
where , , and denote processes, is a message, is a constraint, and denotes a list of messages, i.e., input, output or constraint messages.
Note that toCstrSS does not modify output and input messages, since messages are actually terms in in both the protocol process algebra, and the constrained forwards semantics. toCstrSS can be used both as a map between specifications, and as a map from process configurations and strand sets appearing in states.
We illustrate the toCstrSS transformation with the example below.
Example 5.1**.**
If we apply the mapping toCstrSS to the process in Example 4.2 we obtain the following term which denotes a set of strands:
[TABLE]
A protocol specification in the protocol process algebra can then be transformed into a specification of that protocol in the constrained protocol strands described below using toCstrSS.
Definition 3** (Specification transformation).**
Given a protocol and its protocol process algebra specification , with , its specification by means of constrained protocol strands is with .
6 Constrained Forwards Strand Semantics
In this section we extend Maude-NPA’s rewriting-based forwards semantics in [Escobar et al(2014)] by adding new transition rules for constrained messages. We refer to this extended forwards semantics as constrained forwards strand semantics. We show that the process algebra semantics and the constrained forwards strand semantics are label bisimilar. Therefore, protocols exhibiting choices can be specified and executed in an equivalent way in both semantics.
In the constrained forwards strand semantics, state changes are defined by a set of rewrite rules, so that the rewrite theory characterizes the behaviors of protocol .
The set of transition rules is an extension of the transition rules in [Escobar et al(2014)]. The transition rules are generated from the protocol specification. A state consists of a multiset of partially executed strands and a set of terms denoting the intruder’s knowledge. The main differences between the sets and are: (i) new transition rules are added in to appropriately deal with constraint messages, (ii) strands are labeled with the role name, together with the identifier for distinguishing different instances, as explained in Section 5.1, (iii) transitions are also labeled, similarly as in the protocol process algebra, (iv) the global counter for generating fresh variables is deleted from the state. Instead, special unique names are assigned to fresh variable, which simplifies our notation.
In the constrained forwards strand semantics we label each transition rule similarly as in Section 4.3, that is, using labels of the form , where , , , and are as explained in Section 4.3, and in this case is the position of the message that is being exchanged in the state transition. Also, similar to Section 4.3, for transitions that send out messages containing choice variables, all (possibly infinitely many) admissible ground substitutions for the choice variables are possible behaviors. A similar mechanism for distinguishing different fresh variables is used as that explained in Section 4.3. Since messages are introduced into strands in the state incrementally, we instantiate the fresh variables incrementally as well. Recall that fresh variables always first show up in a sent message. Therefore, each time a sent message is introduced into a strand in the state, we assign new names to the fresh variables in the message being introduced. The function MaxStrId for getting the max identifier for a constrained strand of a certain role is similar to MaxProcId in Section 4.3.
Since now messages in a strand can be sent or received messages, i.e., terms of the form or , as well as constraint messages , we represent them in the rules below simply as terms of the form when their exact form is not relevant. We will use the precise form of the message when disambiguation is needed.
Before explaining the new transition rules for constraint messages, we show how the transition rules in [Escobar et al(2014)] are labeled.
The constrained forwards strand semantics extends Maude-NPA’s forwards semantics in [Escobar et al(2014)] by adding transition rules to handle constraint messages, i.e, messages of the form , where can be either or . First, we add the two transition rules below for the cases when such a constrained message comes from explicit choices. Note that, as a consequence of well-formedness, the constraints introduce no new variables, and since the constraints that we consider are of the form or , the satisfiability of can be checked by checking whether the corresponding ground equality or disequality holds.
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
The following set of transition rules adds to the state a new strand whose first message is a constraint message of the form :
[TABLE]
Definition 4**.**
Let be a protocol with signature and equational theory . We define the constrained forwards rewrite theory characterizing as where .
7 Bisimulation between Constrained Forwards Strand Semantics and Process Algebra Semantics
In this section we show that the process algebra semantics and the constrained forwards strand semantics are label bisimilar. We first define PA-State and FW-State, the respective notions of state in each semantics.
Definition 5** (PA-State).**
Given a protocol , a PA-State of is a state in the protocol process algebra semantics that is reachable from the initial state. The initial PA-State is .
Definition 6** (FW-State).**
Given a protocol , a FW-State of is a state in the constrained forwards strand semantics that is reachable from the initial state. The initial FW-State is .
The bisimulation relation is defined based on reachability, i.e., if a PA-State and a FW-State are in the relation , then they both can be reached from their corresponding initial states by the same label sequence. Note that we only consider states that are reachable from the initial states.
Let us first define the notation of label sequence that we will use throughout.
Definition 7** (Label Sequence).**
An ordered sequence of transition labels is defined by using as an associative concatenation operator with as an identity. The length of a label sequence is denoted by . Given a label sequence , we denote by the sub-sequence of labels in that have as role name, and as identifier, i.e., labels of the form ( is a shorthand for denoting any term).
Definition 8** (Relation ).**
Given a protocol , the relation is defined as: .
Recall that a process can be “deconstructed” by the mapping toCstrSS into a set of constrained protocol strands, each representing a possible execution path. If a PA-State and a FW-State are related by , then an important observation is that there is a duality between individual processes in and strands in : if there is a process in the describing a role’s continuation in the future, there will be a corresponding strand in describing the part of the process that has already been executed, and vice versa. Another observation is that, since the intruder’s knowledge is extracted from the communication history, following the definition of , the states and have the same communication history, therefore they have the same intruder’s knowledge. We formalize these observations in Lemmas 1 and 2. These lemmas then lead us to the main result that is a bisimulation relation.
We now define the relation , which relates a possibly partially executed labeled process and a constrained strand. This relation defines the duality relation between a labeled process and a constrained strands. If a labeled process is related to a constrained strand by the relation , then: (i) and denote the behavior of the same role with the same identity in the same protocol, and (ii) for any strand , denotes a possible execution path of iff followed by forms a valid possible execution path of the protocol.
Definition 9** (Relation ).**
Given a protocol , and a possibly partially executed labeled process of , a possibly partially executed constrained strand of , then iff
**
where is a shorthand for a term denoting a set of strands. for fresh variables in .
Example 7.1**.**
*Following Examples 4.2 and 5.1, we show a process and a strand that are related by the relation . (resp. ) is the labeled process (resp. constrained strand) of the Server role after making the first explicit nondeterministic choice.
[TABLE]
where is a ground substitution to the pattern variables , , and .
We then lift the duality relation between individual processes and strands to a duality relation between PA-State and FW-State.
Definition 10** (Relation ).**
Let be a PA-State and be a FW-State, if , then:
- (i)
For each labeled process , , there exists a strand , , such that .
- (ii)
For each strand , , there exists a labeled process , , such that .
The lemma below states that the relation induces the duality relation .
Lemma 1**.**
Let be a PA-State and be a FW-State, if , i.e., exists a label sequence such that , and , then .
Proof.
We first prove property (i). If , since both the strand set and the process configuration are empty, the statement is vacuously true.
Now suppose that . Then, without loss of generality, assume there exists a labeled process in , with . Then there is at least one label in of the form ( is a short hand for any content), therefore, there is a strand in of the form .
We then show that the above-mentioned and are related by , i.e., . Since the state is reachable from the initial state by the label sequence , and , denotes exactly the sequence of messages in the unique sequence of labels . Moreover, .
Since the process state is reachable from the initial state by label sequence , there exists a unique process in the specification , and represents all possible behaviors of after the sequence of transitions . Therefore,
[TABLE]
By the correspondence between protocol specifications defined in definition 3 , . Also note that is the only process in that has as its role name, therefore, . Therefore,
[TABLE]
Therefore, . The proof for property (ii) is similar to the one for property (i). ∎
Lemma 2 below formalizes the observation that the equivalence of label sequence implies the same intruder knowledge.
Lemma 2**.**
Given a PA-State and a FW-State such that , i.e., there exists a label sequence such that and , then the contents of intruder knowledge in and in are syntactically equal.
Proof.
In both semantics the only transition rules that add new elements to the intruder’s knowledge are the ones whose label is of the form . Therefore, given the two states and as described above, their intruder’s knowledge can be computed from the sequence of labeled transitions as . ∎
Based on the lemmas above, we can now show that the relation is a bisimulation. Since the proof of Theorem 1 requires a somewhat lengthy case analysis, it has been moved to Appendix A.
Theorem 1** (Bisimulation).**
* is a bisimulation.*
8 Constrained Backwards Strand Semantics
In this section we extend Maude-NPA’s symbolic backwards semantics with rules for constrained messages of the form described in Section 5.1, so that it can analyze protocols exhibiting explicit choices. We refer to this extended backwards semantics as constrained backwards strand semantics. We then show that the constrained backwards strand semantics is sound and complete with respect to the constrained forwards strand semantics presented in Section 6, and the process algebra semantics presented in Section 4. This result allows us to use Maude-NPA for analyzing protocols exhibiting choice, including both implicit and explicit choices, and in particular any protocol specified using the protocol process algebra.
The strand space model used in the constrained backwards strand semantics is the same as the one already used in Maude-NPA [Escobar et al(2009)], except for the following differences:
- •
Maude-NPA explores constrained states as defined in [Escobar et al(2015)], that is, states that have an associated constraint store. More specifically, a constrained state is a pair consisting of a state expression and a constraint, i.e., a set understood as a conjunction of disequality constraints.
- •
Strands are now of the form , where each can be of one of these forms: (i) if it is a sent message, (ii) if it is a received message, or (iii) if it is a constrained message.
State changes are described by a set of rewrite rules, so that the rewrite theory characterizes the behavior of protocol modulo the equations for backwards execution. The set of rules is obtained as follows. First, we adapt the set of rules in Section 3 to constrained states, which is an embedding of rules in . Their forwards version is shown below:
[TABLE]
[TABLE]
where and are variables denoting a list of strand messages, IK is a variable for a set of intruder facts ( or ), SS is a variable denoting a set of strands, and , denote a list of strand messages.
Then, we define new transition rules for constrained messages. That is, we add the reversed version of the following rules:
[TABLE]
[TABLE]
[TABLE]
Rule (B?) processes a constraint message denoting an explicit non-deterministic choice with constant “”. The constraint store is not changed and no satisfiability check is required.
Rules (Bif=) and (Bif) deal with constrained messages associated to explicit deterministic choices. Since the only constraints we allow in explicit deterministic choices are equalities and disequalities, rule (Bif=) is for the case when the constraint is an equality, rule (Bif) is for the case when the constraint is a disequality. The equality constraint is solved by -unification. The constraint in a constrained state is therefore a disequality constraint, i.e., . The semantics of such a constrained state, written is the set of all ground substitution instances of the form:
[TABLE]
The disequality constraints are then solved the same way as in [Escobar et al(2015)].
Definition 11**.**
Let be a protocol with signature and equational theory . We define the constrained backwards rewrite theory characterizing to be where is same as explained in Section 3. is the result of reversing the rewrite rules .
9 Soundness and Completeness of Constrained Backwards Strand Semantics
The soundness and completeness proofs generalize the proofs in [Escobar et al(2014)]. Recall that the state in the constrained states of constrained backwards strand semantics is a symbolic strand state, i.e., a state with variables. A state in the forwards strand semantics is a ground strand state, i.e., a state without variables. The lifting relation defines the instantiation relation between symbolic and ground states.
We define a symbolic state and a ground state as follows.
Definition 12** (Symbolic Strand State).**
Given a protocol , a symbolic strand state of is a term of the form:
[TABLE]
where for each , there exists a strand and a substitution such that , …, , is a variable denoting a (possibly empty) set of strands, and is a variable denoting a (possibly empty) set of intruder’s knowledge facts.
Definition 13** (Ground Strand State).**
Given a protocol , a ground strand state of is a term without variables of the form:
[TABLE]
where for each , there exists a strand and a substitution such that , …, .
The lifting relation in [Escobar et al(2014)] is extended with constraints and constrained messages. Note that the in the definition below can be sent messages, received messages, or constrained messages.
Definition 14** (Lifting Relation).**
Given a protocol , a constrained symbolic strand state and a ground strand state , we say that lifts to , or that instantiates to with a ground substitution , written iff
- •
for each strand in , there exists a strand in such that , .
- •
for each positive intruder fact in , there exists a positive intruder fact in such that , and
- •
for each negative intruder fact in , there is no positive intruder fact in such that .
- •
.
In the following we show the soundness and completeness of transitions in constrained backwards strand semantics w.r.t. the constrained forwards strand semantics by proving two lemmas stating the completeness and soundness of one-step transition in the constrained backwards strand semantics w.r.t. the constrained forwards strand semantics. The soundness and completeness result directly follows these two lemmas. In the proofs we consider only transition rules added in both semantics to deal with explicit choices, that is, rules in the constrained forwards strand semantics and rules in the constrained backwards strand semantics. The proof of the soundness and completeness of one-step transitions performed in the constrained backwards strand semantics using rules w.r.t to one-step transitions performed in the constrained forwards strand semantics using rules is the same as in [Escobar et al(2014)], since in these transitions no constraint is involved. Note that although in [Escobar et al(2014)], Choice Variables were not defined explicitly, the proof extends to strands with choice variables naturally, since the lifting relation between a ground state and a symbolic state does not need to be changed to cover choice variables. Since the strand labels are irrelevant for the result of this section, we will omit the strand labels to simplify the notation from now on. Also, we include the fresh substitution in the substitutions and do not separate the fresh substitutions explicitly.
Extending the proofs in [Escobar et al(2014)], we first prove how the lifting of a ground state to a symbolic state induces a lifting of a forwards rewriting step in the forwards semantics to a backwards narrowing step in the backwards semantics, i.e., the completeness of one-step transition. The lemma below extends the lifting lemma in [Escobar et al(2014)] to strands with constrained messages.
Lemma 3** (Lifting Lemma).**
Given a protocol , two ground strand states and , a constrained symbolic strand state and a substitution s.t. and , then there exists a constrained symbolic strand state and a substitution s.t. and either or .
The Lifting Lemma is illustrated by Figure 1.
Proof.
As has been explained before, we only need to consider the new rules: (Fif), (F?), (F?&). The proof in [Escobar et al(2014)] is structured by cases, some of which having specific requirements on intruder knowledge, or involve changes made to the intruder knowledge. Since all the new rules we are considering do not have specific requirements on the intruder knowledge, and do not change the intruder knowledge either, the cases that we need to consider are the following (cases and in the proof in [Escobar et al(2014)]), which involve the appearance or non-appearance of certain strand(s):
- e:
There is a strand in , , , and a substitution such that is a strand in and is a strand in .
- f:
There is a strand in , , , and a substitution such that is a strand in but is not a strand in .
Now we consider for the forward rewrite rule application in the step .
- •
Given ground states and s.t. using a rule in set (Fif), then there exists a ground substitution , variables SS’ and IK’, and strand in , such that , and and . Since there exists a substitution s. t. , we consider the following two cases:
- –
Case e) The strand appears in . More specifically, is a strand in s.t. . If the constraint is an equality constraint, since , and by the lifting relation, , rule (Bif=) can be applied for the backwards narrowing , and such that . If the constraint is a disequality constraint, since , and by the lifting relation, , we have . Therefore, rule (Bif) can be applied for the backwards narrowing, and .
- –
Case f) The strand does not appear in . Then makes as a valid symbolic strand state of , i.e., and .
- •
Given ground strand states and s.t. using a rule in set (F?), then we consider the following two applicable cases:
- –
Case e) The strand appears in and thus we can perform a backwards narrowing step from with rule (B?), i.e., , and .
- –
Case f) The strand does not appear in . Then makes as a valid constraint symbolic state of , i.e., and .
- •
Given states and s.t. using a rule in set (F?&), the proof is similar with using a rule in the set (F?).
∎
Theorem 2 below then follows straightforwardly.
Theorem 2** (Completeness).**
Given a protocol , two ground strand states , a constrained symbolic strand state and a substitution s.t. (i) is an initial state, (ii) , and (iii) . Then there exists a constrained symbolic initial strand state , two substitutions and , and , s.t. , and .
The Soundness Theorem from [Escobar et al(2014)] can also be extended to constrained backwards and forwards strand semantics. We first show that Lemma 2 in [Escobar et al(2014)], which states the soundness of one-step transition, still holds after extending to constrained states. The Soundness Theorem then follows straightforwardly.
Lemma 4**.**
Given a protocol , two constrained symbolic states and , a ground strand state and a ground substitution , if and , then there exists a ground strand state and a ground substitution such that , and .
Lemma 4 is illustrated by the Figure 2.
Proof.
We only need to consider the new rules: rule (Bif=), (Bif) and (B?).
-
If using rule (B?), then there are associated rules in the sets (F?) and (F?&).
-
If using rule (Bif=), there is a strand in , in s.t. , and , where is a strand in . Since , there is a ground strand in s, and . Therefore, and . By rule (Fif), , and .
If using rule (Bif), there is a strand in , in s.t. and , where is a strand in . Since , there is a ground strand in s, and . Therefore, . By rule (Fif), , and . ∎
The Soundness Theorem below shows that the backwards symbolic reachability analysis is sound with respect to the forwards rewriting-based strand semantics.
Theorem 3** (Soundness).**
Given a protocol , two constrained symbolic strand states , an initial ground strand state and a substitution s.t. (i) is a symbolic initial state, and (ii) , and (iii) . Then there exists a ground strand state and a substitution , s.t. (i) , and (ii) .
The soundness and completeness results in Theorems 3 and 2 together with the bisimulation proved in Theorem 1 show that the backwards symbolic reachability analysis is sound and complete with respect to the process algebra semantics.
Theorem 4** (Soundness).**
Given a protocol , two constrained symbolic strand states , the initial FW-State , a substitution , and the initial PA-State s.t. (i) is a symbolic initial strand state, and (ii) , and (iii) . Then there exists a FW-State such that , and therefore, there is a PA-State such that .
Theorem 5** (Completeness).**
Given a protocol , a PA-State , a FW-State , a constrained symbolic strand state s.t. (i) , (ii) . Then there is a backwards symbolic execution s.t. is a symbolic initial strand state as defined in Section 3, and .
10 Protocol Experiments
In this section we describe some experiments444Available at http://personales.upv.es/sanesro/Maude-NPA-choice/choice.html that we have performed on protocols with choice. We have fully integrated the process algebra syntax, and its transformation into strands, and have developed new methods to specify attack states using the process notation in the recent release of Maude-NPA 3.0 (see [Escobar et al(2017)]).
10.1 Integration of the Protocol Process Algebra in Maude-NPA
We have fully implemented the process algebra notation in Maude-NPA. Strands represent each role behavior as a linear sequence of message outputs and inputs but processes represent each role behavior as a possibly non-linear sequence of message outputs and inputs. The honest principal specification is specified in the process algebra syntax. In order for Maude-NPA to accept process specifications, we have replaced the section STRANDS-PROTOCOL from the protocol template by a new section PROCESSES-PROTOCOL; see [Escobar et al(2017)] for details. The intruder capabilities as well as the states generated by the tool still use the strand syntax.
Attack patterns may be specified using the process algebra syntax, under the label ATTACK-PROCESS, or strand syntax, under the label ATTACK-STATE. We describe how they are specified in the process algebra syntax below. An attack pattern describes a state consisting of zero or more processes that must have executed, and zero or more terms in the intruder knowledge. It may also contain never patterns, that is, descriptions of processes that must not be executed at the time the state is reached. Never patterns can be used to reason about authentication properties, e.g., can Alice execute an instance of the protocol, apparently with Bob, without Bob executing an instance of the protocol with Alice.
Note that processes in an attack pattern cannot contain explicit nondeterminism (?) or explicit deterministic choice (if), since one and only one behavior is provided in an attack pattern. This is achieve by requiring that any constraint appearing in an attack pattern must be strongly irreducible, that is, it must not only be irreducible, but for any irreducible substitution to the variables of , must be irreducible as well.
That is, imagine a process i the form
[TABLE]
where each of the expressions and can evaluate to or depending on the substitutions made to them.
Then in the attack pattern one must specify one and only one of the following possibilities
[TABLE]
Finally, never patterns must satisfy a stronger condition: the entire never pattern must be strongly irreducible. This condition is inherited from the original Maude-NPA.
10.2 Choice of Encryption Type
This protocol allows either public key encryption or shared key encryption to be used by Alice to communicate with Bob. Alice initiates the conversation by sending out a message containing the chosen encryption mode, then Bob replies by sending an encrypted message containing his session key. The encryption mode is chosen nondeterministically by Alice. Therefore, it exhibits an explicit nondeterministic choice. Below we show the protocol description: the first one reflects the case in which public key encryption (denoted by ) is chosen.
2. 2.
3. 3.
4. 4.
The second one reflects the case in which a shared key encryption (denoted by ) is chosen.
2. 2.
3. 3.
4. 4.
Note that and are names of principals, denotes the session key generated by , and denotes a nonce generated by .
There are different ways of encoding this protocol as two process expressions. We have chosen to treat the encryption mode as a choice variable which can be either public key encryption or shared key encryption, and then the receiver will perform an explicit deterministic choice depending on the value of this choice variable. The process specification is as follows:
[TABLE]
We analyzed whether the intruder can learn the session key generated by Bob, when either the public key encryption or shared key encryption is chosen, assuming both principals are honest.
--- initiator accepts session key for shared key encryption and --- intruder learns it eq ATTACK-PROCESS(2) = -(a ; b ; mode) . (mode neq pubkey) . +(she(key(a, b), skey(b,r))) . -(she(key(a, b), skey(b,r) ; N)) . +(she(key(a, b), N)) || skey(b,r) inI || nil [nonexec] .
--- initiator accepts session key for public key encryption and --- intruder learns it eq ATTACK-PROCESS(3) = -(a ; b ; mode) . (mode eq pubkey) . +(pk(a, b ; skey(b, r))) . -(pk(b, a ; skey(b,r) ; N)) . +(pk(a, b ; N)) || skey(b,r) inI || nil [nonexec] .
For this property, Maude-NPA terminated without any attack being found for any of the two attack states.
10.3 Rock-Paper-Scissors
To evaluate our approach on protocols with explicit deterministic choices, we have used a simple protocol which simulates the famous Rock-Paper-Scissors game, in which Alice and Bob are the two players of the game. In this game, Alice and Bob commit to each other their hand shapes, which are later on revealed to each other after both players committed their hand shapes. The result of the game is then agreed upon between the two players according to the rule: rock beats scissors, scissors beats paper and paper beats rock. They finish by verifying with each other that they both reached the same conclusion. Thus, at the end of the protocol each party should know the outcome of the game and whether or not the other party agrees to the outcome. This protocol exhibits explicit deterministic choice, because the result of the game depends on the evaluation of the committed hand shapes according to the game’s rule. Note that this protocol also exhibits implicit nondeterministic choice, since the hand shape of the players are chosen by the players during the game.
The protocol proceeds as follows. First, both initiator and responder choose their hand shapes and send them to each other using a secure commitment scheme. Next, they both send each other the nonces that are necessary to open the commitments. Each of them then compares the two hand shapes and decides if the initiator wins, the responder wins, or there is a tie. The initiator then sends the responder the outcome. When the responder receives the initiator’s verdict, it compares it against its own. It responds with “finished” if it agrees with the initiator and “cheater” if it doesn’t. All messages are signed and encrypted, and the initiator’s and responder’s nonces are included in the messages concerning the outcome of the game. The actual messages sent and choices made are described in more detail below.
2. 2.
3. 3.
4. 4.
5. 5.
if\ (X_{A}\ beats\ X_{B})\
6. 6.
7. 7.
One interesting feature of the Rock-Scissors-Paper protocol, is that, in order to verify that the commitment has been opened successfully, i.e., that the nonce received is the nonce used to create the commitment, one must verify that the result of opening it is well-typed, i.e., that it is equal to “rock”, “scissors”, or “paper”. This can be done via the evaluation of predicates. First, we create a sort Item and declare the constants “rock”, “scissors”, and “paper” to be of sort Item. Then we create a variable of sort Item. We then define a predicate such that evaluates to true. Since only terms of sort Item can be unified with , this predicate can be used to check whether or not a term is of sort Item. The process specification for the initiator and the responder is as follows.
[TABLE]
[TABLE]
We first tried to see whether the protocol can simulate the game successfully, so we asked for different scenarios in which the player Alice or Bob can win in a round of the game. Maude-NPA was able to generate the expected scenarios, and it did not generate any others. We then gave Maude-NPA a secrecy attack state, in which the intruder, playing the role of initiator against an honest responder, attempts to guess its nonce before the responder receives its commitment.
eq ATTACK-PROCESS(1) = -(pk(b,sig(i, ComXA:ComMsg))) . +(pk(i,sig(b, com(n(b, r:Fresh), XB:Item)))) . -(pk(b,sig(i, NA:Nonce))) || n(b, r:Fresh) inI || nil [nonexec] .
Finally we specified an authentication attack state in which we asked if a responder could complete a session with an honest initiator with the conclusion that the initiator had carried out its rule faithfully, without that actually having happened.
eq ATTACK-PROCESS(2) = -(pk(b, sig(a, ComXA))) . +(pk(a, sig(b, com(n(b,r), XB)))) . -(pk(b, sig(a, NA))) . +(pk(a, sig(b, n(b, r)))) . -(pk(b, sig(a, NA ; win))) . ((item? open(NA, ComXA)) eq ok) . (win eq win) . ((open(NA, ComXA) beats XB) eq ok) . +(pk(a, sig(b, NA ; n(b,r))) ; finished) || empty || never( +(pk(b, sig(a, ComXA))) . -(pk(a, sig(b, com(n(b,r), XB)))) . +(pk(b, sig(a, NA))) . -(pk(a, sig(b, n(b,r)))) . (ok eq ok) . (ok eq ok) . +(pk(b, sig(a, NA ; win))) . -(pk(a, sig(b, NA ; n(b,r))) ; finished) || empty ) [nonexec] .
For both of these attack states Maude-NPA finished its search without finding any attacks.
10.4 TLS
In Section 1.3 we introduced a simplified version of the handshake protocol in TLS 1.3 [Rescorla(2016)]. Even this simplified version produced a very large search space, because of the long list of messages and the concurrent interactions of a big amount of choices. We are however able to check the correctness of our specification by producing legal executions in Maude-NPA. Unlike TLS 1.3, we intentionally introduced a “downgrade attack” in our version in which the attacker can trick the principals into using a weaker crypto system. However, we have not yet been able to produce this attack because of the very deep and wide analysis tree (i.e., long reachability sequences with many branches) that is produced. We are currently investigating more efficient ways of managing list processing.
11 Related Work
As we mentioned in the introduction, there is a considerable amount of work on adding choice to the strand space model that involves embedding it into other formal systems, including event-based models for concurrency [Crazzolara and Winskel(2002)], Petri nets [Fröschle(2009)], or multi-set rewriting [Cervesato et al(2000)]. Crazzolara and Winskel model nondeterministic choice as a form of composition, where a conflict relation is defined between possible child strands so that the parent can compose with only one potential child. In [Fröschle(2009)] Fröschle uses a Petri net model to add branching to strand space bundles, which represent the concurrent execution of strand space roles. Note that we have taken the opposite approach of representing bundles as traces of non-branching strands, where a different trace is generated for each choice taken. Although this results in more bundles during forward execution, it makes little difference in backwards execution, and is more straightforward to implement in an already existing analysis tool.
We also note that deterministic choice has been included in the applied pi calculus for cryptographic protocols [Abadi and Fournet(2001)], another widely used formal model, based on Milner’s pi calculus [Milner(1999)]. The applied pi calculus includes the rule , where and are terms. This is similar to our syntax for deterministic choice. However our long-term plan is to add other types or predicates as well (e.g., ) ; indeed our approach extends to any type of predicate that can be evaluated on a ground state. Although the applied pi calculus in its original form does not include nondeterministic choice, both nondeterministic and probabilistic choice have been added in subsequent work [Goubault-Larrecq et al(2007)].
In addition, Olarte and Valencia show in [Olarte and Valencia(2008)] how a cryptographic protocol modeling language can be expressed in their universal timed concurrent constraint programming (utcc) model, a framework that extends the timed concurrent constraint programming model to express mobility. The language does not support choice, but utcc does. It seems that it would not be difficult to extend the language to incorporate the utcc choice mechanisms.
The Tamarin protocol analysis tool [Meier et al(2013)] includes deterministic branching, which was used extensively in the analysis of TLS 1.3 [Cremers et al(2016)]. In particular, it includes an optimization for roles of the form ; when backwards search is used, it is sometimes possible to capture such an execution in terms of just one strand until the conditional is encountered, thus reducing the state space. Our approach produces two strands, but since the process algebra semantics makes it easy to tell whether or not behaves “essentially” the same no matter if or is chosen, we believe that we have a pathway for including such a feature if desired.
12 Conclusions
We have provided an extension to the strand space model that allows for both deterministic and nondeterministic choice, together with an operational semantics for choice in strand spaces that not only provides a formal foundation for choice, but allows us to implement it directly in the Maude-NPA cryptographic protocol analysis tool. In particular, we have applied Maude-NPA to several protocols that rely on choice in order to validate our approach.
This work not only provides a choice extension to strand spaces, but extends them in other ways as well. First of all, it provides a process algebra for strand spaces. This potentially allows us to relate the strand space model to other formal systems (e.g., the applied pi calculus [Abadi(2001)]) giving a better understanding of how it compares with other formal models. In addition, the process algebra semantics provides a new specification language for Maude-NPA that we believe is more natural for users than the current strand-space language.
Another contribution of this work is that it provides a means for evaluating both equality and disequality predicates in the strand space model and in Maude-NPA. This allows us to implement features such as type checking in Maude-NPA, via predicates such as , where , that is, succeeds only if is of sort . This proved to be very helpful, for example, in our specification of the Rock-Scissors-Paper protocol as we described earlier. We believe the expressiveness of Maude-NPA can be further increased at little cost by extending the types of predicates that can be evaluated, e.g., by including predicates for subsumption and their negations. This is another subject for further investigation.
Appendix A Proof of Theorem 1
Proof.
Since and , therefore, . We then prove that: for all PA-State , and FW-State , if , and there exists a PA-State such that , then there exists a FW-State such that and .. If , by definition of the relation , there exists a label sequence s.t. and . Suppose there exists state such that . We prove by case analysis on label that there exists such that . The fact that then follows this by the definition of relation .
In the rest of this proof, and denote lists of messages, and denote messages, and denote processes, denotes a process configuration, denotes a set of constrained protocol strands, and denote the set of messages in the intruder’s knowledge.
if , according to the semantics, by applying rule (PA++), the state is of the form s.t. there exists a ground substitution binding the choice variables in and , the state and . Since , by Lemmas 1 and 2, is of the form s.t. . Let be a constrained strand in s.t. there exists a ground substituion s.t. . By the definition of relation and mapping toCstrSS, the first message of is , s.t. . Then since and , the rule (F++) can be applied for the rewrite , where .
If , by applying rule (PA&), there exists a process in and a ground substitution s.t. . Since , by the definition of toCstrSS, for all strands of role in , the first message is . Without loss of generality, let be , and be . Since the rule (PA&) can be applied, . By Lemma 2, . Moreover, by Lemma 1, , and since , by applying the rule (F++&) we get .
- 2)
: similar to case 1.
- 3)
: if , according to the semantics, by applying rule (PA-), is of the form s.t. for some ground substitution and . Since , by Lemmas 1 and 2, s.t. . Let s.t. there exists a ground substitution s.t. , then by definition of and toCstrSS, the first message of is s.t. . Since , rule (F-) can be applied to get the transition , where .
If , by applying rule (PA&), there exists a process in and a ground substitution s.t. . Without loss of generality, let be . Then . Since , by the definition of toCstrSS, for all strands of role in , the first message is . By Lemma 2, is in the intruder knowledge of . Moreover, by Lemma 1, , and since , by applying the rule (F-&) we get .
- 4)
: according to the transition rules, by applying rule (PAif1). Therefore is of the form , and . Since , by Lemma 1, s.t. . By the definition of the relation and the mapping toCstrSS, there exists and a ground substitution s.t. , and . Since , the rule (Fif) can be applied for the rewrite , where
- 5)
: similar to case 4.
- 6)
: if , by applying rule (PA?1). Therefore is of the form and . Since , by Lemma 1, s.t. . By the definition of and toCstrSS, there is a strand s.t. . Therefore, rule (F?) can be applied for the rewrite , and .
If , by applying rule (PA&). Therefore, there exists a process in . Since , by the definition of toCstrSS, there is a strand of role whose first message is in . Moreover, by Lemma 1, , and since , by applying the rule (F?&) we get .
- 7)
similar to case 6.
Similarly, we can prove that for all PA-State , and FW-State , if , and there exists a FW-State such that , then there exists a PA-State such that and ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[Abadi(2001)] Abadi M (2001) Leslie lamport’s properties and actions. In: Proceedings of the Twentieth Annual ACM Symposium on Principles of Distributed Computing, PODC 2001, p 15
- 2[Abadi and Fournet(2001)] Abadi M, Fournet C (2001) Mobile values, new names, and secure communication. In: Conference Record of POPL 2001: The 28th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp 104–115
- 3[Cervesato et al(2000)] Cervesato I, Durgin NA, Mitchell JC, Lincoln P, Scedrov A (2000) Relating strands and multiset rewriting for security protocol analysis. In: Proceedings of the 13th IEEE Computer Security Foundations Workshop, CSFW ’00, pp 35–51
- 4[Crazzolara and Winskel(2002)] Crazzolara F, Winskel G (2002) Composing strand spaces. In: FST TCS 2002: Foundations of Software Technology and Theoretical Computer Science, pp 97–108
- 5[Cremers et al(2016)] Cremers C, Horvat M, Scott S, van der Merwe T (2016) Automated analysis and verification of TLS 1.3: 0-rtt, resumption and delayed authentication. In: IEEE Symposium on Security and Privacy, SP 2016, San Jose, CA, USA, May 22-26, 2016, IEEE Computer Society, pp 470–485, DOI 10.1109/SP.2016.35 , URL http://dx.doi.org/10.1109/SP.2016.35 · doi ↗
- 6[Escobar et al(2009)] Escobar S, Meadows C, Meseguer J (2009) Maude-NPA: Cryptographic protocol analysis modulo equational properties. In: Foundations of Security Analysis and Design V, FOSAD 2007/2008/2009 Tutorial Lectures, Springer, LNCS vol. 5705, pp 1–50
- 7[Escobar et al(2014)] Escobar S, Meadows C, Meseguer J, Santiago S (2014) A rewriting-based forwards semantics for Maude-NPA. In: Proceedings of the 2014 Symposium and Bootcamp on the Science of Security, Hot So S 2014, ACM
- 8[Escobar et al(2015)] Escobar S, Meadows C, Meseguer J, Santiago S (2015) Symbolic protocol analysis with disequality constraints modulo equational theories. In: Programming Languages with Applications to Biology and Security, pp 238–261
