Scalable Information-Flow Analysis of Secure Three-Party Affine   Computations

Patrick Ah-Fat; Michael Huth

arXiv:1901.00798·cs.CR·January 4, 2019

Scalable Information-Flow Analysis of Secure Three-Party Affine Computations

Patrick Ah-Fat, Michael Huth

PDF

TL;DR

This paper develops a scalable method to quantify information flow in secure three-party affine computations using min-entropy, enabling practical privacy analysis in large input scenarios.

Contribution

It derives a closed-form formula for min-entropy in three-party affine computations, scalable to large inputs, and provides bounds for non-uniform priors.

Findings

01

Explicit formula for min-entropy under uniform priors

02

Constant-time computation relative to input size

03

Logarithmic complexity in affine coefficients

Abstract

Elaborate protocols in Secure Multi-party Computation enable several participants to compute a public function of their own private inputs while ensuring that no undesired information leaks about the private inputs, and without resorting to any trusted third party. However, the public output of the computation inevitably leaks some information about the private inputs. Recent works have introduced a framework and proposed some techniques for quantifying such information flow. Yet, owing to their complexity, those methods do not scale to practical situations that may involve large input spaces. The main contribution of the work reported here is to formally investigate the information flow captured by the min-entropy in the particular case of secure three-party computations of affine functions in order to make its quantification scalable to realistic scenarios. To this end, we…

Equations201

A = {β y + γ z ∣ (y, z) \in [[0; n]] \times [[0; m]]}

A = {β y + γ z ∣ (y, z) \in [[0; n]] \times [[0; m]]}

A_{i} = {β i + γ z ∣ z \in [[0; m]]}

A_{i} = {β i + γ z ∣ z \in [[0; m]]}

A = i = 0 ⋃ n A_{i}

A = i = 0 ⋃ n A_{i}

o \in A ⟺ β n + γ m - o \in A

o \in A ⟺ β n + γ m - o \in A

∣ A \cap [[0; β γ - 1]] ∣ = i = 0 \sum n ∣ A_{i} \cap [[0; β γ - 1]] ∣

∣ A \cap [[0; β γ - 1]] ∣ = i = 0 \sum n ∣ A_{i} \cap [[0; β γ - 1]] ∣

∣ A \cap [[β γ; β n + γ m - β γ]] ∣ = β n + γ m - 2 β γ + 1

∣ A \cap [[β γ; β n + γ m - β γ]] ∣ = β n + γ m - 2 β γ + 1

B_{k} = i = k mod γ ⋃ A_{i}

B_{k} = i = k mod γ ⋃ A_{i}

H (X_{T} ∣ x_{A}, O)

H (X_{T} ∣ x_{A}, O)

p (o ∣ x_{A}, x_{T}) = x_{S} f (x_{A}, x_{T}, x_{S}) = o \sum p (x_{S})

p (o ∣ x_{A}, x_{T}) = x_{S} f (x_{A}, x_{T}, x_{S}) = o \sum p (x_{S})

Y \in I_{Y}, Z \in I_{Z}

Y \in I_{Y}, Z \in I_{Z}

f (x, y, z) = α + β y + γ z

f (x, y, z) = α + β y + γ z

o = f (y, z) = α + β y + γ z

o = f (y, z) = α + β y + γ z

D_{O} = {f (y, z) ∣ (y, z) \in I_{Y} \times I_{Z}}

D_{O} = {f (y, z) ∣ (y, z) \in I_{Y} \times I_{Z}}

H (Y ∣ O) = - lo g V (Y ∣ O)

H (Y ∣ O) = - lo g V (Y ∣ O)

V (Y ∣ O)

V (Y ∣ O)

\begin{array}[]{lllll}\operatorname{V}(Y\mid O)&=&\sum_{o}p(o)\cdot\max_{y}p(y)&=&1\cdot\operatorname{H}(Y)\end{array}

\begin{array}[]{lllll}\operatorname{V}(Y\mid O)&=&\sum_{o}p(o)\cdot\max_{y}p(y)&=&1\cdot\operatorname{H}(Y)\end{array}

\begin{array}[]{lllll}\operatorname{V}(Y\mid O)&=&\sum_{o}p(o)\cdot 1&=&1\end{array}

\begin{array}[]{lllll}\operatorname{V}(Y\mid O)&=&\sum_{o}p(o)\cdot 1&=&1\end{array}

V (Y ∣ O)

V (Y ∣ O)

p (o ∣ y)

p (o ∣ y)

V (Y ∣ O)

V (Y ∣ O)

A = {β y + γ z ∣ (y, z) \in [[0; n]] \times [[0; m]]}

A = {β y + γ z ∣ (y, z) \in [[0; n]] \times [[0; m]]}

- x \leq ⌈ - x ⌉ < - x + 1

- x \leq ⌈ - x ⌉ < - x + 1

x - 1 < - ⌈ - x ⌉ \leq x

x - 1 < - ⌈ - x ⌉ \leq x

- ⌈ - x ⌉ = ⌊ x ⌋

- ⌈ - x ⌉ = ⌊ x ⌋

k = 1 \sum q - 1 ⌊ \frac{k p}{q} ⌋ = \frac{( p - 1 ) ( q - 1 )}{2}

k = 1 \sum q - 1 ⌊ \frac{k p}{q} ⌋ = \frac{( p - 1 ) ( q - 1 )}{2}

k = 1 \sum q - 1 ⌊ \frac{k p}{q} ⌋

k = 1 \sum q - 1 ⌊ \frac{k p}{q} ⌋

2 k = 1 \sum q - 1 ⌊ \frac{k p}{q} ⌋ = (q - 1) p + k = 1 \sum q - 1 (⌊ \frac{k p}{q} ⌋ + ⌊ - \frac{k p}{q} ⌋)

2 k = 1 \sum q - 1 ⌊ \frac{k p}{q} ⌋ = (q - 1) p + k = 1 \sum q - 1 (⌊ \frac{k p}{q} ⌋ + ⌊ - \frac{k p}{q} ⌋)

k = 1 \sum q - 1 (⌊ \frac{k p}{q} ⌋ + ⌊ - \frac{k p}{q} ⌋)

k = 1 \sum q - 1 (⌊ \frac{k p}{q} ⌋ + ⌊ - \frac{k p}{q} ⌋)

2 k = 1 \sum q - 1 ⌊ \frac{k p}{q} ⌋

2 k = 1 \sum q - 1 ⌊ \frac{k p}{q} ⌋

j \in [[0; q - 1]] ⋃ [p j]_{q} = Z

j \in [[0; q - 1]] ⋃ [p j]_{q} = Z

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Abstract

Elaborate protocols in Secure Multi-party Computation enable several participants to compute a public function of their own private inputs while ensuring that no undesired information leaks about the private inputs, and without resorting to any trusted third party. However, the public output of the computation inevitably leaks some information about the private inputs. Recent works have introduced a framework and proposed some techniques for quantifying such information flow. Yet, owing to their complexity, those methods do not scale to practical situations that may involve large input spaces. The main contribution of the work reported here is to formally investigate the information flow captured by the min-entropy in the particular case of secure three-party computations of affine functions in order to make its quantification scalable to realistic scenarios. To this end, we mathematically derive an explicit formula for this entropy under uniform prior beliefs about the inputs. We show that this closed-form expression can be computed in time constant in the inputs sizes and logarithmic in the coefficients of the affine function. Finally, we formulate some theoretical bounds for this privacy leak in the presence of non-uniform prior beliefs.

**Scalable Information-Flow Analysis of

Secure Three-Party Affine Computations

**

Patrick Ah-Fat and Michael Huth

Department of Computing, Imperial College London

London, SW7 2AZ, United Kingdom

$\{$ patrick.ah-fat14, m.huth $\}$ @imperial.ac.uk

Keywords: Computational Privacy, Min-entropy, Combinatorics.

1 Introduction

Secure Multi-party Computation (SMC) is a domain of cryptography that aims at enabling several parties to compute a public function of their own private inputs, while keeping the inputs secret and without resorting to any trusted third party [1, 2, 3, 4, 5, 6]. Multi-party secure protocols typically require the parties to engage in a series of rounds of communication in order to exchange their information so as to be able to collaboratively compute the intended output. Such protocols provide the guarantee that none of the parties will be able to infer any information about the other parties’ input, other than the information conveyed by the public output itself.

Paradoxically, as a function of the inputs, the public output inevitably leaks some information about those private inputs. This leakage is considered as an inherent consequence of the primary objective of SMC: it is commonly qualified as the “acceptable leakage” and its study is thus largely ignored in the SMC literature [7, 8, 9, 10]. Recent works have been undertaken with the aim of quantifying such information flows [11, 12, 13]. By adapting techniques from Quantitative Information Flow (QIF) and applying concepts from Information Theory (IT) to the context of SMC, they introduce an attack model and a general notion of entropy that enable us not only to reason about the acceptable leakage in SMC, but also to construct bespoke privacy-enhancing mechanisms aimed at protecting the inputs’ secrecy. In this attack model, the entropy of a targeted input reflects the amount of information that is gained by an attacker once the output is revealed.

Although these techniques offer a rich framework designed for analysing information flows in SMC, their computation is essentially combinatorial, and their application in practice is thus impeded by the scalability of computing this combinatorics. Indeed, in the general case, the time complexity of computing such entropy measures is quadratic in the product of the inputs sizes, making them inadequate for examining real world applications of SMC that may involve large input spaces. We believe however, that developing techniques that can perform such analyses efficiently would benefit and complement the extensive researches [14, 15, 16, 17] that are being conducted on efficient SMC protocols: potential participants of an SMC would not only have efficient cryptographic protocols at their disposal, but they could also effectively run privacy analyses in order to precisely estimate the risk that they would run by entering the computation.

In this paper, our objective is to focus our efforts on a particular class of functions for which we further investigate those analyses in order to make them applicable to arbitrarily large input spaces. More precisely, we focus on secure three-party computations, and we study the class of functions that are affine in the target’s and the spectator’s inputs, while the amount of information that an attacker gains on a targeted input will be measured by means of conditional min-entropy. In this setting, the main contribution of this work is to reduce the combinatorial essence of this information measure to a closed-form expression that has time complexity constant in the inputs sizes, and logarithmic in the coefficients of the affine function. More specifically, we show that under uniform prior beliefs, the conditional min-entropy can be reduced to a simple function of the size of the output domain, for which we then derive an explicit expression. Finally, as this reduction is valid under uniform prior beliefs on the inputs, we also exhibit some explicit bounds for this information measure in the presence of non-uniform prior beliefs.

Outline of Paper. We present an intuitive overview of our main contributions and of the key technical aspects of our work in Section 2. We discuss some related works in Section 3. The mathematical formalisation required for analysing information flows in secure three-party affine computations is introduced in Section 4. In Section 5, we show that the information gained by an attacker under uniform prior beliefs is entirely determined by the size of the output domain, for which we derive a closed-form expression. Explicit bounds for the information flow under non-uniform prior beliefs are presented in Section 6. We illustrate those theoretical results in Section 7 and Section 8 concludes the paper.

Notations. Let $D$ be a discrete set. We denote by $|D|$ the cardinality of set $D$ . Let $\Omega(D)$ be the set of all probability distributions whose support is contained in $D$ . Throughout, we present distributions as Python dictionaries with domain values as keys and associated probabilities as values. For example, $\{4\colon\frac{1}{2},8\colon\frac{1}{2}\}$ represents the uniform distribution over $\{4,8\}$ . For any integers $a$ and $b$ , we will write $\llbracket{a};{b}\rrbracket$ for the set of consecutive integers ranging from $a$ to $b$ , namely $\{a,a+1,\cdots,b\}$ . The greatest common divisor of $a$ and $b$ will be denoted as $\gcd(a,b)$ . The fact that two integers $i$ and $j$ have same residue modulo another integer $k$ will be denoted as $i=j\mod k$ . Given random variable $X$ and value $x$ , the event “ $X=x$ ” will be abbreviated by “ $x$ ” when there is no ambiguity, and its probability will be denoted by $p(x)$ . Similarly, we will abbreviate $\sum_{x\in D}$ by $\sum_{x}$ when the domain $D$ is obvious from context. Finally, the logarithm in base $2$ will be denoted as $\log$ .

2 Methodology

In this section, we present an overview of our main contributions and we highlight the key technical components of our work intuitively. Although the aim of this section is to illustrate and summarise our results, the detailed and rigorous approach is developed in Sections 4, 5 and 6. This work is motivated by Secure Multi-party Computation, which requires all the manipulated values to belong to finite spaces. Thus, we will focus on integer values ranged in finite intervals.

The main contribution of this work is to introduce an efficient and scalable way of quantifying the acceptable leakage in three-party affine computations. To this effect, we consider the secure computation of a public function $f$ performed on three private inputs $x$ , $y$ and $z$ . We wish to quantify the amount of information that an attacker, who has control of or is being able to eavesdrop on the value of $x$ , would gain on input $y$ once the output of $f$ is revealed. We focus on the functions $f$ whose output $o$ can, once the input $x$ controlled by the attacker is fixed, be expressed as a function of $y$ and $z$ , in its simplest form, as $o=f(y,z)=\beta y+\gamma z$ where $\beta$ and $\gamma$ are constant integers. We quantify the information gained by the attacker from this computation by $\operatorname{H}(Y\mid O)$ , the min-entropy of input $y$ given output $o$ , considered as random variables. When inputs $Y$ and $Z$ are considered as random variables uniformly distributed on some intervals, we show in Section 5.1 that this entropy can be reduced to an explicit formula involving $N_{O}$ , the number of possible values that output $O$ can take. The main difficulty now resides in deriving a closed-form formula for $N_{O}$ , the focus of Section 5.2, and for which we sketch an intuitive explanation now. Given that $Y$ and $Z$ are uniformly distributed on respective intervals $I_{Y}$ and $I_{Z}$ , we show that those intervals can be assumed to be of the form $I_{Y}=\llbracket{0};{n}\rrbracket$ and $I_{Z}=\llbracket{0};{m}\rrbracket$ respectively, where $n$ and $m$ are positive integers. We also show that a simple simplification enables us to assume that constants $\beta$ and $\gamma$ are positive and coprime. The number of outputs $N_{O}$ can now be expressed as the following cardinal $N_{O}=|A|$ where we define set $A$ as:

[TABLE]

For the sake of our later explanation, we will define for all $i$ in $\llbracket{0};{n}\rrbracket$ , the set $A_{i}$ as:

[TABLE]

so that set $A$ can now be expressed as:

[TABLE]

In order to illustrate our method and theorems for computing the cardinal $|A|$ , we will construct some graphical representations of set $A$ under different configurations in the following examples.

Example 1.

Let $\beta=3$ , $\gamma=4$ , $n=7$ and $m=6$ . The graphical representation of the corresponding set $A$ is shown in Figure 1. The $y$ -axis corresponds to the values of $y$ , which is ranged in $\llbracket{0};{7}\rrbracket$ , while the $x$ -axis corresponds to the possible values that the output can take. For each row, indexed by $i\in\llbracket{0};{7}\rrbracket$ , we mark by a cross the possible values that the output $o=\beta i+\gamma z$ can take. In other words, each row $i$ will represent the elements contained in set $A_{i}$ . As $A$ is defined as the union of all the $A_{i}$ , the set $A$ corresponds to the projection of all the crosses onto the $x$ -axis. In other words, value $o$ belongs to set $A$ if and only if there is at least one cross in column $o$ .

In order to tally the number of feasible outputs, we will highlight some intuitive results, which we will formalise and prove in Section 5.2. We first notice that the number of outputs is upper bounded by $(n+1)(m+1)$ , and may be strictly lower than this bound since one column may contain several crosses. We will refer to such columns containing more than $1$ cross as intersections. We make the following observations:

The “first” intersection occurs at column $\beta\gamma=12$ and is highlighted in red in Figure 1. In other words, the lowest output $o$ whose column contains at least two crosses is $o=\beta\gamma$ . 2. 2.

We indicate in blue the first cross of the last row, indexed at column $\beta n=21$ , and in orange the last cross of the first row, indexed at column $\gamma m=24$ . We notice that the first intersection can occur if and only if both of those crosses, highlighted in blue and orange, do not stand before the column highlighted in red. In other words, the first intersection can occur if and only if $\beta n\geq\beta\gamma$ and $\gamma m\geq\beta\gamma$ , i.e. if $n\geq\gamma$ and $m\geq\beta$ , which we claim in Lemma 3. If one of those conditions is not satisfied, then there is no intersection, and the number of outputs is $(n+1)(m+1)$ , which we claim in Corollary 1 and illustrate in the next example. 3. 3.

Set $A$ is “symmetrical”, i.e. that for all output $o$ in $\llbracket{0};{\beta n+\gamma m}\rrbracket$ , we have:

[TABLE]

where $\beta n+\gamma m$ is the largest output obtained for maximal values of $y$ and $z$ . This is proved in Lemma 5. 4. 4.

Thus, the last intersection occurs at column $\beta n+\gamma m-\beta\gamma$ and is highlighted in purple. Together with observation 1, this constitutes the content of Lemma 4. Moreover, by symmetry, there is the same number of outputs contained in $\llbracket{0};{\beta\gamma-1}\rrbracket$ and $\llbracket{\beta n+\gamma m-\beta\gamma+1};{\beta n+\gamma m}\rrbracket$ , as claimed in Corollary 2. 5. 5.

As there is no intersection before the red column $\beta\gamma$ , the number of outputs contained in $\llbracket{0};{\beta\gamma-1}\rrbracket$ can be obtained by summing the number of elements of all $A_{i}$ contained in this interval. More formally, we have:

[TABLE]

This corresponds to the total number of crosses that stand before the column highlighted in red. We develop its computation in Theorem 1. 6. 6.

Finally, we observe that all the columns lying between the red and purple ones, i.e. ranging in the interval $\llbracket{\beta\gamma};{\beta n+\gamma m-\beta\gamma}\rrbracket$ , contain at least one cross. This result is formalised in Theorem 2 and implies that:

[TABLE]

In order to prove that all such columns contain at least one cross, we make the following reasoning.

(a)

Two sets $A_{i}$ whose indices are separated by a multiple of $\gamma$ will only contain some outputs that have the same residue modulo $\gamma$ . More precisely, if $i$ and $j$ are both congruent to some $k$ modulo $\gamma$ , then the elements of $A_{i}$ and $A_{j}$ will be congruent to $\beta k$ modulo $\gamma$ . We illustrate this fact in Figure 1, where we color in green the elements of sets $A_{1}$ and $A_{5}$ . We can notice that all those elements are congruent to $1\beta$ modulo $\gamma$ . 2. (b)

For all $k$ in $\llbracket{0};{\gamma-1}\rrbracket$ , let us define $B_{k}$ as the union of all the $A_{i}$ whose index $i$ is congruent to $k$ modulo $\gamma$ :

[TABLE]

For example, $B_{1}$ can be represented as the projection of all the green crosses on the $x$ -axis. Then, we can see that each $B_{k}$ includes all the outputs that are ranged between the red and purple columns and that are congruent to $\beta k$ modulo $\gamma$ . This observation is formalised in Lemmas 6 and 7. We can indeed see in the figure that $\{15,19,23,27,31\}\subseteq A_{1}\cup A_{5}=B_{1}$ . We notice that this may not be the case outside of the domain delimited by the red and purple columns as for example $47\notin A_{1}\cup A_{5}$ . 3. (c)

Finally, as formally explained in Theorem 2, we claim that for all output $o$ ranged between the red and purple columns, there exists a $k$ in $\llbracket{0};{\gamma-1}\rrbracket$ such that $o\in B_{k}$ . Indeed, if we denote by $r$ the residue of $o$ modulo $\gamma$ , it suffices to choose $k=\beta^{-1}r$ to ensure that $\beta k=r\mod\gamma$ , which then implies $o\in B_{k}$ . While $\beta^{-1}$ refers to the inverse of $\beta$ modulo $\gamma$ , such an operation is allowed since $\beta$ and $\gamma$ are coprime. For example, output $o=17$ has residue $r=1$ modulo $\gamma=4$ . In this case, we can choose $k=\beta^{-1}r=3^{-1}\cdot 1=3$ and we can verify that $o\in B_{3}$ . This concludes our intuition and ensures that all column ranged between the red and purple one will contain at least one cross.

Finally, Theorem 4 and Corollary 4 derive some lower and upper bounds for $\operatorname{H}(Y\mid O)$ when prior beliefs on the inputs are not uniform.

Example 2.

Let $\beta=3$ , $\gamma=4$ , $n=3$ and $m=6$ . In the graphical representation of the corresponding set $A$ that is displayed in Figure 2, we can notice that the blue cell representing $\beta n$ appears before the red column $\beta\gamma$ , meaning that the condition $n\geq\gamma\wedge m\geq\beta$ is not satisfied. This implies that there is no intersection in this setting and that $|A|=(n+1)(m+1)=28$ .

3 Related Works

In this section, we discuss related work that constitutes the foundations and the motivations of our present work.

3.1 Secure Multi-party Computation

Secure Multi-party Computation [1, 2, 3, 4, 5, 6] is a domain of cryptography that provides advanced protocols which enable several participants to compute a public function of their own private inputs without having to rely on any other trusted third party or any external authority. Those protocols enable the participants to compute a function in a decentralised manner, while ensuring that no information leaks about the private inputs, other than that which can be inferred from the public output. The commonly called “acceptable leakage” which is further studied in this paper, is the information that can be inferred from an attacker about the other inputs given the knowledge of the public output. Secure multi-party computation is not the only domain that is subject to an acceptable leakage. In particular, the results of our work are also applicable to other fields or scenarios that aim at protecting the inputs’ privacy and that involve the opening of a public output, such as outsourced computation where a trusted third party is privately sent all the inputs and returns the public output as unique piece of information, or trusted computing where the parties input their secret data into hardware security modules, which then ensure that no unintended information will be accessible to the other parties.

3.2 Quantitative Information Flow

The purpose of Quantitative Information Flow (QIF) [18, 19] is to provide frameworks and techniques based on information theory and probability theory for measuring the amount of information that leaks from a secret. Different mathematical concepts have emerged in order to convey varied and precise information about a secret: Shannon entropy [20] reflects the minimum number of binary questions required to recover a secret on average, while the min-entropy is an indicator of the probability to guess a secret in one try [21, 22, 18]. Richer measures such as Rényi entropy [23] and the $g$ -entropy [24] have been introduced in order to quantify some specific properties of a secret, and more general entropies have been proposed in order to unify those different concepts [12, 25]. In this work, we will measure the information gained by an attacker by means of min-entropy, which is used extensively in cryptography in order to quantify the vulnerability of a secret.

3.3 Differential Privacy

Differential Privacy (DP) [26, 27] formalises privacy concerns and introduces techniques that provide users of a database with the assurance that their personal details will not have a significant impact on the output of the queries performed on the database. More precisely, it proposes mechanisms which ensure that the outcome of the queries performed on two databases differing in at most one element will be statistically indistinguishable. Moreover, minimising the distortion of the outcome of the queries while ensuring privacy is an important trade-off that governs DP. Although DP is particularly adapted for guaranteeing privacy in statistical computations involving a large number of parties, its effectiveness diminishes when a small number of parties are involved in the computation. For example, in a two-party computation, a DP mechanism would ensure that the output would not be sensibly affected when half of the data is changed. In this case, the utility of the computed function is thus be drastically hindered by the low number of parties. Unlike DP and other works that have been conducted on trading off privacy and utility in SMC, this work does not intend to enhance the inputs privacy. Instead, our objective is to propose an efficient method for quantifying the privacy risks that a certain kind of computations presents.

3.4 Information Flow in Secure Multi-party Computation

Recent works [11, 12, 13] have adapted techniques stemming from QIF to the setting of SMC in order to propose a model that allows us to reason about the acceptable leakage. In this model, the set of parties willing to compute a public function $f$ is partitioned into three sets: a set of attackers, a set of targets and a set of spectators, holding the respective input vectors $\mathbf{x_{\mathbb{A}}}$ , $\mathbf{x_{\mathbb{T}}}$ and $\mathbf{x_{\mathbb{S}}}$ . The attackers are those parties willing to share the value of their inputs and to take advantage of the public output of the computation $f(\mathbf{x_{\mathbb{A}}},\mathbf{x_{\mathbb{T}}},\mathbf{x_{\mathbb{S}}})$ in order to learn as much information as possible on their targets’ inputs, while the remaining parties are called spectators. From the point of view of the attackers, the inputs $\mathbf{x_{\mathbb{T}}}$ and $\mathbf{x_{\mathbb{S}}}$ are unknown values and are thus modelled as random variables $X_{\mathbb{T}}$ and $X_{\mathbb{S}}$ , further deemed to be independent since targets and spectators are supposed to be honest parties who provide their inputs without being influenced by any other information. The attackers’ prior belief on those inputs will represent the prior distributions $\pi_{\mathbb{T}}$ and $\pi_{\mathbb{S}}$ of those random variables. The output of the function $f$ is then also considered as a random variable defined as $O=f(\mathbf{x_{\mathbb{A}}},X_{\mathbb{T}},X_{\mathbb{S}})$ . The privacy of the targeted parties is then expressed as the conditional entropy $\operatorname{H}(X_{\mathbb{T}}\mid\mathbf{x_{\mathbb{A}}},O)$ of the targeted inputs given knowledge of the attackers’ inputs and the conditional knowledge of the output. The choice of the entropy measure $\operatorname{H}$ depends on the users’ privacy concerns and is left general in [12] to this end. In this work and for clarity purposes, we will choose to convey the inputs’ privacy by means of min-entropy, as in many cryptographic scenarios, although our analyses can be adapted to more general entropy measures. Under this assumption, the privacy of the targeted parties becomes:

[TABLE]

by virtue of Bayes theorem. Moreover, as $X_{\mathbb{T}}$ and $X_{\mathbb{S}}$ are independent, we know that:

[TABLE]

If we denote by $n$ and $m$ the size of the domains of $X_{\mathbb{T}}$ and $X_{\mathbb{S}}$ respectively, we know that computing one $p(o\mid\mathbf{x_{\mathbb{A}}},\mathbf{x_{\mathbb{T}}})$ has complexity $\mathcal{O}(m)$ and thus computing each $\max_{\mathbf{x_{\mathbb{T}}}}p(\mathbf{x_{\mathbb{T}}})\cdot p(o\mid\mathbf{x_{\mathbb{A}}},\mathbf{x_{\mathbb{T}}})$ has complexity $\mathcal{O}(nm)$ . Moreover, in the worst case, i.e. if $f$ is injective, the output domain will have a size of $nm$ , which yields an overall complexity in $\mathcal{O}(n^{2}m^{2})$ for the computation of $\operatorname{H}(X_{\mathbb{T}}\mid\mathbf{x_{\mathbb{A}}},O)$ . In conclusion, although recent works have introduced a framework for characterising and quantifying the acceptable leakage, its computation cost is quadratic in the product of the inputs sizes in general, which prevents those privacy analyses to be applicable in practice, and this major complexity issue constitutes the focus of this paper.

4 Information Flow Analysis in Secure Three-Party Affine Computations

Let us consider three parties $\mathcal{X}$ , $\mathcal{Y}$ and $\mathcal{Z}$ holding the respective private inputs $x$ , $y$ and $z$ . Let $f$ be a public function of three variables. We assume that the parties wish to enter the secure computation of $f(x,y,z)$ and that $\mathcal{X}$ is attacking $\mathcal{Y}$ under spectator $\mathcal{Z}$ . From the point of view of attacker $\mathcal{X}$ , although $x$ is a known and constant value, the inputs $y$ and $z$ appear as unknown values and will be modelled as random variables $Y$ and $Z$ . Parties $\mathcal{Y}$ and $\mathcal{Z}$ are supposed to be honest parties who will not collaborate. Thus, random variables $Y$ and $Z$ are deemed to be independent. We further assume that the target’s and spectator’s inputs are from finite intervals $I_{Y}$ and $I_{Z}$ :

[TABLE]

Their prior probability distributions $\pi_{Y}$ and $\pi_{Z}$ will represent the prior beliefs that $\mathcal{X}$ may have on those values, such that $\pi_{Y}\in\Omega(I_{Y})$ and $\pi_{Z}\in\Omega(I_{Z})$ . We note that the absence of prior belief may be represented as uniform prior distributions. Finally, we assume that function $f$ is affine in the target’s and spectator’s inputs, i.e. that we can choose three constant integers $\alpha$ , $\beta$ and $\gamma$ so as to express the output of $f$ as:

[TABLE]

Note that constants $\alpha$ , $\beta$ and $\gamma$ may be function of input $x$ , which is also considered as a constant. Admissible candidates for such affine functions $f$ can for example be defined as $f(x,y,z)=3y+4z$ or $f(x,y,z)=x^{2}+xy+(x^{3}-2)z$ .

Assumption 1.

From the attacker’s point of view, input $x$ is a known value and will thus be considered as a constant throughout this paper.

Thus, we may abuse notation by omitting the first argument of $f$ , we refer to its output $o$ as:

[TABLE]

while we define the corresponding random variable $O$ for the output as $O=f(Y,Z)=\alpha+\beta Y+\gamma Z$ . We also introduce the output domain $D_{O}$ as:

[TABLE]

By denoting the min-entropy by $\operatorname{H}$ , the amount of information that the attacker gains on the targeted input once the output is revealed will be quantified by $\operatorname{H}(Y\mid x,O)$ . Since the value of $x$ will also be considered as a public constant in the present privacy analyses, we will refer to this quantity as $\operatorname{H}(Y\mid O)$ , which develops as:

[TABLE]

where the Bayes vulnerability of $Y$ given $O$ is defined as:

[TABLE]

To conclude this section and in order to simplify the following development, we will examine the particular case when $\beta$ or $\gamma$ is zero.

Lemma 1.

If $\beta=0$ then $\operatorname{H}(Y\mid O)=\operatorname{H}(Y)$ . 2. 2.

If $\beta\neq 0$ and $\gamma=0$ then $\operatorname{H}(Y\mid O)=0$ .

Proof.

If $\beta=0$ then clearly no information about $Y$ leaks from $O$ and thus $\operatorname{H}(Y\mid O)=\operatorname{H}(Y)$ . More formally, in this case, $Y$ and $O$ are independent and thus Equation (4) becomes:

[TABLE] 2. 2.

If $\beta\neq 0$ and $\gamma=0$ then $Y$ is entirely determined by $O$ given the relation $y=\frac{o-\alpha}{\beta}$ and thus $\operatorname{H}(Y\mid O)=0$ . More formally, for all $o$ in $D_{O}$ , there exists one $y$ in $I_{Y}$ such that $p(y\mid o)=1$ and thus Equation (4) becomes:

[TABLE]

∎

Assumption 2.

In the rest of the paper, we will assume that $\beta$ and $\gamma$ are non-zero.

5 Privacy under uniform prior beliefs

5.1 Reducing the entropy expression

In this section, we study the case where the attacker has no prior belief on the target’s and spectator’s inputs, i.e. when $\pi_{Y}$ and $\pi_{Z}$ are uniform on $I_{Y}$ and $I_{Z}$ respectively. In other words, we assume that for all $y$ in $I_{Y}$ and $z$ in $I_{Z}$ , we have $p(y)=\frac{1}{|I_{Y}|}$ and $p(z)=\frac{1}{|I_{Z}|}$ . As $\pi_{Y}$ is uniform, we have:

[TABLE]

However, by definition, we know that for all output $o$ in $D_{O}$ , there exists at least one pair $(y,z)$ in $I_{Y}\times I_{Z}$ that satisfies $f(y,z)=o$ . For all such pairs, as $Y$ and $Z$ are independent, we have:

[TABLE]

since for a given $o$ and $y$ , there is at most one $z^{\prime}$ that satisfies $f(y,z^{\prime})=o$ as $f$ is affine and $\gamma$ is non-zero. Consequently, $p(o\mid y)=\frac{1}{|I_{Z}|}$ since $\pi_{Z}$ is uniform, and thus:

[TABLE]

where $N_{O}$ denotes the cardinal of $D_{O}$ . Our aim will now be to compute $N_{O}$ .

We mention four simplifications before analysing this problem in more details.

Assumption 3.

We first notice that deducting constant $\alpha$ from the output of $f$ does not affect the number of different outputs, which enables us to simplify $f$ as $f(y,z)=\beta y+\gamma z$ . 2. 2.

Now, let us assume that interval $I_{Y}$ is of the form $I_{Y}=\llbracket{a};{b}\rrbracket$ . By substituting variable $y$ to variable $y^{\prime}=y-a$ , we can rewrite the expression of $f$ as $f(y^{\prime},z)=\beta a+\beta y^{\prime}+\gamma z$ . The new variable $y^{\prime}$ is ranged in $\llbracket{0};{b-a}\rrbracket$ and we can again deduct constant $\beta a$ from the output. We can perform the same reasoning with variable $z$ , which enables us to assume without loss of generality that inputs $y$ and $z$ belong to some intervals of the form $\llbracket{0};{n}\rrbracket$ , $\llbracket{0};{m}\rrbracket$ . 3. 3.

We now show that integers $\beta$ and $\gamma$ can be assumed to be positive without loss of generality. If both $\beta$ and $\gamma$ are negative, then we can equivalently compute the number of outputs of function $f^{\prime}(y,z)=-f(y,z)=-\beta y-\gamma z$ which has positive coefficients. If $\beta<0$ and $\gamma>0$ , we can write $f$ as $f(y,z)=\beta n-\beta(n-y)+\gamma z$ . However, the input space $\llbracket{0};{n}\rrbracket$ of variable $Y$ is equal to that of $n-Y$ , and we can thus equivalently study function $f^{\prime}(y,z)=-\beta y+\gamma z$ whose coefficients are positive. Conversely, if $\beta>0$ and $\gamma<0$ , we can again equivalently study $f^{\prime}(y,z)=-f(y,z)$ , which is tackled in the previous case. 4. 4.

Let us denote by $d$ the greatest common divisor of $\beta$ and $\gamma$ . We know that $d$ can be computed in $\mathcal{O}(\log(\beta+\gamma))$ . Function $f$ can be factorised as $f(y,z)=d\cdot f^{\prime}(y,z)$ where $f^{\prime}(y,z)=\beta^{\prime}y+\gamma^{\prime}z$ where $\beta^{\prime}$ and $\gamma^{\prime}$ are coprime. However, functions $f$ and $f^{\prime}$ have the same number of outputs. We can thus assume that the coefficients of the affine function are coprime provided that we have computed their greatest common divisor.

5.2 Measuring the size of the output domain

For the sake of clarity, this technical subsection will be developed so as to be self-contained.

Let $n$ , $m$ be two non-negative integers and $\beta$ and $\gamma$ be two positive integers. Let us also assume that $\beta$ and $\gamma$ are coprime. The aim is to calculate the cardinal $N_{O}$ of the set $A$ defined as follows:

[TABLE]

We can first notice that $|A|$ is positive and upper bounded by $(n+1)(m+1)$ . The difficulty is that two different pairs $(y,z)$ and $(y^{\prime},z^{\prime})$ in $\llbracket{0};{n}\rrbracket\times\llbracket{0};{m}\rrbracket$ can satisfy $\beta y+\gamma z=\beta y^{\prime}+\gamma z^{\prime}$ , and thus $|A|$ will often be lower than $(n+1)(m+1)$ . We also notice that $A\subseteq\llbracket{0};{\beta n+\gamma m}\rrbracket$ , and thus we also have $|A|\leq\beta n+\gamma m+1$ .

Notations:. For any real $x$ , the floor of $x$ will be denoted by $\lfloor x\rfloor$ while $\lceil x\rceil$ will denote its ceiling. The fact that two integers $i$ and $j$ have same residue modulo another integer $k$ will be denoted as $i=j\mod k$ . For all integers $i$ and $k$ , we will denote the equivalence class of $i$ modulo $k$ by $[{i}]_{{k}}=\{j\in\mathbb{Z}\mid j=i\mod k\}$ .

Recall 1.

For all real numbers $x$ , we have $\lceil-x\rceil=-\lfloor x\rfloor$ .

Proof.

Let $x$ be a real number. We have:

[TABLE]

and so:

[TABLE]

and thus, as $-\lceil-x\rceil$ is integral:

[TABLE]

∎

Recall 2.

Let $p$ and $q$ be two coprime natural numbers. We have:

[TABLE]

Proof.

If $q=1$ then the result is immediate since the sum adds up to [math]. Let us now assume that $q>1$ . We can notice that we have:

[TABLE]

and thus:

[TABLE]

However, because $p$ and $q$ are coprime we know that for all $k$ in $\llbracket{1};{q-1}\rrbracket$ , we have $gcd(k,q)=1$ . As $q>1$ , we thus know that for all $k$ in $\llbracket{1};{q-1}\rrbracket$ , we have $\frac{kq}{p}\notin\mathbb{Z}$ and thus:

[TABLE]

and thus Equation (4) becomes:

[TABLE]

∎

Recall 3.

Let $p$ and $q$ be two coprime natural numbers. We have:

[TABLE]

Proof.

Let $i$ and $j$ be in $\llbracket{0};{q-1}\rrbracket$ . We have:

[TABLE]

since $p$ and $q$ are coprime. Thus, for all distinct $i$ and $j$ in $\llbracket{0};{q-1}\rrbracket$ , we have $[{pi}]_{{q}}\neq[{pj}]_{{q}}$ and thus $|\{pj\mod q\mid j\in\llbracket{0};{q-1}\rrbracket\}|=q$ and therefore:

[TABLE]

∎

Lemma 2.

Let $(y,z)$ and $(y^{\prime},z^{\prime})$ be in $\llbracket{0};{n}\rrbracket\times\llbracket{0};{m}\rrbracket$ . We have:

[TABLE]

Proof.

Let $(y,z)$ and $(y^{\prime},z^{\prime})$ be in $\llbracket{0};{n}\rrbracket\times\llbracket{0};{m}\rrbracket$ . As $\beta$ and $\gamma$ are coprime, we have:

[TABLE]

∎

Lemma 3.

Let $(y,z)$ and $(y^{\prime},z^{\prime})$ be two distinct pairs in $\llbracket{0};{n}\rrbracket\times\llbracket{0};{m}\rrbracket$ . We have:

[TABLE]

Proof.

Let $(y,z)$ and $(y^{\prime},z^{\prime})$ be two distinct pairs in $\llbracket{0};{n}\rrbracket\times\llbracket{0};{m}\rrbracket$ such that:

[TABLE]

By virtue of Lemma 2, we can take $k$ in $\mathbb{Z}$ such that:

[TABLE]

As $(y,z)$ and $(y^{\prime},z^{\prime})$ are different, we further know that $k$ is different from [math]. This implies that:

[TABLE]

But as $y$ and $y^{\prime}$ belong to $\llbracket{0};{n}\rrbracket$ and $z$ and $z^{\prime}$ belong to $\llbracket{0};{m}\rrbracket$ , we also know that:

[TABLE]

and thus:

[TABLE]

∎

Corollary 1.

If $n<\gamma\vee m<\beta$ , then $|A|=(n+1)(m+1)$ .

Proof.

Let us assume that $n<\gamma\vee m<\beta$ . Let us define the function $g$ as $g\colon(y,z)\longmapsto\beta y+\gamma z$ with domain $\llbracket{0};{n}\rrbracket\times\llbracket{0};{m}\rrbracket\longrightarrow\llbracket{0};{\beta n+\gamma m}\rrbracket$ . By virtue of Lemma 3, we know that the function $g$ is injective. Thus, we have:

[TABLE]

∎

Assumption 4.

In the remainder of this section, we will now assume that $n\geq\gamma\wedge m>\beta$ .

Lemma 4.

Let $o$ be in $A$ . Let $(y,z)$ and $(y^{\prime},z^{\prime})$ be two distinct pairs in $\llbracket{0};{n}\rrbracket\times\llbracket{0};{m}\rrbracket$ . We have:

[TABLE]

Proof.

Let $o$ be in $A$ . Let $(y,z)$ and $(y^{\prime},z^{\prime})$ be two distinct pairs in $\llbracket{0};{n}\rrbracket\times\llbracket{0};{m}\rrbracket$ such that:

[TABLE]

By virtue of Lemma 2, we can take $k$ in $\mathbb{Z}$ such that:

[TABLE]

We further know that both pairs are distinct and we can thus choose $k$ different from [math]. Without loss of generality, let us assume that $(y,z)>_{2}(y^{\prime},z^{\prime})$ where $>_{2}$ refers to the lexicographic order on integer pairs. In other words, let us assume that $k>0$ .

We know that $z^{\prime}\geq 0$ and Equation (8) ensures that $y^{\prime}\geq\gamma$ since $k>0$ . As $o=\beta y^{\prime}+\gamma z^{\prime}$ , we thus have $o\geq\beta\gamma$ .

Conversely, we know that $y^{\prime}\leq n$ and Equation (8) ensures that $z^{\prime}\leq m-\beta$ since $k>0$ . As $o=\beta y^{\prime}+\gamma z^{\prime}$ , we thus have:

[TABLE]

∎

Theorem 1.

We have:

[TABLE]

Proof.

By virtue of Lemma 4, we know that for all $o$ in $\llbracket{0};{\beta\gamma-1}\rrbracket$ , and for all pairs $(y,z)$ and $(y^{\prime},z^{\prime})$ in $\llbracket{0};{n}\rrbracket^{2}$ , we have:

[TABLE]

Thus:

[TABLE]

since $\beta y+\gamma z\geq\beta\gamma$ for all $z$ in $\llbracket{0};{n}\rrbracket$ when $y\geq\gamma$ , and since $n\geq\gamma$ . Moreover, since $m\geq\beta$ , for all $y$ in $\llbracket{0};{\gamma-1}\rrbracket$ , we have:

[TABLE]

by virtue of Recall 1. We thus have:

[TABLE]

by virtue of Recall 2 since $\beta$ and $\gamma$ are coprime. ∎

Lemma 5.

Let $o$ be in $\llbracket{0};{\beta n+\gamma m}\rrbracket$ . We have:

[TABLE]

Proof.

Let $o$ be in $\llbracket{0};{\beta n+\gamma m}\rrbracket$ . We know that $\beta$ and $\gamma$ are coprime, so we can take two integers $y$ and $z$ in $\mathbb{Z}$ such that:

[TABLE]

and we have:

[TABLE]

Now, we have:

[TABLE]

and thus:

[TABLE]

∎

Corollary 2.

We have:

[TABLE]

Proof.

This is an immediate consequence of Theorem 1 and Lemma 5. ∎

Definition 1.

For all $i$ in $\llbracket{0};{n}\rrbracket$ , we define:

[TABLE]

so that $A=\bigcup_{i=0}^{n}A_{i}$ . 2. 2.

For all $j$ in $\llbracket{0};{\gamma-1}\rrbracket$ , we define:

[TABLE]

so that $A$ can be rewritten:

[TABLE]

Lemma 6.

For all $j$ in $\llbracket{0};{\gamma-1}\rrbracket$ , we have:

[TABLE]

Proof.

Let $j$ be in $\llbracket{0};{\gamma-1}\rrbracket$ . For all $q$ in $\llbracket{0};{\lfloor\frac{n-j}{\gamma}\rfloor}\rrbracket$ , let us define the predicate $P_{q}$ as follows:

[TABLE]

and let us prove by induction that $P_{q}$ holds for all $q$ in $\llbracket{0};{\lfloor\frac{n-j}{\gamma}\rfloor}\rrbracket$ .

By definition, we have:

[TABLE]

and thus $P_{0}$ holds.

Let $q$ be in $\llbracket{0};{\lfloor\frac{n-j}{\gamma}\rfloor-1}\rrbracket$ and let us assume that $P_{q}$ holds.

By definition, we have:

[TABLE]

This equation, combined with the assumption that $P_{q}$ holds, yields us:

[TABLE]

However, as $m\geq\beta$ , we know that:

[TABLE]

And so Equation (14) becomes:

[TABLE]

which means that $P_{q+1}$ holds, which enables us to conclude the induction.

∎

Lemma 7.

For all $j$ in $\llbracket{0};{\gamma-1}\rrbracket$ , we have:

[TABLE]

Proof.

Let $j$ be in $\llbracket{0};{\gamma-1}\rrbracket$ . Lemma 6 ensures that we have:

[TABLE]

First, we know that $\beta j\leq\beta\gamma$ . Let us now define $S$ as the following statement:

[TABLE]

We have the following equivalences:

[TABLE]

By definition of the floor function, Equation (17) holds and by equivalence, Equation (16) thus also holds, and so:

[TABLE]

and intersecting with $[{\beta j}]_{{\gamma}}$ yields us the expected result. ∎

Theorem 2.

We have:

[TABLE]

Proof.

By definition, we have:

[TABLE]

Lemma 7 thus implies:

[TABLE]

But as $\beta$ and $\gamma$ are coprime, Recall 3 ensures that:

[TABLE]

which concludes the proof of the Theorem. ∎

Theorem 3.

We have:

[TABLE]

Proof.

Let us define the intervals $I_{1},I_{2}$ and $I_{3}$ as follows:

[TABLE]

We notice that $(I_{1},I_{2},I_{3})$ forms a partition of $\llbracket{0};{\beta n+\gamma m}\rrbracket$ and thus we can partition $A$ into $(A\cap I_{1},A\cap I_{2},A\cap I_{3})$ , which enables us to express the cardinal of $A$ as the following sum:

[TABLE]

We already know by Theorem 1 and Corollary 2 that:

[TABLE]

Moreover, Theorem 2 ensures that:

[TABLE]

and thus:

[TABLE]

and finally Equation (20) becomes:

[TABLE]

∎

In the following corollary, we can now synthesise the previous results and realize one of our main objectives: a closed-form expression for $\operatorname{H}(Y\mid O)$ under uniform prior beliefs.

Corollary 3.

We consider a function $f$ defined as $f(y,z)=\alpha+\beta y+\gamma z$ with $\alpha$ , $\beta$ and $\gamma$ being three constant integer values, with non-zero $\beta$ and $\gamma$ . We assume that $Y$ and $Z$ are ranged in the respective intervals $I_{Y}$ and $I_{Z}$ of size $(n+1)$ and $(m+1)$ respectively, and we assume that $Y$ and $Z$ are uniformly distributed on those intervals. Let $d$ be the greatest common divisor of $\beta$ and $\gamma$ , and let us define $\beta^{\prime}=|\frac{\beta}{d}|$ and $\gamma^{\prime}=|\frac{\gamma}{d}|$ where here $|x|$ represents the absolute value of integer $x$ . Then, we have:

[TABLE]

Proof.

This is an immediate consequence of Theorem 3 and Equation (3). ∎

This gives us a method for quantifying the information leaks about a targeted party from the public output of an SMC under uniform prior beliefs. This method requires a computational time that is constant in the inputs size and logarithmic in the coefficients of the affine function due to the greatest common divisor operation. In the next section, we show how we can reason about $\operatorname{H}(Y\mid O)$ when an attacker has some prior beliefs about $Y$ and $Z$ .

6 Privacy bounds under non-uniform prior beliefs

In this section, we present some lower and upper bounds for $\operatorname{H}(Y\mid O)$ under non-uniform prior beliefs on the target’s and the spectator’s input. The following theorem first imposes a lower bound.

Theorem 4.

We have:

[TABLE]

with equality when $Y$ and $Z$ have uniform prior distributions.

Proof.

We denote by $N_{O}$ the number of possible outputs, for which we recall that Theorem 3 yields an explicit formula. By comparison between the $1$ -norm and the infinity-norm, we have:

[TABLE]

But we know that:

[TABLE]

and thus:

[TABLE]

Taking the negative logarithm concludes the proof, and we can verify, with our bespoke explicit formula from Equation (3), that we indeed have equality when $Y$ and $Z$ are uniformly distributed since then $\max_{y}p(y)=\frac{1}{|I_{Y}|}$ and $\max_{z}p(z)=\frac{1}{|I_{Z}|}$ . ∎

We will now study some upper bounds for $\operatorname{H}(Y\mid O)$ . It is a known result on the min-entropy that $\operatorname{H}(Y\mid O)\leq\operatorname{H}(Y)$ , i.e. that knowledge of the public output cannot increase the targeted input’s entropy. We will now prove that $\operatorname{H}(Y\mid O)\leq\operatorname{H}(Z)$ , i.e. the remaining entropy of $Y$ given knowledge of $O$ cannot be larger than the prior entropy of the spectator’s input $Z$ . To this end, we first state in the next theorem that an attacker eavesdropping on the value of $x$ and learning the public output will gain the same amount of information on targeted input $Y$ than on the spectator’s input $Z$ .

Theorem 5.

We have:

[TABLE]

Proof.

We have:

[TABLE]

For all $o$ in $D_{O}$ , we define the set $S^{o}$ of pairs that result in output $o$ as:

[TABLE]

We define its projections $S^{o}_{Y}$ and $S^{o}_{Z}$ on its first and second components respectively as follows:

[TABLE]

Moreover, for all $o$ in $D_{O}$ and $y$ in $D_{Y}$ , we know that $p(y\mid o)$ is non zero only if there exists a $z$ in $D_{Z}$ such that $(y,z)$ is in $S_{o}$ . Thus:

[TABLE]

Now, for all $o$ in $D_{O}$ and $y$ in $S^{o}_{Y}$ , there exists a unique $z$ in $D_{Z}$ such that $\beta y+\gamma z=o$ , determined by $z=\frac{o-\beta y}{\gamma}$ and we have $p(o\mid y)=p(z)$ . Thus:

[TABLE]

Conversely, for all $o$ in $D_{O}$ and $z$ in $S^{o}_{Z}$ , there exists a unique $y$ satisfying $\beta y+\gamma z=o$ and we have $p(o\mid z)$ . Thus:

[TABLE]

And finally for all $o$ in $D_{O}$ , we know that $p(o\mid z)$ can only be non zero if $z$ is in $S^{o}_{Z}$ and thus:

[TABLE]

∎

Corollary 4.

We have:

[TABLE]

Proof.

We have $\operatorname{H}(Z\mid O)\leq\operatorname{H}(Z)$ and Theorem 5 concludes the proof. ∎

7 Examples

In this section, we illustrate the theoretical results previously obtained. We begin this section by presenting an example which deepens our understanding of the behaviour of $\operatorname{H}(Y\mid O)$ under non-uniform prior beliefs. We could intuitively posit that $\operatorname{H}(Y\mid O)$ is maximal when the prior distributions for $Y$ and $Z$ are uniform. However, we refute this hypothesis in the following example.

Example 3.

Let us consider the function $f(y,z)=o=2y+3z$ , and let us assume that $y$ in ranged in $I_{Y}=\llbracket{0};{2}\rrbracket$ and $z$ is ranged in $I_{Z}=\llbracket{0};{1}\rrbracket$ . Let us consider $\pi^{u}_{Y}=\{0:\frac{1}{3},1:\frac{1}{3},2:\frac{1}{3}\}$ and $\pi^{u}_{Z}=\{0:\frac{1}{2},1:\frac{1}{2}\}$ the uniform distributions for $Y$ and $Z$ on their respective domains. We also consider the following particular distribution $\pi_{Y}^{*}=\{0:\frac{1}{2},1:0,2:\frac{1}{2}\}$ .

Then, when $Y$ and $Z$ respectively follow the prior distributions $\pi_{Y}^{u}$ and $\pi_{Z}^{u}$ , we have $\operatorname{H}^{u}(Y\mid O)=-\log(\frac{5}{6})$ . On the other hand, when $Y$ and $Z$ respectively follow the prior distributions $\pi_{Y}^{*}$ and $\pi_{Z}^{u}$ , we get $\operatorname{H}^{*}(Y\mid O)=-\log(\frac{3}{4})$ .

We thus have $\operatorname{H}^{*}(Y\mid O)>\operatorname{H}^{u}(Y\mid O)$ which contradicts the intuitive hypothesis.

The next example presents a use case of Corollary 3.

Example 4.

Let us consider three parties $\mathcal{X}$ , $\mathcal{Y}$ and $\mathcal{Z}$ holding respective private inputs $x$ , $y$ and $z$ and willing to enter the secure computation of a public function $f$ defined as $f(x,y,z)=(3x-6)y+(x^{2}-2x+6)z$ . We suppose that party $\mathcal{X}$ is attacking input $y$ under spectator $\mathcal{Z}$ . We notice that when input $x$ is fixed, the function $f$ is affine in $y$ and $z$ and we can thus apply our privacy analysis. We assume that $Y$ and $Z$ are ranged in the input domain $I=\llbracket{0};{{5}\times 10^{{12}}}\rrbracket$ and we assume that $\mathcal{X}$ ’s prior beliefs $\pi_{Y}$ and $\pi_{Z}$ on those inputs are uniform over $I$ . We plot in Figure 3 the values of $\operatorname{H}(Y\mid O)$ computed via Corollary 3, for the values of input $x$ ranged in $\llbracket{0};{30}\rrbracket$ . Note that although a small interval for the values of $x$ has been chosen for readability purposes, entropy $\operatorname{H}(Y\mid O)$ can be computed for any value of $x$ . For an attacker $\mathcal{X}$ who is willing to lie on his honest and intended input in order to learn as much information as possible on his targeted input $Y$ , he would have more incentive to enter some value $x$ that produces low entropy. For example, he would rather enter value $x=25$ than $x=2$ . Conversely, targeted parties could consider such information so as to evaluate the risk that they would face by entering the computation in the worst case, or on average.

Example 5.

In order to evaluate the effectiveness of our approach, we repeated the operations of the previous example while letting the size of the input spaces $I_{Y}$ and $I_{Z}$ vary, and by comparing the computational time that different methods require to perform such analyses. More precisely, we computed the $31$ values of $\operatorname{H}(Y\mid O)$ in the same scenario as in the previous Example 4, but we let inputs $Y$ and $Z$ be ranged in the intervals $\llbracket{0};{{5}\times 10^{{p}}}\rrbracket$ for different values of $p$ . We compared the time taken by the three following methods, which we display in Figure 4.

Naive method:* We use the combinatorial formula given in Equation (1) that has complexity $\mathcal{O}(n^{2}m^{2})$ .* 2. 2.

Simplified method:* We use the simplified formula of Equation (3) for affine functions under uniform distributions, where $N_{O}$ is computed naively by enumerating the set of outputs, which yields complexity $\mathcal{O}(nm)$ .* 3. 3.

Explicit method:* We use the result of Corollary 3 which provides a constant time formula.*

The variables $n$ and $m$ represent the size of the input spaces (minus $1$ ) and are both set to ${5}\times 10^{{p}}$ for varying values of $p$ . We set a time limit of $5$ minutes and we mark by an infinity sign the computations that timed out. We can notice that both naive methods rapidly time out as the input space grows whereas our explicit formula enables us to perform privacy analyses in constant time for arbitrarily large input spaces such as the one performed in Example 4. Those computations have been performed on an Intel(R) Core(TM) i3-2350M CPU @ 2.30GHz, but are aimed at estimating the order of magnitude of those methods rather than precisely assessing them individually.

In the following example, we now illustrate the lower and upper bounds that have been derived for $\operatorname{H}(Y\mid O)$ under non-uniform prior beliefs.

Example 6.

We consider the computation of $f$ whose simplification, once $x$ is fixed, is defined as $f(y,z)=y+z$ . We assume that $Y$ and $Z$ are ranged in the domain $I=\llbracket{0};{50}\rrbracket$ .

We define a spiked distribution $d_{s}$ parametrised by a domain $D$ , a center $c\in D$ and a weight $w\in[0,1]$ as:

[TABLE]

In other words, distribution $d_{s}(D,c,w)$ allocates a probability $w$ to the value $c$ and distributes the remaining probability uniformly amongst the other values of the domain $D$ .

We suppose that $Y$ is uniformly distributed over $I$ and that $Z$ follows distribution $d_{s}(I,0,w)$ for different weights $w$ . We divide the interval $[0,1]$ into $50$ values. For each $w$ in those $50$ values, we compute the exact values of $\operatorname{H}(Y\mid O)$ and that of its bounds derived in the previous section. The value of $\operatorname{H}(Y\mid O)$ appears in blue in Figure 5. Its lower bound stemming from Theorem 4 is drawn in red and its upper bound derived from Corollary 4 is traced in green. Note that we considered small input spaces since $\operatorname{H}(Y\mid O)$ is here calculated with a naive method, although its bounds can be computed efficiently for arbitrarily large input spaces.

8 Conclusion

Although extensive researches in Secure Multi-party Computation have considerably improved the efficiency of cryptographic protocols, the quantification of the acceptable leakage is a problem that still requires deeper investigations. Indeed, the computational complexity of those recently introduced privacy analyses does not yet allow their application in practical situations that involve large input spaces. In this work, we focused our attention on secure three-party computations of affine functions. We have formally investigated the behaviour of the acceptable leakage under uniform prior beliefs in order to obtain an explicit formula for the min-entropy of the targeted input given conditional knowledge of the output. The calculation of this closed-form expression requires a computational time that is constant in the inputs sizes and logarithmic in the coefficients of the function, which enables the privacy analysis of such computations in practice. Finally, we have derived some theoretical bounds for this acceptable leakage when the input prior distributions are non-uniform in order to accommodate the potential prior belief that an attacker may have.

In the future, we would like to enlarge our understanding of the acceptable leakage in more general settings. First, as our work is motivated by the privacy leaks that occur during SMC protocols, we tailored our analyses for finite input spaces. However, it would be interesting to adapt our model and to design some methods that can accommodate continuous input and output spaces. Moreover, although our current analysis considers the computation of affine functions for three parties, it would be of interest to explore the computation of affine functions for any number of parties.

We also mean to investigate more general functions that involve non-linear terms. It would be particularly interesting to study the composition of our analyses of affine functions in order to use them as building blocks for studying more complex functions. Finally, efficient and exact quantification of the acceptable leakage for general functions may be hard to obtain simultaneously, and we would thus also be interested in providing efficient methods for approximating the inputs privacy in general scenarios.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Andrew Chi-Chih Yao. How to generate and exchange secrets. In Foundations of Computer Science, 1986., 27th Annual Symposium on , pages 162–167. IEEE, 1986.
2[2] Andrew C Yao. Protocols for secure computations. In Foundations of Computer Science, 1982. SFCS’08. 23rd Annual Symposium on , pages 160–164. IEEE, 1982.
3[3] Adi Shamir. How to share a secret. CACM , 22(11):612–613, 1979.
4[4] Tal Rabin and Michael Ben-Or. Verifiable secret sharing and multiparty protocols with honest majority. In Proceedings of the twenty-first annual ACM symposium on Theory of computing , pages 73–85. ACM, 1989.
5[5] Michael Ben-Or, Shafi Goldwasser, and Avi Wigderson. Completeness theorems for non-cryptographic fault-tolerant distributed computation. In Proc. of the twentieth annual ACM symposium on Theory of computing , pages 1–10. ACM, 1988.
6[6] David Chaum, Claude Crépeau, and Ivan Damgard. Multiparty unconditionally secure protocols. In Proceedings of the twentieth annual ACM symposium on Theory of computing , pages 11–19. ACM, 1988.
7[7] Yehuda Lindell and Benny Pinkas. Secure multiparty computation for privacy-preserving data mining. Journal of Privacy and Confidentiality , 1(1):5, 2009.
8[8] Claudio Orlandi. Is multiparty computation any good in practice? In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on , pages 5848–5851. IEEE, 2011.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Abstract

1 Introduction

2 Methodology

Example 1**.**

Example 2**.**

3 Related Works

3.1 Secure Multi-party Computation

3.2 Quantitative Information Flow

3.3 Differential Privacy

3.4 Information Flow in Secure Multi-party Computation

4 Information Flow Analysis in Secure Three-Party Affine Computations

Assumption 1**.**

Lemma 1**.**

Proof.

Assumption 2**.**

5 Privacy under uniform prior beliefs

5.1 Reducing the entropy expression

Assumption 3**.**

5.2 Measuring the size of the output domain

Recall 1**.**

Proof.

Recall 2**.**

Proof.

Recall 3**.**

Proof.

Lemma 2**.**

Proof.

Lemma 3**.**

Proof.

Corollary 1**.**

Proof.

Assumption 4**.**

Lemma 4**.**

Proof.

Theorem 1**.**

Proof.

Lemma 5**.**

Proof.

Corollary 2**.**

Proof.

Definition 1**.**

Lemma 6**.**

Proof.

Lemma 7**.**

Proof.

Theorem 2**.**

Proof.

Theorem 3**.**

Proof.

Corollary 3**.**

Proof.

6 Privacy bounds under non-uniform prior beliefs

Theorem 4**.**

Proof.

Theorem 5**.**

Proof.

Corollary 4**.**

Proof.

7 Examples

Example 3**.**

Example 4**.**

Example 5**.**

Example 6**.**

8 Conclusion

Example 1.

Example 2.

Assumption 1.

Lemma 1.

Assumption 2.

Assumption 3.

Recall 1.

Recall 2.

Recall 3.

Lemma 2.

Lemma 3.

Corollary 1.

Assumption 4.

Lemma 4.

Theorem 1.

Lemma 5.

Corollary 2.

Definition 1.

Lemma 6.

Lemma 7.

Theorem 2.

Theorem 3.

Corollary 3.

Theorem 4.

Theorem 5.

Corollary 4.

Example 3.

Example 4.

Example 5.

Example 6.