Scalable Information-Flow Analysis of Secure Three-Party Affine Computations
Patrick Ah-Fat, Michael Huth

TL;DR
This paper develops a scalable method to quantify information flow in secure three-party affine computations using min-entropy, enabling practical privacy analysis in large input scenarios.
Contribution
It derives a closed-form formula for min-entropy in three-party affine computations, scalable to large inputs, and provides bounds for non-uniform priors.
Findings
Explicit formula for min-entropy under uniform priors
Constant-time computation relative to input size
Logarithmic complexity in affine coefficients
Abstract
Elaborate protocols in Secure Multi-party Computation enable several participants to compute a public function of their own private inputs while ensuring that no undesired information leaks about the private inputs, and without resorting to any trusted third party. However, the public output of the computation inevitably leaks some information about the private inputs. Recent works have introduced a framework and proposed some techniques for quantifying such information flow. Yet, owing to their complexity, those methods do not scale to practical situations that may involve large input spaces. The main contribution of the work reported here is to formally investigate the information flow captured by the min-entropy in the particular case of secure three-party computations of affine functions in order to make its quantification scalable to realistic scenarios. To this end, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Abstract
Elaborate protocols in Secure Multi-party Computation enable several participants to compute a public function of their own private inputs while ensuring that no undesired information leaks about the private inputs, and without resorting to any trusted third party. However, the public output of the computation inevitably leaks some information about the private inputs. Recent works have introduced a framework and proposed some techniques for quantifying such information flow. Yet, owing to their complexity, those methods do not scale to practical situations that may involve large input spaces. The main contribution of the work reported here is to formally investigate the information flow captured by the min-entropy in the particular case of secure three-party computations of affine functions in order to make its quantification scalable to realistic scenarios. To this end, we mathematically derive an explicit formula for this entropy under uniform prior beliefs about the inputs. We show that this closed-form expression can be computed in time constant in the inputs sizes and logarithmic in the coefficients of the affine function. Finally, we formulate some theoretical bounds for this privacy leak in the presence of non-uniform prior beliefs.
**Scalable Information-Flow Analysis of
Secure Three-Party Affine Computations
**
Patrick Ah-Fat and Michael Huth
Department of Computing, Imperial College London
London, SW7 2AZ, United Kingdom
patrick.ah-fat14, m.huth@imperial.ac.uk
Keywords: Computational Privacy, Min-entropy, Combinatorics.
1 Introduction
Secure Multi-party Computation (SMC) is a domain of cryptography that aims at enabling several parties to compute a public function of their own private inputs, while keeping the inputs secret and without resorting to any trusted third party [1, 2, 3, 4, 5, 6]. Multi-party secure protocols typically require the parties to engage in a series of rounds of communication in order to exchange their information so as to be able to collaboratively compute the intended output. Such protocols provide the guarantee that none of the parties will be able to infer any information about the other parties’ input, other than the information conveyed by the public output itself.
Paradoxically, as a function of the inputs, the public output inevitably leaks some information about those private inputs. This leakage is considered as an inherent consequence of the primary objective of SMC: it is commonly qualified as the “acceptable leakage” and its study is thus largely ignored in the SMC literature [7, 8, 9, 10]. Recent works have been undertaken with the aim of quantifying such information flows [11, 12, 13]. By adapting techniques from Quantitative Information Flow (QIF) and applying concepts from Information Theory (IT) to the context of SMC, they introduce an attack model and a general notion of entropy that enable us not only to reason about the acceptable leakage in SMC, but also to construct bespoke privacy-enhancing mechanisms aimed at protecting the inputs’ secrecy. In this attack model, the entropy of a targeted input reflects the amount of information that is gained by an attacker once the output is revealed.
Although these techniques offer a rich framework designed for analysing information flows in SMC, their computation is essentially combinatorial, and their application in practice is thus impeded by the scalability of computing this combinatorics. Indeed, in the general case, the time complexity of computing such entropy measures is quadratic in the product of the inputs sizes, making them inadequate for examining real world applications of SMC that may involve large input spaces. We believe however, that developing techniques that can perform such analyses efficiently would benefit and complement the extensive researches [14, 15, 16, 17] that are being conducted on efficient SMC protocols: potential participants of an SMC would not only have efficient cryptographic protocols at their disposal, but they could also effectively run privacy analyses in order to precisely estimate the risk that they would run by entering the computation.
In this paper, our objective is to focus our efforts on a particular class of functions for which we further investigate those analyses in order to make them applicable to arbitrarily large input spaces. More precisely, we focus on secure three-party computations, and we study the class of functions that are affine in the target’s and the spectator’s inputs, while the amount of information that an attacker gains on a targeted input will be measured by means of conditional min-entropy. In this setting, the main contribution of this work is to reduce the combinatorial essence of this information measure to a closed-form expression that has time complexity constant in the inputs sizes, and logarithmic in the coefficients of the affine function. More specifically, we show that under uniform prior beliefs, the conditional min-entropy can be reduced to a simple function of the size of the output domain, for which we then derive an explicit expression. Finally, as this reduction is valid under uniform prior beliefs on the inputs, we also exhibit some explicit bounds for this information measure in the presence of non-uniform prior beliefs.
Outline of Paper. We present an intuitive overview of our main contributions and of the key technical aspects of our work in Section 2. We discuss some related works in Section 3. The mathematical formalisation required for analysing information flows in secure three-party affine computations is introduced in Section 4. In Section 5, we show that the information gained by an attacker under uniform prior beliefs is entirely determined by the size of the output domain, for which we derive a closed-form expression. Explicit bounds for the information flow under non-uniform prior beliefs are presented in Section 6. We illustrate those theoretical results in Section 7 and Section 8 concludes the paper.
Notations. Let be a discrete set. We denote by the cardinality of set . Let be the set of all probability distributions whose support is contained in . Throughout, we present distributions as Python dictionaries with domain values as keys and associated probabilities as values. For example, represents the uniform distribution over . For any integers and , we will write for the set of consecutive integers ranging from to , namely . The greatest common divisor of and will be denoted as . The fact that two integers and have same residue modulo another integer will be denoted as . Given random variable and value , the event “” will be abbreviated by “” when there is no ambiguity, and its probability will be denoted by . Similarly, we will abbreviate by when the domain is obvious from context. Finally, the logarithm in base will be denoted as .
2 Methodology
In this section, we present an overview of our main contributions and we highlight the key technical components of our work intuitively. Although the aim of this section is to illustrate and summarise our results, the detailed and rigorous approach is developed in Sections 4, 5 and 6. This work is motivated by Secure Multi-party Computation, which requires all the manipulated values to belong to finite spaces. Thus, we will focus on integer values ranged in finite intervals.
The main contribution of this work is to introduce an efficient and scalable way of quantifying the acceptable leakage in three-party affine computations. To this effect, we consider the secure computation of a public function performed on three private inputs , and . We wish to quantify the amount of information that an attacker, who has control of or is being able to eavesdrop on the value of , would gain on input once the output of is revealed. We focus on the functions whose output can, once the input controlled by the attacker is fixed, be expressed as a function of and , in its simplest form, as where and are constant integers. We quantify the information gained by the attacker from this computation by , the min-entropy of input given output , considered as random variables. When inputs and are considered as random variables uniformly distributed on some intervals, we show in Section 5.1 that this entropy can be reduced to an explicit formula involving , the number of possible values that output can take. The main difficulty now resides in deriving a closed-form formula for , the focus of Section 5.2, and for which we sketch an intuitive explanation now. Given that and are uniformly distributed on respective intervals and , we show that those intervals can be assumed to be of the form and respectively, where and are positive integers. We also show that a simple simplification enables us to assume that constants and are positive and coprime. The number of outputs can now be expressed as the following cardinal where we define set as:
[TABLE]
For the sake of our later explanation, we will define for all in , the set as:
[TABLE]
so that set can now be expressed as:
[TABLE]
In order to illustrate our method and theorems for computing the cardinal , we will construct some graphical representations of set under different configurations in the following examples.
Example 1**.**
Let , , and . The graphical representation of the corresponding set is shown in Figure 1. The -axis corresponds to the values of , which is ranged in , while the -axis corresponds to the possible values that the output can take. For each row, indexed by , we mark by a cross the possible values that the output can take. In other words, each row will represent the elements contained in set . As is defined as the union of all the , the set corresponds to the projection of all the crosses onto the -axis. In other words, value belongs to set if and only if there is at least one cross in column .
In order to tally the number of feasible outputs, we will highlight some intuitive results, which we will formalise and prove in Section 5.2. We first notice that the number of outputs is upper bounded by , and may be strictly lower than this bound since one column may contain several crosses. We will refer to such columns containing more than cross as intersections. We make the following observations:
The “first” intersection occurs at column and is highlighted in red in Figure 1. In other words, the lowest output whose column contains at least two crosses is . 2. 2.
We indicate in blue the first cross of the last row, indexed at column , and in orange the last cross of the first row, indexed at column . We notice that the first intersection can occur if and only if both of those crosses, highlighted in blue and orange, do not stand before the column highlighted in red. In other words, the first intersection can occur if and only if and , i.e. if and , which we claim in Lemma 3. If one of those conditions is not satisfied, then there is no intersection, and the number of outputs is , which we claim in Corollary 1 and illustrate in the next example. 3. 3.
Set is “symmetrical”, i.e. that for all output in , we have:
[TABLE]
where is the largest output obtained for maximal values of and . This is proved in Lemma 5. 4. 4.
Thus, the last intersection occurs at column and is highlighted in purple. Together with observation 1, this constitutes the content of Lemma 4. Moreover, by symmetry, there is the same number of outputs contained in and , as claimed in Corollary 2. 5. 5.
As there is no intersection before the red column , the number of outputs contained in can be obtained by summing the number of elements of all contained in this interval. More formally, we have:
[TABLE]
This corresponds to the total number of crosses that stand before the column highlighted in red. We develop its computation in Theorem 1. 6. 6.
Finally, we observe that all the columns lying between the red and purple ones, i.e. ranging in the interval , contain at least one cross. This result is formalised in Theorem 2 and implies that:
[TABLE]
In order to prove that all such columns contain at least one cross, we make the following reasoning.
- (a)
Two sets whose indices are separated by a multiple of will only contain some outputs that have the same residue modulo . More precisely, if and are both congruent to some modulo , then the elements of and will be congruent to modulo . We illustrate this fact in Figure 1, where we color in green the elements of sets and . We can notice that all those elements are congruent to modulo . 2. (b)
For all in , let us define as the union of all the whose index is congruent to modulo :
[TABLE]
For example, can be represented as the projection of all the green crosses on the -axis. Then, we can see that each includes all the outputs that are ranged between the red and purple columns and that are congruent to modulo . This observation is formalised in Lemmas 6 and 7. We can indeed see in the figure that . We notice that this may not be the case outside of the domain delimited by the red and purple columns as for example . 3. (c)
Finally, as formally explained in Theorem 2, we claim that for all output ranged between the red and purple columns, there exists a in such that . Indeed, if we denote by the residue of modulo , it suffices to choose to ensure that , which then implies . While refers to the inverse of modulo , such an operation is allowed since and are coprime. For example, output has residue modulo . In this case, we can choose and we can verify that . This concludes our intuition and ensures that all column ranged between the red and purple one will contain at least one cross.
Finally, Theorem 4 and Corollary 4 derive some lower and upper bounds for when prior beliefs on the inputs are not uniform.
Example 2**.**
Let , , and . In the graphical representation of the corresponding set that is displayed in Figure 2, we can notice that the blue cell representing appears before the red column , meaning that the condition is not satisfied. This implies that there is no intersection in this setting and that .
3 Related Works
In this section, we discuss related work that constitutes the foundations and the motivations of our present work.
3.1 Secure Multi-party Computation
Secure Multi-party Computation [1, 2, 3, 4, 5, 6] is a domain of cryptography that provides advanced protocols which enable several participants to compute a public function of their own private inputs without having to rely on any other trusted third party or any external authority. Those protocols enable the participants to compute a function in a decentralised manner, while ensuring that no information leaks about the private inputs, other than that which can be inferred from the public output. The commonly called “acceptable leakage” which is further studied in this paper, is the information that can be inferred from an attacker about the other inputs given the knowledge of the public output. Secure multi-party computation is not the only domain that is subject to an acceptable leakage. In particular, the results of our work are also applicable to other fields or scenarios that aim at protecting the inputs’ privacy and that involve the opening of a public output, such as outsourced computation where a trusted third party is privately sent all the inputs and returns the public output as unique piece of information, or trusted computing where the parties input their secret data into hardware security modules, which then ensure that no unintended information will be accessible to the other parties.
3.2 Quantitative Information Flow
The purpose of Quantitative Information Flow (QIF) [18, 19] is to provide frameworks and techniques based on information theory and probability theory for measuring the amount of information that leaks from a secret. Different mathematical concepts have emerged in order to convey varied and precise information about a secret: Shannon entropy [20] reflects the minimum number of binary questions required to recover a secret on average, while the min-entropy is an indicator of the probability to guess a secret in one try [21, 22, 18]. Richer measures such as Rényi entropy [23] and the -entropy [24] have been introduced in order to quantify some specific properties of a secret, and more general entropies have been proposed in order to unify those different concepts [12, 25]. In this work, we will measure the information gained by an attacker by means of min-entropy, which is used extensively in cryptography in order to quantify the vulnerability of a secret.
3.3 Differential Privacy
Differential Privacy (DP) [26, 27] formalises privacy concerns and introduces techniques that provide users of a database with the assurance that their personal details will not have a significant impact on the output of the queries performed on the database. More precisely, it proposes mechanisms which ensure that the outcome of the queries performed on two databases differing in at most one element will be statistically indistinguishable. Moreover, minimising the distortion of the outcome of the queries while ensuring privacy is an important trade-off that governs DP. Although DP is particularly adapted for guaranteeing privacy in statistical computations involving a large number of parties, its effectiveness diminishes when a small number of parties are involved in the computation. For example, in a two-party computation, a DP mechanism would ensure that the output would not be sensibly affected when half of the data is changed. In this case, the utility of the computed function is thus be drastically hindered by the low number of parties. Unlike DP and other works that have been conducted on trading off privacy and utility in SMC, this work does not intend to enhance the inputs privacy. Instead, our objective is to propose an efficient method for quantifying the privacy risks that a certain kind of computations presents.
3.4 Information Flow in Secure Multi-party Computation
Recent works [11, 12, 13] have adapted techniques stemming from QIF to the setting of SMC in order to propose a model that allows us to reason about the acceptable leakage. In this model, the set of parties willing to compute a public function is partitioned into three sets: a set of attackers, a set of targets and a set of spectators, holding the respective input vectors , and . The attackers are those parties willing to share the value of their inputs and to take advantage of the public output of the computation in order to learn as much information as possible on their targets’ inputs, while the remaining parties are called spectators. From the point of view of the attackers, the inputs and are unknown values and are thus modelled as random variables and , further deemed to be independent since targets and spectators are supposed to be honest parties who provide their inputs without being influenced by any other information. The attackers’ prior belief on those inputs will represent the prior distributions and of those random variables. The output of the function is then also considered as a random variable defined as . The privacy of the targeted parties is then expressed as the conditional entropy of the targeted inputs given knowledge of the attackers’ inputs and the conditional knowledge of the output. The choice of the entropy measure depends on the users’ privacy concerns and is left general in [12] to this end. In this work and for clarity purposes, we will choose to convey the inputs’ privacy by means of min-entropy, as in many cryptographic scenarios, although our analyses can be adapted to more general entropy measures. Under this assumption, the privacy of the targeted parties becomes:
[TABLE]
by virtue of Bayes theorem. Moreover, as and are independent, we know that:
[TABLE]
If we denote by and the size of the domains of and respectively, we know that computing one has complexity and thus computing each has complexity . Moreover, in the worst case, i.e. if is injective, the output domain will have a size of , which yields an overall complexity in for the computation of . In conclusion, although recent works have introduced a framework for characterising and quantifying the acceptable leakage, its computation cost is quadratic in the product of the inputs sizes in general, which prevents those privacy analyses to be applicable in practice, and this major complexity issue constitutes the focus of this paper.
4 Information Flow Analysis in Secure Three-Party Affine Computations
Let us consider three parties , and holding the respective private inputs , and . Let be a public function of three variables. We assume that the parties wish to enter the secure computation of and that is attacking under spectator . From the point of view of attacker , although is a known and constant value, the inputs and appear as unknown values and will be modelled as random variables and . Parties and are supposed to be honest parties who will not collaborate. Thus, random variables and are deemed to be independent. We further assume that the target’s and spectator’s inputs are from finite intervals and :
[TABLE]
Their prior probability distributions and will represent the prior beliefs that may have on those values, such that and . We note that the absence of prior belief may be represented as uniform prior distributions. Finally, we assume that function is affine in the target’s and spectator’s inputs, i.e. that we can choose three constant integers , and so as to express the output of as:
[TABLE]
Note that constants , and may be function of input , which is also considered as a constant. Admissible candidates for such affine functions can for example be defined as or .
Assumption 1**.**
From the attacker’s point of view, input is a known value and will thus be considered as a constant throughout this paper.
Thus, we may abuse notation by omitting the first argument of , we refer to its output as:
[TABLE]
while we define the corresponding random variable for the output as . We also introduce the output domain as:
[TABLE]
By denoting the min-entropy by , the amount of information that the attacker gains on the targeted input once the output is revealed will be quantified by . Since the value of will also be considered as a public constant in the present privacy analyses, we will refer to this quantity as , which develops as:
[TABLE]
where the Bayes vulnerability of given is defined as:
[TABLE]
To conclude this section and in order to simplify the following development, we will examine the particular case when or is zero.
Lemma 1**.**
If then . 2. 2.
If and then .
Proof.
If then clearly no information about leaks from and thus . More formally, in this case, and are independent and thus Equation (4) becomes:
[TABLE] 2. 2.
If and then is entirely determined by given the relation and thus . More formally, for all in , there exists one in such that and thus Equation (4) becomes:
[TABLE]
∎
Assumption 2**.**
In the rest of the paper, we will assume that and are non-zero.
5 Privacy under uniform prior beliefs
5.1 Reducing the entropy expression
In this section, we study the case where the attacker has no prior belief on the target’s and spectator’s inputs, i.e. when and are uniform on and respectively. In other words, we assume that for all in and in , we have and . As is uniform, we have:
[TABLE]
However, by definition, we know that for all output in , there exists at least one pair in that satisfies . For all such pairs, as and are independent, we have:
[TABLE]
since for a given and , there is at most one that satisfies as is affine and is non-zero. Consequently, since is uniform, and thus:
[TABLE]
where denotes the cardinal of . Our aim will now be to compute .
We mention four simplifications before analysing this problem in more details.
Assumption 3**.**
We first notice that deducting constant from the output of does not affect the number of different outputs, which enables us to simplify as . 2. 2.
Now, let us assume that interval is of the form . By substituting variable to variable , we can rewrite the expression of as . The new variable is ranged in and we can again deduct constant from the output. We can perform the same reasoning with variable , which enables us to assume without loss of generality that inputs and belong to some intervals of the form , . 3. 3.
We now show that integers and can be assumed to be positive without loss of generality. If both and are negative, then we can equivalently compute the number of outputs of function which has positive coefficients. If and , we can write as . However, the input space of variable is equal to that of , and we can thus equivalently study function whose coefficients are positive. Conversely, if and , we can again equivalently study , which is tackled in the previous case. 4. 4.
Let us denote by the greatest common divisor of and . We know that can be computed in . Function can be factorised as where where and are coprime. However, functions and have the same number of outputs. We can thus assume that the coefficients of the affine function are coprime provided that we have computed their greatest common divisor.
5.2 Measuring the size of the output domain
For the sake of clarity, this technical subsection will be developed so as to be self-contained.
Let , be two non-negative integers and and be two positive integers. Let us also assume that and are coprime. The aim is to calculate the cardinal of the set defined as follows:
[TABLE]
We can first notice that is positive and upper bounded by . The difficulty is that two different pairs and in can satisfy , and thus will often be lower than . We also notice that , and thus we also have .
Notations:. For any real , the floor of will be denoted by while will denote its ceiling. The fact that two integers and have same residue modulo another integer will be denoted as . For all integers and , we will denote the equivalence class of modulo by .
Recall 1**.**
For all real numbers , we have .
Proof.
Let be a real number. We have:
[TABLE]
and so:
[TABLE]
and thus, as is integral:
[TABLE]
∎
Recall 2**.**
Let and be two coprime natural numbers. We have:
[TABLE]
Proof.
If then the result is immediate since the sum adds up to [math]. Let us now assume that . We can notice that we have:
[TABLE]
and thus:
[TABLE]
However, because and are coprime we know that for all in , we have . As , we thus know that for all in , we have and thus:
[TABLE]
and thus Equation (4) becomes:
[TABLE]
∎
Recall 3**.**
Let and be two coprime natural numbers. We have:
[TABLE]
Proof.
Let and be in . We have:
[TABLE]
since and are coprime. Thus, for all distinct and in , we have and thus and therefore:
[TABLE]
∎
Lemma 2**.**
Let and be in . We have:
[TABLE]
Proof.
Let and be in . As and are coprime, we have:
[TABLE]
∎
Lemma 3**.**
Let and be two distinct pairs in . We have:
[TABLE]
Proof.
Let and be two distinct pairs in such that:
[TABLE]
By virtue of Lemma 2, we can take in such that:
[TABLE]
As and are different, we further know that is different from [math]. This implies that:
[TABLE]
But as and belong to and and belong to , we also know that:
[TABLE]
and thus:
[TABLE]
∎
Corollary 1**.**
If , then .
Proof.
Let us assume that . Let us define the function as with domain . By virtue of Lemma 3, we know that the function is injective. Thus, we have:
[TABLE]
∎
Assumption 4**.**
In the remainder of this section, we will now assume that .
Lemma 4**.**
Let be in . Let and be two distinct pairs in . We have:
[TABLE]
Proof.
Let be in . Let and be two distinct pairs in such that:
[TABLE]
By virtue of Lemma 2, we can take in such that:
[TABLE]
We further know that both pairs are distinct and we can thus choose different from [math]. Without loss of generality, let us assume that where refers to the lexicographic order on integer pairs. In other words, let us assume that .
We know that and Equation (8) ensures that since . As , we thus have .
Conversely, we know that and Equation (8) ensures that since . As , we thus have:
[TABLE]
∎
Theorem 1**.**
We have:
[TABLE]
Proof.
By virtue of Lemma 4, we know that for all in , and for all pairs and in , we have:
[TABLE]
Thus:
[TABLE]
since for all in when , and since . Moreover, since , for all in , we have:
[TABLE]
by virtue of Recall 1. We thus have:
[TABLE]
by virtue of Recall 2 since and are coprime. ∎
Lemma 5**.**
Let be in . We have:
[TABLE]
Proof.
Let be in . We know that and are coprime, so we can take two integers and in such that:
[TABLE]
and we have:
[TABLE]
Now, we have:
[TABLE]
and thus:
[TABLE]
∎
Corollary 2**.**
We have:
[TABLE]
Proof.
This is an immediate consequence of Theorem 1 and Lemma 5. ∎
Definition 1**.**
For all in , we define:
[TABLE]
so that . 2. 2.
For all in , we define:
[TABLE]
so that can be rewritten:
[TABLE]
Lemma 6**.**
For all in , we have:
[TABLE]
Proof.
Let be in . For all in , let us define the predicate as follows:
[TABLE]
and let us prove by induction that holds for all in .
By definition, we have:
[TABLE]
and thus holds.
Let be in and let us assume that holds.
By definition, we have:
[TABLE]
This equation, combined with the assumption that holds, yields us:
[TABLE]
However, as , we know that:
[TABLE]
And so Equation (14) becomes:
[TABLE]
which means that holds, which enables us to conclude the induction.
∎
Lemma 7**.**
For all in , we have:
[TABLE]
Proof.
Let be in . Lemma 6 ensures that we have:
[TABLE]
First, we know that . Let us now define as the following statement:
[TABLE]
We have the following equivalences:
[TABLE]
By definition of the floor function, Equation (17) holds and by equivalence, Equation (16) thus also holds, and so:
[TABLE]
and intersecting with yields us the expected result. ∎
Theorem 2**.**
We have:
[TABLE]
Proof.
By definition, we have:
[TABLE]
Lemma 7 thus implies:
[TABLE]
But as and are coprime, Recall 3 ensures that:
[TABLE]
which concludes the proof of the Theorem. ∎
Theorem 3**.**
We have:
[TABLE]
Proof.
Let us define the intervals and as follows:
[TABLE]
We notice that forms a partition of and thus we can partition into , which enables us to express the cardinal of as the following sum:
[TABLE]
We already know by Theorem 1 and Corollary 2 that:
[TABLE]
Moreover, Theorem 2 ensures that:
[TABLE]
and thus:
[TABLE]
and finally Equation (20) becomes:
[TABLE]
∎
In the following corollary, we can now synthesise the previous results and realize one of our main objectives: a closed-form expression for under uniform prior beliefs.
Corollary 3**.**
We consider a function defined as with , and being three constant integer values, with non-zero and . We assume that and are ranged in the respective intervals and of size and respectively, and we assume that and are uniformly distributed on those intervals. Let be the greatest common divisor of and , and let us define and where here represents the absolute value of integer . Then, we have:
[TABLE]
Proof.
This is an immediate consequence of Theorem 3 and Equation (3). ∎
This gives us a method for quantifying the information leaks about a targeted party from the public output of an SMC under uniform prior beliefs. This method requires a computational time that is constant in the inputs size and logarithmic in the coefficients of the affine function due to the greatest common divisor operation. In the next section, we show how we can reason about when an attacker has some prior beliefs about and .
6 Privacy bounds under non-uniform prior beliefs
In this section, we present some lower and upper bounds for under non-uniform prior beliefs on the target’s and the spectator’s input. The following theorem first imposes a lower bound.
Theorem 4**.**
We have:
[TABLE]
with equality when and have uniform prior distributions.
Proof.
We denote by the number of possible outputs, for which we recall that Theorem 3 yields an explicit formula. By comparison between the -norm and the infinity-norm, we have:
[TABLE]
But we know that:
[TABLE]
and thus:
[TABLE]
Taking the negative logarithm concludes the proof, and we can verify, with our bespoke explicit formula from Equation (3), that we indeed have equality when and are uniformly distributed since then and . ∎
We will now study some upper bounds for . It is a known result on the min-entropy that , i.e. that knowledge of the public output cannot increase the targeted input’s entropy. We will now prove that , i.e. the remaining entropy of given knowledge of cannot be larger than the prior entropy of the spectator’s input . To this end, we first state in the next theorem that an attacker eavesdropping on the value of and learning the public output will gain the same amount of information on targeted input than on the spectator’s input .
Theorem 5**.**
We have:
[TABLE]
Proof.
We have:
[TABLE]
For all in , we define the set of pairs that result in output as:
[TABLE]
We define its projections and on its first and second components respectively as follows:
[TABLE]
Moreover, for all in and in , we know that is non zero only if there exists a in such that is in . Thus:
[TABLE]
Now, for all in and in , there exists a unique in such that , determined by and we have . Thus:
[TABLE]
Conversely, for all in and in , there exists a unique satisfying and we have . Thus:
[TABLE]
And finally for all in , we know that can only be non zero if is in and thus:
[TABLE]
∎
Corollary 4**.**
We have:
[TABLE]
Proof.
We have and Theorem 5 concludes the proof. ∎
7 Examples
In this section, we illustrate the theoretical results previously obtained. We begin this section by presenting an example which deepens our understanding of the behaviour of under non-uniform prior beliefs. We could intuitively posit that is maximal when the prior distributions for and are uniform. However, we refute this hypothesis in the following example.
Example 3**.**
Let us consider the function , and let us assume that in ranged in and is ranged in . Let us consider and the uniform distributions for and on their respective domains. We also consider the following particular distribution .
Then, when and respectively follow the prior distributions and , we have . On the other hand, when and respectively follow the prior distributions and , we get .
We thus have which contradicts the intuitive hypothesis.
The next example presents a use case of Corollary 3.
Example 4**.**
Let us consider three parties , and holding respective private inputs , and and willing to enter the secure computation of a public function defined as . We suppose that party is attacking input under spectator . We notice that when input is fixed, the function is affine in and and we can thus apply our privacy analysis. We assume that and are ranged in the input domain and we assume that ’s prior beliefs and on those inputs are uniform over . We plot in Figure 3 the values of computed via Corollary 3, for the values of input ranged in . Note that although a small interval for the values of has been chosen for readability purposes, entropy can be computed for any value of . For an attacker who is willing to lie on his honest and intended input in order to learn as much information as possible on his targeted input , he would have more incentive to enter some value that produces low entropy. For example, he would rather enter value than . Conversely, targeted parties could consider such information so as to evaluate the risk that they would face by entering the computation in the worst case, or on average.
Example 5**.**
In order to evaluate the effectiveness of our approach, we repeated the operations of the previous example while letting the size of the input spaces and vary, and by comparing the computational time that different methods require to perform such analyses. More precisely, we computed the values of in the same scenario as in the previous Example 4, but we let inputs and be ranged in the intervals for different values of . We compared the time taken by the three following methods, which we display in Figure 4.
Naive method:* We use the combinatorial formula given in Equation (1) that has complexity .* 2. 2.
Simplified method:* We use the simplified formula of Equation (3) for affine functions under uniform distributions, where is computed naively by enumerating the set of outputs, which yields complexity .* 3. 3.
Explicit method:* We use the result of Corollary 3 which provides a constant time formula.*
The variables and represent the size of the input spaces (minus ) and are both set to for varying values of . We set a time limit of minutes and we mark by an infinity sign the computations that timed out. We can notice that both naive methods rapidly time out as the input space grows whereas our explicit formula enables us to perform privacy analyses in constant time for arbitrarily large input spaces such as the one performed in Example 4. Those computations have been performed on an Intel(R) Core(TM) i3-2350M CPU @ 2.30GHz, but are aimed at estimating the order of magnitude of those methods rather than precisely assessing them individually.
In the following example, we now illustrate the lower and upper bounds that have been derived for under non-uniform prior beliefs.
Example 6**.**
We consider the computation of whose simplification, once is fixed, is defined as . We assume that and are ranged in the domain .
We define a spiked distribution parametrised by a domain , a center and a weight as:
[TABLE]
In other words, distribution allocates a probability to the value and distributes the remaining probability uniformly amongst the other values of the domain .
We suppose that is uniformly distributed over and that follows distribution for different weights . We divide the interval into values. For each in those values, we compute the exact values of and that of its bounds derived in the previous section. The value of appears in blue in Figure 5. Its lower bound stemming from Theorem 4 is drawn in red and its upper bound derived from Corollary 4 is traced in green. Note that we considered small input spaces since is here calculated with a naive method, although its bounds can be computed efficiently for arbitrarily large input spaces.
8 Conclusion
Although extensive researches in Secure Multi-party Computation have considerably improved the efficiency of cryptographic protocols, the quantification of the acceptable leakage is a problem that still requires deeper investigations. Indeed, the computational complexity of those recently introduced privacy analyses does not yet allow their application in practical situations that involve large input spaces. In this work, we focused our attention on secure three-party computations of affine functions. We have formally investigated the behaviour of the acceptable leakage under uniform prior beliefs in order to obtain an explicit formula for the min-entropy of the targeted input given conditional knowledge of the output. The calculation of this closed-form expression requires a computational time that is constant in the inputs sizes and logarithmic in the coefficients of the function, which enables the privacy analysis of such computations in practice. Finally, we have derived some theoretical bounds for this acceptable leakage when the input prior distributions are non-uniform in order to accommodate the potential prior belief that an attacker may have.
In the future, we would like to enlarge our understanding of the acceptable leakage in more general settings. First, as our work is motivated by the privacy leaks that occur during SMC protocols, we tailored our analyses for finite input spaces. However, it would be interesting to adapt our model and to design some methods that can accommodate continuous input and output spaces. Moreover, although our current analysis considers the computation of affine functions for three parties, it would be of interest to explore the computation of affine functions for any number of parties.
We also mean to investigate more general functions that involve non-linear terms. It would be particularly interesting to study the composition of our analyses of affine functions in order to use them as building blocks for studying more complex functions. Finally, efficient and exact quantification of the acceptable leakage for general functions may be hard to obtain simultaneously, and we would thus also be interested in providing efficient methods for approximating the inputs privacy in general scenarios.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Andrew Chi-Chih Yao. How to generate and exchange secrets. In Foundations of Computer Science, 1986., 27th Annual Symposium on , pages 162–167. IEEE, 1986.
- 2[2] Andrew C Yao. Protocols for secure computations. In Foundations of Computer Science, 1982. SFCS’08. 23rd Annual Symposium on , pages 160–164. IEEE, 1982.
- 3[3] Adi Shamir. How to share a secret. CACM , 22(11):612–613, 1979.
- 4[4] Tal Rabin and Michael Ben-Or. Verifiable secret sharing and multiparty protocols with honest majority. In Proceedings of the twenty-first annual ACM symposium on Theory of computing , pages 73–85. ACM, 1989.
- 5[5] Michael Ben-Or, Shafi Goldwasser, and Avi Wigderson. Completeness theorems for non-cryptographic fault-tolerant distributed computation. In Proc. of the twentieth annual ACM symposium on Theory of computing , pages 1–10. ACM, 1988.
- 6[6] David Chaum, Claude Crépeau, and Ivan Damgard. Multiparty unconditionally secure protocols. In Proceedings of the twentieth annual ACM symposium on Theory of computing , pages 11–19. ACM, 1988.
- 7[7] Yehuda Lindell and Benny Pinkas. Secure multiparty computation for privacy-preserving data mining. Journal of Privacy and Confidentiality , 1(1):5, 2009.
- 8[8] Claudio Orlandi. Is multiparty computation any good in practice? In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on , pages 5848–5851. IEEE, 2011.
