Non-Uniform Attacks Against Pseudoentropy
Krzysztof Pietrzak, Maciej Skorski

TL;DR
This paper extends known non-uniform distinguishability results from pseudorandom distributions to those with limited min-entropy, showing they are similarly vulnerable to efficient distinguishing circuits.
Contribution
It generalizes previous results to distributions with bounded min-entropy, demonstrating they can be distinguished from higher min-entropy distributions with comparable circuit complexity.
Findings
Distributions with less than k bits of min-entropy can be distinguished from those with δ-smooth min-entropy using circuits of size O(2^k ε^2/δ^2).
Distributions supported on at most 2^k elements can be distinguished from distributions with min-entropy k+1 with size O(2^k ε^2).
Pseudoentropy distributions are vulnerable to the same non-uniform attacks as pseudorandom distributions.
Abstract
De, Trevisan and Tulsiani [CRYPTO 2010] show that every distribution over -bit strings which has constant statistical distance to uniform (e.g., the output of a pseudorandom generator mapping to bit strings), can be distinguished from the uniform distribution with advantage by a circuit of size . We generalize this result, showing that a distribution which has less than bits of min-entropy, can be distinguished from any distribution with bits of -smooth min-entropy with advantage by a circuit of size . As a special case, this implies that any distribution with support at most (e.g., the output of a pseudoentropy generator mapping to bit strings) can be distinguished from any given distribution with min-entropy with advantage by a circuit of size…
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\Copyright
Krzysztof Pietrzak and Maciej Skorski \EventEditorsIoannis Chatzigiannakis, Piotr Indyk, Fabian Kuhn, and Anca Muscholl \EventNoEds4 \EventLongTitle44th International Colloquium on Automata, Languages, and Programming (ICALP 2017) \EventShortTitleICALP 2017 \EventAcronymICALP \EventYear2017 \EventDateJuly 10–14, 2017 \EventLocationWarsaw, Poland \EventLogoeatcs \SeriesVolume80 \ArticleNo
Non-Uniform Attacks Against Pseudoentropy111The full version is available at https://arxiv.org/abs/1704.08678
Krzysztof Pietrzak Supported by the European Research Council, ERC consolidator grant (682815 - TOCNeT). IST Austria
Maciej Skorski Supported by the European Research Council, ERC consolidator grant (682815 - TOCNeT). IST Austria
Abstract.
De, Trevisan and Tulsiani [CRYPTO 2010] show that every distribution over -bit strings which has constant statistical distance to uniform (e.g., the output of a pseudorandom generator mapping to bit strings), can be distinguished from the uniform distribution with advantage by a circuit of size .
We generalize this result, showing that a distribution which has less than bits of min-entropy, can be distinguished from any distribution with bits of -smooth min-entropy with advantage by a circuit of size . As a special case, this implies that any distribution with support at most (e.g., the output of a pseudoentropy generator mapping to bit strings) can be distinguished from any given distribution with min-entropy with advantage by a circuit of size .
Our result thus shows that pseudoentropy distributions face basically the same non-uniform attacks as pseudorandom distributions.
Key words and phrases:
pseudoentropy, non-uniform attacks
1991 Mathematics Subject Classification:
F.1.3 Complexity Measures and Classes
1. Introduction
De, Trevisan and Tulsiani [2] show a non-uniform attack against any pseudorandom generator (PRG) which maps . For any , their attack achieves distinguishing advantage and can be realized by a circuit of size . Their attack doesn’t even need the PRG to be efficiently computable.
In this work we consider a more general question, where we ask for attacks distinguishing a distribution from any distribution with slightly higher min-entropy. We generalize [2], showing a non-uniform attack which, for any , distinguishes any distribution with bits of min-entropy from any distribution with bits of -smooth min-entropy with advantage , and where the distinguisher is of size . As a corollary we recover the [2] result, showing that the output of any pseudoentropy generator can be distinguished from any variable with min-entropy with advantage by circuits of size .
- •
From a theoretical perspective, we prove where the separation between pseudoentropy and smooth min-entropy lies, by classifying how powerful computationally bounded adversaries can be so they can still be fooled to “see” more entropy than there really is.
- •
From a more practical perspective, our result shows that using pseudoentropy instead of pseudorandomness (which for many applications is sufficient and allows for saving in entropy quantity [3]), will not give improvements in terms of quality (i.e., the size and advantage of distinguishers considered), at least not against generic non-uniform attacks.
1.1. Notation and Basic Definitions
Two variables and are indistinguishable, denoted , if for all boolean circuits of size we have . The statistical distance of and is (where ), the Euclidean distance of and is . A variable has min-entropy if it doesn’t take any particular outcome with probability greater , it has -smooth min-entropy [6], if it’s close to some distribution with min-entropy . has bits of HILL pseudoentoentry of quality if there exists a with min-entropy that is indistinguishable from , we use the following standard notation for these notions
**: **
min-entropy:
**: **
smooth min-entropy:
**: **
HILL pseudoentropy:
1.2. Our Contribution
In this work give generic non-uniform attacks on pseudoentropy distributions. A seemingly natural goal is to consider a distribution with bits of min-entropy, strictly larger bits of HILL entropy, and then give an upper bound on in terms of . This does not work as there are where ,222Consider an which is basically uniform over , but has mass on one particular point, then . and as by definition , we can have a large entropy gap even when considering unbounded adversaries against HILL entropy. For this reason, in our main technical result Lemma 1 below, we must consider distributions with bounded smooth min-entropy. This makes the statement of the lemma somewhat technical. In practice, the distributions considered often have bounded support, for example because they were generated from a short seed by a deterministic process (like a pseudorandom generator). In this case we can drop the smoothness requirement as stated in Theorem 1.1 below.
Lemma 1** (Nonuniform attacks against pseudoentropy).**
Suppose that does not have bits of -smooth min-entropy, i.e., , then for any we have
[TABLE]
where hides a factor linear in .
Theorem 1.1**.**
Let be a deterministic (not necessarily efficient) function. Then we have
[TABLE]
more generally, for any over with support of size
[TABLE]
Remark 1.2** (Concluding best attacks against PRGs).**
For the special case we recover the bound for pseudorandom* generators from [2].*
Proof 1.3** (Proof of Theorem 1.1).**
The theorem follows from Lemma 1 when ; consider any with support of size , then , as no matter how we cut probability mass of over elements, one element will have the weight at least .
1.3. Proof Outline
1.3.1. A Weaker Result as a Ball-Bins Problem
We outline the proof of a somewhat weakened version of Theorem 1.1 in the language of balls and bins. For every of min-entropy we want to distinguish from . Suppose for simplicity that is flat and is injective, so that is also flat. Our strategy will be to hash the points randomly into two bins and take advantage of the fact that the average maximum load is closer to when we sample from than when drawing from . The reason is that has more balls, so by the law of large numbers, we expect the load to be “more concentrated” around the mean.
Think of throwing balls (inputs ) into two bins (labeled by and ). If the balls come from the support of , the expected maximum load (over two bins) equals . Similarly, if the balls come from the support of , then maximum load is . In terms of the average load (the load normalized by the total number of balls)
[TABLE]
As we obtain (with good probability)
[TABLE]
Letting be one of these bins assignments we obtain a distinguisher with advantage . To generate the assignments efficiently we relax the assumption about choosing bins and assume only that the choices of bins are independent for any group of balls. The fourth moment method allows us to keep sufficiently good probabilistic guarantees on the maximum load.
1.3.2. The General Case by Random Walk Techniques
A high-level outline and comparison to [2]
Below in Figure 1 we sketch the flow of our argument.
Our starting point is the proof from [2]. They use the fact that a random mapping likely distinguishes any two distributions and over with advantage being the Euclidean distance .
For any and with constant statistical distance (which is the case for the PRG setting where and ) this yields a bound . This bound can be then amplified, at the cost of extra advice, by partitioning the domain and combining corresponding advantages (advice basically encodes if there is a need for flipping the output). Finally one can show that -wise independence provides enough randomness for this argument, which makes sampling efficient. Our argument deviates from this approach in two important aspects.
The first difference is that in the pseudoentropy case we can improve the advantage from , where is the logarithm of the support of the variables considered, to , where is the min-entropy of the variable we want to distinguish from. The reason is that being statistically far from any -bit min-entropy distributions implies a large bias on already elements. This fact (see Lemmas 3.1 and 3.2, and also Figure 3) is a new characterization of smooth min-entropy of independent interest.
The second subtlety arises when it comes to amplify the advantage over the partition slices. For the pseudorandomness case it is enough to split the domain in a deterministic way, for example by fixing prefixes of -bit strings, in our case this is not sufficient. For us a “good” partition must shatter the -element high-biased set, which can be arbitrary. Our solution is to use random partitions, in fact, we show that using -universal hashing is sufficient. Generating base distinguishers and partitions at the same time makes probability calculations more involved.
Technical calculations are based on the fourth moment method, similarly as in [2]. The basic idea is that for settings where the second and fourth moment are easy to compute (e.g. sums of independent symmetric random variables) we can obtain good upper and lower bounds on the first moment. In the context of algorithmic applications these techniques are usually credited to [1]. Interestingly, exploiting natural relations to random walks, we show that calculations immediately follow by adopting classical (almost one century old) tools and results [5, 4]. Our technical novelty is an application of moment inequalities due to Marcinkiewicz-Zygmund and Paley-Zygmund, which allow us to prove slightly more than just the existence of an attack. Namely we generate it with constant success probability.
Advantage
Consider any with -smooth min-entropy smaller than . This requirement can be seen as a statement about the “shape” of the distribution. Namely, the mass of that is above the threshold equals at least , that is
[TABLE]
For an illustration see Figure 2.
We construct our attack based on this observation. Define the advantage of a function for distributions and as
[TABLE]
(writing also when the summation is restricted to a subset ). Consider a random distinguisher . Random variables for different are independent, have zero-mean and second moment equal to 1. Therefore the expected square of of the advantage, over the choice of , equals
[TABLE]
Let be the set of such that . For any of min-entropy at least we obtain
[TABLE]
where the first inequality follows because for , the second inequality is by the standard inequality between the first and second norm, and the third inequality follows because we showed that (illustrated in Figure 2) which also implies .
By the previous formula on the expected squared advantage this means that
[TABLE]
for at least one choice of . This implies
[TABLE]
A random as defined would be of size exponential in , but since we used only the second moment in calculations, it suffices to generate as pairwise independent random variables. By assuming -wise independence – which can be computed by size circuits – we can prove slightly more, namely that a constant fraction of generated ’s are good distinguishers. This property will be important for the next step, where we amplify the advantage assuming larger distinguishers.
Leveraging the advantage by slicing the domain
Consider a random and equitable partition of the set . From the previous analysis we know that a random distinguisher achieves advantage over the whole domain. Note that (for any, not necessarily random partition ) we have
[TABLE]
where is the restriction of the distance to the set (by restricting the summation to ). From a random partition we expect the mass difference between and to be distributed evenly among the partition slices (see Figure 3(b)). Based on the last equation, we expect
[TABLE]
to hold with high probability over .
In fact, if the mass difference is not well balanced amongst the slices (in the extreme case, concentrated on one slice) our argument will not offer any gain over the previous construction (see Figure 3(a)).
By applying the previous argument to individual slices, for every we can obtain an advantage when restricted to the set (with high probability over the choice of and ). Now if the sets are efficiently recognizable, we can combine them into a better distinguisher. Namely for every we chose a value such that ’s advantage (before taking the absolute value) restricted to has sign , and set
[TABLE]
then the advantage equals (with high probability over and the ’s)
[TABLE]
We need to specify a -wise independent hash for , another -wise independent hash for deciding in which of the slices an element lies, and bits to encode the ’s. Thus for a given the size of will be . Using the above equation, we then get a smooth tradeoff between the advantage and the circuit size . This discussion shows that to complete the argument we need the following two properties of the partition (a) the mass difference between and is (roughly) equidistributed among slices and (b) the membership in partition slices can be efficiently decided.
Slicing using -wise independence
To complete the argument, we assume that is a power of , and generate the slicing by using a -universal hash function . The -th slice is defined as . These assumptions are enough to prove that
[TABLE]
Interestingly, the expected advantage (left-hand side) cannot be computed directly. The trick here is to bound it in terms of the second and fourth moment. The above inequality, coupled with bounds on second moments of the advantage (obtained directly), allows us to prove that
[TABLE]
This shows that there exists the claimed distinguisher . In fact, a constant fraction of generated (over the choice of and ) distinguishers ’s works.
Random walks
From a technical point of view, our method involves computing higher moments of the advantages to obtain concentration and anti-concentration results. The key observation is that the advantage written down as
[TABLE]
which can be then studied as a random walk
[TABLE]
with zero-mean increments . The difference with respect to classical model is that the increments are only -wise independent (for ). However, that classical moment bounds still apply (see Sections 2.2 and 2.3 for more details).
2. Preliminaries
2.1. Interpolation Inequalities
Interpolation inequalities show how to bound the -th moment of a random variable if we know bounds on one smaller and one higher moment. The following result is known also as log-convexity of norms, and can be proved by the Hölder Inequality
Lemma 2.1** (Moments interpolation).**
For any and any bounded random variable we have
[TABLE]
where is such that , and for any we define .
Alternatively, we can lower bound a moment given two higher moments. This is very useful when higher moments are easier to compute. In this work will bound first moments from below when we know the second and the fourth moment (which are easier to compute as they are even-order moments)
Corollary 2.2**.**
For any bounded we have .
2.2. Moments of random walks
For a random walk , where are independent with zero-mean, we have good control over the moments, namely where constants depend on . This result is due to Marcinkiewicz and Zygmund [5] who extended the former result of Khintchine [4]. Below we notice that for small moments it suffices to assume only -wise independence (most often used versions assume fully independence)
Lemma 2.3** (Strengthening of Marcinkiewicz-Zygmund’s Inequality for ).**
Suppose that are -wise independent, with zero mean. Then we have
[TABLE]
The proof appears in Section 4.1.
2.3. Anticontentration bounds
Lemma 2.4** (Paley-Zygmund Inequality).**
For any positive random variable and a parameter we have
[TABLE]
By applying Lemma 2.4 to the setting of Lemma 2.3, and choosing we obtain
Corollary 2.5** (Anticoncentration for walks with -wise independent increments).**
Suppose that are -wise independent with zero-mean, then we have
[TABLE]
where the summation is over .
3. Proof of Lemma 1
Lemma 3.1** (Characterizing smooth min-entropy).**
For any random variable with values in a finite set , any and we have the following equivalence
[TABLE]
The proof appears in Section 4.2. We will work with the following equivalent statement
Corollary 3.2** (No smooth min-entropy implies bias w.r.t. distributions of min-entropy over at most elements).**
We have if and only if there exists a set of at most elements such that
[TABLE]
for all of min-entropy at least .
Proof 3.3** (Proof of Corollary 3.2).**
The direction trivially follows by the definition of smooth min-entropy. Now assume . Let be the set of all such that , then , and moreover by Lemma 3.1 we have . In particular for any of min-entropy (i.e., for all )
[TABLE]
Lemma 3.4** (Bias implies Euclidean distance).**
For any distributions on and any subset of we have
[TABLE]
Proof 3.5**.**
By the Jensen Inequality we have
[TABLE]
which is equivalent to the statement.
Corollary 3.6** (No smooth min-entropy implies Euclidean distance to min-entropy distributions).**
Suppose that . Then for any of min-entropy at least we have .
Proof 3.7** (Proof of Corollary 3.6).**
It suffices to combine Lemma 3.4 and Corollary 3.2.
By Corollary 2.5 we conclude that the advantage of a random distinguisher for any two measures (in our case and ) equals the Euclidean distance.
Lemma 3.8** (The advantage of a random distinguisher equals the Euclidean distance).**
Let be -wise independent as indexed by and such that outputs a random element from . Then for any set we have
[TABLE]
with probability over the choice of (the result actually holds for any measures in place of ).
For our case, that is the setting in Lemma 3.4, we obtain
Corollary 3.9** (A random attack achieves with significant probability).**
For as in Corollary 3.6, and as in Lemma 3.8 we have w.p. over .
3.1. Partitioning the domain into slices
Let , where , be a -universal hash function. Define , and consider advantages on slices
[TABLE]
The following corollary shows that on each of our slices, we get the advantage . The proof appears in Section 4.3.
Corollary 3.10** ((Mixed) moments of slice advantages).**
For , as above and every
[TABLE]
(the statement is valid for arbitrary measures in place of ).
Denote . Using Lemma 2.4 with where we compute and according to Corollary 3.10 we obtain . Bounding once again as in Corollary 3.10 we get
Corollary 3.11** (Total advantage on all parition slices).**
For as in Corollary 3.6, and defined above we have
[TABLE]
(for general the lower bound is ).
The corollary shows that the total absolute advantage over all partition slices, is as expected. Since is a partition we have
[TABLE]
where for (the sign of the advantage on the -th slice) we define where contains . This shows that by ”flipping“ the distinguisher output on the slices we achieve the sum of individual advantages. Since the bit can be computed with advice (the complexity of the function plus the complexity of finding for a given ) we obtain
Corollary 3.12** (Computing total advantage by one distinguisher).**
For as in Corollary 3.6, and defined above there exists a modification to which in time and advice achieves advantage with probability .
Finally by setting and manipulating we arrive at
Corollary 3.13** (Continue tradeoff).**
For any there exists such that the distinguisher in Corollary 3.12 has advantage and circuit complexity .
4. Omitted Proofs
4.1. Proof of Lemma 2.3 (Strengthening of Marcinkiewicz-Zygmund’s Inequality for )
Let . Since are (in particular) -wise independent with zero mean, we get
[TABLE]
(the summation taken over ). The fourth moment is somewhat more complicated
[TABLE]
The second equality follows because whenever occurs in an odd power, for example , the expectation is zero (this way one can simplify and bound also higher moments, see [7]). It remains to estimate the first moment. By Corollary 2.2 and bounds on the second and fourth moment we have just computed we obtain
[TABLE]
and the upper bound follows by Jensen’s Inequality (with constant 1).
4.2. Proof of Lemma 3.1 (Characterizing smooth min-entropy)
Suppose that . then, by definition, there is such that and . Since all the summands are positive and since , ignoring those for which yields
[TABLE]
Again, since we obtain
[TABLE]
which finishes the proof of the ”“ part.
Assume now that . Note that
[TABLE]
and therefore we have . By this observation we can construct a distribution by shifting of the mass of from the set to the set in such a way that we have for all . Thus and since a fraction of the mass is shifted and redistributed we have . This finishes the proof of the ”“ part.
4.3. Proof of Corollary 3.10 ((Mixed) moments of slice advantages)
For shortness denote and .
Note that by Lemma 2.3, applied to the family (which is -wise independent) we have
[TABLE]
which is the first inequality claimed in the corollary.how does come in here? In turn, again by Lemma 2.3, we have
[TABLE]
Since this holds for any , by Cauchy-Schwarz we get for any
[TABLE]
which proves the second inequality in the corollary.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Bonnie Berger “The Fourth Moment Method” In SIAM J. Comput. 26.4 , 1997, pp. 1188–1207 DOI: 10.1137/S 0097539792240005 · doi ↗
- 2[2] Anindya De, Luca Trevisan and Madhur Tulsiani “Time Space Tradeoffs for Attacks against One-Way Functions and PR Gs” In Advances in Cryptology - CRYPTO 2010, 30th Annual Cryptology Conference, Santa Barbara, CA, USA, August 15-19, 2010. Proceedings , 2010, pp. 649–665 DOI: 10.1007/978-3-642-14623-7_35 · doi ↗
- 3[3] Yevgeniy Dodis, Krzysztof Pietrzak and Daniel Wichs “Key Derivation without Entropy Waste” In Advances in Cryptology – EUROCRYPT 2014 8441 , Lecture Notes in Computer Science Springer Berlin Heidelberg, 2014, pp. 93–110 DOI: 10.1007/978-3-642-55220-5_6 · doi ↗
- 4[4] Aleksandr Khintchine “Über einen Satz der Wahrscheinlichkeitsrechnung” In Fundamenta Mathematicae 6.1 , 1924, pp. 9–20 URL: http://eudml.org/doc/214283
- 5[5] J. Marcinkiewicz and A. Zygmund “Quelques théorèmes sur les fonctions indépendantes” In Studia Mathematica 7.1 , 1938, pp. 104–120 URL: http://eudml.org/doc/218615
- 6[6] Renato Renner and Stefan Wolf “Simple and Tight Bounds for Information Reconciliation and Privacy Amplification” In Advances in Cryptology - ASIACRYPT 2005, 11th International Conference on the Theory and Application of Cryptology and Information Security, Chennai, India, December 4-8, 2005, Proceedings , 2005, pp. 199–216 DOI: 10.1007/11593447_11 · doi ↗
- 7[7] Jeanette P. Schmidt, Alan Siegel and Aravind Srinivasan “Chernoff-Hoeffding Bounds for Applications with Limited Independence” In Proceedings of the Fourth Annual ACM/SIGACT-SIAM Symposium on Discrete Algorithms, 25-27 January 1993, Austin, Texas. , 1993, pp. 331–340 URL: http://dl.acm.org/citation.cfm?id=313559.313797
