Revisiting Shared Data Protection Against Key Exposure
Katarzyna Kapusta, Gerard Memmi, and Matthieu Rambaud

TL;DR
This paper introduces a new security model and encryption scheme for distributed data storage that remains secure even if the encryption key is exposed, reducing overhead and providing cryptographic proofs of security.
Contribution
It defines a novel security model for key exposure resilience, proposes a new encryption-then-sharing scheme with reduced overhead, and offers cryptographic proofs within this context.
Findings
New security model for key exposure scenarios
Encryption scheme with half the overhead of existing methods
Cryptographic proofs based on blockcipher resilience assumptions
Abstract
This paper puts a new light on secure data storage inside distributed systems. Specifically, it revisits computational secret sharing in a situation where the encryption key is exposed to an attacker. It comes with several contributions: First, it defines a security model for encryption schemes, where we ask for additional resilience against exposure of the encryption key. Precisely we ask for (1) indistinguishability of plaintexts under full ciphertext knowledge, (2) indistinguishability for an adversary who learns: the encryption key, plus all but one share of the ciphertext. (2) relaxes the "all-or-nothing" property to a more realistic setting, where the ciphertext is transformed into a number of shares, such that the adversary can't access one of them. (1) asks that, unless the user's key is disclosed, noone else than the user can retrieve information about the plaintext. Second, it…
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Revisiting Shared Data Protection Against Key Exposure
Katarzyna Kapusta, Matthieu Rambaud, and Gerard Memmi
Abstract
This paper puts a new light on computational secret sharing with a view towards distributed storage environments. It starts with revisiting the security model for encrypted data protection against key exposure. The goal of this revisiting is to take advantage of the characteristics of distributed storage in order to design faster key leakage resisting schemes, with the same security properties as the existing ones in this context of distributed storage.
We then introduce two novel schemes that match our —all storage places or nothing— security level of secret sharing under key exposure. The first one is based on standard block cipher encryption. The second one fits both in the random oracle model (e.g. Keccak) or in the idealized blockcipher model with twice larger key than the security parameter (e.g. an idealized AES256 would achieve 128 bits security). The first one reduces by half the amount of the processing required to be done in addition to data encryption with regard to the fastest state-of-the-art solution, whereas the second one completely eradicates additional processing. We confirm the complexity results by presenting a performance evaluation.
A non-negligible part of our contribution lies in the provided security analysis. In addition to giving security proofs for our schemes, we revisit the ones of previous work and point out structural weaknesses in a context of key exposure.
Keywords— Distributed storage security, computational secret sharing, cloud storage security, data protection, all-or-nothing encryption, confidentiality under key exposure.
1 Introduction
Ensuring confidentiality of data at rest is mostly achieved by its encryption using a symmetric cipher. However, even the strongest algorithm will protect data only as long as the key remains secret to attackers. Secure key management is the obvious countermeasure to that problem but its good implementation is not so straightforward and may very well be very costly. The problem of reinforcing data confidentiality against weak or leaked keys exists in the literature since almost three decades Rivest (1997b). Nowadays, the emergence of new powerful attackers puts this problem into a new light Karame et al. (2018).
As data are often outsourced and managed externally, key exposure origins may be manifold. First, it may be the result of bad key generation, for instance because of the use of hard-coded keys or backdoors in the key generation software Cohney et al. (2018); Karame et al. (2018). Second, the key may be revealed because of its poor management (the risk of bad key management increases in a situation where multiple users share same data and the same key). Third, with the time passing, the key length may become not sufficient anymore for always more powerful adversaries acquiring enough computational capabilities. Last but not least, an encryption key may simply be lost, stolen, or obtained as the result of bribery or coercion.
A classical way of reinforcing encrypted data confidentiality and protecting it against key exposure consists in encrypting the data in such a way that an adversary obtains no information unless she has the totality of the ciphertext, plus a secret encryption key: this property is denoted as all-or-nothing (AON). Key exposure resilient secret sharing schemes can thus be derived from schemes with this property. However, we observe that former all-or-nothing works (as the classical Rivest’s or Desai’s schemes Rivest (1997b)) are not efficient because they require two rounds of encryption: the first one to achieve the all-or-nothing property, and the other one to encrypt. Also, a more recent work Karame et al. (2018) gets rid of this twice-encryption requirement, at the cost of a linear post-processing of the encrypted file. But it appears that its security proofs are based on the assumption that an adversary could not distinguish between a piece of a ciphertext —of which she chose the plaintext— and pure randomness. We will detail that this assumption is not holding in the mainstream encryption schemes that the paper Karame et al. (2018) uses, such as blockciphers in counter mode. Thus, we are left with an important question:
How can we efficiently protect encrypted data against key exposure, since we believe that the risk of key exposure is impossible to completely eradicate ?
By efficiently we mean with no additional encryption layer. We discuss the general technical challenges raised by this problem throughout our paper: in §4.3, at the end of 4.5, in §4.7 and in §6.2. The good news is that distributed storage provides us with an additional security ingredient - data fragmentation and dispersal - enabling us to address the key exposure threat. On the other hand, it seemed to us that all the previous schemes address an over constraining security model, where the adversary selects all the ciphertext blocks that she wants to see (up to a certain quantity). Guided by a practical secure storage purpose, we think that the relevant parameter is instead the number of shares containers or storage sites that an adversary is able to access to. Therefore, another question: In a threat model where an adversary fully accesses a certain number of storage sites, can we achieve the same security level as the existing AON schemes with more efficiency?
As will become clear, we are facing a tradeoff. Either we could follow the approach of Bastion Karame et al. (2018) and use mainstream encryption, then fix the issue of non-pseudorandomness by an additional post-processing. Or we could strive to design encryption schemes that directly have a pseudorandom behavior under key exposure:
Can we find efficient encryption schemes where any large fraction of the ciphertext looks pseudo-random even in a situation of key exposure? And for mainstream encryption, how far can we reduce the necessary post-processing overhead to achieve this goal?
We address the second question by revisiting the security model for encrypted data protection against key exposure in a distributed environment such as cloud composed of several storage sites or multi-cloud with independent storage providers. Instead of taking the previous AON threat model (described in Karame et al. (2018)) in which an adversary is able to compromise a given number of any ciphertext blocks, we present a new model taking as parameter the number of compromised servers or sites. Such a change is motivated by the fact that an adversary able to compromise one ciphertext block in a given storage site has necessarily been able to acquire enough access rights to also be able to compromise other data stored in the same site by the same user, containing the said compromised ciphertext blocks. In particular, we match the same security level as the recently introduced CAKE Karame et al. (2018) model, in our context where the adversary accesses full shares.
This re-adjustment of the security model allows us to propose new secret sharing schemes with the all-or-nothing property that are faster than the state-of-the-art propositions. First, we introduce a scheme based on classical symmetric encryption with a block cipher in the counter mode. It reduces by half the complexity of the post-processing when compared with the fastest relevant work, which is Bastion’s scheme Karame et al. (2018). We show then how to totally get rid of the post-processing by switching to a scheme based on less mainstream encryption with random oracles. We give two efficient instantiations of it: from ideal blockciphers with twice larger keys (e.g. AES256 would provide a security parameter of 128), and from ideal sponge functions (e.g. Keccak). The price to pay for our new schemes improvements is a slight storage overhead, of times the keysize (of e.g. 128 bits, for an overall 128 bits level of security), where is the number of storage sites.
Apart from complexity and performance improvements, the main contribution of this work is a deep security analysis of the computational secret sharing schemes in a situation of key exposure. We revisit the previously proposed security proof of Karame et al. (2018) and single out a sufficient hypothesis under which its security is guaranteed. At the same time, we evidence that the classical proof structure for indistinguishability of plaintext encryptions, which proceeds by reduction to a distinguisher between an implemented keyed function and an idealized one, breaks down in presence of key leakage. This thus motivates the completely different approach in our proofs, which crucially rely on the fact that an idealized blockcipher with leaked key, still behaves as an ideal permutation. To match this ideal behavior in practice, it is important that the leaked key should have been chosen sufficiently at random in the first place. The proof of our alternative scheme in the random oracle model is actually much simpler, and provides a more efficient scheme.
In conclusion, our work comes as both two scalable and fast secret sharing schemes, and a complement to key management recommendations provided, for instance, by the NIST Barker (2016).
Detailed outline In Section 2 we present basic definitions, data structures, and notations used all along this paper. In Section 3, we present related works from the domains of secret sharing and all-or-nothing encryption as well as point out their limitations. In Section 4, we introduce a new security model (denoted as SAKE) and show that, under our storage site-by-storage site threat model, it is as secure as the more constraining one introduced recently in Karame et al. (2018). We finally highlight the main technical challenges of one-round encryption under key exposure, which to our knowledge we are the first ones to properly address. In Section 5, we illustrate the relevance of our approach by describing a new scheme (denoted as ) providing protection against key exposure that reduces the complexity of the post-encryption processing by half with regard to the state-of-the-art fastest scheme. In Section 6, we do not only present its security proof but also revisit the security analysis of a recent relevant scheme. As an alternative to our scheme, in Section 7, we introduce a scheme (denoted as ) that totally gets rid off the additional post-processing overhead, and describe both an instantiation using blockciphers, and also implement it with Keccak. Before concluding in Section 9 , Section 8 focuses on comparisons between our two new proposals and the state-of-the art techniques in terms of complexity, memory occupation, and verification of security properties. Complexity results are confirmed by performance evaluation.
2 Basic Definitions and Notations
We identify the following definitions and basic data structures that mostly correspond to classical concepts concerning block cipher encryption and computational secret sharing scheme. Our main unit of length is a block, which is a sequence of bits of size , where is our security parameter. All secret keys , headers, initialization vectors etc. will be of length one block, such as the inputs/outputs / of the blockciphers that we will consider111Notwithstanding, our blockcipher instantiation of will use a blockcipher with two-blocks-long key , where is the actual one-block-long key of our final scheme. But the blockcipher itself will still process with one-block long inputs and outputs and : e.g. AES256, which processes 128 bits blocks.. Thus, a block is necessarily small, compared to shares (see Dworkin (2001) and below) which can be much bigger (we focus on large data protection as perfect secret sharing instead of symmetric encryption can be applied for small data, solving the problem of key exposure). We note or the XOR operation between two blocks, which is the sum of binary vectors in . We will sometimes add a block with a number (typically the counter of CTR mode). By this abuse of notation, we simply mean the binary writing of the number , seen as a vector in .
- •
Plaintext : the data to be protected, is a sequence of bits and (in the context of block cipher encryption) of blocklength blocks , where vary from to . 222In Karame et al. (2018), indices are from to ().
- •
Secret , , and : We distinguish three secret values. A secret key of size of one block, which will be never stored within the data and given only to the persons entitled to read the data (this will be abstracted out in our model). By contrast, the two other secret quantities or (corresponding to a pseudorandom header/nonce or to an initialization vector) are stored in the storage sites, but in a secret-shared manner: they are ”mixed within the shares”.
- •
Ciphertext : plaintext encrypted with a randomized keyed encryption scheme . has a total bitsize of bits. If encryption was done using a blockcipher, then is a sequence of blocks. Indices of the vary from [math] to . 333In Karame et al. (2018), they are noted to , where denotes the number of blocks.
- •
Transformed ciphertext : in case when a linear transform is applied to the ciphertext , the notation is used to distinguish between the ciphertext and the ciphertext after the transformation. 444In our scheme, indices of the vary from to . Whereas in Karame et al. (2018) they are noted to . In our scheme the transformed ciphertext just consists in the ciphertext where the initialization vector has been removed. Moreover, when compared to Karame et al. (2018), we reverted the primes between the and the transformed ; our is one block smaller; and our are of length instead of (the letter being reserved for the number of shares).
- •
Fragment for : we partition the transformed ciphertext into fragments . For simplicity, if the transformed ciphertext is composed of blocks (or bits), then each is a group of consecutive blocks (or bits). We assume for convenience that and are divisible by the number of shares .
- •
Share for : the shares are by definition all, and only all, what is stored in the storage sites. They typically consist in the above fragments, plus a share of some secret quantity . In any case, the shares never contain (to be precised) the secret key . 555In Karame et al. (2018) shares are equal to the fragments.
- •
”All-or-nothing transform” : randomized transformation mapping the plaintext into a set of shares . It possibly uses a secret key as additional input. It also always uses a random input , that we do not make explicit in the input variables. It uses a public symmetric encryption scheme as an underlying mechanism. In our new security model, ”all-or-nothing” should be understood as ”all storage sites/shares”.
- •
Random oracle (RO): a function \mathcal{O}:\bigr{(}\mathbf{F}_{2}^{2{|B|}}\bigl{)}\longrightarrow\bigl{(}\mathbf{F}_{2}^{|B|}\bigr{)}^{\infty} of arbitrarily long output, and taking as input small seeds (for example two blocks: ). We note for the oracle with output truncated to blocks. A key property of random oracles that we will use is that, when fixing the first input block , the function is still a random oracle, this time of one-block input, moreover independent from all the other oracles .
- •
Ideal keyed blockcipher : a symmetric encryption function that maps blocks to blocks.
Detailed definition of a random oracle We refer to the definition presented in (Katz and Lindell, 2014, §13). When queried on a new input, outputs a string of independent uniformly distributed bits, and when queried on a previously queried input, returns the same output. We will use the fact that a RO satisfies in particular the strictly weaker property of being a pseudorandom generator —also known as (idealized) seeded pseudorandom number generator (PRNG) (Katz and Lindell, 2014, Definition 3.15). See 4.7 for a discussion of why the random-looking property of the output of a pseudorandom generator is in general not preserved when part of the seed is leaked, while the random-looking property of the output of a random oracle is preserved even if part of the seed is leaked. This is the main technical subtlety of this paper, and the reason why we pay much care in security analyses under key exposure.
Keyed pseudorandom function/permutation (PRF/PRP) We refer to (Katz and Lindell, 2014, Definitions 3.24 & 3.26). A PRF is a function: , such that when the first input component is chosen at random and fixed once for all, then the output of on several randomly chosen secret is indistinguishable from independent uniform random blocks.
Definitions of ideal keyed blockcipher and ideal random permutation Shannon’s perfect blockcipher model is an oracle such that, on every query with a new key (the first input component), it outputs a fresh new random permutation oracle Black et al. (2002). In turn, let us recall informally that a random permutation oracle (as assumed by the random permutation model: RPM) is a tuple of functions operating on blocks and inverse of each other. They have the following ideal behavior: on every input to \bigl{\{}not queried to nor output by before\bigr{\}}, then outputs a value chosen uniformly at random among all values \bigl{\{}not output by nor input to before\bigr{\}}; and the inverse oracle behaves accordingly in the other direction. These three ideal primitives (ideal BC, RO, and RPM) were recently proven to be equivalent to each other. 666Ideal BC is efficiently instantiable from RPM Lampe and Seurin (2013) and also from RO Holenstein et al. (2011). Finally, a RO with arbitrary long variable input and output can itself be instantiated from a sponge function calling an ideal random permutation Bertoni et al. (2008).
Chosen plaintexts indistinguishability ind1 and ind-CPA, polynomial adversary and negligible functions We consider an adversary who can make a total number of queries which we note . Let us define a negligible function as any fixed polynomial in , , and divided by . Implicitely, this means that our adversary is polynomial. Thus what we will call a ”negligible event” should be understood as something which occurs with probability equal to a negligible function. Likewise, when we say that the adversary has ”negligible advantage” in a game, we mean that she wins it of the time, plus or minus a negligible percentage. With respect to these conventions, an encryption scheme with plaintexts indistinguishability under queries with one-shot keys —a.k.a. ind1— is defined in (Katz and Lindell, 2014, Definition 3.9). Indistinguishability under multiple queries with the same key, which we simply note ”ind-CPA”, is defined in (Katz and Lindell, 2014, Definition 3.22).
3 Related Work
This section presents an overview of methods reinforcing confidentiality by dispersing data over different storage sites.
Perfect secret sharing (PSS) Perfect secret sharing Blakley (1979); Shamir (1979) with threshold transforms data into a set of shares, of which are needed for data reconstruction. Strictly less than shares provide no information whatsoever about the initial data, so the information is protected against adversaries unable to collect the required threshold of shares. Shamir’s perfect secret sharing scheme (PSS) Blakley (1979); Shamir (1979) is information theoretically secure. This unconditional security comes at the cost of a -fold increase in the volume of the data to be stored as each of the shares is of the size of the data itself.
Computational secret sharing (CSS) In it’s vanilla version, Krawczyk’s Secret Sharing Made Short Krawczyk (1994) consists in encrypting the plaintext, then dividing the ciphertext into fragments. Each storage site receives a ”share”, consisting in one fragment plus a share of the encryption key under a PSS with threshold . In (Krawczyk, 1994, Theorem 3) this scheme is proven to be a secret sharing scheme with threshold , under computational assumptions. As we detail later in §4.3, Krawczyk’s CSS breaks down under key exposure.
All-or-nothing encryption (AON) An all-or-nothing (AON) encryption makes the ciphertext decryptable only when complete. This security property can be achieved using a pre- or post-processing of encrypted data denoted as an all-or-nothing transform (AONT). A detailed overview of AON schemes can be found in Qiu et al. (2019).
For space constraint we defer to Appendix C the detail of Rivest’s AONT scheme, Boyko & Desai’s AONT schemes and formalization of what an AONT is. Although the definitions differ, one can roughly understand it is an efficiently invertible transformation such that, given two chosen plaintexts, an adversary knowing all but one blocks of their transforms, cannot distinguish between them. The transformation is composed of two encryption rounds using two different keys, only one of them will ne exposed.
Stinson Stinson (2001) formalized a linear all-or-nothing transform as an invertible matrix of size , that maps an ”unknown” input vector of length to a ”partially known” output vector , such that for each index , and every missing coordinate , an attacker who manages to learn all coordinates of but will learn no more information on than what she knew a priori. However, we would like to emphasize that the standalone property of a linear AONT is not enough to obtain a cryptosystem. Indeed, as emphasized by Boyko in Boyko (1999) 777”However, the linear constructions of Stinson would definitely not be secure in that model, since it is easy to come up with linear relations among the elements of by looking at just a few elements of (in fact, since is linear and deterministic, every output of gives a linear relation on elements of .”, an attacker may still learn correlations between unknown coordinates , potentially ruining the ind-CPA security property of the ciphertext.
Recently introduced, Bastion scheme Karame et al. (2018) builds on a variant of Stinson’s linear AONT. The plaintext is first encrypted, then the ciphertext is transformed using a square matrix such that: (i) all diagonal elements are set to 0, and (ii) remaining off-diagonal elements are set to 1.
As a result, each block of the ciphertext is XOR-ed with all other ciphertext blocks. The transformed ciphertext is then claimed to be protected against key exposure and all but two ciphertext blocks exposed (one more block than in the case of Stinson’s AONT). The advantage of Bastion’s approach is that they require only a single encryption round and thus is much faster than the Rivest’s initial proposal (see Figure 3).
4 Encryption under Key Exposure: Why a New Model?
We motivate in §4.1 our alteration of existing security models in order to address resiliency of shared data under key exposure, that we present in §4.2: -Shares Access under Key Exposure, that takes as parameter the number of shares (=storage sites) instead of the amount of compromised blocks. Then in §4.3 we explain why previous schemes, such as Krawczyk’s SSMS, break down under this scenario of key exposure. We give - a more formal definition in 4.4, that we relate to the classical ind-CPA security. In §4.5, we give a soft overview of the challenges to overcome to achieve efficient -SAKE schemes, and discuss previous schemes achieving SAKE security. In 4.6 we compare SAKE to CAKE, and emphasize that they are equivalent against an adversary corrupting all but one storage sites. Finally in 4.7 we nail down the main technical subtlety that will make the difference between encryptions which are directly SAKE, and those which are not (and thus which will require extra post-processing).
4.1 Motivations
The state-of-the-art security model relevant for evaluating the security of a computational secret sharing scheme under key exposure was introduced in Karame et al. (2018) and is denoted as Ciphertext Access under Key Exposure (CAKE). A scheme is denoted to be CAKE secure when it resists an attacker able to access any blocks of the dispersed ciphertext as well as the encryption key. This model, inspired by all-or-nothing encryption, still does not take into account the fact that, once transformed, the ciphertext will be cut into shares containing a fraction of e.g. blocks each888We have different notations than (Karame et al., 2018, §3.1).. These shares will be then distributed over different storage sites or servers, preferably belonging to independent storage providers. Thus, the power given to the adversary to obtain any ciphertext blocks that she wants may be overkill in this context. The actual difficulty for an attacker lies in looking for and acquiring the totality of these dispersed shares, which is equivalent to compromising all the storage sites. Indeed, once an attacker compromises a storage site, she will be most probably able to acquire all the blocks of the share stored at this site 999A rare exception to this scenario would be in the case of a memory leak attack..
Motivated by the observation described above, we propose a relaxed security model, in which we want to protect the data shares against an attacker able to compromise all but one of the storage sites. This alteration in perspective allows us to design faster secret sharing schemes without in practice compromising on security.
4.2 Shares Access under Key Exposure: -SAKE
We introduce now our threat model that addresses a scenario where data is transformed into shares , and these shares are then dispersed over storage sites. In this model, we can have two types of attackers. The first type of attackers (1) will be able to compromise all the storage sites and gather all the shares but will not be able to access the symmetric encryption key . They are typically honest-but-curious cloud providers who would have access to all the shares. The second type of attacker (2) will be able to obtain the key, e.g. due to bad key management, but will not be able to compromise the totality of the storage sites.
We now formulate two security properties that characterize a scheme resisting the presented threat model:
Security property (1) - classical ind-CPA property: In the scenario where the adversary possesses all the shares but has no information about the key, we ask for the classical indistinguishability under Chosen Plaintext Attack (ind-CPA) as defined for instance in (Katz and Lindell, 2014, p. 74). 2. 2.
Security property (2) - ind-CPA property of all-but-one shares under key exposure: In the scenario where the attacker could get the key and get to all but one of the storage sites, we ask for the indistinguishability under chosen plaintext attack and under key exposure of the shares in possession of the attacker.
4.3 Comments on the -SAKE definition
Property (1) is easy to match since it suffices to encrypt data using any scheme having the ind-CPA property before transforming them into shares and, most importantly , not to hide the encryption key within the shares. Indeed, if the key is secret shared and the key shares are attached or appended to the data shares, as in the case of Krawczyk’s SSMS in its original form, an attacker getting the shares will automatically get the encryption key. Similarly, in the case of Rivest’s or Desai’s AONT, the key can be recovered once the ciphertext is complete. So in all these schemes (1) doesn’t hold.
By contrast, property (2) will be much harder to achieve. Let us give a taste of why. Let us consider the Krawczyk’s CSS scheme where the data owner doesn’t ”mix” at all the encryption key within the shares, but instead keeps it for himself (and for the other persons accredited). Let us note the baseline encryption scheme used in Krawczyk’s and the secret key. Let us suppose informally that operates as a stream cipher, so that, given the key, the first blocks of the ciphertext can be decrypted into the first blocks of the plaintext. Now consider an adversary who manages to obtain this key , and who chooses two plaintexts of same length which differ in their initial block. Then, this adversary can distinguish between these plaintexts as soon as she is given the share containing the initial blocks of the ciphertexts.
Notice that we guarantee nothing against an adversary who has access to all the shares and to the decryption key . On the contrary, we ask for the plaintext to be efficiently reconstructible from these two ingredients. Additional means of protecting the plaintext are outside the scope of this paper.
4.4 Formal definition of -SAKE security
We consider a public randomized keyed invertible transformation ”” that maps a plaintext to bit strings, that we call ”shares”:
[TABLE]
and such that the inverse is efficiently computable given all the shares and the key . We don’t explicitly note here the random component of the input: of used during the transform (e.g. the Initialization Vector in block cipher encryption, or the sponge header Bertoni et al. (2012) - at this point, we do not specify the encryption details). We say that is a - transformation if and only if:
(1) The following encryption scheme is ind-CPA secure: select once for all a secret key at random, then on every plaintext , output all the shares .
(2) We now give a formal definition of the -SAKE property (2) based on the classical ind-CPA game. Let us fix an encryption key that will be used by the ”” transform and that will be supposed from now on known to the adversary (as we want to deal with the case of a key leakage).
The polynomial adversary has access at any time ( is said ”adaptive”) to an oracle which performs the encryption scheme applied inside of ”” (and its inverse ) on any (or ciphertext) of ’s choice. It also performs (or its inverse) on any plaintext (or set of shares) of ’s choice.
Here is the -SAKE security game (2) between and , that defines the ind-CPA security in the context of key exposure.
outputs to a pair of plaintexts and of the same length. 2. 2.
chooses randomly a bit . 3. 3.
performs the randomized transformation that gives shares . 4. 4.
adaptively queries indices of shares and gives them to her. 5. 5.
outputs to a bit . 6. 6.
The output of the experiment is defined to be if and [math] otherwise. In the former case, we say that succeeds.
If the adversary has no non-negligible advantage in this game, with respect to the length of one block as security parameter, then we say that the transformation fulfills the property (2) of -SAKE.
A careful reader will notice that if shares are badly chosen in some scheme, then this scheme cannot satisfy our (2) security requirement of - (e.g. if a share is always empty, or two shares always equal, then an adversary can recover everything from shares). In this paper we care about designing schemes that are -, not schemes that are not.
4.5 How to design more efficient schemes?
As regards the random parameter , although it is classically put as a header of the ciphertext in classical ind-CPA encryption schemes, we just stressed in 4.3 that the security property (2) completely breaks down if the adversary recovers this random parameter. Indeed, she could then decipher the beginning of the ciphertext with this . So in particular, we have to hide it in a manner that any shares give no information about it.
Note that all previous all-or-nothing schemes are based on the same pattern: perform a first transform ”AONT”, which is an encryption where the encryption key is hidden within the ciphertext (in a more size-efficient way than in Krawczyk’s SSMS). Then encrypt a second time, this time with a key that is kept by the user. In particular, Rivest’s, Boyko’s and Desai’s AON verify the - property with respect to the second key. In the Bastion Karame et al. (2018) scheme, as in our schemes, the Initialization Vector/random nonce used during the encryption plays the role of the ”first key” that is hidden within the ciphertext (Bastion) or shares (us). The real challenge in designing efficient schemes matching this double security objective (1) and (2), is to prevent attacks such as in §4.3. But just hiding the within the shares is not enough, since parts of ciphertexts can be distinguished if the encryption has bad randomness properties under key exposure: see §6.2. This is why we will still need a (small) post-processing of the ciphertext in .
4.6 Difference with the CAKE model
We point out here some remarks about the differences between the CAKE definition from Karame et al. (2018) and SAKE security properties. We discuss as well different possibilities concerning the size of the shares and their impact on the security of a secret sharing scheme.
The SAKE security property is clearly inspired by the CAKE property. We formulate therefore the two following correspondences between the two properties:
Remark 1**.**
**From CAKE to SAKE:
To make it simple: consider a -CAKE scheme, such as the one of Karame et al. (2018), which outputs a ciphertext of length . Then in the highly typical cases where , splitting this ciphertext into shares of size at least two blocks each yields a -SAKE scheme101010In full generality: a CAKE secure scheme is SAKE secure if the sum of blocks in any subsets of its shares does not exceed ..**
Remark 2**.**
**From SAKE to CAKE:
A -SAKE secure scheme is CAKE secure where denotes the minimum of all possible sums of combinations of of the shares. To make it simple: in the highly typical cases where , then a -SAKE scheme is only -CAKE. Indeed, if a CAKE adversary chooses ciphertext blocks in distinct shares, then SAKE doesn’t guarantee anything.**
For the sake of simplicity, we will define all shares in our schemes as having the same size. This definition could be modified in order to fit a non-uniform distribution of ciphertext among the shares. This would include cases like having two shares where one share contains blocks of the ciphertext and the other just one block (that could be useful for instance when wanting to have a large outsourced fragment of data in the cloud and one small fragment kept at the user’s device).
In a multi-cloud storage scenario is rather small Bessani et al. (2013) due to the burden coming with the subscription to a new storage provider. In a case where data is dispersed over multiple servers, is rarely greater than 20-30 Resch and Plank (2011).
4.7 Nailing down the main technical difficulties
Let us consider a keyed permutation , which is at least a PRP as in §2 (or refer to (Katz and Lindell, 2014, Definition 3.24)), or even which satisfies the stronger notion of an ideal blockcipher. Then consider the following classical ”counter-mode” strings generator: fix a secret key once for all, then on every query, sample at random then output:
[TABLE]
These outputs are indistinguishable from uniform independent random bit strings. Said otherwise, construction (2) is a pseudorandom generator in the sense of (Katz and Lindell, 2014, Definition 3.24).
On the other hand, if the is leaked to the adversary, then this is false (see Appendix A for a detailed explanation). In §6.2 we explain how a previous scheme felt into this safety trap along its proof under key exposure. Notice that a similar breakdown occurs if (2) is replaced by the classical ”chained-based” (CBC) mode of operation and, worse, this is then the case even if is just a PRF (not a PRP anymore).
We will fix this problem with post-processing of the output in §5-§6, whereas in §7 we will fix it with the following trick: if we replace the construction (2) using a random oracle as follows: , then it is true that this string generator remains pseudorandom even when is leaked to the adversary. We will use this crucially in the proof of Main Theorem 2. Interestingly, we show in 7.3 that the construction (2) switched upside down [ becomes the secret key and the variable random input] is secure under leakage of the secret , when assuming an ideal random behavior (ideal blockcipher model) on .
5 SSAKE: fast block-cipher based CSS protecting against key exposure
We introduce a new computational secret sharing scheme denoted as : Secret Sharing Against Key Exposure. In the next section we will prove that it satisfies the previously introduced security notion:
Main Theorem 1**.**
In the ideal blockcipher model, the scheme satisfies - security with respect to the key parameter used for the blockcipher encryption.
The advantage of this scheme is that it is versatile because it uses a standard blockcipher in the CTR mode as the encryption scheme inside the transformation (for instance the standard AES), so that the transformation can be applied on already encrypted data. The transformation is composed of four steps:
Encryption of the plaintext into a ciphertext using a blockcipher (like AES) in the CTR mode. 2. 2.
Transformation of the ciphertext using a linear transform creating dependencies between the blocks. 3. 3.
Splitting the transformed ciphertext into fragments composed of consecutive blocks. 4. 4.
Applying a perfect secret sharing scheme to the initialization vector used during the encryption to produce shares , and attaching these shares to the fragments: an output share is then the concatenation of the transformed ciphertext fragments and the IV’s share .
We consider a keyed blockcipher , which is a publicly known keyed function operating on blocks of size bits and which outputs bits (when designing the scheme we think of the most common symmetric encryption block cipher AES with blocks of 128 bits). Here, the secret key of the blockcipher is of size one block.
The algorithm starts with plaintext encryption using the in Counter Mode (we chose this mode instead of CBC for parallelizability, and simplicity of Proposition 1). This results in a ciphertext , composed of blocks. For the sake of completeness, let us just remind that such encryption consists to one-time pad the vector ( concatenated with zero block appended) with the following vector (which is not pseudorandom: see the explanation Section 4.5, as well as §6.2 and Appendix A):
[TABLE]
generated from a ”seed” block sampled at random. The output of this operation is the ciphertext 111111We ask for the first block of ciphertext to be , in order to be in the setting of Hypothesis 1. We could instead have stuck with the classical choice , and modified Proposition 1 accordingly.
[TABLE]
of blocks.
A noninvertible linear transform is then applied to the ciphertext transforming it into of length : each ciphertext block , , is XOR-ed with its predecessor . Aditionally, the initialization vector block is split using a perfect secret sharing scheme (PSS) with adversary threshold into shares : for example, an additive secret sharing. So the original ciphertext can be recovered from both and the shared .
The linear transformation of the into can be shown as right multiplication by the following noninvertible binary matrix of size :
[TABLE]
Transformed ciphertext is split into fragments. This can be done in various ways. The simplest way is to just create the fragments from large chunks of consecutive blocks. Consecutive blocks can be also dispersed over different fragments - this would reveal less information to an attacker that somehow managed to obtain the but has no knowledge about the plaintext.
For reconstruction: recover from all the shares, then, for someone who knows the key , deduce , then deduce the sequentially, then decipher them: .
6 Security Analysis of the scheme (and of previous work)
We first prove in §6.1 that achieves -SAKE security, then in §6.2 we revisit the security proofs of Karame et al. (2018) and highlight —one more time— the technical issues raised by the setting of key exposure and finally single out in §6.3 a sufficient hypothesis under which the scheme of Karame et al. (2018) is secure.
6.1 Proof of Main Theorem 1
(1) is trivially ind-CPA for an adversary who ignores the secret key and is given all the shares. Indeed, the shares are the result of a public transformation (the linear transform-then-sharing) whose sole input is the ciphertext . But this ciphertext itself is the plaintext encrypted with the classical counter-mode blockcipher encryption scheme with secret key . Thus any two ciphertexts , and thus any two transforms of them with a public transformation, are well-known to be indistinguishable under chosen plaintext attacks (see e.g. (Katz and Lindell, 2014, Theorem 3.30)).
The proof of (2) is based on the following indistinguishability property:
6.1.1 Main technical result
We place ourselves in the ideal permutation model (RPM) recalled in §2.
Proposition 1** (Uniformity of differentials of a random value in the RPM).**
Let be a public fixed ideal random permutation of size , a fixed public integer and be a fixed sequence of numbers, then consider the following procedure executed by an oracle : generate uniformly at random, then output:
[TABLE]
Then the output of this procedure is indistinguishable from the output of a generator of random strings of length blocks, for a polynomial adversary having access to the public resource , with respect to the security parameter .
The core idea of the proof is the following: the classical padding vector of the counter mode of encryption is not safe when is made accessible to the adversary (=key exposure), since then she can invert any by querying and recover the initialization vector to distinguish a padding vector from pure randomness. So what we do in is that we translate this vector by a random block . The issue is that we sample this random by querying on . But recall that behaves as a random oracle (modulo avoiding collisions), thus, as long as was not queried before (nor its inverse), then does output a value uniformly at random, so this doesn’t give any advantage to the adversary.
Proof.
We consider a cascade of games, where the view of adversary is the same between two consecutive games up to negligible events. The first game Game1 is where faces a true random string generator, whereas the last one Game6 is the actual oracle. Let us say that, in each game, our bounded adversary makes a total of queries to and , and of queries to the challenging oracle .
Game1 On each query to the challenging oracle , say the -th query, sends to a sequence of blocks sampled uniformly at random.
Game2 On each query to the challenging oracle , say the -th query, generates a sequence of random blocks and returns to the sequence . The view of is the same as in the previous game.
Game3 The challenging oracle has the same behavior as in the previous game, but the permutation oracle becomes ”lazy”: it doesn’t check anymore for collisions between two outputs of different inputs, so outputs uniformly at random when queried on a new input. Such a collision event happens in the game with probability , so is negligible.
Game4 The challenging oracle has the same behavior as in the previous game, but it sometimes ”crashes”, i.e. returns no answer. Namely, on each query, in addition to sampling the values , also samples ”for herself” a random block and crashes if: either one of the elements in the sequence (i) was already queried by the adversary to , or (i’) overlaps with a previously sampled sequence; or, if one of the elements in the sequence (ii) was already queried by the adversary to , or (ii’) overlaps with a previously sampled sequence. It is straightforward that (i) and (i’) happen with probabilities , resp. roughly (see (Katz and Lindell, 2014, pf of Thm 3.30) for thinner birthday paradox estimations) because the are not related in any manner with the outputs given to the adversary. On the other hand, one can actually see that (ii) and (ii’) also happen with probabilities resp. roughly , because the set of ”bad” values are translated by a uniform random before being given to the adversary, thus the possible sets of ”bad” values are equidistributed from the point of view of the adversary. This is the core idea of the proof.
Game5 The challenging oracle has the same behavior as in the previous game except that, this time, the values are sampled by querying the uniform oracle: . The values are thus still sampled uniformly at random, since, outside of ”crash” events, they were not queried before to (neither by nor ). Thus, the view of the adversary is the same as in the previous game.
Game6 Finally, the permutation oracle checks again for collisions, thus its outputs are not completely uniform anymore, introducing a negligible difference of views for the adversary compared to the previous game. ∎
6.1.2 End of the proof of Main theorem 1
(2), let us consider an adversary who is given the key . Let us consider the SAKE game of §4.4: chooses two plaintexts , and is given shares under of one of them. Her view consists in shares of the and fragments of . First, one can reason as if did not receive the shares of under the PSS121212Because these shares are indistinguishable from uniformly distributed random values, as can be seen e.g. from Shamir’s scheme or an additive secret-sharing scheme. One can also see this more formally, since a PSS is universally composable (see (Cramer et al., 2015, chap 4)) so can be formally replaced in any protocol by a black box that gives no information to an adversary accessing up to shares..
Then, consider a more advantageous Game2 where the is given all of
[TABLE]
Since she has strictly more information in this game, her guessing advantage is bigger.
Consider finally Game3, where the adversary is instead given , obtained from with the following successive operations (that one can also see as arising from elementary columns operations on ):
[TABLE]
Where it is clear that, since the operations are reversible, the adversary in Game3 can recover the view of adversary in Game2 so has larger (actually equal) advantage. But with the notations of Proposition 1, an adversary in Game3 receives exactly
[TABLE]
so whatever the value of the bit , she has negligible advantage in distinguishing what she receives from random, by Proposition 1. So she can’t a fortiori distinguish between the two possibilities.
6.2 Revisiting the security analysis of previous works and the issue of pseudorandomness under key exposure
The proof in Karame et al. (2018), follows the template of the proof of ind-CPA security for CTR mode, as done e.g. (Katz and Lindell, 2014, theorem 3.32). It considers the ind-CPA CAKE game as defined in (Karame et al., 2018, §3.2), where knows the key and plays against an oracle which uses a blockcipher with this fixed key. Then it shows that, if such an adversary could win the ind-CPA CAKE game with non negligible advantage, then it could distinguish between the actual ”pseudorandom” permutation and a truly random one, just as in (Katz and Lindell, 2014, p91). Then it concludes a contradiction, and thus the adversary couldn’t win the CAKE game.
Although this strategy works in (Katz and Lindell, 2014, theorem 3.32)131313”Intuitively, such a “gap” (if present) would enable us to distinguish the pseudorandom function from a truly random function. Formally, we prove this via reduction.”, the problem here is that there is no contradiction anymore in this context of key exposure. Indeed, our adversary perfectly knows the permutation , because she knows the key . So she can trivially distinguish it from a truly random one. So this doesn’t disprove anything about the adversary’s ability to win the CAKE game. This is why in our proof of Main Theorem 1 we needed to choose a completely different strategy and crucially relied on the RPM assumption for the with fixed public key .
The proof in Karame et al. (2018) relies on the statement that the output of the CTR is pseudorandom. However, this statement is false. On the face of it, it has actually no precise meaning141414As Katz-Lindell notice under their (Katz and Lindell, 2014, definition 3.25), ”it is meaningless to say that is pseudo-random if the key is known”.. But the problem is actually deeper: to exemplify it, we show in Appendix A that, when the key of the blockcipher in counter mode is known, then the output is not random-looking even if the adversary doesn’t know the counter. We also sketched this problem in the idea of proof of Proposition 1. Random-lookingness is broken even from the point of view of an adversary who knows only any two output blocks of CTR. The same argument holds for any two consecutive output blocks of CBC.
6.3 Minimal sufficient hypotheses for Bastion’s security
As we just saw, public knowledge of the inverse makes the output of CTR appear nonrandom. Nevertheless, we prove in Appendix B the following Proposition 2, which states that the security of Bastion holds under a certain pseudorandomness hypothesis.
Unlike the notations of Karame et al. (2018), we keep our indices of ciphertext blocks which run from [math] to , in particular we stick to our more traditional convention that is the initialization vector (instead of in loc. cit.). We also keep our convention where is the transformed ciphertext.
Proposition 2** (Under the following hypothesis, Bastion satisfies CAKE security).**
Sample once for all a key which will be used in the blockcipher and give it to the adversary , so that she has access to and with this key. Then, on every query of the adversary, sample a (secret) initialization vector uniformly at random, and for all possible tuples of distinct indices chosen by the adversary151515The two columns that cannot see, which makes possibilities depending on her choice in , output to her the following. Choose arbitrarily an index161616The pivot column: in the example in the proof of Appendix B. distinct from , that we give to , then output to the vector in defined by:
[TABLE]
(where one should read, instead of , just ). Then Bastion’s security is guaranteed if the adversary cannot distinguish this output from a random bit string of the same size.
7 Key exposure resistance with no overhead on top of encryption
We first borrow in §7.1 a very simple ind-CPA encryption scheme based on random oracles from the paper Bertoni et al. (2012). Then we tweak it for our purpose of secret sharing under key exposure in §7.2, where we prove that the resulting scheme is SAKE secure. We describe an instantiation with blockciphers of two-blocks-long keys (such as AES256 for a security parameter of 128) in §7.3, and finally describe in 7.4 the instantiation which we implemented with Keccak, along with the choice of consistent security parameters.
7.1 Baseline encryption
We consider the most elementary case of the encryption scheme defined in (Bertoni et al., 2012, §2). 171717Technically we revisit the ideal vanilla case of their construction, which they call in (Bertoni et al., 2012, §2), under the simplest setting, where we consider just one single secret body . Let us recall for the interested reader that this scheme is then tweaked into authenticated Duplex mode: see their (Bertoni et al., 2012, Lemma 6), with equivalent indistinguishability, then instantiated with an ideal sponge. Recall from §2 that we consider a public random oracle \mathcal{O}:\bigr{(}\mathbf{F}_{2}^{2{|B|}}\bigl{)}\longrightarrow\bigl{(}\mathbf{F}_{2}^{|B|}\bigr{)}^{\infty}, and note when restricting to the first output blocks.
Definition 1** ().**
Sample once for all a secret key uniformly at random. Then, on input of length , sample a header uniformly at random and output the ”one-time-padding”:
[TABLE]
Proposition 1**.**
* is an ind-CPA secure encryption scheme with secret key .*
This actually follows from (Katz and Lindell, 2014, Theorem 3.26) because the function is in particular a pseudorandom function with respect to key and input .
7.2 ROSSake scheme
Our scheme ROSSake is described in the following figure. Although the fragments could be chosen completely arbitrarily, as long as they don’t overlap and form a partition of the ciphertext, for simplicity we will assume as in §2 that is divisible by , and that the fragments are formed by gathering consecutive blocks.
Main Theorem 2**.**
Protocol is a scheme with respect to the secret key .
Proof.
(1) Let us consider an adversary who ignores the key but has all shares. Then ind-CPA of the scheme follows from Proposition 1. (2) Let us consider an adversary who knows the key and who plays the ind-CPA game of requirement (2) for -SAKE security. On every query to the challenge oracle on two plaintexts , , the adversary obtains shares of one of the chosen plaintexts generated by with the same key . [Throughout the game, the adversary can also query on any plaintext she wants.] By the property of information theoretical secret sharing, the shares given to the adversary are indistinguishable from random strings independent from all the rest of the experiment, so she can ignore them and concentrate on the fragments that she receives. To make the argument clearer, let us give more power to the adversary and also give to her the last missing fragment but not the corresponding missing share of the (otherwise she could reconstruct the secret).
The core idea of the proof is that, in this context where the key is exposed, the publicly know function is again a random oracle with respect to input (for the reader who is not convinced, we refer to the overkill domain-separation argument for random oracles done in (Coron et al., 2005, §5)).
So, from the point of view of this more powerful adversary, the ind-CPA game (2) of the definition of -SAKE with multiple queries on the same public key , consists exactly in the following one-query encryption game ”ind1” with the following encryption challenger oracle. The oracle selects a random oracle once for all and gives it to the adversary. Then, on every query , from the adversary, the challenge oracle selects a bit , samples a new secret uniformly at random (which plays the role of a secret key in an ind1 game!) then returns to the adversary the ciphertext: . But in particular, the function is a pseudorandom generator in the sense of (Katz and Lindell, 2014, Definition 3.15)181818As we notice in §4.7, this core argument completely breaks down in general if we replace the random oracle by the weaker notion of a mere pseudorandom function . , so that we fall back in the scheme of (Katz and Lindell, 2014, Theorem 17), which is proven (stronger than) ind1. Thus the adversary has negligible advantage in the game. ∎
7.3 Can we instantiate this scheme from AES ?
A convenient way to build a random oracle with input of constant size, two blocks in our use-case, consists morally in switching upside-down the construction of equation (2) [see the comments at the end of §4.7]. 191919This construction is suggested by the NIST: https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-90Ar1.pdf Let us formalize this in the ideal blockcipher model.
Proposition 2**.**
Consider an ideal blockcipher . Let us consider a fixed public counter, e.g. . Then the following function is a random oracle with input size the keysize of :
[TABLE]
Importantly, for our purpose we thus use an ideal blockcipher with key of size two blocks: twice larger than the key of the final scheme obtained. Thus the exponent , to differentiate it from the of the previous scheme of keysize one block.
Proof.
The proposition results from the definition of an ideal blockcipher. Indeed for each new input , then a fresh new random permutation oracle is created, so that the outputs on when queried on the counter are uniform and independent from the outputs of previously queried oracles with other keys . ∎
We hid under the carpet that the distribution of outputs is not totally uniform, since the permutation oracle avoids collisions (e.g. will select the output of outside of the previous outputs of ). But this bias, of order of magnitude is by definition negligible. For example if the oracle of Proposition 2 were used in place of the construction (2) to produce random-looking strings, then this bias would gives no more than a negligible advantage to an adversary that makes a number of queries negligible compared to . A formal argument for this detail can be found along the lines of the proof of the PRF/PRP Lemma, see e.g. (Maurer et al., 2004, §7).
In conclusion, for our purpose in the case where the security parameter is one block of size 128 bits, then AES256 has indeed key of size two blocks (256), thus by the previous proposition it does the job in the ideal blockcipher model .
7.4 Our instantiation with Keccak
Assuming the existence of a fixed random permutation , the authors of Keccak could prove the existence of a random oracle of arbitrary input and output size in Bertoni et al. (2008) (adopted for SHA3). From it they could build an encryption scheme ”SpongeWrap” which satisfies a pseudorandomness property. Namely, (Bertoni et al., 2012, Theorem 1) can be read in the light of (Katz and Lindell, 2014, Definition 3.28) as stating that, under the RPM, then SpongeWrap is indistinguishable from very large keyed pseudorandom permutation (from its input domain onto its image).
For mere privacy (not authentication) concerns, the security parameters of Bertoni et al. (2012) are the length of the key and the ”capacity” of the sponge ( is not the ciphertext’s size in Bertoni et al. (2012)!). The first formula of (Bertoni et al., 2012, Theorem 1) then states that for an adversary allowed to do a negligible number of queries (noted here and ) with respect to and , then the distinguishing advantage is essentially in . For the sake of comparison with the schemes Bastion and that we instantiated with AES128, we thus took security parameters (and bit rate ) for Keccak in our simulations in 8.2. We set the number of rounds of the permutation to , as recommended by the guidelines202020Note on Keccak parameters and usage, NIST hash forum, 2010.
In conclusion: Keccak might seem overkill to instantiate a random oracle with constant small input size, but it is our preferred choice since it gave as good performances than blockciphers, plus the Duplex mode also enables an authentication of the ciphertext with no additional round of encryption, which is interesting for integrity/robustness issues of secret sharing.
8 Comparison with Relevant Works
In Table 1 we compare SSAKE and schemes with relevant works in terms of complexity, memory size, with regard to their compliance with our security properties. After commenting Table 1 and detailing how the selected schemes were implemented, we present their experimental throughput in Figure 3.
8.1 Complexity and storage
We compare the and schemes in terms of complexity with relevant works: Krawczyk’s SSMS, Rivest’s and Desai’s AONT and AON, Bastion. The CTR encryption is used as a baseline. The important difference between the new generation of schemes protecting against key exposure and former algorithms is the number of encryption rounds. SSAKE, , and Bastion use only one single encryption round while Rivest’s and Desai’s AON apply two or more rounds.
Note that, Krawczyk’s SSMS and Desai’s AONT use a single round but do not protect against key exposure: see §4.3 for a discussion. Whereas Rivest’s and Desai’s AON are CAKE, but require two encryption rounds: see §4.5. Finally see §4.6 for how a CAKE scheme is -SAKE for free, provided , which explains our last column. See also §4.6 for the direction -SAKE to CAKE, which explains the CAKE entries for our schemes.
Both Bastion and SSAKE use a linear transform post-processing on top of encryption. The difference is that SSAKE transform uses twice less XOR than the Bastion’s one. Bastion scheme applies a linear transform over the encrypted data requiring XORs. applies only a linear transform (one pass of XORs) over the ciphertext and an additive PSS over the (that has a negligible impact: xors, on the scheme performance when is large).
does not require to apply a transform over the whole ciphertext. It needs only XORs more in addition to data encryption, which is negligible as is no more than 20-30 (when shares are distributed over multiple servers inside a data center, in a multi-cloud scenario it would be no more than 4).
The schemes of Rivest’s, Desai’s and Karame et al. (2018) produce a total stored data (= all the shares) of a size equal to the ciphertext’s . Whereas the schemes SSMS, , and require an additional storage space of or additional blocks (=secret sharing of the key in SSMS or of the in our scheme), which is a constant with regard to the size of the original data. Therefore, this overhead is negligible for large files.
8.2 Performance evaluation
Relevant algorithms were implemented using the same programming style in JAVA with JDK 1.8 on DELL Latitude E6540, X64-based PC running on Intel® CoreTM i7-4800MQ CPU @ 2.70 GHz with 8 GB RAM, under Windows 7. Standard library was used and the official Keccak implementation was used for Lake Keyak 212121https://keccak.team/keyak.html to instantiate : we refer to §7.4 for our choices of consistent security parameters. A random data sample was used for each measurement and each presented result is an average of 30 measurements. AES-CTR-128 was used as the algorithm for symmetric encryption. AES-NI was enabled. Results are somewhat consistent with those presented in Karame et al. (2018) when taking into account the difference between AES and AES-NI (a factor of 3 in performance was observed in our implementations) as well as differences between hardware platforms.
As shown in the Figure 3 is the fastest among schemes protecting encrypted data against key exposure based on mainstream symmetric encryption with AES. Protection against key leakage is achieved with an overhead of only 7% against a simple data encryption. The second fastest scheme with this respect, Bastion, results in an overhead of around 19% in comparison to data encryption. Our scheme , implemented with the less conventional Keccak-based encryption, turns out to be the fastest of all schemes.
9 Conclusions
In this paper we revisited computational secret sharing aiming at (1) protecting data against an adversary possessing all the shares but not the encryption key, (2) protecting data against an adversary possessing the encryption key and all but one share, and at the same time (3) respecting performance, storage occupation, and scalability requirements. We introduced a new security model inspired by the CAKE model and adapted to take into account the dispersion of shares in a distributed environment. The privacy threat under key exposure is defined in terms of the number of revealed shares and not, as so far in the all-or-nothing literature, the number of revealed ciphertext blocks. We presented a versatile computational secret sharing scheme - - that verifies both security properties (1) and (2) of this new model. Complexity and empirical evaluations show that it performs faster than the fastest known relevant scheme - Bastion - since the linear overhead is reduced by half, achieving the point (3) above. We provided a detailed security analysis of not only the presented scheme but also of previous relevant work. We finally presented and implemented an alternative random-oracle based scheme —that can be instantiated with blockciphers (such as AES256 for a security parameter of 128) or e.g. with Keccak— which requires no processing overhead on top of encryption and turned out to be the fastest of all.
Acknowledgements
We would like to thank prof. Srdjan Capkun for inviting us to a seminar at ETH Zurich and for a helpful discussion on the topic. We thank Marc Stevens, Mohamed Tahar Hammi and Han Qiu for very fruitful discussions. We would also like to thank the journalist Ingrid Fadelli for inviting us to do an interview for TechXplore222222https://techxplore.com/news/2019-02-circular-all-or-nothing-approach-key-exposure.html. The second author was (partially) funded by ANR under Grant ANR-15-CE39-0013-01 “manta”. Some of the results in this paper were presented in December 2018 at a seminar at LINCS Paris.
Appendix A Non-pseudorandomness of any two fragments of CTR output when the key is public
Here we bound ourselves to look at the output of CTR from the point of view of an adversary who has only two ciphertext blocks and the encryption key, and show that it is not pseudo-random.
Consider an adversary who chooses a plaintext and whose goal is to distinguish between CTR encryption and a random string generator. To make the challenge more difficult, suppose that the adversary is given only two output blocks, of indices and (if one of them is zero, then the game is even easier). Call and the output blocks given by the challenging oracle to . If and are actual outputs of CTR, then we have that there exists a certain such that
[TABLE]
The winning strategy of is now clear: compute and compare with : if equal then return ”CTR”, otherwise return ”random”.
Appendix B Proof of Proposition 2
Let us recall the matrix of the linear transform of Bastion, here for a transformed of size :
[TABLE]
The last columns are separated from the two first to illustrate the view of an adversary who would always asks to see during the CAKE game. Performing elementary columns operations inside the columns of the adversary, like in the proof of (§6.1):
[TABLE]
we obtain the adversary’s view on the right of the vertical separator:
[TABLE]
which is the equivalent to the former, from the point of view of an adversary seeing only the last columns: . (Let us repeat the argument of §6.1: from this new view one can deduce the former one, so the guessing advantage of the adversary is larger —actually: equal— than with the previous view). As in §6.1, is equal to a value known to the adversary (deduced from or by transforming them according to the previous matrix), plus , which is assumed indistinguishable from a random uniform vector. Finally, we must loop over all the possible uples of columns among that chooses to see. So we end up with asking for the distinct indistinguishability assumptions of vectors , as formulated in Proposition 2.
Appendix C More on AONT’s
Rivest’s Rivest [1997a] introduced the first known AONT, which is meant to be a pre-processing step applied before data encryption. During this AONT, input data is encrypted into ciphertext using a first random key . A hash of the encrypted data is then computed, XOR-ed with , and appended as the last block of the ciphertext . After such data transform, it is not possible to obtain the right hash, and consequently the encryption key, without possessing the whole ciphertext. However, an attacker knowing can obviously decrypt a fragment of the ciphertext in her possession. Therefore, Rivest suggests to re-encrypt the transformed data one more time with a different encryption key . The data will be then protect against the exposure of the first or second key (but not both of them) unless the whole ciphertext is revealed.
Later, Boyko formalized the all-or-nothing transform [Boyko, 1999, Definition 2]. The numerous subsequent variants of AONT are encompassed by the following loose definition: it consists of a randomized map , which maps an input of fixed size bits to an output of fixed size bits and such that a polynomial adversary has negligible advantage in the following indistinguishability game, where is a set of missing bit positions in . ”Negligible” is to be understood with respect to the security parameters: length of a certain symmetric key (underlying to ) and number of missing bits .
- •
has access to and is given (or possibly adaptively chooses ). She outputs two plaintexts and .
- •
is given , with bit positions missing, and is a random index unknown to the adversary.
- •
To win the game, has to successfully guess the value of .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1]
- 2Barker [2016] Elaine Barker. 2016. Recommendation for Key Management. NIST Special Publication 800-57, Part 1 rev 4 (2016), 1–147. http://dx.doi.org/10.6028/NIST.SP.800-57pt 1r 4 · doi ↗
- 3Bertoni et al . [2008] Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van Assche. 2008. On the Indifferentiability of the Sponge Construction. In Advances in Cryptology – EUROCRYPT 2008 , Nigel Smart (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 181–197.
- 4Bertoni et al . [2012] Guido Bertoni, Joan Daemen, Michaël Peeters, and Gilles Van Assche. 2012. Duplexing the Sponge: Single-Pass Authenticated Encryption and Other Applications. In Selected Areas in Cryptography , Ali Miri and Serge Vaudenay (Eds.). Springer Berlin Heidelberg, Berlin,Heidelberg, 320–337.
- 5Bessani et al . [2013] Alysson Bessani, Miguel Correia, Bruno Quaresma, Fernando André, and Paulo Sousa. 2013. Dep Sky: Dependable and Secure Storage in a Cloud-of-Clouds. Trans. Storage 9, 4, Article 12 (Nov. 2013), 33 pages. https://doi.org/10.1145/2535929 · doi ↗
- 6Black et al . [2002] John Black, Phillip Rogaway, and Thomas Shrimpton. 2002. Black-Box Analysis of the Block-Cipher-Based Hash-Function Constructions from PGV. In Proceedings of the 22Nd Annual International Cryptology Conference on Advances in Cryptology (CRYPTO ’02) . Springer-Verlag, Berlin, Heidelberg, 320–335.
- 7Blakley [1979] George R. Blakley. 1979. Safeguarding Cryptographic Keys. In Proceedings of the 1979 AFIPS National Computer Conference , Vol. 48. 313–317.
- 8Boyko [1999] Victor Boyko. 1999. On the Security Properties of OAEP As an All-or-Nothing Transform. In Proceedings of the 19th Annual International Cryptology Conference on Advances in Cryptology (CRYPTO ’99) . Springer-Verlag, Berlin, Heidelberg, 503–518. http://dl.acm.org/citation.cfm?id=646764.703962
