Hierarchical Coding to Enable Scalability and Flexibility in Heterogeneous Cloud Storage
Siyi Yang, Ahmed Hareedy, Robert Calderbank, Lara Dolecek

TL;DR
This paper introduces hierarchical coding schemes for heterogeneous cloud storage that enhance scalability and flexibility while maintaining small field sizes, using novel multi-level constructions based on Cauchy Reed-Solomon codes.
Contribution
It presents the first hierarchical locality codes that enable scalable and flexible cloud storage with small field sizes through innovative multi-level code constructions.
Findings
First hierarchical locality codes for scalable cloud storage.
Double and triple-level constructions based on Cauchy Reed-Solomon codes.
Scalable, flexible coding schemes adaptable to multiple layers.
Abstract
In order to accommodate the ever-growing data from various, possibly independent, sources and the dynamic nature of data usage rates in practical applications, modern cloud data storage systems are required to be scalable, flexible, and heterogeneous. Codes with hierarchical locality have been intensively studied due to their effectiveness in reducing the average reading time in cloud storage. In this paper, we present the first codes with hierarchical locality that achieve scalability and flexibility in heterogeneous cloud storage using small field size. We propose a double-level construction utilizing so-called Cauchy Reed-Solomon codes. We then develop a triple-level construction based on this double-level code; this construction can be easily generalized into any hierarchical structure with a greater number of layers since it naturally achieves scalability in the cloud storage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Hierarchical Coding to Enable Scalability and Flexibility in Heterogeneous Cloud Storage
Siyi Yang1, Ahmed Hareedy2, Robert Calderbank2, and Lara Dolecek1
1 Electrical and Computer Engineering Department, University of California, Los Angeles, Los Angeles, CA 90095 USA
2 Electrical and Computer Engineering Department, Duke University, Durham, NC 27705 USA
[email protected], [email protected], [email protected], and [email protected]
Abstract
In order to accommodate the ever-growing data from various, possibly independent, sources and the dynamic nature of data usage rates in practical applications, modern cloud data storage systems are required to be scalable, flexible, and heterogeneous. Codes with hierarchical locality have been intensively studied due to their effectiveness in reducing the average reading time in cloud storage. In this paper, we present the first codes with hierarchical locality that achieve scalability and flexibility in heterogeneous cloud storage using small field size. We propose a double-level construction utilizing so-called Cauchy Reed-Solomon codes. We then develop a triple-level construction based on this double-level code; this construction can be easily generalized into any hierarchical structure with a greater number of layers since it naturally achieves scalability in the cloud storage systems.
I Introduction
Codes offering hierarchical locality have been intensely studied because of their ability to reduce the average reading time in various erasure-resilient data storage applications including Flash storage, redundant array of independent disks (RAID) storage, cloud storage, etc. [1, 2, 3]. Codes with shorter block lengths offer lower latency, but they provide limited erasure-correction capability in a cloud storage system. To deal with more erasures, longer codes can be employed. However, since a simultaneous occurrence of a large number of erasures is a rare event, longer codes result in unnecessary extra reading cost, and are on average inefficient. Therefore, maintaining low latency while simultaneously recovering from a potentially large number of erasures is one of the major challenges in cloud storage. Codes with hierarchical locality have been shown to address this issue by providing multi-level access in cloud storage, which enables the data to be read through a chain of network components with increasing data lengths from top to bottom; this architecture is exploited to increase the overall erasure-correction capability[4].
In the literature, codes offering double-level access have been intensely studied[3, 4, 5, 6, 7, 8]; these codes are applicable in double-level cloud storage. In this configuration, consecutive local messages are jointly encoded into correlated local codewords. Each local codeword is stored at the neighboring servers of the corresponding local cloud. The codes are designed such that each local message can be successfully decoded from the corresponding local codeword when there are fewer than local erasures, and the global codeword provides extra protection against unexpected errors in a local codeword, for some . An example having is in Fig. 1. Suppose and . When there is at most server failure, accessing the servers connected to cloud is sufficient to successfully decode the data stored in cloud . If the number of server failures in cloud is , the data can still be obtained through accessing all the servers. Codes with hierarchical locality are a generalized extension of double-level accessible codes, in which more than two levels of access are allowed and are naturally suitable for cloud storage with multiple layers.
Along with hierarchical locality discussed previously, it is also important for the coding schemes to support scalable, heterogeneous, and flexible cloud storage[9]. Scalability enables expanding the backbone network to accommodate additional workload, i.e., additional clouds, without rebuilding the entire infrastructure. Heterogeneity refers to the property of allowing nonidentical local data lengths and providing unequal local protection, which is important for cloud storage with heterogeneous structures. A heterogeneous structure arises in networks consisting of geographically separated components, and they often store data from different sources. Flexibility has been firstly investigated for dynamic data storage systems in [8], and it refers to the property that the local cloud can be split into two smaller local clouds without worsening the global erasure-correction capability nor changing the remaining components. This splitting, for example, is applied when cold data stored at a local cloud become hot unexpectedly.
Various codes offering hierarchical locality have been studied. Cassuto et al.[3] presented so-called multi-block interleaved codes that provide double-level access; this work introduced the concept of multi-level access. The family of integrated-interleaved (I-I) codes, including generalized integrated interleaved (GII) codes and extended integrated interleaved (EII) codes, has been a major prototype for codes with multi-level access [4, 7, 6, 5]. GII codes have the advantage of correcting a large set of error patterns, but the distribution of the data symbols is highly restricted, and all the local codewords are equally protected. EII codes are extensions of GII codes with double-level access, where specific arrangements of data symbols have been investigated, mitigating the aforementioned restriction. However, no similar study has been proposed for GII codes with hierarchical locality. Therefore, I-I codes are more suitable for applications where heterogeneity and flexibility are less important. Sum-rank codes are another family of codes that is proposed for dynamic distributed storage offering double-level access[8]. These codes are maximally recoverable, flexible, and allow unequal protection for local data. However, sum-rank codes require a finite field size that grows exponentially with the maximum local block length, which is a major obstacle to being implemented in real world applications.
In this paper, we introduce code constructions with hierarchical locality and a small field size that achieve scalability, heterogeneity, and flexibility. The paper is organized as follows. In Section II, we introduce the notation and preliminaries. In Section III, we present a new construction of codes offering hierarchical locality that is based on Cauchy Reed Solomon (CRS) codes. This construction requires a field size that grows linearly with the maximum local codelength. In Section IV, we then show that our coding scheme is scalable, heterogeneous, and flexible. Finally, we summarize our results in Section V.
II Notation and Preliminaries
Throughout the rest of this paper, refers to , and refers to . Denote the all zero vector of length by . Similarly, the all zero matrix of size is denoted by . The alphabet field, denoted by , is a Galois field of size , where is a power of a prime. For a vector of length , , , represents the -th component of , and . For a matrix of size , represents the sub-matrix of such that , , . All indices start from .
II-A Notation and Definitions
Let and represent messages and codewords, respectively. A set is called an -code if , , and , where refers to the Hamming distance. We next define a family of codes with double-level access. Note that our discussion is restricted to linear block codes.
Definition 1**.**
Let . Let , , , , where , , for all .
Let . Let and , . Let denote and let denote the message corresponding to , for . A set is called an -code if the following conditions are satisfied:
Let , . Each is an -code. 2. 2.
Let , . Each is an -code.
Example 1**.**
Let and . Let and . Then, . Suppose is specified as follows:
[TABLE]
Then, one can construct an -code with the parameters specified previously.
Any -code specified according to 1 corrects erasures in the -th local codeword via local access, and corrects additional erasures through global access when other local codewords are all correctable via local access. Following this notation, 2 extends 1 into the triple-level case.
Definition 2**.**
Let , , . Let , , where , , for all .
Let , , . Suppose . Let , so that , for and . Let , . Let for all . Let . Let , , . Let , , for all and . Let denote and let denote the message corresponding to , for , . A set is called an -code if the following conditions are satisfied:
Let , . Each is an -code. 2. 2.
Let . Each is an -code.
Example 2**.**
Let , , and . Let , where , , and . Let , where , , and . Then, , where , , . Suppose is specified as follows:
[TABLE]
Then, one can construct an -code with the parameters specified previously.
This definition can be easily generalized into codes with more than three levels of access. For simplicity, we constrain our discussion to the triple-level case.
II-B Cauchy Matrices
Cauchy matrices are the key component in the construction that we will introduce shortly.
Definition 3**.**
(Cauchy matrix) Let and be a finite field of size . Suppose are pairwise distinct elements in . The following matrix is known as a Cauchy matrix,
[TABLE]
We denote this matrix by .
Cauchy matrices are totally invertible, i.e., every square sub-matrix of a Cauchy matrix is invertible. The inverse of a given Cauchy matrix can be explicitly computed using algorithms of lower complexity than those for inverting Vandermonde matrices. These properties make Cauchy matrices promising in designing systematic maximum distance separable (MDS) codes. Lemma 1 presents a useful result about Cauchy matrices that will be used repeatedly in this paper.
Lemma 1**.**
Let such that , . If is a Cauchy matrix, then the following matrix is a parity-check matrix of an -code.
[TABLE]
Proof.
The parity-check matrix of an -code satisfies the property that every columns of this matrix are linearly independent. Therefore, we only need to prove that every rows of are linearly independent. We prove Lemma 1 by contradiction. Suppose there exist rows from that are linearly dependent. Suppose of these linearly dependent rows are from , and the other rows are from , where . Suppose the entries with in are located in the -th columns of , then for all . Observe that is the set of indices of all columns in . Suppose . Then the sub-matrix of the intersection of the rows and the -th columns of is singular. A contradiction. ∎
III Codes for Multi-Level Access
Following the definitions and notation introduced in Section II, we present a CRS-based code with double-level access in Section III-A. Then, we extend our construction into a triple-level case in Section III-B.
III-A Codes with Double-Level Access
In this subsection, we provide a construction of codes offering double-level access based on the CRS codes. Note that the generator matrix of any systematic code with double-level access has the following structure:
[TABLE]
Construction 1**.**
(CRS-based code) Let , , , and , with for all . Let be a finite field such that .
For each , let , , , , be distinct elements of . Consider the Cauchy matrix such that . For each , we obtain , , , according to the following partition of ,
[TABLE]
where , , . Moreover, , for .
Matrices and are substituted in specified in (3), for all , . Let represent the code with generator matrix .
Lemma 2**.**
Following the notation in 1, let , , for . Then, code specified in 1 is an -code.
Sketch of the proof.
For each , define . It follows from and (3) that for , . Define the local parity-check matrix and the global parity-check matrix , for each , as follows:
[TABLE]
We next prove the equations of the local distance and the global distance using and , .
To prove the equation of the local distance, let . Then, one can show that belongs to a code with the local parity-check matrix . From Lemma 1, is an -code. Therefore, any erasures in are correctable. Provided that has length , we can consider the entries of as erasures and thus any erasures in the remaining part of , i.e., , can be corrected. Therefore, .
To prove the equation of the global distance, assume all the local codewords except for are successfully decodable locally. Then, for each , and are computable. Let , then one can show that . From Lemma 1 and from the construction of , any erasures in are correctable, thus erasures in are also correctable. Therefore, . ∎
We next provide a working example for codes in 1. For simplicity, we let all the local codeword lengths and local data lengths be equal. However, the construction itself allows them to be unequal.
Example 3**.**
Let , , , , , , . Then, , . Choose a primitive polynomial over : . Let be a root of , then is a primitive element of . The binary representation of all the symbols in is specified in Table I.
Let , , , and as specified in (5). Therefore,
[TABLE]
Then, the generator matrix is specified as follows,
[TABLE]
Suppose , , then and . Moreover, and are specified as follows,
[TABLE]
According to 1, is the generator matrix of a double-level accessible code that corrects local erasures by local access and corrects extra erasures within a single local cloud by global access. In the following, we denote the erased version of by , and erased symbols by , .
As an example of decoding by local access, suppose . Then, the erased elements of can be retrieved using as the parity-check matrix. In particular, we solve for and obtain . We have decoded successfully.
As an example of decoding by global access, suppose , and has been locally decoded successfully. Then, implies that . Since , we obtain . Moreover, we compute . Let . Then, we solve and obtain . Therefore, , , , , and we have decoded successfully.
III-B Codes with Hierarchical Locality
Based on the double-level accessible codes presented in Section III-A, we present a class of codes with hierarchical locality in 2. For simplicity, we just present a construction with triple-level access. Note that the coding scheme itself can be naturally extended to have more than three levels.
As described in 2, in the triple-level structure, the set of local clouds is partitioned into groups that are indexed by the first-level index . These groups are further divided into local clouds, respectively, and the local clouds within group are indexed by the second-level index . Therefore, each local cloud is indexed by the pair . In the following discussion, the parameters with subscript are determined via the two local clouds indexed by and . The subscript is an abbreviated version of , and the parameters with subscript are determined via the local cloud and all the local clouds in the -th group. Lastly, we define a new notation, , that indexes the parameters determined via the local cloud and some other local clouds in the -th group (not necessarily all of them). Note that this notation bares similarity to . However, they are different notations: the index indexes a subgroup of local clouds not a single one as done by .
A generator matrix of such a code is as follows:
[TABLE]
where for any ,
[TABLE]
is a generator matrix of a code offering double-level access, and
[TABLE]
Properties of are to be discussed later.
Construction 2**.**
Let , . Let , for and , such that and . Let , , for all . Let be a finite field such that .
Let , , for , . For each , , let , , , be distinct elements of .
Consider the Cauchy matrix on such that , for , . Then, we obtain , , , , , , , , , according to the following partition of ,
[TABLE]
[TABLE]
[TABLE]
such that , , , , . Moreover, . Suppose ; let .
Matrices and are substituted in and to construct as specified in (6), (7), and (8). Let represent the code with generator matrix .
Theorem 1**.**
Following the notation in 2, let , , , for , . Then, the code defined in 2 is an -code.
Sketch of the proof.
For each and , define the local cross parity , and the global cross parities . Let . Then, it follows from that for some .
The local erasure-correction capability and the global erasure-correction capability can be easily derived by following the same logic used in the proof of Lemma 2. Therefore, we only need to prove that .
To prove this statement, suppose all the local codewords in the -th group except for are successfully decodable locally, for some , . In other words, for all , there are at most erasures in the corrupted version of the local codeword. From the construction, we know that the row spaces of any two matrices from , , and have no common elements except for the all zero vector. Therefore, for all , , , , can all be derived from . This implies that is known and thus, the entire contribution of global cross parities can be removed. Namely, let , for all , then the message , where . Thus, from Lemma 2, erasures in are correctable. Therefore, . ∎
Remark 1**.**
Note that the constraint of in 1 can be relaxed to if is even. In this case, we have . Moreover, we need to modify the equation of to be , and .
The following is a working example of 2. For simplicity, we let the middle code be the code presented in 3. However, the construction itself doesn’t impose any constraints on , , and , except for .
Example 4**.**
Here, we build on 3 using the same . Let , , , . Let of 3. Then, , , , as in 3. Therefore, , , . We assume , , are all identical, then so are and , , . Let these matrices be defined as follows:
[TABLE]
[TABLE]
For simplicity, we abbreviate as . Note that here , are even; thus, the construction follows the modification described in 1. The components are therefore all identical for , , and are described as follows:
[TABLE]
Then, the generator matrix is given in (12).
Note that the decoding process based on local access and global access have already been introduced in 3. Thus, we only focus on decoding based on the middle-level access in this example. Suppose , , , . Then, , .
Suppose there are erasures in so that , where represent the three erased symbols. Suppose is successfully corrected by local access. Then, codeword is correctable through middle-level access, i.e., by operating on and .
First, from , we know that . Following the proof of 1, we know that . Here, , . Then, and can be computed as , . Therefore, .
Let . We obtain by solving , where is specified in 3. Therefore, , , . We have successfully decoded .
IV Scalability, Heterogeneity, and Flexibility
In Section III, we have presented a construction of codes with hierarchical locality for cloud storage, which enables the system to offer multi-level access. However, multi-level accessibility is not the only property that is considered in practical cloud storage applications. In this section, we therefore discuss scalability, heterogeneity, and flexibility of our construction, which are pivotal particularly in dynamic cloud storage. Although our discussion is restricted to cloud storage, the properties of heterogeneity and flexibility are also of practical importance in non-volatile memories.
IV-A Scalability
As discussed in Section I, scalability refers to the capability of expanding the backbone network to accommodate additional workload without rebuilding the entire infrastructure. More specifically, when a new local cloud is added to the existing configuration, computing a completely different generator matrix resulting in changing all the encoding-decoding components in the system is very costly. The ideal scenario is that adding a new local cloud does not change the encoding-decoding components of the already-existing, local clouds.
We show that our construction naturally achieves this goal. Observe that in 1, the components , , , are built locally. Suppose cloud is added into a double-level configuration adopting 1. The following steps will only result in adding some columns and rows to the original without changing the existing ones:
Parameter Selection: Local cloud chooses its local parameters , , , , and local cloud chooses the additional local parameters ; 2. 2.
Information Exchange: Local cloud sends to the central cloud, and local cloud sends to the central cloud; 3. 3.
Information Exchange: The central cloud forwards to local cloud , and sends to local cloud ; 4. 4.
Update: Local cloud computes its finalized parity-check symbols , and local cloud adds to its current parity symbols.
Note that although the local erasure-correction capability of a local cloud does not change, the global erasure-correction capability of each local cloud increases by after adding the new local cloud into the system.
IV-B Heterogeneity
While codes with identical data length and locality have been intensively studied, heterogeneity has become increasingly important in real world applications, especially in cloud storage. There are typically two forms of heterogeneity: the heterogeneity of the network structure, and unequal usage rates (according to how hot the data stored are) of local components. It is reasonable to assume a heterogeneous structure since components connected to a larger network are typically geographically separated and they often store data from unrelated sources. Heterogeneous networks naturally require codes with different local code lengths and nonidentical data lengths, corresponding to flexible and in our construction, respectively. Unequal protection of data, corresponding to flexible and , also has received increasing attention in recent years. This observation is reasonable since the usage rate of the data is not necessarily identical. Clouds storing hot data (data with higher usage rate and more time urgency) should receive more local protection than those store cold data.
Although the examples we presented in Section III have identical local parameters among all the clouds for simplicity, 1 and 2 do not impose such restrictions, and they are actually suitable for heterogeneous configuration.
Example 5**.**
Here, we build on 2 and we use the same parameters. In this example, , , are not identical for all . Let ; thus, . Let ; thus, . Let ; thus, .
Let and ; thus, .
Then, ; ; . The rest of the parameters can be obtained in a similar fashion, and we then specify as follows:
[TABLE]
According to 2, one can construct an -code with the parameters specified previously.
IV-C Flexibility
The concept of flexibility has been originally proposed and investigated for dynamic cloud storage in [8]. In a dynamic cloud storage system, the usage rate of a piece of data is not likely to remain unchanged. When the data stored in a local cloud become hot, splitting the local cloud into two smaller clouds effectively reduces the latency. However, this action should be done without reducing the erasure-correction capability of the rest of the system or changing the remaining components.
Take 1 as an example, if the data stored in local cloud becomes unexpectedly hot, then the following procedure splits it into two separate clouds and :
Select the desired local parameters and for clouds and , respectively, such that , , , and
[TABLE] 2. 2.
Compute by solving the equation , where , , are described in the proof of Lemma 2. Find , such that ; 3. 3.
Compute , and .
Note that the matrix is vertically split into and , while is horizontally split into and , for all . Therefore, it is obvious that and one can prove that the local codeword doesn’t change for . Moreover, since both the local and the global parity check matrices for each non-split local cloud remain unchanged, the local and global erasure capability are not affected according to Lemma 2. Furthermore, one can prove that the local codewords stored in the new clouds and such that they are capable of correcting and local erasures, respectively.
V Conclusion
Multi-level accessible codes have been shown to be beneficial for cloud storage. While the previous literature works was typically focused on double-level accessible codes and their erasure-correction capabilities, in this paper, we focus on codes with hierarchical locality and additional properties motivated by their practical importance. We proposed a CRS-based code on a finite field with size that grows linearly with the maximum local codelength. We showed that our construction achieves scalability, heterogeneity and flexibility, which are important in dynamic cloud storage.
Acknowledgment
This work has received funding from NSF under the grants CCF-BSF 1718389 and CCF 1717602.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] P. Huang, E. Yaakobi, and P. H. Siegel, “Multi-erasure locally recoverable codes over small fields,” in 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton) . IEEE, 2017, pp. 1123–1130.
- 2[2] S. Ballentine, A. Barg, and S. Vladuts, “Codes with hierarchical locality from covering maps of curves,” ar Xiv preprint ar Xiv:1807.05473 , 2018.
- 3[3] Y. Cassuto, E. Hemo, S. Puchinger, and M. Bossert, “Multi-block interleaved codes for local and global read access,” in Proc. IEEE Int. Symp. Inf. Theory , 2017, pp. 1758–1762.
- 4[4] M. Hassner, K. Abdel-Ghaffar, A. Patel, R. Koetter, and B. Trager, “Integrated interleaving-a novel ECC architecture,” IEEE Transactions on Magnetics , vol. 37, no. 2, pp. 773–775, 2001.
- 5[5] M. Blaum and S. R. Hetzler, “Extended product and integrated interleaved codes,” IEEE Trans. Inf. Theory , vol. 64, no. 3, pp. 1497–1513, 2018.
- 6[6] X. Zhang, “Generalized three-layer integrated interleaved codes,” IEEE Communications Letters , vol. 22, no. 3, pp. 442–445, 2018.
- 7[7] Y. Wu, “Generalized integrated interleaved codes,” IEEE Transactions on Information Theory , vol. 63, no. 2, pp. 1102–1119, Nov. 2017.
- 8[8] U. Martnez-Penas and F. R. Kschischang, “Universal and dynamic locally repairable codes with maximal recoverability via sum-rank codes,” in 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton) . IEEE, 2018, pp. 792–799.
