On One Generalization of LRC Codes with Availability
Stanislav Kruglik, Marina Dudina, Valeriya Potapova, Alexey Frolov

TL;DR
This paper explores a generalized class of locally recoverable codes with intersecting recovery sets, enabling higher code rates and load balancing, supported by theoretical bounds and explicit constructions.
Contribution
It introduces a new generalization of LRC codes with intersecting recovery sets, providing bounds and explicit constructions that improve code rate and load balancing capabilities.
Findings
Derived an upper bound for the code rate.
Provided explicit constructions of the generalized LRC codes.
Showed the benefits of intersecting recovery sets in code performance.
Abstract
We investigate one possible generalization of locally recoverable codes (LRC) with all-symbol locality and availability when recovering sets can intersect in a small number of coordinates. This feature allows us to increase the achievable code rate and still meet load balancing requirements. In this paper we derive an upper bound for the rate of such codes and give explicit constructions of codes with such a property. These constructions utilize LRC codes developed by Wang et al.
| (4, 2) | 0.7111 | 0.7250 | 0.7429 | 0.7667 |
|---|---|---|---|---|
| (5, 2) | 0.7576 | 0.7667 | 0.7778 | 0.7917 |
| (6, 2) | 0.7912 | 0.7976 | 0.8052 | 0.8143 |
| (7, 2) | 0.8167 | 0.8214 | 0.8269 | 0.8333 |
| (4, 3) | 0.6564 | 0.6981 | 0.7516 | 0.8231 |
| (5, 3) | 0.7102 | 0.7375 | 0.7708 | 0.8125 |
| (6, 3) | 0.7496 | 0.7688 | 0.7915 | 0.8188 |
| (7, 3) | 0.7795 | 0.7938 | 0.8103 | 0.8295 |
| (WZL) | ||||
|---|---|---|---|---|
| (3, 2) | 0.6000 | 0.6429 | 0.6667 | 0.6667 |
| (5, 2) | 0.7143 | 0.7576 | 0.7500 | 0.7667 |
| (7, 2) | 0.7778 | 0.8167 | 0.8000 | 0.8214 |
| (3, 3) | 0.5000 | 0.5786 | 0.6250 | 0.6500 |
| (5, 3) | 0.6250 | 0.7102 | 0.7000 | 0.7375 |
| (7, 3) | 0.7000 | 0.7795 | 0.7500 | 0.7938 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
On One Generalization of LRC Codes with Availability
Stanislav Kruglik123, Marina Dudina1, Valeriya Potapova12 and Alexey Frolov12
[email protected], [email protected], [email protected], [email protected]
1 Skolkovo Institute of Science and Technology
Moscow, Russia
2 Institute for Information Transmission Problems
Russian Academy of Sciences
Moscow, Russia
3 Moscow Institute of Physics and Technology
Moscow, Russia
Abstract
We investigate one possible generalization of locally recoverable codes (LRC) with all-symbol locality and availability when recovering sets can intersect in a small number of coordinates. This feature allows us to increase the achievable code rate and still meet load balancing requirements. In this paper we derive an upper bound for the rate of such codes and give explicit constructions of codes with such a property. These constructions utilize LRC codes developed by Wang et al.
I Introduction
A locally recoverable code (LRC) is a code over finite alphabet such that each symbol is a function of small number of other symbols that form a recovering set [1, 2, 3, 4, 5]. These codes are important due to their applications in distributed and cloud storage systems. LRC codes are well-investigated in the literature. The bounds on the rate and minimum code distance are given in [1, 3] for the case of large alphabet size. The alphabet-dependent shortening bound (see [6] for the method explanation) is proposed in [7]. Optimal code constructions are given in [8] based on rank-metric codes (for large alphabet size, which is an exponential function of the code length) and in [9] based on Reed-Solomon codes (for small alphabet, which is a linear function of the code length).
The natural generalization of an LRC code is an LRC code with availability (or multiple disjoint recovering sets). Availability allows us to handle multiple simultaneous requests to erased symbol in parallel. This property is very important for hot data that is simultaneously requested by a large number of users. The case of LRC codes with availability is much less investigated. Bounds on parameters of such codes and constructions are given in [4, 10, 11, 12]. Most of the papers focused on information-symbol locality and availability. In what follows we are interested in all-symbol locality and availability that is preferable in applications as it permits a uniform approach system design.
The property of availability decreases maximum achievable code rate [10]. In this paper we propose a new generalization of LRC codes with availability. Namely, we assume that recovering sets can intersect in a small number of coordinates. This feature allows us to increase the achievable code rate and still meet load balancing requirements.
Our contribution is as follows. We investigate one possible generalization of locally recoverable codes (LRC) with all-symbol locality and availability when recovering sets can intersect in a small number of coordinates. We derive an upper bound for the rate of such codes and give explicit constructions of codes with such a property. These constructions utilize LRC codes developed in [13].
II Preliminaries
II-A LRC codes
Let us denote by a field with elements. Let . The code has locality if every symbol of the codeword can be recovered from a subset of other symbols of [1]. In other words, this means that, given there exists a subset of coordinates such that the restriction of to the coordinates in enables one to find the value of The subset is called a recovering set for the symbol .
II-B LRC codes with availability
Generalizing this concept, assume that every symbol of the code can be recovered from disjoint subsets of symbols of size . More formally, denote by the restriction of the code to a subset of coordinates . Given define the set of codewords
Definition 1
A code is said to have disjoint recovering sets if for every there are pairwise disjoint subsets such that for all and every pair of symbols
[TABLE]
In what follows we refer these codes as -LRC codes. We briefly list the existing results below. The first bound for -LRC codes was given in [14, 15]
[TABLE]
An improvement of this bound was obtained in [10]
[TABLE]
An alphabet-dependent bound was proposed in [12] and has form
[TABLE]
where , and denote the largest possible minimum distance of a code over .
The bound on the rate of -LRC codes was given in [10]
[TABLE]
This bound was improved in [11] for .
In [13] a recursive construction of binary -LRC codes was proposed. The parameters of these codes are as follows: , and . We refer these codes as WZL codes. WZL code is defined by its’ parity-check matrix. Let . Let us define matrix as follows. Each row of is associated with -subset of sorted in lexicographical order, each column – with -subset of also sorted in lexicographical order. In this case the element of is equal to if , where is -subset of associated with -th row and is -subset of associated with -th column. It must be mentioned that has rows and columns and has the following structure:
[TABLE]
where and .
II-C LRC codes with availability and intersection of recovering sets
Let us give to recovering sets an ability to intersect in at most positions and define this code as -LRC. More formally, we can say
Definition 2
A code is said to be -LRC if for every there are subsets such, that the following relations follow
for every pair ,
[TABLE] 2. 2.
for all and every pair of symbols
[TABLE]
In what follows we investigate the parameters of such codes.
III An upper bound on the rate of -LRC codes
III-A The recovery graph
Based on the original idea from [10] we represent locally recoverable codes with locality and availability as a graph in the following way. In accordance to the Definition 2 a coordinate has recovering sets , each of size , where . Define a directed graph as follows. The set of vertices corresponds to the set of coordinates of the LRC code. The ordered pair of vertices forms a directed edge if for some . We color the edges of the graph with distinct colors in order to differentiate between the recovering sets of each coordinate. Note, that as the recovering sets can intersect, then some edges may have several colors. We call the recovery graph of the code
In what follows we need the following lemma
Lemma 1
Let and , then
[TABLE]
Proof:
The upper bound is trivial and correspond to the case, when recovering sets do not intersect. To prove the lower bound assume, that any two recovering sets intersect in exactly positions, we have
[TABLE]
∎
Corollary 1
The out-degree of each vertex is upper bounded with and lower bounded with .
III-B Upper bound on the rate
The proof is very similar to the proof from [10]. For the simplicity of the reader we present the proof here in all the details. Let us introduce the following function
[TABLE]
The following lemma will be used in the proof.
Lemma 2
There exists a subset of vertices of size at least
[TABLE]
such that for any , the induced subgraph on the vertices has at least one vertex such that its set of outgoing edges is missing at least one color.
Proof:
For a given permutation of the set of vertices , we define the coloring of some of the vertices as follows: The color is assigned to the vertex if
[TABLE]
If this condition is satisfied for several recovering sets , the vertex is assigned any of the colors corresponding to these sets. Finally, if this condition is not satisfied at all, then the vertex is not colored.
Let be the set of colored vertices, and consider one of its subsets .
Let be the induced subgraph on . We claim that there exists such that its set of outgoing edges is missing at least one color in . Assume toward a contradiction that every vertex of has outgoing edges of all colors. Choose a vertex and construct a walk through the vertices of according to the following rule. If the path constructed so far ends at some vertex with color choose one of its outgoing edges also colored in and leave the vertex moving along this edge. By assumption, every vertex has outgoing edges of all colors, so this process, and hence this path can be extended indefinitely. Since the graph is finite, there will be a vertex, call it that is encountered twice. The segment of the path that begins at and returns to it has the form
[TABLE]
where . For any the vertex and the edge are colored with the same color. Hence by the definition of the set we conclude that for all a contradiction.
In order to show that there exists such a set of large cardinality, we choose the permutation randomly and uniformly among all the possibilities and compute the expected cardinality of the set
Let be the event that (2) holds for the vertex and the color Since does not depend on , we suppress the subscript , and write
[TABLE]
Let us compute the probability of the event Note that for any set the probability of the event that all the occur simultaneously can be estimated as follows
[TABLE]
Hence by the inclusion exclusion formula we get
[TABLE]
Now let be the indicator random variable for the event that , then
[TABLE]
The proof is completed by observing that there exists at least one choice of for which ∎
Theorem 1
The rate of an -LRC code satisfies
[TABLE]
Proof:
The colored vertices can be viewed as check symbols as they can be recovered from the rest symbols. Thus, the number of information symbols can be estimated as follows
[TABLE]
∎
IV Lower bounds on the rate of -LRC codes
In this section we derive a lower bound on the rate of codes with all symbol locality and availability in which recovering sets can intersect. To find a lower bound we propose the following rather simple code construction. In what follows we explain how to construct a parity-check matrix of a linear -LRC code. We start with a parity-check of -WZL code. Let is denote the matrix by . The matrix of -LRC code is constructed as follows
[TABLE]
where denotes a Kronecker product of matrices.
As a result, we have a matrix of length . It is obvious, that each row of the new matrix will have ones and the number of positions, in which two recovering sets intersects is equal to . This construction will have the same availability as it was for the standard WZL code. Thus, the parameters of the resulting code are as follows
[TABLE]
The matrix has exactly the same rank as the matrix , the rank is equal to . Thus, the rate of the resulting code can be calculated as follows
[TABLE]
Example 1
Let us start from an -WZL code
[TABLE]
and construct a parity-check matrix of an -LRC code. The matrix has a rate and shown below
[TABLE]
V Numerical results
In Table I we present the comparison of upper bounds on the rate of -LRC codes for different values of parameter . We see that the value of the upper bound increases with the parameter . In the Table II we present the comparison of the code rate obtained by proposed code construction () and the code rate of WZL codes with the same locality and availability. In addition, we include the values of the upper bounds for the code rate from [10] and the upper bounds for the code rate proposed in this paper. We see, that e.g. for , and the lower bound is tight and it is better, then the upper bound for the case of , and .
VI Conclusion
We investigated one possible generalization of locally recoverable codes (LRC) with all-symbol locality and availability when recovering sets can intersect in a small number of coordinates. This feature allows us to increase the achievable code rate and still meet load balancing requirements. In this paper we derived an upper bound for the rate of such codes and gave explicit constructions of codes with such a property.
Acknowledgment
A. Frolov thanks A. Barg for introducing this problem to him and for numerous fruitful discussions during his stay in University of Maryland.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] P. Goplan, C. Huang, H. Simitci, and S. Yekhanin, “On the locality of codeword symbols,” IEEE Trans. Inf. Theory , vol. 58, no. 11, pp. 6925–6934, Nov. 2011.
- 2[2] P. Goplan, C. Huang, B. Jenkins, and S. Yekhanin, “Explicit maximally recoverable codes with locality,” IEEE Trans. Inf. Theory , vol. 60, no. 9, pp. 5245 –5256, Sep. 2014.
- 3[3] D. S. Papailiopoulos and A. G. Dimakis, “Locally repairable codes,” IEEE Trans. Inf. Theory , vol. 60, no. 10, pp. 5843–5855, Oct 2014.
- 4[4] A. S. Rawat, O. O. Koyluoglu, N. Silberstein, and S. Vishwanath, “Optimal locally repairable and secure codes for distributed storage systems,” IEEE Trans. Inf. Theory , vol. 60, no. 1, pp. 212–236, Jan 2014.
- 5[5] S. Yekhanin, “Locally decodable codes,” Found. Trends Theoretical Comput. Sci. , vol. 6, no. 3, pp. 139 –255, 2012.
- 6[6] Y. Ben-Haim and S. Litsyn, “Upper bounds on the rate of ldpc codes as a function of minimum distance,” IEEE Trans. Inf. Theory , vol. 52, no. 5, pp. 2092 –2100, May 2006.
- 7[7] V. R. Cadambe and A. Mazumdar, “Bounds on the size of locally recoverable codes,” IEEE Trans. Inf. Theory , vol. 61, no. 11, pp. 5787 –5794, Nov. 2015.
- 8[8] N. Silberstein, A. S. Rawat, O. Koyluogly, and S. Vishwanath, “Optimal locally repairable codes via rank metric codes,” in Proceedings IEEE International Symposium on Information Theory (ISIT) , Jul. 2013, pp. 1819–1823.
