Array Codes with Local Properties
Mario Blaum, Steven R. Hetzler

TL;DR
This paper introduces enhanced array codes with added column parity, enabling local symbol recovery, which improves data repair efficiency in RAID-like storage systems.
Contribution
The paper extends traditional array codes by incorporating column parity, facilitating local recovery and advancing the design of locally recoverable codes.
Findings
Added column parity enables local symbol recovery.
Enhanced codes maintain array code properties with improved repair.
Applications to Locally Recoverable (LRC) codes are discussed.
Abstract
In general, array codes consist of arrays and in many cases, the arrays satisfy parity constraints along lines of different slopes (generally with a toroidal topology). Such codes are useful for RAID type of architectures, since they allow to replace finite field operations by XORs. We present expansions to traditional array codes of this type, like Blaum-Roth (BR) and extended EVENODD codes, by adding parity on columns. This vertical parity allows for recovery of one or more symbols in a column locally, i.e., by using the remaining symbols in the column without invoking the rest of the array. Properties and applications of the new codes are discussed, in particular to Locally Recoverable (LRC) codes.
| Encoding Algorithm from [5] | Encoding Algorithm for | |||
|---|---|---|---|---|
| Improvement % | ||||
| 17 | 8 | 398 | 358 | 10.1% |
| 17 | 15 | 748 | 701 | 6.3% |
| 127 | 8 | 3038 | 2778 | 8.6% |
| 127 | 50 | 18998 | 18696 | 1.6% |
| 127 | 125 | 47498 | 47121 | .8% |
| 257 | 8 | 6158 | 5638 | 8.4% |
| 257 | 50 | 38498 | 37936 | 1.5% |
| 257 | 255 | 196348 | 195581 | .4% |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Array Codes with Local Properties
Mario Blaum and Steven R. Hetzler
IBM Research Division
Almaden Research Center
San Jose, CA 95120, USA
Abstract
In general, array codes consist of arrays and in many cases, the arrays satisfy parity constraints along lines of different slopes (generally with a toroidal topology). Such codes are useful for RAID type of architectures, since they allow to replace finite field operations by XORs. We present expansions to traditional array codes of this type, like Blaum-Roth (BR) and extended EVENODD codes, by adding parity on columns. This vertical parity allows for recovery of one or more symbols in a column locally, i.e., by using the remaining symbols in the column without invoking the rest of the array. Properties and applications of the new codes are discussed, in particular to Locally Recoverable (LRC) codes.
Keywords: Erasure-correcting codes, product codes, Blaum-Roth (BR) codes, Reed-Solomon (RS) codes, EVENODD code, MDS codes, local and global parities, locally recoverable (LRC) codes.
I Introduction
Throughout this paper will denote a prime number and a power of 2, i.e., q\mbox{,=,}2^{b}. We will consider finite fields . The results can be extended to other prime powers, but for simplicity, we consider finite fields of characteristic 2 only.
Given and an integer , let \mbox{\langle}m\mbox{\rangle}_{p} be the unique number , such that . Unless there is confusion, we denote \mbox{\langle}m\mbox{\rangle}_{p} as \mbox{\langle}m\mbox{\rangle}. We will consider arrays with entries , . We next define formally a line in an array.
Definition** 1**
.
Given a array with entries , , a line of slope , , through entry , , is the set of entries \{c_{\mbox{\langle}u_{0}-iv\mbox{\rangle},v}\,:\,0\leqslant v\leqslant p-1\}. A line of slope (vertical line) through entry , , is the set of entries . ∎
Next we give the definition of Blaum-Roth (BR) codes [6]:
Definition** 2**
.
A BR code with parity columns, denoted , consists of all possible arrays over such that, when a zero row is appended to an array in the code, each line of slope for as given by Definition 1 has even parity. ∎
Example** 3**
.
The first 4 rows of the following array are in :
[TABLE]
*We can see that every horizontal line (i.e., line of slope 0), every line of slope 1 and every line of slope 2 have even parity (we mark in red the lines of slope 0, 1 and 2 through entry respectively).
*
The following algebraic definition [6] of a code is equivalent to Definition 1. This definition is convenient for decoding purposes.
Definition** 4**
.
A code is the code over the ring of polynomials modulo M_{p,q}(x)\mbox{,=,}1\oplus x\oplus x^{2}+\cdots\oplus x^{p-1} with coefficients in whose parity-check matrix is given by the Reed-Solomon type of matrix
[TABLE]
where M_{p,q}(\alpha)\mbox{,=,}0 and . ∎
From Definition 4, each codeword in can be considered as a array such that each column represents an element in the ring of polynomials modulo . It can be verified that such an array satisfies Definition 2 [6]. At this point the field is not significative. In fact, since q\mbox{,=,}2^{b}, Definition 4 uses codes in parallel, so studying codes is equivalent to studying codes. This will change with the expansion of codes to be presented next, which incorporates vertical parities in each column of the array.
Definition** 5**
.
Let be a polynomial with coefficients in such that divides and \gcd(1\oplus x,g(x))\mbox{,=,}1. Denote by \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d) the cyclic code of length over whose generator polynomial is having minimum distance . Let \mbox{{\cal R}}_{p}(q) be the ring of polynomials modulo with coefficients in , and let \alpha\in\mbox{{\cal R}}_{p}(q) such that \alpha^{p}\mbox{,=,}1 and . Then an Expanded Blaum-Roth code is the code with coefficients in \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d) whose parity-check matrix is given by (6). ∎
Notice that \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d) is an ideal in \mbox{{\cal R}}_{p}(q). The definition of an code looks similar to the one of a code. They both share the parity-check matrix (6), but while the entries of a codeword in a code are in the ring of polynomials modulo , the entries of a codeword in an code are in the ideal \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d). Thus, a codeword in an code can be represented by a array, where each column is in the cyclic code \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d). We also observe that represents a rotation times. Hence, multiplying by does not involve finite field arithmetic or XOR operations. Let us point out that codes over the ring \mbox{{\cal R}}_{m}(q), an integer (not necessarily prime), have been used in literature, for example, in [14] to provide a unification between EVENODD and RDP codes, in [15] in the context of Regenerating Codes, and in [16] to give efficient encoding and decoding algorithms for a family of MDS array codes.
From Definition 5 we obtain a geometrical description of the codes. As stated above, the codewords in an code can be represented as arrays over such that each column (i.e., line of slope ) is in the code
\mbox{{\cal C}}(p,g(x)(1\oplus x),q,d), and in addition, any line of slope for has even parity (recall that the arrays in a code are arrays).
An important special case is given by code , as described in [5], i.e., g(x)\mbox{,=,}1 and each column in an array has even parity, as illustrated in the next example.
Example** 6**
.
The following is an array in :
[TABLE]
*We can see that each column has even parity (i.e., each column is in \mbox{{\cal C}}(p,1\oplus x,2,2)), as well as each line of slope 0, 1 and 2 (illustrated in red for the lines through entry ).
*
Example** 7**
.
The code consists of all the arrays having even parity on lines of slopes 0, 1 and 2 and whose columns are in the code \mbox{{\cal C}}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4), which is the subcode of the cyclic Hamming code generated by whose codewords have even weight [18].
The following are two arrays in :
[TABLE]
*We will see in Theorem 24 that, like a BR code, an EBR code has minimum distance on columns. As a consequence, the minimum Hamming distance of this code when considered as a code over is D\mbox{,=,}16. In effect, taking a non-zero array in the code, each non-zero column has weight at least four and there are at least four non-zero columns, so , while the array in the right above has weight exactly 16. As a comparison, the product code consisting of the product of \mbox{{\cal C}}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4) with the Hamming code generated by has the same rate as
but minimum Hamming distance 12.
*
Example** 8**
.
*Let p\mbox{,=,}q-1 be a prime number (in fact, a Mersenne prime), q\mbox{,=,}2^{b}, a primitive element in ,
g_{D-2}(x)\mbox{,=,}\prod_{i=1}^{D-2}(\beta^{i}\oplus x) and consider the RS code \mbox{{\cal C}}(p,g_{D-2}(x)(1\oplus x),q,D) [18]. Then the code as given by Definition 5 consists of all the arrays over having even parity on lines of slopes , , and whose columns are in the RS code \mbox{{\cal C}}(p,g_{D-2}(x)(1\oplus x),q,D).*
For example, we can take p\mbox{,=,}7\mbox{,=,}8-1, a primitive element in and g_{1}(x)\mbox{,=,}\beta\oplus x. Then, according to Definition 5, the code consists of all the arrays over having even parity on lines of slope 0, 1 and 2, and whose columns are in the RS code \mbox{{\cal C}}(7,(1\oplus x)(\beta\oplus x),8,3).
More concretely, if is a zero of the primitive polynomial , the reader can verify that the following is an array in :
[TABLE]
*We may assume that the first 5 symbols in the first 4 columns are a data array, while the remaining symbols are parity symbols.
*
The next lemma establishes the connection between and codes.
Lemma** 9**
.
*There is a 1-1 relationship between and codes preserving the (column) weight of each array in the code. *
**Proof: **Consider an array in of (column) weight . Denote the array as C\mbox{,=,}(\mbox{\underline{c}}_{0},\mbox{\underline{c}}_{1},\ldots,\mbox{\underline{c}}_{p-1}), where
\mbox{\underline{c}}_{j}\mbox{,=,}(c_{0,j},c_{1,j},\ldots,c_{p-1,j}) is a (column) vector of length for . For each \mbox{\underline{c}}_{j}, define a (column) vector of length \hat{\mbox{\underline{c}}}_{j}\mbox{,=,}(\hat{c}_{0,j},\hat{c}_{1,j},\ldots,\hat{c}_{p-2,j}) such that \hat{c}_{i,j}\mbox{,=,}c_{i,j}\oplus c_{p-1,j} for . Then, consider the transformation from to given by
[TABLE]
First we need to prove that in (7) is in . Since an array in (resp. ) consists of independent arrays in (resp. ), without loss of generality, we assume q\mbox{,=,}2. By Definition 2, if and only if every line of slope in the array consisting of with a zero row appended at the bottom has even parity for . Notice that, by (7), such an array is equal to , where is a array such that column of is an all-zero vector if c_{p-1,j}\mbox{,=,}0, otherwise it is an all-one vector. Since, in particular the weight of row in is even, the number of all-one columns in is even. Hence, any line of slope , , has even parity in and .
Next we have to show that and have the same (column) weight. This is true because \mbox{\underline{c}}_{j} is non-zero if and only if \hat{\mbox{\underline{c}}}_{j} is non-zero for . In effect, if \mbox{\underline{c}}_{j} is non-zero and c_{p-1,j}\mbox{,=,}0, then c_{i,j}\mbox{,=,}\hat{c}_{i,j} for and \hat{\mbox{\underline{c}}}_{j} is non-zero. If \mbox{\underline{c}}_{j} is non-zero and c_{p-1,j}\mbox{,=,}1, then the number of 1s in \mbox{\underline{c}}_{i,j} for is odd, as well as the number of zeros. Hence, the number of 1s in \hat{c}_{i,j}\mbox{,=,}1\oplus c_{i,j} for is odd, thus \hat{\mbox{\underline{c}}}_{j} has odd weight and it cannot be a zero vector. ∎
Corollary** 10**
*The code over \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d) given by Definition 5 is MDS. *
**Proof: **Let be an array in of (column) weight . Applying transformation (7) to , we obtain
. Since, by Lemma 9, has also (column) weight , and since is MDS [6], then and also is MDS. ∎
Example** 11**
.
Consider the array in given in Example 6. Then the following is the transformation of this array into an array in as given by (7), where we add a row of zeros to the array in :
[TABLE]
*We can see that the parity along all the lines of slope 0, 1 and 2 is preserved.
*
Since it is well known how to encode and decode codes [6, 14], the same can be done for codes. In effect, the first step in the decoding consists of recovering up to erasures in each column (i.e., locally) whenever possible using the cyclic code \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d) (since the code is cyclic, also a burst of up to erasures can be recovered, as is the case in the encoding). Once this is done, the array in is transformed into an array in using transformation (7). This array is decoded in using, for example, the method in [6] or in [14]. Once the decoding is complete, the inverse transformation is applied to obtain the original array in . More efficient encoding and decoding methods, though, will be presented in Section II.
Quite often, it is desirable that the parity columns in an array code are independent, like in [2, 4, 7], since this property allows for a minimization of the number of parity updates when a data symbol is updated. Next we will do the same for array codes with local properties. We start with the definition of Independent Parity (IP) codes [2, 4].
Definition** 12**
.
An IP code with parity columns, denoted , consists of all possible arrays over such that, when a zero row is appended to the array, for each , , the sets
[TABLE]
for have all the same parity, either even or odd. ∎
codes are also known as Generalized EVENODD codes [3] and Blaum–Bruck–Vardy codes [17] in literature.
Example** 13**
.
The array consisting of the first 4 rows of the following array is in :
[TABLE]
*We can see that given , , all the lines of slope together with the corresponding independent parity in column (illustrated in red for the lines through entry ) have the same parity: even for lines of slope 0, odd for lines of slope 1 and even for lines of slope 2.
*
Similarly to codes, there is an equivalent algebraic description of codes [2, 4]. Explicitly:
Definition** 14**
.
An code is the code over the ring of polynomials modulo whose parity-check matrix is given by the matrix
[TABLE]
where M_{p,q}(\alpha)\mbox{,=,}0 and . ∎
By Definition 14, each array in can be considered as a array such that each column represents an element in the ring of polynomials modulo .
Next we define Expanded Independent Parity (EIP) codes:
Definition** 15**
.
*Consider the cyclic code \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d) and \alpha\in\mbox{{\cal R}}_{p}(q) as in Definition 5. Then an EIP code
is the code over \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d) whose parity-check matrix is given by (14). ∎*
Contrary to codes, the parities in codes are always even.
The next example illustrates Definition 15.
Example** 16**
.
The following is an array in (the data, correspoding to the first 4 rows and the first 5 columns, is the same as in the array given in Example 13):
[TABLE]
*Each column has even parity (i.e., each column is in \mbox{{\cal C}}(p,1\oplus x,2,2)), as well as each line of slope 0, 1 and 2 together with the corresponding independent parity (illustrated in red for the lines through entry ).
*
The next lemma, similarly to Lemma 9, establishes the connection between the codes and :
Lemma** 17**
.
*There is a 1-1 relationship between and preserving the (column) weight of each array in the code. *
**Proof: **Proceeding as in Lemma 9, denote an array as C\mbox{,=,}(\mbox{\underline{c}}_{0},\mbox{\underline{c}}_{1},\ldots,\mbox{\underline{c}}_{p+r-1}), where each \mbox{\underline{c}}_{j} is a (column) vector of length for and \hat{\mbox{\underline{c}}}_{j} is defined as in the proof of Lemma 9. Then, consider the transformation from to given by
[TABLE]
We have to prove that in (15) is in . As in Lemma 9, without loss of generality we may assume that q\mbox{,=,}2. Consider the array consisting of with a zero-row appended. By Definition 12 of , we have to prove that in such array, the lines of slope in the first columns of the array, , starting in entry , , together with entry , have all the same parity, either even or odd. As in Lemma 9, by (15), such an array is equal to , where is a array such that column of is an all-zero vector if c_{p-1,j}\mbox{,=,}0, otherwise it is an all-one vector. Consider vector \mbox{\underline{v}}_{s}\mbox{,=,}(c_{p-1,0},c_{p-1,1},\ldots,c_{p-1,p-1},c_{p-1,p+s}), where , and let be the matrix consisting of columns of . If the weight of \mbox{\underline{v}}_{s} is even, then the number of all-one columns in is even and any line of slope in the first columns of the array through entry , (as given by Definition 1), together with entry , has even parity. Otherwise, all such lines together with entry have odd parity.
Regarding the weight preservation, the argument is the same as in Lemma 9. ∎
Example** 18**
.
Consider the array in given in Example 16. The transformation from to given by (15) is (by appending a row of zeros to the array in )
[TABLE]
*We can see that in the array in the right, every line of slope 0 and 1 with its corresponding independent parity bit has even parity, while every line of slope 2 with its corresponding independent parity bit has odd parity (this case illustrated in red for the line through entry ).
*
Corollary** 19**
*If code is MDS, then code given by Definition 15 is also MDS. *
**Proof: **Similar to Corollary 10. ∎
Contrary to codes, codes are not always MDS. In particular, codes, and hence, by Corollary 19,
codes, are MDS for [2, 4]. For the codes are MDS depending on the prime number chosen. A list of prime numbers for which is MDS and is given in [4]. See also [17].
In the definitions of codes and , it is assumed that each column in an array in the code is in the cyclic code \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d). It is certainly possible to extend the definition such that each data column is in a cyclic code \mbox{{\cal C}}(p,g_{j}(x)(1\oplus x),q,d_{j}), where each divides and \gcd(g_{j}(x),1\oplus x)\mbox{,=,}1. In this case, up to erasures can be corrected in data column , giving unequal erasure correction for the data columns. The parity columns are in \mbox{{\cal C}}(p,g^{\prime}(x)(1\oplus x),q,d^{\prime}), where is the greatest common divisor of the s.
Before proceeding further, we briefly discuss the applications and advantages of expanded array codes over other array codes. The applications of expanded array codes are the same as the ones of traditional array codes, like BR [6], EVENODD [2], IP [3, 4], RDP [8], generalized RDP [1, 9] and codes with distributed parities [19, 23, 24, 25]. Mainly, these array codes can be used in RAID type of architectures [10], like RAID 6, which requires two parity columns. In addition, expanded array codes contain vertical parities, providing for local recovery [12]. Array codes are an alternative to RS codes [18], which require finite field operations. Since array codes are based on XOR operations, in general their implementation has less complexity than the one of codes based on finite fields.
Another application of array codes is the cloud. In this case, each entry may correspond to a whole device. Erasure codes involving local and global parities may be invoked [11, 12, 13]. Array codes naturally provide horizontal locality as well as column recovery, so they can be used as Locally Recoverable (LRC) Codes. Traditional array codes do not have vertical parity. In some applications, a column may represents a device, and each entry in the column a sector or a page in the device. It may be desirable to have vertical parities in an array code, so, if a number of sectors or pages fail (for instance, if their internal ECCs are exceeded and such a situation is detected by the CRCs), the failed sectors or pages can be recovered locally without invoking other devices (a first responder type of approach). A way to achieve this goal is by using an array code like a BR or an IP code and then encoding each column with vertical parities. A problem with this approach is that the column parities are not protected by the other parities. This problem is overcome by the expanded array codes described above.
codes have another interesting property. In addition to being able to recover any erased columns, they can recover also from a number of erased lines of slope , . We will study this property in some detail in Section IV, but in the meantime we illustrate this property with a simple example.
Example** 20**
.
Assume that p\mbox{,=,}5, and we have a array such that we encode a data array consisting of zeros into a array in code , and then we append a parity row. The result is a 0-array. Next, assume that two lines of slope 1 are erased, say, lines 1 (in blue) and 4 (in red) as follows, where the symbol corresponds to an erasure:
[TABLE]
This situation admits two solutions as shown below, since the top 4 rows in the array in the right are in :
[TABLE]
*Hence, the two erased lines of slope 1 are uncorrectable. If we encode the data array of zeros into a array in , we also obtain the zero array, but there is a unique decoding to the two erased lines of slope 1, given by the zero array in the left. Since the two colored diagonals in the array in the right have odd parity, this array cannot be in and the solution is unique.
*
If each entry represents a page or a sector, for example, a 64K sector, the parity sectors of an expanded array code, being obtained as XORs of data sectors, by linearity, inherit the CRC bits, i.e., the CRC of the parity sectors does not need to be computed. This one is an important advantage in implementation. Finally, expanded array codes naturally provide multiple localities to recover a single failed symbol, a problem that attracted attention in recent literature [21, 22, 26].
The paper is structured as follows: in Section II, we present efficient encoding and decoding algorithms for the array codes we have defined above (i.e., EBR and EIP codes). In Section III, we examine the problem of the minimum (symbol, as opposed to column) distance of such array codes. In Section IV, we study conditions under which erased lines of slope , , can be recovered in EBR codes (as illustrated in Example 20 for s\mbox{,=,}1). Section V discusses the puncturing of EBR and EIP codes to obtain MDS codes consisting of or arrays, where for certain values of . We end the paper by drawing some conclusions.
II Encoding, Decoding and Updating of a Data Symbol in EBR and EIP Codes
We start with a technical lemma.
Lemma** 21**
.
*Let be an irreducible polynomial on such that divides , prime, and \gcd(g(x),x\oplus 1)\mbox{,=,}1. Then, for each such that , \gcd(g(x),x^{i}\oplus 1)\mbox{,=,}1. *
**Proof: **Assume that the lemma is not true, hence, since is irreducible, there is an , , such that divides . Moreover, assume that is minimal with this property. Since \gcd(g(x),x\oplus 1)\mbox{,=,}1, then . Let p\mbox{,=,}ci+r, where, since is prime, . We can easily verify that
[TABLE]
Since divides both and , then, by (17), it also divides , contradicting the minimality of . ∎
The following lemma gives a recursion that will be used in the decoding of EBR codes.
Lemma** 22**
.
Let \mbox{\underline{v}}(\alpha)\mbox{,=,}\bigoplus_{i=0}^{p-1}v_{i}\alpha^{i}\in\mbox{{\cal C}}(p,g(x)(1\oplus x),q,d), where \alpha\in\mbox{{\cal R}}_{p}(q), and \alpha^{p}\mbox{,=,}1. Then, for each such that , the recursion (1\oplus\alpha^{j})\underline{z}(\alpha)\mbox{,=,}\mbox{\underline{v}}(\alpha) has a unique solution in \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d). Specifically, if \underline{z}(\alpha)\mbox{,=,}\bigoplus_{i=0}^{p-1}z_{i}\alpha^{i}, then
[TABLE]
∎
**Proof: **If is known, by solving recursively (1\oplus\alpha^{j})\underline{z}(\alpha)\mbox{,=,}\mbox{\underline{v}}(\alpha), (19) is obtained. Since is prime, all the entries for are covered by this recursion. In particular, if i\mbox{,=,}1, z_{j}\mbox{,=,}z_{0}\oplus v_{j}. Using this result as our starting point, we obtain
[TABLE]
XORing both sides of (20) from i\mbox{,=,}1 to i\mbox{,=,}p-1, we have
[TABLE]
Since, in particular, must have even weight, then \bigoplus_{i=1}^{p-1}z_{i}\mbox{,=,}z_{0}. Also, since is odd, (p-1)z_{0}\mbox{,=,}0. Finally,
[TABLE]
Replacing these values in (21), we obtain (18).
It remains to be proven that \underline{z}(\alpha)\in\mbox{{\cal C}}(p,g(x)(1\oplus x),q,d). Hence, we have to prove that divides . Certainly, divides since has even weight. Without loss of generality, assume that is irreducible (otherwise, take an irreducible factor of ). Since divides \mbox{\underline{v}}(x), (1\oplus\alpha^{j})\underline{z}(\alpha)\mbox{,=,}\mbox{\underline{v}}(\alpha) and, by Lemma 21, \gcd(g(x),1\oplus x^{j})\mbox{,=,}1 for , divides . ∎
Lemma 22 was proven in [14], Lemma 7, and in [15], Lemma 13, for the special case \mbox{{\cal C}}(p,1\oplus x,2,2).
The next example illustrates Lemma 22:
Example** 23**
.
Let p\mbox{,=,}7, \mbox{\underline{v}}(\alpha)\mbox{,=,}1\oplus\alpha\oplus\alpha^{4}\oplus\alpha^{6}\in\mbox{{\cal C}}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4) (see Example 7), i.e., v_{0}\mbox{,=,}1, v_{1}\mbox{,=,}1, v_{2}\mbox{,=,}0, v_{3}\mbox{,=,}0, v_{4}\mbox{,=,}1, v_{5}\mbox{,=,}0 and v_{6}\mbox{,=,}1. Assume that we want to solve the recursion (1\oplus\alpha^{3})\underline{z}(\alpha)\mbox{,=,}\mbox{\underline{v}}(\alpha). According to (18) and (19), since j\mbox{,=,}3, \mbox{\langle}2j\mbox{\rangle}_{7}\mbox{,=,}6, so
[TABLE]
*so \underline{z}(\alpha)\mbox{,=,}\alpha^{2}\oplus\alpha^{4}\oplus\alpha^{5}\oplus\alpha^{6}. In particular, we can see that \underline{z}\in\mbox{{\cal C}}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4).
*
Observe that the recursion in Lemma 22 involves XORs.
Next we will show how to correct up to erased columns in by adapting the method in [6]. The next theorem was proven in [5] for . The proof is analogous, but we give it for the sake of completeness.
Theorem** 24**
.
*Code given by Definition 5 can correct up to erasures or a burst of up to (consecutive) erasures in each column and up to erased columns. *
**Proof: **Given an array in , since columns are in the cyclic code \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d), up to erasures can be corrected in each column, and also a burst of up to the degree of the generator polynomial , i.e.,
(systematic encoding is a special case of recovering such a burst of erasures).
Next assume that columns have been erased, where , and we denote by the (erased) value of column . Consider the polynomial of degree
[TABLE]
Notice that
[TABLE]
Denote the columns of the array by \mbox{\underline{c}}_{u}, where . Assuming that the erased columns are zero, compute the syndromes
[TABLE]
Hence, from (25), we also have
[TABLE]
From (22), (23), (24) and (26),
[TABLE]
After computing , can be obtained by applying the recursion given by (18) and (19) in Lemma 22 times. Once is obtained, we are left with erasures, and we proceed by induction. ∎
The next example illustrates the decoding procedure given in Theorem 24.
Example** 25**
.
Consider the code of Example 7 and assume that we want to decode the following array, where the blank spaces denote erasures::
[TABLE]
We can see that columns 1, 3 and 6 are erased, columns 0 and 4 contain three erasures each and columns 2 and 5 contain a burst of length four (in particular, the burst in column 2 is an all-around burst, but it can be corrected also since the code is cyclic). Since the columns are in the cyclic code \mbox{{\cal C}}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4), the first step is obtaining the erasures in columns 0, 2, 4 and 5. Once this is done, we obtain
[TABLE]
By (22),
[TABLE]
Assuming that the erased columns are zero when computing the syndromes, by (25), we obtain
[TABLE]
Using (28), we compute
[TABLE]
By (27) and (29), we have to solve the double recursion
[TABLE]
or, multiplying both sides by ,
[TABLE]
Let (1\oplus\alpha^{5})\underline{e}_{0}\mbox{,=,}\mbox{\underline{v}}_{0}, then we have to solve first
[TABLE]
Applying the recursion given by (18) and (19) as illustrated in Example 25, we obtain,
[TABLE]
Next we have to solve
[TABLE]
This gives,
[TABLE]
Recomputing the syndromes,
[TABLE]
Repeating the procedure for two erasures, we now have
[TABLE]
which gives
[TABLE]
We have to solve the recursion
[TABLE]
hence,
[TABLE]
Finally, we recompute
[TABLE]
The final decoded array is then
[TABLE]
*which coincides with the first array given in Example 7.
*
The encoding is a special case of the decoding. For example, we may use the last rows and the last columns to store the parities. We start by encoding systematically [18] the first columns using the generator polynomial . Next we compute the last columns using the decoding procedure as described in Theorem 24. Since at the encoding the erasures are always in the last columns, we may precompute the coefficients of G(x)\mbox{,=,}\prod_{j=1}^{r-1}(x\oplus\alpha^{p-r+j}), making the process faster.
Let us examine next codes. The encoding of codes is very simple and is a direct consequence of Definition 15: given a data array that we denote as \mbox{\underline{v}}_{0},\mbox{\underline{v}}_{1},\ldots,\mbox{\underline{v}}_{p-1}, each \mbox{\underline{v}}_{j} a (vertical) vector of length over , we start by encoding systematically each \mbox{\underline{v}}_{j} into \mbox{\underline{c}}_{j}\in\mbox{{\cal C}}(p,g(x)(1\oplus x),q,d). The result is a array. Then we obtain the parity columns \mbox{\underline{c}}_{p+s}, , as \mbox{\underline{c}}_{p+s}\mbox{,=,}\bigoplus_{j=0}^{p-1}\alpha^{sj}\mbox{\underline{c}}_{j}. Thus, once the \mbox{\underline{c}}_{j}s have been obtained, each \mbox{\underline{c}}_{p+s} requires XORs.
Let us consider next the special case of codes. In a first step, we need to obtain symbols for
as the XOR of symbols for , which takes XORs for each . Hence, the total number of XORs required by the encoding algorithm is . If we shorten the code to columns, where , by assuming that of the columns are zero, then the total number of XORs required at the encoding is . In particular, if r\mbox{,=,}2 and , the total number of XORs at the encoding is . The number of XORs according to the optimized encoding algorithm for with data columns given in [5] is , so the encoding algorithm for also with data columns is more efficient.
Table I compares the number of XORs of required by the encoding algorithms of and of as given in [5], both codes shortened to data columns for . The encoding algorithm of always requires less XORs than the optimized encoding algorithm of in [5]. Table I shows that the savings are more dramatic when .
Regarding the decoding of codes, the first step is always correcting up to erasures in each column or a burst of up to erasures wherever this is feasible. Once this is done, assuming that the code is MDS and up to columns are erased, we can apply transform (15) and decode the array in . Then the inverse transformation will give the desired array in .
If the erased columns correspond to data columns, i.e., they are among the first columns in the array, the array can be decoded directly in applying the same method as for the decoding of . If some of the erased columns are among the parity columns, the decoding is more complicated since the recursion of Lemma 22 cannot be applied. The case r\mbox{,=,}2 is simple to handle though, since when one of the two parity columns is erased, it is corrected as a special case.
We end this section with the problem of updating and codes. The idea is, when updating one data symbol, how to minimize the number of parity symbols that need to be updated, a problem that has been treated repeatedly in the literature on array codes [2, 4, 7, 19, 23, 24]. Actually, codes have bad updating properties, since the parities are not independent and updating one data symbol causes the updating of most of the parity symbols. The same is true for codes, and the creation of codes like EVENODD [2] arises from the need of optimizing the number of updates by making the parities independent. Hence, in what follows, we concentrate on codes only.
As usual, denote an array in as (\mbox{\underline{c}}_{0},\mbox{\underline{c}}_{1},\ldots,\mbox{\underline{c}}_{p-1},\mbox{\underline{c}}_{p},\mbox{\underline{c}}_{p+1},\ldots,\mbox{\underline{c}}_{p+r-1}), where each \mbox{\underline{c}}_{j} is a (column) vector of length . Each time a data symbol , , , is updated, first we need to update the parity symbols in column . In effect, if data symbol is replaced by symbol , consider the (vertical) vector \mbox{\underline{v}}_{j} of length that is zero everywhere except in location , where it is equal to . Encoding (systematically) \mbox{\underline{v}}_{j} into \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d), we obtain a (vertical) vector that we denote \mbox{\underline{c}}^{\prime}_{j}. Once \mbox{\underline{c}}^{\prime}_{j} is obtained, we update \mbox{\underline{c}}_{j} as \mbox{\underline{c}}_{j}\oplus\mbox{\underline{c}}^{\prime}_{j} and the parity vectors \mbox{\underline{c}}_{p+s}, , as \mbox{\underline{c}}_{p+s}\mbox{,=,}\mbox{\underline{c}}_{p+s}\oplus\alpha^{sj}\mbox{\underline{c}}^{\prime}_{j}. Let us illustrate the process in the next example.
Example** 26**
.
Consider the following array in code :
[TABLE]
*Assume that we want to update symbol . The first step is encoding (systematically) \mbox{\underline{v}}_{1}\mbox{,=,}(0,0,1) in
\mbox{{\cal C}}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4). Doing so, we obtain \mbox{\underline{c}}_{1}^{\prime}\mbox{,=,}(0,0,1,0,1,1,1) and \mbox{\underline{c}}_{1}\oplus\mbox{\underline{c}}^{\prime}_{1}\mbox{,=,}(0,1,0,1,1,1,0).*
Then, \mbox{\underline{c}}_{7}\oplus\mbox{\underline{c}}^{\prime}_{1}\mbox{,=,}(1,0,1,1,1,0,0), \mbox{\underline{c}}_{8}\oplus\alpha\mbox{\underline{c}}^{\prime}_{1}\mbox{,=,}(1,0,0,1,0,1,1) and \mbox{\underline{c}}_{9}\oplus\alpha^{2}\mbox{\underline{c}}^{\prime}_{1}\mbox{,=,}(1,1,1,0,0,1,0). The updated array is then
[TABLE]
We can see that the lowest number of updates in the parity symbols that an code as given by Definition 14 can make is . In Example 26, this is the case, we are updating (3)(4)-1\mbox{,=,}11 parity symbols. The reason is that the three vectors consisting of the systematic encoding of the three vectors of weight 1 and length 3 in the vertical code \mbox{{\cal C}}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4), when encoded systematically, have weight 4, the minimum distance of the code. Let us state this observation as a lemma.
Lemma** 27**
.
Consider an with vertical code \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d). Then the number of parity updates when a data symbol is updated reaches the optimal value if and only if the systematic encoding of each vector of weight one and length has weight . ∎
Corollary** 28**
*Consider an code and assume that the vertical code \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d) is MDS. Then the number of parity updates when a data symbol is updated reaches the optimal value . *
**Proof: **Simply observe that if \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d) is MDS then the systematic encoding of each vector of weight 1 and length has weight 2+\deg(g(x))\mbox{,=,}d and the result follows from Lemma 27. ∎
Corollary** 29**
*Consider an code. Then the number of parity updates when a data symbol is updated reaches the optimal value . *
**Proof: **This is the special case of Corollary 28 corresponding to the binary field and the vertical code is the
\mbox{{\cal C}}(p,1\oplus x,2,2) parity code, which in particular is MDS. ∎
III Minimum Hamming Distance of Array Codes with Local Properties
In this section we consider a problem that has not received much attention in the literature on array codes. In general, when we talk about the distance of an array code, we refer to the column distance. In this section we want to consider the symbol distance. Having a high symbol distance may be important when erased columns co-exist with erased symbols. We will consider both and codes. We have already seen that codes are MDS, i.e., their column distance is , while are MDS depending on the prime number considered and the value of . We will simply call the Hamming distance the symbol distance of a code, otherwise we refer to the column distance.
Let us start with a lower bound.
Lemma** 30**
.
*Let be the Hamming distance of an or an MDS code. Then, . *
**Proof: **Take a non-zero array in or in . Then, since the code is MDS, at least columns in the array are non-zero. Since each non-zero column has weight at least , the result follows. ∎
Finding for an MDS code is easy, as shown in the next corollary.
Corollary** 31**
.
*Consider an MDS code with minimum Hamming distance . Then, D\mbox{,=,}d(r+1). *
**Proof: **Take a array consisting of a column of weight in \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d), while the remaining columns are zero. Encoding this array in , since the parity columns are rotations of the non-zero data column, they also have weight , so we obtain an array with columns of weight , hence . The result then follows from Lemma 30. ∎
From now on, we consider codes only.
Lemma** 32**
.
*Consider the code , where either or . Then the minimum Hamming distance of is D\mbox{,=,}2(r+1). *
**Proof: **Since d\mbox{,=,}2, by Lemma 30, . So, it is enough to exhibit an array of weight in when or . The case r\mbox{,=,}1 is trivial.
Denoting the entries of an array by , where , consider the array that is 1 in entries ,
, , , and and 0 elsewhere. This array is in and it has Hamming weight 6.
Similarly, consider the array that is 1 in entries , , , , , , and and 0 elsewhere. This is an array in and it has Hamming weight 8.
Next take r\mbox{,=,}p-2. Consider the array , , of Hamming weight , such that:
[TABLE]
By construction, since i+1\neq\mbox{\langle}(i+1)/2\mbox{\rangle} for , each of the first rows has exactly two 1s, while the last row is zero. Similarly, the first column is zero while the last columns have two 1s each, hence, the array has even parity on rows and columns and weight . It remains to be proven that each line of slope , , has even parity.
In effect, for each such that , take the line of slope , , through entry . By Definition 1, the entries in this line are c_{\mbox{\langle}w-jv\mbox{\rangle},v} for . Take first , then there is a unique such that w-jv\mbox{,=,}i and v\mbox{,=,}i+1, mainly, i\mbox{,=,}\mbox{\langle}(w-j)/(j+1)\mbox{\rangle} (notice that this is possible since ). Similarly, there is a unique such that w-jv\mbox{,=,}i and v\mbox{,=,}\mbox{\langle}(i+1)/2\mbox{\rangle}, mainly, i\mbox{,=,}\mbox{\langle}(2w-j)/(j+2)\mbox{\rangle} (notice that this is possible since ). Hence, any line of slope , , starting at for , has exactly two 1s and thus even parity. If w\mbox{,=,}p-1, the lines of slope starting at entry can be shown to contain no 1s, hence they also have even parity, so the array is in and has Hamming weight .
If r\mbox{,=,}p-1, consider the following array , , of Hamming weight :
[TABLE]
Using the same methods as above, we can see that each line of slope , , contains exactly two 1s, and hence it has even parity, so the array is in and has Hamming weight . ∎
Example** 33**
.
Consider p\mbox{,=,}7. The following are arrays of weight 4, 6, 8, 10 and 12 in , , , and respectively according to the proof of Lemma 32:
[TABLE]
[TABLE]
Lemma 32 shows that the bound of Lemma 30 is tight for when and . What happens for ? Going back to Example 33, consider . An exhaustive search shows that the minimum Hamming distance of is D\mbox{,=,}12, so in this case the bound is not tight. The following is an array in of weight 12:
[TABLE]
Let us point out that a product code of an MDS horizontal code (like RS) with parities with a vertical parity code has minimum Hamming distance .
A possible competitor for an code is a code consisting of arrays such that their first rows are in and their last row is the XOR of such first rows. Let us call such a code ( was illustrated in Example 20). For example, if p\mbox{,=,}7, the following is an array in :
[TABLE]
We can easily see that both lemmas 30 and 32 apply to codes. However, the minimum Hamming distance of is 10, which is less than the minimum Hamming distance 12 of . In effect, notice that the following is an array of weight 10 in :
[TABLE]
Finding the minimum Hamming distance of an code for is an open problem.
IV Recovery of Erased Lines of slope in an
Code
In Section II, we have seen how to encode and decode an code. In particular, we have shown how to recover up to erased columns. Interestingly, an code can also recover a number of erased lines of slope , where . We say that an code is MDS on lines of slope , if the code can recover up to erased lines of such slope. We have shown in Theorem 24 that an code is MDS on lines of slope . What happens with the other slopes , ?
A trivial case corresponds to r\mbox{,=,}1: in this case, an code is a product code with parity on rows and columns. Any erased row or column can be recovered, hence, an code is MDS on lines of slope and on lines of slope 0.
The next case corresponds to r\mbox{,=,}2. We had seen in Example 20 that the code can recover an erased pair of lines of slope 1. The following lemma proves that this is true for any .
Lemma** 34**
.
*The code can recover any erased pair of lines of slope for . *
**Proof: **Assume that , , is an array in , and assume that two rows have been erased. Consider the transposed array (b_{u,v})\mbox{,=,}(c_{u,v})^{\rm T}, , i.e., b_{u,v}\mbox{,=,}c_{v,u}. Then, the array has two erased columns. It is enough to show that . Certainly, each line of slope 0 has even parity. Take a line of slope 1, i.e., according to Definition 1, entries such that u+v\mbox{,=,}j for some , . Since b_{u,v}\mbox{,=,}c_{v,u}, the lines of slope 1 coincide for the array and for the transposed array , so and the two erased columns can be recovered.
Assume next that two lines of slope 1 are erased in the array . Take an array defined from the array as b_{u,v}\mbox{,=,}c_{\mbox{\langle}-u\mbox{\rangle},\mbox{\langle}u+v\mbox{\rangle}} for . We claim, . Notice first that every line of slope in corresponds to a line of slope 1 in . In effect, take such that . According to Definition 1, a line of slope (vertical) in through is given by the set , which is equal to the set . This last set corresponds to the line of slope 1 in through , also according to Definition 1.
Similarly, it can be shown that each line of slope 0 in corresponds to a line of slope 0 in and that each line of slope 1 in corresponds to a line of slope in . Hence, is in since it has even parity on lines of slope 0, 1 and , so, by Theorem 24, it can recover the two columns corresponding to the two erased lines of slope 1 in . ∎
Corollary** 35**
.
*The code is MDS on lines of slope 0, 1 and . *
Example** 36**
.
Consider the code . The transpose transformation as described in Lemma 34 gives
[TABLE]
We can see that columns become rows, rows become columns, and the lines of slope 1 are preserved by this transformation, so the transposed arrays are also in and any pair of erased horizontal lines can be recovered.
The second transformation in Lemma 34, i.e., b_{u,v}\mbox{,=,}c_{\mbox{\langle}-u\mbox{\rangle},\mbox{\langle}u+v\mbox{\rangle}}, gives the following correspondence:
[TABLE]
*We can see that this transformation maps lines of slope 1 in into lines of slope in , lines of slope 0 in into lines of slope 0 in and lines of slope in into lines of slope 1 in . For example, the line of slope 1 through (in bold) is mapped into the third vertical line. So, and any pair of erased lines of slope 1 can be recovered.
*
Consider next the case r\mbox{,=,}3.
Lemma** 37**
.
*The code can recover any three erased lines of lines of slope , where . *
**Proof: **Assume that , , is an array in , and assume that three rows (lines of slope 0) have been erased. Consider the array , , such that b_{u,v}\mbox{,=,}c_{\mbox{\langle}2v\mbox{\rangle},u}. Then lines of slope in are mapped into lines of slope 0 in and lines of slope 0 in are mapped into lines of slope in , like in the case of the transpose transformation. Lines of slope 2 in are mapped into lines of slope 1 in . In effect, consider the line of slope 1 in through entry , where . According to Definition 1, this line corresponds to the set \{b_{\mbox{\langle}u_{0}-v\mbox{\rangle},v}\,:\,0\leqslant v\leqslant p-1\}, which is equal to the set \{c_{\mbox{\langle}2v\mbox{\rangle},\mbox{\langle}u_{0}-v\mbox{\rangle}}\,,\,0\leqslant v\leqslant p-1\}. Since \mbox{\langle}2v\mbox{\rangle}+\mbox{\langle}2(u_{0}-v)\mbox{\rangle}\mbox{,=,}\mbox{\langle}2u_{0}\mbox{\rangle}, then this last set corresponds to the line of slope 2 in through entry (\mbox{\langle}2u_{0}\mbox{\rangle},0).
Similarly, proceeding as in the previous cases, it can be shown that lines of slope 1 in are mapped into lines of slope 2 in . Thus, is in , hence, by Theorem 24, it can correct any three erased columns (which correspond to three erased rows in ).
Next assume that three lines of slope 1 have been erased in . We can consider the same transformation as in Lemma 34, i.e., consider a array such that b_{u,v}\mbox{,=,}c_{\mbox{\langle}-u\mbox{\rangle},\mbox{\langle}u+v\mbox{\rangle}} for . As in Lemma 34, every line of slope in corresponds to a line of slope 1 in , each line of slope 0 in corresponds to a line of slope 0 in and each line of slope 1 in corresponds to a line of slope in . In addition, proceeding as in the previous cases, it can be shown that each line of slope 2 in corresponds to a line of slope 2 in . Thus, and any 3 columns in , which correspond to 3 lines of slope 1 in , can be corrected.
Finally, assume that three lines of slope 2 have been erased in . Consider a array such that
b_{u,v}\mbox{,=,}c_{\mbox{\langle}-2(u+v)\mbox{\rangle},\mbox{\langle}u+2v\mbox{\rangle}} for . Now, it can be shown that every line of slope in corresponds to a line of slope 2 in , every line of slope 0 in corresponds to a line of slope 1 in , every line of slope 1 in corresponds to a line of slope 0 in and every line of slope 2 in corresponds to a line of slope in . Thus, and any 3 columns in , which correspond to 3 lines of slope 2 in , can be corrected. ∎
Corollary** 38**
.
*The code is MDS on lines of slope 0, 1, 2 and . *
Example** 39**
.
Consider the code . The first transformation in Lemma 37, i.e., b_{u,v}\mbox{,=,}c_{\mbox{\langle}2v\mbox{\rangle},u}, gives
[TABLE]
We can see that lines of slope [math] in become lines of slope in , lines of slope in become lines of slope [math] in , lines of slope 1 in become lines of slope 2 in and lines of slope 2 in become lines of slope 1 in , so and any three erased horizontal lines can be recovered. For example, the line of slope 1 starting in (in bold) is mapped into the line of slope 2 in the right array, also in bold.
The second transformation in Lemma 37, i.e., b_{u,v}\mbox{,=,}c_{\mbox{\langle}-u\mbox{\rangle},\mbox{\langle}u+v\mbox{\rangle}}, is the same as the second transformation in Lemma 34 and has been illustrated in Example 36. As in Example 36, this transformation maps lines of slope 1 in into lines of slope in , lines of slope 0 in into lines of slope 0 in and lines of slope in into lines of slope 1 in . In addition, lines of slope 2 in are mapped into lines of slope 2 in , hence, and any three erased lines of slope 1 can recovered.
Finally, the last transformation in Lemma 37, i.e., b_{u,v}\mbox{,=,}c_{\mbox{\langle}-2(u+v)\mbox{\rangle},\mbox{\langle}u+2v\mbox{\rangle}}, gives
[TABLE]
We can see that this transformation maps lines of slope 2 in into lines of slope in , lines of slope 0 in into lines of slope 1 in , lines of slope in into lines of slope 2 in and lines of slope 1 in into lines of slope 0 in . For example, the line of slope 2 starting in (in bold) is mapped into the second vertical line.
*In particular, and any three erased lines of slope 2 can recovered.
*
We have seen that an code is MDS on lines of slope , where and . The next lemma shows that this is also the case for .
Lemma** 40**
.
*The code with is MDS on lines of slope , where . *
**Proof: **Consider an array . Take such that and assume that lines of slope have been erased. Consider the array , , such that b_{u,v}\mbox{,=,}c_{\mbox{\langle}-ju+(3j+2)v\mbox{\rangle},\mbox{\langle}u+jv\mbox{\rangle}}. Lines of slope in are mapped into lines of slope in . In effect, consider the line of slope in through entry , where
. This line corresponds to the set , which is equal to the set
\{c_{\mbox{\langle}-ju+(3j+2)v_{0}\mbox{\rangle},\mbox{\langle}u+jv_{0}\mbox{\rangle}}\,,\,0\leqslant u\leqslant p-1\}. Since \mbox{\langle}-ju+(3j+2)v_{0}\mbox{\rangle}+j\mbox{\langle}u+jv_{0}\mbox{\rangle}\mbox{,=,}\mbox{\langle}(j^{2}+2j+3)v_{0}\mbox{\rangle}, then, according to Definition 1, this last set corresponds to the line of slope in through entry (\mbox{\langle}(j^{2}+2j+3)v_{0}\mbox{\rangle},0). Since
\mbox{\langle}j^{2}+2j+3\mbox{\rangle}\mbox{,=,}\mbox{\langle}(j+1)(j+2)\mbox{\rangle} and , \mbox{\langle}j^{2}+2j+3\mbox{\rangle}\neq 0, so to each choice of corresponds a unique line of slope .
Proceeding similarly, we can show that lines of slope in are mapped into lines of slope in and that lines of slope in are mapped into lines of slope in . Thus, a line of slope in , , i_{0}\mbox{,=,}\infty or , is mapped into a line of slope in , . Since these lines have even parity, is in and it can correct any erased columns, which correspond to erased lines of slope in .
Next let r\mbox{,=,}p-1. Consider the array , , such that b_{u,v}\mbox{,=,}c_{\mbox{\langle}-ju+(j+1)v\mbox{\rangle},\mbox{\langle}u\mbox{\rangle}}. Proceeding as above, we can verify that lines of slope in are mapped into lines of slope in and that lines of slope in are mapped into lines of slope in . Thus, a line of slope , , will be mapped to a line of slope , . All these lines have even parity, so is in and it can correct any erased columns, which correspond to erased lines of slope in . ∎
Example** 41**
.
As in Example 39, take p\mbox{,=,}7 and consider the code . If j\mbox{,=,}0, the transformation in Lemma 40, i.e., b_{u,v}\mbox{,=,}c_{\mbox{\langle}-ju+(3j+2)v\mbox{\rangle},\mbox{\langle}u+jv\mbox{\rangle}}, becomes b_{u,v}\mbox{,=,}c_{\mbox{\langle}2v\mbox{\rangle},u}, giving the first transformation illustrated in Example 39. We have seen in this example that the resulting array is in and hence any tree rows can be corrected. In addition, we observe that lines of slope 3 in are transformed into lines of slope 3 in , and that lines of slope 4 in are transformed into lines of slope 4 in . Hence, if is in also is in and any 5 erased rows can be recovered.
Let us take next j\mbox{,=,}3, then the transformation b_{u,v}\mbox{,=,}c_{\mbox{\langle}-ju+(3j+2)v\mbox{\rangle},\mbox{\langle}u+jv\mbox{\rangle}}, becomes b_{u,v}\mbox{,=,}c_{\mbox{\langle}4u+4v\mbox{\rangle},\mbox{\langle}u+3v\mbox{\rangle}}, giving
[TABLE]
We can see that this transformation maps lines of slope 3 in into lines of slope in , lines of slope in into lines of slope 3 in , lines of slope 0 in into lines of slope 1 in , lines of slope 1 in into lines of slope 0 in , lines of slope 2 in into lines of slope 4 in and lines of slope 4 in into lines of slope 2 in . In particular, and any 5 erased columns, which correspond to 5 lines of slope 3 in , can be recovered.
*Similar transformations can be obtained for j\mbox{,=,}1, 2 and 4.
*
Let us unify Theorem 24, Corollaries 35 and 38 and Lemma 40 in the following theorem:
Theorem** 42**
.
The code with or is MDS on lines of slope , where j\mbox{,=,}\infty or . ∎
Theorem 42 gives five values of for which the code is MDS on lines of slope , where j\mbox{,=,}\infty or
. For we do not think that this is the case, but the problem is open.
V Puncturing EBR and EIP Codes
A puncturing of a code of length in specified locations is a code of length such that the specified entries in each codeword of the original code are deleted [18]. For codes we will puncture some rows as follows:
Definition** 43**
.
*Consider an code (resp., an code) and assume that has degree . A punctured EBR (resp. EIP) code (resp. ) consists of all (resp.
) arrays obtained by deleting the last rows of each array in (resp., in
). ∎*
The next lemma is immediate.
Lemma** 44**
.
*The code as given by Definition 43 is MDS (on columns), while the code is MDS if and only if the code is MDS. *
**Proof: **Simply observe that a column in (resp., in ) is a zero column if and only if the corresponding column in is a zero column (resp. in ). In effect, using the notation of Definition 43, a vector in \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d) is the zero vector if and only if the first entries of the vector are zero, since encoding systematically such zero entries in \mbox{{\cal C}}(p,g(x)(1\oplus x),q,d) gives the zero vector. ∎
Notice that in a or in a code the vertical erasure-correction of a code is lost. Let us examine next the simplest case of puncturing, that is, when g(x)\mbox{,=,}1.
Example** 45**
.
Consider a code, which, according to Definition 43, consists of all the arrays obtained by taking the first rows of each array in .
A code can be compared with a regular code: they both consist of arrays and can recover up to erased columns. For example, the array in the left below is in , while the one in the right is in . The last row is not written. Notice that both arrays share the same data array.
[TABLE]
*However, the two codes and are not equivalent. The code can correct up to lines of slope for if or . For example, consider the array in the left above in . The two lines of slope 1 colored in red and in blue can be recovered. In effect, assuming that the array is in with the last row corresponding to erasures, applying the transformation b_{u,v}\mbox{,=,}c_{\mbox{\langle}-u\mbox{\rangle},\mbox{\langle}u+v\mbox{\rangle}} for of Lemma 34, the lines of slope 1 become columns and the resulting array is in . This transformed array has two erased columns, while the remaining columns have one erasure each and so it is recoverable by Theorem 24. Once such array is recovered, by applying the inverse transformation to it, the original array in is obtained, and by deleting the last row, so is the array in . Codes on the other hand cannot recover two or more erased lines that are not vertical, as shown in Example 20.
*
The construction of a code is also given in [16]. In Theorem 5 of this reference, it is proven that the code is MDS when 2 is primitive in , and using the Erdös-Heilbronn conjecture. However, this result is a special case of the combination of Corollary 19, Lemma 44 and Theorem 2.6 in [4].
Example** 46**
.
Consider as in Example 7. Then, consists of all the arrays obtained by taking the first three rows of each array in .
Taking the first 3 rows of the two arrays in of Example 7, we obtain:
[TABLE]
Any three columns in the code can be recovered, so, in particular, this code is a MDS code over vectors of length 3 (the columns), like a RS code over . In fact, it can be shown that this code is equivalent to a RS code over . In effect, if we permute the last two rows of the arrays above, we obtain
[TABLE]
Assuming that each column is a symbol in the finite field with a zero of the primitive polynomial , the first array above corresponds to the polynomial in f_{0}(X)\mbox{,=,}\beta^{6}\oplus\beta^{4}X\oplus\beta^{5}X^{2}+X^{4}\oplus\beta X^{5}\oplus\beta^{5}X^{6}, while the second one corresponds to f_{1}(X)\mbox{,=,}\beta^{4}X^{3}\oplus\beta^{2}X^{4}\oplus\beta^{3}X^{5}+X^{6}. We can verify that f_{i}(\beta^{-j})\mbox{,=,}0 for and , i.e., both codewords are in the RS code with generator polynomial . It can also be verified more in general that with rows 1 and 2 permuted corresponds to a RS code over defined by the primitive polynomial with generator polynomial g(x)\mbox{,=,}(1\oplus x)\prod_{i=1}^{r-1}(\beta^{-i}\oplus x). In the example above, r\mbox{,=,}3.
On the other hand, if is a zero of the primitive polynomial , then, applying the permutation to the three rows of the arrays in corresponds to a RS code over with generator polynomial g(x)\mbox{,=,}(1\oplus x)\prod_{i=1}^{r-1}(\beta^{i}\oplus x). The two original arrays with this permutation are
[TABLE]
Example** 47**
.
Generalizing Example 46, consider a Mersenne prime p\mbox{,=,}2^{b}-1, where is also prime and . The first four such Mersenne primes are 7\mbox{,=,}2^{3}-1, 31\mbox{,=,}2^{5}-1, 127\mbox{,=,}2^{7}-1 and 8191\mbox{,=,}2^{13}-1.
*Consider a RS code over , let be a primitive element in and let be a cyclotomic polynomial [18] such that h(\beta)\mbox{,=,}0. Let g(x)\mbox{,=,}(1\oplus x^{p})/(h(x)(1\oplus x). Then, the code is an MDS code on columns consisting of arrays. We believe that a permutation of the rows of such arrays gives a code that is equivalent to a RS code over as we showed for the case p\mbox{,=,}7 in Example 46, but we cannot find a proof.
*
VI Conclusions
We have expanded codes like Blaum-Roth codes and generalized EVENODD codes to array codes such that each column has a certain erasure-correcting capability. We have shown the connection of the new codes to traditional array codes. We have presented efficient encoding, decoding and updating algorithms. We have observed that the new codes can recover erased lines of different slopes. We have also showed a method for puncturing the codes such that the resulting arrays constitute an MDS code.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Blaum, “A family of MDS array codes with minimal number of encoding operations,” IEEE International Symposium on Information Theory (ISIT’06), Seattle, USA, pp. 2784-88, July 2006.
- 2[2] M. Blaum, J. Brady, J. Bruck, and J. Menon, “EVENODD: an efficient scheme for tolerating double disk failures in RAID architectures,” IEEE Trans. on Computers, vol. C-44, pp. 192-202, February 1995.
- 3[3] M. Blaum, J. Brady, J. Bruck, J. Menon, and A. Vardy, “The EVENODD code and its generalization,” in “High Performance Mass Storage and Parallel I/O: Technologies and Applications,” edited by H. Jin, T. Cortes, and R. Buyya, IEEE & Wiley Press, New York, Chapter 14, pp. 187-208, 2001.
- 4[4] M. Blaum, J. Bruck, and A. Vardy, “MDS array codes with independent parity symbols,” IEEE Trans. on Information Theory, vol. IT-42, pp. 529-42, March 1996.
- 5[5] M. Blaum, V. Deenadhayalan, and S. R. Hetzler, “Expanded Blaum-Roth codes with efficient encoding and decoding algorithms,” IEEE Communications Letters, vol. 23, no. 6, pp. 954-7, June 2019.
- 6[6] M. Blaum and R. M. Roth, “New array codes for multiple phased burst correction,” IEEE Trans. on Information Theory, vol. IT-39, pp. 66-77, January 1993.
- 7[7] M. Blaum and R. M. Roth, “On lowest-density MDS codes,” IEEE Trans. on Information Theory, vol. IT-45, pp. 46-59, January 1999.
- 8[8] P. Corbett, B. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong, and S. Sankar, “Row-diagonal parity for double disk failure correction,” Proc. 3rd Conf. File and Storage Technologies - FAST’04, San Francisco, CA, March/April 2004.
