Array Codes with Local Properties

Mario Blaum; Steven R. Hetzler

arXiv:1906.11731·cs.IT·July 25, 2019

Array Codes with Local Properties

Mario Blaum, Steven R. Hetzler

PDF

TL;DR

This paper introduces enhanced array codes with added column parity, enabling local symbol recovery, which improves data repair efficiency in RAID-like storage systems.

Contribution

The paper extends traditional array codes by incorporating column parity, facilitating local recovery and advancing the design of locally recoverable codes.

Findings

01

Added column parity enables local symbol recovery.

02

Enhanced codes maintain array code properties with improved repair.

03

Applications to Locally Recoverable (LRC) codes are discussed.

Abstract

In general, array codes consist of $m \times n$ arrays and in many cases, the arrays satisfy parity constraints along lines of different slopes (generally with a toroidal topology). Such codes are useful for RAID type of architectures, since they allow to replace finite field operations by XORs. We present expansions to traditional array codes of this type, like Blaum-Roth (BR) and extended EVENODD codes, by adding parity on columns. This vertical parity allows for recovery of one or more symbols in a column locally, i.e., by using the remaining symbols in the column without invoking the rest of the array. Properties and applications of the new codes are discussed, in particular to Locally Recoverable (LRC) codes.

Tables1

Table 1. TABLE I: Comparison between the number of XORs of optimized encoding algorithm for E B R ( p , 2 , q , 1 ) 𝐸 𝐵 𝑅 𝑝 2 𝑞 1 EBR(p,2,q,1) in [ 5 ] and Encoding Algorithm for E I P ( p , 2 , q , 1 ) 𝐸 𝐼 𝑃 𝑝 2 𝑞 1 EIP(p,2,q,1) with k 𝑘 k data columns, 1 ⩽ k ⩽ p − 2 1 𝑘 𝑝 2 1\leqslant k\leqslant p-2 .

		Encoding Algorithm from [5]	Encoding Algorithm for $E I P (p, 2, q, 1)$
$p$	$k$	$3 k p - (k + 2)$	$3 k p - 2 (k + p)$	Improvement %
17	8	398	358	10.1%
17	15	748	701	6.3%
127	8	3038	2778	8.6%
127	50	18998	18696	1.6%
127	125	47498	47121	.8%
257	8	6158	5638	8.4%
257	50	38498	37936	1.5%
257	255	196348	195581	.4%

Equations139

\begin{array}[]{ccc}\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&0&0&0&1\\ \hline\cr{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}0}&{\color[rgb]{1,0,0}1}\\ \hline\cr 0&1&0&0&1\\ \hline\cr 0&1&0&0&1\\ \hline\cr\hline\cr 0&0&0&0&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&{\color[rgb]{1,0,0}0}&0&0&1\\ \hline\cr{\color[rgb]{1,0,0}1}&1&1&0&1\\ \hline\cr 0&1&0&0&{\color[rgb]{1,0,0}1}\\ \hline\cr 0&1&0&{\color[rgb]{1,0,0}0}&1\\ \hline\cr\hline\cr 0&0&{\color[rgb]{1,0,0}0}&0&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&0&0&{\color[rgb]{1,0,0}0}&1\\ \hline\cr{\color[rgb]{1,0,0}1}&1&1&0&1\\ \hline\cr 0&1&{\color[rgb]{1,0,0}0}&0&1\\ \hline\cr 0&1&0&0&{\color[rgb]{1,0,0}1}\\ \hline\cr\hline\cr 0&{\color[rgb]{1,0,0}0}&0&0&0\\ \hline\cr\end{array}\end{array}

\begin{array}[]{ccc}\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&0&0&0&1\\ \hline\cr{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}0}&{\color[rgb]{1,0,0}1}\\ \hline\cr 0&1&0&0&1\\ \hline\cr 0&1&0&0&1\\ \hline\cr\hline\cr 0&0&0&0&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&{\color[rgb]{1,0,0}0}&0&0&1\\ \hline\cr{\color[rgb]{1,0,0}1}&1&1&0&1\\ \hline\cr 0&1&0&0&{\color[rgb]{1,0,0}1}\\ \hline\cr 0&1&0&{\color[rgb]{1,0,0}0}&1\\ \hline\cr\hline\cr 0&0&{\color[rgb]{1,0,0}0}&0&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&0&0&{\color[rgb]{1,0,0}0}&1\\ \hline\cr{\color[rgb]{1,0,0}1}&1&1&0&1\\ \hline\cr 0&1&{\color[rgb]{1,0,0}0}&0&1\\ \hline\cr 0&1&0&0&{\color[rgb]{1,0,0}1}\\ \hline\cr\hline\cr 0&{\color[rgb]{1,0,0}0}&0&0&0\\ \hline\cr\end{array}\end{array}

H_{p, r}

H_{p, r}

\begin{array}[]{cccc}\begin{array}[]{|c|c|c|c|c|}\hline\cr{\color[rgb]{1,0,0}1}&0&0&1&0\\ \hline\cr{\color[rgb]{1,0,0}1}&1&1&0&1\\ \hline\cr{\color[rgb]{1,0,0}0}&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&1&1&1&1\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&0&0&1&0\\ \hline\cr{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}0}&{\color[rgb]{1,0,0}1}\\ \hline\cr 0&1&1&0&0\\ \hline\cr 0&1&1&0&0\\ \hline\cr 0&1&1&1&1\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&{\color[rgb]{1,0,0}0}&0&1&0\\ \hline\cr{\color[rgb]{1,0,0}1}&1&1&0&1\\ \hline\cr 0&1&1&0&{\color[rgb]{1,0,0}0}\\ \hline\cr 0&1&1&{\color[rgb]{1,0,0}0}&0\\ \hline\cr 0&1&{\color[rgb]{1,0,0}1}&1&1\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&0&0&{\color[rgb]{1,0,0}1}&0\\ \hline\cr{\color[rgb]{1,0,0}1}&1&1&0&1\\ \hline\cr 0&1&{\color[rgb]{1,0,0}1}&0&0\\ \hline\cr 0&1&1&0&{\color[rgb]{1,0,0}0}\\ \hline\cr 0&{\color[rgb]{1,0,0}1}&1&1&1\\ \hline\cr\end{array}\end{array}

\begin{array}[]{cccc}\begin{array}[]{|c|c|c|c|c|}\hline\cr{\color[rgb]{1,0,0}1}&0&0&1&0\\ \hline\cr{\color[rgb]{1,0,0}1}&1&1&0&1\\ \hline\cr{\color[rgb]{1,0,0}0}&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&1&1&1&1\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&0&0&1&0\\ \hline\cr{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}0}&{\color[rgb]{1,0,0}1}\\ \hline\cr 0&1&1&0&0\\ \hline\cr 0&1&1&0&0\\ \hline\cr 0&1&1&1&1\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&{\color[rgb]{1,0,0}0}&0&1&0\\ \hline\cr{\color[rgb]{1,0,0}1}&1&1&0&1\\ \hline\cr 0&1&1&0&{\color[rgb]{1,0,0}0}\\ \hline\cr 0&1&1&{\color[rgb]{1,0,0}0}&0\\ \hline\cr 0&1&{\color[rgb]{1,0,0}1}&1&1\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&0&0&{\color[rgb]{1,0,0}1}&0\\ \hline\cr{\color[rgb]{1,0,0}1}&1&1&0&1\\ \hline\cr 0&1&{\color[rgb]{1,0,0}1}&0&0\\ \hline\cr 0&1&1&0&{\color[rgb]{1,0,0}0}\\ \hline\cr 0&{\color[rgb]{1,0,0}1}&1&1&1\\ \hline\cr\end{array}\end{array}

\begin{array}[]{cc}\begin{array}[]{|c|c|c|c||c|c|c|}\hline\cr 1&0&1&0&1&0&1\\ \hline\cr 1&1&1&0&0&0&1\\ \hline\cr 0&1&1&0&0&1&1\\ \hline\cr\hline\cr 0&1&0&0&1&0&0\\ \hline\cr 1&0&0&0&0&1&0\\ \hline\cr 0&0&1&0&1&1&1\\ \hline\cr 1&1&0&0&1&1&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c||c|c|c|}\hline\cr 0&0&0&0&0&1&1\\ \hline\cr 0&0&0&1&1&0&0\\ \hline\cr 0&0&0&1&0&1&0\\ \hline\cr\hline\cr 0&0&0&1&1&1&1\\ \hline\cr 0&0&0&0&1&1&0\\ \hline\cr 0&0&0&0&1&0&1\\ \hline\cr 0&0&0&1&0&0&1\\ \hline\cr\end{array}\end{array}

\begin{array}[]{cc}\begin{array}[]{|c|c|c|c||c|c|c|}\hline\cr 1&0&1&0&1&0&1\\ \hline\cr 1&1&1&0&0&0&1\\ \hline\cr 0&1&1&0&0&1&1\\ \hline\cr\hline\cr 0&1&0&0&1&0&0\\ \hline\cr 1&0&0&0&0&1&0\\ \hline\cr 0&0&1&0&1&1&1\\ \hline\cr 1&1&0&0&1&1&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c||c|c|c|}\hline\cr 0&0&0&0&0&1&1\\ \hline\cr 0&0&0&1&1&0&0\\ \hline\cr 0&0&0&1&0&1&0\\ \hline\cr\hline\cr 0&0&0&1&1&1&1\\ \hline\cr 0&0&0&0&1&1&0\\ \hline\cr 0&0&0&0&1&0&1\\ \hline\cr 0&0&0&1&0&0&1\\ \hline\cr\end{array}\end{array}

\begin{array}[]{|c|c|c|c||c|c|c|}\hline\cr 0&\beta^{4}&1&0&1&\beta&\beta^{2}\\ \hline\cr\beta^{5}&\beta^{6}&\beta^{5}&\beta^{4}&\beta&\beta^{6}&\beta^{2}\\ \hline\cr\beta^{5}&\beta^{5}&\beta^{4}&0&\beta&\beta^{3}&\beta^{5}\\ \hline\cr\beta^{2}&\beta^{4}&0&\beta^{2}&0&\beta^{4}&0\\ \hline\cr\beta^{2}&\beta^{4}&\beta&\beta^{3}&\beta^{4}&1&\beta^{2}\\ \hline\cr\hline\cr\beta^{3}&\beta^{4}&\beta^{4}&\beta&1&\beta^{4}&\beta^{4}\\ \hline\cr\beta^{3}&\beta&\beta^{2}&\beta^{3}&\beta^{4}&\beta^{6}&\beta^{6}\\ \hline\cr\end{array}

\begin{array}[]{|c|c|c|c||c|c|c|}\hline\cr 0&\beta^{4}&1&0&1&\beta&\beta^{2}\\ \hline\cr\beta^{5}&\beta^{6}&\beta^{5}&\beta^{4}&\beta&\beta^{6}&\beta^{2}\\ \hline\cr\beta^{5}&\beta^{5}&\beta^{4}&0&\beta&\beta^{3}&\beta^{5}\\ \hline\cr\beta^{2}&\beta^{4}&0&\beta^{2}&0&\beta^{4}&0\\ \hline\cr\beta^{2}&\beta^{4}&\beta&\beta^{3}&\beta^{4}&1&\beta^{2}\\ \hline\cr\hline\cr\beta^{3}&\beta^{4}&\beta^{4}&\beta&1&\beta^{4}&\beta^{4}\\ \hline\cr\beta^{3}&\beta&\beta^{2}&\beta^{3}&\beta^{4}&\beta^{6}&\beta^{6}\\ \hline\cr\end{array}

\displaystyle C\;\mbox{$\,=\,$}\;(\mbox{$\underline{c}$}_{0},\mbox{$\underline{c}$}_{1},\ldots,\mbox{$\underline{c}$}_{p-1})

\displaystyle C\;\mbox{$\,=\,$}\;(\mbox{$\underline{c}$}_{0},\mbox{$\underline{c}$}_{1},\ldots,\mbox{$\underline{c}$}_{p-1})

\displaystyle\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&0&0&1&0\\ \hline\cr 1&1&1&0&1\\ \hline\cr 0&1&1&0&0\\ \hline\cr 0&1&1&0&0\\ \hline\cr 0&1&1&1&1\\ \hline\cr\end{array}

\displaystyle\begin{array}[]{|c|c|c|c|c|}\hline\cr 1&0&0&1&0\\ \hline\cr 1&1&1&0&1\\ \hline\cr 0&1&1&0&0\\ \hline\cr 0&1&1&0&0\\ \hline\cr 0&1&1&1&1\\ \hline\cr\end{array}

\{(u,v)\,:\,u+sv\mbox{$\,=\,$}i\}\cup\{(i,p+s)\}

\{(u,v)\,:\,u+sv\mbox{$\,=\,$}i\}\cup\{(i,p+s)\}

\begin{array}[]{ccc}\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&0&0&1&1&1&1&1\\ \hline\cr{\color[rgb]{1,0,0}0}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}0}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&1&0\\ \hline\cr 0&0&0&0&1&1&1&1\\ \hline\cr 1&1&0&1&1&0&1&1\\ \hline\cr\hline\cr 0&0&0&0&0&0&0&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&{\color[rgb]{1,0,0}0}&0&1&1&1&1&1\\ \hline\cr{\color[rgb]{1,0,0}0}&1&0&1&1&1&{\color[rgb]{1,0,0}1}&0\\ \hline\cr 0&0&0&0&{\color[rgb]{1,0,0}1}&1&1&1\\ \hline\cr 1&1&0&{\color[rgb]{1,0,0}1}&1&0&1&1\\ \hline\cr\hline\cr 0&0&{\color[rgb]{1,0,0}0}&0&0&0&0&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&0&0&{\color[rgb]{1,0,0}1}&1&1&1&1\\ \hline\cr{\color[rgb]{1,0,0}0}&1&0&1&1&1&1&{\color[rgb]{1,0,0}0}\\ \hline\cr 0&0&{\color[rgb]{1,0,0}0}&0&1&1&1&1\\ \hline\cr 1&1&0&1&{\color[rgb]{1,0,0}1}&0&1&1\\ \hline\cr\hline\cr 0&{\color[rgb]{1,0,0}0}&0&0&0&0&0&0\\ \hline\cr\end{array}\end{array}

\begin{array}[]{ccc}\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&0&0&1&1&1&1&1\\ \hline\cr{\color[rgb]{1,0,0}0}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}0}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&1&0\\ \hline\cr 0&0&0&0&1&1&1&1\\ \hline\cr 1&1&0&1&1&0&1&1\\ \hline\cr\hline\cr 0&0&0&0&0&0&0&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&{\color[rgb]{1,0,0}0}&0&1&1&1&1&1\\ \hline\cr{\color[rgb]{1,0,0}0}&1&0&1&1&1&{\color[rgb]{1,0,0}1}&0\\ \hline\cr 0&0&0&0&{\color[rgb]{1,0,0}1}&1&1&1\\ \hline\cr 1&1&0&{\color[rgb]{1,0,0}1}&1&0&1&1\\ \hline\cr\hline\cr 0&0&{\color[rgb]{1,0,0}0}&0&0&0&0&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&0&0&{\color[rgb]{1,0,0}1}&1&1&1&1\\ \hline\cr{\color[rgb]{1,0,0}0}&1&0&1&1&1&1&{\color[rgb]{1,0,0}0}\\ \hline\cr 0&0&{\color[rgb]{1,0,0}0}&0&1&1&1&1\\ \hline\cr 1&1&0&1&{\color[rgb]{1,0,0}1}&0&1&1\\ \hline\cr\hline\cr 0&{\color[rgb]{1,0,0}0}&0&0&0&0&0&0\\ \hline\cr\end{array}\end{array}

\tilde{H}_{p, r}

\tilde{H}_{p, r}

\begin{array}[]{cccc}\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr{\color[rgb]{1,0,0}1}&0&0&1&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&1&0&1&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&0&0&0&1&1&1&1\\ \hline\cr{\color[rgb]{1,0,0}1}&1&0&1&1&0&0&1\\ \hline\cr\hline\cr{\color[rgb]{1,0,0}0}&0&0&1&0&1&1&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&0&0&1&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}0}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&0&0\\ \hline\cr 0&0&0&0&1&1&1&1\\ \hline\cr 1&1&0&1&1&0&0&1\\ \hline\cr\hline\cr 0&0&0&1&0&1&1&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&{\color[rgb]{1,0,0}0}&0&1&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&1&0&1&1&1&{\color[rgb]{1,0,0}0}&0\\ \hline\cr 0&0&0&0&{\color[rgb]{1,0,0}1}&1&1&1\\ \hline\cr 1&1&0&{\color[rgb]{1,0,0}1}&1&0&0&1\\ \hline\cr\hline\cr 0&0&{\color[rgb]{1,0,0}0}&1&0&1&1&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&0&0&{\color[rgb]{1,0,0}1}&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&1&0&1&1&1&0&{\color[rgb]{1,0,0}0}\\ \hline\cr 0&0&{\color[rgb]{1,0,0}0}&0&1&1&1&1\\ \hline\cr 1&1&0&1&{\color[rgb]{1,0,0}1}&0&0&1\\ \hline\cr\hline\cr 0&{\color[rgb]{1,0,0}0}&0&1&0&1&1&0\\ \hline\cr\end{array}\end{array}

\begin{array}[]{cccc}\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr{\color[rgb]{1,0,0}1}&0&0&1&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&1&0&1&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&0&0&0&1&1&1&1\\ \hline\cr{\color[rgb]{1,0,0}1}&1&0&1&1&0&0&1\\ \hline\cr\hline\cr{\color[rgb]{1,0,0}0}&0&0&1&0&1&1&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&0&0&1&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}0}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&{\color[rgb]{1,0,0}1}&0&0\\ \hline\cr 0&0&0&0&1&1&1&1\\ \hline\cr 1&1&0&1&1&0&0&1\\ \hline\cr\hline\cr 0&0&0&1&0&1&1&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&{\color[rgb]{1,0,0}0}&0&1&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&1&0&1&1&1&{\color[rgb]{1,0,0}0}&0\\ \hline\cr 0&0&0&0&{\color[rgb]{1,0,0}1}&1&1&1\\ \hline\cr 1&1&0&{\color[rgb]{1,0,0}1}&1&0&0&1\\ \hline\cr\hline\cr 0&0&{\color[rgb]{1,0,0}0}&1&0&1&1&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&0&0&{\color[rgb]{1,0,0}1}&1&1&0&0\\ \hline\cr{\color[rgb]{1,0,0}0}&1&0&1&1&1&0&{\color[rgb]{1,0,0}0}\\ \hline\cr 0&0&{\color[rgb]{1,0,0}0}&0&1&1&1&1\\ \hline\cr 1&1&0&1&{\color[rgb]{1,0,0}1}&0&0&1\\ \hline\cr\hline\cr 0&{\color[rgb]{1,0,0}0}&0&1&0&1&1&0\\ \hline\cr\end{array}\end{array}

\displaystyle C\;\mbox{$\,=\,$}\;(\mbox{$\underline{c}$}_{0},\mbox{$\underline{c}$}_{1},\ldots,\mbox{$\underline{c}$}_{p+r-1})

\displaystyle C\;\mbox{$\,=\,$}\;(\mbox{$\underline{c}$}_{0},\mbox{$\underline{c}$}_{1},\ldots,\mbox{$\underline{c}$}_{p+r-1})

\displaystyle\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&0&0&1&1&1&0&0\\ \hline\cr 0&1&0&1&1&1&0&0\\ \hline\cr 0&0&0&0&1&1&1&1\\ \hline\cr 1&1&0&1&1&0&0&1\\ \hline\cr 0&0&0&1&0&1&1&0\\ \hline\cr\end{array}

\displaystyle\begin{array}[]{|c|c|c|c|c||c|c|c|}\hline\cr 1&0&0&1&1&1&0&0\\ \hline\cr 0&1&0&1&1&1&0&0\\ \hline\cr 0&0&0&0&1&1&1&1\\ \hline\cr 1&1&0&1&1&0&0&1\\ \hline\cr 0&0&0&1&0&1&1&0\\ \hline\cr\end{array}

\begin{array}[]{|c|c|c|c|c|}\hline\cr 0&{\color[rgb]{0,0,1}E}&0&0&{\color[rgb]{1,0,0}E}\\ \hline\cr{\color[rgb]{0,0,1}E}&0&0&{\color[rgb]{1,0,0}E}&0\\ \hline\cr 0&0&{\color[rgb]{1,0,0}E}&0&{\color[rgb]{0,0,1}E}\\ \hline\cr 0&{\color[rgb]{1,0,0}E}&0&{\color[rgb]{0,0,1}E}&0\\ \hline\cr{\color[rgb]{1,0,0}E}&0&{\color[rgb]{0,0,1}E}&0&0\\ \hline\cr\end{array}

\begin{array}[]{|c|c|c|c|c|}\hline\cr 0&{\color[rgb]{0,0,1}E}&0&0&{\color[rgb]{1,0,0}E}\\ \hline\cr{\color[rgb]{0,0,1}E}&0&0&{\color[rgb]{1,0,0}E}&0\\ \hline\cr 0&0&{\color[rgb]{1,0,0}E}&0&{\color[rgb]{0,0,1}E}\\ \hline\cr 0&{\color[rgb]{1,0,0}E}&0&{\color[rgb]{0,0,1}E}&0\\ \hline\cr{\color[rgb]{1,0,0}E}&0&{\color[rgb]{0,0,1}E}&0&0\\ \hline\cr\end{array}

\begin{array}[]{cc}\begin{array}[]{|c|c|c|c|c|}\hline\cr 0&{\color[rgb]{0,0,1}0}&0&0&{\color[rgb]{1,0,0}0}\\ \hline\cr{\color[rgb]{0,0,1}0}&0&0&{\color[rgb]{1,0,0}0}&0\\ \hline\cr 0&0&{\color[rgb]{1,0,0}0}&0&{\color[rgb]{0,0,1}0}\\ \hline\cr 0&{\color[rgb]{1,0,0}0}&0&{\color[rgb]{0,0,1}0}&0\\ \hline\cr{\color[rgb]{1,0,0}0}&0&{\color[rgb]{0,0,1}0}&0&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c|}\hline\cr 0&{\color[rgb]{0,0,1}1}&0&0&{\color[rgb]{1,0,0}1}\\ \hline\cr{\color[rgb]{0,0,1}1}&0&0&{\color[rgb]{1,0,0}1}&0\\ \hline\cr 0&0&{\color[rgb]{1,0,0}1}&0&{\color[rgb]{0,0,1}1}\\ \hline\cr 0&{\color[rgb]{1,0,0}1}&0&{\color[rgb]{0,0,1}1}&0\\ \hline\cr{\color[rgb]{1,0,0}1}&0&{\color[rgb]{0,0,1}1}&0&0\\ \hline\cr\end{array}\end{array}

\begin{array}[]{cc}\begin{array}[]{|c|c|c|c|c|}\hline\cr 0&{\color[rgb]{0,0,1}0}&0&0&{\color[rgb]{1,0,0}0}\\ \hline\cr{\color[rgb]{0,0,1}0}&0&0&{\color[rgb]{1,0,0}0}&0\\ \hline\cr 0&0&{\color[rgb]{1,0,0}0}&0&{\color[rgb]{0,0,1}0}\\ \hline\cr 0&{\color[rgb]{1,0,0}0}&0&{\color[rgb]{0,0,1}0}&0\\ \hline\cr{\color[rgb]{1,0,0}0}&0&{\color[rgb]{0,0,1}0}&0&0\\ \hline\cr\end{array}&\begin{array}[]{|c|c|c|c|c|}\hline\cr 0&{\color[rgb]{0,0,1}1}&0&0&{\color[rgb]{1,0,0}1}\\ \hline\cr{\color[rgb]{0,0,1}1}&0&0&{\color[rgb]{1,0,0}1}&0\\ \hline\cr 0&0&{\color[rgb]{1,0,0}1}&0&{\color[rgb]{0,0,1}1}\\ \hline\cr 0&{\color[rgb]{1,0,0}1}&0&{\color[rgb]{0,0,1}1}&0\\ \hline\cr{\color[rgb]{1,0,0}1}&0&{\color[rgb]{0,0,1}1}&0&0\\ \hline\cr\end{array}\end{array}

x^{p} \oplus 1

x^{p} \oplus 1

z_{0}

z_{0}

\displaystyle z_{\mbox{$\langle$}ij\mbox{$\rangle$}}

\displaystyle z_{\mbox{$\langle$}ij\mbox{$\rangle$}}

\displaystyle z_{\mbox{$\langle$}ij\mbox{$\rangle$}}

\displaystyle\bigoplus_{i=1}^{p-1}z_{\mbox{$\langle$}ij\mbox{$\rangle$}}\quad\mbox{$\,=\,$}\quad\bigoplus_{i=1}^{p-1}z_{i}

\displaystyle\bigoplus_{i=1}^{p-1}z_{\mbox{$\langle$}ij\mbox{$\rangle$}}\quad\mbox{$\,=\,$}\quad\bigoplus_{i=1}^{p-1}z_{i}

\displaystyle\bigoplus_{i=1}^{p-1}\bigoplus_{u=1}^{i}v_{\mbox{$\langle$}uj\mbox{$\rangle$}}

\displaystyle\bigoplus_{i=1}^{p-1}\bigoplus_{u=1}^{i}v_{\mbox{$\langle$}uj\mbox{$\rangle$}}

z_{0}

z_{0}

z_{3}

z_{6}

z_{2}

z_{5}

z_{1}

z_{4}

G (x)

G (x)

G (α^{i_{0}})

G (α^{i_{0}})

G (α^{i_{j}})

S_{j}

S_{j}

S_{j}

S_{j}

j = 0 ⨁ ρ - 1 g_{j} S_{j}

j = 0 ⨁ ρ - 1 g_{j} S_{j}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Array Codes with Local Properties

Mario Blaum and Steven R. Hetzler

IBM Research Division

Almaden Research Center

San Jose, CA 95120, USA

Abstract

In general, array codes consist of $m\times n$ arrays and in many cases, the arrays satisfy parity constraints along lines of different slopes (generally with a toroidal topology). Such codes are useful for RAID type of architectures, since they allow to replace finite field operations by XORs. We present expansions to traditional array codes of this type, like Blaum-Roth (BR) and extended EVENODD codes, by adding parity on columns. This vertical parity allows for recovery of one or more symbols in a column locally, i.e., by using the remaining symbols in the column without invoking the rest of the array. Properties and applications of the new codes are discussed, in particular to Locally Recoverable (LRC) codes.

Keywords: Erasure-correcting codes, product codes, Blaum-Roth (BR) codes, Reed-Solomon (RS) codes, EVENODD code, MDS codes, local and global parities, locally recoverable (LRC) codes.

I Introduction

Throughout this paper $p$ will denote a prime number and $q$ a power of 2, i.e., $q\mbox{$ ,=, $}2^{b}$ . We will consider finite fields $GF(q)$ . The results can be extended to other prime powers, but for simplicity, we consider finite fields of characteristic 2 only.

Given $p$ and an integer $m$ , let $\mbox{$ \langle $}m\mbox{$ \rangle $}_{p}$ be the unique number $j$ , $0\leqslant j\leqslant p-1$ such that $j\equiv m\;(\bmod\;p)$ . Unless there is confusion, we denote $\mbox{$ \langle $}m\mbox{$ \rangle $}_{p}$ as $\mbox{$ \langle $}m\mbox{$ \rangle $}$ . We will consider $p\times p$ arrays with entries $c_{u,v}\in GF(q)$ , $0\leqslant u,v\leqslant p-1$ . We next define formally a line in an array.

Definition 1

.

Given a $p\times p$ array with entries $c_{u,v}\in GF(q)$ , $0\leqslant u,v\leqslant p-1$ , a line of slope $i$ , $0\leqslant i\leqslant p-1$ , through entry $c_{u_{0},0}$ , $0\leqslant u_{0}\leqslant p-1$ , is the set of $p$ entries $\{c_{\mbox{$ \langle $}u_{0}-iv\mbox{$ \rangle $},v}\,:\,0\leqslant v\leqslant p-1\}$ . A line of slope $\infty$ (vertical line) through entry $(0,v_{0})$ , $0\leqslant v_{0}\leqslant p-1$ , is the set of $p$ entries $\{c_{u,v_{0}}\,:\,0\leqslant u\leqslant p-1\}$ . ∎

Next we give the definition of Blaum-Roth (BR) codes [6]:

Definition 2

.

A BR code with $r$ parity columns, denoted $BR(p,r,q)$ , consists of all possible $(p-1)\times p$ arrays over $GF(q)$ such that, when a zero row is appended to an array in the code, each line of slope $i$ for $0\leqslant i\leqslant r-1$ as given by Definition 1 has even parity. ∎

Example 3

.

The first 4 rows of the following $5\times 5$ array are in $BR(5,3,2)$ :

[TABLE]

*We can see that every horizontal line (i.e., line of slope 0), every line of slope 1 and every line of slope 2 have even parity (we mark in red the lines of slope 0, 1 and 2 through entry $c_{1,0}$ respectively).

$\Box$ *

The following algebraic definition [6] of a $BR(p,r,q)$ code is equivalent to Definition 1. This definition is convenient for decoding purposes.

Definition 4

.

A $BR(p,r,q)$ code is the code over the ring of polynomials modulo $M_{p,q}(x)\mbox{$ ,=, $}1\oplus x\oplus x^{2}+\cdots\oplus x^{p-1}$ with coefficients in $GF(q)$ whose parity-check matrix is given by the $r\times p$ Reed-Solomon type of matrix

[TABLE]

where $M_{p,q}(\alpha)\mbox{$ ,=, $}0$ and $\alpha\neq 1$ . ∎

From Definition 4, each codeword in $BR(p,r,q)$ can be considered as a $(p-1)\times p$ array such that each column represents an element in the ring of polynomials modulo $M_{p,q}(x)$ . It can be verified that such an array satisfies Definition 2 [6]. At this point the field $GF(q)$ is not significative. In fact, since $q\mbox{$ ,=, $}2^{b}$ , Definition 4 uses $b$ codes $BR(p,r,2)$ in parallel, so studying $BR(p,r,q)$ codes is equivalent to studying $BR(p,r,2)$ codes. This will change with the expansion of $BR(p,r,q)$ codes to be presented next, which incorporates vertical parities in each column of the array.

Definition 5

.

Let $g(x)$ be a polynomial with coefficients in $GF(q)$ such that $g(x)$ divides $1\oplus x^{p}$ and $\gcd(1\oplus x,g(x))\mbox{$ ,=, $}1$ . Denote by $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ the cyclic code of length $p$ over $GF(q)$ whose generator polynomial is $g(x)(1\oplus x)$ having minimum distance $d$ . Let $\mbox{$ {\cal R} $}_{p}(q)$ be the ring of polynomials modulo $1\oplus x^{p}$ with coefficients in $GF(q)$ , and let $\alpha\in\mbox{$ {\cal R} $}_{p}(q)$ such that $\alpha^{p}\mbox{$ ,=, $}1$ and $\alpha\neq 1$ . Then an Expanded Blaum-Roth code $EBR(p,r,q,g(x))$ is the $[p,p-r]$ code with coefficients in $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ whose parity-check matrix is given by (6). ∎

Notice that $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ is an ideal in $\mbox{$ {\cal R} $}_{p}(q)$ . The definition of an $EBR(p,r,q,g(x))$ code looks similar to the one of a $BR(p,r,q)$ code. They both share the parity-check matrix (6), but while the entries of a codeword in a $BR(p,r,q)$ code are in the ring of polynomials modulo $M_{p,q}(x)$ , the entries of a codeword in an $EBR(p,r,q,g(x))$ code are in the ideal $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ . Thus, a codeword in an $EBR(p,r,q,g(x))$ code can be represented by a $p\times p$ array, where each column is in the cyclic code $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ . We also observe that $\alpha^{i}$ represents a rotation $i$ times. Hence, multiplying by $\alpha^{i}$ does not involve finite field arithmetic or XOR operations. Let us point out that codes over the ring $\mbox{$ {\cal R} $}_{m}(q)$ , $m$ an integer (not necessarily prime), have been used in literature, for example, in [14] to provide a unification between EVENODD and RDP codes, in [15] in the context of Regenerating Codes, and in [16] to give efficient encoding and decoding algorithms for a family of MDS array codes.

From Definition 5 we obtain a geometrical description of the codes. As stated above, the codewords in an $EBR(p,r,q,g(x))$ code can be represented as $p\times p$ arrays over $GF(q)$ such that each column (i.e., line of slope $\infty$ ) is in the code

$\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ , and in addition, any line of slope $j$ for $0\leqslant j\leqslant r-1$ has even parity (recall that the arrays in a $BR(p,r,q)$ code are $(p-1)\times p$ arrays).

An important special case is given by code $EBR(p,r,2,1)$ , as described in [5], i.e., $g(x)\mbox{$ ,=, $}1$ and each column in an array has even parity, as illustrated in the next example.

Example 6

.

The following is an array in $EBR(5,3,2,1)$ :

[TABLE]

*We can see that each column has even parity (i.e., each column is in $\mbox{$ {\cal C} $}(p,1\oplus x,2,2)$ ), as well as each line of slope 0, 1 and 2 (illustrated in red for the lines through entry $c_{1,0}$ ).

$\Box$ *

Example 7

.

The code $EBR(7,3,2,1\oplus x\oplus x^{3})$ consists of all the $7\times 7$ arrays having even parity on lines of slopes 0, 1 and 2 and whose columns are in the code $\mbox{$ {\cal C} $}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4)$ , which is the subcode of the $[7,4,3]$ cyclic Hamming code generated by $1\oplus x\oplus x^{3}$ whose codewords have even weight [18].

The following are two arrays in $EBR(7,3,2,1\oplus x\oplus x^{3})$ :

[TABLE]

*We will see in Theorem 24 that, like a BR code, an EBR code has minimum distance $r+1$ on columns. As a consequence, the minimum Hamming distance of this code when considered as a code over $GF(2)$ is $D\mbox{$ ,=, $}16$ . In effect, taking a non-zero array in the code, each non-zero column has weight at least four and there are at least four non-zero columns, so $D\geqslant 16$ , while the array in the right above has weight exactly 16. As a comparison, the product code consisting of the product of $\mbox{$ {\cal C} $}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4)$ with the $[7,4,3]$ Hamming code generated by $1\oplus x\oplus x^{3}$ has the same rate as

$EBR(7,3,2,1\oplus x\oplus x^{3})$ but minimum Hamming distance 12.

$\Box$ *

Example 8

.

*Let $p\mbox{$ ,=, $}q-1$ be a prime number (in fact, a Mersenne prime), $q\mbox{$ ,=, $}2^{b}$ , $\beta$ a primitive element in $GF(q)$ ,

$g_{D-2}(x)\mbox{$ ,=, $}\prod_{i=1}^{D-2}(\beta^{i}\oplus x)$ and consider the RS code $\mbox{$ {\cal C} $}(p,g_{D-2}(x)(1\oplus x),q,D)$ [18]. Then the $EBR(p,r,q,g_{D-2}(x))$ code as given by Definition 5 consists of all the $p\times p$ arrays over $GF(q)$ having even parity on lines of slopes $j$ , $0\leqslant j\leqslant r-1$ , and whose columns are in the RS code $\mbox{$ {\cal C} $}(p,g_{D-2}(x)(1\oplus x),q,D)$ .*

For example, we can take $p\mbox{$ ,=, $}7\mbox{$ ,=, $}8-1$ , $\beta$ a primitive element in $GF(8)$ and $g_{1}(x)\mbox{$ ,=, $}\beta\oplus x$ . Then, according to Definition 5, the code $EBR(7,3,8,g_{1}(x))$ consists of all the $7\times 7$ arrays over $GF(8)$ having even parity on lines of slope 0, 1 and 2, and whose columns are in the $[7,5]$ RS code $\mbox{$ {\cal C} $}(7,(1\oplus x)(\beta\oplus x),8,3)$ .

More concretely, if $\beta$ is a zero of the primitive polynomial $1\oplus x\oplus x^{3}$ , the reader can verify that the following is an array in $EBR(7,3,8,g_{1}(x))$ :

[TABLE]

*We may assume that the first 5 symbols in the first 4 columns are a $5\times 4$ data array, while the remaining symbols are parity symbols.

$\Box$ *

The next lemma establishes the connection between $BR(p,r,q)$ and $EBR(p,r,q,1)$ codes.

Lemma 9

.

*There is a 1-1 relationship between $BR(p,r,q)$ and $EBR(p,r,q,1)$ codes preserving the (column) weight of each array in the code. *

**Proof: **Consider an array in $EBR(p,r,q,1)$ of (column) weight $w$ . Denote the array as $C\mbox{$ ,=, $}(\mbox{$ \underline{c} $}_{0},\mbox{$ \underline{c} $}_{1},\ldots,\mbox{$ \underline{c} $}_{p-1})$ , where

$\mbox{$ \underline{c} $}_{j}\mbox{$ ,=, $}(c_{0,j},c_{1,j},\ldots,c_{p-1,j})$ is a (column) vector of length $p$ for $0\leqslant j\leqslant p-1$ . For each $\mbox{$ \underline{c} $}_{j}$ , define a (column) vector of length $p-1$ $\hat{\mbox{$ \underline{c} $}}_{j}\mbox{$ ,=, $}(\hat{c}_{0,j},\hat{c}_{1,j},\ldots,\hat{c}_{p-2,j})$ such that $\hat{c}_{i,j}\mbox{$ ,=, $}c_{i,j}\oplus c_{p-1,j}$ for $0\leqslant i\leqslant p-2$ . Then, consider the transformation from $EBR(p,r,q,1)$ to $BR(p,r,q)$ given by

[TABLE]

First we need to prove that $\hat{C}$ in (7) is in $BR(p,r,q)$ . Since an array in $EBR(p,r,q,1)$ (resp. $BR(p,r,q)$ ) consists of $q$ independent arrays in $EBR(p,r,2,1)$ (resp. $BR(p,r,2)$ ), without loss of generality, we assume $q\mbox{$ ,=, $}2$ . By Definition 2, $\hat{C}\in BR(p,r,2)$ if and only if every line of slope $s$ in the $p\times p$ array consisting of $\hat{C}$ with a zero row appended at the bottom has even parity for $0\leqslant s\leqslant r-1$ . Notice that, by (7), such an array is equal to $C\oplus W$ , where $W$ is a $p\times p$ array such that column $j$ of $W$ is an all-zero vector if $c_{p-1,j}\mbox{$ ,=, $}0$ , otherwise it is an all-one vector. Since, in particular the weight of row $p-1$ in $C$ is even, the number of all-one columns in $W$ is even. Hence, any line of slope $s$ , $0\leqslant s\leqslant r-1$ , has even parity in $C\oplus W$ and $\hat{C}\in BR(p,r,2)$ .

Next we have to show that $C$ and $\hat{C}$ have the same (column) weight. This is true because $\mbox{$ \underline{c} $}_{j}$ is non-zero if and only if $\hat{\mbox{$ \underline{c} $}}_{j}$ is non-zero for $0\leqslant j\leqslant p-1$ . In effect, if $\mbox{$ \underline{c} $}_{j}$ is non-zero and $c_{p-1,j}\mbox{$ ,=, $}0$ , then $c_{i,j}\mbox{$ ,=, $}\hat{c}_{i,j}$ for $0\leqslant i\leqslant p-2$ and $\hat{\mbox{$ \underline{c} $}}_{j}$ is non-zero. If $\mbox{$ \underline{c} $}_{j}$ is non-zero and $c_{p-1,j}\mbox{$ ,=, $}1$ , then the number of 1s in $\mbox{$ \underline{c} $}_{i,j}$ for $0\leqslant i\leqslant p-2$ is odd, as well as the number of zeros. Hence, the number of 1s in $\hat{c}_{i,j}\mbox{$ ,=, $}1\oplus c_{i,j}$ for $0\leqslant i\leqslant p-2$ is odd, thus $\hat{\mbox{$ \underline{c} $}}_{j}$ has odd weight and it cannot be a zero vector. ∎

Corollary 10

*The code $EBR(p,r,q,g(x))$ over $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ given by Definition 5 is MDS. *

**Proof: **Let $C$ be an array in $EBR(p,r,q,g(x))$ of (column) weight $w$ . Applying transformation (7) to $C$ , we obtain

$\hat{C}\in BR(p,r,q)$ . Since, by Lemma 9, $\hat{C}$ has also (column) weight $w$ , and since $BR(p,r,q)$ is MDS [6], then $w\geqslant r+1$ and also $EBR(p,r,q,g(x))$ is MDS. ∎

Example 11

.

Consider the array in $EBR(5,3,2,1)$ given in Example 6. Then the following is the transformation of this array into an array in $BR(5,3,2)$ as given by (7), where we add a row of zeros to the $4\times 5$ array in $BR(5,3,2)$ :

[TABLE]

*We can see that the parity along all the lines of slope 0, 1 and 2 is preserved.

$\Box$ *

Since it is well known how to encode and decode $BR(p,r,q)$ codes [6, 14], the same can be done for $EBR(p,r,q,g(x))$ codes. In effect, the first step in the decoding consists of recovering up to $d-1$ erasures in each column (i.e., locally) whenever possible using the cyclic code $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ (since the code is cyclic, also a burst of up to $1+\deg(g)$ erasures can be recovered, as is the case in the encoding). Once this is done, the array in $EBR(p,r,q,g(x))$ is transformed into an array in $BR(p,r,q)$ using transformation (7). This array is decoded in $BR(p,r,q)$ using, for example, the method in [6] or in [14]. Once the decoding is complete, the inverse transformation is applied to obtain the original array in $EBR(p,r,q,g(x))$ . More efficient encoding and decoding methods, though, will be presented in Section II.

Quite often, it is desirable that the parity columns in an array code are independent, like in [2, 4, 7], since this property allows for a minimization of the number of parity updates when a data symbol is updated. Next we will do the same for array codes with local properties. We start with the definition of Independent Parity (IP) codes [2, 4].

Definition 12

.

An IP code with $r$ parity columns, denoted $IP(p,r,q)$ , consists of all possible $(p-1)\times(p+r)$ arrays over $GF(q)$ such that, when a zero row is appended to the array, for each $s$ , $0\leqslant s\leqslant r-1$ , the $p$ sets

[TABLE]

for $0\leqslant i\leqslant p-1$ have all the same parity, either even or odd. ∎

$IP(p,r,q)$ codes are also known as Generalized EVENODD codes [3] and Blaum–Bruck–Vardy codes [17] in literature.

Example 13

.

The $4\times 8$ array consisting of the first 4 rows of the following $5\times 8$ array is in $IP(5,3,2)$ :

[TABLE]

*We can see that given $s$ , $0\leqslant s\leqslant 2$ , all the lines of slope $s$ together with the corresponding independent parity in column $5+s$ (illustrated in red for the lines through entry $c_{1,0}$ ) have the same parity: even for lines of slope 0, odd for lines of slope 1 and even for lines of slope 2.

$\Box$ *

Similarly to $BR(p,r,q)$ codes, there is an equivalent algebraic description of $IP(p,r,q)$ codes [2, 4]. Explicitly:

Definition 14

.

An $IP(p,r,q)$ code is the code over the ring of polynomials modulo $M_{p,q}(x)$ whose parity-check matrix is given by the $r\times(p+r)$ matrix

[TABLE]

where $M_{p,q}(\alpha)\mbox{$ ,=, $}0$ and $\alpha\neq 1$ . ∎

By Definition 14, each array in $IP(p,r,q)$ can be considered as a $(p-1)\times(p+r)$ array such that each column represents an element in the ring of polynomials modulo $M_{p,q}(x)$ .

Next we define Expanded Independent Parity (EIP) codes:

Definition 15

.

*Consider the cyclic code $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ and $\alpha\in\mbox{$ {\cal R} $}_{p}(q)$ as in Definition 5. Then an EIP code

$EIP(p,r,q,g(x))$ is the $[p+r,p]$ code over $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ whose parity-check matrix $\tilde{H}_{p,r}$ is given by (14). ∎*

Contrary to $IP(p,r,q)$ codes, the parities in $EIP(p,r,q,g(x))$ codes are always even.

The next example illustrates Definition 15.

Example 16

.

The following is an array in $EIP(5,3,2,1)$ (the data, correspoding to the first 4 rows and the first 5 columns, is the same as in the array given in Example 13):

[TABLE]

*Each column has even parity (i.e., each column is in $\mbox{$ {\cal C} $}(p,1\oplus x,2,2)$ ), as well as each line of slope 0, 1 and 2 together with the corresponding independent parity (illustrated in red for the lines through entry $c_{1,0}$ ).

$\Box$ *

The next lemma, similarly to Lemma 9, establishes the connection between the codes $IP(p,r,q)$ and $EIP(p,r,q,1)$ :

Lemma 17

.

*There is a 1-1 relationship between $IP(p,r,q)$ and $EIP(p,r,q,1)$ preserving the (column) weight of each array in the code. *

**Proof: **Proceeding as in Lemma 9, denote an array $C\in EIP(p,r,q,1))$ as $C\mbox{$ ,=, $}(\mbox{$ \underline{c} $}_{0},\mbox{$ \underline{c} $}_{1},\ldots,\mbox{$ \underline{c} $}_{p+r-1})$ , where each $\mbox{$ \underline{c} $}_{j}$ is a (column) vector of length $p$ for $0\leqslant j\leqslant p+r-1$ and $\hat{\mbox{$ \underline{c} $}}_{j}$ is defined as in the proof of Lemma 9. Then, consider the transformation from $EIP(p,r,q,1)$ to $IP(p,r,q)$ given by

[TABLE]

We have to prove that $\hat{C}$ in (15) is in $IP(p,r,q)$ . As in Lemma 9, without loss of generality we may assume that $q\mbox{$ ,=, $}2$ . Consider the $p\times(p+r)$ array consisting of $\hat{C}$ with a zero-row appended. By Definition 12 of $IP(p,r,2)$ , we have to prove that in such $p\times(p+r)$ array, the lines of slope $s$ in the first $p$ columns of the array, $0\leqslant s\leqslant r-1$ , starting in entry $\hat{c}_{i,0}$ , $0\leqslant i\leqslant p-1$ , together with entry $\hat{c}_{i,p+s}$ , have all the same parity, either even or odd. As in Lemma 9, by (15), such an array is equal to $C\oplus W$ , where $W$ is a $p\times(p+r)$ array such that column $j$ of $W$ is an all-zero vector if $c_{p-1,j}\mbox{$ ,=, $}0$ , otherwise it is an all-one vector. Consider vector $\mbox{$ \underline{v} $}_{s}\mbox{$ ,=, $}(c_{p-1,0},c_{p-1,1},\ldots,c_{p-1,p-1},c_{p-1,p+s})$ , where $0\leqslant s\leqslant r-1$ , and let $W_{s}$ be the $p\times(p+1)$ matrix consisting of columns $0,1,\ldots,p-1,p+s$ of $W$ . If the weight of $\mbox{$ \underline{v} $}_{s}$ is even, then the number of all-one columns in $W_{s}$ is even and any line of slope $s$ in the first $p$ columns of the array through entry $\hat{c}_{i,0}$ , $0\leqslant i\leqslant p-2$ (as given by Definition 1), together with entry $\hat{c}_{i,p+s}$ , has even parity. Otherwise, all such lines together with entry $\hat{c}_{i,p+s}$ have odd parity.

Regarding the weight preservation, the argument is the same as in Lemma 9. ∎

Example 18

.

Consider the array in $EIP(5,3,2,1)$ given in Example 16. The transformation from $EIP(5,3,2,1)$ to $IP(5,3,2)$ given by (15) is (by appending a row of zeros to the $4\times 8$ array in $IP(5,3,2,1)$ )

[TABLE]

*We can see that in the array in the right, every line of slope 0 and 1 with its corresponding independent parity bit has even parity, while every line of slope 2 with its corresponding independent parity bit has odd parity (this case illustrated in red for the line through entry $c_{1,0}$ ).

$\Box$ *

Corollary 19

*If code $IP(p,r,q)$ is MDS, then code $EIP(p,r,q,g(x))$ given by Definition 15 is also MDS. *

**Proof: **Similar to Corollary 10. ∎

Contrary to $BR$ codes, $IP$ codes are not always MDS. In particular, $IP(p,r,q)$ codes, and hence, by Corollary 19,

$EIP(p,r,q,g(x))$ codes, are MDS for $1\leqslant r\leqslant 3$ [2, 4]. For $r>3$ the codes are MDS depending on the prime number chosen. A list of prime numbers for which $IP(p,r,q)$ is MDS and $r\geqslant 4$ is given in [4]. See also [17].

In the definitions of codes $EBR(p,r,q,g(x))$ and $EIP(p,r,q,g(x))$ , it is assumed that each column in an array in the code is in the cyclic code $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ . It is certainly possible to extend the definition such that each data column $j$ is in a cyclic code $\mbox{$ {\cal C} $}(p,g_{j}(x)(1\oplus x),q,d_{j})$ , where each $g_{j}(x)$ divides $1\oplus x^{p}$ and $\gcd(g_{j}(x),1\oplus x)\mbox{$ ,=, $}1$ . In this case, up to $d_{j}-1$ erasures can be corrected in data column $j$ , giving unequal erasure correction for the data columns. The $r$ parity columns are in $\mbox{$ {\cal C} $}(p,g^{\prime}(x)(1\oplus x),q,d^{\prime})$ , where $g^{\prime}(x)$ is the greatest common divisor of the $g_{j}(x)$ s.

Before proceeding further, we briefly discuss the applications and advantages of expanded array codes over other array codes. The applications of expanded array codes are the same as the ones of traditional array codes, like BR [6], EVENODD [2], IP [3, 4], RDP [8], generalized RDP [1, 9] and codes with distributed parities [19, 23, 24, 25]. Mainly, these array codes can be used in RAID type of architectures [10], like RAID 6, which requires two parity columns. In addition, expanded array codes contain vertical parities, providing for local recovery [12]. Array codes are an alternative to RS codes [18], which require finite field operations. Since array codes are based on XOR operations, in general their implementation has less complexity than the one of codes based on finite fields.

Another application of array codes is the cloud. In this case, each entry may correspond to a whole device. Erasure codes involving local and global parities may be invoked [11, 12, 13]. Array codes naturally provide horizontal locality as well as column recovery, so they can be used as Locally Recoverable (LRC) Codes. Traditional array codes do not have vertical parity. In some applications, a column may represents a device, and each entry in the column a sector or a page in the device. It may be desirable to have vertical parities in an array code, so, if a number of sectors or pages fail (for instance, if their internal ECCs are exceeded and such a situation is detected by the CRCs), the failed sectors or pages can be recovered locally without invoking other devices (a first responder type of approach). A way to achieve this goal is by using an array code like a BR or an IP code and then encoding each column with vertical parities. A problem with this approach is that the column parities are not protected by the other parities. This problem is overcome by the expanded array codes described above.

$EBR(p,r,q,g(x))$ codes have another interesting property. In addition to being able to recover any $r$ erased columns, they can recover also from a number of erased lines of slope $j$ , $1\leqslant j\leqslant r-1$ . We will study this property in some detail in Section IV, but in the meantime we illustrate this property with a simple example.

Example 20

.

Assume that $p\mbox{$ ,=, $}5$ , and we have a $5\times 5$ array such that we encode a $4\times 3$ data array consisting of zeros into a $4\times 5$ array in code $BR(5,2,2)$ , and then we append a parity row. The result is a $5\times 5$ 0-array. Next, assume that two lines of slope 1 are erased, say, lines 1 (in blue) and 4 (in red) as follows, where the symbol $E$ corresponds to an erasure:

[TABLE]

This situation admits two solutions as shown below, since the top 4 rows in the array in the right are in $B(5,2,2)$ :

[TABLE]

*Hence, the two erased lines of slope 1 are uncorrectable. If we encode the $4\times 3$ data array of zeros into a $5\times 5$ array in $EBR(5,2,2,1)$ , we also obtain the zero array, but there is a unique decoding to the two erased lines of slope 1, given by the zero array in the left. Since the two colored diagonals in the array in the right have odd parity, this array cannot be in $EBR(5,2,2,1)$ and the solution is unique.

$\Box$ *

If each entry represents a page or a sector, for example, a 64K sector, the parity sectors of an expanded array code, being obtained as XORs of data sectors, by linearity, inherit the CRC bits, i.e., the CRC of the parity sectors does not need to be computed. This one is an important advantage in implementation. Finally, expanded array codes naturally provide multiple localities to recover a single failed symbol, a problem that attracted attention in recent literature [21, 22, 26].

The paper is structured as follows: in Section II, we present efficient encoding and decoding algorithms for the array codes we have defined above (i.e., EBR and EIP codes). In Section III, we examine the problem of the minimum (symbol, as opposed to column) distance of such array codes. In Section IV, we study conditions under which erased lines of slope $s$ , $0\leqslant s\leqslant r-1$ , can be recovered in EBR codes (as illustrated in Example 20 for $s\mbox{$ ,=, $}1$ ). Section V discusses the puncturing of EBR and EIP codes to obtain MDS codes consisting of $m\times p$ or $m\times(p+r)$ arrays, where $m<p$ for certain values of $m$ . We end the paper by drawing some conclusions.

II Encoding, Decoding and Updating of a Data Symbol in EBR and EIP Codes

We start with a technical lemma.

Lemma 21

.

*Let $g(x)$ be an irreducible polynomial on $GF(q)$ such that $g(x)$ divides $x^{p}\oplus 1$ , $p$ prime, and $\gcd(g(x),x\oplus 1)\mbox{$ ,=, $}1$ . Then, for each $i$ such that $1\leqslant i<p$ , $\gcd(g(x),x^{i}\oplus 1)\mbox{$ ,=, $}1$ . *

**Proof: **Assume that the lemma is not true, hence, since $g(x)$ is irreducible, there is an $i$ , $1\leqslant i<p$ , such that $g(x)$ divides $x^{i}+1$ . Moreover, assume that $i$ is minimal with this property. Since $\gcd(g(x),x\oplus 1)\mbox{$ ,=, $}1$ , then $2\leqslant i<p$ . Let $p\mbox{$ ,=, $}ci+r$ , where, since $p$ is prime, $1\leqslant r<i$ . We can easily verify that

[TABLE]

Since $g(x)$ divides both $x^{p}\oplus 1$ and $x^{i}\oplus 1$ , then, by (17), it also divides $x^{r}\oplus 1$ , contradicting the minimality of $i$ . ∎

The following lemma gives a recursion that will be used in the decoding of EBR codes.

Lemma 22

.

Let $\mbox{$ \underline{v} $}(\alpha)\mbox{$ ,=, $}\bigoplus_{i=0}^{p-1}v_{i}\alpha^{i}\in\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ , where $\alpha\in\mbox{$ {\cal R} $}_{p}(q)$ , $\alpha\neq 1$ and $\alpha^{p}\mbox{$ ,=, $}1$ . Then, for each $j$ such that $1\leqslant j\leqslant p-1$ , the recursion $(1\oplus\alpha^{j})\underline{z}(\alpha)\mbox{$ ,=, $}\mbox{$ \underline{v} $}(\alpha)$ has a unique solution in $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ . Specifically, if $\underline{z}(\alpha)\mbox{$ ,=, $}\bigoplus_{i=0}^{p-1}z_{i}\alpha^{i}$ , then

[TABLE]

∎

**Proof: **If $z_{0}$ is known, by solving recursively $(1\oplus\alpha^{j})\underline{z}(\alpha)\mbox{$ ,=, $}\mbox{$ \underline{v} $}(\alpha)$ , (19) is obtained. Since $p$ is prime, all the entries $z_{i}$ for $1\leqslant i\leqslant p-1$ are covered by this recursion. In particular, if $i\mbox{$ ,=, $}1$ , $z_{j}\mbox{$ ,=, $}z_{0}\oplus v_{j}$ . Using this result as our starting point, we obtain

[TABLE]

XORing both sides of (20) from $i\mbox{$ ,=, $}1$ to $i\mbox{$ ,=, $}p-1$ , we have

[TABLE]

Since, in particular, $\underline{z}(\alpha)$ must have even weight, then $\bigoplus_{i=1}^{p-1}z_{i}\mbox{$ ,=, $}z_{0}$ . Also, since $p$ is odd, $(p-1)z_{0}\mbox{$ ,=, $}0$ . Finally,

[TABLE]

Replacing these values in (21), we obtain (18).

It remains to be proven that $\underline{z}(\alpha)\in\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ . Hence, we have to prove that $g(x)(1\oplus x)$ divides $\underline{z}(x)$ . Certainly, $1\oplus x$ divides $\underline{z}(x)$ since $\underline{z}(x)$ has even weight. Without loss of generality, assume that $g(x)$ is irreducible (otherwise, take an irreducible factor of $g(x)$ ). Since $g(x)$ divides $\mbox{$ \underline{v} $}(x)$ , $(1\oplus\alpha^{j})\underline{z}(\alpha)\mbox{$ ,=, $}\mbox{$ \underline{v} $}(\alpha)$ and, by Lemma 21, $\gcd(g(x),1\oplus x^{j})\mbox{$ ,=, $}1$ for $1\leqslant j\leqslant p-1$ , $g(x)$ divides $\underline{z}(x)$ . ∎

Lemma 22 was proven in [14], Lemma 7, and in [15], Lemma 13, for the special case $\mbox{$ {\cal C} $}(p,1\oplus x,2,2)$ .

The next example illustrates Lemma 22:

Example 23

.

Let $p\mbox{$ ,=, $}7$ , $\mbox{$ \underline{v} $}(\alpha)\mbox{$ ,=, $}1\oplus\alpha\oplus\alpha^{4}\oplus\alpha^{6}\in\mbox{$ {\cal C} $}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4)$ (see Example 7), i.e., $v_{0}\mbox{$ ,=, $}1$ , $v_{1}\mbox{$ ,=, $}1$ , $v_{2}\mbox{$ ,=, $}0$ , $v_{3}\mbox{$ ,=, $}0$ , $v_{4}\mbox{$ ,=, $}1$ , $v_{5}\mbox{$ ,=, $}0$ and $v_{6}\mbox{$ ,=, $}1$ . Assume that we want to solve the recursion $(1\oplus\alpha^{3})\underline{z}(\alpha)\mbox{$ ,=, $}\mbox{$ \underline{v} $}(\alpha)$ . According to (18) and (19), since $j\mbox{$ ,=, $}3$ , $\mbox{$ \langle $}2j\mbox{$ \rangle $}_{7}\mbox{$ ,=, $}6$ , so

[TABLE]

*so $\underline{z}(\alpha)\mbox{$ ,=, $}\alpha^{2}\oplus\alpha^{4}\oplus\alpha^{5}\oplus\alpha^{6}$ . In particular, we can see that $\underline{z}\in\mbox{$ {\cal C} $}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4)$ .

$\Box$ *

Observe that the recursion in Lemma 22 involves $\frac{3p-5}{2}$ XORs.

Next we will show how to correct up to $r$ erased columns in $EBR(p,r,q,g(x))$ by adapting the method in [6]. The next theorem was proven in [5] for $EBR(p,r,q,1)$ . The proof is analogous, but we give it for the sake of completeness.

Theorem 24

.

*Code $EBR(p,r,q,g(x))$ given by Definition 5 can correct up to $d-1$ erasures or a burst of up to $1+\deg(g(x))$ (consecutive) erasures in each column and up to $r$ erased columns. *

**Proof: **Given an array in $EBR(p,r,2,g(x))$ , since columns are in the cyclic code $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ , up to $d-1$ erasures can be corrected in each column, and also a burst of up to the degree of the generator polynomial $g(x)(1\oplus x)$ , i.e.,

$1+\deg(g(x))$ (systematic encoding is a special case of recovering such a burst of erasures).

Next assume that columns $i_{0},i_{1},\ldots,i_{\rho-1}$ have been erased, where $\rho\leqslant r$ , and we denote by $\underline{e}_{s}$ the (erased) value of column $i_{s}$ . Consider the polynomial $G(x)$ of degree $\rho-1$

[TABLE]

Notice that

[TABLE]

Denote the columns of the array by $\mbox{$ \underline{c} $}_{u}$ , where $0\leqslant u\leqslant p-1$ . Assuming that the erased columns are zero, compute the syndromes

[TABLE]

Hence, from (25), we also have

[TABLE]

From (22), (23), (24) and (26),

[TABLE]

After computing $\bigoplus_{j=0}^{\rho-1}g_{j}S_{j}$ , $\underline{e}_{0}$ can be obtained by applying the recursion given by (18) and (19) in Lemma 22 $\rho-1$ times. Once $\underline{e}_{0}$ is obtained, we are left with $\rho-1$ erasures, and we proceed by induction. ∎

The next example illustrates the decoding procedure given in Theorem 24.

Example 25

.

Consider the code $EBR(7,3,2,1\oplus x\oplus x^{3})$ of Example 7 and assume that we want to decode the following array, where the blank spaces denote erasures::

[TABLE]

We can see that columns 1, 3 and 6 are erased, columns 0 and 4 contain three erasures each and columns 2 and 5 contain a burst of length four (in particular, the burst in column 2 is an all-around burst, but it can be corrected also since the code is cyclic). Since the columns are in the cyclic code $\mbox{$ {\cal C} $}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4)$ , the first step is obtaining the erasures in columns 0, 2, 4 and 5. Once this is done, we obtain

[TABLE]

By (22),

[TABLE]

Assuming that the erased columns are zero when computing the syndromes, by (25), we obtain

[TABLE]

Using (28), we compute

[TABLE]

By (27) and (29), we have to solve the double recursion

[TABLE]

or, multiplying both sides by $\alpha^{-2}$ ,

[TABLE]

Let $(1\oplus\alpha^{5})\underline{e}_{0}\mbox{$ ,=, $}\mbox{$ \underline{v} $}_{0}$ , then we have to solve first

[TABLE]

Applying the recursion given by (18) and (19) as illustrated in Example 25, we obtain,

[TABLE]

Next we have to solve

[TABLE]

This gives,

[TABLE]

Recomputing the syndromes,

[TABLE]

Repeating the procedure for two erasures, we now have

[TABLE]

which gives

[TABLE]

We have to solve the recursion

[TABLE]

hence,

[TABLE]

Finally, we recompute

[TABLE]

The final decoded array is then

[TABLE]

*which coincides with the first array given in Example 7.

$\Box$ *

The encoding is a special case of the decoding. For example, we may use the last $1+\deg(g(x))$ rows and the last $r$ columns to store the parities. We start by encoding systematically [18] the first $p-r$ columns using the generator polynomial $g(x)(1\oplus x)$ . Next we compute the last $r$ columns using the decoding procedure as described in Theorem 24. Since at the encoding the erasures are always in the last $r$ columns, we may precompute the coefficients of $G(x)\mbox{$ ,=, $}\prod_{j=1}^{r-1}(x\oplus\alpha^{p-r+j})$ , making the process faster.

Let us examine next $EIP(p,r,q,g(x))$ codes. The encoding of $EIP(p,r,q,g(x))$ codes is very simple and is a direct consequence of Definition 15: given a $(p-(1+\deg(g(x)))\times p$ data array that we denote as $\mbox{$ \underline{v} $}_{0},\mbox{$ \underline{v} $}_{1},\ldots,\mbox{$ \underline{v} $}_{p-1}$ , each $\mbox{$ \underline{v} $}_{j}$ a (vertical) vector of length $p-(1+\deg(g(x))$ over $GF(q)$ , we start by encoding systematically each $\mbox{$ \underline{v} $}_{j}$ into $\mbox{$ \underline{c} $}_{j}\in\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ . The result is a $p\times p$ array. Then we obtain the $r$ parity columns $\mbox{$ \underline{c} $}_{p+s}$ , $0\leqslant s\leqslant r-1$ , as $\mbox{$ \underline{c} $}_{p+s}\mbox{$ ,=, $}\bigoplus_{j=0}^{p-1}\alpha^{sj}\mbox{$ \underline{c} $}_{j}$ . Thus, once the $\mbox{$ \underline{c} $}_{j}$ s have been obtained, each $\mbox{$ \underline{c} $}_{p+s}$ requires $(p-1)p$ XORs.

Let us consider next the special case of $EIP(p,r,q,1)$ codes. In a first step, we need to obtain symbols $c_{p-1,j}$ for

$0\leqslant j\leqslant p-1$ as the XOR of symbols $c_{i,j}$ for $0\leqslant i\leqslant p-2$ , which takes $p-2$ XORs for each $c_{p-1,j}$ . Hence, the total number of XORs required by the encoding algorithm is $(p-2)p+r(p-1)p$ . If we shorten the code to $k$ columns, where $1\leqslant k\leqslant p$ , by assuming that $p-k$ of the columns are zero, then the total number of XORs required at the encoding is $k(p-2)+r(k-1)p$ . In particular, if $r\mbox{$ ,=, $}2$ and $k\leqslant p-2$ , the total number of XORs at the encoding is $3kp-2(k+p)$ . The number of XORs according to the optimized encoding algorithm for $EBR(p,2,q,1)$ with $k$ data columns given in [5] is $3kp-(k+2)$ , so the encoding algorithm for $EIP(p,2,q,1)$ also with $k$ data columns is more efficient.

Table I compares the number of XORs of required by the encoding algorithms of $EIP(p,2,q,1)$ and of $EBR(p,2,q,1)$ as given in [5], both codes shortened to $k$ data columns for $1\leqslant k\leqslant p-2$ . The encoding algorithm of $EIP(p,r,q,1)$ always requires less XORs than the optimized encoding algorithm of $EBR(p,2,q,1)$ in [5]. Table I shows that the savings are more dramatic when $k<<p$ .

Regarding the decoding of $EIP(p,r,q,g(x))$ codes, the first step is always correcting up to $d-1$ erasures in each column or a burst of up to $1+\deg(g(x))$ erasures wherever this is feasible. Once this is done, assuming that the code is MDS and up to $r$ columns are erased, we can apply transform (15) and decode the array in $IP(p,r,q)$ . Then the inverse transformation will give the desired array in $EIP(p,r,q,g(x))$ .

If the erased columns correspond to data columns, i.e., they are among the first $p$ columns in the array, the array can be decoded directly in $EIP(p,r,q,g(x))$ applying the same method as for the decoding of $EBR(p,r,q,g(x))$ . If some of the erased columns are among the $r$ parity columns, the decoding is more complicated since the recursion of Lemma 22 cannot be applied. The case $r\mbox{$ ,=, $}2$ is simple to handle though, since when one of the two parity columns is erased, it is corrected as a special case.

We end this section with the problem of updating $EBR(p,r,q,g(x))$ and $EIP(p,r,q,g(x))$ codes. The idea is, when updating one data symbol, how to minimize the number of parity symbols that need to be updated, a problem that has been treated repeatedly in the literature on array codes [2, 4, 7, 19, 23, 24]. Actually, $EBR(p,r,q,g(x))$ codes have bad updating properties, since the parities are not independent and updating one data symbol causes the updating of most of the parity symbols. The same is true for $BR(p,r,q)$ codes, and the creation of codes like EVENODD [2] arises from the need of optimizing the number of updates by making the parities independent. Hence, in what follows, we concentrate on $EIP(p,r,q,g(x))$ codes only.

As usual, denote an array in $EIP(p,r,q,g(x))$ as $(\mbox{$ \underline{c} $}_{0},\mbox{$ \underline{c} $}_{1},\ldots,\mbox{$ \underline{c} $}_{p-1},\mbox{$ \underline{c} $}_{p},\mbox{$ \underline{c} $}_{p+1},\ldots,\mbox{$ \underline{c} $}_{p+r-1})$ , where each $\mbox{$ \underline{c} $}_{j}$ is a (column) vector of length $p$ . Each time a data symbol $c_{i,j}$ , $0\leqslant i\leqslant p-(2+\deg(g(x)))$ , $0\leqslant j\leqslant p-1$ , is updated, first we need to update the parity symbols in column $j$ . In effect, if data symbol $c_{i,j}$ is replaced by symbol $c^{\prime}_{i,j}$ , consider the (vertical) vector $\mbox{$ \underline{v} $}_{j}$ of length $p-(1+\deg(g(x)))$ that is zero everywhere except in location $i$ , where it is equal to $c_{i,j}\oplus c^{\prime}_{i,j}$ . Encoding (systematically) $\mbox{$ \underline{v} $}_{j}$ into $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ , we obtain a (vertical) vector that we denote $\mbox{$ \underline{c} $}^{\prime}_{j}$ . Once $\mbox{$ \underline{c} $}^{\prime}_{j}$ is obtained, we update $\mbox{$ \underline{c} $}_{j}$ as $\mbox{$ \underline{c} $}_{j}\oplus\mbox{$ \underline{c} $}^{\prime}_{j}$ and the parity vectors $\mbox{$ \underline{c} $}_{p+s}$ , $0\leqslant s\leqslant r-1$ , as $\mbox{$ \underline{c} $}_{p+s}\mbox{$ ,=, $}\mbox{$ \underline{c} $}_{p+s}\oplus\alpha^{sj}\mbox{$ \underline{c} $}^{\prime}_{j}$ . Let us illustrate the process in the next example.

Example 26

.

Consider the following array in code $EIP(7,3,2,1\oplus x\oplus x^{3})$ :

[TABLE]

*Assume that we want to update symbol $c_{2,1}$ . The first step is encoding (systematically) $\mbox{$ \underline{v} $}_{1}\mbox{$ ,=, $}(0,0,1)$ in

$\mbox{$ {\cal C} $}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4)$ . Doing so, we obtain $\mbox{$ \underline{c} $}_{1}^{\prime}\mbox{$ ,=, $}(0,0,1,0,1,1,1)$ and $\mbox{$ \underline{c} $}_{1}\oplus\mbox{$ \underline{c} $}^{\prime}_{1}\mbox{$ ,=, $}(0,1,0,1,1,1,0)$ .*

Then, $\mbox{$ \underline{c} $}_{7}\oplus\mbox{$ \underline{c} $}^{\prime}_{1}\mbox{$ ,=, $}(1,0,1,1,1,0,0)$ , $\mbox{$ \underline{c} $}_{8}\oplus\alpha\mbox{$ \underline{c} $}^{\prime}_{1}\mbox{$ ,=, $}(1,0,0,1,0,1,1)$ and $\mbox{$ \underline{c} $}_{9}\oplus\alpha^{2}\mbox{$ \underline{c} $}^{\prime}_{1}\mbox{$ ,=, $}(1,1,1,0,0,1,0)$ . The updated array is then

[TABLE]

$\Box$

We can see that the lowest number of updates in the parity symbols that an $EIP(p,r,q,g(x))$ code as given by Definition 14 can make is $(r+1)d-1$ . In Example 26, this is the case, we are updating $(3)(4)-1\mbox{$ ,=, $}11$ parity symbols. The reason is that the three vectors consisting of the systematic encoding of the three vectors of weight 1 and length 3 in the vertical code $\mbox{$ {\cal C} $}(7,(1\oplus x\oplus x^{3})(1\oplus x),2,4)$ , when encoded systematically, have weight 4, the minimum distance of the code. Let us state this observation as a lemma.

Lemma 27

.

Consider an $EIP(p,r,q,g(x))$ with vertical code $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ . Then the number of parity updates when a data symbol is updated reaches the optimal value $(r+1)d-1$ if and only if the systematic encoding of each vector of weight one and length $p-(1+\deg(g(x))$ has weight $d$ . ∎

Corollary 28

*Consider an $EIP(p,r,q,g(x))$ code and assume that the vertical code $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ is MDS. Then the number of parity updates when a data symbol is updated reaches the optimal value $(r+1)d-1$ . *

**Proof: **Simply observe that if $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ is MDS then the systematic encoding of each vector of weight 1 and length $p-(1+\deg(g(x))$ has weight $2+\deg(g(x))\mbox{$ ,=, $}d$ and the result follows from Lemma 27. ∎

Corollary 29

*Consider an $EIP(p,r,2,1)$ code. Then the number of parity updates when a data symbol is updated reaches the optimal value $2r+1$ . *

**Proof: **This is the special case of Corollary 28 corresponding to the binary field $GF(2)$ and the vertical code is the

$\mbox{$ {\cal C} $}(p,1\oplus x,2,2)$ parity code, which in particular is MDS. ∎

III Minimum Hamming Distance of Array Codes with Local Properties

In this section we consider a problem that has not received much attention in the literature on array codes. In general, when we talk about the distance of an array code, we refer to the column distance. In this section we want to consider the symbol distance. Having a high symbol distance may be important when erased columns co-exist with erased symbols. We will consider both $EBR(p,r,q,g(x))$ and $EIP(p,r,q,g(x))$ codes. We have already seen that $EBR(p,r,q,g(x))$ codes are MDS, i.e., their column distance is $r+1$ , while $EIP(p,r,q,g(x))$ are MDS depending on the prime number considered and the value of $r$ . We will simply call the Hamming distance the symbol distance of a code, otherwise we refer to the column distance.

Let us start with a lower bound.

Lemma 30

.

*Let $D$ be the Hamming distance of an $EBR(p,r,q,g(x))$ or an $EIP(p,r,q,g(x))$ MDS code. Then, $D\geqslant d(r+1)$ . *

**Proof: **Take a non-zero array in $EBR(p,r,q,g(x))$ or in $EIP(p,r,q,g(x))$ . Then, since the code is MDS, at least $r+1$ columns in the array are non-zero. Since each non-zero column has weight at least $d$ , the result follows. ∎

Finding $D$ for an $EIP(p,r,q,g(x))$ MDS code is easy, as shown in the next corollary.

Corollary 31

.

*Consider an $EIP(p,r,q,g(x))$ MDS code with minimum Hamming distance $D$ . Then, $D\mbox{$ ,=, $}d(r+1)$ . *

**Proof: **Take a $p\times p$ array consisting of a column of weight $d$ in $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ , while the remaining $p-1$ columns are zero. Encoding this array in $EIP(p,r,q,g(x))$ , since the parity columns are rotations of the non-zero data column, they also have weight $d$ , so we obtain an array with $r+1$ columns of weight $d$ , hence $D\leqslant d(r+1)$ . The result then follows from Lemma 30. ∎

From now on, we consider $EBR(p,r,q,1)$ codes only.

Lemma 32

.

*Consider the code $EBR(p,r,q,1)$ , where either $1\leqslant r\leqslant 3$ or $p-2\leqslant r\leqslant p-1$ . Then the minimum Hamming distance of $EBR(p,r,q,1)$ is $D\mbox{$ ,=, $}2(r+1)$ . *

**Proof: **Since $d\mbox{$ ,=, $}2$ , by Lemma 30, $D\geqslant 2(r+1)$ . So, it is enough to exhibit an array of weight $2(r+1)$ in $EBR(p,r,q,1)$ when $1\leqslant r\leqslant 3$ or $p-2\leqslant r\leqslant p-1$ . The case $r\mbox{$ ,=, $}1$ is trivial.

Denoting the entries of an array by $(i,j)$ , where $0\leqslant i,j\leqslant p-1$ , consider the array that is 1 in entries $(p-3,p-2)$ ,

$(p-3,p-1)$ , $(p-2,p-3)$ , $(p-2,p-1)$ , $(p-1,p-3)$ and $(p-1,p-2)$ and 0 elsewhere. This array is in $EBR(p,2,q,1)$ and it has Hamming weight 6.

Similarly, consider the array that is 1 in entries $(p-5,p-2)$ , $(p-5,p-1)$ , $(p-4,p-3)$ , $(p-4,p-1)$ , $(p-2,p-4)$ , $(p-2,p-2)$ , $(p-1,p-4)$ and $(p-1,p-3)$ and 0 elsewhere. This is an array in $EBR(p,3,q,1)$ and it has Hamming weight 8.

Next take $r\mbox{$ ,=, $}p-2$ . Consider the array $(c_{i,j})$ , $0\leqslant i,j\leqslant p-1$ , of Hamming weight $2(p-1)$ , such that:

[TABLE]

By construction, since $i+1\neq\mbox{$ \langle $}(i+1)/2\mbox{$ \rangle $}$ for $0\leqslant i\leqslant p-2$ , each of the first $p-1$ rows has exactly two 1s, while the last row is zero. Similarly, the first column is zero while the last $p-1$ columns have two 1s each, hence, the array has even parity on rows and columns and weight $2(p-1)$ . It remains to be proven that each line of slope $j$ , $1\leqslant j\leqslant p-3$ , has even parity.

In effect, for each $w$ such that $0\leqslant w\leqslant p-1$ , take the line of slope $j$ , $0\leqslant j\leqslant p-3$ , through entry $c_{w,0}$ . By Definition 1, the entries in this line are $c_{\mbox{$ \langle $}w-jv\mbox{$ \rangle $},v}$ for $0\leqslant v\leqslant p-1$ . Take first $w<p-1$ , then there is a unique $i$ such that $w-jv\mbox{$ ,=, $}i$ and $v\mbox{$ ,=, $}i+1$ , mainly, $i\mbox{$ ,=, $}\mbox{$ \langle $}(w-j)/(j+1)\mbox{$ \rangle $}$ (notice that this is possible since $j\neq p-1$ ). Similarly, there is a unique $i$ such that $w-jv\mbox{$ ,=, $}i$ and $v\mbox{$ ,=, $}\mbox{$ \langle $}(i+1)/2\mbox{$ \rangle $}$ , mainly, $i\mbox{$ ,=, $}\mbox{$ \langle $}(2w-j)/(j+2)\mbox{$ \rangle $}$ (notice that this is possible since $j\neq p-2$ ). Hence, any line of slope $j$ , $1\leqslant j\leqslant p-3$ , starting at $c_{w,0}$ for $0\leqslant w\leqslant p-2$ , has exactly two 1s and thus even parity. If $w\mbox{$ ,=, $}p-1$ , the lines of slope $j$ starting at entry $c_{p-1,0}$ can be shown to contain no 1s, hence they also have even parity, so the array is in $EBR(p,p-2,q,1)$ and has Hamming weight $2(p-1)$ .

If $r\mbox{$ ,=, $}p-1$ , consider the following array $(c_{i,j})$ , $0\leqslant i,j\leqslant p-1$ , of Hamming weight $2p$ :

[TABLE]

Using the same methods as above, we can see that each line of slope $j$ , $0\leqslant j\leqslant p-2$ , contains exactly two 1s, and hence it has even parity, so the array is in $EBR(p,p-1,q,1)$ and has Hamming weight $2p$ . ∎

Example 33

.

Consider $p\mbox{$ ,=, $}7$ . The following are arrays of weight 4, 6, 8, 10 and 12 in $EBR(7,1,2,1)$ , $EBR(7,2,2,1)$ , $EBR(7,3,2,1)$ , $EBR(7,5,2,1)$ and $EBR(7,6,2,1)$ respectively according to the proof of Lemma 32:

[TABLE]

$\Box$

Lemma 32 shows that the bound of Lemma 30 is tight for $EBR(p,r,q,1)$ when $1\leqslant r\leqslant 3$ and $p-2\leqslant r\leqslant p-1$ . What happens for $4\leqslant r\leqslant p-3$ ? Going back to Example 33, consider $EBR(7,4,2,1)$ . An exhaustive search shows that the minimum Hamming distance of $EBR(7,4,2,1)$ is $D\mbox{$ ,=, $}12$ , so in this case the bound is not tight. The following is an array in $EBR(7,4,2,1)$ of weight 12:

[TABLE]

Let us point out that a product code of an MDS horizontal code (like RS) with $r$ parities with a vertical parity code has minimum Hamming distance $2(r+1)$ .

A possible competitor for an $EBR(p,r,q,1)$ code is a code consisting of $p\times p$ arrays such that their first $p-1$ rows are in $BR(p,r,q)$ and their last row is the XOR of such first $p-1$ rows. Let us call $BRVP(p,r,q)$ such a code ( $BRVP(5,2,2)$ was illustrated in Example 20). For example, if $p\mbox{$ ,=, $}7$ , the following is an array in $BRVP(7,3,2)$ :

[TABLE]

We can easily see that both lemmas 30 and 32 apply to $BRVP(p,r,q)$ codes. However, the minimum Hamming distance of $BRVP(7,4,2)$ is 10, which is less than the minimum Hamming distance 12 of $EBR(7,4,2,1)$ . In effect, notice that the following is an array of weight 10 in $BRVP(7,4,2)$ :

[TABLE]

Finding the minimum Hamming distance of an $EBR(p,r,q,1)$ code for $4\leqslant r\leqslant p-3$ is an open problem.

IV Recovery of Erased Lines of slope $i$ in an

$EBR(p,r,q,1)$ Code

In Section II, we have seen how to encode and decode an $EBR(p,r,q,g(x))$ code. In particular, we have shown how to recover up to $r$ erased columns. Interestingly, an $EBR(p,r,q,1)$ code can also recover a number of erased lines of slope $i$ , where $0\leqslant i\leqslant r-1$ . We say that an $EBR(p,r,q,1)$ code is MDS on lines of slope $i$ , if the code can recover up to $r$ erased lines of such slope. We have shown in Theorem 24 that an $EBR(p,r,q,1)$ code is MDS on lines of slope $\infty$ . What happens with the other slopes $i$ , $0\leqslant i\leqslant r-1$ ?

A trivial case corresponds to $r\mbox{$ ,=, $}1$ : in this case, an $EBR(p,1,q,1)$ code is a product code with parity on rows and columns. Any erased row or column can be recovered, hence, an $EBR(p,1,q,1)$ code is MDS on lines of slope $\infty$ and on lines of slope 0.

The next case corresponds to $r\mbox{$ ,=, $}2$ . We had seen in Example 20 that the code $EBR(5,2,2,1)$ can recover an erased pair of lines of slope 1. The following lemma proves that this is true for any $p$ .

Lemma 34

.

*The code $EBR(p,2,q,1)$ can recover any erased pair of lines of slope $i$ for $0\leqslant i\leqslant 1$ . *

**Proof: **Assume that $(c_{u,v})$ , $0\leqslant u,v\leqslant p-1$ , is an array in $EBR(p,2,q,1)$ , and assume that two rows have been erased. Consider the transposed array $(b_{u,v})\mbox{$ ,=, $}(c_{u,v})^{\rm T}$ , $0\leqslant u,v\leqslant p-1$ , i.e., $b_{u,v}\mbox{$ ,=, $}c_{v,u}$ . Then, the array $(b_{u,v})$ has two erased columns. It is enough to show that $(b_{u,v})\in EBR(p,2,q,1)$ . Certainly, each line of slope 0 has even parity. Take a line of slope 1, i.e., according to Definition 1, $p$ entries $b_{u,v}$ such that $u+v\mbox{$ ,=, $}j$ for some $j$ , $0\leqslant j\leqslant p-1$ . Since $b_{u,v}\mbox{$ ,=, $}c_{v,u}$ , the lines of slope 1 coincide for the array $(c_{u,v})$ and for the transposed array $(b_{u,v})$ , so $(b_{u,v})\in EBR(p,2,q,1)$ and the two erased columns can be recovered.

Assume next that two lines of slope 1 are erased in the array $(c_{u,v})$ . Take an array $(b_{u,v})$ defined from the array $(c_{u,v})$ as $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}-u\mbox{$ \rangle $},\mbox{$ \langle $}u+v\mbox{$ \rangle $}}$ for $0\leqslant u,v\leqslant p-1$ . We claim, $(b_{u,v})\in EBR(p,2,q,1)$ . Notice first that every line of slope $\infty$ in $(b_{u,v})$ corresponds to a line of slope 1 in $(c_{u,v})$ . In effect, take $v_{0}$ such that $0\leqslant v_{0}\leqslant p-1$ . According to Definition 1, a line of slope $\infty$ (vertical) in $(b_{u,v})$ through $b_{0,v_{0}}$ is given by the set $\{(b_{u,v_{0}})\,:\,0\leqslant u\leqslant p-1\}$ , which is equal to the set $\{(c_{-u,u+v_{0}})\,:\,0\leqslant u\leqslant p-1\}$ . This last set corresponds to the line of slope 1 in $(c_{u,v})$ through $c_{v_{0},0}$ , also according to Definition 1.

Similarly, it can be shown that each line of slope 0 in $(b_{u,v})$ corresponds to a line of slope 0 in $(c_{u,v})$ and that each line of slope 1 in $(b_{u,v})$ corresponds to a line of slope $\infty$ in $(c_{u,v})$ . Hence, $(b_{u,v})$ is in $EBR(p,2,q,1)$ since it has even parity on lines of slope 0, 1 and $\infty$ , so, by Theorem 24, it can recover the two columns corresponding to the two erased lines of slope 1 in $(c_{u,v})$ . ∎

Corollary 35

.

*The code $EBR(p,2,q,1)$ is MDS on lines of slope 0, 1 and $\infty$ . *

Example 36

.

Consider the code $EBR(7,2,q,1)$ . The transpose transformation as described in Lemma 34 gives

[TABLE]

We can see that columns become rows, rows become columns, and the lines of slope 1 are preserved by this transformation, so the transposed arrays are also in $EBR(7,2,q,1)$ and any pair of erased horizontal lines can be recovered.

The second transformation in Lemma 34, i.e., $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}-u\mbox{$ \rangle $},\mbox{$ \langle $}u+v\mbox{$ \rangle $}}$ , gives the following correspondence:

[TABLE]

*We can see that this transformation maps lines of slope 1 in $(c_{u,v})$ into lines of slope $\infty$ in $(b_{u,v})$ , lines of slope 0 in $(c_{u,v})$ into lines of slope 0 in $(b_{u,v})$ and lines of slope $\infty$ in $(c_{u,v})$ into lines of slope 1 in $(b_{u,v})$ . For example, the line of slope 1 through $c_{2,0}$ (in bold) is mapped into the third vertical line. So, $(b_{u,v})\in EBR(7,2,q,1)$ and any pair of erased lines of slope 1 can be recovered.

$\Box$ *

Consider next the case $r\mbox{$ ,=, $}3$ .

Lemma 37

.

*The code $EBR(p,3,q,1)$ can recover any three erased lines of lines of slope $j$ , where $0\leqslant j\leqslant 2$ . *

**Proof: **Assume that $(c_{u,v})$ , $0\leqslant u,v\leqslant p-1$ , is an array in $EBR(p,3,q,1)$ , and assume that three rows (lines of slope 0) have been erased. Consider the array $(b_{u,v})$ , $0\leqslant u,v\leqslant p-1$ , such that $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}2v\mbox{$ \rangle $},u}$ . Then lines of slope $\infty$ in $(c_{u,v})$ are mapped into lines of slope 0 in $(b_{u,v})$ and lines of slope 0 in $(c_{u,v})$ are mapped into lines of slope $\infty$ in $(b_{u,v})$ , like in the case of the transpose transformation. Lines of slope 2 in $(c_{u,v})$ are mapped into lines of slope 1 in $(b_{u,v})$ . In effect, consider the line of slope 1 in $(b_{u,v})$ through entry $b_{u_{0},0}$ , where $0\leqslant u_{0}\leqslant p-1$ . According to Definition 1, this line corresponds to the set $\{b_{\mbox{$ \langle $}u_{0}-v\mbox{$ \rangle $},v}\,:\,0\leqslant v\leqslant p-1\}$ , which is equal to the set $\{c_{\mbox{$ \langle $}2v\mbox{$ \rangle $},\mbox{$ \langle $}u_{0}-v\mbox{$ \rangle $}}\,,\,0\leqslant v\leqslant p-1\}$ . Since $\mbox{$ \langle $}2v\mbox{$ \rangle $}+\mbox{$ \langle $}2(u_{0}-v)\mbox{$ \rangle $}\mbox{$ ,=, $}\mbox{$ \langle $}2u_{0}\mbox{$ \rangle $}$ , then this last set corresponds to the line of slope 2 in $(c_{u,v})$ through entry $(\mbox{$ \langle $}2u_{0}\mbox{$ \rangle $},0)$ .

Similarly, proceeding as in the previous cases, it can be shown that lines of slope 1 in $(c_{u,v})$ are mapped into lines of slope 2 in $(b_{u,v})$ . Thus, $(b_{u,v})$ is in $EBR(p,3,q,1)$ , hence, by Theorem 24, it can correct any three erased columns (which correspond to three erased rows in $(c_{u,v})$ ).

Next assume that three lines of slope 1 have been erased in $(c_{u,v})$ . We can consider the same transformation as in Lemma 34, i.e., consider a $p\times p$ array $(b_{u,v})$ such that $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}-u\mbox{$ \rangle $},\mbox{$ \langle $}u+v\mbox{$ \rangle $}}$ for $0\leqslant u,v\leqslant p-1$ . As in Lemma 34, every line of slope $\infty$ in $(b_{u,v})$ corresponds to a line of slope 1 in $(c_{u,v})$ , each line of slope 0 in $(b_{u,v})$ corresponds to a line of slope 0 in $(c_{u,v})$ and each line of slope 1 in $(b_{u,v})$ corresponds to a line of slope $\infty$ in $(c_{u,v})$ . In addition, proceeding as in the previous cases, it can be shown that each line of slope 2 in $(b_{u,v})$ corresponds to a line of slope 2 in $(c_{u,v})$ . Thus, $(b_{u,v})\in EBR(p,3,q,1)$ and any 3 columns in $(b_{u,v})$ , which correspond to 3 lines of slope 1 in $(c_{u,v})$ , can be corrected.

Finally, assume that three lines of slope 2 have been erased in $(c_{u,v})$ . Consider a $p\times p$ array $(b_{u,v})$ such that

$b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}-2(u+v)\mbox{$ \rangle $},\mbox{$ \langle $}u+2v\mbox{$ \rangle $}}$ for $0\leqslant u,v\leqslant p-1$ . Now, it can be shown that every line of slope $\infty$ in $(b_{u,v})$ corresponds to a line of slope 2 in $(c_{u,v})$ , every line of slope 0 in $(b_{u,v})$ corresponds to a line of slope 1 in $(c_{u,v})$ , every line of slope 1 in $(b_{u,v})$ corresponds to a line of slope 0 in $(c_{u,v})$ and every line of slope 2 in $(b_{u,v})$ corresponds to a line of slope $\infty$ in $(c_{u,v})$ . Thus, $(b_{u,v})\in EBR(p,3,q,1)$ and any 3 columns in $(b_{u,v})$ , which correspond to 3 lines of slope 2 in $(c_{u,v})$ , can be corrected. ∎

Corollary 38

.

*The code $EBR(p,3,q,1)$ is MDS on lines of slope 0, 1, 2 and $\infty$ . *

Example 39

.

Consider the code $EBR(7,3,q,1)$ . The first transformation in Lemma 37, i.e., $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}2v\mbox{$ \rangle $},u}$ , gives

[TABLE]

We can see that lines of slope [math] in $(c_{u,v})$ become lines of slope $\infty$ in $(b_{u,v})$ , lines of slope $\infty$ in $(c_{u,v})$ become lines of slope [math] in $(b_{u,v})$ , lines of slope 1 in $(c_{u,v})$ become lines of slope 2 in $(b_{u,v})$ and lines of slope 2 in $(c_{u,v})$ become lines of slope 1 in $(b_{u,v})$ , so $(b_{u,v})\in EBR(7,3,q,1)$ and any three erased horizontal lines can be recovered. For example, the line of slope 1 starting in $c_{2,0}$ (in bold) is mapped into the line of slope 2 in the right array, also in bold.

The second transformation in Lemma 37, i.e., $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}-u\mbox{$ \rangle $},\mbox{$ \langle $}u+v\mbox{$ \rangle $}}$ , is the same as the second transformation in Lemma 34 and has been illustrated in Example 36. As in Example 36, this transformation maps lines of slope 1 in $(c_{u,v})$ into lines of slope $\infty$ in $(b_{u,v})$ , lines of slope 0 in $(c_{u,v})$ into lines of slope 0 in $(b_{u,v})$ and lines of slope $\infty$ in $(c_{u,v})$ into lines of slope 1 in $(b_{u,v})$ . In addition, lines of slope 2 in $(c_{u,v})$ are mapped into lines of slope 2 in $(b_{u,v})$ , hence, $(b_{u,v})\in EBR(7,3,q,1)$ and any three erased lines of slope 1 can recovered.

Finally, the last transformation in Lemma 37, i.e., $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}-2(u+v)\mbox{$ \rangle $},\mbox{$ \langle $}u+2v\mbox{$ \rangle $}}$ , gives

[TABLE]

We can see that this transformation maps lines of slope 2 in $(c_{u,v})$ into lines of slope $\infty$ in $(b_{u,v})$ , lines of slope 0 in $(c_{u,v})$ into lines of slope 1 in $(b_{u,v})$ , lines of slope $\infty$ in $(c_{u,v})$ into lines of slope 2 in $(b_{u,v})$ and lines of slope 1 in $(c_{u,v})$ into lines of slope 0 in $(b_{u,v})$ . For example, the line of slope 2 starting in $c_{2,0}$ (in bold) is mapped into the second vertical line.

*In particular, $(b_{u,v})\in EBR(7,3,q,1)$ and any three erased lines of slope 2 can recovered.

$\Box$ *

We have seen that an $EBR(p,r,q,1)$ code is MDS on lines of slope $j$ , where $0\leqslant j\leqslant r-1$ and $1\leqslant r\leqslant 3$ . The next lemma shows that this is also the case for $p-2\leqslant r\leqslant p-1$ .

Lemma 40

.

*The code $EBR(p,r,q,1)$ with $p-2\leqslant r\leqslant p-1$ is MDS on lines of slope $j$ , where $0\leqslant j\leqslant r-1$ . *

**Proof: **Consider an array $(c_{u,v})\in EBR(p,p-2,q,1)$ . Take $j$ such that $0\leqslant j\leqslant p-3$ and assume that $r$ lines of slope $j$ have been erased. Consider the array $(b_{u,v})$ , $0\leqslant u,v\leqslant p-1$ , such that $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}-ju+(3j+2)v\mbox{$ \rangle $},\mbox{$ \langle $}u+jv\mbox{$ \rangle $}}$ . Lines of slope $j$ in $(c_{u,v})$ are mapped into lines of slope $\infty$ in $(b_{u,v})$ . In effect, consider the line of slope $\infty$ in $(b_{u,v})$ through entry $b_{0,v_{0}}$ , where

$0\leqslant v_{0}\leqslant p-1$ . This line corresponds to the set $\{b_{u,v_{0}}\,:\,0\leqslant u\leqslant p-1\}$ , which is equal to the set

$\{c_{\mbox{$ \langle $}-ju+(3j+2)v_{0}\mbox{$ \rangle $},\mbox{$ \langle $}u+jv_{0}\mbox{$ \rangle $}}\,,\,0\leqslant u\leqslant p-1\}$ . Since $\mbox{$ \langle $}-ju+(3j+2)v_{0}\mbox{$ \rangle $}+j\mbox{$ \langle $}u+jv_{0}\mbox{$ \rangle $}\mbox{$ ,=, $}\mbox{$ \langle $}(j^{2}+2j+3)v_{0}\mbox{$ \rangle $}$ , then, according to Definition 1, this last set corresponds to the line of slope $j$ in $(c_{u,v})$ through entry $(\mbox{$ \langle $}(j^{2}+2j+3)v_{0}\mbox{$ \rangle $},0)$ . Since

$\mbox{$ \langle $}j^{2}+2j+3\mbox{$ \rangle $}\mbox{$ ,=, $}\mbox{$ \langle $}(j+1)(j+2)\mbox{$ \rangle $}$ and $0\leqslant j\leqslant p-3$ , $\mbox{$ \langle $}j^{2}+2j+3\mbox{$ \rangle $}\neq 0$ , so to each choice of $v_{0}$ corresponds a unique line of slope $j$ .

Proceeding similarly, we can show that lines of slope $p-1$ in $(c_{u,v})$ are mapped into lines of slope $p-2$ in $(b_{u,v})$ and that lines of slope $p-2$ in $(c_{u,v})$ are mapped into lines of slope $p-1$ in $(b_{u,v})$ . Thus, a line of slope $i_{0}$ in $(c_{u,v})$ , $i_{0}\neq j$ , $i_{0}\mbox{$ ,=, $}\infty$ or $0\leqslant i_{0}\leqslant r-1$ , is mapped into a line of slope $i_{1}$ in $(b_{u,v})$ , $i_{1}\not\in\{\infty,p-2,p-1\}$ . Since these lines have even parity, $(b_{u,v})$ is in $EBR(p,p-2,q,1)$ and it can correct any $p-2$ erased columns, which correspond to $p-2$ erased lines of slope $j$ in $(c_{u,v})$ .

Next let $r\mbox{$ ,=, $}p-1$ . Consider the array $(b_{u,v})$ , $0\leqslant u,v\leqslant p-1$ , such that $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}-ju+(j+1)v\mbox{$ \rangle $},\mbox{$ \langle $}u\mbox{$ \rangle $}}$ . Proceeding as above, we can verify that lines of slope $j$ in $(c_{u,v})$ are mapped into lines of slope $\infty$ in $(b_{u,v})$ and that lines of slope $p-1$ in $(c_{u,v})$ are mapped into lines of slope $p-1$ in $(b_{u,v})$ . Thus, a line of slope $i_{0}$ , $i_{0}\neq p-1$ , will be mapped to a line of slope $i_{1}$ , $i_{1}\neq p-1$ . All these lines have even parity, so $(b_{u,v})$ is in $EBR(p,p-2,q,1)$ and it can correct any $p-1$ erased columns, which correspond to $p-1$ erased lines of slope $j$ in $(c_{u,v})$ . ∎

Example 41

.

As in Example 39, take $p\mbox{$ ,=, $}7$ and consider the code $EBR(7,5,q,1)$ . If $j\mbox{$ ,=, $}0$ , the transformation in Lemma 40, i.e., $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}-ju+(3j+2)v\mbox{$ \rangle $},\mbox{$ \langle $}u+jv\mbox{$ \rangle $}}$ , becomes $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}2v\mbox{$ \rangle $},u}$ , giving the first transformation illustrated in Example 39. We have seen in this example that the resulting array $(b_{u,v})$ is in $EBR(7,3,q,1)$ and hence any tree rows can be corrected. In addition, we observe that lines of slope 3 in $(c_{u,v})$ are transformed into lines of slope 3 in $(b_{u,v})$ , and that lines of slope 4 in $(c_{u,v})$ are transformed into lines of slope 4 in $(b_{u,v})$ . Hence, if $(c_{u,v})$ is in $EBR(7,5,q,1)$ also $(b_{u,v})$ is in $EBR(7,5,q,1)$ and any 5 erased rows can be recovered.

Let us take next $j\mbox{$ ,=, $}3$ , then the transformation $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}-ju+(3j+2)v\mbox{$ \rangle $},\mbox{$ \langle $}u+jv\mbox{$ \rangle $}}$ , becomes $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}4u+4v\mbox{$ \rangle $},\mbox{$ \langle $}u+3v\mbox{$ \rangle $}}$ , giving

[TABLE]

We can see that this transformation maps lines of slope 3 in $(c_{u,v})$ into lines of slope $\infty$ in $(b_{u,v})$ , lines of slope $\infty$ in $(c_{u,v})$ into lines of slope 3 in $(b_{u,v})$ , lines of slope 0 in $(c_{u,v})$ into lines of slope 1 in $(b_{u,v})$ , lines of slope 1 in $(c_{u,v})$ into lines of slope 0 in $(b_{u,v})$ , lines of slope 2 in $(c_{u,v})$ into lines of slope 4 in $(b_{u,v})$ and lines of slope 4 in $(c_{u,v})$ into lines of slope 2 in $(b_{u,v})$ . In particular, $(b_{u,v})\in EBR(7,5,q,1)$ and any 5 erased columns, which correspond to 5 lines of slope 3 in $(c_{u,v})$ , can be recovered.

*Similar transformations can be obtained for $j\mbox{$ ,=, $}1$ , 2 and 4.

$\Box$ *

Let us unify Theorem 24, Corollaries 35 and 38 and Lemma 40 in the following theorem:

Theorem 42

.

The code $EBR(p,r,q,1)$ with $1\leqslant r\leqslant 3$ or $p-2\leqslant r\leqslant p-1$ is MDS on lines of slope $j$ , where $j\mbox{$ ,=, $}\infty$ or $0\leqslant j\leqslant r-1$ . ∎

Theorem 42 gives five values of $r$ for which the code $EBR(p,r,q,1)$ is MDS on lines of slope $j$ , where $j\mbox{$ ,=, $}\infty$ or

$0\leqslant j\leqslant r-1$ . For $4\leqslant r\leqslant p-3$ we do not think that this is the case, but the problem is open.

V Puncturing EBR and EIP Codes

A puncturing of a code of length $n$ in $s$ specified locations is a code of length $n-s$ such that the $s$ specified entries in each codeword of the original code are deleted [18]. For $EBR(p,r,q,g(x))$ codes we will puncture some rows as follows:

Definition 43

.

*Consider an $EBR(p,r,q,g(x))$ code (resp., an $EIP(p,r,q,g(x))$ code) and assume that $g(x)$ has degree $t$ . A punctured EBR (resp. EIP) code $PEBR(p,r,q,g(x))$ (resp. $PEIP(p,r,q,g(x))$ ) consists of all $(p-t-1)\times p$ (resp.

$(p-t-1)\times(p+r)$ ) arrays obtained by deleting the last $t+1$ rows of each array in $EBR(p,r,q,g(x))$ (resp., in

$EIP(p,r,q,g(x))$ ). ∎*

The next lemma is immediate.

Lemma 44

.

*The code $PEBR(p,r,q,g(x))$ as given by Definition 43 is MDS (on columns), while the code $PEIP(p,r,q,g(x))$ is MDS if and only if the code $EIP(p,r,q,g(x))$ is MDS. *

**Proof: **Simply observe that a column in $EBR(p,r,q,g(x))$ (resp., in $EIP(p,r,q,g(x))$ ) is a zero column if and only if the corresponding column in $PEBR(p,r,q,g(x))$ is a zero column (resp. in $PEIP(p,r,q,g(x))$ ). In effect, using the notation of Definition 43, a vector in $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ is the zero vector if and only if the first $p-t-1$ entries of the vector are zero, since encoding systematically such $p-t-1$ zero entries in $\mbox{$ {\cal C} $}(p,g(x)(1\oplus x),q,d)$ gives the zero vector. ∎

Notice that in a $PEBR(p,r,q,g(x))$ or in a $PEIP(p,r,q,g(x))$ code the vertical erasure-correction of a code is lost. Let us examine next the simplest case of puncturing, that is, when $g(x)\mbox{$ ,=, $}1$ .

Example 45

.

Consider a $PEBR(p,r,q,1)$ code, which, according to Definition 43, consists of all the $(p-1)\times p$ arrays obtained by taking the first $p-1$ rows of each $p\times p$ array in $EBR(p,r,q,1)$ .

A $PEBR(p,r,q,1)$ code can be compared with a regular $BR(p,r,q)$ code: they both consist of $(p-1)\times p$ arrays and can recover up to $r$ erased columns. For example, the array in the left below is in $PEBR(5,2,2,1)$ , while the one in the right is in $BR(5,2,2)$ . The last row is not written. Notice that both arrays share the same data array.

[TABLE]

*However, the two codes $PEBR(p,r,q,1)$ and $BR(p,r,q)$ are not equivalent. The code $PEBR(p,r,q,1)$ can correct up to $r$ lines of slope $j$ for $1\leqslant j\leqslant r-1$ if $2\leqslant r\leqslant 3$ or $p-2\leqslant r\leqslant p-1$ . For example, consider the array in the left above in $PEBR(5,2,2,1)$ . The two lines of slope 1 colored in red and in blue can be recovered. In effect, assuming that the array is in $EBR(5,2,2,1)$ with the last row corresponding to erasures, applying the transformation $b_{u,v}\mbox{$ ,=, $}c_{\mbox{$ \langle $}-u\mbox{$ \rangle $},\mbox{$ \langle $}u+v\mbox{$ \rangle $}}$ for $0\leqslant u,v\leqslant 4$ of Lemma 34, the lines of slope 1 become columns and the resulting array is in $EBR(5,2,2,1)$ . This transformed array has two erased columns, while the remaining columns have one erasure each and so it is recoverable by Theorem 24. Once such array is recovered, by applying the inverse transformation to it, the original array in $EBR(5,2,2,1)$ is obtained, and by deleting the last row, so is the array in $PEBR(5,2,2,1)$ . Codes $BR(p,r,q)$ on the other hand cannot recover two or more erased lines that are not vertical, as shown in Example 20.

$\Box$ *

The construction of a $PEIP(p,r,2,1)$ code is also given in [16]. In Theorem 5 of this reference, it is proven that the code $PEIP(p,r,2,1)$ is MDS when 2 is primitive in $GF(p)$ , $p>5$ and $r\leqslant 5$ using the Erdös-Heilbronn conjecture. However, this result is a special case of the combination of Corollary 19, Lemma 44 and Theorem 2.6 in [4].

Example 46

.

Consider $EBR(7,3,2,1\oplus x\oplus x^{3})$ as in Example 7. Then, $PEBR(7,3,2,1\oplus x\oplus x^{3})$ consists of all the $3\times 7$ arrays obtained by taking the first three rows of each $7\times 7$ array in $EBR(7,3,2,1\oplus x\oplus x^{3})$ .

Taking the first 3 rows of the two arrays in $EBR(7,3,2,1\oplus x\oplus x^{3})$ of Example 7, we obtain:

[TABLE]

Any three columns in the code can be recovered, so, in particular, this code is a $[7,4,4]$ MDS code over vectors of length 3 (the columns), like a RS code over $GF(8)$ . In fact, it can be shown that this code is equivalent to a $[7,4,4]$ RS code over $GF(8)$ . In effect, if we permute the last two rows of the arrays above, we obtain

[TABLE]

Assuming that each column is a symbol in the finite field $GF(8)$ with $\beta$ a zero of the primitive polynomial $1\oplus x\oplus x^{3}$ , the first array above corresponds to the polynomial in $GF(8)$ $f_{0}(X)\mbox{$ ,=, $}\beta^{6}\oplus\beta^{4}X\oplus\beta^{5}X^{2}+X^{4}\oplus\beta X^{5}\oplus\beta^{5}X^{6}$ , while the second one corresponds to $f_{1}(X)\mbox{$ ,=, $}\beta^{4}X^{3}\oplus\beta^{2}X^{4}\oplus\beta^{3}X^{5}+X^{6}$ . We can verify that $f_{i}(\beta^{-j})\mbox{$ ,=, $}0$ for $0\leqslant i\leqslant 1$ and $0\leqslant j\leqslant 2$ , i.e., both codewords are in the RS code with generator polynomial $(1\oplus x)(\beta^{-1}\oplus x)(\beta^{-2}\oplus x)$ . It can also be verified more in general that $PEBR(7,r,2,1\oplus x\oplus x^{3})$ with rows 1 and 2 permuted corresponds to a RS code over $GF(8)$ defined by the primitive polynomial $1\oplus x\oplus x^{3}$ with generator polynomial $g(x)\mbox{$ ,=, $}(1\oplus x)\prod_{i=1}^{r-1}(\beta^{-i}\oplus x)$ . In the example above, $r\mbox{$ ,=, $}3$ .

On the other hand, if $\beta$ is a zero of the primitive polynomial $1\oplus x^{2}\oplus x^{3}$ , then, applying the permutation $(1\;2\;0)$ to the three rows of the arrays in $PEBR(7,r,2,1\oplus x\oplus x^{3})$ corresponds to a RS code over $GF(8)$ with generator polynomial $g(x)\mbox{$ ,=, $}(1\oplus x)\prod_{i=1}^{r-1}(\beta^{i}\oplus x)$ . The two original arrays with this permutation are

[TABLE]

$\Box$

Example 47

.

Generalizing Example 46, consider a Mersenne prime $p\mbox{$ ,=, $}2^{b}-1$ , where $b$ is also prime and $b\geqslant 3$ . The first four such Mersenne primes are $7\mbox{$ ,=, $}2^{3}-1$ , $31\mbox{$ ,=, $}2^{5}-1$ , $127\mbox{$ ,=, $}2^{7}-1$ and $8191\mbox{$ ,=, $}2^{13}-1$ .

*Consider a RS code over $GF(2^{b})$ , let $\beta$ be a primitive element in $GF(2^{b})$ and let $h(x)$ be a cyclotomic polynomial [18] such that $h(\beta)\mbox{$ ,=, $}0$ . Let $g(x)\mbox{$ ,=, $}(1\oplus x^{p})/(h(x)(1\oplus x)$ . Then, the code $PEBR(p,r,2,g(x))$ is an MDS code on columns consisting of $b\times p$ arrays. We believe that a permutation of the rows of such arrays gives a code that is equivalent to a RS code over $GF(p+1)$ as we showed for the case $p\mbox{$ ,=, $}7$ in Example 46, but we cannot find a proof.

$\Box$ *

VI Conclusions

We have expanded codes like Blaum-Roth codes and generalized EVENODD codes to array codes such that each column has a certain erasure-correcting capability. We have shown the connection of the new codes to traditional array codes. We have presented efficient encoding, decoding and updating algorithms. We have observed that the new codes can recover erased lines of different slopes. We have also showed a method for puncturing the codes such that the resulting arrays constitute an MDS code.

Bibliography26

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Blaum, “A family of MDS array codes with minimal number of encoding operations,” IEEE International Symposium on Information Theory (ISIT’06), Seattle, USA, pp. 2784-88, July 2006.
2[2] M. Blaum, J. Brady, J. Bruck, and J. Menon, “EVENODD: an efficient scheme for tolerating double disk failures in RAID architectures,” IEEE Trans. on Computers, vol. C-44, pp. 192-202, February 1995.
3[3] M. Blaum, J. Brady, J. Bruck, J. Menon, and A. Vardy, “The EVENODD code and its generalization,” in “High Performance Mass Storage and Parallel I/O: Technologies and Applications,” edited by H. Jin, T. Cortes, and R. Buyya, IEEE & Wiley Press, New York, Chapter 14, pp. 187-208, 2001.
4[4] M. Blaum, J. Bruck, and A. Vardy, “MDS array codes with independent parity symbols,” IEEE Trans. on Information Theory, vol. IT-42, pp. 529-42, March 1996.
5[5] M. Blaum, V. Deenadhayalan, and S. R. Hetzler, “Expanded Blaum-Roth codes with efficient encoding and decoding algorithms,” IEEE Communications Letters, vol. 23, no. 6, pp. 954-7, June 2019.
6[6] M. Blaum and R. M. Roth, “New array codes for multiple phased burst correction,” IEEE Trans. on Information Theory, vol. IT-39, pp. 66-77, January 1993.
7[7] M. Blaum and R. M. Roth, “On lowest-density MDS codes,” IEEE Trans. on Information Theory, vol. IT-45, pp. 46-59, January 1999.
8[8] P. Corbett, B. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong, and S. Sankar, “Row-diagonal parity for double disk failure correction,” Proc. 3rd Conf. File and Storage Technologies - FAST’04, San Francisco, CA, March/April 2004.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Array Codes with Local Properties

Abstract

I Introduction

Definition** 1**

Definition** 2**

Example** 3**

Definition** 4**

Definition** 5**

Example** 6**

Example** 7**

Example** 8**

Lemma** 9**

Corollary** 10**

Example** 11**

Definition** 12**

Example** 13**

Definition** 14**

Definition** 15**

Example** 16**

Lemma** 17**

Example** 18**

Corollary** 19**

Example** 20**

II Encoding, Decoding and Updating of a Data Symbol in EBR and EIP Codes

Lemma** 21**

Lemma** 22**

Example** 23**

Theorem** 24**

Example** 25**

Example** 26**

Lemma** 27**

Corollary** 28**

Corollary** 29**

III Minimum Hamming Distance of Array Codes with Local Properties

Lemma** 30**

Corollary** 31**

Lemma** 32**

Example** 33**

IV Recovery of Erased Lines of slope iii in an

Lemma** 34**

Corollary** 35**

Example** 36**

Lemma** 37**

Corollary** 38**

Example** 39**

Lemma** 40**

Example** 41**

Theorem** 42**

V Puncturing EBR and EIP Codes

Definition** 43**

Lemma** 44**

Example** 45**

Example** 46**

Example** 47**

VI Conclusions

Definition 1

Definition 2

Example 3

Definition 4

Definition 5

Example 6

Example 7

Example 8

Lemma 9

Corollary 10

Example 11

Definition 12

Example 13

Definition 14

Definition 15

Example 16

Lemma 17

Example 18

Corollary 19

Example 20

Lemma 21

Lemma 22

Example 23

Theorem 24

Example 25

Example 26

Lemma 27

Corollary 28

Corollary 29

Lemma 30

Corollary 31

Lemma 32

Example 33

IV Recovery of Erased Lines of slope $i$ in an

Lemma 34

Corollary 35

Example 36

Lemma 37

Corollary 38

Example 39

Lemma 40

Example 41

Theorem 42

Definition 43

Lemma 44

Example 45

Example 46

Example 47