On PGZ decoding of alternant codes
R. Farr\'e, N. Sayols, and S. Xamb\'o-Descamps

TL;DR
This paper reviews the PGZ decoding algorithm for alternant codes, proposes an improved method for error detection and correction, and demonstrates its effectiveness through examples and computational system descriptions.
Contribution
It introduces an improved PGZ decoding algorithm for alternant codes, enhancing error detection and correction capabilities.
Findings
Enhanced decoding accuracy for alternant codes
Successful application to Reed-Solomon and Goppa codes
Effective computational implementation demonstrated
Abstract
In this note we first review the classical Petterson-Gorenstein-Zierler decoding algorithm for the class of alternant codes (which includes Reed-Solomon, Bose-Chaudhuri-Hocquenghem and classical Goppa codes), then we present an improvement of the method to find the number of errors and the errorlocator polynomial, and finally we illustrate the procedure with several examples. In two appendices we sketch the main features of the system we have used for the computations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCoding theory and cryptography · DNA and Biological Computing · Cellular Automata and Applications
On PGZ decoding of alternant codes
R. Farré, N. Sayols, and S. Xambó-Descamps
Universitat Politècnica de Catalunya
[email protected], [email protected], [email protected]
Abstract
In this note we first review the classical Petterson-Gorenstein-Zierler decoding algorithm for the class of alternant codes (which includes Reed-Solomon, Bose-Chaudhuri-Hocquenghem and classical Goppa codes), then we present an improvement of the method to find the number of errors and the error-locator polynomial, and finally we illustrate the procedure with several examples. In two appendices we sketch the main features of the system [3] we have designed and developed for the computations.
keywords:
Alternant codes, RS codes, BCH codes, classical Goppa codes
MSC:
[2010] 11T71, 94B05, 94B35, 94B15
Introduction
The Petterson-Gorenstein-Zierler decoding algorithm (PGZ for short) was first developed for Reed-Solomon codes (RS), and later applied to Bose-Chaudhuri-Hocquenghem codes (BCH). In [4], two flavours of it were presented for alternant codes, with due attention to the computational aspects. The main interest of working with the class of alternant codes is that it includes many interesting subclasses, like RS codes, BCH codes (the most relevant class of cyclic codes), and classical Goppa codes. The practical bonus of this realization is that all these families of codes can be constructed by specializing the general constructor of alternant codes and, most fundamentally, that any effective decoding algorithm for alternant codes is sufficient (and effective) for all those subclasses.
In this note we present a natural improvement, both conceptual and computational, of the PGZ algorithm. The key point is that the output of the Gauss-Jordan reduction of a (Hankel-like) matrix constructed from the syndrome vector gives directly and at the same time the number of errors and the error-locator polynomial.
The organization is as follows. In the first section we briefly review alternant codes. This includes details about how the classes of codes just mentioned can be contructed with special calls to the main constructor. The second section is devoted to present the mathematical basis of the PGZ approach for the decoding of alternant codes. Our improvement of PGZ is explained in detail in the third section and in the fourth we provide several examples. Finally Appendix A contains listings of the key Python functions that we have designed and coded to get clear implementations of the computations and in the Appendix B we sketch the main features of the Python package [3] used to script the examples.
Notations and conventions
If is a prime power, the finite field of elements (unique up to isomorphism) is denoted . It is a subfield of for all positive integers . The field can be constructed as the quotient , where is any irreducible polynomial over of degree .
Given elements in a ring, we write to denote the Vandermonde matrix of rows associated to . In other words, its rows have the form for . The determinant of the matrix , which is called the Vandermonde determinant of , is equal to . In particular, it is non-zero when the are distinct elements of a field.
Let be a finite field. A linear code of length defined over is a vector subspace . If has dimension , we say that is an code. The quotient is called the rate of . The Hamming distance of is the number of indices such that . The minimum distance of a linear code , denoted , is the minimum of the distances for , . The number of non-zero entries of is called the weight of and is denoted . It is easy to see that is the minimum of the weights of non-zero elements of . An code of minimum distance is said to be an code, or an if we need to write the base field explicitly.
1 Essentials on alternant codes
Let and . Let and be elements of such that for all and for all . Consider the matrix
[TABLE]
that is,
[TABLE]
We say that is the alternant control matrix of order associated with the vectors
[TABLE]
To make explicit that the entries of and (and hence of ) lie in , we will say that is defined over .
The -code defined by the control matrix is the subspace of whose elements are the vectors such that . Such codes will be called alternant codes. If we define the -syndrome of a vector as , then is just the subspace of whose elements are the vectors with zero -syndrome.
1.1 Proposition** (Alternant bounds).**
If , then
[TABLE]
and
[TABLE]
(minimum distance alternant bound)**.
Proof.
See, for example, [4], p. 183. ∎
For the proofs of the statements in the remainder of this section, we refer to [4], Section 4.1.
Reed-Solomon codes
Given a list or vector of distinct non-zero elements , the Reed–Solomon code
[TABLE]
is the subspace of generated by the rows of the Vandermonde matrix . It turns out that
[TABLE]
where is given by
[TABLE]
Note that in this case , hence , and that the alternant bounds are sharp. Indeed, we have , hence , while (by the Singleton bound) and by the minimum distance alternant bound. In other words, is MDS (maximum distance separable).
An RS code is called primitive if the are all non-zero elements of . In that case, a natural way to proceed is to generate those elements as the powers of a primitive element of and so the code is, if its dimension is , , where .
Generalized Reed-Solomon codes
The vector in the definition of the code as an alternant code is obtained from by the formula (3). If we allow that can be chosen possibly unrelated to , but still with components in , the resulting codes are called Generalized Reed–Solomon (GRS) codes, and we will write to denote them. An argument as above shows that such codes have type . Notice that the code is the intersection of the GRS code with .
BCH codes
These codes are denoted , where and , are integers (called the design minimum distance and the offset, respectively). When , we simply write and say that the it is a strict BCH code. The good news here is that
[TABLE]
where , , .
If is a primitive element of , and hence , we have the equality
[TABLE]
Classical Goppa codes
Let be a polynomial of degree and let be distinct non-zero elements such that for all . Then the classical Goppa code over associated with and , which will be denoted , can be defined as , where is the vector . Thus the minimum distance of is and its dimension satisfies . The minimum distance bound can be improved to in the case that and the roots of are distinct.
2 The PGZ decoding approach
Let be an alternant code. Let , that is, the highest integer such that . For reasons that will become apparent below, is called the error-correction capacity of .
Let (using a transmission chanel terminology, we say that it is the sent vector) and (error vector, or error pattern). Let (received vector). The goal of a decoder is to obtain from and when . Henceforth we will assume that .
If , we say that is an error position. Let be the error positions and the corresponding error values. The error locators are defined by . Since are distinct, the knowledge of the is equivalent to the knowledge of the error positions.
The monic polynonial whose roots are the error locators is called the error-locator polynomial. Notice that
[TABLE]
where is the -th elementary symmetric polynomial in the .
The syndrome of is the vector , say . Since , we have . Inserting the definitions, we easily find that
[TABLE]
We will use the following notations:
[TABLE]
and the vector
[TABLE]
Next proposition establishes the key relation for computing the error-locator polynomial.
2.2 Proposition**.**
If (see the fomula (5)), then
[TABLE]
Proof.
Substituting by in the identity
[TABLE]
we obtain the relations
[TABLE]
where . Multiplying by and adding with respect to , we obtain (using (6)) the relations
[TABLE]
where , and these relations are equivalent to the stated matrix relation. ∎
2.3 Remark**.**
In the equation (9), the matrix turns out to be non-singular and hence it determines (and ) uniquely, namely . In next section we are going to establish this fact as a corollary of our Theorem 3.5, whose main outcome is a fast solution of that equation.
The roots of only tell us in what positions the errors occur. To find the actual value of the errors, we need the syndrome polynomial, and the polynomial . Notice that the roots of are .
2.4 Theorem** (Forney’s formula).**
Let . Then for any we have
[TABLE]
where denotes the derivative of .
Proof.
See, for example, [4], sections 4.2 and 4.4. ∎
Because of this result, the polynomial is called the error-evaluator polynomial.
3 Fast PGZ computations
The main object considered in this Section is the matrix (cf. [1])
[TABLE]
Note that , so that all components are well defined. Note also that the submatrix at the upper left corner is the matrix defined by Eq. (7) an that the column to its right is the vector defined by Eq. (8).
In next Theorem we use the following notation: . Thus the -th row of , for , is the vector . We also write .
3.5 Theorem**.**
.
Proof.
Let and . Then the -th column of is the column vector . It follows that the element in row column of is (by Eq. (6)). ∎
3.6 Corollary**.**
The rank of is and the matrix is non-singular.
Proof.
Since has rank , the rank of is at most . On the other hand, the theorem shows that and therefore
[TABLE]
Note that is the Vandermonde determinant of , which is non-zero because the error locators are distinct. ∎
3.7 Corollary**.**
The Gauss-Jordan algorithm applied to the matrix returns a matrix that has the form
[TABLE]
where denotes unneeded values (if any) and the vertical dots below the horizontal line denote that all its elements (if any) are zero. This matrix gives at the same time , the number of errors, and the coefficients of the error-locator polynomial.∎
Putting together what we have learned in the last two sections, we obtain two algorithms to decode alternant codes, or rather two variants of an algorithm. We call them PGZ and PGZm, for in essence they are due to Peterson, Gorenstein and Zierler (see [2]). They share the same scheme for finding the location of the errors, but differ in how the error values are computed. PGZm is the simplest of the two, as it relies mainly on linear algebra, whereas PGZ relies on the finding the error evaluator polynomial and using Forney’s formula.
In the descriptions that follow, Error means “a suitable decoding-error message” and the function GJ(S) returns the values of the matrix (12) as a column vector (this is a slightly modified form of the Gauss-Jordan procedure). In detail, it works as follows:
Improved PGZ
Get the syndrome vector, . If , return . 2. 2.
Form the matrix as in the Eq. (11). 3. 3.
Set (Eq. (12)). After this we have , hence also the error-locator polynomial . 4. 4.
Find the elements that are roots of the polynomial . If the number of these roots is , return Error. Otherwise let be the error-locators corresponding to the roots and set , where . 5. 5.
Let , and compute the error-evaluator polynomial by the formula
[TABLE] 6. 6.
Find the errors , for all , using Forney’s formula (equation (10)). If any of the values of is not in , return Error. Otherwise return .
3.8 Theorem**.**
The algorithm PGZ corrects up to errors
Proof.
It is an immediate consequence of what we have seen so far. ∎
3.9 Remark**.**
In step 5 of the algorithm we could use the alternative syndrome polynomial , find the alternative error evaluator as the remainder of the division of by and then, in step 6, use the following alternative Forney formula ([4], P.4.9):
[TABLE]
Algorithm PGZm
The steps 5 and 6 of the PGZ algorithm can be compressed into a single step consisting in solving for the following system of linear equations:
[TABLE]
which is equivalent to the matrix equation
[TABLE]
and then return (or Error if one or more of the components of is not in ).
3.10 Remark**.**
Even with the improvements advanced in this note, in theory the PGZ and PGZm algorithms cannot beat, for very large alternant codes, the Berlekamp-Massey-Sugiyama (BMS) algorithm (cf. [4], Section 4.3). But they are comparable for the codes that are feasible in practice. Indeed, the very construction of the alternant matrix is costly in time and space and within the range of parameters that can usualy be afforded, the efficiency of the PGZ or PGZm is comparable to that of BMS. Let us also say that in some contexts, as for example in teaching, the PGZm has the advantage that it is more straightfoward to explain and to implement, the easiest case being RS codes over , prime.
4 Examples
Here we are going to discuss the implementation of the algorithms using [3], an how it works, by considering some examples for each of the following classes: RS, GRS, BCH and (classical) Goppa codes.
In the code constructors described below, h and a stand for variables bound to vectors of the same length with entries in a finite field; K and F, to finite fields and ; r, k, d and l, to integers , and used as in the first two sections; and g to a univariate polynomial with coefficients in a finite field.
AC(h,a,r,K): This constructs the alternating code . In the context of this note, it is our main constructor, as the others (described below) are in fact defined as special calls to AC (cf. Section 1).
- 2.
RS(a,k): This yields the RS code , an code defined over the field to which the elements of belong.
- 3.
GRS(h,a,k): As RS, but we have to supply as a first argument.
- 4.
PRS(F,k): The primitive RS code of the finite field . It is defined as RS(a,k), but taking as the list of non-zero elements of .
- 5.
BCH(a,d,l): Supplies the code , where here a stands for an element in a finite field.
- 6.
Goppa(g,a): The Goppa code .
The code C obtained by any one of these constructors is a record-like structure with fields that allow to get data from the code or store new information about it. The labels of those fields end with an underscore, but otherwise tend to mimic the mathematical symbols. For example, a = a_(C) and H = H_(C) bind the variable a to the vector and the variable H to the alternant control matrix of .
4.11 Remark**.**
Except for RS and GRS codes, for which the parameters can be deduced immediately from the data supplied to their constructors, in general there is some work to be done to determine those not yet known. This work is rather straightforward when it comes to compute . To that end, we need to construct a control matrix of defined over . This can be done in two steps: replace each entry of by the column of is components in the natural linear basis of over (this yields an -control matrix, but it usually has redundant rows) and then suppress all the rows that are linear combinations of the previous ones. We have implemented these steps by means of the functions blow(H,K) and prune(M). The bottom line is that the dimension of is , where is the number of rows of prune(blow(H,K)) or just rank(blow(H,K)). This is the method used to determine the dimension and the rate when we quote them.
A final comment before getting into the examples is that we can assume that the received vector is an error vector such that . The reason is the linearity of the code, which implies that only the error vector is involved in the computations of the error positions and values.
RS
Take and construct C = PRS(K,8), the primitive RS of of dimension . It has length , so its rate is , and the minimum distance is , so that it corrects at least two errors. First let us consider the case of one error. Suppose the received vector is
e = [0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0]
Then the decoder call PGZ(e,C) (or PGZm(e,C)) yields
PGZ: Error positions [4], error values [3] :: Vector[Z13] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] :: Vector[Z13]
This means that PGZ finds that there is a single error at the index 4 (5th element of the vector) and that its value is 3, and then outputs (correctly) the decoded vector. Let us go over the steps followed by PGZ in detail. The control matrix is H = H_(C):
[[1 2 4 8 3 6 12 11 9 5 10 7] [1 4 3 12 9 10 1 4 3 12 9 10] [1 8 12 5 1 8 12 5 1 8 12 5] [1 3 9 1 3 9 1 3 9 1 3 9]] :: Matrix[Z13]
Then the syndromy vector is given by s = y*transpose(H),
[9, 1, 3, 9],
and the matrix S = hankel_matrix(s) is
[[9 1 3] [1 3 9]] :: Matrix[Z13]
This matrix has rank 1, as the second row is 3 times the first. This means that there is one error and that the error-locating polynomial is . To find the error position, we have to look at the position of 3 in the vector used to construct C, which is a_(C):
[1, 2, 4, 8, 3, 6, 12, 11, 9, 5, 10, 7] :: Vector[Z13]
Thus the position is indeed the one in which the error occurred. To find the error value, first we have to calculate the error evaluator
[TABLE]
which turns out to be the constant . Forney’s formula for the error value is (for in this case ), which is the error value.
Now we are going to repeat, with less detail, the case of 2 errors. Suppose the received vector is
y = [0, 0, 0, 0, 3, 0, 0, 0, 0, 7, 0, 0]
Then the decoder call PGZ(y,C) (or PGZm(y,C)) yields
PGZ: Error positions [4, 9], error values [3, 7] :: Vector[Z13] [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] :: Vector[Z13]
The syndromy vector is
[5, 7, 7, 3],
and the matrix S = hankel_matrix(s) is
[[5 7 7] [7 7 3]] :: Matrix[Z13]
Since it has rank 2, there have been 2 errors. In this case the Gauss-Jordan reduction produces . Since the roots 3 and 5 occupy the positions 5 and 10 in , we see that the indices of the computed error positions are correct. For the error values we have to apply Forney’s formula to 3 and 5 ( and ). We have , , , and . Then the error corresponding to, for example, the second root is
[TABLE]
For another example, if we take F = Zn(31) and k = 20, then C = PRS(F,k) is a code. This corrects up to 5 errors and its rate is . This capability is illustrated in the following listing:
e = rd_error_vector(F,n,5) # this creates a ramdom 5-error pattern
[0,0,0,0,0,0,0,0,0,14,0,0,0,28,26,0,0,0,0,23,0,0,16,0,0,0,0,0,0,0] :: Vector[Z31] PGZ(e,C) PGZ: Error positions [9,13,14,19,22], error values [14,28,26,23,16] :: Vector[Z31]
BCH
Take and , generated by such that . Let . This is a binary code of length (the order of ) that corrects up to 3 errors. In our system it can be constructed as follows:
K = Zn(2) [F,a] = extension(K,[1,0,0,1,0,1],’a’,’F’) C = BCH(a,7)
Its dimension is 16 (so its rate is ), as shown by the following command:
n - rank(blow(H_(C),K))
16
Now consider, for example, the weight 3 error pattern e:
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]
Then the call PGZ(e,C) outputs
PGZ: Error positions [5, 19, 28], error values [1, 1, 1] :: Vector[K]
(Note that the only possible error value over is 1, and that therefore in this case the decoder only needs to care about error location.) The matrix computed in this case is (instead of we write ):
[[22, 13, 14, 26]
[13, 14, 26, 19]
[14, 26, 19, 28]]
which gives .
The code also corrects up to 3 errors when considered as an -code (which is a GRS code over ). For example, if
e=[0,0,0,0,0,0,0,0, , 1, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, , 0,0,0,0]
then the output contains
PGZ: Error positions [8, 9, 26], error values [a5, 1, a19] :: Vector[F]
together with the correct decoded vector. With the same conventions as before, the matrix gives :
[[16, 0, 30, 14]
[ 0, 30, 14, 25]
[30, 14, 25, 28]]
Here is a more involved example. Take and , generated by such that . The element is primitive and has order . It follows that the code has length and that it corrects at least 5 errors. In the PyECC system it can be constructed as folows:
K = Zn(3) f=get_irreducible_polynomial(K,5,’X’) # X5-X+1 [F,a] = extension(K,f,’a’,’F’) C = BCH(a2,11)
In addition, the command n - rank(blow(H_(C),K)) yields that its dimension is , so that its rate is .
Consider the error pattern of weight 5 (where 0k denotes 0 repeated k times)
e = [02, 1, 07, 1, 022, 2, 06, 2, 072, 1, 07] :: Vector[Z3]
Then we have:
PGZ(e,C)
PGZ: Error positions [2, 10, 33, 40, 113], error values [1, 1, 2, 2, 1] :: Vector[F5]
Classical Goppa
Consider the field generated over by such that . Let and make a list of the elements such that . Then it turns out that has length ( has four simple roots and one double root in ) and that corrects up to 3 errors.
F5 = Zn(5)
Creation of F25, with generator x
[F25,x] = extension(F5,[1,0,-2],’x’,’F25’)
Creation of the polynomial ring F25[T]
[A,T] = polynomial_ring(F25,’T’)
g = T6 + T3 + T +1 a = Set(F25)[1:]ΨΨ# The non-zero elements of F25 a = [t for t in a if evaluate(g,t)!=0] C = Goppa(g,a)
generate a random error pattern of weight 3
e = rd_error_vector(Z5,n,3)
e = [0,1,0,0,0,3,0,4,0,0,0,0,0,0,0,0,0,0,0] :: Vector[Z5]
Use the PGZ decoder for C
PGZ(e,C)
PGZ: Error positions [1,5,7], error values [1,3,4] :: Vector[K] [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] :: Vector[Z5]
The dimension of (given by n - rank(blow(H_(C),F5)) is , and its rate .
Here is another illustration, with :
e = rd_error_vector(F3,n,5) [010, 2, 035, 2, 09, 1, 06, 1, 03, 2, 010]
F3 = Zn(3) f = get_irreducible_polynomial(F3,4,’X’) # X4 + X + 2 [F81,x] = extension(F3,f,’x’,’F81’) # x is primitive g = X2 * (X-1)**4 * (X-2)**4 a = Set(F81)[3:] # g does not vanish on any of these values n = len(a) # 81 - 3 = 76 C = Goppa(g,a) # code of length 76 k = n - rank(blow(H_(C),F3)) # k = 76-32 = 44 PGZm(e,C)
PGZm: Error positions [10, 46, 56, 63, 67], error values [2, 2, 1, 1, 2] :: Vector[F3]
Appendix A The function PGZm
For the sake of brevity, here we list and comment the PGZm(y,C) function. Its code, together with the code of PGZ(y,C), can be accessed by browsing the text file Listings-FSX.pyecc. The parameter y is supposed to be the received vector in a transmission using the alternant code C. We have seen that the value of expressions a_(C) and H_(C) is the vector and the control matrix . Similarly, the values of the expressions K_(C), h_(C), r_(C) are the field over which C is defined, the vector and the number of rows of , respectively.
def PGZm(y,C): if isinstance(y,list): y = vector(K_(C),y) if not isinstance(y,Vector_Element): return "PGZm: Argument is not a vector" h = h_(C) if len(y) != len(h): return "PGZm: Vector argument has wrong length" r = r_(C); alpha = a_(C); H = H_(C); K = K_(C) s = y*H.transpose() if is_zero(s): print("PGZm: Input is a code vector") return y S = hankel_matrix(s) c0 = S[:,0] # keep the first column of S a = -GJ(S); l = len(a) a = reverse(a.to_list()) K1 = K_(H) [_,z] = polynomial_ring(K1,’z’,’K1[z]’) L = hohner([1]+a,z) R = [s for s in alpha if evaluate(L,s)==0] if len(R) < l: return "PGZm: Defective error location" M = [alpha.to_list().index(r) for r in R] h1 = [h[m] for m in M] V = alternant_matrix(h1,R,l) v = c0[:l] V1 = splice(V,v) w = transpose(GJ(V1)) for t in w: t = pull(t,K) if not belongs(t,K): return "PGZ: error value not in base field" show("PGZm: Error positions {}, error values {}".format(M, w)) for j in range(len(M)): y[M[j]]-=w[j] return pull(y,K)
Appendix B The PyECC system
Initially (October 2015) the idea that launched [3] was to match the functionality of the CC system developed to deal with the computational tasks related to the book [4], but it became soon clear that we could go beyond that system in several directions. The aim of the undertaking is to produce a Python package (PyECC) enabling the construction, coding and decoding of error-correcting codes and make it freely available for teachers and researchers. The current state of the project is documented at PyECC
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] R. Farré. Notes on information theory and coding theory, 2003. In Catalan.
- 2[2] W. W. Peterson and E. J. Weldon. Error-Correcting codes . MIT Press (2nd edition), 1972.
- 3[3] N. Sayols and S. Xambó-Descamp. A Python package for the construction, coding and decoding of error-correcting codes. Py ECC , 2017.
- 4[4] S. Xambó-Descamps. Block error-correcting codes: a computational primer . Univesitext. Springer, 2003.
