# Berlekamp–Welch algorithm

The Berlekamp–Welch algorithm, also known as the Welch–Berlekamp algorithm, is named for Elwyn R. Berlekamp and Lloyd R. Welch. The algorithm efficiently corrects errors in BCH codes and Reed–Solomon codes (which are a subset of BCH codes). Unlike many other decoding algorithms, and in correspondence with the code-domain Berlekamp–Massey algorithm that uses syndrome decoding and the dual of the codes, the Berlekamp–Welch decoding algorithm provides a method for decoding Reed–Solomon codes using just the generator matrix and not syndromes.

## History on decoding Reed–Solomon codes

1. In 1960, Peterson came up with an algorithm for decoding BCH codes.[1][2] His algorithm solves the important second stage of the generalized BCH decoding procedure and is used to calculate the error locator polynomial coefficients that in turn provide the error locator polynomial. This is crucial to the decoding of BCH codes.
2. In 1963, Gorenstein–Zierler saw that BCH codes and Reed–Solomon codes have a common generalization and that the decoding algorithm extends to more general situation.
3. In 1968 / 69, Elwyn Berlekamp invented an algorithm for decoding BCH codes. James Massey recognized its application to linear feedback shift registers and simplified the algorithm.[3][4] Massey termed the algorithm the LFSR Synthesis Algorithm (Berlekamp Iterative Algorithm) but it is now known as the Berlekamp–Massey algorithm.
4. In 1986, The Welch–Berlekamp algorithm was developed to solve the decoding equation of Reed–Solomon codes, using a fast method to solve a certain polynomial equation. The Berlekamp – Welch algorithm has a running time complexity of ${\displaystyle {\mathcal {O}}(N^{3})}$. We will in the following sections look at the Gemmel and Sudan’s exposition of the Berlekamp Welch Algorithm.[5]

## Error locator polynomial of Reed–Solomon codes

In the problem of decoding Reed–Solomon codes, the inputs are pair wise distinct evaluation points ${\displaystyle \alpha _{i}}$’s (i = 1, . . ., n) where ${\displaystyle \alpha _{i}\in \mathbb {F} }$ with dimension ${\displaystyle K}$ and distance ${\displaystyle D=N-K+1}$ and a codeword ${\displaystyle y}$ = ${\displaystyle (y_{1},\ldots ,y_{n})\in \mathbb {F} _{n}}$. Our goal is to describe an algorithm that can correct ${\displaystyle e<{N-K+1 \over 2}}$ many errors in polynomial time. To do so we have to find a polynomial ${\displaystyle P}$ over ${\displaystyle \mathbb {F} }$ such that ${\displaystyle P}$ has degree less than ${\displaystyle k-1}$ and (the number of ${\displaystyle i}$’s such that ${\displaystyle P(\alpha _{i})\neq y_{i}\leq e}$. We can assume that there exists a polynomial ${\displaystyle P(X)}$ such that ${\displaystyle \Delta (y,(P(\alpha _{i}))_{i=1}^{N})}$${\displaystyle e\leq {D \over 2}}$ or ${\displaystyle {N-K+1 \over 2}}$.

Note that the coefficients of ${\displaystyle P}$ are the encoded information. To solve this, we use an indicator for those ${\displaystyle i}$’s where an error may have occurred. Thus we define ${\displaystyle E(X)}$, which is an error locator polynomial over ${\displaystyle \mathbb {F} }$ such that ${\displaystyle E(\alpha _{i})=0}$ if ${\displaystyle y_{i}\neq P(\alpha _{i})}$ and the degree of ${\displaystyle E}$ can be given by: ${\displaystyle E\leq {n-k \over 2}}$.

${\displaystyle E(X)=\prod _{\alpha _{i}\in S}(X-\alpha _{i})}$ where ${\displaystyle S=\{\alpha _{i}|P(\alpha _{i})\neq y_{i}\}}$

We can also claim that for every ${\displaystyle 1\leq i\leq N}$, ${\displaystyle y_{i}E(\alpha _{i})=P(\alpha _{i})E(\alpha _{i})}$. This fact holds true because in the event of ${\displaystyle y_{i}\neq P(\alpha _{i})}$, both sides of the above equation become ${\displaystyle 0}$ because ${\displaystyle E(\alpha _{i})=0}$.

However since both ${\displaystyle E(X)}$ and ${\displaystyle P(X)}$ are unknown, the main task of the decoding algorithm would be to find ${\displaystyle P(X)}$. To do this we use a seemingly useless yet very powerful method and define another polynomial ${\displaystyle Q(X)}$ as ${\displaystyle Q(X)}$ = ${\displaystyle P(X)E(X)}$. This is because the ${\displaystyle n}$ equations with ${\displaystyle e+k}$ we need to solve are quadratic in nature. Thus by defining a product of two variables that gives rise to a quadratic term as one unknown variable, we increase the number of unknowns but make the equations linear in nature. This method is called linearization[6] and is a very powerful tool.

Thus ${\displaystyle Q(X)}$ is a polynomial over ${\displaystyle \mathbb {F} }$ having the properties:

This helps because if we now manage to find ${\displaystyle Q(X)}$ and ${\displaystyle E(X)}$, we can easily find ${\displaystyle P(X)}$ using ${\displaystyle P(X)={Q(X) \over E(X)}}$. The main purpose of the Berlekamp Welch algorithm is to find out ${\displaystyle P(X)}$ using degree bounded polynomials ${\displaystyle Q(X)}$ and ${\displaystyle E(X)}$ and the properties of ${\displaystyle E}$ and ${\displaystyle N}$.

Computing ${\displaystyle E(X)}$ is as hard as ﬁnding the end solution, polynomial ${\displaystyle P(X)}$. Once ${\displaystyle E(X)}$ is computed, using erasure decoding for Reed–Solomon codes, we can easily recover ${\displaystyle P(X)}$. However in a few cases, even the polynomial ${\displaystyle Q(X)}$ is as hard to ﬁnd as ${\displaystyle E(X)}$. As an example, given ${\displaystyle Q(X)}$ and ${\displaystyle y}$ (such that ${\displaystyle y_{i}\neq 0}$ for ${\displaystyle 1\leq i\leq n}$), by checking positions where ${\displaystyle Q(i)=0}$, we can ﬁnd the error locations. Thus the algorithm works on the principle that while each of the polynomials ${\displaystyle E(X)}$ and ${\displaystyle Q(X)}$ are hard to ﬁnd individually; computing them together is much easier.

## The Berlekamp–Welch decoder and algorithm

The Welch–Berlekamp decoder for Reed–Solomon codes consists of the Welch– Berlekamp algorithm augmented by some additional steps that prepare the received word for the algorithm and interpret the result of the algorithm.

The inputs given to the Berlekamp Welch decoder are the integers denoting Block Length ${\displaystyle N}$, the number of errors ${\displaystyle e}$ such that ${\displaystyle e}$ < ${\displaystyle {N-K+1 \over 2}}$, and the received word ${\displaystyle (y_{i},\alpha _{i})_{i=1}^{N}}$ satisfying the condition that there exists at most one ${\displaystyle P(X)}$ with ${\displaystyle deg(P(X))\leq {k-1}}$ with ${\displaystyle \Delta (y,{P(\alpha _{i})_{i}})\leq e}$.

The output of the decoder is either the polynomial ${\displaystyle P(X)}$, or in some cases, a failure. This decoder functions in two steps as follows:

1. This step is called the interpolation step in which the decoder computes a non zero polynomial ${\displaystyle E(X)}$ of degree e (This implies that the coefficient of ${\displaystyle X^{e}}$ must be 1 [7]) and another polynomial ${\displaystyle Q(X)}$ with ${\displaystyle \deg(Q(X))\leq {e+K-1}}$. These polynomials are created such that the condition ${\displaystyle y_{i}E(\alpha _{i})=Q(\alpha _{i})}$ for all ${\displaystyle 1\leq i\leq n}$. In the case that polynomials satisfying the above condition cannot be computed, the output of the decoder would be a failure.
2. If ${\displaystyle E(X)}$ divides ${\displaystyle Q(X)}$, then a ${\displaystyle P}$${\displaystyle (X)}$ is defined which equals ${\displaystyle Q(X) \over E(X)}$. If ${\displaystyle \Delta ((y,(P}$${\displaystyle (\alpha _{i})_{i})\leq e)}$, then the decoder outputs ${\displaystyle P}$${\displaystyle (X)}$. If the above condition is not satisfied, i.e. if ${\displaystyle E(X)}$ does not divide ${\displaystyle Q(X)}$then a failure is returned by the decoder.

According to the algorithm, in the cases where it does not output a failure, it outputs a ${\displaystyle P(X)}$ that is the correct and desired polynomial. To prove that, the algorithm always outputs the desired polynomial, we need to prove a few claims we have made while describing the algorithm. Let us go ahead and do so now.

Claim 1: There exist a pair of polynomials ${\displaystyle E(X)}$ and ${\displaystyle Q(X)}$ that satisfy Step 1 of the BW algorithm such that ${\displaystyle {Q(X) \over E(X)}=P(X)}$.

Let E(x) be the error-locating polynomial for ${\displaystyle P(X)}$ such that ${\displaystyle E(X)=X^{e-\Delta (y,P(\alpha _{i})_{i})}\prod _{1\leq i\leq n|y_{i}\neq P(\alpha _{i})}(X-\alpha _{i})}$and let ${\displaystyle Q(X)=P(X)E(X)}$. Note that ${\displaystyle deg(Q(X))\leq {deg(P(X))+deg(E(X))}\leq {e+k-1}}$. We also stated that ${\displaystyle E(X)}$ is a polynomial of degree exactly ${\displaystyle e}$. Note that ${\displaystyle E(X)}$ is a polynomial following the property that ${\displaystyle E(\alpha _{i})=0}$ if and only if ${\displaystyle y_{i}\neq P(\alpha _{i})}$.We can now state that ${\displaystyle E(X)}$ and ${\displaystyle Q(X)}$ satisfy the equation ${\displaystyle y_{i}E(\alpha _{i})=Q(\alpha _{i})}$ from the first step of the BW algorithm. If ${\displaystyle E(\alpha _{i})=0}$, then ${\displaystyle Q(\alpha _{i})=P(\alpha _{i})E(\alpha _{i})=y_{i}E(\alpha _{i})=0}$. However whenever ${\displaystyle E(\alpha _{i})\neq 0}$, we can easily state that ${\displaystyle P(\alpha _{i})=y_{i}}$ and therefore also state that ${\displaystyle P(\alpha _{i})E(\alpha _{i})=y_{i}E(\alpha _{i})}$ just as we claimed.

This above claim however just reiterates and proves the fact that there exists a pair of polynomials ${\displaystyle E(X)}$ and ${\displaystyle Q(X)}$ such that ${\displaystyle P(X)}$ = ${\displaystyle Q(X)/E(X)}$. It however does not necessarily guarantee the fact that the algorithm we discussed above would indeed output such a pair of polynomials. We therefore move on to look at another claim that helps establish this fact using the above claim and thereby proving the correctness of the algorithm.

Claim 2: For any two distinct solutions ${\displaystyle (E_{1}(X),Q_{1}(X))\neq (E_{2}(X),Q_{2}(X))}$ that satisfy the first step of the Berlekamp Welch algorithm given above, they will also satisfy the equation ${\displaystyle {Q_{1}(X) \over E_{1}(X)}={Q_{2}(X) \over E_{2}(X)}}$

The total degrees of the polynomials ${\displaystyle Q_{1}(X)E_{1}(X)}$ and ${\displaystyle Q_{2}(X)E_{2}(X)\leq {2e+k-1}}$. We define another polynomial ${\displaystyle R(X)=Q_{1}(X)E_{2}(X)-Q_{2}(X)E_{1}(X)}$ ....................................(i)

Note that ${\displaystyle R(X)}$ such that ${\displaystyle deg(R(X))\leq {2e+k-1}}$. From step 1 of the Berlekamp Welch algorithm we also know that ${\displaystyle y_{i}E_{1}(\alpha _{i})=Q_{1}(\alpha _{i})}$ and ${\displaystyle y_{i}E_{2}(\alpha _{i})=Q_{2}(\alpha _{i}}$) ........…..........(ii)

Now, substituting the values of ${\displaystyle Q(X)}$ from equation (ii) into equation (i), we get: ${\displaystyle R(\alpha _{i})=y_{i}E_{1}(\alpha _{i})E_{2}(\alpha _{i})-y_{i}E_{2}(\alpha _{i})E_{1}(\alpha _{i})=0}$ for ${\displaystyle 1\leq i\leq n}$.

Thus, the above polynomial ${\displaystyle R(X)}$ has ${\displaystyle n}$ roots and ${\displaystyle deg(R(X))\leq {2e+k-1}}$ which implies that ${\displaystyle deg(R(X))}$ < ${\displaystyle n}$ because of the upper bound on ${\displaystyle e}$. Since ${\displaystyle deg(R(X))}$ < ${\displaystyle n}$, we can come to the conclusion that the polynomials ${\displaystyle Q_{1}(X)E_{2}(X)}$ and ${\displaystyle Q_{2}(X)E_{1}(X)}$ agree on more points than their degree, and hence they are identical. Note that since ${\displaystyle E_{1}(X)\neq 0}$ and ${\displaystyle E_{2}(X)\neq 0}$, it can be implied that ${\displaystyle {Q_{1}(X) \over E_{1}(X)}={Q_{2}(X) \over E_{2}(X)}}$ as per our initial claim.

Thus based on the above claims, we can safely state that the output of the Berlekamp Welch algorithm, when outputting the polynomial ${\displaystyle P(X)}$ is correct.

We can now claim that the algorithm can be implemented such that it has a running time of ${\displaystyle O(n^{3})}$. This can be proved as follows: In Step 1 of the algorithm, the polynomials ${\displaystyle Q(X)}$ and ${\displaystyle E(X)}$ have ${\displaystyle e+k}$ and ${\displaystyle e+1}$ unknown values respectively and the constraints ${\displaystyle y_{i}E(\alpha _{i})=Q(\alpha _{i})}$ for all ${\displaystyle 1\leq i\leq n}$ acts as a linear equation with these unknowns. We therefore get a system of ${\displaystyle n}$ linear equations in ${\displaystyle 2e+k+1}$ < ${\displaystyle n+2}$ unknowns. Using our first claim, this system of equations has a solution since the degree of polynomial ${\displaystyle E(X)}$ is ${\displaystyle e}$. This can be solved in ${\displaystyle O(n^{3})}$ time, by say Gaussian elimination. Finally, we can note that Step 2 of the algorithm can also be implemented in time ${\displaystyle O(n^{3})}$ by "long division" method. Hence we can state that the Berlekamp Welch algorithm can be used to uniquely decode any ${\displaystyle [n,k]_{q}}$ Reed–Solomon code in ${\displaystyle O(n^{3})}$ time for a maximum of ${\displaystyle {n-k+1} \over 2}$ errors.

## Example

The error locator polynomial serves to "neutralize" errors in P by making Q zero at those points, so that the system of linear equations is not affected by the inaccuracy in the input.

Consider a simple example where a redundant set of points are used to represent the line ${\displaystyle y=5-x}$, and one of the points is incorrect. The points that the algorithm gets as an input are ${\displaystyle (1,4),(2,3),(3,4),(4,1)}$, where ${\displaystyle (3,4)}$ is the defective point. The algorithm must solve the following system of equations:

{\displaystyle {\begin{alignedat}{1}Q(1)&=4*E(1)\\Q(2)&=3*E(2)\\Q(3)&=4*E(3)\\Q(4)&=1*E(4)\\\end{alignedat}}}

Given a solution ${\displaystyle Q}$ and ${\displaystyle E}$ to this system of equations, it is evident that at any of the points ${\displaystyle x=1,2,3,4}$ one of the following must be true: either ${\displaystyle Q(x_{i})=E(x_{i})=0}$, or ${\displaystyle P(x_{i})={Q(x_{i}) \over E(x_{i})}=y_{i}}$. Since ${\displaystyle E}$ is defined as only having a degree of one, the former can only be true in one point. Therefore, ${\displaystyle P(x_{i})}$ must equal ${\displaystyle y_{i}}$ at the three other points.

Letting ${\displaystyle E(x)=x+e_{0}}$ and ${\displaystyle Q(x)=q_{0}+q_{1}x+q_{2}x^{2}}$ and bringing ${\displaystyle E(x)}$ to the left, we can rewrite the system thus:

{\displaystyle {\begin{alignedat}{10}q_{0}&+&q_{1}&+&q_{2}&-&4e_{0}&-&4&=&0\\q_{0}&+&2q_{1}&+&4q_{2}&-&3e_{0}&-&6&=&0\\q_{0}&+&3q_{1}&+&9q_{2}&-&4e_{0}&-&12&=&0\\q_{0}&+&4q_{1}&+&16q_{2}&-&e_{0}&-&4&=&0\end{alignedat}}}

This system can be solved through Gaussian elimination, and gives the values:

${\displaystyle q_{0}=-15,q_{1}=8,q_{2}=-1,e_{0}=-3}$

Thus, ${\displaystyle Q(x)=-x^{2}+8x-15,E(x)=x-3}$. Dividing the two gives:

${\displaystyle {Q(x) \over E(x)}=P(x)=5-x}$

${\displaystyle 5-x}$ fits three of the four points given, so it is the most likely to be the original polynomial.