# Zemor's decoding algorithm

In coding theory, Zemor's algorithm, designed and developed by Gilles Zemor,[1] is a recursive low-complexity approach to code construction. It is an improvement over the algorithm of Sipser and Spielman.

Zemor considered a typical class of Sipser–Spielman construction of expander codes, where the underlying graph is bipartite graph. Sipser and Spielman introduced a constructive family of asymptotically good linear-error codes together with a simple parallel algorithm that will always remove a constant fraction of errors. The article is based on Dr. Venkatesan Guruswami's course notes [2]

## Code construction

Zemor's algorithm is based on a type of expander graphs called Tanner graph. The construction of code was first proposed by Tanner.[3] The codes are based on double cover ${\displaystyle d}$, regular expander ${\displaystyle G}$, which is a bipartite graph. ${\displaystyle G}$ =${\displaystyle \left(V,E\right)}$, where ${\displaystyle V}$ is the set of vertices and ${\displaystyle E}$ is the set of edges and ${\displaystyle V}$ = ${\displaystyle A}$ ${\displaystyle \cup }$ ${\displaystyle B}$ and ${\displaystyle A}$ ${\displaystyle \cap }$ ${\displaystyle B}$ = ${\displaystyle \emptyset }$, where ${\displaystyle A}$ and ${\displaystyle B}$ denotes the set of 2 vertices. Let ${\displaystyle n}$ be the number of vertices in each group, i.e, ${\displaystyle |A|=|B|=n}$. The edge set ${\displaystyle E}$ be of size ${\displaystyle N}$ =${\displaystyle nd}$ and every edge in ${\displaystyle E}$ has one endpoint in both ${\displaystyle A}$ and ${\displaystyle B}$. ${\displaystyle E(v)}$ denotes the set of edges containing ${\displaystyle v}$.

Assume an ordering on ${\displaystyle V}$, therefore ordering will be done on every edges of ${\displaystyle E(v)}$ for every ${\displaystyle v\in V}$. Let finite field ${\displaystyle \mathbb {F} =GF(2)}$, and for a word ${\displaystyle x=(x_{e}),e\in E}$ in ${\displaystyle \mathbb {F} ^{N}}$, let the subword of the word will be indexed by ${\displaystyle E(v)}$. Let that word be denoted by ${\displaystyle (x)_{v}}$. The subset of vertices ${\displaystyle A}$ and ${\displaystyle B}$ induces every word ${\displaystyle x\in \mathbb {F} ^{N}}$ a partition into ${\displaystyle n}$ non-overlapping sub-words ${\displaystyle \left(x\right)_{v}\in \mathbb {F} ^{d}}$, where ${\displaystyle v}$ ranges over the elements of ${\displaystyle A}$. For constructing a code ${\displaystyle C}$, consider a linear subcode ${\displaystyle C_{o}}$, which is a ${\displaystyle [d,r_{o}d,\delta ]}$ code, where ${\displaystyle q}$, the size of the alphabet is ${\displaystyle 2}$. For any vertex ${\displaystyle v\in V}$, let ${\displaystyle v(1),v(2),\ldots ,v(d)}$ be some ordering of the ${\displaystyle d}$ vertices of ${\displaystyle E}$ adjacent to ${\displaystyle v}$. In this code, each bit ${\displaystyle x_{e}}$ is linked with an edge ${\displaystyle e}$ of ${\displaystyle E}$.

We can define the code ${\displaystyle C}$ to be the set of binary vectors ${\displaystyle x=\left(x_{1},x_{2},\ldots ,x_{N}\right)}$ of ${\displaystyle \{0,1\}^{N}}$ such that, for every vertex ${\displaystyle v}$ of ${\displaystyle V}$, ${\displaystyle \left(x_{v(1)},x_{v(2)},\ldots ,x_{v(d)}\right)}$ is a code word of ${\displaystyle C_{o}}$. In this case, we can consider a special case when every vertex of ${\displaystyle E}$ is adjacent to exactly ${\displaystyle 2}$ vertices of ${\displaystyle V}$. It means that ${\displaystyle V}$ and ${\displaystyle E}$ make up, respectively, the vertex set and edge set of ${\displaystyle d}$ regular graph ${\displaystyle G}$.

Let us call the code ${\displaystyle C}$ constructed in this way as ${\displaystyle \left(G,C_{o}\right)}$ code. For a given graph ${\displaystyle G}$ and a given code ${\displaystyle C_{o}}$, there are several ${\displaystyle \left(G,C_{o}\right)}$ codes as there are different ways of ordering edges incident to a given vertex ${\displaystyle v}$, i.e., ${\displaystyle v(1),v(2),\ldots ,v(d)}$. In fact our code ${\displaystyle C}$ consist of all codewords such that ${\displaystyle x_{v}\in C_{o}}$ for all ${\displaystyle v\in A,B}$. The code ${\displaystyle C}$ is linear ${\displaystyle [N,K,D]}$ in ${\displaystyle \mathbb {F} }$ as it is generated from a subcode ${\displaystyle C_{o}}$, which is linear. The code ${\displaystyle C}$ is defined as ${\displaystyle C=\{c\in \mathbb {F} ^{N}:(c)_{v}\in C_{o}\}}$ for every ${\displaystyle v\in V}$.

Graph G and code C

In matrix ${\displaystyle G}$, let ${\displaystyle \lambda }$ is equal to the second largest eigen value of adjacency matrix of ${\displaystyle G}$. Here the largest eigen value is ${\displaystyle d}$. Two important claims are made:

### Claim 1

${\displaystyle \left({\dfrac {K}{N}}\right)\geq 2r_{o}-1}$
. Let ${\displaystyle R}$ be the rate of a linear code constructed from a bipartite graph whose digit nodes have degree ${\displaystyle m}$ and whose subcode nodes have degree ${\displaystyle n}$. If a single linear code with parameters ${\displaystyle \left(n,k\right)}$ and rate ${\displaystyle r=\left({\dfrac {k}{n}}\right)}$ is associated with each of the subcode nodes, then ${\displaystyle k\geq 1-\left(1-r\right)m}$.

#### Proof

Let ${\displaystyle R}$ be the rate of the linear code, which is equal to ${\displaystyle K/N}$ Let there are ${\displaystyle S}$ subcode nodes in the graph. If the degree of the subcode is ${\displaystyle n}$, then the code must have ${\displaystyle \left({\dfrac {n}{m}}\right)S}$ digits, as each digit node is connected to ${\displaystyle m}$ of the ${\displaystyle \left(n\right)S}$ edges in the graph. Each subcode node contributes ${\displaystyle (n-k)}$ equations to parity check matrix for a total of ${\displaystyle \left(n-k\right)S}$. These equations may not be linearly independent. Therefore, ${\displaystyle \left({\dfrac {K}{N}}\right)\geq \left({\dfrac {({\dfrac {n}{m}})S-(n-k)S}{({\dfrac {n}{m}})S}}\right)}$
${\displaystyle \geq 1-m\left({\dfrac {n-k}{n}}\right)}$
${\displaystyle \geq 1-m\left(1-r\right)}$, Since the value of ${\displaystyle m}$,i.e., the digit node of this bipartite graph is ${\displaystyle 2}$ and here ${\displaystyle r=r_{o}}$, we can write as:
${\displaystyle \left({\dfrac {K}{N}}\right)\geq 2r_{o}-1}$

### Claim 2

${\displaystyle D\geq N\left({\dfrac {(\delta -({\dfrac {\lambda }{d}}))}{(1-({\dfrac {\lambda }{d}})}})\right)^{2}}$
${\displaystyle =N\left(\delta ^{2}-O\left({\dfrac {\lambda }{d}}\right)\right)}$ ${\displaystyle \rightarrow (1)}$

If ${\displaystyle S}$ is linear code of rate ${\displaystyle r}$, block code length ${\displaystyle d}$, and minimum relative distance ${\displaystyle \delta }$, and if ${\displaystyle B}$ is the edge vertex incidence graph of a ${\displaystyle d}$ – regular graph with second largest eigen value ${\displaystyle \lambda }$, then the code ${\displaystyle C(B,S)}$ has rate at least ${\displaystyle 2r_{o}-1}$ and minimum relative distance at least ${\displaystyle \left(\left({\dfrac {\delta -\left({\dfrac {\lambda }{d}}\right)}{1-\left({\dfrac {\lambda }{d}}\right)}}\right)\right)^{2}}$.

#### Proof

Let ${\displaystyle B}$ be derived from the ${\displaystyle d}$ regular graph ${\displaystyle G}$. So, the number of variables of ${\displaystyle C(B,S)}$ is ${\displaystyle \left({\dfrac {dn}{2}}\right)}$ and the number of constraints is ${\displaystyle n}$. According to Alon - Chung,[4] if ${\displaystyle X}$ is a subset of vertices of ${\displaystyle G}$ of size ${\displaystyle \gamma n}$, then the number of edges contained in the subgraph is induced by ${\displaystyle X}$ in ${\displaystyle G}$ is at most ${\displaystyle \left({\dfrac {dn}{2}}\right)\left(\gamma ^{2}+({\dfrac {\lambda }{d}})\gamma \left(1-\gamma \right)\right)}$.

In matrix ${\displaystyle G}$, we can assume that ${\displaystyle \lambda /d}$ is bounded away from ${\displaystyle 1}$. For those values of ${\displaystyle d}$ in which ${\displaystyle d-1}$ is odd prime, there are explicit constructions of sequences of ${\displaystyle d}$ - regular bipartite graphs with arbitrarily large number of vertices such that each graph ${\displaystyle G}$ in the sequence is a Ramanujan graph. It is called Ramanujan graph as it satisfies the inequality ${\displaystyle \lambda (G)\leq 2{\sqrt {d-1}}}$. Certain expansion properties are visible in graph ${\displaystyle G}$ as the separation between the eigen values ${\displaystyle d}$ and ${\displaystyle \lambda }$. If the graph ${\displaystyle G}$ is Ramanujan graph, then that expression ${\displaystyle (1)}$ will become ${\displaystyle 0}$ eventually as ${\displaystyle d}$ becomes large.

## Zemor's algorithm

The iterative decoding algorithm written below alternates between the vertices ${\displaystyle A}$ and ${\displaystyle B}$ in ${\displaystyle G}$ and corrects the codeword of ${\displaystyle C_{o}}$ in ${\displaystyle A}$ and then it switches to correct the codeword ${\displaystyle C_{o}}$ in ${\displaystyle B}$. Here edges associated with a vertex on one side of a graph are not incident to other vertex on that side. In fact, it doesn't matter in which order, the set of nodes ${\displaystyle A}$ and ${\displaystyle B}$ are processed. The vertex processing can also be done in parallel.

The decoder ${\displaystyle \mathbb {D} :\mathbb {F} ^{d}\rightarrow C_{o}}$stands for a decoder for ${\displaystyle C_{o}}$ that recovers correctly with any codewords with less than ${\displaystyle \left({\dfrac {d}{2}}\right)}$ errors.

### Decoder algorithm

Received word : ${\displaystyle w=(w_{e}),e\in E}$
 ${\displaystyle z\leftarrow w}$ For ${\displaystyle t\leftarrow 1}$ to ${\displaystyle m}$ do //${\displaystyle m}$ is the number of iterations { if (${\displaystyle t}$ is odd) // Here the algorithm will alternate between its two vertex sets. ${\displaystyle X\leftarrow A}$ else ${\displaystyle X\leftarrow B}$ Iteration ${\displaystyle t}$: For every ${\displaystyle v\in X}$, let ${\displaystyle (z)_{v}\leftarrow \mathbb {D} ((z)_{v})}$ // Decoding ${\displaystyle z_{v}}$ to its nearest codeword. }  Output: ${\displaystyle z}$

### Explanation of the algorithm

Since ${\displaystyle G}$ is bipartite, the set ${\displaystyle A}$ of vertices induces the partition of the edge set ${\displaystyle E}$ = ${\displaystyle \cup _{v\in A}E_{v}}$ . The set ${\displaystyle B}$ induces another partition, ${\displaystyle E}$ = ${\displaystyle \cup _{v\in B}E_{v}}$ .

Let ${\displaystyle w\in \{0,1\}^{N}}$ be the received vector, and recall that ${\displaystyle N=dn}$. The first iteration of the algorithm consists of applying the complete decoding for the code induced by ${\displaystyle E_{v}}$ for every ${\displaystyle v\in A}$ . This means that for replacing, for every ${\displaystyle v\in A}$, the vector ${\displaystyle \left(w_{v(1)},w_{v(2)},\ldots ,w_{v(d)}\right)}$ by one of the closest codewords of ${\displaystyle C_{o}}$. Since the subsets of edges ${\displaystyle E_{v}}$ are disjoint for ${\displaystyle v\in A}$, the decoding of these ${\displaystyle n}$ subvectors of ${\displaystyle w}$ may be done in parallel.

The iteration will yield a new vector ${\displaystyle z}$. The next iteration consists of applying the preceding procedure to ${\displaystyle z}$ but with ${\displaystyle A}$ replaced by ${\displaystyle B}$. In other words, it consists of decoding all the subvectors induced by the vertices of ${\displaystyle B}$. The coming iterations repeat those two steps alternately applying parallel decoding to the subvectors induced by the vertices of ${\displaystyle A}$ and to the subvectors induced by the vertices of ${\displaystyle B}$.
Note: [If ${\displaystyle d=n}$ and ${\displaystyle G}$ is the complete bipartite graph, then ${\displaystyle C}$ is a product code of ${\displaystyle C_{o}}$ with itself and the above algorithm reduces to the natural hard iterative decoding of product codes].

Here, the number of iterations, ${\displaystyle m}$ is ${\displaystyle \left({\dfrac {(\log {n})}{\log(2-\alpha )}}\right)}$. In general, the above algorithm can correct a code word whose Hamming weight is no more than ${\displaystyle ({\dfrac {1}{2}}).\alpha N\delta \left(({\dfrac {\delta }{2}})-({\dfrac {\lambda }{d}})\right)=\left(({\dfrac {1}{4}}).\alpha N(\delta ^{2}-O({\dfrac {\lambda }{d}})\right)}$ for values of ${\displaystyle \alpha <1}$. Here, the decoding algorithm is implemented as a circuit of size ${\displaystyle O(N\log {N})}$ and depth ${\displaystyle O(\log {N})}$ that returns the codeword given that error vector has weight less than ${\displaystyle \alpha N\delta ^{2}(1-\epsilon )/4}$ .

### Theorem

If ${\displaystyle G}$ is a Ramanujan graph of sufficiently high degree, for any ${\displaystyle \alpha <1}$, the decoding algorithm can correct ${\displaystyle ({\dfrac {\alpha \delta _{o}^{2}}{4}})(1-\in )N}$ errors, in ${\displaystyle O(\log {n})}$ rounds ( where the big- ${\displaystyle O}$ notation hides a dependence on ${\displaystyle \alpha }$). This can be implemented in linear time on a single processor; on ${\displaystyle n}$ processors each round can be implemented in constant time.

#### Proof

Since the decoding algorithm is insensitive to the value of the edges and by linearity, we can assume that the transmitted codeword is the all zeros - vector. Let the received codeword be ${\displaystyle w}$. The set of edges which has an incorrect value while decoding is considered. Here by incorrect value, we mean ${\displaystyle 1}$ in any of the bits. Let ${\displaystyle w=w^{0}}$ be the initial value of the codeword, ${\displaystyle w^{1},w^{2},\ldots ,w^{t}}$ be the values after first, second . . . ${\displaystyle t}$ stages of decoding. Here, ${\displaystyle X^{i}={e\in E|x_{e}^{i}=1}}$, and ${\displaystyle S^{i}={v\in V^{i}|E_{v}\cap X^{i+1}!=\emptyset }}$. Here ${\displaystyle S^{i}}$ corresponds to those set of vertices that was not able to successfully decode their codeword in the ${\displaystyle i^{th}}$ round. From the above algorithm ${\displaystyle S^{1} as number of unsuccessful vertices will be corrected in every iteration. We can prove that ${\displaystyle S^{0}>S^{1}>S^{2}>\cdots }$is a decreasing sequence. In fact, ${\displaystyle |S_{i+1}|<=({\dfrac {1}{2-\alpha }})|S_{i}|}$. As we are assuming, ${\displaystyle \alpha <1}$, the above equation is in a geometric decreasing sequence. So, when ${\displaystyle |S_{i}|, more than ${\displaystyle log_{2-\alpha }n}$ rounds are necessary. Furthermore, ${\displaystyle \sum |S_{i}|=n\sum ({\dfrac {1}{(2-\alpha )^{i}}})=O(n)}$, and if we implement the ${\displaystyle i^{th}}$ round in ${\displaystyle O(|S_{i}|)}$ time, then the total sequential running time will be linear.

## Drawbacks of Zemor's algorithm

1. It is lengthy process as the number of iterations ${\displaystyle m}$ in decoder algorithm takes is ${\displaystyle [(\log {n})/(\log(2-\alpha ))]}$
2. Zemor's decoding algorithm finds it difficult to decode erasures. A detailed way of how we can improve the algorithm is

given in.[5]