# Decoding methods

In coding theory, decoding is the process of translating received messages into codewords of a given code. There have been many common methods of mapping messages to codewords. These are often used to recover messages sent over a noisy channel, such as a binary symmetric channel.

## Notation

Henceforth, ${\displaystyle C\subset \mathbb {F} _{2}^{n}}$ could have been considered a code with the length ${\displaystyle n}$; ${\displaystyle x,y}$ shall be elements of ${\displaystyle \mathbb {F} _{2}^{n}}$; and ${\displaystyle d(x,y)}$ would be

## Ideal observer decoding

One may be given the message ${\displaystyle x\in \mathbb {F} _{2}^{n}}$, then ideal observer decoding generates the codeword ${\displaystyle y\in C}$. The process results in this solution:

${\displaystyle \mathbb {P} (y{\mbox{ sent}}\mid x{\mbox{ received}})}$

For example, a person can choose the codeword ${\displaystyle y}$ that is most likely to be received as the message ${\displaystyle x}$ after transmission.

### Decoding conventions

Each codeword does not have an expected possibility: there may be more than one codeword with an equal likelihood of mutating into the received message. In such a case, the sender and receiver(s) must agree ahead of time on a decoding convention. Popular conventions include:

1. Request that the codeword be resent -- automatic repeat-request
2. Choose any random codeword from the set of most likely codewords which is nearer to that.

## Maximum likelihood decoding

Given a received codeword ${\displaystyle x\in \mathbb {F} _{2}^{n}}$ maximum likelihood decoding picks a codeword ${\displaystyle y\in C}$ to maximize:

${\displaystyle \mathbb {P} (x{\mbox{ received}}\mid y{\mbox{ sent}})}$

i.e. choose the codeword ${\displaystyle y}$ that maximizes the probability that ${\displaystyle x}$ was received, given that ${\displaystyle y}$ was sent. Note that if all codewords are equally likely to be sent then this scheme is equivalent to ideal observer decoding. In fact, by Bayes Theorem we have

{\displaystyle {\begin{aligned}\mathbb {P} (x{\mbox{ received}}\mid y{\mbox{ sent}})&{}={\frac {\mathbb {P} (x{\mbox{ received}},y{\mbox{ sent}})}{\mathbb {P} (y{\mbox{ sent}})}}\\&{}=\mathbb {P} (y{\mbox{ sent}}\mid x{\mbox{ received}})\cdot {\frac {\mathbb {P} (x{\mbox{ received}})}{\mathbb {P} (y{\mbox{ sent}})}}.\end{aligned}}}

Upon fixing ${\displaystyle \mathbb {P} (x{\mbox{ received}})}$, ${\displaystyle x}$ is restructured and ${\displaystyle \mathbb {P} (y{\mbox{ sent}})}$ is constant as all codewords are equally likely to be sent. Therefore ${\displaystyle \mathbb {P} (x{\mbox{ received}}\mid y{\mbox{ sent}})}$ is maximised as a function of the variable ${\displaystyle y}$ precisely when ${\displaystyle \mathbb {P} (y{\mbox{ sent}}\mid x{\mbox{ received}})}$ is maximised, and the claim follows.

As with ideal observer decoding, a convention must be agreed to for non-unique decoding.

The ML decoding problem can also be modeled as an integer programming problem.[1]

The ML decoding algorithm has been found to be an instance of the MPF problem which is solved by applying the generalized distributive law. [2]

## Minimum distance decoding

Given a received codeword ${\displaystyle x\in \mathbb {F} _{2}^{n}}$, minimum distance decoding picks a codeword ${\displaystyle y\in C}$ to minimise the Hamming distance :

${\displaystyle d(x,y)=\#\{i:x_{i}\not =y_{i}\}}$

i.e. choose the codeword ${\displaystyle y}$ that is as close as possible to ${\displaystyle x}$.

Note that if the probability of error on a discrete memoryless channel ${\displaystyle p}$ is strictly less than one half, then minimum distance decoding is equivalent to maximum likelihood decoding, since if

${\displaystyle d(x,y)=d,\,}$

then:

{\displaystyle {\begin{aligned}\mathbb {P} (y{\mbox{ received}}\mid x{\mbox{ sent}})&{}=(1-p)^{n-d}\cdot p^{d}\\&{}=(1-p)^{n}\cdot \left({\frac {p}{1-p}}\right)^{d}\\\end{aligned}}}

which (since p is less than one half) is maximised by minimising d.

Minimum distance decoding is also known as nearest neighbour decoding. It can be assisted or automated by using a standard array. Minimum distance decoding is a reasonable decoding method when the following conditions are met:

1. The probability ${\displaystyle p}$ that an error occurs is independent of the position of the symbol
2. Errors are independent events - an error at one position in the message does not affect other positions

These assumptions may be reasonable for transmissions over a binary symmetric channel. They may be unreasonable for other media, such as a DVD, where a single scratch on the disk can cause an error in many neighbouring symbols or codewords.

As with other decoding methods, a convention must be agreed to for non-unique decoding.

## Syndrome decoding

Syndrome decoding is a highly efficient method of decoding a linear code over a noisy channel - i.e. one on which errors are made. In essence, syndrome decoding is minimum distance decoding using a reduced lookup table. It is the linearity of the code which allows

Suppose that ${\displaystyle C\subset \mathbb {F} _{2}^{n}}$ is a linear code of length ${\displaystyle n}$ and minimum distance ${\displaystyle d}$ with parity-check matrix ${\displaystyle H}$. Then clearly ${\displaystyle C}$ is capable of correcting up to

${\displaystyle t=\left\lfloor {\frac {d-1}{2}}\right\rfloor }$

errors made by the channel (since if no more than ${\displaystyle t}$ errors are made then minimum distance decoding will still correctly decode the incorrectly transmitted codeword).

Now suppose that a codeword ${\displaystyle x\in \mathbb {F} _{2}^{n}}$ is sent over the channel and the error pattern ${\displaystyle e\in \mathbb {F} _{2}^{n}}$ occurs. Then ${\displaystyle z=x+e}$ is received. Ordinary minimum distance decoding would lookup the vector ${\displaystyle z}$ in a table of size ${\displaystyle |C|}$ for the nearest match - i.e. an element (not necessarily unique) ${\displaystyle c\in C}$ with

${\displaystyle d(c,z)\leq d(y,z)}$

for all ${\displaystyle y\in C}$. Syndrome decoding takes advantage of the property of the parity matrix that:

${\displaystyle Hx=0}$

for all ${\displaystyle x\in C}$. The syndrome of the received ${\displaystyle z=x+e}$ is defined to be:

${\displaystyle Hz=H(x+e)=Hx+He=0+He=He}$

Under the assumption that no more than ${\displaystyle t}$ errors were made during transmission, the receiver looks up the value ${\displaystyle He}$ in a table of size

${\displaystyle {\begin{matrix}\sum _{i=0}^{t}{\binom {n}{i}}<|C|\\\end{matrix}}}$

(for a binary code) against pre-computed values of ${\displaystyle He}$ for all possible error patterns ${\displaystyle e\in \mathbb {F} _{2}^{n}}$. Knowing what ${\displaystyle e}$ is, it is then trivial to decode ${\displaystyle x}$ as:

${\displaystyle x=z-e}$

## Partial response maximum likelihood

{{#invoke:main|main}}

Partial response maximum likelihood (PRML) is a method for converting the weak analog signal from the head of a magnetic disk or tape drive into a digital signal.

## Viterbi decoder

{{#invoke:main|main}}

A Viterbi decoder uses the Viterbi algorithm for decoding a bitstream that has been encoded using forward error correction based on a convolutional code. The Hamming distance is used as a metric for hard decision Viterbi decoders. The squared Euclidean distance is used as a metric for soft decision decoders.

## Sources

• {{#invoke:citation/CS1|citation

|CitationClass=book }}

• {{#invoke:citation/CS1|citation

|CitationClass=book }}

• {{#invoke:citation/CS1|citation

|CitationClass=book }}

## References

1. "Using linear programming to Decode Binary linear codes," J.Feldman, M.J.Wainwright and D.R.Karger, IEEE Transactions on Information Theory, 51:954-972, March 2005.
2. {{#invoke:Citation/CS1|citation |CitationClass=journal }}