# Wozencraft ensemble

{{ safesubst:#invoke:Unsubst||$N=Refimprove |date=__DATE__ |$B= {{#invoke:Message box|ambox}} }} In coding theory, the Wozencraft ensemble is a set of linear codes in which most of codes satisfy the Gilbert-Varshamov bound. It is named after John Wozencraft, who proved its existence. The ensemble is described by Template:Harvtxt, who attributes it to Wozencraft. Template:Harvtxt used the Wozencraft ensemble as the inner codes in his construction of strongly explicit asymptotically good code.

## Existence theorem

Here relative distance is the ratio of minimum distance to block length. And ${\displaystyle H_{q}}$ is the q-ary entropy function defined as follows: ${\displaystyle H_{q}(x)=xlog_{q}(q-1)-xlog_{q}x-(1-x)log_{q}(1-x)}$.

In fact, to show the existence of this set of linear codes, we will specify this ensemble explicitly as follows: for ${\displaystyle \alpha \in \mathbb {F} _{q^{k}}-\{0\}}$, the inner code ${\displaystyle C_{in}^{\alpha }:\mathbb {F} _{q}^{k}\to \mathbb {F} _{q}^{2k}}$, is defined as ${\displaystyle C_{in}^{\alpha }(x)=(x,\alpha x)}$. Here we can notice that ${\displaystyle x\in \mathbb {F} _{q}^{k}}$ and ${\displaystyle \alpha \in \mathbb {F} _{q^{k}}}$. We can do the multiplication ${\displaystyle \alpha x}$ since ${\displaystyle \mathbb {F} _{q}^{k}}$ is isomorphic to ${\displaystyle \mathbb {F} _{q^{k}}}$.

This ensemble is due to Wozencraft and is called the Wozencraft ensemble.

For any x and y in ${\displaystyle \mathbb {F} _{q}^{k}}$, we have the following facts:

Now we know that Wozencraft ensemble contains linear codes with rate ${\displaystyle {\frac {1}{2}}}$. In the following proof, we will show that there are at least ${\displaystyle \left({1-\varepsilon }\right)N}$ those linear codes having the relative distance ${\displaystyle \geq H_{q}^{-1}({\frac {1}{2}}-\varepsilon )}$, i.e. they meet the Gilbert-Varshamov bound.

## Proof

To prove that there are at least ${\displaystyle (1-\varepsilon )N}$ number of linear codes in the Wozencraft ensemble having relative distance ${\displaystyle \geq H_{q}^{-1}({\frac {1}{2}}-\varepsilon )}$, we will prove that there are at most ${\displaystyle \varepsilon N}$ number of linear codes having relative distance < ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )}$ (i.e., having the distance < ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$).

Notice that in a linear code, the distance is equal to the minimum weight of all codewords of that code. This fact is the property of linear code. So if one non-zero codeword has the weight less than ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$, then that code has the distance less than ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$.

So ${\displaystyle P}$ = the number of linear codes having the distance less than ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$ = the number of linear codes having some codeword that has the weight less than ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$.

Now we have the following claim:

Claim: Two linear codes ${\displaystyle C_{in}^{\alpha _{1}}}$ and ${\displaystyle C_{in}^{\alpha _{2}}}$ (here ${\displaystyle \alpha _{1}\neq \alpha _{2}\in \mathbb {F} _{q^{k}}-\{0\}}$) do not share any non-zero codeword.

Proof of Claim:

We prove the above claim by contradiction. Suppose there exist ${\displaystyle \alpha _{1}\neq \alpha _{2}\in \mathbb {F} _{q^{k}}-\{0\}}$ such that two linear codes ${\displaystyle C_{in}^{\alpha _{1}}}$ and ${\displaystyle C_{in}^{\alpha _{2}}}$ contain the same non-zero codeword y.

This implies ${\displaystyle \alpha _{1}=\alpha _{2}}$, which is a contradiction, which completes the proof of the claim.

Now we come back to the proof of the theorem.

With any linear code having distance < ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$, it has some codeword that has the weight less than ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$.

Also due to the Claim, notice that no two linear code share the same non-zero codewords. This implies that if we have ${\displaystyle P}$ linear codes having distance < ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$, then we have at least ${\displaystyle P}$ different ${\displaystyle y}$ such that ${\displaystyle wt(y)}$ < ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$ (one such codeword ${\displaystyle y}$ for each linear code). Here ${\displaystyle wt(y)}$ denotes the weight of codeword ${\displaystyle y}$, which is the number of non-zero positions of ${\displaystyle y}$.

So ${\displaystyle P}$ (the number of linear codes having distance < ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$) is less than or equal the number of non-zero ${\displaystyle y\in F_{q}^{2k}}$ such that wt(y) < ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$.

Here ${\displaystyle Vol_{q}(r,n)}$ is the Volume of Hamming ball of radius r in ${\displaystyle [q]^{n}}$.

Recall the upper bound of the Volume of Hamming ball ${\displaystyle Vol_{q}(pn,n)\leq q^{H_{q}(p)n}}$ (check Bounds on the Volume of a Hamming ball for proof's detail), we have:

That means the number of linear codes having the relative distance < ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$ is less than ${\displaystyle \varepsilon N}$. So the number of linear codes having the relative distance at least ${\displaystyle H_{q}^{-1}({\frac {1}{2}}-\varepsilon )\cdot 2k}$ is greater than ${\displaystyle N-\varepsilon N=(1-\varepsilon )N}$, which completes the proof.