Good–Turing frequency estimation

From formulasearchengine
Revision as of 20:52, 14 October 2013 by en>DarwinPeacock (trying to clarify intro sentence)
Jump to navigation Jump to search

The leftover hash lemma is a lemma in cryptography first stated by Russell Impagliazzo, Leonid Levin, and Michael Luby.

Imagine that you have a secret key X that has n uniform random bits, and you would like to use this secret key to encrypt a message. Unfortunately, you were a bit careless with the key, and know that an adversary was able to learn about t<n bits of that key, but you do not know which. Can you still use your key, or do you have to throw it away and choose a new key? The leftover hash lemma tells us that we can produce a key of almost nt bits, over which the adversary has almost no knowledge. Since the adversary knows all but nt bits, this is almost optimal.

More precisely, the leftover hash lemma tells us that we can extract about H(X) (the min-entropy of X) bits from a random variable X that are almost uniformly distributed. In other words, an adversary who has some partial knowledge about X, will have almost no knowledge about the extracted value. That is why this is also called privacy amplification (see privacy amplification section in the article Quantum key distribution).

Randomness extractors achieve the same result, but use (normally) less randomness.

Leftover hash lemma

Let X be a random variable over 𝒳 and let m>0. Let h:𝒮×𝒳{0,1}m be a 2-universal hash function. If

mH(X)2log(1ε)

then for S uniform over 𝒮 and independent of X, we have

δ[(h(S,X),S),(U,S)]ε

where U is uniform over {0,1}m and independent of S.

H(X)=logmaxxPr[X=x] is the Min-entropy of X, which measures the amount of randomness X has. The min-entropy is always less than or equal to the Shannon entropy. Note that maxxPr[X=x] is the probability of correctly guessing X. (The best guess is to guess the most probable value.) Therefore, the min-entropy measures how difficult it is to guess X.

δ(X,Y)=12v|Pr[X=v]Pr[Y=v]| is a statistical distance between X and Y.

See also

References