Main Page: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
No edit summary
No edit summary
 
(283 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
{{Otheruses4|the transmission of data across noisy channels|the storage of text in computers|Variable-width encoding}}
This is a preview for the new '''MathML rendering mode''' (with SVG fallback), which is availble in production for registered users.
In [[coding theory]] a '''variable-length code''' (VLC) is a [[code]] which  maps source symbols to a ''variable'' number of bits.  


Variable-length codes can allow sources to be [[data compression|compressed]] and decompressed with ''zero'' error ([[lossless data compression]]) and still be read back symbol by symbol. With the right coding strategy an [[independent and identically-distributed random variables|independent and identically-distributed source]] may be compressed almost arbitrarily close to its [[information entropy|entropy]]. This is in contrast to fixed length coding methods, for which data compression is only possible for large blocks of data, and any compression beyond the logarithm of the total number of possibilities comes with a finite (though perhaps arbitrarily small) probability of failure.
If you would like use the '''MathML''' rendering mode, you need a wikipedia user account that can be registered here [[https://en.wikipedia.org/wiki/Special:UserLogin/signup]]
* Only registered users will be able to execute this rendering mode.
* Note: you need not enter a email address (nor any other private information). Please do not use a password that you use elsewhere.


Some examples of well-known variable-length coding strategies are [[Huffman coding]], [[Lempel–Ziv|Lempel–Ziv coding]] and [[arithmetic coding]].
Registered users will be able to choose between the following three rendering modes:


== Codes and their extensions ==
'''MathML'''
:<math forcemathmode="mathml">E=mc^2</math>


The extension of a code is the mapping of finite length source sequences to finite length bit strings, that is obtained by concatenating for each symbol of the source sequence the corresponding codeword produced by the original code.
<!--'''PNG'''  (currently default in production)
:<math forcemathmode="png">E=mc^2</math>


Using terms from [[formal language theory]], the precise mathematical definition is as follows: Let <math>S</math> and <math>T</math> be two finite sets, called the source and target [[alphabet (computer science)|alphabets]], respectively. A '''code''' <math>C: S \to T^*</math> is a [[total function]] mapping each symbol from <math>S</math> to a [[Word (data type)|sequence of symbols]] over <math>T</math>, and the extension of <math>C</math> to a [[Homomorphism#Homomorphisms_and_e-free_homomorphisms_in_formal_language_theory|homomorphism]] of <math>S^*</math> into <math>T^*</math>, which naturally maps each sequence of source symbols to a sequence of target symbols, is referred to as its '''extension'''.
'''source'''
:<math forcemathmode="source">E=mc^2</math> -->


== Classes of variable-length codes ==
<span style="color: red">Follow this [https://en.wikipedia.org/wiki/Special:Preferences#mw-prefsection-rendering link] to change your Math rendering settings.</span> You can also add a [https://en.wikipedia.org/wiki/Special:Preferences#mw-prefsection-rendering-skin Custom CSS] to force the MathML/SVG rendering or select different font families. See [https://www.mediawiki.org/wiki/Extension:Math#CSS_for_the_MathML_with_SVG_fallback_mode these examples].


Variable-length codes can be strictly nested in order of decreasing generality as non-singular codes, uniquely decodable codes and prefix codes. Prefix codes are always uniquely decodable, and these in turn are always non-singular:
==Demos==


=== Non-singular codes ===
Here are some [https://commons.wikimedia.org/w/index.php?title=Special:ListFiles/Frederic.wang demos]:


A code is '''non-singular''' if each source symbol is mapped to a different non-empty bit string, i.e. the mapping from source symbols to bit strings is [[injective]].
* For example the mapping <math>M_1 = \{\, a\mapsto 0, b\mapsto 0, c\mapsto 1\,\}</math> is '''not''' non-singular because both "a" and "b" map to the same bit string "0" ; any extension of this mapping will generate a lossy (non-lossless) coding. Such singular coding may still be useful when some loss of information is acceptable (for example when such code is used in audio or video compression, where a lossy coding becomes equivalent to source [[Quantization (signal processing)|quantization]]).
* However, the mapping <math>M_2 = \{\, a \mapsto 1, b \mapsto 011, c\mapsto 01110, d\mapsto 1110, e\mapsto 10011\,\}</math> is non-singular ; its extension will generate a lossless coding, which will be useful for general data transmission (but this feature is not always required). Note that it is not necessary for the non-singular code to be more compact than the source (and in many applications, a larger code is useful, for example as a way to detect and/or recover from encoding or transmission errors, or in security applications to protect a source from undetectable tampering).


=== Uniquely decodable codes ===
* accessibility:
** Safari + VoiceOver: [https://commons.wikimedia.org/wiki/File:VoiceOver-Mac-Safari.ogv video only], [[File:Voiceover-mathml-example-1.wav|thumb|Voiceover-mathml-example-1]], [[File:Voiceover-mathml-example-2.wav|thumb|Voiceover-mathml-example-2]], [[File:Voiceover-mathml-example-3.wav|thumb|Voiceover-mathml-example-3]], [[File:Voiceover-mathml-example-4.wav|thumb|Voiceover-mathml-example-4]], [[File:Voiceover-mathml-example-5.wav|thumb|Voiceover-mathml-example-5]], [[File:Voiceover-mathml-example-6.wav|thumb|Voiceover-mathml-example-6]], [[File:Voiceover-mathml-example-7.wav|thumb|Voiceover-mathml-example-7]]
** [https://commons.wikimedia.org/wiki/File:MathPlayer-Audio-Windows7-InternetExplorer.ogg Internet Explorer + MathPlayer (audio)]
** [https://commons.wikimedia.org/wiki/File:MathPlayer-SynchronizedHighlighting-WIndows7-InternetExplorer.png Internet Explorer + MathPlayer (synchronized highlighting)]
** [https://commons.wikimedia.org/wiki/File:MathPlayer-Braille-Windows7-InternetExplorer.png Internet Explorer + MathPlayer (braille)]
** NVDA+MathPlayer: [[File:Nvda-mathml-example-1.wav|thumb|Nvda-mathml-example-1]], [[File:Nvda-mathml-example-2.wav|thumb|Nvda-mathml-example-2]], [[File:Nvda-mathml-example-3.wav|thumb|Nvda-mathml-example-3]], [[File:Nvda-mathml-example-4.wav|thumb|Nvda-mathml-example-4]], [[File:Nvda-mathml-example-5.wav|thumb|Nvda-mathml-example-5]], [[File:Nvda-mathml-example-6.wav|thumb|Nvda-mathml-example-6]], [[File:Nvda-mathml-example-7.wav|thumb|Nvda-mathml-example-7]].
** Orca: There is ongoing work, but no support at all at the moment [[File:Orca-mathml-example-1.wav|thumb|Orca-mathml-example-1]], [[File:Orca-mathml-example-2.wav|thumb|Orca-mathml-example-2]], [[File:Orca-mathml-example-3.wav|thumb|Orca-mathml-example-3]], [[File:Orca-mathml-example-4.wav|thumb|Orca-mathml-example-4]], [[File:Orca-mathml-example-5.wav|thumb|Orca-mathml-example-5]], [[File:Orca-mathml-example-6.wav|thumb|Orca-mathml-example-6]], [[File:Orca-mathml-example-7.wav|thumb|Orca-mathml-example-7]].
** From our testing, ChromeVox and JAWS are not able to read the formulas generated by the MathML mode.


A code is '''uniquely decodable''' if its extension is non-singular. Whether a given code is uniquely decodable can be decided with the [[Sardinas–Patterson algorithm]].
==Test pages ==
* The mapping <math>M_3 = \{\, a\mapsto 0, b\mapsto 01, c\mapsto 011\,\}</math> is uniquely decodable (this can be demonstrated by looking at the ''follow-set'' after each target bit string in the map, because each bitstring is terminated as soon as we see a 0 bit which cannot follow any existing code to create a longer valid code in the map, but unambiguously starts a new code).
* Consider again the code  <math>M_2</math> from the previous section. This code, which is based on an example found in,<ref>Berstel et al. (2009), Example 2.3.1, p. 63</ref> is '''not''' uniquely decodable, since the string ''011101110011'' can be interpreted as the sequence of codewords ''01110–1110 – 011'', but also as the sequence of codewords ''011 – 1 – 011 – 10011''. Two possible decodings of this encoded string are thus given by ''cdb'' and ''babe''. However, such a code is useful when the set of all possible source symbols is completely known and finite, or when there are restrictions (for example a formal syntax) that determine if source elements of this extension are acceptable. Such restrictions permit the decoding of the original message by checking which of the possible source symbols mapped to the same symbol are valid under those restrictions.


=== Prefix codes ===
To test the '''MathML''', '''PNG''', and '''source''' rendering modes, please go to one of the following test pages:
{{Main|Prefix code}}
*[[Displaystyle]]
*[[MathAxisAlignment]]
*[[Styling]]
*[[Linebreaking]]
*[[Unique Ids]]
*[[Help:Formula]]


A code is a '''prefix code''' if no target bit string in the mapping is a prefix of the target bit string of a different source symbol in the same mapping. This means that symbols can be decoded instantaneously after their entire codeword is received. Other commonly used names for this concept are '''prefix-free code''', '''instantaneous code''', or '''context-free code'''.
*[[Inputtypes|Inputtypes (private Wikis only)]]
* The example mapping <math>M_3</math> in the previous paragraph is '''not''' a prefix code because we don't know after reading the bit string "0" if it encodes a "a" source symbol, or if it is the prefix of the encodings of the "b" or "c" symbols.
*[[Url2Image|Url2Image (private Wikis only)]]
* An example of a prefix code is shown below.
==Bug reporting==
{| class="wikitable" style="text-align:center; position: relative; left: 1in;" |
If you find any bugs, please report them at [https://bugzilla.wikimedia.org/enter_bug.cgi?product=MediaWiki%20extensions&component=Math&version=master&short_desc=Math-preview%20rendering%20problem Bugzilla], or write an email to math_bugs (at) ckurs (dot) de .
|-
! Symbol !! Codeword
|-
| a || 0
|-
| b || 10
|-
| c || 110
|-
| d || 111
|}
:: Example of encoding and decoding:
::: aabacdab → 00100110111010 → |0|0|10|0|110|111|0|10| → aabacdab
 
A special case of prefix codes are [[block code]]s. Here all codewords must have the same length. The latter are not very useful in the context of [[data compression|source coding]], but often serve as [[forward error correction|error correcting codes]] in the context of [[channel coding]].
 
== Advantages ==
 
The advantage of a variable-length code is that unlikely source symbols can be assigned longer codewords and likely source symbols can be assigned shorter codewords, thus giving a low [[Expected_value|''expected'']] codeword length. For the above example, if the probabilities of (a, b, c, d) were <math>\textstyle\left(\frac{1}{2}, \frac{1}{4}, \frac{1}{8}, \frac{1}{8}\right)</math>, the expected number of bits used to represent a source symbol using the code above would be:
:: <math>1\times\frac{1}{2}+2\times\frac{1}{4}+3\times\frac{1}{8}+3\times\frac{1}{8}=\frac{7}{4}</math>.
As the entropy of this source is 1.7500 bits per symbol, this code compresses the source as much as possible so that the source can be recovered with ''zero'' error.
 
== Notes ==
<references/>
==References==
* {{cite book | last1=Berstel | first1=Jean | last2=Perrin | first2=Dominique | last3=Reutenauer | first3=Christophe | title=Codes and automata | series=Encyclopedia of Mathematics and its Applications | volume=129 | location=Cambridge | publisher=[[Cambridge University Press]] | year=2010 | isbn=978-0-521-88831-8 | zbl=1187.94001 }}  [http://www-igm.univ-mlv.fr/~berstel/LivreCodes/Codes.html Draft available online]
 
{{Compression Methods}}
 
[[Category:Coding theory]]
[[Category:Lossless compression algorithms]]

Latest revision as of 22:52, 15 September 2019

This is a preview for the new MathML rendering mode (with SVG fallback), which is availble in production for registered users.

If you would like use the MathML rendering mode, you need a wikipedia user account that can be registered here [[1]]

  • Only registered users will be able to execute this rendering mode.
  • Note: you need not enter a email address (nor any other private information). Please do not use a password that you use elsewhere.

Registered users will be able to choose between the following three rendering modes:

MathML

E=mc2


Follow this link to change your Math rendering settings. You can also add a Custom CSS to force the MathML/SVG rendering or select different font families. See these examples.

Demos

Here are some demos:


Test pages

To test the MathML, PNG, and source rendering modes, please go to one of the following test pages:

Bug reporting

If you find any bugs, please report them at Bugzilla, or write an email to math_bugs (at) ckurs (dot) de .