Superintegrable Hamiltonian system: Difference between revisions
en>Helpful Pixie Bot m ISBNs (Build KH) |
en>EmausBot m Bot: Migrating 1 langlinks, now provided by Wikidata on d:Q7643453 |
||
Line 1: | Line 1: | ||
'''Distributed source coding''' ('''DSC''') is an important problem in [[information theory]] and [[communication]]. DSC problems regard the compression of multiple correlated information sources that do not communicate with each other.<ref>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1328091 "Distributed source coding for sensor networks" by Z. Xiong, A.D. Liveris, and S. Cheng]</ref> By modeling the correlation between multiple sources at the decoder side together with [[channel code]]s, DSC is able to shift the computational complexity from encoder side to decoder side, therefore provide appropriate frameworks for applications with complexity-constrained sender, such as [[sensor networks]] and video/multimedia compression (see [[distributed video coding]]<ref>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?tp=&arnumber=1657820&isnumber=34703 "Distributed video coding in wireless sensor networks" by Puri, R. Majumdar, A. Ishwar, P. Ramchandran, K. ]</ref>). One of the main properties of distributed source coding is that the computational burden in encoders is shifted to the joint decoder. | |||
==History== | |||
In 1973, [[David Slepian]] and [[Jack Keil Wolf]] proposed the information theoretical lossless compression bound on distributed compression of two statistically dependent [[IID|i.i.d.]] sources X and Y.<ref name=swbound>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1055037 "Noiseless coding of correlated information sources" by D. Slepian and J. Wolf]</ref> After that, this bound was extended to cases with more than two sources by [[Thomas M. Cover]] in 1975,<ref name=swergodic>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1055356 "A proof of the data compression theorem of Slepian and Wolf for ergodic sources" by T. Cover]</ref> while the theoretical results in the lossy compression case are presented by [[Aaron D. Wyner]] and [[Jacob Ziv]] in 1976.<ref name=wzbound>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?tp=&arnumber=1055508 "The rate-distortion function for source coding with side information at the decoder" by A. Wyner and J. Ziv]</ref> | |||
Although the theorems on DSC were proposed on 1970s, it was after about 30 years that attempts were started for practical techniques, based on the idea that DSC is closely related to channel coding proposed in 1974 by [[Aaron D. Wyner]].<ref name=swpractical>[http://ieeexplore.ieee.org/xpls/freeabs_all.jsp?arnumber=1055171 "Recent results in Shannon theory" by A. D. Wyner]</ref> The asymmetric DSC problem was addressed by S. S. Pradhan and K. Ramchandran in 1999, which focused on statistically dependent binary and Gaussian sources and used scalar and trellis [[coset]] constructions to solve the problem.<ref name=discus>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?tp=&arnumber=1055508 "Distributed source coding using syndromes (DISCUS): design and construction" by S. S. Pradhan and K. Ramchandran]</ref> They further extended the work into the symmetric DSC case.<ref name=discus2>[http://ieeexplore.ieee.org/xpls/freeabs_all.jsp?arnumber=838176 "Distributed source coding: symmetric rates and applications to sensor networks" by S. S. Pradhan and K. Ramchandran]</ref> | |||
[[Syndrome decoding]] technology was first used in distributed source coding by the [[DISCUS]] system of SS Pradhan and K Ramachandran (Distributed Source Coding Using Syndromes).<ref name=discus/> They compress binary block data from one source into syndromes and transmit data from the other source uncompressed as [[side information]]. This kind of DSC scheme achieves asymmetric compression rates per source and results in ''asymmetric'' DSC. This asymmetric DSC scheme can be easily extended to the case of more than two correlated information sources. There are also some DSC schemes that use [[parity bit]]s rather than syndrome bits. | |||
The correlation between two sources in DSC has been modeled as a [[virtual channel]] which is usually referred as a [[binary symmetric channel]].<ref>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1281474 "Distributed code constructions for the entire Slepian–Wolf rate region for arbitrarily correlated sources" by Schonberg, D. Ramchandran, K. Pradhan, S.S.]</ref><ref>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1512420 "Generalized coset codes for distributed binning" by Pradhan, S.S. Ramchandran, K.]</ref> | |||
Starting from [[DISCUS]], DSC has attracted significant research activity and more sophisticated channel coding techniques have been adopted into DSC frameworks, such as [[Turbo Code]], [[LDPC]] Code, and so on. | |||
Similar to the previous lossless coding framework based on Slepian–Wolf theorem, efforts have been taken on lossy cases based on the Wyner–Ziv theorem. Theoretical results on quantizer designs was provided by R. Zamir and S. Shamai,<ref name=wzquantize>[http://ieeexplore.ieee.org/xpls/freeabs_all.jsp?arnumber=706450 "Nested linear/lattice codes for Wyner–Ziv encoding" by R. Zamir and S. Shamai]</ref> while different frameworks have been proposed based on this result, including a nested lattice quantizer and a trellis-coded quantizer. | |||
Moreover, DSC has been used in video compression for applications which require low complexity video encoding, such as sensor networks, multiview video camcorders, and so on.<ref name=dvc>[http://ieeexplore.ieee.org/xpls/freeabs_all.jsp?arnumber=1369699 "Distributed Video Coding" by B. Girod, etc. ]</ref> | |||
With deterministic and probabilistic discussions of correlation model of two correlated information sources, DSC schemes with more general compressed rates have been developed.<ref>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1614079 "On code design for the Slepian–Wolf problem and lossless multiterminal networks" by Stankovic, V. Liveris, A.D. Zixiang Xiong Georghiades, C.N.]</ref><ref>[http://portal.acm.org/citation.cfm?id=1226544 "A general and optimal framework to achieve the entire rate region for Slepian–Wolf coding" by P. Tan and J. Li]</ref><ref>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4471935 "Distributed source coding using short to moderate length rate-compatible LDPC codes: the entire Slepian–Wolf rate region" by Sartipi, M. Fekri, F.]</ref> In these ''non-asymmetric'' schemes, both of two correlated sources are compressed. | |||
Under a certain deterministic assumption of correlation between information sources, a DSC framework in which any number of information sources can be compressed in a distributed way has been demonstrated by X. Cao and M. Kuijper.<ref>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?isnumber=4895364&arnumber=4895396&count=299&index=31 "A distributed source coding framework for multiple sources" by Xiaomin Cao and Kuijper, M.]</ref> This method performs non-asymmetric compression with flexible rates for each source, achieving the same overall compression rate as repeatedly applying asymmetric DSC for more than two sources. | |||
==Theoretical bounds== | |||
The information theoretical lossless compression bound on DSC (the [[Slepian–Wolf bound]]) was first purposed by [[David Slepian]] and [[Jack Keil Wolf]] in terms of entropies of correlated information sources in 1973.<ref name=swbound/> They also showed that two isolated sources can compress data as efficiently as if they were communicating with each other. This bound has been extended to the case of more than two correlated sources by [[Thomas M. Cover]] in 1975.<ref name=swergodic/> | |||
Similar results were obtained in 1976 by [[Aaron D. Wyner]] and [[Jacob Ziv]] with regard to lossy coding of joint Gaussian sources.<ref name=wzbound/> | |||
===Slepian–Wolf bound=== | |||
Distributed Coding is the coding of two or more dependent sources with separate encoders and joint decoder. Given two statistically dependent i.i.d. finite-alphabet random sequences X and Y, Slepian–Wolf theorem includes theoretical bound for the lossless coding rate for distributed coding of the two sources as below:<ref name=swbound/> | |||
: <math>R_X\geq H(X|Y), \,</math> | |||
: <math>R_Y\geq H(Y|X), \, </math> | |||
: <math>R_X+R_Y\geq H(X,Y). \, </math> | |||
If both the encoder and decoder of the two sources are independent, the lowest rate we can achieve for lossless compression is <math>H(X)</math> and <math>H(Y)</math> for <math>X</math> and <math>Y</math> respectively, where <math>H(X)</math> and <math>H(Y)</math> are the entropies of <math>X</math> and <math>Y</math>. However, with joint decoding, if vanishing error probability for long sequences is accepted, the Slepian–Wolf theorem shows that much better compression rate can be achieved. As long as the total rate of <math>X</math> and <math>Y</math> is larger than their joint entropy <math>H(X,Y)</math> and none of the sources is encoded with a rate larger than its entropy, distributed coding can achieve arbitrarily small error probability for long sequences. | |||
A special case of distributed coding is compression with decoder side information, where source <math>Y</math> is available at the decoder side but not accessible at the encoder side. This can be treated as the condition that <math>R_Y=H(Y)</math> has already been used to encode <math>Y</math>, while we intend to use <math>H(X|Y)</math> to encode <math>X</math>. The whole system is operating in an asymmetric way (compression rate for the two sources are asymmetric). | |||
===Wyner–Ziv bound=== | |||
Shortly after Slepian–Wolf theorem on lossless distributed compression was published, the extension to lossy compression with decoder side information was proposed as Wyner–Ziv theorem.<ref name=wzbound/> Similarly to lossless case, two statistically dependent i.i.d. sources <math>X</math> and <math>Y</math> are given, where <math>Y</math> is available at the decoder side but not accessible at the encoder side. Instead of lossless compression in Slepian–Wolf theorem, Wyner–Ziv theorem looked into the lossy compression case. | |||
Wyner–Ziv theorem presents the achievable lower bound for the bit rate of <math>X</math> at given distortion <math>D</math>. It was found that for Gaussian memoryless sources and mean-squared error distortion, the lower bound for the bit rate of <math>X</math> remain the same no matter whether side information is available at the encoder or not. | |||
==Virtual channel== | |||
'''Deterministic''' model | |||
'''Probabilistic''' model | |||
==Asymmetric DSC vs. symmetric DSC== | |||
Asymmetric DSC means that, different bitrates are used in coding the input sources, while same bitrate is used in symmetric DSC. Taking a DSC design with two sources for example, in this example <math>X</math> and <math>Y</math> are two discrete, memoryless, uniformly distributed sources which generate set of variables <math>\mathbf{x}</math> and <math>\mathbf{y}</math> of length 7 bits and the Hamming distance between <math>\mathbf{x}</math> and <math>\mathbf{y}</math> is at most one. The Slepian–Wolf bound for them is: | |||
:<math>R_X+R_Y \geq 10</math> | |||
:<math>R_X \geq 5</math> | |||
:<math>R_Y \geq 5</math> | |||
This means, the theoretical bound is <math>R_X+R_Y=10</math> and symmetric DSC means 5 bits for each source. Other pairs with <math>R_X+R_Y=10</math> are asymmetric cases with different bit rate distributions between <math>X</math> and <math>Y</math>, where <math>R_X=3</math>, <math>R_Y=7</math> and <math>R_Y=3</math>, <math>R_X=7</math> represent two extreme cases called decoding with side information. | |||
==Practical distributed source coding== | |||
===Slepian–Wolf coding – lossless distributed coding=== | |||
It was understood that [[Slepian–Wolf coding]] is closely related to channel coding in 1974,<ref name=swpractical/> and after about 30 years, practical DSC started to be implemented by different channel codes. The motivation behind the use of channel codes is from two sources case, the correlation between input sources can be modeled as a virtual channel which has input as source <math>X</math> and output as source <math>Y</math>. The [[DISCUS]] system proposed by S. S. Pradhan and K. Ramchandran in 1999 implemented DSC with [[syndrome decoding]], which worked for asymmetric case and was further extended to symmetric case.<ref name=discus/><ref name=discus2/> | |||
The basic framework of syndrome based DSC is that, for each source, its input space is partitioned into several cosets according to the particular channel coding method used. Every input of each source gets an output indicating which coset the input belongs to, and the joint decoder can decode all inputs by received coset indices and dependence between sources. The design of channel codes should consider the correlation between input sources. | |||
A group of codes can be used to generate coset partitions,<ref>[http://ieeexplore.ieee.org/xpls/freeabs_all.jsp?arnumber=21245 "Coset codes. I. Introduction and geometrical classification" by G. D. Forney]</ref> such as trellis codes and lattice codes. Pradhan and Ramchandran designed rules for construction of sub-codes for each source, and presented result of trellis-based coset constructions in DSC, which is based on [[convolution code]] and set-partitioning rules as in [[Trellis modulation]], as well as lattice code based DSC.<ref name=discus/><ref name=discus2/> After this, embedded trellis code was proposed for asymmetric coding as an improvement over their results.<ref>[http://ieeexplore.ieee.org/xpls/freeabs_all.jsp?arnumber=917167 "Design of trellis codes for source coding with side information at the decoder" by X. Wang and M. Orchard]</ref> | |||
After DISCUS system was proposed, more sophisticated channel codes have been adapted to the DSC system, such as [[Turbo Code]], [[LDPC]] Code and Iterative Channel Code. The encoders of these codes are usually simple and easy to implement, while the decoders have much higher computational complexity and are able to get good performance by utilizing source statistics. With sophisticated channel codes which have performance approaching the capacity of the correlation channel, corresponding DSC system can approach the Slepian–Wolf bound. | |||
Although most research focused on DSC with two dependent sources, Slepian–Wolf coding has been extended to more than two input sources case, and sub-codes generation methods from one channel code was proposed by V. Stankovic, A. D. Liveris, etc. given particular correlation models.<ref>[http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1281475 "Design of Slepian–Wolf codes by channel code partitioning" by V. Stankovic, A. D. Liveris, Z. Xiong and C. N. Georghiades]</ref> | |||
====General theorem of Slepian–Wolf coding with syndromes for two sources==== | |||
'''Theorem''': Any pair of correlated uniformly distributed sources, <math>X, Y \in \left\{0,1\right\}^n</math>, with <math>\mathbf{d_H}(X, Y) \leq t</math>, can be compressed separately at a rate pair <math>(R_1, R_2)</math> such that <math> R_1, R_2 \geq n-k, R_1+R_2 \geq 2n-k</math>, where <math>R_1</math> and <math>R_2</math> are integers, and <math>k \leq n-\log(\sum_{i=0}^t{n \choose i})</math>. This can be achieved using an <math>(n,k,2t+1)</math> binary linear code. | |||
''Proof'': The Hamming bound for an <math>(n,k,2t+1)</math> binary linear code is <math>k \leq n-\log(\sum_{i=0}^t{n \choose i})</math>, and we have Hamming code achieving this bound, therefore we have such a binary linear code <math>\mathbf{C}</math> with <math>k\times n</math> generator matrix <math>\mathbf{G}</math>. Next we will show how to construct syndrome encoding based on this linear code. | |||
Let <math>R_1+R_2=2n-k</math> and <math>\mathbf{G_1}</math> be formed by taking first <math>(n-R_1)</math> rows from <math>\mathbf{G}</math>, while <math>\mathbf{G_2}</math> is formed using the remaining <math>(n-R_2)</math> rows of <math>\mathbf{G}</math>. <math>\mathbf{C_1}</math> and <math>\mathbf{C_2}</math> are the subcodes of the Hamming code generated by <math>\mathbf{G_1}</math> and <math>\mathbf{G_2}</math> respectively, with <math>\mathbf{H_1}</math> and <math>\mathbf{H_2}</math> as their parity check matrices. | |||
For a pair of input <math>\mathbf{(x, y)}</math>, the encoder is given by <math>\mathbf{s_1}=\mathbf{H_1}\mathbf{x}</math> and <math>\mathbf{s_2}=\mathbf{H_2}\mathbf{y}</math>. That means, we can represent <math>\mathbf{x}</math> and <math>\mathbf{y}</math> as <math>\mathbf{x=u_1G_1+c_{s1}}</math>, <math>\mathbf{y=u_2G_2+c_{s2}}</math>, where <math>\mathbf{c_{s1}, c_{s2}}</math> are the representatives of the cosets of <math>\mathbf{s1, s2}</math> with regard to <math>\mathbf{C_1, C_2}</math> respectively. Since we have <math>\mathbf{y=x+e}</math> with <math>w(\mathbf{e}) \leq t</math>. We can get <math>\mathbf{x+y=uG+c_{s}=e}</math>, where <math>\mathbf{u=\left[ u_1, u_2\right] }</math>, <math>\mathbf{c_{s}=c_{s1}+c_{s2}}</math>. | |||
Suppose there are two different input pairs with the same syndromes, that means there are two different strings <math>\mathbf{u^1, u^2} \in \left\{ 0,1\right\}^k</math>, such that <math>\mathbf{u^1G+c_{s}=e}</math> and <math>\mathbf{u^2G+c_{s}=e}</math>. Thus we will have <math>\mathbf{(u^1-u^2)G=0}</math>. Because minimum Hamming weight of the code <math>\mathbf{C}</math> is <math>2t+1</math>, the distance between <math>\mathbf{u_1G}</math> and <math>\mathbf{u_2G}</math> is <math>\geq 2t+1</math>. On the other hand, according to <math>w(\mathbf{e}) \leq t</math> together with <math>\mathbf{u^1G+c_{s}=e}</math> and <math>\mathbf{u^2G+c_{s}=e}</math>, we will have <math>d_H(\mathbf{u^1G, c_{s}}) \leq t</math> and <math>d_H(\mathbf{u^2G, c_{s}}) \leq t</math>, which contradict with <math>d_H(\mathbf{u^1G, u^2G}) \geq 2t+1</math>. Therefore, we cannot have more than one input pairs with the same syndromes. | |||
Therefore, we can successfully compress the two dependent sources with constructed subcodes from an <math>(n,k,2t+1)</math> binary linear code, with rate pair <math>(R_1, R_2)</math> such that <math> R_1, R_2 \geq n-k, R_1+R_2 \geq 2n-k</math>, where <math>R_1</math> and <math>R_2</math> are integers, and <math>k \leq n-\log(\sum_{i=0}^t{n \choose i})</math>. ''Log'' indicates ''Log<sub>2</sub>''. | |||
====Slepian–Wolf coding example==== | |||
Take the same example as in the previous '''Asymmetric DSC vs. Symmetric DSC''' part, this part presents the corresponding DSC schemes with coset codes and syndromes including asymmetric case and symmetric case. The Slepian–Wolf bound for DSC design is shown in the previous part. | |||
=====Asymmetric case (<math>R_X=3</math>, <math>R_Y=7</math>)===== | |||
In this case, the length of an input variable <math>\mathbf{y}</math> from source <math>Y</math> is 7 bits, therefore it can be sent lossless with 7 bits independent of any other bits. Based on the knowledge that <math>\mathbf{x}</math> and <math>\mathbf{y}</math> have Hamming distance at most one, for input <math>\mathbf{x}</math> from source <math>X</math>, since the receiver already has <math>\mathbf{y}</math>, the only possible <math>\mathbf{x}</math> are those with at most 1 distance from <math>\mathbf{y}</math>. If we model the correlation between two sources as a virtual channel, which has input <math>\mathbf{x}</math> and output <math>\mathbf{y}</math>, as long as we get <math>\mathbf{y}</math>, all we need to successfully "decode" <math>\mathbf{x}</math> is "parity bits" with particular error correction ability, taking the difference between <math>\mathbf{x}</math> and <math>\mathbf{y}</math> as channel error. We can also model the problem with cosets partition. That is, we want to find a channel code, which is able to partition the space of input <math>X</math> into several cosets, where each coset has a unique syndrome associated with it. With a given coset and <math>\mathbf{y}</math>, there is only one <math>\mathbf{x}</math> that is possible to be the input given the correlation between two sources. | |||
In this example, we can use the <math>(7,4, 3)</math> binary [[Hamming Code]] <math>\mathbf{C}</math>, with parity check matrix <math>\mathbf{H}</math>. For an input <math>\mathbf{x}</math> from source <math>X</math>, only the syndrome given by <math>\mathbf{s}=\mathbf{H}\mathbf{x}</math> is transmitted, which is 3 bits. With received <math>\mathbf{y}</math> and <math>\mathbf{s}</math>, suppose there are two inputs <math>\mathbf{x_1}</math> and <math>\mathbf{x_2}</math> with same syndrome <math>\mathbf{s}</math>. That means <math>\mathbf{H}\mathbf{x_1}=\mathbf{H}\mathbf{x_2}</math>, which is <math>\mathbf{H}(\mathbf{x_1}-\mathbf{x_2})=0</math>. Since the minimum Hamming weight of <math>(7,4,3)</math> Hamming Code is 3, <math>d_H(\mathbf{x_1}, \mathbf{x_2})\geq 3</math>. Therefore the input <math>\mathbf{x}</math> can be recovered since <math>d_H(\mathbf{x}, \mathbf{y})\leq 1</math>. | |||
Similarly, the bits distribution with <math>R_X=7</math>, <math>R_Y=3</math> can be achieved by reversing the roles of <math>X</math> and <math>Y</math>. | |||
=====Symmetric case===== | |||
In symmetric case, what we want is equal bitrate for the two sources: 5 bits each with separate encoder and joint decoder. We still use linear codes for this system, as we used for asymmetric case. The basic idea is similar, but in this case, we need to do coset partition for both sources, while for a pair of received syndromes (corresponds to one coset), only one pair of input variables are possible given the correlation between two sources. | |||
Suppose we have a pair of [[linear code]] <math>\mathbf{C_1}</math> and <math>\mathbf{C_2}</math> and an encoder-decoder pair based on linear codes which can achieve symmetric coding. The encoder output is given by: <math>\mathbf{s_1}=\mathbf{H_1}\mathbf{x}</math> and <math>\mathbf{s_2}=\mathbf{H_2}\mathbf{y}</math>. If there exists two pair of valid inputs <math>\mathbf{x_1}, \mathbf{y_1}</math> and <math>\mathbf{x_2}, \mathbf{y_2}</math> generating the same syndromes, i.e. <math>\mathbf{H_1}\mathbf{x_1} = \mathbf{H_1}\mathbf{x_2}</math> and <math>\mathbf{H_1}\mathbf{y_1} = \mathbf{H_1}\mathbf{y_2}</math>, we can get following(<math>w()</math> represents Hamming weight): | |||
<math>\mathbf{y_1}=\mathbf{x_1}+\mathbf{e_1}</math>, where <math>w(\mathbf{e_1}) \leq 1</math> | |||
<math>\mathbf{y_2}=\mathbf{x_2}+\mathbf{e_2}</math>, where <math>w(\mathbf{e_2}) \leq 1</math> | |||
Thus: <math>\mathbf{x_1}+\mathbf{x_2} \in \mathbf{C_1}</math> | |||
<math>\mathbf{y_1}+\mathbf{y_2}=\mathbf{x_1}+\mathbf{x_2}+\mathbf{e_3} \in \mathbf{C_2}</math> | |||
where <math>\mathbf{e_3}=\mathbf{e_2}+\mathbf{e_1}</math> and <math>w(\mathbf{e_3}) \leq 2</math>. That means, as long as we have the minimum distance between the two codes larger than <math>3</math>, we can achieve error-free decoding. | |||
The two codes <math>\mathbf{C_1}</math> and <math>\mathbf{C_2}</math> can be constructed as subcodes of the <math>(7, 4, 3)</math> Hamming code and thus has minimum distance of <math>3</math>. Given the [[generator matrix]] <math>\mathbf{G}</math> of the original Hamming code, the generator matrix <math>\mathbf{G_1}</math> for <math>\mathbf{C_1}</math> is constructed by taking any two rows from <math>\mathbf{G}</math>, and <math>\mathbf{G_2}</math> is constructed by the remaining two rows of <math>\mathbf{G}</math>. The corresponding <math>(5\times7)</math> [[Parity check matrix|parity-check matrix]] for each sub-code can be generated according to the generator matrix and used to generate syndrome bits. | |||
===Wyner–Ziv coding – lossy distributed coding=== | |||
In general, a Wyner–Ziv coding scheme is obtained by adding a quantizer and a de-quantizer to the Slepian–Wolf coding scheme. Therefore, a Wyner–Ziv coder design could focus on the quantizer and corresponding reconstruction method design. Several quantizer designs have been proposed, such as a nested lattice quantizer,<ref>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1289429 "Nested quantization and Slepian–Wolf coding: a Wyner–Ziv coding paradigm for i.i.d. sources" by Z. Xiong, A. D. Liveris, S. Cheng and Z. Liu]</ref> trellis code quantizer<ref>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4784347 "Wyner–Ziv coding based on TCQ and LDPC codes" by Y. Yang, S. Cheng, Z. Xiong and W. Zhao]</ref> and Lloyd quantization method.<ref>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1193992 "Design of optimal quantizers for distributed source coding" by D. Rebollo-Monedero, R. Zhang and B. Girod]</ref> | |||
===Large scale distributed quantization=== | |||
Unfortunately, the above approaches do not scale (in design or operational complexity requirements) to sensor networks of large sizes, a scenario where distributed compression is most helpful. If there are N sources transmitting at R bits each (with some distributed coding scheme), the number of possible reconstructions scales <math> 2^{NR}</math>. Even for moderate values of N and R (say N=10, R = 2), prior design schemes become impractical. Recently, an approach,<ref>[http://www.scl.ece.ucsb.edu/pubs/pubs_D/d10_4.pdf "Towards large scale distributed source coding" by S. Ramaswamy, K. Viswanatha, A. Saxena and K. Rose]</ref> using ideas borrowed from Fusion Coding of Correlated Sources, has been proposed where design and operational complexity are traded against decoder performance. This has allowed distributed quantizer design for network sizes reaching 60 sources, with substantial gains over traditional approaches. | |||
The central idea is the presence of a bit-subset selector which maintains a certain subset of the received (NR bits, in the above example) bits for each source. Let <math> \mathcal{B}</math> be the set of all subsets of the NR bits i.e. | |||
:<math>\mathcal{B} = 2^{\{1,...,NR\}} </math> | |||
Then, we define the bit-subset selector mapping to be | |||
<br /> | |||
:<math> \mathcal{S} : \{1,...,N\} \rightarrow \mathcal{B} </math> | |||
Note that each choice of the bit-subset selector imposes a storage requirement (C) that is exponential in the cardinality of the set of chosen bits. | |||
<br /> | |||
:<math> C = \sum_{n=1}^N 2^{|\mathcal{S}(n)|} </math> | |||
This allows a judicious choice of bits that minimize the distortion, given the constraints on decoder storage. Additional limitations on the set of allowable subsets are still needed. The effective cost function that needs to be minimized is a weighted sum of distortion and decoder storage | |||
<br /> | |||
:<math> J = D + \lambda C </math> | |||
The system design is performed by iteratively (and incrementally) optimizing the encoders, decoder and bit-subset selector till convergence. | |||
==Non-asymmetric DSC== | |||
{{Empty section|date=June 2010}} | |||
==Non-asymmetric DSC for more than two sources== | |||
The syndrome approach can still be used for more than two sources. Let us consider <math>a</math> binary sources of length-<math>n</math> <math> \mathbf{x}_1,\mathbf{x}_2,\cdots, \mathbf{x}_a \in \{0,1\}^n </math>. Let <math> \mathbf{H}_1, \mathbf{H}_2, \cdots, \mathbf{H}_s </math> be the corresponding coding matrices of sizes <math> m_1 \times n, m_2 \times n, \cdots, m_a \times n</math>. Then the input binary sources are compressed into <math> \mathbf{s}_1 = \mathbf{H}_1 \mathbf{x}_1, \mathbf{s}_2 = \mathbf{H}_2 \mathbf{x}_2, \cdots, \mathbf{s}_a = \mathbf{H}_a \mathbf{x}_a </math> of total <math> m= m_1 + m_2 + \cdots m_a </math> bits. Apparently, two source tuples cannot be recovered at the same time if they share the same syndrome. In other words, if all source tuples of interest have different syndromes, then one can recover them losslessly. | |||
General theoretical result does not seem to exist. However, for a restricted kind of source so-called Hamming source <ref name="HCMS">[http://arxiv.org/pdf/1001.4072 "Hamming Codes for Multiple Sources" by R. Ma and S. Cheng]</ref> that only has at most one source different from the rest and at most one bit location not all identical, practical lossless DSC is shown to exist in some cases. For the case when there are more than two sources, the number of source tuple in a Hamming source is <math>2^n (a n + 1)</math>. Therefore, a packing bound that <math>2^m \ge 2^n (a n + 1)</math> obviously has to satisfy. When the packing bound is satisfied with equality, we may call such code to be perfect (an analogous of perfect code in error correcting code).<ref name="HCMS" /> | |||
A simplest set of <math> a, n, m</math> to satisfy the packing bound with equality is <math> a=3, n=5, m=9 </math>. However, it turns out that such syndrome code does not exist.<ref>[http://tulsagrad.ou.edu/samuel_cheng/papers/dcc10.pdf "The Non-existence of Length-5 Slepian–Wolf Codes of Three Sources" by S. Cheng and R. Ma]</ref> The simplest (perfect) syndrome code with more than two sources have <math> n = 21 </math> and <math> m = 27 </math>. Let | |||
<math> | |||
\mathbf{Q}_1 = | |||
\begin{pmatrix} | |||
1 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 0 \; 1 \; 1 \; 0 \; 0 \; 0 \; 0 \\ | |||
0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \\ | |||
0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 0 \; 1 \; 0 \; 1 \; 1 \\ | |||
0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \\ | |||
0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 0 \; 1 \; 0 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 1 \\ | |||
0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 0 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 0 \; 1 | |||
\end{pmatrix}, | |||
</math> | |||
<math> | |||
\mathbf{Q}_2= | |||
\begin{pmatrix} | |||
0 \; 0 \; 0 \; 1 \; 0 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 1 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \\ | |||
1 \; 0 \; 0 \; 0 \; 1 \; 0 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 0 \; 0 \\ | |||
0 \; 1 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 1 \\ | |||
1 \; 0 \; 1 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 1 \; 0 \; 1 \; 1 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \\ | |||
0 \; 1 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 1 \; 0 \; 0 \; 0 \; 1 \; 0 \; 1 \; 1 \; 0 \; 1 \; 1 \\ | |||
0 \; 0 \; 1 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 1 \; 1 \; 1 \; 0 \; 1 \; 0 \; 1 \; 1 \; 1 \; 0 | |||
\end{pmatrix}, | |||
</math> | |||
<math> | |||
\mathbf{Q}_3= | |||
\begin{pmatrix} | |||
1 \; 0 \; 0 \; 1 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 1 \; 1 \\ | |||
1 \; 1 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 1 \; 1 \; 1 \\ | |||
0 \; 1 \; 1 \; 0 \; 0 \; 1 \; 1 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 0 \\ | |||
1 \; 0 \; 1 \; 1 \; 0 \; 0 \; 1 \; 1 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 0 \; 1 \\ | |||
0 \; 1 \; 0 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 0 \; 1 \; 0 \; 0 \\ | |||
0 \; 0 \; 1 \; 0 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 0 \; 0 \; 1 \; 1 | |||
\end{pmatrix}, | |||
</math> | |||
<math> | |||
\mathbf{G} = [ \mathbf{0} | \mathbf{I}_9] | |||
</math>, | |||
and | |||
<math> | |||
\mathbf{G}=\begin{pmatrix} | |||
\mathbf{G}_1 \\ \mathbf{G}_2 \\ \mathbf{G}_3 | |||
\end{pmatrix} | |||
</math> | |||
such that <math> | |||
\mathbf{G}_1, \mathbf{G}_2, \mathbf{G}_3 | |||
</math> | |||
are any partition of <math> \mathbf{G} </math>. | |||
<math> | |||
\mathbf{H}_1= \begin{pmatrix} | |||
\mathbf{G}_1 \\ \mathbf{Q}_1 | |||
\end{pmatrix}, | |||
\mathbf{H}_2= \begin{pmatrix} | |||
\mathbf{G}_2 \\ \mathbf{Q}_2 | |||
\end{pmatrix}, | |||
\mathbf{H}_3= \begin{pmatrix} | |||
\mathbf{G}_3 \\ \mathbf{Q}_3 | |||
\end{pmatrix} | |||
</math> | |||
can compress a Hamming source (i.e., sources that have no more than one bit different will all have different syndromes).<ref name="HCMS" /> | |||
For example, for the symmetric case, a possible set of coding matrices are | |||
<math> | |||
\mathbf{H}_1 = | |||
\begin{pmatrix} | |||
0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \\ | |||
0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \\ | |||
0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \\ | |||
1 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 0 \; 1 \; 1 \; 0 \; 0 \; 0 \; 0 \\ | |||
0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \\ | |||
0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 0 \; 1 \; 0 \; 1 \; 1 \\ | |||
0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \\ | |||
0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 0 \; 1 \; 0 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 1 \\ | |||
0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 0 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 0 \; 1 | |||
\end{pmatrix}, | |||
</math> | |||
<math> | |||
\mathbf{H}_2= | |||
\begin{pmatrix} | |||
0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 0 \\ | |||
0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \\ | |||
0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \\ | |||
0 \; 0 \; 0 \; 1 \; 0 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 1 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \\ | |||
1 \; 0 \; 0 \; 0 \; 1 \; 0 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 0 \; 0 \\ | |||
0 \; 1 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 1 \\ | |||
1 \; 0 \; 1 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 1 \; 0 \; 1 \; 1 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \\ | |||
0 \; 1 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 1 \; 0 \; 0 \; 0 \; 1 \; 0 \; 1 \; 1 \; 0 \; 1 \; 1 \\ | |||
0 \; 0 \; 1 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 1 \; 1 \; 1 \; 0 \; 1 \; 0 \; 1 \; 1 \; 1 \; 0 | |||
\end{pmatrix}, | |||
</math> | |||
<math> | |||
\mathbf{H}_3= | |||
\begin{pmatrix} | |||
0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \\ | |||
0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \\ | |||
0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 0 \; 0 \\ | |||
1 \; 0 \; 0 \; 1 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 1 \; 1 \\ | |||
1 \; 1 \; 0 \; 0 \; 1 \; 0 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 1 \; 1 \; 1 \\ | |||
0 \; 1 \; 1 \; 0 \; 0 \; 1 \; 1 \; 0 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 0 \\ | |||
1 \; 0 \; 1 \; 1 \; 0 \; 0 \; 1 \; 1 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 0 \; 1 \\ | |||
0 \; 1 \; 0 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 0 \; 0 \; 0 \; 1 \; 0 \; 0 \; 1 \; 1 \; 0 \; 1 \; 0 \; 0 \\ | |||
0 \; 0 \; 1 \; 0 \; 1 \; 1 \; 0 \; 1 \; 1 \; 1 \; 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 0 \; 0 \; 0 \; 1 \; 1 | |||
\end{pmatrix}. | |||
</math> | |||
==See also== | |||
*[[Linear code]] | |||
*[[Syndrome decoding]] | |||
*[[Low-density parity-check code]] | |||
*[[Turbo Code]] | |||
==References== | |||
{{Reflist}} | |||
{{DEFAULTSORT:Distributed Source Coding}} | |||
[[Category:Information theory]] | |||
[[Category:Coding theory]] | |||
[[Category:Wireless sensor network]] | |||
[[Category:Data transmission]] |
Latest revision as of 06:21, 13 April 2013
Distributed source coding (DSC) is an important problem in information theory and communication. DSC problems regard the compression of multiple correlated information sources that do not communicate with each other.[1] By modeling the correlation between multiple sources at the decoder side together with channel codes, DSC is able to shift the computational complexity from encoder side to decoder side, therefore provide appropriate frameworks for applications with complexity-constrained sender, such as sensor networks and video/multimedia compression (see distributed video coding[2]). One of the main properties of distributed source coding is that the computational burden in encoders is shifted to the joint decoder.
History
In 1973, David Slepian and Jack Keil Wolf proposed the information theoretical lossless compression bound on distributed compression of two statistically dependent i.i.d. sources X and Y.[3] After that, this bound was extended to cases with more than two sources by Thomas M. Cover in 1975,[4] while the theoretical results in the lossy compression case are presented by Aaron D. Wyner and Jacob Ziv in 1976.[5]
Although the theorems on DSC were proposed on 1970s, it was after about 30 years that attempts were started for practical techniques, based on the idea that DSC is closely related to channel coding proposed in 1974 by Aaron D. Wyner.[6] The asymmetric DSC problem was addressed by S. S. Pradhan and K. Ramchandran in 1999, which focused on statistically dependent binary and Gaussian sources and used scalar and trellis coset constructions to solve the problem.[7] They further extended the work into the symmetric DSC case.[8]
Syndrome decoding technology was first used in distributed source coding by the DISCUS system of SS Pradhan and K Ramachandran (Distributed Source Coding Using Syndromes).[7] They compress binary block data from one source into syndromes and transmit data from the other source uncompressed as side information. This kind of DSC scheme achieves asymmetric compression rates per source and results in asymmetric DSC. This asymmetric DSC scheme can be easily extended to the case of more than two correlated information sources. There are also some DSC schemes that use parity bits rather than syndrome bits.
The correlation between two sources in DSC has been modeled as a virtual channel which is usually referred as a binary symmetric channel.[9][10]
Starting from DISCUS, DSC has attracted significant research activity and more sophisticated channel coding techniques have been adopted into DSC frameworks, such as Turbo Code, LDPC Code, and so on.
Similar to the previous lossless coding framework based on Slepian–Wolf theorem, efforts have been taken on lossy cases based on the Wyner–Ziv theorem. Theoretical results on quantizer designs was provided by R. Zamir and S. Shamai,[11] while different frameworks have been proposed based on this result, including a nested lattice quantizer and a trellis-coded quantizer.
Moreover, DSC has been used in video compression for applications which require low complexity video encoding, such as sensor networks, multiview video camcorders, and so on.[12]
With deterministic and probabilistic discussions of correlation model of two correlated information sources, DSC schemes with more general compressed rates have been developed.[13][14][15] In these non-asymmetric schemes, both of two correlated sources are compressed.
Under a certain deterministic assumption of correlation between information sources, a DSC framework in which any number of information sources can be compressed in a distributed way has been demonstrated by X. Cao and M. Kuijper.[16] This method performs non-asymmetric compression with flexible rates for each source, achieving the same overall compression rate as repeatedly applying asymmetric DSC for more than two sources.
Theoretical bounds
The information theoretical lossless compression bound on DSC (the Slepian–Wolf bound) was first purposed by David Slepian and Jack Keil Wolf in terms of entropies of correlated information sources in 1973.[3] They also showed that two isolated sources can compress data as efficiently as if they were communicating with each other. This bound has been extended to the case of more than two correlated sources by Thomas M. Cover in 1975.[4]
Similar results were obtained in 1976 by Aaron D. Wyner and Jacob Ziv with regard to lossy coding of joint Gaussian sources.[5]
Slepian–Wolf bound
Distributed Coding is the coding of two or more dependent sources with separate encoders and joint decoder. Given two statistically dependent i.i.d. finite-alphabet random sequences X and Y, Slepian–Wolf theorem includes theoretical bound for the lossless coding rate for distributed coding of the two sources as below:[3]
If both the encoder and decoder of the two sources are independent, the lowest rate we can achieve for lossless compression is and for and respectively, where and are the entropies of and . However, with joint decoding, if vanishing error probability for long sequences is accepted, the Slepian–Wolf theorem shows that much better compression rate can be achieved. As long as the total rate of and is larger than their joint entropy and none of the sources is encoded with a rate larger than its entropy, distributed coding can achieve arbitrarily small error probability for long sequences.
A special case of distributed coding is compression with decoder side information, where source is available at the decoder side but not accessible at the encoder side. This can be treated as the condition that has already been used to encode , while we intend to use to encode . The whole system is operating in an asymmetric way (compression rate for the two sources are asymmetric).
Wyner–Ziv bound
Shortly after Slepian–Wolf theorem on lossless distributed compression was published, the extension to lossy compression with decoder side information was proposed as Wyner–Ziv theorem.[5] Similarly to lossless case, two statistically dependent i.i.d. sources and are given, where is available at the decoder side but not accessible at the encoder side. Instead of lossless compression in Slepian–Wolf theorem, Wyner–Ziv theorem looked into the lossy compression case.
Wyner–Ziv theorem presents the achievable lower bound for the bit rate of at given distortion . It was found that for Gaussian memoryless sources and mean-squared error distortion, the lower bound for the bit rate of remain the same no matter whether side information is available at the encoder or not.
Virtual channel
Deterministic model
Probabilistic model
Asymmetric DSC vs. symmetric DSC
Asymmetric DSC means that, different bitrates are used in coding the input sources, while same bitrate is used in symmetric DSC. Taking a DSC design with two sources for example, in this example and are two discrete, memoryless, uniformly distributed sources which generate set of variables and of length 7 bits and the Hamming distance between and is at most one. The Slepian–Wolf bound for them is:
This means, the theoretical bound is and symmetric DSC means 5 bits for each source. Other pairs with are asymmetric cases with different bit rate distributions between and , where , and , represent two extreme cases called decoding with side information.
Practical distributed source coding
Slepian–Wolf coding – lossless distributed coding
It was understood that Slepian–Wolf coding is closely related to channel coding in 1974,[6] and after about 30 years, practical DSC started to be implemented by different channel codes. The motivation behind the use of channel codes is from two sources case, the correlation between input sources can be modeled as a virtual channel which has input as source and output as source . The DISCUS system proposed by S. S. Pradhan and K. Ramchandran in 1999 implemented DSC with syndrome decoding, which worked for asymmetric case and was further extended to symmetric case.[7][8]
The basic framework of syndrome based DSC is that, for each source, its input space is partitioned into several cosets according to the particular channel coding method used. Every input of each source gets an output indicating which coset the input belongs to, and the joint decoder can decode all inputs by received coset indices and dependence between sources. The design of channel codes should consider the correlation between input sources.
A group of codes can be used to generate coset partitions,[17] such as trellis codes and lattice codes. Pradhan and Ramchandran designed rules for construction of sub-codes for each source, and presented result of trellis-based coset constructions in DSC, which is based on convolution code and set-partitioning rules as in Trellis modulation, as well as lattice code based DSC.[7][8] After this, embedded trellis code was proposed for asymmetric coding as an improvement over their results.[18]
After DISCUS system was proposed, more sophisticated channel codes have been adapted to the DSC system, such as Turbo Code, LDPC Code and Iterative Channel Code. The encoders of these codes are usually simple and easy to implement, while the decoders have much higher computational complexity and are able to get good performance by utilizing source statistics. With sophisticated channel codes which have performance approaching the capacity of the correlation channel, corresponding DSC system can approach the Slepian–Wolf bound.
Although most research focused on DSC with two dependent sources, Slepian–Wolf coding has been extended to more than two input sources case, and sub-codes generation methods from one channel code was proposed by V. Stankovic, A. D. Liveris, etc. given particular correlation models.[19]
General theorem of Slepian–Wolf coding with syndromes for two sources
Theorem: Any pair of correlated uniformly distributed sources, , with , can be compressed separately at a rate pair such that , where and are integers, and . This can be achieved using an binary linear code.
Proof: The Hamming bound for an binary linear code is , and we have Hamming code achieving this bound, therefore we have such a binary linear code with generator matrix . Next we will show how to construct syndrome encoding based on this linear code.
Let and be formed by taking first rows from , while is formed using the remaining rows of . and are the subcodes of the Hamming code generated by and respectively, with and as their parity check matrices.
For a pair of input , the encoder is given by and . That means, we can represent and as , , where are the representatives of the cosets of with regard to respectively. Since we have with . We can get , where , .
Suppose there are two different input pairs with the same syndromes, that means there are two different strings , such that and . Thus we will have . Because minimum Hamming weight of the code is , the distance between and is . On the other hand, according to together with and , we will have and , which contradict with . Therefore, we cannot have more than one input pairs with the same syndromes.
Therefore, we can successfully compress the two dependent sources with constructed subcodes from an binary linear code, with rate pair such that , where and are integers, and . Log indicates Log2.
Slepian–Wolf coding example
Take the same example as in the previous Asymmetric DSC vs. Symmetric DSC part, this part presents the corresponding DSC schemes with coset codes and syndromes including asymmetric case and symmetric case. The Slepian–Wolf bound for DSC design is shown in the previous part.
Asymmetric case (, )
In this case, the length of an input variable from source is 7 bits, therefore it can be sent lossless with 7 bits independent of any other bits. Based on the knowledge that and have Hamming distance at most one, for input from source , since the receiver already has , the only possible are those with at most 1 distance from . If we model the correlation between two sources as a virtual channel, which has input and output , as long as we get , all we need to successfully "decode" is "parity bits" with particular error correction ability, taking the difference between and as channel error. We can also model the problem with cosets partition. That is, we want to find a channel code, which is able to partition the space of input into several cosets, where each coset has a unique syndrome associated with it. With a given coset and , there is only one that is possible to be the input given the correlation between two sources.
In this example, we can use the binary Hamming Code , with parity check matrix . For an input from source , only the syndrome given by is transmitted, which is 3 bits. With received and , suppose there are two inputs and with same syndrome . That means , which is . Since the minimum Hamming weight of Hamming Code is 3, . Therefore the input can be recovered since .
Similarly, the bits distribution with , can be achieved by reversing the roles of and .
Symmetric case
In symmetric case, what we want is equal bitrate for the two sources: 5 bits each with separate encoder and joint decoder. We still use linear codes for this system, as we used for asymmetric case. The basic idea is similar, but in this case, we need to do coset partition for both sources, while for a pair of received syndromes (corresponds to one coset), only one pair of input variables are possible given the correlation between two sources.
Suppose we have a pair of linear code and and an encoder-decoder pair based on linear codes which can achieve symmetric coding. The encoder output is given by: and . If there exists two pair of valid inputs and generating the same syndromes, i.e. and , we can get following( represents Hamming weight):
where and . That means, as long as we have the minimum distance between the two codes larger than , we can achieve error-free decoding.
The two codes and can be constructed as subcodes of the Hamming code and thus has minimum distance of . Given the generator matrix of the original Hamming code, the generator matrix for is constructed by taking any two rows from , and is constructed by the remaining two rows of . The corresponding parity-check matrix for each sub-code can be generated according to the generator matrix and used to generate syndrome bits.
Wyner–Ziv coding – lossy distributed coding
In general, a Wyner–Ziv coding scheme is obtained by adding a quantizer and a de-quantizer to the Slepian–Wolf coding scheme. Therefore, a Wyner–Ziv coder design could focus on the quantizer and corresponding reconstruction method design. Several quantizer designs have been proposed, such as a nested lattice quantizer,[20] trellis code quantizer[21] and Lloyd quantization method.[22]
Large scale distributed quantization
Unfortunately, the above approaches do not scale (in design or operational complexity requirements) to sensor networks of large sizes, a scenario where distributed compression is most helpful. If there are N sources transmitting at R bits each (with some distributed coding scheme), the number of possible reconstructions scales . Even for moderate values of N and R (say N=10, R = 2), prior design schemes become impractical. Recently, an approach,[23] using ideas borrowed from Fusion Coding of Correlated Sources, has been proposed where design and operational complexity are traded against decoder performance. This has allowed distributed quantizer design for network sizes reaching 60 sources, with substantial gains over traditional approaches.
The central idea is the presence of a bit-subset selector which maintains a certain subset of the received (NR bits, in the above example) bits for each source. Let be the set of all subsets of the NR bits i.e.
Then, we define the bit-subset selector mapping to be
Note that each choice of the bit-subset selector imposes a storage requirement (C) that is exponential in the cardinality of the set of chosen bits.
This allows a judicious choice of bits that minimize the distortion, given the constraints on decoder storage. Additional limitations on the set of allowable subsets are still needed. The effective cost function that needs to be minimized is a weighted sum of distortion and decoder storage
The system design is performed by iteratively (and incrementally) optimizing the encoders, decoder and bit-subset selector till convergence.
Non-asymmetric DSC
Non-asymmetric DSC for more than two sources
The syndrome approach can still be used for more than two sources. Let us consider binary sources of length- . Let be the corresponding coding matrices of sizes . Then the input binary sources are compressed into of total bits. Apparently, two source tuples cannot be recovered at the same time if they share the same syndrome. In other words, if all source tuples of interest have different syndromes, then one can recover them losslessly.
General theoretical result does not seem to exist. However, for a restricted kind of source so-called Hamming source [24] that only has at most one source different from the rest and at most one bit location not all identical, practical lossless DSC is shown to exist in some cases. For the case when there are more than two sources, the number of source tuple in a Hamming source is . Therefore, a packing bound that obviously has to satisfy. When the packing bound is satisfied with equality, we may call such code to be perfect (an analogous of perfect code in error correcting code).[24]
A simplest set of to satisfy the packing bound with equality is . However, it turns out that such syndrome code does not exist.[25] The simplest (perfect) syndrome code with more than two sources have and . Let
, and such that are any partition of .
can compress a Hamming source (i.e., sources that have no more than one bit different will all have different syndromes).[24] For example, for the symmetric case, a possible set of coding matrices are
See also
References
43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.
- ↑ "Distributed source coding for sensor networks" by Z. Xiong, A.D. Liveris, and S. Cheng
- ↑ "Distributed video coding in wireless sensor networks" by Puri, R. Majumdar, A. Ishwar, P. Ramchandran, K.
- ↑ 3.0 3.1 3.2 "Noiseless coding of correlated information sources" by D. Slepian and J. Wolf
- ↑ 4.0 4.1 "A proof of the data compression theorem of Slepian and Wolf for ergodic sources" by T. Cover
- ↑ 5.0 5.1 5.2 "The rate-distortion function for source coding with side information at the decoder" by A. Wyner and J. Ziv
- ↑ 6.0 6.1 "Recent results in Shannon theory" by A. D. Wyner
- ↑ 7.0 7.1 7.2 7.3 "Distributed source coding using syndromes (DISCUS): design and construction" by S. S. Pradhan and K. Ramchandran
- ↑ 8.0 8.1 8.2 "Distributed source coding: symmetric rates and applications to sensor networks" by S. S. Pradhan and K. Ramchandran
- ↑ "Distributed code constructions for the entire Slepian–Wolf rate region for arbitrarily correlated sources" by Schonberg, D. Ramchandran, K. Pradhan, S.S.
- ↑ "Generalized coset codes for distributed binning" by Pradhan, S.S. Ramchandran, K.
- ↑ "Nested linear/lattice codes for Wyner–Ziv encoding" by R. Zamir and S. Shamai
- ↑ "Distributed Video Coding" by B. Girod, etc.
- ↑ "On code design for the Slepian–Wolf problem and lossless multiterminal networks" by Stankovic, V. Liveris, A.D. Zixiang Xiong Georghiades, C.N.
- ↑ "A general and optimal framework to achieve the entire rate region for Slepian–Wolf coding" by P. Tan and J. Li
- ↑ "Distributed source coding using short to moderate length rate-compatible LDPC codes: the entire Slepian–Wolf rate region" by Sartipi, M. Fekri, F.
- ↑ "A distributed source coding framework for multiple sources" by Xiaomin Cao and Kuijper, M.
- ↑ "Coset codes. I. Introduction and geometrical classification" by G. D. Forney
- ↑ "Design of trellis codes for source coding with side information at the decoder" by X. Wang and M. Orchard
- ↑ "Design of Slepian–Wolf codes by channel code partitioning" by V. Stankovic, A. D. Liveris, Z. Xiong and C. N. Georghiades
- ↑ "Nested quantization and Slepian–Wolf coding: a Wyner–Ziv coding paradigm for i.i.d. sources" by Z. Xiong, A. D. Liveris, S. Cheng and Z. Liu
- ↑ "Wyner–Ziv coding based on TCQ and LDPC codes" by Y. Yang, S. Cheng, Z. Xiong and W. Zhao
- ↑ "Design of optimal quantizers for distributed source coding" by D. Rebollo-Monedero, R. Zhang and B. Girod
- ↑ "Towards large scale distributed source coding" by S. Ramaswamy, K. Viswanatha, A. Saxena and K. Rose
- ↑ 24.0 24.1 24.2 "Hamming Codes for Multiple Sources" by R. Ma and S. Cheng
- ↑ "The Non-existence of Length-5 Slepian–Wolf Codes of Three Sources" by S. Cheng and R. Ma