|
|
Line 1: |
Line 1: |
| {{technical|date=June 2012}}
| |
| In [[information theory]] and [[signal processing]], the '''Discrete Universal Denoiser'''
| |
| (DUDE) is a [[denoising]] scheme for recovering sequences over a finite alphabet,
| |
| which have been corrupted by a [[Noisy-channel coding theorem|discrete
| |
| memoryless channel]]. The DUDE was proposed in 2005 by Tsachy Weissman, Erik
| |
| Ordentlich, Gadiel Seroussi, Sergio Verdú and Marcelo J. Weinberger
| |
| .<ref name="dude-orig">
| |
| T. Weissman, E. Ordentlich, G. Seroussi, S. Verdu ́, and M.J. Weinberger. Universal discrete denoising: Known channel. IEEE Transactions on Information Theory,, 51(1):5–28, 2005.
| |
| </ref>
| |
|
| |
|
| == Overview ==
| |
|
| |
|
| The '''Discrete Universal Denoiser''' <ref name="dude-orig" /> (DUDE) is a [[denoising]] scheme that estimates an
| | Quantity versus quality. That�s an issue that feels relevant in fashion today, with the current focus on fast fashion from high street through to pre-collection. It also feels relevant coming from New York, a fashion week with a schedule that at first glance seems rammed, but upon closer inspection, like a mirage, fades away to practically nothing.<br><br>Nothing of note, in any case. Clocking in at nine days, New York is only rivalled by Paris when it comes to sheer heft, but the city�s fashion direction is really defined by a handful of designers. Despite the name, there wasn�t that much in New York this spring/summer 2015 season that really registered as �new�<br><br> |
| unknown signal <math>x^n=\left( x_1 \ldots x_n \right)</math> over a finite
| | Instead, what we often saw was a composite of tried-and-tested crowd pleasers, and maybe a few novel ideas filched from the back catalogues of other designers<br> |
| alphabet from a noisy version <math>z^n=\left( z_1 \ldots z_n \right)</math>.
| | That approach - hoodwinking press and buyers with the illusion of novelty, like the man behind the curtain making believe he was the great and all-powerful Oz - is endemic across the industry as a whole. Nevertheless, there�s something about New York that throws it into sharper relief<br><br> |
| While most [[denoising]] schemes in the signal processing
| | Take Victoria Beckham�s show, liberal as it was in its borrowing of stylist tropes culled from past collections by [http://www.pcs-systems.co.uk/Images/celinebag.aspx Cheap Celine Bags] and Jil Sander. Beckham, however, has made no qualms about the fact that she isn�t a trained designer: instead, she acts as an editing eye, more like a magazine stylist - or an especially canny shopper<br><br> |
| and statistics literature deal with [[Signal (information theory)|signals]] over | | She and her team (because they deserve a hefty dose of credit as the power behind the throne) are adept at nailing what feels right in a particular moment. Sometimes, of course, what feels right is a moment another designer nailed not that long ago. It looks fine second time around<br><br> |
| an infinite alphabet (notably, real-valued signals), the DUDE addresses the
| | The bags were good. And it�ll sell<br> |
| finite alphabet case. The noisy version <math>z^n</math> is assumed to be generated by transmitting
| | New York Fashion Week in pictures <br> |
| <math>x^n</math> through a known [[Noisy-channel coding theorem|discrete | | Sales are often seen as the key motivating factor in the New York fashion scene. That�s not to say that Milanese, Parisian and London designers don�t shift product. They do, but America was built on a history of mass manufacture, rather than the handmade ethos that still informs the European capitals via the haute couture, alta moda and Savile Row<br><br> |
| memoryless channel]].
| | Maybe that conditions American designers into thinking about their labels as true businesses - young New York designers throw about the word �brand� with wild abandon, in a manner that their French or British counterparts shy away from (young Milanese designers are virtually non-existent, but that�s a discussion for another time, and a different city<br> |
| | Kendall Jenner walks the runway at the Diane Von Furstenberg fashion show during New York Fashion Wee<br> |
| | There�s no designer younger and more branded than Alexander Wang: he founded his label aged 21 in 2005, off the back of a few knitted sweaters. It�s now valued, conservatively, at �20m and Wang is a fashion-week fixture. Wang slots into the Victoria Beckham camp when it comes to design, although he doubtless won�t appreciate the comparison. Nevertheless, his collections aren�t groundbreaki<br><br> |
|
| |
|
| For a fixed ''context length'' parameter <math>k</math>, the DUDE counts of the occurrences of all the strings of length <math>2k+1</math> appearing in <math>z^n</math>. The estimated value <math>\hat{x}_i</math> is determined based the two-sided length-<math>k</math> ''context'' <math>\left( z_{i-k}, \ldots, z_{i-1},z_{i+1}, \ldots,z_{i+k} \right) </math> of <math>z_i</math>, taking into account all the other tokens in <math>z^n</math> with the same context, as well as the known channel matrix and the loss function being used.
| | Rather, they�re artful bricolage, fusing existing fashion references, tricky, techy textiles, odd accessories and ever-shifting ideas of co<br>. |
| | Bricolage sounds cool, but is actually just French for �tinkering�, which is exactly what Wang does. After a few seasons of duds (silly fur mittens, tired logo-mania, last season�s ugly utility), this collection got the mix down. Sexed-up, stripped-back sportswear, in neon-flushed fabrics with plenty of Aertex, rubberised treatments and fake function. It wasn�t original in the slightest, but it had enough energy to sweep you al<br><br> |
|
| |
|
| The idea underlying the DUDE is best illustrated when <math>x^n</math> is a
| | Quite a few New York designers get by on that, by pumping up the energy around their clothes rather than translating said energy into the garments themselves. It can frequently lead to a zinging, post-show high followed by a crash when you actually see the stuff out of cont<br><br> |
| realization of a random vector <math>X^n</math>. If the conditional distribution
| |
| <math>X_i | Z_{i-k}, \ldots, Z_{i-1}, Z_{i+1}, \ldots, Z_{i+k}</math>, namely
| |
| the distribution of the noiseless symbol <math>X_i</math> conditional on its noisy context <math>\left( Z_{i-k}, \ldots,
| |
| Z_{i-1},Z_{i+1}, \ldots,Z_{i+k} \right) </math> was available, the optimal
| |
| estimator <math>\hat{X}_i</math> would be the [[Bayes estimator|Bayes Response]] to
| |
| <math>X_i | Z_{i-k}, \ldots, Z_{i-1}, Z_{i+1}, \ldots, Z_{i+k}</math>.
| |
| Fortunately, when
| |
| the channel matrix is known and non-degenerate, this conditional distribution
| |
| can be expressed in terms of the conditional distribution | |
| <math>Z_i | Z_{i-k}, \ldots, Z_{i-1}, Z_{i+1}, \ldots, Z_{i+k}</math>, namely
| |
| the distribution of the noisy symbol <math>Z_i</math> conditional on its noisy
| |
| context. This conditional distribution, in turn, can be estimated from an
| |
| individual observed noisy signal <math>Z^n</math> by virtue of the [[Law of Large Numbers]],
| |
| provided <math>n</math> is ``large enough''.
| |
|
| |
|
| Applying the DUDE scheme with a context length <math>k</math> to a sequence of
| | Spring/summer 2015 looks by Jason Wu You sometimes get that with Thom Browne, so complex and convoluted are the catwalk mise en sc�nes within which he places his clothing. This season, models paraded lavishly embroidered tailoring, feather-pricked [http://search.Un.org/search?ie=utf8&site=un_org&output=xml_no_dtd&client=UN_Website_en&num=10&lr=lang_en&proxystylesheet=UN_Website_en&oe=utf8&q=cardigan+suits&Submit=Go cardigan suits] and sequinned PVC on a freshly-mown lawn, to a spoken-word soundtrack waffling on about a bunch of sisters and what they w<br><br> |
| length <math>n</math> over a finite alphabet <math>\mathcal{Z}</math> requires
| |
| <math>O(n)</math> operations and space <math>O\left( \min( n , |\mathcal{Z}|^{2k} )
| |
| \right)</math>.
| |
|
| |
|
| Under certain assumptions, the DUDE is a universal scheme in the sense of asymptotically performing as well as an optimal denoiser, which has oracle access to the unknown sequence. More specifically, assume that the denoising performance is measured using a given single-character fidelity criterion, and consider the regime where the sequence length <math>n</math> tends to infinity and the context length <math>k=k_n</math> tends to infinity “not too fast”. In the stochastic setting, where a doubly-infinite sequence noiseless sequence <math>\mathbf{x}</math> is a realization of a stationary process <math>\mathbf{X}</math>, the DUDE asymptotically performs, in expectation, as well as the best denoiser, which has oracle access to the source distribution <math>\mathbf{X}</math>. In the single-sequence, or “semi-stochastic” setting with a ''fixed'' doubly-infinite sequence <math>\mathbf{x}</math>, the DUDE asymptotically performs as well as the best “sliding window” denoiser, namely any denoiser that determines <math>\hat{x}_i</math> from the window <math>\left( z_{i-k},\ldots,z_{i+k} \right)</math>, which has oracle access to <math>\mathbf{x}</math>.
| | The story was written by Browne himself, the voice was Diane Kea<br>n. |
| | Apparently, the six sisters are a cross between the Beale sisters of Grey Gardens (taste levels) and the Rockerfellers of Park Avenue (cash levels - Browne�s plainest suits come in at around two grand). It was an uplifting distraction, and the energy came not from a thumping soundtrack or styling gimmicks, but from the clothes themselves - however untenable they may be for real women�s real li<br><br><br> |
|
| |
|
| ==The discrete denoising problem==
| | There was a sense of reality to what Lazaro Hernandez and Jack Mccollough offered at Proenza Schouler. �It�s really about American sportswear, and this idea of �normal�,� said Hernandez before a show that was anything but. Their �normal� included leather vests plaited to resemble houndstooth, nylon thread crocheted into openwork dresses, and perforated blouses and skirts in leather so tissue-fine it ended up looking like nylon. �It�s dumb,� they said �It�s the clothes we all wear every day.� Meaning clothes that weren�t clever-clever or trying too hard, and that the work in, say, an argyle dress composed of 144 pattern pieces and executed without fit seams, couldn�t be immediately read in a two-dimensional image but had to be experienced in t<br><br>lesh. |
|
| |
|
| [[File:Denoise-scheme.pdf|thumb|600px|Block diagram description of the discrete denoising problem]]
| | Read more: Marc Jacobs' no makeup models du<br>ng NYFW |
| | Victoria Beckha<br>at NYFW |
| | Too many New York designers stick to tried and tested recipesThere was a touch of the dumb to Jason Wu�s show, too. The good kind of dumb - the dumb glamour of a bugle-beaded evening gown with the easiness of aT-shirt, or a billowing silk-jersey dress with a Grecian si<br><br>city. |
|
| |
|
| Let <math>\mathcal{X}</math> be the finite alphabet of a fixed but unknown original “noiseless” sequence <math>x^n=\left( x_1, \ldots, x_n \right)\in \mathcal{X}^n</math>. The sequence is fed into a [[Noisy-channel coding theorem|discrete
| | They felt easy, really ready to wear - as opposed to so many of the resoundingly difficult clothes dubbed that way. The final say from New York fashion week comes from Marc Jacobs. This season, he, too, seemed fixated on the notion of real - or perhaps, hy<br><br>real. |
| memoryless channel]] (DMC). The DMC operates on each symbol <math>x_i</math> independently, producing a corresponding random symbol <math>Z_i</math> in a finite alphabet <math>\mathcal{Z}</math>. The DMC is known and given as a <math>\mathcal{X}</math>-by-<math>\mathcal{Z}</math> Markov matrix <math>\Pi</math>, whose entries are <math>\pi(x,z)=\mathbb{P}\left( Z=z \,|\, X=x \right)</math>. It is convenient to write <math>\pi_z</math> for the <math>z</math>-column of <math>\Pi</math>. The DMC produces a random noisy sequence <math>Z^n=\left( z_1,\ldots,z_n \right)\in\mathcal{Z}^n</math>. A specific realization of this random vector will be denoted by <math>z^n</math>.
| |
| A denoiser is a function <math>\hat{X}^n: \mathcal{Z}^n \to \mathcal{X}^n</math> that attempts to recover the noiseless sequence <math>x^n</math> from a distorted version <math>z^n</math>. A specific denoised sequence is denoted by <math>\hat{x}^n=\hat{X}^n\left( z^n
| |
| \right)=\left( \hat{X}_1 (z^n),\ldots , \hat{X}_n(z^n) \right)</math>.
| |
| The problem of choosing the denoiser <math>\hat{X}^n</math> is known as signal
| |
| estimation, [[Filter (signal processing)|filtering]] or [[smoothing]]. To compare candidate denoisers, we choose a single-symbol fidelity criterion <math>\Lambda:\mathcal{X}\times \mathcal{X}\to [0,\infty)</math> (for example, the Hamming loss) and define the per-symbol loss of the denoiser <math>\hat{X}^n</math> at <math>(x^n,z^n)</math> by
| |
|
| |
|
| <math>
| | His audience listened, via headphones on each seat, to piped-in background noise from a middle-American house, while a distinctly Koonsian reworking of one, 10,000sq ft of shocking pink, sat in the middle of his catwalk. A model presents a creation by Jason Wu Spring/Summer 2015 collection during New York Fash<br>n Week |
| \begin{align}
| | Those bore no relation to the clothes, riffs on army surplus in satin punctuated with cartoonish holes and peppered with buckshot spherical embroidery, in plump, doll-like shapes. Barbie meets Action Man - maybe the pink, centre-stage shack was her dream house? <br><br>s me. |
| L_{\hat{X}^n}\left( x^n,z^n \right) = \frac{1}{n}\sum_{i=1}^n\Lambda\left(
| |
| x_i \, , \, \hat{X}_i(z^n) \right) \,.
| |
| \end{align}
| |
| </math>
| |
|
| |
|
| Ordering the elements of the alphabet <math>\mathcal{X}</math> by <math>\mathcal{X}=\left( a_1 , \ldots ,
| | Whatever the rationale, neither clothes nor show looked like anything else this week. Which was precisely the point. Marc Jacobs keeps his eye on what other designers are doing. It�s not to copy them, or even to check if they copy him, but out of a perverse contrariness, a wish to buck the st<br><br> quo. |
| a_{|\mathcal{X}|} \right)</math>, the fidelity criterion can be given by a <math>|\mathcal{X}|</math>-by-<math>|\mathcal{X}|</math> matrix, with columns of the form
| |
|
| |
|
| <math>
| | If other designers do gingham and sugary-sweet bridesmaid pastels, you can bet Jacobs will show polka dots and sludgy fatigues. Regardless of taste, or even relevance, you have to applaud Jacobs for at least showing us something consistently, contrarily new, in a New York that desperately needs it. |
| \begin{align}
| |
| \lambda_{\hat{x}} = \left(
| |
| \begin{array}{c}
| |
| \Lambda(a_1,\hat{x}) \\
| |
| \vdots \\
| |
| \Lambda(a_{|\mathcal{X}|},\hat{x})
| |
| \end{array}
| |
| \right) \,.
| |
| \end{align}
| |
| </math>
| |
| | |
| == The DUDE scheme ==
| |
| | |
| === Step 1: Calculating the empirical distribution in each context ===
| |
| | |
| The DUDE corrects symbols according to their context. The context length <math>k</math> used is a tuning parameter of the scheme. For <math>k+1\leq i\leq n-k</math>, define the left context of the <math>i</math>-th symbol in <math>z^n</math> by <math>l^k(z^n,i)=\left(
| |
| z_{i-k},\ldots,z_{i-1} \right)</math> and the corresponding right context as <math>r^k(z^n,i)=\left( z_{i+1},\ldots,z_{i+k} \right)</math>. A two-sided context is a combination <math>(l^k,r^k)</math> of a left and a right context.
| |
| | |
| The first step of the DUDE scheme is to calculate the empirical distribution of symbols in each possible two-sided context along the noisy sequence <math>z^n</math>. Formally, a given two-sided context <math>(l^k,r^k)\in\mathcal{Z}^k\times \mathcal{Z}^k</math> that appears once or more along <math>z^n</math> determines an empirical probability distribution over <math>\mathcal{Z}</math>, whose value at the symbol <math>z</math> is
| |
| | |
| <math>
| |
| \begin{align}
| |
| \mu \left( z^n,l^k,r^k \right)[z] =
| |
| \frac{\Big|
| |
| \left\{ k+1\leq i \leq n-k \,\,|\,\, ( z_{i-k},\ldots,z_{i+k})=l^k z r^k \right\}
| |
| \Big|}
| |
| {\Big|
| |
| \left\{ k+1\leq i \leq n-k \,\,|\,\, l^k(z^n,i)=l^k \text{ and } r^k(z^n,i)=r^k\right\}
| |
| \Big|} \,.
| |
| \end{align}
| |
| </math>
| |
| | |
| Thus, the first step of the DUDE scheme with context length <math>k</math> is to scan the input noisy sequence <math>z^n</math> once, and store the length-<math>|\mathcal{Z}|</math> empirical distribution vector <math> \mu \left(
| |
| z^n,l^k,r^k \right) </math> (or its non-normalized version, the count vector) for each two-sided context found along <math>z^n</math>. Since there are at most <math>N_{n,k}=\min\left(
| |
| n,|\mathcal{Z}|^{2k} \right)</math> possible two-sided contexts along <math>z^n</math>, this step requires <math>O(n)</math> operations and storage <math>O(N_{n,k})</math>.
| |
| | |
| ===Step 2: Calculating the Bayes response to each context===
| |
| | |
| Denote the column of single-symbol fidelity criterion <math>\Lambda</math>, corresponding to the symbol <math>\hat{x}\in\mathcal{X}</math>, by <math>\lambda_{\hat{x}}</math>. We define the ''Bayes Response'' to any vector <math>\mathbf{v}</math> of length <math>|\mathcal{X}|</math> with non-negative entries as
| |
| | |
| <math>
| |
| \begin{align}
| |
| \hat{X}_{Bayes}(\mathbf{v}) =
| |
| \text{argmin}_{\hat{x}\in\mathcal{X}}\lambda_{\hat{x}}^\top\mathbf{v}\,.
| |
| \end{align}
| |
| </math>
| |
| | |
| This definition is motivated in the [[#Background|background]] below.
| |
| | |
| The second step of the DUDE scheme is to calculate, for each two-sided context <math>(l^k,r^k)</math> observed in the previous step along <math>z^n</math>, and for each symbol <math>z\in\mathcal{Z}</math> observed in each context (namely, any <math>z</math> such that <math>l^rzr^k</math> is a substring of <math>z^n</math>) the Bayes response to the vector <math>\Pi^{-\top}\mu\left( z^n\,,\,l^k\,,\,r^k \right)\odot \pi_{z}</math>, namely
| |
| | |
| <math>
| |
| \begin{align}
| |
| g(l^k,z,r^k) := \hat{X}_{Bayes} \left( \Pi^{-\top}\mu\left(
| |
| z^n\,,\,l^k\,,\,r^k \right)\odot \pi_{z} \right)\,.
| |
| \end{align}
| |
| </math>
| |
| | |
| Note that the sequence <math>z^n</math> and the context length <math>k</math> are implicit. Here, <math>\pi_z</math> is the <math>z</math>-column of <math>\Pi</math> and for vectors <math>\mathbf{a}</math> and <math>\mathbf{b}</math>, <math>\mathbf{a}\odot\mathbf{b}</math> denotes their Schur (entrywise) product, defined by <math>\left(
| |
| \mathbf{a}\odot\mathbf{b}\right)_i = a_i b_i</math>. Matrix multiplication is evaluated before the Schur product, so that <math>\Pi^{-\top}\mu\odot\pi_z</math> stands for <math>(\Pi^{-\top}\mu)\odot\pi_z</math>.
| |
| | |
| This formula assumed that the channel matrix <math>\Pi</math> is square (<math>|\mathcal{X}|=|\mathcal{Z}|</math>) and invertible. When <math>|\mathcal{X}|\leq|\mathcal{Z}|</math> and <math>\Pi</math> is not invertible, under the reasonable assumption that it has full row rank, we replace <math>(\Pi^\top)^{-1}</math> above with its Moore-Penrose pseudo-inverse <math>\left( \Pi\Pi^\top \right)^{-1}\Pi</math> and calculate instead
| |
| | |
| <math>
| |
| \begin{align}
| |
| g(l^k,z,r^k):=\hat{X}_{Bayes}\left( (\Pi\Pi^\top)^{-1}\Pi \mu\left( z^n,l^k,r^k \right)\odot
| |
| \pi_z \right)\,.
| |
| \end{align}
| |
| </math>
| |
| | |
| By caching the inverse or pseudo-inverse <math>\Pi^{-\top}</math>, and the values <math>\lambda_{\hat{x}}\odot \pi_z</math> for the relevant pairs <math>(\hat{x},z)\in\mathcal{X}\times\mathcal{Z}</math>, this step requires <math>O(N_{k,n})</math> operations and <math>O(N_{k,n})</math> storage.
| |
| | |
| === Step 3: Estimating each symbol by the Bayes response to its context ===
| |
| | |
| The third and final step of the DUDE scheme is to scan <math>z^n</math> again and compute the actual denoised sequence <math> \hat{X}^n(z^n)=\left( \hat{X}_1(z^n), \ldots ,
| |
| \hat{X}_n(z^n) \right) </math>. The denoised symbol chosen to replace <math>z_i</math> is the Bayes response to the two-sided context of the symbol, namely
| |
| | |
| <math>
| |
| \begin{align}
| |
| \hat{X}_i(z^n) := g\left( l^k(z^n,i) \,,\, z_i \,,\, r^k(z^n,i)\right)\,.\end{align}
| |
| </math>
| |
| | |
| This step requires <math>O(n)</math> operations and used the data structure constructed in the previous step.
| |
| | |
| In summary, the entire DUDE requires <math>O(n)</math> operations and <math>O(N_{k,n})</math> storage.
| |
| | |
| == Asymptotic optimality properties ==
| |
| | |
| The DUDE is designed to be universally optimal, namely optimal (is some sense, under some assumptions) regardless of the original sequence <math>x^n</math>.
| |
| | |
| Let <math>\hat{X}^n_{DUDE}:\mathcal{Z}^n\to\mathcal{X}^n</math> denote a sequence of DUDE schemes, as described above, where <math>\hat{X}^n_{DUDE}</math> uses a context length <math>k_n</math> that is implicit in the notation. We only require that <math>\lim_{n\to\infty}k_n=\infty</math> and that <math>k_n |\mathcal{Z}|^{2K_n}=o\left( \frac{n}{\log n} \right)</math>.
| |
| | |
| === For a stationary source ===
| |
| | |
| Denote by <math>\mathcal{D}_n</math> the set of all <math>n</math>-block denoisers, namely all maps <math>\hat{X}^n:\mathcal{Z}^n\to\mathcal{X}^n</math>.
| |
| | |
| Let <math>\mathbf{X}</math> be an unknown stationary source and <math>\mathbf{Z}</math> be the distribution of the corresponding noisy sequence. Then
| |
| | |
| <math>
| |
| \begin{align}
| |
| \lim_{n\to\infty}\mathbf{E}\left[ L_{\hat{X}^n_{DUDE}}\left( X^n,Z^n \right) \right]=
| |
| \lim_{n\to\infty}\min_{\hat{X}^n\in\mathcal{D}_n}\mathbf{E} \left[L_{\hat{X}^n}\left( X^n,Z^n
| |
| \right)\right]\,,
| |
| \end{align}
| |
| </math>
| |
| | |
| and both limits exist. If, in addition the source <math>\mathbf{X}</math> is ergodic, then
| |
| | |
| <math>
| |
| \begin{align}
| |
| \limsup_{n\to\infty} L_{\hat{X}^n_{DUDE}}\left( X^n,Z^n \right) =
| |
| \lim_{n\to\infty}\min_{\hat{X}^n\in\mathcal{D}_n}\mathbf{E} \left[L_{\hat{X}^n}\left( X^n,Z^n
| |
| \right)\right]\,,\,\text{ almost surely}\,.
| |
| \end{align}
| |
| </math>
| |
| | |
| === For an individual sequence ===
| |
| | |
| Denote by <math>\mathcal{D}_{n,k}</math> the set of all <math>n</math>-block <math>k</math>-th order sliding window denoisers, namely all maps <math>\hat{X}^n:\mathcal{Z}\to\mathcal{X}</math> of the form <math>\hat{X}_i(z^n) = f\left( z_{i-k},\ldots,z_{i+k} \right)</math> with <math>f:\mathcal{Z}^{2k+1}\to\mathcal{X}</math> arbitrary.
| |
| | |
| Let <math>\mathbf{x}\in\mathcal{X}^\infty</math> be an unknown noiseless sequence stationary source and <math>\mathbf{Z}</math> be the distribution of the corresponding noisy sequence. Then
| |
| | |
| <math>
| |
| \begin{align}
| |
| \lim_{n\to\infty}
| |
| \left[
| |
| L_{\hat{X}^n_{DUDE}}\left( x^n,Z^n \right) -
| |
| \min_{\hat{X}^n\in\mathcal{D}_{n,k}} L_{\hat{X}^n}\left( x^n,Z^n \right)
| |
| \right ] =0 \,,\,\text{ almost surely}\,.
| |
| \end{align}
| |
| </math>
| |
| | |
| === Non-asymptotic performance ===
| |
| | |
| Let <math>\hat{X}^n_{k}</math> denote the DUDE on with context length <math>k</math> defined on <math>n</math>-blocks. Then there exist explicit constants <math>A,C>0</math> and <math>B>1</math> that depend on <math>\left( \Pi,\Lambda \right)</math> alone, such that for any <math>n,k</math> and any <math>x^n\in\mathcal{X}^n</math> we have
| |
| | |
| <math>
| |
| \begin{align}
| |
| \frac{A}{\sqrt{n}}B^k\,\leq
| |
| \mathbf{E} \left[ L_{\hat{X}^n_{k}}\left( x^n,Z^n \right) -
| |
| \min_{\hat{X}^n\in\mathcal{D}_{n,k}} L_{\hat{X}^n}\left( x^n,Z^n \right)
| |
| \right] \leq \sqrt{k}\frac{C}{\sqrt{n}} |\mathcal{Z}|^{k} \,,
| |
| \end{align}
| |
| </math>
| |
| | |
| where <math>Z^n</math> is the noisy sequence corresponding to <math>x^n</math> (whose randomness is due to the channel alone)
| |
| <ref name="lower">K. Viswanathan and E. Ordentlich. Lower limits of discrete universal denoising. IEEE Transactions on Information Theory, 55(3):1374–1386, 2009.
| |
| </ref>
| |
| .
| |
| | |
| In fact holds with the same constants <math>A,B</math> as above for ''any''
| |
| <math>n</math>-block denoiser <math>\hat{X}^n\in\mathcal{D}^n</math>.<ref name="dude-orig" /> The lower bound proof requires that the channel matrix <math>\Pi</math> be square and the pair <math>\left( \Pi,\Lambda \right)</math> satisfies a certain technical condition.
| |
| | |
| ==Background==
| |
| | |
| To motivate the particular definition of the DUDE using the Bayes response to a particular vector, we now find the optimal denoiser in the non-universal case, where the unknown sequence <math>x^n</math> is a realization of a random vector <math>X^n</math>, whose distribution is known.
| |
| | |
| Consider first the case <math>n=1</math>. Since the joint distribution of <math>(X,Z)</math> is known, given the observed noisy symbol <math>z</math>, the unknown symbol <math>X\in\mathcal{X}</math> is distributed according to the known distribution <math>\mathbb{P}(X=x|Z=z)</math>. By ordering the elements of <math>\mathcal{X}</math>, we can describe this conditional distribution on <math>\mathcal{X}</math> using a probability vector <math>\mathbf{P}_{X|z}</math>, indexed by <math>\mathcal{X}</math>, whose <math>x</math>-entry is <math>\mathbb{P}\left( X=x|Z=z \right)</math>. Clearly the expected loss for the choice of estimated symbol <math>\hat{x}</math> is <math>\lambda_{\hat{x}}^\top \mathbf{P}_{X|z}</math>.
| |
| | |
| Define the ''Bayes Envelope'' of a probability vector <math>\mathbf{v}</math>, describing a probability distribution on <math>\mathcal{X}</math>, as the minimal expected loss <math>U(\mathbf{v}) =
| |
| \min_{\hat{x}\in\mathcal{X}}\mathbf{v}^\top \lambda_{\hat{x}}</math>, and the ''Bayes Response'' to <math>\mathbf{v}</math> as the prediction that achieves this minimum, <math>\hat{X}_{Bayes}(\mathbf{v}) = \text{argmin}_{\hat{x}\in\mathcal{X}}\mathbf{v}^\top
| |
| \lambda_{\hat{x}}</math>. Observe that the Bayes response is scale invariant in the sense that <math>\hat{X}_{Bayes}(\mathbf{v}) = \hat{X}_{Bayes}(\alpha\mathbf{v})</math> for <math>\alpha>0</math>.
| |
| | |
| For the case <math>n=1</math>, then, the optimal denoiser is <math>\hat{X}(z)=\hat{X}_{Bayes}\left( \mathbf{P}_{X|z} \right)</math>. This optimal denoiser can be expressed using the marginal distribution of <math>Z</math> alone, as follows. When the channel matrix <math>\Pi</math> is invertible, we have <math>\mathbf{P}_{X|z} \propto \Pi^{-\top}P_Z\odot \pi_z</math> where <math>\pi_z</math> is the <math>z</math>-th column of <math>\Pi</math>. This implies that the optimal denoiser is given equivalently by <math>\hat{X}(z)=\hat{X}_{Bayes}\left(
| |
| \Pi^{-\top}\mathbf{P}_Z\odot\pi_z \right)</math>. When <math>|\mathcal{X}|\leq|\mathcal{Z}|</math> and <math>\Pi</math> is not invertible, under the reasonable assumption that it has full row rank, we can replace <math>\Pi^{-1}</math> with its Moore-Penrose pseudo-inverse and obtain
| |
| <math>
| |
| \hat{X}(z)=\hat{X}_{Bayes}\left( (\Pi\Pi^\top)^{-1}\Pi\mathbf{P}_Z\odot\pi_z
| |
| \right)\,.
| |
| </math>
| |
| | |
| Turning now to arbitrary <math>n</math>, the optimal denoiser <math>\hat{X}^{opt}(z^n)</math> (with minimal expected loss) is therefore given by the Bayes response to <math>\mathbf{P}_{X_i|z^n}</math>
| |
| | |
| <math>
| |
| \begin{align}
| |
| \hat{X}^{opt}_i(z^n) = \hat{X}_{Bayes}\mathbf{P}_{X_i|z^n} =
| |
| \text{argmin}_{\hat{x}\in\mathcal{X}}\lambda_{\hat{x}}^\top \mathbf{P}_{X_i|z^n}\,,
| |
| \end{align}
| |
| </math>
| |
| | |
| where <math>\mathbf{P}_{X_i|z^n}</math> is a vector indexed by <math>\mathcal{X}</math>, whose <math>x</math>-entry is <math>\mathbb{P}\left( X_i=x | Z^n=z^n \right)</math>. The conditional probability vector <math>\mathbf{P}_{X_i|z^n}</math> is hard to compute. A derivation analogous to the case <math>n=1</math> above shows that the optimal denoiser admits an alternative representation, namely <math>\hat{X}^{opt}_i(z^n)=\hat{X}_{Bayes}\left(
| |
| \Pi^{-\top}\mathbf{P}_{Z_i,z^{n\backslash i}}\odot\pi_{z_i} \right)</math>, where <math>z^{n \backslash i}=\left( z_1,\ldots,z_{i-1},z_{i+1},\ldots,z_n \right)\in
| |
| \mathcal{Z}^{n-1}</math> is a given vector and <math>\mathbf{P}_{Z_i,z^{n\backslash i}}</math> is the probability vector indexed by <math>\mathcal{Z}</math> whose <math>z</math>-entry is <math>\mathbb{P}\left( (Z_1,\ldots,Z_n) =
| |
| (z_1,\ldots,z_{i-1},z,z_{i+1},\ldots,z_n) \right)\,.</math> Again, <math>\Pi^{-\top}</math> is replaced by a pseudo-inverse if <math>\Pi</math> is not square or not invertible.
| |
| | |
| When the distribution of <math>X</math> (and therefore, of <math>Z</math>) is
| |
| not available, the DUDE replaces the unknown vector
| |
| <math>\mathbf{P}_{Z_i,z^{n\backslash i}}</math> with an empirical estimate
| |
| obtained along the noisy sequence <math>z^n</math> itself, namely with
| |
| <math>\mu\left( Z_i, l^k(Z^n,i),r^k(Z^n,i) \right)</math>. This leads to the
| |
| above definition of the DUDE.
| |
| | |
| While the convergence arguments behind the optimality properties above are more
| |
| subtle, we note that the above, combined with the
| |
| [[Ergodic_theory#Ergodic_theorems|Birkhoff Ergodic Theorem]], is enough to prove that for a stationary ergodic source, the DUDE with context-length <math>k</math> is asymptotically optimal all <math>k</math>-th order sliding window denoisers.
| |
| | |
| ==Extensions==
| |
| | |
| The basic DUDE as described here assumes a signal with a one-dimensional index
| |
| set over a finite alphabet, a known memoryless
| |
| channel and a context length that is fixed in advance. Relaxations of each of these
| |
| assumptions have been considered in turn.<ref>
| |
| {{cite journal |author=Ordentlich, E. |title= Reflections on the DUDE |url= http://www.stanford.edu/~tsachy/pdf_files/Reflections%20on%20the%20DUDE.pdf |display-authors=2 |author3=<Please add first missing authors to populate metadata.> }}
| |
| </ref> Specifically:
| |
| * Infinite alphabets <ref>
| |
| A. Dembo and T. Weissman. Universal denoising for the finite-input-general-output channel.
| |
| IEEE Trans. Inform. Theory, 51(4):1507–1517, April 2005.
| |
| </ref><ref>
| |
| K. Sivaramakrishnan and T. Weissman. Universal denoising of discrete-time continuous amplitude signals. In Proc. of the 2006 IEEE Intl. Symp. on Inform. Theory, (ISIT’06),
| |
| Seattle, WA, USA, July 2006.
| |
| </ref><ref name="cont-alphabet1">
| |
| G. Motta, E. Ordentlich, I. Ramírez, G. Seroussi, and M. Weinberger, “The
| |
| DUDE framework for continuous tone image denoising,” IEEE Transactions on
| |
| Image Processing, 20, No. 1, January 2011.
| |
| </ref><ref name="cont-alphabet2">
| |
| K. Sivaramakrishnan and T. Weissman. Universal denoising of continuous amplitude signals
| |
| with applications to images. In Proc. of IEEE International Conference on Image Processing,
| |
| Atlanta, GA, USA, October 2006, pp. 2609–2612
| |
| </ref>
| |
| * Channels with memory <ref>
| |
| C. D. Giurcaneanu and B. Yu. Efficient algorithms for discrete universal denoising for channels
| |
| with memory. In Proc. of the 2005 IEEE Intl. Symp. on Inform. Theory, (ISIT’05), Adelaide,
| |
| Australia, Sept. 2005.
| |
| </ref><ref>
| |
| R. Zhang and T. Weissman. Discrete denoising for channels with memory. Communications
| |
| in Information and Systems (CIS), 5(2):257–288, 2005.
| |
| </ref>
| |
| * Unknown channel matrix <ref>
| |
| G. M. Gemelos, S. Sigurjonsson, T. Weissman. Universal minimax discrete denoising under
| |
| channel uncertainty. IEEE Trans. Inform. Theory, 52:3476–3497, 2006.
| |
| </ref><ref>
| |
| G. M. Gemelos, S. Sigurjonsson and T. Weissman. Algorithms for discrete denoising under
| |
| channel uncertainty. IEEE Trans. Signal Processing, 54(6):2263–2276, June 2006.
| |
| </ref>
| |
| * Variable context and adaptive choice of context length <ref>
| |
| E. Ordentlich, M.J. Weinberger, and T. Weissman. Multi-directional context sets with applications to universal denoising and compression. In Proc. of the 2005 IEEE Intl. Symp. on
| |
| Inform. Theory, (ISIT’05), Adelaide, Australia, Sept. 2005.
| |
| </ref><ref>
| |
| J. Yu and S. Verd´u. Schemes for bidirectional modeling of discrete stationary sources. IEEE
| |
| Trans. Inform. Theory, 52(11):4789–4807, 2006.
| |
| </ref><ref>
| |
| S. Chen, S. N. Diggavi, S. Dusad and S. Muthukrishnan. Efficient string matching algorithms
| |
| for combinatorial universal denoising. In Proc. of IEEE Data Compression Conference (DCC),
| |
| Snowbird, Utah, March 2005.
| |
| </ref><ref>
| |
| G. Gimel’farb. Adaptive context for a discrete universal denoiser. In Proc. Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshops, SSPR 2004
| |
| and SPR 2004, Lisbon, Portugal, August 18–20, pp. 477–485
| |
| </ref>
| |
| * Two-dimensional signals <ref name="2d-dude">
| |
| E. Ordentlich, G. Seroussi, S. Verd´u, M.J. Weinberger, and T. Weissman. A universal discrete
| |
| image denoiser and its application to binary images. In Proc. IEEE International Conference
| |
| on Image Processing, Barcelona, Catalonia, Spain, September 2003.
| |
| </ref>
| |
| | |
| ==Applications==
| |
| | |
| ===Application to image denoising===
| |
| | |
| A DUDE-based framework for grayscale [[image denoising]]<ref name=cont-alphabet1 /> achieves state-of-the-art denoising for impulse-type noise channels (e.g., "salt and pepper" or "M-ary symmetric" noise), and good performance on the Gaussian channel (comparable to the [[Non-local means]] image denoising scheme on this channel). A different DUDE variant applicable to grayscale images is presented in.<ref name=cont-alphabet2 />
| |
| | |
| ===Application to channel decoding of uncompressed sources===
| |
| | |
| The DUDE has led to universal algorithms for channel decoding of uncompressed sources
| |
| .<ref name="uncompressed-sources">
| |
| E. Ordentlich, G. Seroussi, S. Verdú, and K. Viswanathan, "Universal
| |
| Algorithms for Channel Decoding of Uncompressed Sources," IEEE Trans.
| |
| Information Theory, vol. 54, no. 5, pp. 2243-2262, May 2008
| |
| </ref>
| |
| | |
| ==References==
| |
| {{reflist}}
| |
| | |
| [[Category:Noise reduction]]
| |
Quantity versus quality. That�s an issue that feels relevant in fashion today, with the current focus on fast fashion from high street through to pre-collection. It also feels relevant coming from New York, a fashion week with a schedule that at first glance seems rammed, but upon closer inspection, like a mirage, fades away to practically nothing.
Nothing of note, in any case. Clocking in at nine days, New York is only rivalled by Paris when it comes to sheer heft, but the city�s fashion direction is really defined by a handful of designers. Despite the name, there wasn�t that much in New York this spring/summer 2015 season that really registered as �new�
Instead, what we often saw was a composite of tried-and-tested crowd pleasers, and maybe a few novel ideas filched from the back catalogues of other designers
That approach - hoodwinking press and buyers with the illusion of novelty, like the man behind the curtain making believe he was the great and all-powerful Oz - is endemic across the industry as a whole. Nevertheless, there�s something about New York that throws it into sharper relief
Take Victoria Beckham�s show, liberal as it was in its borrowing of stylist tropes culled from past collections by Cheap Celine Bags and Jil Sander. Beckham, however, has made no qualms about the fact that she isn�t a trained designer: instead, she acts as an editing eye, more like a magazine stylist - or an especially canny shopper
She and her team (because they deserve a hefty dose of credit as the power behind the throne) are adept at nailing what feels right in a particular moment. Sometimes, of course, what feels right is a moment another designer nailed not that long ago. It looks fine second time around
The bags were good. And it�ll sell
New York Fashion Week in pictures
Sales are often seen as the key motivating factor in the New York fashion scene. That�s not to say that Milanese, Parisian and London designers don�t shift product. They do, but America was built on a history of mass manufacture, rather than the handmade ethos that still informs the European capitals via the haute couture, alta moda and Savile Row
Maybe that conditions American designers into thinking about their labels as true businesses - young New York designers throw about the word �brand� with wild abandon, in a manner that their French or British counterparts shy away from (young Milanese designers are virtually non-existent, but that�s a discussion for another time, and a different city
Kendall Jenner walks the runway at the Diane Von Furstenberg fashion show during New York Fashion Wee
There�s no designer younger and more branded than Alexander Wang: he founded his label aged 21 in 2005, off the back of a few knitted sweaters. It�s now valued, conservatively, at �20m and Wang is a fashion-week fixture. Wang slots into the Victoria Beckham camp when it comes to design, although he doubtless won�t appreciate the comparison. Nevertheless, his collections aren�t groundbreaki
Rather, they�re artful bricolage, fusing existing fashion references, tricky, techy textiles, odd accessories and ever-shifting ideas of co
.
Bricolage sounds cool, but is actually just French for �tinkering�, which is exactly what Wang does. After a few seasons of duds (silly fur mittens, tired logo-mania, last season�s ugly utility), this collection got the mix down. Sexed-up, stripped-back sportswear, in neon-flushed fabrics with plenty of Aertex, rubberised treatments and fake function. It wasn�t original in the slightest, but it had enough energy to sweep you al
Quite a few New York designers get by on that, by pumping up the energy around their clothes rather than translating said energy into the garments themselves. It can frequently lead to a zinging, post-show high followed by a crash when you actually see the stuff out of cont
Spring/summer 2015 looks by Jason Wu You sometimes get that with Thom Browne, so complex and convoluted are the catwalk mise en sc�nes within which he places his clothing. This season, models paraded lavishly embroidered tailoring, feather-pricked cardigan suits and sequinned PVC on a freshly-mown lawn, to a spoken-word soundtrack waffling on about a bunch of sisters and what they w
The story was written by Browne himself, the voice was Diane Kea
n.
Apparently, the six sisters are a cross between the Beale sisters of Grey Gardens (taste levels) and the Rockerfellers of Park Avenue (cash levels - Browne�s plainest suits come in at around two grand). It was an uplifting distraction, and the energy came not from a thumping soundtrack or styling gimmicks, but from the clothes themselves - however untenable they may be for real women�s real li
There was a sense of reality to what Lazaro Hernandez and Jack Mccollough offered at Proenza Schouler. �It�s really about American sportswear, and this idea of �normal�,� said Hernandez before a show that was anything but. Their �normal� included leather vests plaited to resemble houndstooth, nylon thread crocheted into openwork dresses, and perforated blouses and skirts in leather so tissue-fine it ended up looking like nylon. �It�s dumb,� they said �It�s the clothes we all wear every day.� Meaning clothes that weren�t clever-clever or trying too hard, and that the work in, say, an argyle dress composed of 144 pattern pieces and executed without fit seams, couldn�t be immediately read in a two-dimensional image but had to be experienced in t
lesh.
Read more: Marc Jacobs' no makeup models du
ng NYFW
Victoria Beckha
at NYFW
Too many New York designers stick to tried and tested recipesThere was a touch of the dumb to Jason Wu�s show, too. The good kind of dumb - the dumb glamour of a bugle-beaded evening gown with the easiness of aT-shirt, or a billowing silk-jersey dress with a Grecian si
city.
They felt easy, really ready to wear - as opposed to so many of the resoundingly difficult clothes dubbed that way. The final say from New York fashion week comes from Marc Jacobs. This season, he, too, seemed fixated on the notion of real - or perhaps, hy
real.
His audience listened, via headphones on each seat, to piped-in background noise from a middle-American house, while a distinctly Koonsian reworking of one, 10,000sq ft of shocking pink, sat in the middle of his catwalk. A model presents a creation by Jason Wu Spring/Summer 2015 collection during New York Fash
n Week
Those bore no relation to the clothes, riffs on army surplus in satin punctuated with cartoonish holes and peppered with buckshot spherical embroidery, in plump, doll-like shapes. Barbie meets Action Man - maybe the pink, centre-stage shack was her dream house?
s me.
Whatever the rationale, neither clothes nor show looked like anything else this week. Which was precisely the point. Marc Jacobs keeps his eye on what other designers are doing. It�s not to copy them, or even to check if they copy him, but out of a perverse contrariness, a wish to buck the st
quo.
If other designers do gingham and sugary-sweet bridesmaid pastels, you can bet Jacobs will show polka dots and sludgy fatigues. Regardless of taste, or even relevance, you have to applaud Jacobs for at least showing us something consistently, contrarily new, in a New York that desperately needs it.