Manifold alignment: Difference between revisions
en>Phil Boswell m convert dodgy URL to ID using AWB |
en>Randykitty m fix |
||
Line 1: | Line 1: | ||
'''Diffusion maps''' is a [[machine learning]] algorithm introduced by R. R. Coifman and S. Lafon.<ref name="DifussionMap" /><ref name="Diffusion" /> It computes a family of embeddings of a data set into Euclidean space (often low-dimensional) whose coordinates can be computed from the eigenvectors and eigenvalues of a diffusion operator on the data. The Euclidean distance between points in the embedded space is equal to the "diffusion distance" between probability distributions centered at those points. Different from other dimensionality reduction methods such as [[principal component analysis]] (PCA) and [[multi-dimensional scaling]] (MDS), diffusion maps is a non-linear method that focuses on discovering the underlying [[manifold]] that the data has been sampled from. By integrating local similarities at different scales, diffusion maps gives a global description of the data-set. Compared with other methods, the diffusion maps algorithm is robust to noise perturbation and is computationally inexpensive. | |||
==Definition of diffusion maps== | |||
Following,<ref name="Introduction" /> diffusion maps can be defined in four steps. | |||
===Connectivity=== | |||
Diffusion maps leverages the relationship between [[heat diffusion]] and random walk [[Markov chain]]. The basic observation is that if we take a random walk on the data, walking to a nearby data-point is more likely than walking to another that is far away. Based on this, the connectivity between two data points, <math>x</math> and <math>y</math>, can be defined as the probability of walking from <math>x</math> to <math>y</math> in one step of the random walk. Usually, this probability will be defined as the kernel function on the two points, for example, the popular Gaussian kernel: | |||
: <math> | |||
p(x,y)=k(x,y)=e^{-\frac{||x-y||^2}{\alpha}} | |||
</math> | |||
More generally, the [[Integral kernel|kernel]] function has the following properties | |||
: <math>k(x,y) = k(y,x)</math> | |||
(<math>k</math> is symmetric) | |||
: <math> k(x,y) \geq 0 \,\,\forall x,y</math> | |||
(<math>k</math> is positivity preserving). | |||
The kernel constitutes the prior definition of the local geometry of data-set. Since a given kernel will capture a specific feature of the data set, its choice should be guided by the application that one has in mind. | |||
===Diffusion process=== | |||
With the definition of connectivity between two points, we can define the diffusion matrix <math>L</math> (it is also a version of graph Laplacian matrix) | |||
: <math> | |||
L_{i,j}=k(x_i,x_j) \, | |||
</math> | |||
Many works use the normalized matrix as | |||
: <math> | |||
M=D^{-1}L. \, | |||
</math> | |||
We can define the probabilities for taken from <math>i</math> to <math>j</math> in <math>t</math> steps as: | |||
: <math> | |||
p(x_j,t|x_i)=M^t_{ij} \, | |||
</math> | |||
As we calculate the probabilities<math>M^{t}</math> for increasing values of <math>t</math>, we observe the data-set at different scales. This is the diffusion process. Diffusion process reveals | |||
the global geometric structure of a data set as paths along the true geometric structure of the data set have high probability. | |||
===Diffusion distance=== | |||
The diffusion distance at time <math>t</math> between two points can be measured as the similarity of two points in the observation space with the connectivity between them. It is given by | |||
: <math> | |||
D_{t}(x_i,x_j)^2 =\sum_y (p(y,t|x_i)-p(y,t|x_j))^2 w(y) | |||
</math> | |||
here <math>w(y)</math> is a weighted function which can often treated as <math>w(y)=1</math> | |||
===Diffusion process=== | |||
Now, we can give the definition of diffusion map as: | |||
: <math> | |||
\Psi_t(x)=(\lambda_1^t\psi_1(x),\lambda_2^t\psi_2(x),\ldots,\lambda_n^t\psi_n(x)) | |||
</math> | |||
in,<ref name="Nadler05diffusionmaps" /> it is proved that | |||
: <math> | |||
D_t(x_i,x_j)^2=||\Psi_t(x_i)-\Psi_t(x_j)||^2 \, | |||
</math> | |||
if we take the first ''k'' eigenvectors and eigenvalues, we get the diffusion map from the original data to a ''k''-dimensional space which is embedded in the original space. | |||
==Algorithm== | |||
The basic algorithm framework of diffusion map is as: | |||
Step 1. Given the similarity matrix ''L'' | |||
Step 2. Form the normalized matrix <math>M = D^{-1}L</math> | |||
Step 3. Compute the ''k'' largest eigenvalues of <math>M^t</math> and the corresponding eigenvectors | |||
Step 4. Use diffusion map to get the embedding matrix ''Y'' | |||
==Application== | |||
In the paper,<ref name="Nadler05diffusionmaps" /> they showed how to design a kernel that reproduces the diffusion induced by a Fokker-Planck equation. Also, they explained that when the data approximate a manifold, then one can recover the geometry of this manifold by computing an approximation of the Laplace-Beltrami operator. This computation is completely insensitive | |||
to the distribution of the points and therefore provides a separation of the statistics and the geometry of the | |||
data. Since Diffusion map gives a global description of the data-set, it can measure the distances between pair of sample points in the manifold the data is embedded. Based on diffusion map, there are many applications, such as spectral clustering, low dimensional representation of images, image segmentation,<ref name="Farbman" /> 3D model segmentation,<ref name="sidi11" /> speaker identification,<ref name="speakerid2011" /> sampling on manifolds and so on. | |||
==See also== | |||
*[[Nonlinear dimensionality reduction]] | |||
==References== | |||
{{Reflist| | |||
refs= | |||
<ref name="Nadler05diffusionmaps">{{cite journal | |||
|last = Nadler | |||
|first = Boaz | |||
|coauthors = Stéphane Lafon ; Ronald R. Coifman ; Ioannis G. Kevrekidis | |||
|year = 2005 | |||
|title = Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker–Planck Operators | |||
|url = http://www.wisdom.weizmann.ac.il/~nadler/Publications/dm_nips05.pdf | |||
|journal = in Advances in Neural Information Processing Systems 18 | |||
}}</ref> | |||
<ref name="Farbman">{{cite journal | |||
|last = Zeev | |||
|first = Farbman | |||
|coauthors = Fattal Raanan ;Lischinski Dani | |||
|year = 2010 | |||
|title = Diffusion maps for edge-aware image editing | |||
|url = http://doi.acm.org/10.1145/1882261.1866171 | |||
|journal = ACM Trans. Graph | |||
|volume = 29 | |||
|pages = 145:1–145:10 | |||
}}</ref> | |||
<ref name="DifussionMap">{{cite journal | |||
|last = Coifman | |||
|first = R.R. | |||
|coauthors = S. Lafon. | |||
|year = 2006 | |||
|title = Diffusion maps | |||
|url = http://www.pnas.org/content/102/21/7426.long | |||
|journal = Applied and Computational Harmonic Analysis | |||
}}</ref> | |||
<ref name="Diffusion">{{cite journal | |||
|last = Lafon | |||
|first = S.S. | |||
|year = 2004 | |||
|title = Diffusion Maps and Geometric Harmonics | |||
|journal = PhD thesis, Yale University | |||
}}</ref> | |||
<!-- unused | |||
<ref name="Macrostate">{{cite journal | |||
|last = Korenblum | |||
|first = D. | |||
|coauthors = Shalloway D. | |||
|year = 2003 | |||
|title = Macrostate Data Clustering | |||
|url = http://pre.aps.org/abstract/PRE/v67/i5/e056704 | |||
|journal = Physical Review E | |||
|volume = 67 | |||
|page = 056704 | |||
}}</ref>--> | |||
<ref name="Introduction">{{cite journal | |||
|last = J De | |||
|first = Porte | |||
|coauthors = B M; Hereman, W; Walt, S J Van Der | |||
|year = 2008 | |||
|title = An Introduction to Diffusion Maps | |||
|journal =Techniques | |||
}}</ref> | |||
<ref name="sidi11">{{cite conference | |||
|last = Oana | |||
|first = Sidi | |||
|coauthors = Oliver van Kaick ; Yanir Kleiman ; Hao Zhang ; Daniel Cohen-Or | |||
|year = 2011 | |||
|title = Unsupervised Co-Segmentation of a Set of Shapes via Descriptor-Space Spectral Clustering | |||
|journal =ACM Transactions on Graphics | |||
}}</ref> | |||
<ref name="speakerid2011">{{cite journal | |||
|last = Michalevsky | |||
|first = Yan | |||
|coauthors = Ronen Talmon ; Israel Cohen | |||
|year = 2011 | |||
|title = Speaker Identification Using Diffusion Maps | |||
|conference = European signal processing conference 2011 | |||
|url = http://www.eurasip.org/Proceedings/Eusipco/Eusipco2011/papers/1569414913.pdf | |||
}}</ref> | |||
}} | |||
[[Category:Machine learning algorithms]] |
Revision as of 14:38, 17 October 2013
Diffusion maps is a machine learning algorithm introduced by R. R. Coifman and S. Lafon.[1][2] It computes a family of embeddings of a data set into Euclidean space (often low-dimensional) whose coordinates can be computed from the eigenvectors and eigenvalues of a diffusion operator on the data. The Euclidean distance between points in the embedded space is equal to the "diffusion distance" between probability distributions centered at those points. Different from other dimensionality reduction methods such as principal component analysis (PCA) and multi-dimensional scaling (MDS), diffusion maps is a non-linear method that focuses on discovering the underlying manifold that the data has been sampled from. By integrating local similarities at different scales, diffusion maps gives a global description of the data-set. Compared with other methods, the diffusion maps algorithm is robust to noise perturbation and is computationally inexpensive.
Definition of diffusion maps
Following,[3] diffusion maps can be defined in four steps.
Connectivity
Diffusion maps leverages the relationship between heat diffusion and random walk Markov chain. The basic observation is that if we take a random walk on the data, walking to a nearby data-point is more likely than walking to another that is far away. Based on this, the connectivity between two data points, and , can be defined as the probability of walking from to in one step of the random walk. Usually, this probability will be defined as the kernel function on the two points, for example, the popular Gaussian kernel:
More generally, the kernel function has the following properties
The kernel constitutes the prior definition of the local geometry of data-set. Since a given kernel will capture a specific feature of the data set, its choice should be guided by the application that one has in mind.
Diffusion process
With the definition of connectivity between two points, we can define the diffusion matrix (it is also a version of graph Laplacian matrix)
Many works use the normalized matrix as
We can define the probabilities for taken from to in steps as:
As we calculate the probabilities for increasing values of , we observe the data-set at different scales. This is the diffusion process. Diffusion process reveals the global geometric structure of a data set as paths along the true geometric structure of the data set have high probability.
Diffusion distance
The diffusion distance at time between two points can be measured as the similarity of two points in the observation space with the connectivity between them. It is given by
here is a weighted function which can often treated as
Diffusion process
Now, we can give the definition of diffusion map as:
in,[4] it is proved that
if we take the first k eigenvectors and eigenvalues, we get the diffusion map from the original data to a k-dimensional space which is embedded in the original space.
Algorithm
The basic algorithm framework of diffusion map is as:
Step 1. Given the similarity matrix L
Step 2. Form the normalized matrix
Step 3. Compute the k largest eigenvalues of and the corresponding eigenvectors
Step 4. Use diffusion map to get the embedding matrix Y
Application
In the paper,[4] they showed how to design a kernel that reproduces the diffusion induced by a Fokker-Planck equation. Also, they explained that when the data approximate a manifold, then one can recover the geometry of this manifold by computing an approximation of the Laplace-Beltrami operator. This computation is completely insensitive to the distribution of the points and therefore provides a separation of the statistics and the geometry of the data. Since Diffusion map gives a global description of the data-set, it can measure the distances between pair of sample points in the manifold the data is embedded. Based on diffusion map, there are many applications, such as spectral clustering, low dimensional representation of images, image segmentation,[5] 3D model segmentation,[6] speaker identification,[7] sampling on manifolds and so on.
See also
References
43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.
- ↑ Cite error: Invalid
<ref>
tag; no text was provided for refs namedDifussionMap
- ↑ Cite error: Invalid
<ref>
tag; no text was provided for refs namedDiffusion
- ↑ Cite error: Invalid
<ref>
tag; no text was provided for refs namedIntroduction
- ↑ 4.0 4.1 Cite error: Invalid
<ref>
tag; no text was provided for refs namedNadler05diffusionmaps
- ↑ Cite error: Invalid
<ref>
tag; no text was provided for refs namedFarbman
- ↑ Cite error: Invalid
<ref>
tag; no text was provided for refs namedsidi11
- ↑ Cite error: Invalid
<ref>
tag; no text was provided for refs namedspeakerid2011