Lagrange, Euler and Kovalevskaya tops: Difference between revisions
en>Pierscoleman m Linked to Kovalevskaya Top wiki entry |
en>BG19bot m WP:CHECKWIKI error fix for #61. Punctuation goes before References. Do general fixes if a problem exists. - using AWB (8855) |
||
Line 1: | Line 1: | ||
{{Redirect|Elastic net|the statistical regularization technique|Elastic net regularization}} | |||
[[File:Elmap breastcancer wiki.png|thumb|300px| Linear PCA versus nonlinear Principal Manifolds<ref name=Handbook>A. N. Gorban, A. Y. Zinovyev, [http://arxiv.org/abs/0809.0490 Principal Graphs and Manifolds], In: Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods and Techniques, Olivas E.S. et al Eds. Information Science Reference, IGI Global: Hershey, PA, USA, 2009. 28–59.</ref> for [[Scientific visualization|visualization]] of [[breast cancer]] [[microarray]] data: a) Configuration of nodes and 2D Principal Surface in the 3D PCA linear manifold. The dataset is curved and can not be mapped adequately on a 2D principal plane; b) The distribution in the internal 2D non-linear principal surface coordinates (ELMap2D) together with an estimation of the density of points; c) The same as b), but for the linear 2D PCA manifold (PCA2D). The “basal” breast cancer subtype is visualized more adequately with ELMap2D and some features of the distribution become better resolved in comparison to PCA2D. Principal manifolds are produced by the '''elastic map'''s algorithm. Data are available for public competition.<ref>Wang, Y., Klijn, J.G., Zhang, Y., Sieuwerts, A.M., Look, M.P., Yang, F., Talantov, D., Timmermans, M., Meijer-van Gelder, M.E., Yu, J. et al.: Gene expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365, 671–679 (2005); [http://www.ihes.fr/~zinovyev/princmanif2006/ Data online]</ref> Software is available for free non-commercial use.<ref>A. Zinovyev, [http://bioinfo-out.curie.fr/projects/vidaexpert/ ViDaExpert] - Multidimensional Data Visualization Tool (free for non-commercial use). [[Curie Institute (Paris)|Institut Curie]], Paris.</ref><ref>A. Zinovyev, [http://www.ihes.fr/~zinovyev/vida/ViDaExpert/ViDaOverView.pdf ViDaExpert overview], [http://www.ihes.fr IHES] ([[Institut des Hautes Études Scientifiques]]), Bures-Sur-Yvette, Île-de-France.</ref>]] | |||
'''Elastic maps''' provide a tool for [[nonlinear dimensionality reduction]]. By their construction, they are system of elastic [[Spring (device)|springs]] embedded in the data | |||
space.<ref name=Handbook/> This system approximates a low-dimensional manifold. The elastic coefficients of this system allow the switch from completely unstructured [[k-means]] [[Cluster analysis|clustering]] (zero elasticity) to the estimators located closely to linear [[Principal component analysis|PCA manifolds]] (for high bending and low stretching modules). With some intermediate values of the [[elasticity coefficient]]s, this system effectively approximates non-linear principal manifolds. This approach is based on a [[Mechanics|mechanical]] analogy between principal manifolds, that are passing through "the middle" of data distribution, and elastic membranes and plates. The method was developed by A.N. Gorban, A.Y. Zinovyev and A.A. Pitenko in 1996–1998. | |||
== Energy of elastic map == | |||
Let data set be a set of vectors <math>S</math> in a finite-dimensional [[Euclidean space]]. The ''elastic map'' is represented by a set of nodes <math>W_j</math> in the same space. Each datapoint <math>s \in S</math> has a ''host node'', namely the closest node <math>W_j</math> (if there are several closest nodes then one takes the node with the smallest number). The data set <math>S</math> is divided on classes <math>K_j=\{s \ | \ W_j \mbox{ is a host of } s\}</math>. | |||
The ''approximation energy'' D is the distortion | |||
: <math>D=\frac{1}{2}\sum_{j=1}^k \sum_{s \in K_j}\|s-W_j\|^2</math>, | |||
this is the energy of the springs with unit elasticity which connect each data point with its host node. It is possible to apply weighting factors to the terms of this sum, for example to reflect the [[standard deviation]] of the [[probability density function]] of any subset of data points <math>\{s_i\}</math>. | |||
On the set of nodes an additional structure is defined. Some pairs of nodes, <math>(W_i,W_j)</math>, are connected by ''elastic edges''. Call this set of pairs <math>E</math>. Some triplets of nodes, <math>(W_i,W_j,W_k)</math>, form ''bending ribs''. Call this set of triplets <math>G</math>. | |||
: The stretching energy is <math>U_{E}=\frac{1}{2}\lambda \sum_{(W_i,W_j) \in E} \|W_i -W_j\|^2 </math>, | |||
: The bending energy is <math>U_G=\frac{1}{2}\mu \sum_{(W_i,W_j,W_l) \in G} \|W_i -2W_j+W_l\|^2 </math>, | |||
where <math>\lambda</math> and <math>\mu</math> are the stretching and bending moduli respectively. The stretching energy is sometimes referred to as the "membrane" term, while the bending energy is referred to as the "thin plate" term.<ref>Michael Kass, Andrew Witkin, Demetri Terzopoulos, Snakes: Active contour models, Int.J. Computer Vision, 1988 vol 1-4 pp.321-331</ref> | |||
For example, on the 2D rectangular grid the elastic edges are just vertical and horizontal edges (pairs of closest vertices) and the bending ribs are the vertical or horizontal triplets of consecutive (closest) vertices. | |||
: The total energy of the elastic map is thus <math>U=D+U_E+U_G.</math> | |||
The position of the nodes <math>\{W_j\}</math> is determined by the mechanical equilibrium of the elastic map, i.e. its location is such that it minimizes the total energy <math>U</math>. | |||
== Expectation-maximization algorithm == | |||
For a given splitting of the dataset <math>S</math> in classes <math>K_j</math> minimization of the quadratic functional <math>U</math> is a linear problem with the sparse matrix of coefficients. Therefore, similarly to PCA or ''k''-means, a splitting method is used: | |||
* For given <math>\{W_j\}</math> find <math>\{K_j\}</math>; | |||
* For given <math>\{K_j\}</math> minimize <math>U</math> and find <math>\{W_j\}</math>; | |||
* If no change, terminate. | |||
This [[expectation-maximization algorithm]] guarantees a local minimum of <math>U</math>. For improving the approximation various additional methods are proposed. For example, the ''softening'' strategy is used. This strategy | |||
starts with a rigid grids (small length, small bending and large elasticity modules | |||
<math>\lambda </math> and <math>\mu </math> coefficients) and finishes with soft grids (small <math>\lambda </math> and <math>\mu </math>). The training goes in several epochs, each epoch with its own grid rigidness. Another adaptive strategy is ''growing net'': one starts from small amount of nodes and gradually adds new nodes. Each epoch goes with its own number of nodes. | |||
== Applications == | |||
[[File:SlideQualityLife.png|thumb|300px| Application of principal curves build by the elastic maps method: Nonlinear quality of life index.<ref>A. N. Gorban, A. Zinovyev, [http://arxiv.org/abs/1001.1122 Principal manifolds and graphs in practice: from molecular biology to dynamical systems], [[International Journal of Neural Systems]], Vol. 20, No. 3 (2010) 219–232.</ref> Points represent data of the [[United Nations|UN]] 171 countries in 4-dimensional space formed by the values of 4 indicators: [[Gross domestic product|gross product per capita]], [[life expectancy]], [[infant mortality]], [[tuberculosis]] incidence. Different forms and colors correspond to various geographical locations and years. Red bold line represents the '''principal curve''', approximating the dataset.]] | |||
Most important applications are in bioinformatics<ref>A.N. Gorban, B. Kegl, D. Wunsch, A. Zinovyev (Eds.), [http://pca.narod.ru/contentsgkwz.htm Principal Manifolds for Data Visualisation and Dimension Reduction], LNCSE 58, Springer: Berlin – Heidelberg – New York, 2007. ISBN 978-3-540-73749-0</ref> | |||
,<ref>M. Chacón, M. Lévano, H. Allende, H. Nowak, [http://pca.narod.ru/ElNetChakon.pdf Detection of Gene Expressions in Microarrays by Applying Iteratively Elastic Neural Net], In: B. Beliczynski et al. (Eds.), Lecture Notes in Computer Sciences, Vol. 4432, Springer: Berlin – Heidelberg 2007, 355–363.</ref> for exploratory data analysis and visualisation of multidimensional data, for data visualisation in economics, social and political sciences,<ref>A. Zinovyev, [http://arxiv.org/abs/1008.1188 Data visualization in political and social sciences], In: SAGE "International Encyclopedia of Political Science", Badie, B., Berg-Schlosser, D., Morlino, L. A. (Eds.), 2011.</ref> as an auxiliary tool for data mapping in geographic informational systems and for visualisation of data of various nature. | |||
Recently, the method is adapted as a support tool in the decision process underlying the selection, optimization, and management of [[financial portfolio]]s.<ref>M. Resta, [http://www.springerlink.com/content/6416210h727016t5/ Portfolio optimization through elastic maps: Some evidence from the Italian stock exchange], Knowledge-Based Intelligent Information and Engineering Systems, B. Apolloni, R.J. Howlett and L. Jain (eds.), Lecture Notes in Computer Science, Vol. 4693, Springer: Berlin – Heidelberg, 2010, 635-641.</ref> | |||
==References== | |||
{{reflist}} | |||
[[Category:Data mining]] | |||
[[Category:Multivariate statistics]] | |||
[[Category:Dimension reduction]] |
Revision as of 20:53, 12 January 2013
Name: Jodi Junker
My age: 32
Country: Netherlands
Home town: Oudkarspel
Post code: 1724 Xg
Street: Waterlelie 22
my page - www.hostgator1centcoupon.info
Elastic maps provide a tool for nonlinear dimensionality reduction. By their construction, they are system of elastic springs embedded in the data space.[1] This system approximates a low-dimensional manifold. The elastic coefficients of this system allow the switch from completely unstructured k-means clustering (zero elasticity) to the estimators located closely to linear PCA manifolds (for high bending and low stretching modules). With some intermediate values of the elasticity coefficients, this system effectively approximates non-linear principal manifolds. This approach is based on a mechanical analogy between principal manifolds, that are passing through "the middle" of data distribution, and elastic membranes and plates. The method was developed by A.N. Gorban, A.Y. Zinovyev and A.A. Pitenko in 1996–1998.
Energy of elastic map
Let data set be a set of vectors in a finite-dimensional Euclidean space. The elastic map is represented by a set of nodes in the same space. Each datapoint has a host node, namely the closest node (if there are several closest nodes then one takes the node with the smallest number). The data set is divided on classes .
The approximation energy D is the distortion
this is the energy of the springs with unit elasticity which connect each data point with its host node. It is possible to apply weighting factors to the terms of this sum, for example to reflect the standard deviation of the probability density function of any subset of data points .
On the set of nodes an additional structure is defined. Some pairs of nodes, , are connected by elastic edges. Call this set of pairs . Some triplets of nodes, , form bending ribs. Call this set of triplets .
where and are the stretching and bending moduli respectively. The stretching energy is sometimes referred to as the "membrane" term, while the bending energy is referred to as the "thin plate" term.[5]
For example, on the 2D rectangular grid the elastic edges are just vertical and horizontal edges (pairs of closest vertices) and the bending ribs are the vertical or horizontal triplets of consecutive (closest) vertices.
The position of the nodes is determined by the mechanical equilibrium of the elastic map, i.e. its location is such that it minimizes the total energy .
Expectation-maximization algorithm
For a given splitting of the dataset in classes minimization of the quadratic functional is a linear problem with the sparse matrix of coefficients. Therefore, similarly to PCA or k-means, a splitting method is used:
This expectation-maximization algorithm guarantees a local minimum of . For improving the approximation various additional methods are proposed. For example, the softening strategy is used. This strategy starts with a rigid grids (small length, small bending and large elasticity modules and coefficients) and finishes with soft grids (small and ). The training goes in several epochs, each epoch with its own grid rigidness. Another adaptive strategy is growing net: one starts from small amount of nodes and gradually adds new nodes. Each epoch goes with its own number of nodes.
Applications
Most important applications are in bioinformatics[7] ,[8] for exploratory data analysis and visualisation of multidimensional data, for data visualisation in economics, social and political sciences,[9] as an auxiliary tool for data mapping in geographic informational systems and for visualisation of data of various nature.
Recently, the method is adapted as a support tool in the decision process underlying the selection, optimization, and management of financial portfolios.[10]
References
43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.
- ↑ 1.0 1.1 A. N. Gorban, A. Y. Zinovyev, Principal Graphs and Manifolds, In: Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods and Techniques, Olivas E.S. et al Eds. Information Science Reference, IGI Global: Hershey, PA, USA, 2009. 28–59.
- ↑ Wang, Y., Klijn, J.G., Zhang, Y., Sieuwerts, A.M., Look, M.P., Yang, F., Talantov, D., Timmermans, M., Meijer-van Gelder, M.E., Yu, J. et al.: Gene expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365, 671–679 (2005); Data online
- ↑ A. Zinovyev, ViDaExpert - Multidimensional Data Visualization Tool (free for non-commercial use). Institut Curie, Paris.
- ↑ A. Zinovyev, ViDaExpert overview, IHES (Institut des Hautes Études Scientifiques), Bures-Sur-Yvette, Île-de-France.
- ↑ Michael Kass, Andrew Witkin, Demetri Terzopoulos, Snakes: Active contour models, Int.J. Computer Vision, 1988 vol 1-4 pp.321-331
- ↑ A. N. Gorban, A. Zinovyev, Principal manifolds and graphs in practice: from molecular biology to dynamical systems, International Journal of Neural Systems, Vol. 20, No. 3 (2010) 219–232.
- ↑ A.N. Gorban, B. Kegl, D. Wunsch, A. Zinovyev (Eds.), Principal Manifolds for Data Visualisation and Dimension Reduction, LNCSE 58, Springer: Berlin – Heidelberg – New York, 2007. ISBN 978-3-540-73749-0
- ↑ M. Chacón, M. Lévano, H. Allende, H. Nowak, Detection of Gene Expressions in Microarrays by Applying Iteratively Elastic Neural Net, In: B. Beliczynski et al. (Eds.), Lecture Notes in Computer Sciences, Vol. 4432, Springer: Berlin – Heidelberg 2007, 355–363.
- ↑ A. Zinovyev, Data visualization in political and social sciences, In: SAGE "International Encyclopedia of Political Science", Badie, B., Berg-Schlosser, D., Morlino, L. A. (Eds.), 2011.
- ↑ M. Resta, Portfolio optimization through elastic maps: Some evidence from the Italian stock exchange, Knowledge-Based Intelligent Information and Engineering Systems, B. Apolloni, R.J. Howlett and L. Jain (eds.), Lecture Notes in Computer Science, Vol. 4693, Springer: Berlin – Heidelberg, 2010, 635-641.