Periodic function: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
en>Jack Greenmaven
m Reverted 1 edit by 119.92.247.208 identified as test/vandalism using STiki
en>Addshore
m Reverted edits by 111.93.196.146 (talk) (HG 3)
Line 1: Line 1:
{{Cleanup|date=June 2011}}
Hi their. My name is Myrtie Boger but I never really liked that name. Minnesota has been my home and Excellent every day living . To ride horses is the things i do once a week. Auditing has been my profession for some point but I plan on changing this method. You can always find his website here: [http://foxandjanesalon.com/snapbackhatssale.html Cheap snapback hats in bulk]
 
A '''self-organizing map''' ('''SOM''') or '''self-organizing feature map''' ('''SOFM''') is a type of [[artificial neural network]] (ANN) that is trained using [[unsupervised learning]] to produce a low-dimensional (typically two-dimensional), discretized representation of the input space of the training samples, called a '''map'''. Self-organizing maps are different from other artificial neural networks in the sense that they use a neighborhood function to preserve the [[Topology|topological]] properties of the input space.
 
[[File:Synapse Self-Organizing Map.png|thumb|right|300px|A self-organizing map showing [[United States Congress|U.S. Congress]] voting patterns visualized in [[Peltarion Synapse|Synapse]]. The first two boxes show clustering and distances while the remaining ones show the component planes. Red means a yes vote while blue means a no vote in the component planes (except the party component where red is [[Republican Party (United States)|Republican]] and blue is [[Democratic Party (United States)|Democratic]]).]]<!--
-->
This makes SOMs useful for [[Scientific visualization|visualizing]] low-dimensional views of high-dimensional data, akin to [[multidimensional scaling]]. The model was first described as an artificial neural network by the [[Finland|Finnish]] professor [[Teuvo Kohonen]], and is sometimes called a '''Kohonen map''' or '''network'''.<ref name="KohonenMap">{{cite web |title=Kohonen Network |last1=Kohonen |first1=Teuvo |last2=Honkela |first2=Timo |year=2007 |work=Scholarpedia |url=http://www.scholarpedia.org/article/Kohonen_network }}</ref><ref>{{cite journal |last=Kohonen |first=Teuvo |year=1982 |title=Self-Organized Formation of Topologically Correct Feature Maps |journal=Biological Cybernetics |volume=43 |number=1 |pages=59–69 }}</ref>
 
Like most artificial neural networks, SOMs operate in two modes: training and mapping. "Training" builds the map using input examples (a [[Competitive learning|competitive process]], also called [[vector quantization]]), while "mapping" automatically classifies a new input vector.
 
A self-organizing map consists of components called nodes or neurons. Associated with each node is a weight vector of the same dimension as the input data vectors and a position in the map space. The usual arrangement of nodes is a two-dimensional regular spacing in a [[hexagonal]] or [[rectangular]] grid. The self-organizing map describes a mapping from a higher dimensional input space to a lower dimensional map space. The procedure for placing a vector from data space onto the map is to find the node with the closest (smallest distance metric) weight vector to the data space vector.
 
While it is typical to consider this type of network structure as related to [[Feedforward neural networks|feedforward networks]] where the nodes are visualized as being attached, this type of architecture is fundamentally different in arrangement and motivation.
 
Useful extensions include using [[Torus|toroidal]] grids where opposite edges are connected and using large numbers of nodes.
 
It has been shown that while self-organizing maps with a small number of nodes behave in a way that is similar to [[K-means algorithm|K-means]], larger self-organizing maps rearrange data in a way that is fundamentally topological in character. <ref name="Self-organizing map">{{cite web |title=Self-organizing map |url=http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Self-organizing_map.html }}</ref>
 
It is also common to use the [[U-Matrix]].<ref name="UltschSiemon1990">{{cite book |first=Alfred |last=Ultsch |first2=H. Peter |last2=Siemon |chapter=Kohonen's Self Organizing Feature Maps for Exploratory Data Analysis |title=Proceedings of the International Neural Network Conference (INNC-90), Paris, France, July 9–13, 1990 |pages=305–308 |editor1-first=Bernard |editor1-last=Widrow |editor2-first=Bernard |editor2-last=Angeniol |publisher=Kluwer |location=Dordrecht, Netherlands |year=1990 |volume=1 |isbn=978-0-7923-0831-7 |url=http://www.uni-marburg.de/fb12/datenbionik/pdf/pubs/1990/UltschSiemon90 }}</ref> The U-Matrix value of a particular node is the average distance between the node and its closest neighbors.<ref name="Ultsch2003" /> In a square grid, for instance, we might consider the closest 4 or 8 nodes (the [[Von Neumann neighborhood|Von Neumann]] and [[Moore neighborhood]]s, respectively), or six nodes in a hexagonal grid.
 
Large SOMs display emergent properties. In maps consisting of thousands of nodes, it is possible to perform cluster operations on the map itself.<ref name="Ultsch2007">{{cite book |first=Alfred |last=Ultsch |chapter=Emergence in Self-Organizing Feature Maps |title=Proceedings of the 6th International Workshop on Self-Organizing Maps (WSOM '07) |editor1-first=H. |editor1-last=Ritter |editor2-first=R. |editor2-last=Haschke |publisher=Neuroinformatics Group |location=Bielefeld, Germany |year=2007 |isbn=978-3-00-022473-7 }}</ref>
 
== Learning algorithm ==
The goal of learning in the self-organizing map is to cause different parts of the network to respond similarly to certain input patterns. This is partly motivated by how visual, auditory or other [[sense|sensory]] information is handled in separate parts of the [[cerebral cortex]] in the [[human brain]].<ref name="Haykin">{{cite book |first=Simon |last=Haykin |title=Neural networks - A comprehensive foundation |chapter=9. Self-organizing maps |edition=2nd |publisher=Prentice-Hall |year=1999 |isbn=0-13-908385-5 }}</ref>
 
[[Image:Somtraining.svg|thumb|500px|An illustration of the training of a self-organizing map. The blue blob is the distribution of the training data, and the small white disc is the current training datum drawn from that distribution. At first (left) the SOM nodes are arbitrarily positioned in the data space. The node (highlighted in yellow) which is nearest to the training datum is selected. It is moved towards the training datum, as (to a lesser extent) are its neighbors on the grid. After many iterations the grid tends to approximate the data distribution (right).]]<!--
-->
The weights of the neurons are initialized either to small random values or sampled evenly from the subspace spanned by the two largest [[principal component]] [[eigenvectors]]. With the latter alternative, learning is much faster because the initial weights already give a good approximation of SOM weights.<ref name="SOMIntro">{{cite web |title=Intro to SOM |first=Teuvo |last=Kohonen |work=SOM Toolbox |url=http://www.cis.hut.fi/projects/somtoolbox/theory/somalgorithm.shtml |year=2005<!-- last updated 18 March 2005 --> |accessdate=2006-06-18 }}</ref>
 
The network must be fed a large number of example vectors that represent, as close as possible, the kinds of vectors expected during mapping. The examples are usually administered several times as iterations.
 
The training utilizes [[competitive learning]]. When a training example is fed to the network, its [[Euclidean distance]] to all weight vectors is computed. The neuron whose weight vector is most similar to the input is called the best matching unit (BMU). The weights of the BMU and neurons close to it in the SOM lattice are adjusted towards the input vector. The magnitude of the change decreases with time and with distance (within the lattice) from the BMU. The update formula for a neuron with weight vector '''Wv'''(s) is
:'''Wv'''(s + 1) = '''Wv'''(s) + Θ(u, v, s) α(s)('''D'''(t) - '''Wv'''(s)),
where s is the step index, t an index into the training sample, u is the index of the BMU for '''D'''(t), α(s) is a [[monotonically decreasing]] learning coefficient and '''D'''(t) is the input vector; v is assumed to visit all neurons for every value of s and t.<ref name="Scholarpedia">{{cite web |title=Kohonen network |first1=Teuvo |last1=Kohonen |first2=Timo |last2=Honkela |work=Scholarpedia |url=http://www.scholarpedia.org/article/Kohonen_network |year=2011<!-- last approved revision 2011-11-15 --> |accessdate=2012-09-24 }}<!--
Begin Quote
Consider first data items that are n-dimensional Euclidean vectors x(t)=[ξ1(t),ξ2(t),…,ξn(t)]. Here t is the index of the data item in a given sequence. Let the ith model be mi(t)=[μi1(t),μi2(t),…,μin(t)], where now t denotes the index in the sequence in which the models are generated.
End Quote
The equation mi(t+1)=mi(t)+α(t)hci(t)[x(t)−mi(t)] thus uses the symbol t to mean *two different things*: the t of x(t) is not the t of m, α and h. This is why we use s and t here.
 
Ultsch & Siemon 1990 also use three nested loops when describing Kohonen's algorithm: the outer one is over the training steps (and controls the decay of Θ and α (called n and η, respectively, in their paper)), the middle one is over the data items, and the inner is over the neurons.
--></ref> Depending on the implementations, t can scan the training data set systematically (t is 0, 1, 2...T-1, then repeat, T being the training sample's size), be randomly drawn from the data set ([[bootstrap sampling]]), or implement some other sampling method (such as [[Resampling_(statistics)#Jackknife|jackknifing]]).
 
The neighborhood function Θ(u, v, s) depends on the lattice distance between the BMU (neuron ''u'') and neuron ''v''. In the simplest form it is 1 for all neurons close enough to BMU and 0 for others, but a [[Gaussian function]] is a common choice, too. Regardless of the functional form, the neighborhood function shrinks with time.<ref name="Haykin" /> At the beginning when the neighborhood is broad, the self-organizing takes place on the global scale. When the neighborhood has shrunk to just a couple of neurons, the weights are converging to local estimates. In some implementations the learning coefficient α and the neighborhood function Θ decrease steadily with increasing s, in others (in particular those where t scans the training data set) they decrease in step-wise fashion, once every T steps.
 
This process is repeated for each input vector for a (usually large) number of cycles '''λ'''. The network winds up associating output nodes with groups or patterns in the input data set. If these patterns can be named, the names can be attached to the associated nodes in the trained net.
 
During mapping, there will be one single ''winning'' neuron: the neuron whose weight vector lies closest to the input vector. This can be simply determined by calculating the Euclidean distance between input vector and weight vector.
 
While representing input data as vectors has been emphasized in this article, it should be noted that any kind of object which can be represented digitally, which has an appropriate distance measure associated with it, and in which the necessary operations for training are possible can be used to construct a self-organizing map. This includes matrices, continuous functions or even other self-organizing maps.
 
===Preliminary definitions===
{{Unreferenced section|date=February 2010}}
[[File:SOM of RGB and eight colors.JPG|thumb|Self organizing maps (SOM) of three and eight colors with U-Matrix.]]
Consider an n×m array of nodes, each of which contains a weight vector and is aware of its location in the array. Each weight vector is of the same dimension as the node's input vector. The weights may initially be set to random values.
 
Now we need input to feed the map &mdash;The generated map and the given input exist in separate subspaces. We will create three vectors to represent colors. Colors can be represented by their red, green, and blue components. Consequently our input vectors will have three components, each corresponding to a color space. The input vectors will be:
:R = <255, 0, 0>
:G = <0, 255, 0>
:B = <0, 0, 255>
 
The color training vector data sets used in SOM:
:threeColors = [255, 0, 0], [0, 255, 0], [0, 0, 255]
:eightColors = [0, 0, 0], [255, 0, 0], [0, 255, 0], [0, 0, 255], [255, 255, 0], [0, 255, 255], [255, 0, 255], [255, 255, 255]
 
The data vectors should preferably be normalized (vector length is equal to one) before training the SOM.
[[File:SOM of Fishers Iris flower data set.JPG|thumb|Self organizing map of Fisher's Iris flower data.]]
 
Neurons (40&times;40 square grid) are trained for 250 iterations with a learning rate of 0.1 using the normalized [[Iris flower data set]] which has four-dimensional data vectors. Shown are: a color image formed by the first three dimensions of the four-dimensional SOM weight vectors (top left), a pseudo-color image of the magnitude of the SOM weight vectors (top right), a U-Matrix (Euclidean distance between weight vectors of neighboring cells) of the SOM (bottom left), and an overlay of data points (red: ''I. setosa'', green: ''I. versicolor'' and blue: ''I. virginica'') on the U-Matrix based on the minimum Euclidean distance between data vectors and SOM weight vectors (bottom right).
 
=== Variables ===
These are the variables needed, with vectors in bold,
* <math>s</math> is the current iteration
* <math>\lambda</math> is the iteration limit
* <math>t</math> is the index of the target input data vector in the input data set <math>\mathbf{D}</math>
* <math>\mathbf{D(t)}</math> is a target input data vector
* <math>v</math> is the index of the node in the map
* <math>\mathbf{Wv}</math> is the current weight vector of node ''v''
* <math>u</math> is the index of the best matching unit (BMU) in the map
* <math>\Theta (u, v, s)</math> is a restraint due to distance from BMU, usually called the neighborhood function, and
* <math>\alpha (s)</math> is a learning restraint due to iteration progress.
 
=== Algorithm ===
# Randomize the map's nodes' weight vectors
# Grab an input vector <math>\mathbf{D(t)}</math>
# Traverse each node in the map
## Use the [[Euclidean distance]] formula to find the similarity between the input vector and the map's node's weight vector
## Track the node that produces the smallest distance (this node is the best matching unit, BMU)
# Update the nodes in the neighborhood of the BMU (including the BMU itself) by pulling them closer to the input vector
## '''Wv'''(s + 1) = '''Wv'''(s) + Θ(u, v, s) α(s)('''D'''(t) - '''Wv'''(s))
# Increase s and repeat from step 2 while <math>s < \lambda</math>
 
A variant algorithm:
# Randomize the map's nodes' weight vectors
# Traverse each input vector in the input data set
## Traverse each node in the map
### Use the [[Euclidean distance]] formula to find the similarity between the input vector and the map's node's weight vector
### Track the node that produces the smallest distance (this node is the best matching unit, BMU)
## Update the nodes in the neighborhood of the BMU (including the BMU itself) by pulling them closer to the input vector
### '''Wv'''(s + 1) = '''Wv'''(s) + Θ(u, v, s) α(s)('''D'''(t) - '''Wv'''(s))
# Increase s and repeat from step 2 while <math>s < \lambda</math>
 
== Interpretation ==
[[File:Self oraganizing map cartography.jpg|thumb|left|300px|Cartographical representation of a self-organizing map ([[U-Matrix]]) based on Wikipedia featured article data (word frequency). Distance is inversely proportional to similarity. The "mountains" are edges between clusters. The red lines are links between articles.]]
 
[[File:SOMsPCA.PNG|thumb|One-dimensional SOM versus principal component analysis (PCA) for data approximation. SOM is a red [[broken line]] with squares, 20 nodes. The first principal component is presented by a blue line. Data points are the small grey circles. For PCA, the [[fraction of variance unexplained]] in this example is 23.23%, for SOM it is 6.86%.<ref>Illustration is prepared using free software: Mirkes, Evgeny M.; [http://www.math.le.ac.uk/people/ag153/homepage/PCA_SOM/PCA_SOM.html ''Principal Component Analysis and Self-Organizing Maps: applet''], University of Leicester, 2011</ref>]]
 
There are two ways to interpret a SOM. Because in the training phase weights of the whole neighborhood are moved in the same direction, similar items tend to excite adjacent neurons. Therefore, SOM forms a semantic map where similar samples are mapped close together and dissimilar ones apart. This may be visualized by a [[U-Matrix]] (Euclidean distance between weight vectors of neighboring cells) of the SOM.<ref name="UltschSiemon1990" /><ref name="Ultsch2003">Ultsch, Alfred (2003); ''U*-Matrix: A tool to visualize clusters in high dimensional data'', Department of Computer Science, University of Marburg, [http://www.uni-marburg.de/fb12/datenbionik/pdf/pubs/2003/ultsch03ustar Technical Report Nr. 36:1-12]</ref><ref>Saadatdoost, Robab, Alex Tze Hiang Sim, and Jafarkarimi, Hosein. "Application of self organizing map for knowledge discovery based in higher education data." Research and Innovation in Information Systems (ICRIIS), 2011 International Conference on. IEEE, 2011.</ref>
 
The other way is to think of neuronal weights as pointers to the input space. They form a discrete approximation of the distribution of training samples. More neurons point to regions with high training sample concentration and fewer where the samples are scarce.
 
 
SOM may be considered a nonlinear generalization of [[Principal components analysis]] (PCA).<ref>Yin, Hujun; [http://pca.narod.ru/contentsgkwz.htm ''Learning Nonlinear Principal Manifolds by Self-Organising Maps''], in Gorban, Alexander N.; Kégl, Balázs; Wunsch, Donald C.; and Zinovyev, Andrei (Eds.); ''Principal Manifolds for Data Visualization and Dimension Reduction'', Lecture Notes in Computer Science and Engineering (LNCSE), vol. 58, Berlin, Germany: Springer, 2007, ISBN 978-3-540-73749-0</ref> It has been shown, using both artificial and real geophysical data, that SOM has many advantages<ref>Liu, Yonggang; and Weisberg, Robert H. (2005); [http://www.agu.org/pubs/crossref/2005/2004JC002786.shtml ''Patterns of Ocean Current Variability on the West Florida Shelf Using the Self-Organizing Map''], Journal of Geophysical Research, 110, C06003, {{doi|10.1029/2004JC002786}}</ref><ref>Liu, Yonggang; Weisberg, Robert H.; and Mooers, Christopher N. K. (2006); [http://www.agu.org/pubs/crossref/2006/2005JC003117.shtml ''Performance Evaluation of the Self-Organizing Map for Feature Extraction''], Journal of Geophysical Research, 111, C05018, {{doi|10.1029/2005jc003117}}</ref> over the conventional feature extraction methods such as Empirical Orthogonal Functions (EOF) or PCA.
 
Originally, SOM was not formulated as a solution to an optimisation problem. Nevertheless, there have been several attempts to modify the definition of SOM and to formulate an optimisation problem which gives similar results.<ref>Heskes, Tom; [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.55.6572 ''Energy Functions for Self-Organizing Maps''], in Oja, Erkki; and Kaski, Samuel (Eds.), ''Kohonen Maps'', Elsevier, 1999</ref> For example, [[Elastic map]]s use the mechanical metaphor of elasticity to approximate [[Nonlinear dimensionality reduction#Principal curves and manifolds|principal manifolds]]:<ref>Gorban, Alexander N.; Kégl, Balázs; Wunsch, Donald C.; and Zinovyev, Andrei (Eds.); [http://pca.narod.ru/contentsgkwz.htm ''Principal Manifolds for Data Visualization and Dimension Reduction''], Lecture Notes in Computer Science and Engineering (LNCSE), vol. 58, Berlin, Germany: Springer, 2007, ISBN 978-3-540-73749-0</ref> the analogy is an elastic membrane and plate.
 
== Alternatives ==
* The '''[[generative topographic map]]''' (GTM) is a potential alternative to SOMs. In the sense that a GTM explicitly requires a smooth and continuous mapping from the input space to the map space, it is topology preserving. However, in a practical sense, this measure of topological preservation is lacking.<ref>{{cite paper |last=Kaski |first=Samuel |title=Data Exploration Using Self-Organizing Maps |journal=Acta Polytechnica Scandinavica |series=Mathematics, Computing and Management in Engineering Series No. 82 |year=1997 |publisher=Finnish Academy of Technology |location=Espoo, Finland |isbn=952-5148-13-0}}</ref>
 
* The '''[[time adaptive self-organizing map]]''' (TASOM) network is an extension of the basic SOM. The TASOM employs adaptive learning rates and neighborhood functions. It also includes a scaling parameter to make the network invariant to scaling, translation and rotation of the input space. The TASOM and its variants have been used in several applications including adaptive clustering, multilevel thresholding, input space approximation, and active contour modeling.<ref>{{cite paper |first=Hamed |last=Shah-Hosseini |first2=Reza |last2=Safabakhsh |title=TASOM: A New Time Adaptive Self-Organizing Map |journal=IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics |volume=33 |number=2 |month=April |year=2003 |pages=271–282 |url=http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1187438&tag=1 }}</ref> Moreover, a Binary Tree TASOM or BTASOM, resembling a binary natural tree having nodes composed of TASOM networks has been proposed where the number of its levels and the number of its nodes are adaptive with its environment.<ref>{{cite paper |first=Hamed |last=Shah-Hosseini |title=Binary Tree Time Adaptive Self-Organizing Map |journal=Neurocomputing |volume=74 |number=11 |month=May |year=2011 |pages=1823–1839 |url=http://www.sciencedirect.com/science/article/pii/S0925231211000786 }}</ref>
 
* The '''[[growing self-organizing map]]''' (GSOM) is a growing variant of the self-organizing map. The GSOM was developed to address the issue of identifying a suitable map size in the SOM. It starts with a minimal number of nodes (usually four) and grows new nodes on the boundary based on a heuristic. By using a value called the ''spread factor'', the data analyst has the ability to control the growth of the GSOM.
 
* The '''[[elastic map]]s''' approach<ref>A. N. Gorban, A. Zinovyev, [http://arxiv.org/abs/1001.1122 Principal manifolds and graphs in practice: from molecular biology to dynamical systems], [[International Journal of Neural Systems]], Vol. 20, No. 3 (2010) 219–232.</ref> borrows from the [[spline interpolation]] the idea of minimization of the [[elastic energy]]. In learning, it minimizes the sum of quadratic bending and stretching energy with the [[least squares]] [[approximation error]].
 
* The conformal approach <ref>{{cite journal | last=Liou | first=C.-Y. | last2=Kuo | first2=Y.-T. | title=Conformal Self-organizing Map for a Genus Zero Manifold |journal=The Visual Computer |volume=21 |issue=5 |pages=340-353 |date=2005 |doi=10.1007/s00371-005-0290-6 |url=http://link.springer.com/article/10.1007%2Fs00371-005-0290-6}}</ref><ref>{{cite journal | last=Liou | first=C.-Y. | last2=Tai | first2=W.-P. | title=Conformality in the self-organization network |journal=Artificial Intelligence |volume=116 |pages=265-286 |date=2000 |doi=10.1016/S0004-3702(99)00093-4 |url=http://www.sciencedirect.com/science/article/pii/S0004370299000934}}</ref> that uses conformal mapping to interpolate each training sample between grid nodes in a continuous surface. An one-to-one smooth mapping is possible in this approach.
 
== See also ==
* [[Neural gas]]
* Large Memory Storage and Retrieval (LAMSTAR) neural networks
* [[Hybrid Kohonen SOM]]
 
== References ==
{{Reflist|2}}
 
{{Commons category}}
 
== External links ==
* [http://jsalatas.ictpro.gr/weka/ Self-organizing maps for WEKA]: Implementation of a self-organizing maps in Java, for the WEKA Machine Learning Workbench.
* [http://ai4r.org/ Self-organizing maps for Ruby]: Implementation of self-organizing maps in Ruby, for the AI4R project.
* [http://github.com/LucidTechnics/som Self-organizing map for JavaScript]: An open-source implementation of a self-organizing map in JavaScript for node.js from Lucid Technics, LLC.
* [http://hackage.haskell.org/package/som Self-organizing map for Haskell]: An open-source implementation of a self-organising map in Haskell.
* [http://www.spice.ci.ritsumei.ac.jp/~thangc/programs/ Spice-SOM]: A free GUI application of self-organizing map
* [http://mathcs.emory.edu/~kthayer/ifcsoft/ IFCSoft]: An open-source Java platform for generating self-organizing maps
* [http://www.demogng.de DemoGNG]: Java applet implementing self-organizing maps and other network models (neural gas, growing neural gas, growing grid etc.)
* [http://cran.r-project.org/web/packages/kohonen/ kohonen] An open source Supervised and unsupervised self-organising maps package for R.
* [http://supfam.org/supraHex supraHex] A supra-hexagonal map for analysing high-dimensional omics data.
 
{{DEFAULTSORT:Self-Organizing Map}}
[[Category:Neural networks]]
[[Category:Dimension reduction]]
[[Category:Data clustering algorithms]]

Revision as of 15:51, 11 February 2014

Hi their. My name is Myrtie Boger but I never really liked that name. Minnesota has been my home and Excellent every day living . To ride horses is the things i do once a week. Auditing has been my profession for some point but I plan on changing this method. You can always find his website here: Cheap snapback hats in bulk