Matrix multiplication: Difference between revisions

From formulasearchengine
Jump to navigation Jump to search
→‎Illustration: Tweaking colors
 
Line 1: Line 1:
Hello! Allow me start by extended auto warranty saying my name - Ron Stephenson. [http://Www.mass.gov/ocabr/consumer-rights-and-resources/autos/lemon-laws/used-vehicle-warranty-law.html Delaware] has usually been my residing place and will by no means transfer. After becoming out of my job for many years I grew to become  extended auto warranty a production and distribution officer but I strategy on altering it. The favorite hobby extended auto warranty for him and his kids is to play badminton but he is having difficulties to [http://www.consumerautomotiveresearch.com/typesofwarranties.htm discover time] for it.<br><br>Feel free to visit my website; [http://ieee.emc-center.org/UserProfile/tabid/66/userId/68507/Default.aspx car warranty]
{{Other uses}}
 
In [[machine learning]], '''pattern recognition''' is the assignment of a label to a given input value. An example of pattern recognition is [[classification (machine learning)|classification]], which attempts to assign each input value to one of a given set of ''classes'' (for example, determine whether a given email is "spam" or "non-spam"). However, pattern recognition is a more general problem that encompasses other types of output as well. Other examples are [[regression analysis|regression]], which assigns a real-valued output to each input; [[sequence labeling]], which assigns a class to each member of a sequence of values (for example, [[part of speech tagging]], which assigns a [[part of speech]] to each word in an input sentence); and [[parsing]], which assigns a [[parse tree]] to an input sentence, describing the [[syntactic structure]] of the sentence.
 
Pattern recognition algorithms generally aim to provide a reasonable answer for all possible inputs and to perform "most likely" matching of the inputs, taking into account their statistical variation. This is opposed to ''[[pattern matching]]'' algorithms, which look for exact matches in the input with pre-existing patterns. A common example of a pattern-matching algorithm is [[regular expression]] matching, which looks for patterns of a given sort in textual data and is included in the search capabilities of many [[text editor]]s and [[word processor]]s. In contrast to pattern recognition, pattern matching is generally not considered a type of machine learning, although pattern-matching algorithms (especially with fairly general, carefully tailored patterns) can sometimes succeed in providing similar-quality output to the sort provided by pattern-recognition algorithms.
 
Pattern recognition is studied in many fields, including [[psychology]], [[psychiatry]], [[ethology]], [[cognitive science]], [[Three-phase traffic theory|traffic flow]] and [[computer science]].
 
==Overview==
Pattern recognition is generally categorized according to the type of learning procedure used to generate the output value. ''[[Supervised learning]]'' assumes that a set of ''training data'' (the ''[[training set]]'') has been provided, consisting of a set of instances that have been properly labeled by hand with the correct output. A learning procedure then generates a ''model'' that attempts to meet two sometimes conflicting objectives: Perform as well as possible on the training data, and generalize as well as possible to new data (usually, this means being as simple as possible, for some technical definition of "simple", in accordance with [[Occam's Razor]], discussed below). [[Unsupervised learning]], on the other hand, assumes training data that has not been hand-labeled, and attempts to find inherent patterns in the data that can then be used to determine the correct output value for new data instances. A combination of the two that has recently been explored is [[semi-supervised learning]], which uses a combination of labeled and unlabeled data (typically a small set of labeled data combined with a large amount of unlabeled data). Note that in cases of unsupervised learning, there may be no training data at all to speak of; in other words, the data to be labeled ''is'' the training data.
 
Note that sometimes different terms are used to describe the corresponding supervised and unsupervised learning procedures for the same type of output. For example, the unsupervised equivalent of classification is normally known as ''[[data clustering|clustering]]'', based on the common perception of the task as involving no training data to speak of, and of grouping the input data into ''clusters'' based on some inherent similarity measure (e.g. the [[distance]] between instances, considered as vectors in a multi-dimensional [[vector space]]), rather than assigning each input instance into one of a set of pre-defined classes. Note also that in some fields, the terminology is different: For example, in [[community ecology]], the term "classification" is used to refer to what is commonly known as "clustering".
 
The piece of input data for which an output value is generated is formally termed an ''instance''. The instance is formally described by a [[feature vector|vector]] of ''features'', which together constitute a description of all known characteristics of the instance. (These feature vectors can be seen as defining points in an appropriate [[space (mathematics)|multidimensional space]], and methods for manipulating vectors in [[vector space]]s can be correspondingly applied to them, such as computing the [[dot product]] or the angle between two vectors.) Typically, features are either [[categorical data|categorical]] (also known as [[nominal data|nominal]], i.e., consisting of one of a set of unordered items, such as a gender of "male" or "female", or a blood type of "A", "B", "AB" or "O"), [[ordinal data|ordinal]] (consisting of one of a set of ordered items, e.g., "large", "medium" or "small"), [[integer|integer-valued]] (e.g., a count of the number of occurrences of a particular word in an email) or [[real number|real-valued]] (e.g., a measurement of blood pressure). Often, categorical and ordinal data are grouped together; likewise for integer-valued and real-valued data. Furthermore, many algorithms work only in terms of categorical data and require that real-valued or integer-valued data be ''discretized'' into groups (e.g., less than 5, between 5 and 10, or greater than 10).
 
===Probabilistic classifiers===
Many common pattern recognition algorithms are ''probabilistic'' in nature, in that they use [[statistical inference]] to find the best label for a given instance. Unlike other algorithms, which simply output a "best" label, often probabilistic algorithms also output a [[probability]] of the instance being described by the given label. In addition, many probabilistic algorithms output a list of the ''N''-best labels with associated probabilities, for some value of ''N'', instead of simply a single best label. When the number of possible labels is fairly small (e.g., in the case of [[classification (machine learning)|classification]]), ''N'' may be set so that the probability of all possible labels is output. Probabilistic algorithms have many advantages over non-probabilistic algorithms:
*They output a confidence value associated with their choice. (Note that some other algorithms may also output confidence values, but in general, only for probabilistic algorithms is this value mathematically grounded in [[probability theory]]. Non-probabilistic confidence values can in general not be given any specific meaning, and only used to compare against other confidence values output by the same algorithm.)
*Correspondingly, they can ''abstain'' when the confidence of choosing any particular output is too low.
*Because of the probabilities output, probabilistic pattern-recognition algorithms can be more effectively incorporated into larger machine-learning tasks, in a way that partially or completely avoids the problem of ''error propagation''.
 
===How many feature variables are important?===
[[Feature selection]] algorithms, attempt to directly prune out redundant or irrelevant features. A general introduction to [[feature selection]] which summarizes approaches and challenges, has been given.<ref>Isabelle Guyon Clopinet, André Elisseeff (2003). ''An Introduction to Variable and Feature Selection''. The Journal of Machine Learning Research, Vol. 3, 1157-1182. [http://www-vis.lbl.gov/~romano/mlgroup/papers/guyon03a.pdf Link]</ref> The complexity of feature-selection is, because of its non-monotonous character, an [[optimization problem]] where given a total of <math>n</math> features the [[powerset]] consisting of all <math>2^n-1</math> subsets of features need to be explored. The [[Branch and bound|Branch-and-Bound algorithm]] <ref>
{{Cite journal| author=Iman Foroutan, Jack Sklansky| year=1987 |
title=Feature Selection for Automatic Classification of Non-Gaussian Data | journal=IEEE Transactions on Systems, Man and Cybernetics | volume=17 | pages=187&ndash;198 | doi = 10.1109/TSMC.1987.4309029 | issue=2
}}.</ref> does reduce this complexity but is intractable for medium to large values of the number of available features <math>n</math>. For a large-scale comparison of feature-selection algorithms see
.<ref>
{{Cite journal| author=Mineichi Kudo, Jack Sklansky| year=2000 |
title=Comparison of algorithms that select features for pattern classifiers | journal=Pattern Recognition | volume=33 | pages=25&ndash;41 | doi = 10.1016/S0031-3203(99)00041-2 | issue=1}}.</ref>
 
Techniques to transform the raw feature vectors ('''feature extraction''') are sometimes used prior to application of the pattern-matching algorithm. For example, [[feature extraction]] algorithms attempt to reduce a large-dimensionality feature vector into a smaller-dimensionality vector that is easier to work with and encodes less redundancy, using mathematical techniques such as [[principal components analysis]] (PCA). The distinction between '''feature selection''' and '''feature extraction''' is that the resulting features after feature extraction has taken place are of a different sort than the original features and may not easily be interpretable, while the features left after feature selection are simply a subset of the original features.
 
== Problem statement (supervised version)==
Formally, the problem of [[supervised learning|supervised]] pattern recognition can be stated as follows: Given an unknown function <math>g:\mathcal{X}\rightarrow\mathcal{Y}</math> (the ''ground truth'') that maps input instances <math>\boldsymbol{x} \in \mathcal{X}</math> to output labels <math>y \in \mathcal{Y}</math>, along with training data <math>\mathbf{D} = \{(\boldsymbol{x}_1,y_1),\dots,(\boldsymbol{x}_n, y_n)\}</math> assumed to represent accurate examples of the mapping, produce a function <math>h:\mathcal{X}\rightarrow\mathcal{Y}</math> that approximates as closely as possible the correct mapping <math>g</math>. (For example, if the problem is filtering spam, then <math>\boldsymbol{x}_i</math> is some representation of an email and <math>y</math> is either "spam" or "non-spam"). In order for this to be a well-defined problem, "approximates as closely as possible" needs to be defined rigorously. In [[decision theory]], this is defined by specifying a [[loss function]] that assigns a specific value to "loss" resulting from producing an incorrect label. The goal then is to minimize the [[expected value|expected]] loss, with the expectation taken over the [[probability distribution]] of <math>\mathcal{X}</math>. In practice, neither the distribution of <math>\mathcal{X}</math> nor the ground truth function <math>g:\mathcal{X}\rightarrow\mathcal{Y}</math> are known exactly, but can be computed only empirically by collecting a large number of samples of <math>\mathcal{X}</math> and hand-labeling them using the correct value of <math>\mathcal{Y}</math> (a time-consuming process, which is typically the limiting factor in the amount of data of this sort that can be collected). The particular loss function depends on the type of label being predicted. For example, in the case of [[classification (machine learning)|classification]], the simple [[zero-one loss function]] is often sufficient. This corresponds simply to assigning a loss of 1 to any incorrect labeling and implies that the optimal classifier minimizes the [[Bayes error rate|error rate]] on independent test data (i.e. counting up the fraction of instances that the learned function <math>h:\mathcal{X}\rightarrow\mathcal{Y}</math> labels wrongly, which is equivalent to maximizing the number of correctly classified instances). The goal of the learning procedure is then to minimize the error rate (maximize the [[correctness]]) on a "typical" test set.
 
For a probabilistic pattern recognizer, the problem is instead to estimate the probability of each possible output label given a particular input instance, i.e., to estimate a function of the form
:<math>p({\rm label}|\boldsymbol{x},\boldsymbol\theta) = f\left(\boldsymbol{x};\boldsymbol{\theta}\right)</math>
where the [[feature vector]] input is <math>\boldsymbol{x}</math>, and the function ''f'' is typically parameterized by some parameters <math>\boldsymbol{\theta}</math>.<ref>For [[linear discriminant analysis]] the parameter vector <math>\boldsymbol\theta</math> consists of the two mean vectors <math>\boldsymbol\mu_1</math> and <math>\boldsymbol\mu_2</math> and the common [[covariance matrix]] <math>\boldsymbol\Sigma</math>.</ref> In a [[discriminative model|discriminative]] approach to the problem, ''f'' is estimated directly. In a [[generative model|generative]] approach, however, the inverse probability <math>p({\boldsymbol{x}|\rm label})</math> is instead estimated and combined with the [[prior probability]] <math>p({\rm label}|\boldsymbol\theta)</math> using [[Bayes' rule]], as follows:
:<math>p({\rm label}|\boldsymbol{x},\boldsymbol\theta) = \frac{p({\boldsymbol{x}|\rm label}) p({\rm label|\boldsymbol\theta})}{\sum_{L \in \text{all labels}} p(\boldsymbol{x}|L) p(L|\boldsymbol\theta)}.</math>
 
When the labels are [[continuous distribution|continuously distributed]] (e.g., in [[regression analysis]]), the denominator involves [[integral|integration]] rather than summation:
 
:<math>p({\rm label}|\boldsymbol{x},\boldsymbol\theta) = \frac{p({\boldsymbol{x}|\rm label}) p({\rm label|\boldsymbol\theta})}{\int_{L \in \text{all labels}} p(\boldsymbol{x}|L) p(L|\boldsymbol\theta) \operatorname{d}L}.</math>
 
The value of <math>\boldsymbol\theta</math> is typically learned using [[maximum a posteriori]] (MAP) estimation. This finds the best value that simultaneously meets two conflicting objects: To perform as well as possible on the training data (smallest [[Bayes error rate|error-rate]]) and to find the simplest possible model. Essentially, this combines [[maximum likelihood]] estimation with a [[regularization (mathematics)|regularization]] procedure that favors simpler models over more complex models. In a [[Bayesian inference|Bayesian]] context, the regularization procedure can be viewed as placing a [[prior probability]] <math>p(\boldsymbol\theta)</math> on different values of <math>\boldsymbol\theta</math>. Mathematically:
 
:<math>\boldsymbol\theta^* = \arg \max_{\boldsymbol\theta} p(\boldsymbol\theta|\mathbf{D})</math>
 
where <math>\boldsymbol\theta^*</math> is the value used for <math>\boldsymbol\theta</math> in the subsequent evaluation procedure, and <math>p(\boldsymbol\theta|\mathbf{D})</math>, the [[posterior probability]] of <math>\boldsymbol\theta</math>, is given by
 
:<math>p(\boldsymbol\theta|\mathbf{D}) = \left[\prod_{i=1}^n p(y_i|\boldsymbol{x}_i,\boldsymbol\theta) \right] p(\boldsymbol\theta).</math>
 
In the [[Bayesian statistics|Bayesian]] approach to this problem, instead of choosing a single parameter vector <math>\boldsymbol{\theta}^*</math>, the probability of a given label for a new instance <math>\boldsymbol{x}</math> is computed by integrating over all possible values of <math>\boldsymbol\theta</math>, weighted according to the posterior probability:
 
:<math>p({\rm label}|\boldsymbol{x}) = \int p({\rm label}|\boldsymbol{x},\boldsymbol\theta)p(\boldsymbol{\theta}|\mathbf{D}) \operatorname{d}\boldsymbol{\theta}.</math>
 
=== Frequentist or Bayesian approach to pattern recognition? ===
The first pattern classifier – the linear discriminant presented by [[Fisher discriminant analysis|Fisher]] – was developed in the [[Frequentist inference|Frequentist]] tradition. The frequentist approach entails that the model parameters are considered unknown, but objective. The parameters are then computed (estimated) from the collected data. For the linear discriminant, these parameters are precisely the mean vectors and the [[Covariance matrix]]. Also the probability of each class <math>p({\rm label}|\boldsymbol\theta)</math> is estimated from the collected dataset. Note that the usage of ‘[[Bayes rule]]’ in a pattern classifier does not make the classification approach Bayesian.
 
[[Bayesian inference|Bayesian statistics]] has its origin in Greek philosophy where a distinction was already made between the ‘[[A priori and a posteriori|a priori]]’ and the ‘[[A priori and a posteriori|a posteriori]]’ knowledge. Later [[A priori and a posteriori#Immanuel Kant|Kant]] defined his distinction between what is a priori known – before observation – and the empirical knowledge gained from observations. In a Bayesian pattern classifier, the class probabilities <math>p({\rm label}|\boldsymbol\theta)</math> can be chosen by the user, which are then a priori. Moreover, experience quantified as a priori parameter values can be weighted with empirical observations – using e.g., the [[Beta distribution|Beta-]] ([[Conjugate prior distribution|conjugate prior]]) and [[Dirichlet distribution|Dirichlet-distributions]]. The Bayesian approach facilitates a seamless intermixing between expert knowledge in the form of subjective probabilities, and objective observations.
 
Probabilistic pattern classifiers can be used according to a frequentist or a Bayesian approach.
 
==Uses==
[[File:800px-Cool Kids of Death Off Festival p 146-face selected.jpg|thumb|200px|[[Face recognition|The face was automatically detected]] by special software.]]
Within medical science, pattern recognition is the basis for [[computer-aided diagnosis]] (CAD) systems. CAD describes a procedure that supports the doctor's interpretations and findings.
 
Other typical applications of pattern recognition techniques are automatic [[speech recognition]], [[document classification|classification of text into several categories]] (e.g., spam/non-spam email messages), the [[handwriting recognition|automatic recognition of handwritten postal codes]] on postal envelopes, automatic recognition of images of human faces, or handwriting image extraction from medical forms.<ref>{{cite journal|last=Milewski|first=Robert|coauthors=Govindaraju, Venu|title=Binarization and cleanup of handwritten text from carbon copy medical form images|journal=Pattern Recognition|date=31 March 2008|volume=41|issue=4|pages=1308–1315|doi=10.1016/j.patcog.2007.08.018|url=http://dl.acm.org/citation.cfm?id=1324656}}</ref> The last two examples form the subtopic [[image analysis]] of pattern recognition that deals with digital images as input to pattern recognition systems.<ref name=duda2001>Richard O. Duda, [[Peter E. Hart]], David G. Stork (2001) ''Pattern classification'' (2nd edition), Wiley, New York, ISBN 0-471-05669-3</ref><ref>R. Brunelli, ''Template Matching Techniques in Computer Vision: Theory and Practice'', Wiley, ISBN 978-0-470-51706-2, 2009</ref>
 
Optical character recognition is a classic example of the application of a pattern classifier, see
[http://cmp.felk.cvut.cz/cmp/software/stprtool/examples/ocr_system.gif OCR-example].
The method of signing one's name was captured with stylus and overlay starting in 1990.{{citation needed|date=January 2011}} The strokes, speed, relative min, relative max, acceleration and pressure is used to uniquely identify and confirm identity. Banks were first offered this technology, but were content to collect from the FDIC for any bank fraud and did not want to inconvenience customers..{{citation needed|date=January 2011}}
 
* [[Artificial neural networks]] (neural net classifiers) and [[Deep Learning]] have many real-world applications in image processing, a few examples:
* identification and authentication: e.g., license plate recognition,<ref>[http://anpr-tutorial.com/ THE AUTOMATIC NUMBER PLATE RECOGNITION TUTORIAL] http://anpr-tutorial.com/</ref> fingerprint analysis and face detection/verification;<ref>[http://www.cs.cmu.edu/afs/cs.cmu.edu/usr/mitchell/ftp/faces.html Neural Networks for Face Recognition] Companion to Chapter 4 of the textbook Machine Learning.</ref>
* medical diagnosis: e.g., screening for cervical cancer (Papnet)<ref>[http://health-asia.org/papnet-for-cervical-screening/ PAPNET For Cervical Screening] http://health-asia.org/papnet-for-cervical-screening/</ref> or breast tumors;
* defence: various navigation and guidance systems, target recognition systems, etc.
 
For a discussion of the aforementioned applications of neural networks in image processing, see e.g.<ref>{{Cite journal| author=Egmont-Petersen, M., de Ridder, D., Handels, H. | year=2002 |
title=Image processing with neural networks - a review | journal=Pattern Recognition | volume=35 | pages=2279&ndash;2301 | doi = 10.1016/S0031-3203(01)00178-9 | issue=10
}}</ref>
 
In psychology, pattern recognition, making sense of and identifying the objects we see is closely related to perception, which explains how the sensory inputs we receive are made meaningful. Pattern recognition can be thought of in two different ways: the first being template matching and the second being feature detection.
A template is a pattern used to produce items of the same proportions. The template-matching hypothesis suggests that incoming stimuli are compared with templates in the long term memory. If there is a match, the stimulus is identified.
Feature detection models, such as the Pandemonium system for classifying letters (Selfridge, 1959), suggest that the stimuli are broken down into their component parts for identification. For example, a capital E has three horizontal lines and one vertical line.<ref>{{cite web|url=http://www.s-cool.co.uk/a-level/psychology/attention/revise-it/pattern-recognition |title=A-level Psychology Attention Revision - Pattern recognition &#124; S-cool, the revision website |publisher=S-cool.co.uk |date= |accessdate=2012-09-17}}</ref>
 
==Algorithms==
Algorithms for pattern recognition depend on the type of label output, on whether learning is supervised or unsupervised, and on whether the algorithm is statistical or non-statistical in nature. Statistical algorithms can further be categorized as [[generative model|generative]] or [[discriminative model|discriminative]].
 
===Categorical [[sequence labeling]] algorithms (predicting sequences of [[categorical data|categorical]] labels)===
Supervised:
*[[Conditional random field]]s (CRFs)
*[[Hidden Markov model]]s (HMMs)
*[[Maximum entropy Markov model]]s (MEMMs)
*[[Recurrent neural networks]]
 
Unsupervised:
*[[Hidden Markov model]]s (HMMs)
 
===[[Classification (machine learning)|Classification]] algorithms ([[supervised learning|supervised]] algorithms predicting [[categorical data|categorical]] labels)===
Parametric:<ref>Assuming known distributional shape of feature distributions per class, such as the [[Gaussian distribution|Gaussian]] shape.</ref>
*[[Linear discriminant analysis]]
*[[Quadratic classifier|Quadratic discriminant analysis]]
*[[Maximum entropy classifier]] (aka [[logistic regression]], [[multinomial logistic regression]]): Note that logistic regression is an algorithm for classification, despite its name. (The name comes from the fact that logistic regression uses an extension of a linear regression model to model the probability of an input being in a particular class.)
Nonparametric:<ref>No distributional assumption regarding shape of feature distributions per class.</ref>
*[[Decision tree]]s, [[decision list]]s
*[[Variable kernel density estimation#Use for statistical classification|Kernel estimation]] and [[K-nearest-neighbor]] algorithms
*[[Naive Bayes classifier]]
*[[Neural network]]s (multi-layer perceptrons)
*[[Perceptron]]s
*[[Support vector machine]]s
*[[Gene expression programming]]
 
===[[Cluster analysis|Clustering]] algorithms ([[unsupervised learning|unsupervised]] algorithms predicting [[categorical data|categorical]] labels)===
*Categorical [[mixture model]]s
*[[Deep learning|Deep learning methods]]
*[[Hierarchical clustering]] (agglomerative or divisive)
*[[K-means clustering]]
*[[Correlation clustering]]
*[[Kernel principal component analysis]] (Kernel PCA)
 
===[[Ensemble learning]] algorithms (supervised [[meta-algorithm]]s for combining multiple learning algorithms together)===
*[[Boosting (meta-algorithm)]]
*[[Bootstrap aggregating]] ("bagging")
*[[Ensemble averaging]]
*[[Mixture of experts]], [[hierarchical mixture of experts]]
 
===General algorithms for predicting arbitrarily-structured (sets of) labels===
*[[Bayesian network]]s
*[[Markov random field]]s
 
===[[Multilinear subspace learning]] algorithms (predicting labels of multidimensional data using [[tensor]] representations)===
Unsupervised:
*[[Multilinear principal component analysis]] (MPCA)
 
===[[Parsing]] algorithms (predicting [[tree structure]]d labels)===
Supervised and unsupervised:
*[[Probabilistic context free grammar]]s (PCFGs)
 
===Real-valued [[sequence labeling]] algorithms (predicting sequences of [[real number|real-valued]] labels)===
Supervised (?):
*[[Kalman filter]]s
*[[Particle filter]]s
 
===[[Regression analysis|Regression]] algorithms (predicting [[real number|real-valued]] labels)===
Supervised:
*[[Gaussian process regression]] (kriging)
*[[Linear regression]] and extensions
*[[Neural network]]s and [[Deep learning|Deep learning methods]]
 
Unsupervised:
*[[Independent component analysis]] (ICA)
*[[Principal components analysis]] (PCA)
 
==Which classifier to choose for a classification task?==
This article contains an extensive list of statistical classifiers for [[Supervised learning|supervised]] and [[Unsupervised learning|unsupervised]] classification tasks, clustering and general regression prediction. When considering building a classifier e.g., for a software application, a number of different aspects influence the choice of the preferred classifier type to use.
 
Building or ''training'' a classifier is essentially [[statistical inference]]. This means that an attempt is made to identify stochastic (often unknown) relations between feature variables and the categories to be predicted. For example, the influence of increased cholesterol on the risk of a heart attack for a patient, within the next year. Which other variables besides the current cholesterol level determine this risk? The two categories to 'predict' by a classifier are 'heart attack likely', or 'heart attack unlikely'.
 
The theoretically optimal classifier is called the '''[[Bayes error rate|Bayes classifier]]'''. It minimizes the loss-function or ''risk'' as defined [[Supervised learning#How supervised learning algorithms work|here]]. When all types of misclassifications are associated with equal losses (outcome A becomes B is as undesired as when outcome B becomes A), the Bayes classifier with the minimal error rate (on a test set) is the optimal one for the classification task. In general, it is unknown what is the optimal classifier type and true parameters <math>\boldsymbol{\theta}</math>. However, bounds on the optimal Bayes error rate have been derived. For, for example, the K-nearest neighbor classifier theoretical results are derived that [[K-nearest-neighbor#Properties|bound the error rate]], in relation to the optimal Bayes error rate.
 
In essence, building a classifier brings model selection with it. [[Feature selection]] – using only a subset of the available feature variables to predict the most likely categorical outcome – by itself entails model selection. Choosing among the extensive set of different classifiers makes model selection even more complex. A theoretical analysis of this search problem has been presented as the '''[[No free lunch in search and optimization|no free lunch theorem]]:<ref>David H. Wolpert (2001). ''The Supervised Learning No Free Lunch Theorems''. Technical report MS-269-1, NASA Ames Research Center. [http://www.no-free-lunch.org/Wolp01a.pdf Link]</ref>''' No particular classification algorithm is 'the best' for all problems. The pragmatic approach to this open problem is to combine prior knowledge of the classification task (e.g. [[Statistical distribution|distributional assumptions]]) with a search process where different types of classifiers are developed and their performance compared.
 
The search for the 'best' model that predicts observations and relations between these is a problem that was discovered already in ancient Greece. In medieval times, [[Occam's razor]] was formulated:
 
“plurality should not be posited without necessity”.
 
In this context, it means that if a simple classifier with only a few parameters (small <math>\boldsymbol{\theta}</math>) does the job as well as a much more complex classification algorithm, choose the simpler one. Only to add that the performance of a classifier is but one of the criteria to apply when choosing the best classification model. [[Statistical distribution|Distributional assumptions]], insight provided into the discovered relations between variables, whether the classification algorithm can cope with missing feature variables, whether a change in class [[prior probability]] <ref>Relative frequency of each class in the training and test sets.</ref> can be incorporated, speed of the training algorithm, memory requirements, parallelization of the classification process, resemblance to the human perceptual system, and other aspects as well. In visual pattern recognition [[Prior knowledge for pattern recognition#Class-invariance|invariance]] to variations in color, rotation and scale are extra properties that need to be accounted for.
 
===Supervised classification===
 
When choosing the most appropriate supervised classifier, the generally accepted heuristic is to:
# Separate the available data, at random, into a training set and a test set. Use the test set only for the final performance comparison of the trained classifiers.
# Experiment by training a number of classification algorithms, including [[Parametric statistics|parametric]] ([[discriminant analysis]], [[Multinomial logistic regression|multinomial classifier]]<ref name=Glick1973>Ned Glick (1973) ''Sample-Based Multinomial Classification'', ''Biometrics'', Vol. 29, No. 2, pp. 241-256.</ref> ) and [[Non-parametric statistics|non-parametric]] algorithms ([[K-nearest-neighbor|k-nearest neighbor]], a [[support vector machine]], a [[Neural networks|feed-forward neural network]], a standard [[Decision tree|decision-tree algorithm]]).
# Test distributional assumptions of the (continuous) feature-distributions per category. Are they [[Gaussian distribution|Gaussian]]?
# Which subset of feature variables contributes mostly to the discriminative performance of the classifier?
# Are elaborate [[confidence interval]]s needed for the [[Bayes error rate|error-rates]] and class-predictions?<ref name=mclachlan2004>Geoffrey J. McLachlan (2004) ''Discriminant Analysis and Statistical Pattern Recognition'', Wiley Series in Probability and Statistics, New Jersey, ISBN 0-471-69115-1</ref>
# White-box versus black-box considerations may render specific classifiers unsuited for the job.
 
===Unsupervised classification===
 
==See also==
{{Div col|cols=2}}
* [[Adaptive resonance theory]]
* [[Artificial neural networks]]
* [[Cache language model]]
* [[Compound term processing]]
* [[Computer-aided diagnosis]]
* [[Data mining]]
* [[Deep Learning]]
* [[List of numerical analysis software]]
* [[List of numerical libraries]]
* [[Machine learning]]
* [[Multilinear subspace learning]]
* [[Neocognitron]]
* [[Pattern language]]
* [[Pattern recognition (psychology)]]
* [[Perception]]
* [[Perceptual learning]]
* [[Predictive analytics]]
* [[Prior knowledge for pattern recognition]]
* [[Sequence mining]]
* [[Template matching]]
* [[Thin-slicing]]
* [[Contextual image classification]]
{{Div col end}}
 
==References==
{{FOLDOC}}
{{reflist}}
 
==Further reading==
*{{cite book|last=Fukunaga|first=Keinosuke|title=Introduction to Statistical Pattern Recognition|edition=2nd|year=1990|publisher=Academic Press|location=Boston|isbn=0-12-269851-7}}
*{{cite book|last=Bishop|first=Christopher|authorlink=Christopher Bishop|title=Pattern Recognition and Machine Learning|year=2006|publisher=Springer|location=Berlin|isbn=0-387-31073-8}}
*{{cite book|last1=Koutroumbas|first1=Konstantinos|last2=Theodoridis|first2=Sergios|title=Pattern Recognition|edition=4th|year=2008|publisher=Academic Press|location=Boston|isbn=1-59749-272-8}}
*{{cite book|last1=Hornegger|first1=Joachim|last2=Paulus|first2=Dietrich W. R.|title=Applied Pattern Recognition: A Practical Introduction to Image and Speech Processing in C++|edition=2nd|year=1999|publisher=Morgan Kaufmann Publishers|location=San Francisco|isbn=3-528-15558-2}}
*{{cite book|last=Schuermann|first=Juergen|title=Pattern Classification: A Unified View of Statistical and Neural Approaches|year=1996|publisher=Wiley|location=New York|isbn=0-471-13534-8}}
*{{cite book|editor=Godfried T. Toussaint|title=Computational Morphology|year=1988|publisher=North-Holland Publishing Company|location=Amsterdam}}
*{{cite book|last1=Kulikowski|first1=Casimir A.|last2=Weiss|first2=Sholom M.|title=Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning, and Expert Systems|series=Machine Learning|year=1991|publisher=Morgan Kaufmann Publishers|location=San Francisco|isbn=1-55860-065-5}}
*{{cite paper|last1=Jain|first1=Anil.K.|last2=Duin|first2=Robert.P.W.|last3=Mao|first3=Jianchang|title=Statistical pattern recognition: a review|year=2000|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence | volume=22 | pages=4&ndash;37 | doi = 10.1109/34.824819 | issue=1}}
*[http://www.egmont-petersen.nl/classifiers.htm An introductory tutorial to classifiers (introducing the basic terms, with numeric example)]
 
==External links==
* [http://www.iapr.org The International Association for Pattern Recognition]
* [http://cgm.cs.mcgill.ca/~godfried/teaching/pr-web.html List of Pattern Recognition web sites]
* [http://www.jprr.org Journal of Pattern Recognition Research]
* [http://www.docentes.unal.edu.co/morozcoa/docs/pr.php Pattern Recognition Info]
* [http://www.sciencedirect.com/science/journal/00313203 Pattern Recognition] (Journal of the Pattern Recognition Society)
* [http://www.worldscinet.com/ijprai/mkt/archive.shtml International Journal of Pattern Recognition and Artificial Intelligence]
* [http://www.inderscience.com/ijapr International Journal of Applied Pattern Recognition]
* [http://www.openpr.org.cn/ Open Pattern Recognition Project], intended to be an open source platform for sharing algorithms of pattern recognition
* [http://www.intelnics.com/opennn OpenNN: Open Neural Networks Library]
 
[[Category:Machine learning]]
[[Category:Formal sciences]]

Revision as of 12:28, 31 January 2014

I'm Fernando (21) from Seltjarnarnes, Iceland.
I'm learning Norwegian literature at a local college and I'm just about to graduate.
I have a part time job in a the office.

my site; wellness [continue reading this..]

In machine learning, pattern recognition is the assignment of a label to a given input value. An example of pattern recognition is classification, which attempts to assign each input value to one of a given set of classes (for example, determine whether a given email is "spam" or "non-spam"). However, pattern recognition is a more general problem that encompasses other types of output as well. Other examples are regression, which assigns a real-valued output to each input; sequence labeling, which assigns a class to each member of a sequence of values (for example, part of speech tagging, which assigns a part of speech to each word in an input sentence); and parsing, which assigns a parse tree to an input sentence, describing the syntactic structure of the sentence.

Pattern recognition algorithms generally aim to provide a reasonable answer for all possible inputs and to perform "most likely" matching of the inputs, taking into account their statistical variation. This is opposed to pattern matching algorithms, which look for exact matches in the input with pre-existing patterns. A common example of a pattern-matching algorithm is regular expression matching, which looks for patterns of a given sort in textual data and is included in the search capabilities of many text editors and word processors. In contrast to pattern recognition, pattern matching is generally not considered a type of machine learning, although pattern-matching algorithms (especially with fairly general, carefully tailored patterns) can sometimes succeed in providing similar-quality output to the sort provided by pattern-recognition algorithms.

Pattern recognition is studied in many fields, including psychology, psychiatry, ethology, cognitive science, traffic flow and computer science.

Overview

Pattern recognition is generally categorized according to the type of learning procedure used to generate the output value. Supervised learning assumes that a set of training data (the training set) has been provided, consisting of a set of instances that have been properly labeled by hand with the correct output. A learning procedure then generates a model that attempts to meet two sometimes conflicting objectives: Perform as well as possible on the training data, and generalize as well as possible to new data (usually, this means being as simple as possible, for some technical definition of "simple", in accordance with Occam's Razor, discussed below). Unsupervised learning, on the other hand, assumes training data that has not been hand-labeled, and attempts to find inherent patterns in the data that can then be used to determine the correct output value for new data instances. A combination of the two that has recently been explored is semi-supervised learning, which uses a combination of labeled and unlabeled data (typically a small set of labeled data combined with a large amount of unlabeled data). Note that in cases of unsupervised learning, there may be no training data at all to speak of; in other words, the data to be labeled is the training data.

Note that sometimes different terms are used to describe the corresponding supervised and unsupervised learning procedures for the same type of output. For example, the unsupervised equivalent of classification is normally known as clustering, based on the common perception of the task as involving no training data to speak of, and of grouping the input data into clusters based on some inherent similarity measure (e.g. the distance between instances, considered as vectors in a multi-dimensional vector space), rather than assigning each input instance into one of a set of pre-defined classes. Note also that in some fields, the terminology is different: For example, in community ecology, the term "classification" is used to refer to what is commonly known as "clustering".

The piece of input data for which an output value is generated is formally termed an instance. The instance is formally described by a vector of features, which together constitute a description of all known characteristics of the instance. (These feature vectors can be seen as defining points in an appropriate multidimensional space, and methods for manipulating vectors in vector spaces can be correspondingly applied to them, such as computing the dot product or the angle between two vectors.) Typically, features are either categorical (also known as nominal, i.e., consisting of one of a set of unordered items, such as a gender of "male" or "female", or a blood type of "A", "B", "AB" or "O"), ordinal (consisting of one of a set of ordered items, e.g., "large", "medium" or "small"), integer-valued (e.g., a count of the number of occurrences of a particular word in an email) or real-valued (e.g., a measurement of blood pressure). Often, categorical and ordinal data are grouped together; likewise for integer-valued and real-valued data. Furthermore, many algorithms work only in terms of categorical data and require that real-valued or integer-valued data be discretized into groups (e.g., less than 5, between 5 and 10, or greater than 10).

Probabilistic classifiers

Many common pattern recognition algorithms are probabilistic in nature, in that they use statistical inference to find the best label for a given instance. Unlike other algorithms, which simply output a "best" label, often probabilistic algorithms also output a probability of the instance being described by the given label. In addition, many probabilistic algorithms output a list of the N-best labels with associated probabilities, for some value of N, instead of simply a single best label. When the number of possible labels is fairly small (e.g., in the case of classification), N may be set so that the probability of all possible labels is output. Probabilistic algorithms have many advantages over non-probabilistic algorithms:

  • They output a confidence value associated with their choice. (Note that some other algorithms may also output confidence values, but in general, only for probabilistic algorithms is this value mathematically grounded in probability theory. Non-probabilistic confidence values can in general not be given any specific meaning, and only used to compare against other confidence values output by the same algorithm.)
  • Correspondingly, they can abstain when the confidence of choosing any particular output is too low.
  • Because of the probabilities output, probabilistic pattern-recognition algorithms can be more effectively incorporated into larger machine-learning tasks, in a way that partially or completely avoids the problem of error propagation.

How many feature variables are important?

Feature selection algorithms, attempt to directly prune out redundant or irrelevant features. A general introduction to feature selection which summarizes approaches and challenges, has been given.[1] The complexity of feature-selection is, because of its non-monotonous character, an optimization problem where given a total of features the powerset consisting of all subsets of features need to be explored. The Branch-and-Bound algorithm [2] does reduce this complexity but is intractable for medium to large values of the number of available features . For a large-scale comparison of feature-selection algorithms see .[3]

Techniques to transform the raw feature vectors (feature extraction) are sometimes used prior to application of the pattern-matching algorithm. For example, feature extraction algorithms attempt to reduce a large-dimensionality feature vector into a smaller-dimensionality vector that is easier to work with and encodes less redundancy, using mathematical techniques such as principal components analysis (PCA). The distinction between feature selection and feature extraction is that the resulting features after feature extraction has taken place are of a different sort than the original features and may not easily be interpretable, while the features left after feature selection are simply a subset of the original features.

Problem statement (supervised version)

Formally, the problem of supervised pattern recognition can be stated as follows: Given an unknown function (the ground truth) that maps input instances to output labels , along with training data assumed to represent accurate examples of the mapping, produce a function that approximates as closely as possible the correct mapping . (For example, if the problem is filtering spam, then is some representation of an email and is either "spam" or "non-spam"). In order for this to be a well-defined problem, "approximates as closely as possible" needs to be defined rigorously. In decision theory, this is defined by specifying a loss function that assigns a specific value to "loss" resulting from producing an incorrect label. The goal then is to minimize the expected loss, with the expectation taken over the probability distribution of . In practice, neither the distribution of nor the ground truth function are known exactly, but can be computed only empirically by collecting a large number of samples of and hand-labeling them using the correct value of (a time-consuming process, which is typically the limiting factor in the amount of data of this sort that can be collected). The particular loss function depends on the type of label being predicted. For example, in the case of classification, the simple zero-one loss function is often sufficient. This corresponds simply to assigning a loss of 1 to any incorrect labeling and implies that the optimal classifier minimizes the error rate on independent test data (i.e. counting up the fraction of instances that the learned function labels wrongly, which is equivalent to maximizing the number of correctly classified instances). The goal of the learning procedure is then to minimize the error rate (maximize the correctness) on a "typical" test set.

For a probabilistic pattern recognizer, the problem is instead to estimate the probability of each possible output label given a particular input instance, i.e., to estimate a function of the form

where the feature vector input is , and the function f is typically parameterized by some parameters .[4] In a discriminative approach to the problem, f is estimated directly. In a generative approach, however, the inverse probability is instead estimated and combined with the prior probability using Bayes' rule, as follows:

When the labels are continuously distributed (e.g., in regression analysis), the denominator involves integration rather than summation:

The value of is typically learned using maximum a posteriori (MAP) estimation. This finds the best value that simultaneously meets two conflicting objects: To perform as well as possible on the training data (smallest error-rate) and to find the simplest possible model. Essentially, this combines maximum likelihood estimation with a regularization procedure that favors simpler models over more complex models. In a Bayesian context, the regularization procedure can be viewed as placing a prior probability on different values of . Mathematically:

where is the value used for in the subsequent evaluation procedure, and , the posterior probability of , is given by

In the Bayesian approach to this problem, instead of choosing a single parameter vector , the probability of a given label for a new instance is computed by integrating over all possible values of , weighted according to the posterior probability:

Frequentist or Bayesian approach to pattern recognition?

The first pattern classifier – the linear discriminant presented by Fisher – was developed in the Frequentist tradition. The frequentist approach entails that the model parameters are considered unknown, but objective. The parameters are then computed (estimated) from the collected data. For the linear discriminant, these parameters are precisely the mean vectors and the Covariance matrix. Also the probability of each class is estimated from the collected dataset. Note that the usage of ‘Bayes rule’ in a pattern classifier does not make the classification approach Bayesian.

Bayesian statistics has its origin in Greek philosophy where a distinction was already made between the ‘a priori’ and the ‘a posteriori’ knowledge. Later Kant defined his distinction between what is a priori known – before observation – and the empirical knowledge gained from observations. In a Bayesian pattern classifier, the class probabilities can be chosen by the user, which are then a priori. Moreover, experience quantified as a priori parameter values can be weighted with empirical observations – using e.g., the Beta- (conjugate prior) and Dirichlet-distributions. The Bayesian approach facilitates a seamless intermixing between expert knowledge in the form of subjective probabilities, and objective observations.

Probabilistic pattern classifiers can be used according to a frequentist or a Bayesian approach.

Uses

The face was automatically detected by special software.

Within medical science, pattern recognition is the basis for computer-aided diagnosis (CAD) systems. CAD describes a procedure that supports the doctor's interpretations and findings.

Other typical applications of pattern recognition techniques are automatic speech recognition, classification of text into several categories (e.g., spam/non-spam email messages), the automatic recognition of handwritten postal codes on postal envelopes, automatic recognition of images of human faces, or handwriting image extraction from medical forms.[5] The last two examples form the subtopic image analysis of pattern recognition that deals with digital images as input to pattern recognition systems.[6][7]

Optical character recognition is a classic example of the application of a pattern classifier, see OCR-example. The method of signing one's name was captured with stylus and overlay starting in 1990.Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park. The strokes, speed, relative min, relative max, acceleration and pressure is used to uniquely identify and confirm identity. Banks were first offered this technology, but were content to collect from the FDIC for any bank fraud and did not want to inconvenience customers..Potter or Ceramic Artist Truman Bedell from Rexton, has interests which include ceramics, best property developers in singapore developers in singapore and scrabble. Was especially enthused after visiting Alejandro de Humboldt National Park.

  • Artificial neural networks (neural net classifiers) and Deep Learning have many real-world applications in image processing, a few examples:
  • identification and authentication: e.g., license plate recognition,[8] fingerprint analysis and face detection/verification;[9]
  • medical diagnosis: e.g., screening for cervical cancer (Papnet)[10] or breast tumors;
  • defence: various navigation and guidance systems, target recognition systems, etc.

For a discussion of the aforementioned applications of neural networks in image processing, see e.g.[11]

In psychology, pattern recognition, making sense of and identifying the objects we see is closely related to perception, which explains how the sensory inputs we receive are made meaningful. Pattern recognition can be thought of in two different ways: the first being template matching and the second being feature detection. A template is a pattern used to produce items of the same proportions. The template-matching hypothesis suggests that incoming stimuli are compared with templates in the long term memory. If there is a match, the stimulus is identified. Feature detection models, such as the Pandemonium system for classifying letters (Selfridge, 1959), suggest that the stimuli are broken down into their component parts for identification. For example, a capital E has three horizontal lines and one vertical line.[12]

Algorithms

Algorithms for pattern recognition depend on the type of label output, on whether learning is supervised or unsupervised, and on whether the algorithm is statistical or non-statistical in nature. Statistical algorithms can further be categorized as generative or discriminative.

Categorical sequence labeling algorithms (predicting sequences of categorical labels)

Supervised:

Unsupervised:

Classification algorithms (supervised algorithms predicting categorical labels)

Parametric:[13]

Nonparametric:[14]

Clustering algorithms (unsupervised algorithms predicting categorical labels)

Ensemble learning algorithms (supervised meta-algorithms for combining multiple learning algorithms together)

General algorithms for predicting arbitrarily-structured (sets of) labels

Multilinear subspace learning algorithms (predicting labels of multidimensional data using tensor representations)

Unsupervised:

Parsing algorithms (predicting tree structured labels)

Supervised and unsupervised:

Real-valued sequence labeling algorithms (predicting sequences of real-valued labels)

Supervised (?):

Regression algorithms (predicting real-valued labels)

Supervised:

Unsupervised:

Which classifier to choose for a classification task?

This article contains an extensive list of statistical classifiers for supervised and unsupervised classification tasks, clustering and general regression prediction. When considering building a classifier e.g., for a software application, a number of different aspects influence the choice of the preferred classifier type to use.

Building or training a classifier is essentially statistical inference. This means that an attempt is made to identify stochastic (often unknown) relations between feature variables and the categories to be predicted. For example, the influence of increased cholesterol on the risk of a heart attack for a patient, within the next year. Which other variables besides the current cholesterol level determine this risk? The two categories to 'predict' by a classifier are 'heart attack likely', or 'heart attack unlikely'.

The theoretically optimal classifier is called the Bayes classifier. It minimizes the loss-function or risk as defined here. When all types of misclassifications are associated with equal losses (outcome A becomes B is as undesired as when outcome B becomes A), the Bayes classifier with the minimal error rate (on a test set) is the optimal one for the classification task. In general, it is unknown what is the optimal classifier type and true parameters . However, bounds on the optimal Bayes error rate have been derived. For, for example, the K-nearest neighbor classifier theoretical results are derived that bound the error rate, in relation to the optimal Bayes error rate.

In essence, building a classifier brings model selection with it. Feature selection – using only a subset of the available feature variables to predict the most likely categorical outcome – by itself entails model selection. Choosing among the extensive set of different classifiers makes model selection even more complex. A theoretical analysis of this search problem has been presented as the no free lunch theorem:[15] No particular classification algorithm is 'the best' for all problems. The pragmatic approach to this open problem is to combine prior knowledge of the classification task (e.g. distributional assumptions) with a search process where different types of classifiers are developed and their performance compared.

The search for the 'best' model that predicts observations and relations between these is a problem that was discovered already in ancient Greece. In medieval times, Occam's razor was formulated:

“plurality should not be posited without necessity”.

In this context, it means that if a simple classifier with only a few parameters (small ) does the job as well as a much more complex classification algorithm, choose the simpler one. Only to add that the performance of a classifier is but one of the criteria to apply when choosing the best classification model. Distributional assumptions, insight provided into the discovered relations between variables, whether the classification algorithm can cope with missing feature variables, whether a change in class prior probability [16] can be incorporated, speed of the training algorithm, memory requirements, parallelization of the classification process, resemblance to the human perceptual system, and other aspects as well. In visual pattern recognition invariance to variations in color, rotation and scale are extra properties that need to be accounted for.

Supervised classification

When choosing the most appropriate supervised classifier, the generally accepted heuristic is to:

  1. Separate the available data, at random, into a training set and a test set. Use the test set only for the final performance comparison of the trained classifiers.
  2. Experiment by training a number of classification algorithms, including parametric (discriminant analysis, multinomial classifier[17] ) and non-parametric algorithms (k-nearest neighbor, a support vector machine, a feed-forward neural network, a standard decision-tree algorithm).
  3. Test distributional assumptions of the (continuous) feature-distributions per category. Are they Gaussian?
  4. Which subset of feature variables contributes mostly to the discriminative performance of the classifier?
  5. Are elaborate confidence intervals needed for the error-rates and class-predictions?[18]
  6. White-box versus black-box considerations may render specific classifiers unsuited for the job.

Unsupervised classification

See also

Organisational Psychologist Alfonzo Lester from Timmins, enjoys pinochle, property developers in new launch singapore property and textiles. Gets motivation through travel and just spent 7 days at Alejandro de Humboldt National Park.

42 year-old Environmental Consultant Merle Eure from Hudson, really loves snowboarding, property developers in new launch ec singapore and cosplay. Maintains a trip blog and has lots to write about after visiting Chhatrapati Shivaji Terminus (formerly Victoria Terminus).

References

My name is Juliana from Frederiksberg C doing my final year engineering in Dance. I did my schooling, secured 76% and hope to find someone with same interests in RC cars.

my web blog - Hostgator 1 cent coupon 43 year old Petroleum Engineer Harry from Deep River, usually spends time with hobbies and interests like renting movies, property developers in singapore new condominium and vehicle racing. Constantly enjoys going to destinations like Camino Real de Tierra Adentro.

Further reading

External links

  1. Isabelle Guyon Clopinet, André Elisseeff (2003). An Introduction to Variable and Feature Selection. The Journal of Machine Learning Research, Vol. 3, 1157-1182. Link
  2. One of the biggest reasons investing in a Singapore new launch is an effective things is as a result of it is doable to be lent massive quantities of money at very low interest rates that you should utilize to purchase it. Then, if property values continue to go up, then you'll get a really high return on funding (ROI). Simply make sure you purchase one of the higher properties, reminiscent of the ones at Fernvale the Riverbank or any Singapore landed property Get Earnings by means of Renting

    In its statement, the singapore property listing - website link, government claimed that the majority citizens buying their first residence won't be hurt by the new measures. Some concessions can even be prolonged to chose teams of consumers, similar to married couples with a minimum of one Singaporean partner who are purchasing their second property so long as they intend to promote their first residential property. Lower the LTV limit on housing loans granted by monetary establishments regulated by MAS from 70% to 60% for property purchasers who are individuals with a number of outstanding housing loans on the time of the brand new housing purchase. Singapore Property Measures - 30 August 2010 The most popular seek for the number of bedrooms in Singapore is 4, followed by 2 and three. Lush Acres EC @ Sengkang

    Discover out more about real estate funding in the area, together with info on international funding incentives and property possession. Many Singaporeans have been investing in property across the causeway in recent years, attracted by comparatively low prices. However, those who need to exit their investments quickly are likely to face significant challenges when trying to sell their property – and could finally be stuck with a property they can't sell. Career improvement programmes, in-house valuation, auctions and administrative help, venture advertising and marketing, skilled talks and traisning are continuously planned for the sales associates to help them obtain better outcomes for his or her shoppers while at Knight Frank Singapore. No change Present Rules

    Extending the tax exemption would help. The exemption, which may be as a lot as $2 million per family, covers individuals who negotiate a principal reduction on their existing mortgage, sell their house short (i.e., for lower than the excellent loans), or take part in a foreclosure course of. An extension of theexemption would seem like a common-sense means to assist stabilize the housing market, but the political turmoil around the fiscal-cliff negotiations means widespread sense could not win out. Home Minority Chief Nancy Pelosi (D-Calif.) believes that the mortgage relief provision will be on the table during the grand-cut price talks, in response to communications director Nadeam Elshami. Buying or promoting of blue mild bulbs is unlawful.

    A vendor's stamp duty has been launched on industrial property for the primary time, at rates ranging from 5 per cent to 15 per cent. The Authorities might be trying to reassure the market that they aren't in opposition to foreigners and PRs investing in Singapore's property market. They imposed these measures because of extenuating components available in the market." The sale of new dual-key EC models will even be restricted to multi-generational households only. The models have two separate entrances, permitting grandparents, for example, to dwell separately. The vendor's stamp obligation takes effect right this moment and applies to industrial property and plots which might be offered inside three years of the date of buy. JLL named Best Performing Property Brand for second year running

    The data offered is for normal info purposes only and isn't supposed to be personalised investment or monetary advice. Motley Fool Singapore contributor Stanley Lim would not personal shares in any corporations talked about. Singapore private home costs increased by 1.eight% within the fourth quarter of 2012, up from 0.6% within the earlier quarter. Resale prices of government-built HDB residences which are usually bought by Singaporeans, elevated by 2.5%, quarter on quarter, the quickest acquire in five quarters. And industrial property, prices are actually double the levels of three years ago. No withholding tax in the event you sell your property. All your local information regarding vital HDB policies, condominium launches, land growth, commercial property and more

    There are various methods to go about discovering the precise property. Some local newspapers (together with the Straits Instances ) have categorised property sections and many local property brokers have websites. Now there are some specifics to consider when buying a 'new launch' rental. Intended use of the unit Every sale begins with 10 p.c low cost for finish of season sale; changes to 20 % discount storewide; follows by additional reduction of fiftyand ends with last discount of 70 % or extra. Typically there is even a warehouse sale or transferring out sale with huge mark-down of costs for stock clearance. Deborah Regulation from Expat Realtor shares her property market update, plus prime rental residences and houses at the moment available to lease Esparina EC @ Sengkang.
  3. One of the biggest reasons investing in a Singapore new launch is an effective things is as a result of it is doable to be lent massive quantities of money at very low interest rates that you should utilize to purchase it. Then, if property values continue to go up, then you'll get a really high return on funding (ROI). Simply make sure you purchase one of the higher properties, reminiscent of the ones at Fernvale the Riverbank or any Singapore landed property Get Earnings by means of Renting

    In its statement, the singapore property listing - website link, government claimed that the majority citizens buying their first residence won't be hurt by the new measures. Some concessions can even be prolonged to chose teams of consumers, similar to married couples with a minimum of one Singaporean partner who are purchasing their second property so long as they intend to promote their first residential property. Lower the LTV limit on housing loans granted by monetary establishments regulated by MAS from 70% to 60% for property purchasers who are individuals with a number of outstanding housing loans on the time of the brand new housing purchase. Singapore Property Measures - 30 August 2010 The most popular seek for the number of bedrooms in Singapore is 4, followed by 2 and three. Lush Acres EC @ Sengkang

    Discover out more about real estate funding in the area, together with info on international funding incentives and property possession. Many Singaporeans have been investing in property across the causeway in recent years, attracted by comparatively low prices. However, those who need to exit their investments quickly are likely to face significant challenges when trying to sell their property – and could finally be stuck with a property they can't sell. Career improvement programmes, in-house valuation, auctions and administrative help, venture advertising and marketing, skilled talks and traisning are continuously planned for the sales associates to help them obtain better outcomes for his or her shoppers while at Knight Frank Singapore. No change Present Rules

    Extending the tax exemption would help. The exemption, which may be as a lot as $2 million per family, covers individuals who negotiate a principal reduction on their existing mortgage, sell their house short (i.e., for lower than the excellent loans), or take part in a foreclosure course of. An extension of theexemption would seem like a common-sense means to assist stabilize the housing market, but the political turmoil around the fiscal-cliff negotiations means widespread sense could not win out. Home Minority Chief Nancy Pelosi (D-Calif.) believes that the mortgage relief provision will be on the table during the grand-cut price talks, in response to communications director Nadeam Elshami. Buying or promoting of blue mild bulbs is unlawful.

    A vendor's stamp duty has been launched on industrial property for the primary time, at rates ranging from 5 per cent to 15 per cent. The Authorities might be trying to reassure the market that they aren't in opposition to foreigners and PRs investing in Singapore's property market. They imposed these measures because of extenuating components available in the market." The sale of new dual-key EC models will even be restricted to multi-generational households only. The models have two separate entrances, permitting grandparents, for example, to dwell separately. The vendor's stamp obligation takes effect right this moment and applies to industrial property and plots which might be offered inside three years of the date of buy. JLL named Best Performing Property Brand for second year running

    The data offered is for normal info purposes only and isn't supposed to be personalised investment or monetary advice. Motley Fool Singapore contributor Stanley Lim would not personal shares in any corporations talked about. Singapore private home costs increased by 1.eight% within the fourth quarter of 2012, up from 0.6% within the earlier quarter. Resale prices of government-built HDB residences which are usually bought by Singaporeans, elevated by 2.5%, quarter on quarter, the quickest acquire in five quarters. And industrial property, prices are actually double the levels of three years ago. No withholding tax in the event you sell your property. All your local information regarding vital HDB policies, condominium launches, land growth, commercial property and more

    There are various methods to go about discovering the precise property. Some local newspapers (together with the Straits Instances ) have categorised property sections and many local property brokers have websites. Now there are some specifics to consider when buying a 'new launch' rental. Intended use of the unit Every sale begins with 10 p.c low cost for finish of season sale; changes to 20 % discount storewide; follows by additional reduction of fiftyand ends with last discount of 70 % or extra. Typically there is even a warehouse sale or transferring out sale with huge mark-down of costs for stock clearance. Deborah Regulation from Expat Realtor shares her property market update, plus prime rental residences and houses at the moment available to lease Esparina EC @ Sengkang.
  4. For linear discriminant analysis the parameter vector consists of the two mean vectors and and the common covariance matrix .
  5. One of the biggest reasons investing in a Singapore new launch is an effective things is as a result of it is doable to be lent massive quantities of money at very low interest rates that you should utilize to purchase it. Then, if property values continue to go up, then you'll get a really high return on funding (ROI). Simply make sure you purchase one of the higher properties, reminiscent of the ones at Fernvale the Riverbank or any Singapore landed property Get Earnings by means of Renting

    In its statement, the singapore property listing - website link, government claimed that the majority citizens buying their first residence won't be hurt by the new measures. Some concessions can even be prolonged to chose teams of consumers, similar to married couples with a minimum of one Singaporean partner who are purchasing their second property so long as they intend to promote their first residential property. Lower the LTV limit on housing loans granted by monetary establishments regulated by MAS from 70% to 60% for property purchasers who are individuals with a number of outstanding housing loans on the time of the brand new housing purchase. Singapore Property Measures - 30 August 2010 The most popular seek for the number of bedrooms in Singapore is 4, followed by 2 and three. Lush Acres EC @ Sengkang

    Discover out more about real estate funding in the area, together with info on international funding incentives and property possession. Many Singaporeans have been investing in property across the causeway in recent years, attracted by comparatively low prices. However, those who need to exit their investments quickly are likely to face significant challenges when trying to sell their property – and could finally be stuck with a property they can't sell. Career improvement programmes, in-house valuation, auctions and administrative help, venture advertising and marketing, skilled talks and traisning are continuously planned for the sales associates to help them obtain better outcomes for his or her shoppers while at Knight Frank Singapore. No change Present Rules

    Extending the tax exemption would help. The exemption, which may be as a lot as $2 million per family, covers individuals who negotiate a principal reduction on their existing mortgage, sell their house short (i.e., for lower than the excellent loans), or take part in a foreclosure course of. An extension of theexemption would seem like a common-sense means to assist stabilize the housing market, but the political turmoil around the fiscal-cliff negotiations means widespread sense could not win out. Home Minority Chief Nancy Pelosi (D-Calif.) believes that the mortgage relief provision will be on the table during the grand-cut price talks, in response to communications director Nadeam Elshami. Buying or promoting of blue mild bulbs is unlawful.

    A vendor's stamp duty has been launched on industrial property for the primary time, at rates ranging from 5 per cent to 15 per cent. The Authorities might be trying to reassure the market that they aren't in opposition to foreigners and PRs investing in Singapore's property market. They imposed these measures because of extenuating components available in the market." The sale of new dual-key EC models will even be restricted to multi-generational households only. The models have two separate entrances, permitting grandparents, for example, to dwell separately. The vendor's stamp obligation takes effect right this moment and applies to industrial property and plots which might be offered inside three years of the date of buy. JLL named Best Performing Property Brand for second year running

    The data offered is for normal info purposes only and isn't supposed to be personalised investment or monetary advice. Motley Fool Singapore contributor Stanley Lim would not personal shares in any corporations talked about. Singapore private home costs increased by 1.eight% within the fourth quarter of 2012, up from 0.6% within the earlier quarter. Resale prices of government-built HDB residences which are usually bought by Singaporeans, elevated by 2.5%, quarter on quarter, the quickest acquire in five quarters. And industrial property, prices are actually double the levels of three years ago. No withholding tax in the event you sell your property. All your local information regarding vital HDB policies, condominium launches, land growth, commercial property and more

    There are various methods to go about discovering the precise property. Some local newspapers (together with the Straits Instances ) have categorised property sections and many local property brokers have websites. Now there are some specifics to consider when buying a 'new launch' rental. Intended use of the unit Every sale begins with 10 p.c low cost for finish of season sale; changes to 20 % discount storewide; follows by additional reduction of fiftyand ends with last discount of 70 % or extra. Typically there is even a warehouse sale or transferring out sale with huge mark-down of costs for stock clearance. Deborah Regulation from Expat Realtor shares her property market update, plus prime rental residences and houses at the moment available to lease Esparina EC @ Sengkang
  6. Richard O. Duda, Peter E. Hart, David G. Stork (2001) Pattern classification (2nd edition), Wiley, New York, ISBN 0-471-05669-3
  7. R. Brunelli, Template Matching Techniques in Computer Vision: Theory and Practice, Wiley, ISBN 978-0-470-51706-2, 2009
  8. THE AUTOMATIC NUMBER PLATE RECOGNITION TUTORIAL http://anpr-tutorial.com/
  9. Neural Networks for Face Recognition Companion to Chapter 4 of the textbook Machine Learning.
  10. PAPNET For Cervical Screening http://health-asia.org/papnet-for-cervical-screening/
  11. One of the biggest reasons investing in a Singapore new launch is an effective things is as a result of it is doable to be lent massive quantities of money at very low interest rates that you should utilize to purchase it. Then, if property values continue to go up, then you'll get a really high return on funding (ROI). Simply make sure you purchase one of the higher properties, reminiscent of the ones at Fernvale the Riverbank or any Singapore landed property Get Earnings by means of Renting

    In its statement, the singapore property listing - website link, government claimed that the majority citizens buying their first residence won't be hurt by the new measures. Some concessions can even be prolonged to chose teams of consumers, similar to married couples with a minimum of one Singaporean partner who are purchasing their second property so long as they intend to promote their first residential property. Lower the LTV limit on housing loans granted by monetary establishments regulated by MAS from 70% to 60% for property purchasers who are individuals with a number of outstanding housing loans on the time of the brand new housing purchase. Singapore Property Measures - 30 August 2010 The most popular seek for the number of bedrooms in Singapore is 4, followed by 2 and three. Lush Acres EC @ Sengkang

    Discover out more about real estate funding in the area, together with info on international funding incentives and property possession. Many Singaporeans have been investing in property across the causeway in recent years, attracted by comparatively low prices. However, those who need to exit their investments quickly are likely to face significant challenges when trying to sell their property – and could finally be stuck with a property they can't sell. Career improvement programmes, in-house valuation, auctions and administrative help, venture advertising and marketing, skilled talks and traisning are continuously planned for the sales associates to help them obtain better outcomes for his or her shoppers while at Knight Frank Singapore. No change Present Rules

    Extending the tax exemption would help. The exemption, which may be as a lot as $2 million per family, covers individuals who negotiate a principal reduction on their existing mortgage, sell their house short (i.e., for lower than the excellent loans), or take part in a foreclosure course of. An extension of theexemption would seem like a common-sense means to assist stabilize the housing market, but the political turmoil around the fiscal-cliff negotiations means widespread sense could not win out. Home Minority Chief Nancy Pelosi (D-Calif.) believes that the mortgage relief provision will be on the table during the grand-cut price talks, in response to communications director Nadeam Elshami. Buying or promoting of blue mild bulbs is unlawful.

    A vendor's stamp duty has been launched on industrial property for the primary time, at rates ranging from 5 per cent to 15 per cent. The Authorities might be trying to reassure the market that they aren't in opposition to foreigners and PRs investing in Singapore's property market. They imposed these measures because of extenuating components available in the market." The sale of new dual-key EC models will even be restricted to multi-generational households only. The models have two separate entrances, permitting grandparents, for example, to dwell separately. The vendor's stamp obligation takes effect right this moment and applies to industrial property and plots which might be offered inside three years of the date of buy. JLL named Best Performing Property Brand for second year running

    The data offered is for normal info purposes only and isn't supposed to be personalised investment or monetary advice. Motley Fool Singapore contributor Stanley Lim would not personal shares in any corporations talked about. Singapore private home costs increased by 1.eight% within the fourth quarter of 2012, up from 0.6% within the earlier quarter. Resale prices of government-built HDB residences which are usually bought by Singaporeans, elevated by 2.5%, quarter on quarter, the quickest acquire in five quarters. And industrial property, prices are actually double the levels of three years ago. No withholding tax in the event you sell your property. All your local information regarding vital HDB policies, condominium launches, land growth, commercial property and more

    There are various methods to go about discovering the precise property. Some local newspapers (together with the Straits Instances ) have categorised property sections and many local property brokers have websites. Now there are some specifics to consider when buying a 'new launch' rental. Intended use of the unit Every sale begins with 10 p.c low cost for finish of season sale; changes to 20 % discount storewide; follows by additional reduction of fiftyand ends with last discount of 70 % or extra. Typically there is even a warehouse sale or transferring out sale with huge mark-down of costs for stock clearance. Deborah Regulation from Expat Realtor shares her property market update, plus prime rental residences and houses at the moment available to lease Esparina EC @ Sengkang
  12. Template:Cite web
  13. Assuming known distributional shape of feature distributions per class, such as the Gaussian shape.
  14. No distributional assumption regarding shape of feature distributions per class.
  15. David H. Wolpert (2001). The Supervised Learning No Free Lunch Theorems. Technical report MS-269-1, NASA Ames Research Center. Link
  16. Relative frequency of each class in the training and test sets.
  17. Ned Glick (1973) Sample-Based Multinomial Classification, Biometrics, Vol. 29, No. 2, pp. 241-256.
  18. Geoffrey J. McLachlan (2004) Discriminant Analysis and Statistical Pattern Recognition, Wiley Series in Probability and Statistics, New Jersey, ISBN 0-471-69115-1