Difference between revisions of "Página de pruebas"

Revision as of 17:53, 16 January 2021

K-Nearest Neighbour

Recorded Noel class (15/06):

https://drive.google.com/drive/folders/1BaordCV9vw-gxLdJBMbWioX2NW7Ty9Lm

https://drive.google.com/drive/folders/1BaordCV9vw-gxLdJBMbWioX2NW7Ty9Lm

StatQuest: https://www.youtube.com/watch?v=HVXime0nQeI


KNN is a model that classifies a new data point based on the points that are closest in distance to the new point. The principle behind nearest neighbor methods is to find a predefined number of training samples (K) closest in distance to the new data point. Then, the class of the new data point will be the most common class in the k training samples. https://scikit-learn.org/stable/modules/neighbors.html [Adelo] In other words, KNN determines the class of a given unlabeled observation by identifying the most common class among the k-nearest labeled observations to it. This is a simple method, but extremely powerful.
Regression/Classification	Applications	Strengths	Weaknesses	Comments	Improvements
KNN can be used for both classification and regression predictive problems. However, it is more widely used in classification problems in the industry. https://www.analyticsvidhya.com/blog/2018/03/introduction-k-neighbours-algorithm-clustering/	Computer vision applications: Optical character recognition Face recognition Recommendation systems Pattern detection in genetic data			k-NN is ideal for classification tasks where relationships among the attributes and target classes are: numerous complex difficult to interpret and where instances of a class are fairly homogeneous	Weighting training examples based on their distance Alternative measures of "nearness" Finding "close" examples in a large training set quickly

Basic Implementation:

Training Algorithm:

Simply store the training examples

Prediction Algorithm:

Calculate the distance from x to all points in your data (Udemy Course)
Sort the points in your data by increasing distance from x (Udemy Course)
Predict the majority label of the "k" closets points (Udemy Course)

Find the $k$ training examples $(x_{1},y_{1}),...(x_{k},y_{k})$ that are nearest to the test example $x$ (Noel)
Predict the most frequent class among those $y_{i}'s$ . (Noel)

Udemy course, Pierian data https://www.udemy.com/course/python-for-data-science-and-machine-learning-bootcamp/

@@ Line 10: / Line 10: @@
-KNN is a model that classifies a new data point based on the points that are closest in distance to the new point. The principle behind nearest neighbor methods is to find a predefined number of training samples (''K'') closest in distance to the new data point. Then, the class of the new data point will be the most common class in the k training samples.
+{| class="wikitable"
-https://scikit-learn.org/stable/modules/neighbors.html [Adelo]
+|+
+! colspan="6" |KNN is a model that classifies a new data point based on the points that are closest in distance to the new point. The principle behind nearest neighbor methods is to find a predefined number of training samples (''K'') closest in distance to the new data point. Then, the class of the new data point will be the most common class in the k training samples. <nowiki>https://scikit-learn.org/stable/modules/neighbors.html</nowiki> [Adelo]
 In other words, KNN determines the class of a given unlabeled observation by identifying the most common class among the k-nearest labeled observations to it.
 This is a simple method, but extremely powerful.
+|-
+!'''Regression/Classification'''
+!'''Applications'''
+!Strengths
+!Weaknesses
+!Comments
+!Improvements
+|-
+|KNN can be used for both classification and regression predictive problems. However, it is more widely used in classification problems in the industry. <nowiki>https://www.analyticsvidhya.com/blog/2018/03/introduction-k-neighbours-algorithm-clustering/</nowiki>
+|
+* Computer vision applications:
-<br />
-KNN can be used for both classification and regression predictive problems. However, it is more widely used in classification problems in the industry. https://www.analyticsvidhya.com/blog/2018/03/introduction-k-neighbours-algorithm-clustering/
-<br />
-'''Applications of this learning method include:'''
-* Computer vision applications:
 :* Optical character recognition
 :* Face recognition
 * Recommendation systems
 * Pattern detection in genetic data
+|
+|
+|k-NN is ideal for classification tasks where relationships among the attributes and target classes are:
+* numerous
-<br />
+* complex
-k-NN is ideal for classification tasks where relationships among the attributes and target classes are:
+* difficult to interpret and
-* numerous
-* complex
-* difficult to interpret and
 * where instances of a class are fairly homogeneous
+|
+:* Weighting training examples based on their distance
+:* Alternative measures of "nearness"
+:* Finding "close" examples in a large training set quickly
+|}
@@ Line 53: / Line 62: @@
 :* Find the <math>k</math> training examples <math>(x_{1},y_{1}),...(x_{k},y_{k})</math> that are '''nearest''' to the test example <math>x</math> (Noel)
 :* Predict the most frequent class among those <math>y_{i}'s</math>. (Noel)
-* '''Improvements:'''
-:* Weighting training examples based on their distance
-:* Alternative measures of "nearness"
-:* Finding "close" examples in a large training set quickly
-'''Strengths and Weaknesses:'''
-{| class="wikitable"
-|+
-!Strengths
-!Weaknesses
-|-
-|The algorithm is simple and effective
-|The method does not produce any model which limits potential insights about the relationship between features
-|-
-|Fast training phase
-|Slow classification phase. Requires lots of memory
-|-
-|Capable of reflecting complex relationships
-|Can not handle nominal feature or missing data without additional pre-processing
-|-
-|Unlike many other methods, no assumptions about the distribution of the data are made
-|
-|}

Difference between revisions of "Página de pruebas"

Revision as of 17:53, 16 January 2021

K-Nearest Neighbour

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Tools