Difference between revisions of "Página de pruebas"
Adelo Vieira (talk | contribs) |
Adelo Vieira (talk | contribs) |
||
| Line 72: | Line 72: | ||
<img src="https://upload.wikimedia.org/wikipedia/commons/e/e7/KnnClassification.svg" style="display: block; margin-left: auto; margin-right: auto; width: 300pt;" /> | <img src="https://upload.wikimedia.org/wikipedia/commons/e/e7/KnnClassification.svg" style="display: block; margin-left: auto; margin-right: auto; width: 300pt;" /> | ||
| − | <div style="text-align: | + | <div style="text-align: left; display:block; margin-right: auto; margin-left: auto; width:500pt">Example of k-NN classification. The test sample (green dot) should be classified either to blue squares or to red triangles. If k = 3 (solid line circle) it is assigned to the red triangles because there are 2 triangles and only 1 square inside the inner circle. If k = 5 (dashed line circle) it is assigned to the blue squares (3 squares vs. 2 triangles inside the outer circle). Taken from https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm</div> |
Revision as of 19:32, 16 January 2021
K-Nearest Neighbour
- Recorded Noel class (15/06):
- StatQuest: https://www.youtube.com/watch?v=HVXime0nQeI
|
KNN classifies a new data point based on the points that are closest in distance to the new point. The principle behind KNN is to find a predefined number of training samples (K) closest in distance to the new data point. Then, the class of the new data point will be the most common class in the k nearest training samples. https://scikit-learn.org/stable/modules/neighbors.html [Adelo] In other words, KNN determines the class of a given unlabeled observation by identifying the most common class among the k-nearest labeled observations to it. This is a simple method, but extremely powerful. | |||||
|---|---|---|---|---|---|
| Regression/Classification | Applications | Strengths | Weaknesses | Comments | Improvements |
|
KNN can be used for both classification and regression predictive problems. However, it is more widely used in classification problems in the industry. https://www.analyticsvidhya.com/blog/2018/03/introduction-k-neighbours-algorithm-clustering/ |
|
|
|
k-NN is ideal for classification tasks where relationships among the attributes and target classes are:
|
|
Basic Implementation:
- Training Algorithm:
- Simply store the training examples
- Prediction Algorithm:
- Calculate the distance from the new data point to all points in the data.
- Sort the points in your data by increasing the distance from the new data point.
- Determine the most frequent class among the k nearest points</math>.