Difference between revisions of "Página de pruebas"

Revision as of 00:39, 16 January 2021

K-Nearest Neighbour

15/06: Recorded class - K-Nearest Neighbour

https://drive.google.com/drive/folders/1BaordCV9vw-gxLdJBMbWioX2NW7Ty9Lm

https://drive.google.com/drive/folders/1BaordCV9vw-gxLdJBMbWioX2NW7Ty9Lm

StatQuest: https://www.youtube.com/watch?v=HVXime0nQeI

KNN determines the class of a given unlabeled observation by identifying the k-nearest labeled observations to it. In other words, the algorithm assigns a given unlabeled observation to the class that has more similar labeled instances. This is a simple method, but very powerful.

Udemy course, Pierian data https://www.udemy.com/course/python-for-data-science-and-machine-learning-bootcamp/

k-NN is ideal for classification tasks where relationships among the attributes and target classes are:

numerous
complex
difficult to interpret and
where instances of a class are fairly homogeneous

Applications of this learning method include:

Computer vision applications:

Optical character recognition
Face recognition

Recommendation systems
Pattern detection in genetic data

Basic Implementation:

Training Algorithm:

Simply store the training examples

Prediction Algorithm:

Calculate the distance from x to all points in your data (Udemy Course)
Sort the points in your data by increasing distance from x (Udemy Course)
Predict the majority label of the "k" closets points (Udemy Course)

Find the $k$ training examples Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle (x_{1},y_{1}),...(x_{k},y_{k})} that are nearest to the test example Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle x} (Noel)
Predict the most frequent class among those Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://en.wikipedia.org/api/rest_v1/":): {\displaystyle y_{i}'s} . (Noel)

Improvements:

Weighting training examples based on their distance
Alternative measures of "nearness"
Finding "close" examples in a large training set quickly

Strengths and Weaknesses:


Strengths	Weaknesses
The algorithm is simple and effective	The method does not produce any model which limits potential insights about the relationship between features
Fast training phase	Slow classification phase. Requires lots of memory
Capable of reflecting complex relationships	Can not handle nominal feature or missing data without additional pre-processing
Unlike many other methods, no assumptions about the distribution of the data are made

Classifying a new example:

@@ Line 1: / Line 1: @@
+==K-Nearest Neighbour==
-{|
+* 15/06: Recorded class - K-Nearest Neighbour
-|}<div class="mw-collapsible mw-collapsed" style="width:220pt">
-''AWS Academy - Cloud Architecting''
+:* https://drive.google.com/drive/folders/1BaordCV9vw-gxLdJBMbWioX2NW7Ty9Lm
-<div class="mw-collapsible-content">
-{| style="width: 900pt !important"
+:* https://drive.google.com/drive/folders/1BaordCV9vw-gxLdJBMbWioX2NW7Ty9Lm
+* StatQuest: https://www.youtube.com/watch?v=HVXime0nQeI
+<br >
+KNN determines the class of a given unlabeled observation by identifying the k-nearest labeled observations to it. In other words, the algorithm assigns a given unlabeled observation to the class that has more similar labeled instances. This is a simple method, but very powerful.
+[[File:KNearest_Neighbors_from_the_Udemy_course_Pierian_data1.mp4|800px|thumb|center|Udemy course, Pierian data https://www.udemy.com/course/python-for-data-science-and-machine-learning-bootcamp/]]
+k-NN is ideal for classification tasks where relationships among the attributes and target classes are:
+* numerous
+* complex
+* difficult to interpret and
+* where instances of a class are fairly homogeneous
+<br />
+'''Applications of this learning method include:'''
+* Computer vision applications:
+:* Optical character recognition
+:* Face recognition
+* Recommendation systems
+* Pattern detection in genetic data
+<br />
+Basic Implementation:
+* Training Algorithm:
+:* Simply store the training examples
+* Prediction Algorithm:
+:# Calculate the distance from x to all points in your data (Udemy Course)
+:# Sort the points in your data by increasing distance from x (Udemy Course)
+:# Predict the majority label of the "k" closets points (Udemy Course)
+:* Find the <math>k</math> training examples <math>(x_{1},y_{1}),...(x_{k},y_{k})</math> that are '''nearest''' to the test example <math>x</math> (Noel)
+:* Predict the most frequent class among those <math>y_{i}'s</math>. (Noel)
+* '''Improvements:'''
+:* Weighting training examples based on their distance
+:* Alternative measures of "nearness"
+:* Finding "close" examples in a large training set quickly
+'''Strengths and Weaknesses:'''
+{| class="wikitable"
+|+
+!Strengths
+!Weaknesses
+|-
+|The algorithm is simple and effective
+|The method does not produce any model which limits potential insights about the relationship between features
+|-
+|Fast training phase
+|Slow classification phase. Requires lots of memory
+|-
+|Capable of reflecting complex relationships
+|Can not handle nominal feature or missing data without additional pre-processing
 |-
+|Unlike many other methods, no assumptions about the distribution of the data are made
 |
-Designing a cloud environment
-|
-Storing relational data in Amazon RDS, Managing NoSQL databases
-|
-Multi-region failover with Amazon Route 53
 |}
-</div>
-</div>
+* Classifying a new example:
+<br />

Difference between revisions of "Página de pruebas"

Revision as of 00:39, 16 January 2021

K-Nearest Neighbour

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Tools