Difference between revisions of "Página de pruebas 3"

From Sinfronteras
Jump to: navigation, search
 
(581 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
{{Sidebar}}
  
 +
<html><buttonclass="averte" onclick="aver()">aver</button></html>
  
 +
<html>
 +
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
 +
<script>
 +
function aver() {
 +
  link = "http://wiki.sinfronteras.ws/index.php?title=P%C3%A1gina_de_pruebas_3+&+action=edit"
 +
  link2 = link.replace("amp;","")
 +
  window.location = link2
 +
  sleep(2);
 +
  window.document.getElementById('firstHeading').style.color = "red"
 +
}
 +
$(document).ready( function() {
 +
    $('#totalItems, #enteredItems').keyup(function(){
 +
        window.document.getElementById('firstHeading').style.color = "red"
 +
    }); 
 +
    window.document.getElementById('firstHeading').style.color = "red"
 +
});
 +
</script>
 +
</html>
  
{| class="wikitable"
+
<br />
! rowspan="2" |
+
==Projects portfolio==
! rowspan="2" |
+
 
! rowspan="2" style="width:80px; background-color:#E6B0AA" |Values have any meaningful order
+
 
! rowspan="2" style="width:80px; background-color:#A9DFBF" |Distance between values is defined
+
<br />
! colspan="3" style="width:80px; background-color:#FDEBD0" |'''Mathematical operations make sense'''
+
==Data Analytics courses==
(Values can be used to perform '''mathematical operations)'''
+
 
! rowspan="2" style="width:80px; background-color:#AED6F1" |There is a meaning ful zero-point
+
 
! colspan="5" style="width:80px; background-color:#D7BDE2" |Values can be used to perform statistical computations
+
<br />
! rowspan="2" |Example
+
==Possible sources of data==
|-
+
 
! style="width:80px; background-color:#FDEBD0" | '''Comparison operators'''
+
 
! style="width:80px; background-color:#FDEBD0" | Addition and subtrac tion
+
<br />
! style="width:80px; background-color:#FDEBD0" | Multiplica tion and division
+
==What is data==
! style="width:80px; background-color:#D7BDE2" | "Counts", aka, "Fre quency of Distribu tion"
+
 
! style="width:80px; background-color:#D7BDE2" | Mode
+
 
! style="width:80px; background-color:#D7BDE2" | Median
+
<br />
! style="width:80px; background-color:#D7BDE2" | Mean
+
===Qualitative vs quantitative data===
! style="width:80px; background-color:#D7BDE2" | Stn
+
 
|-
+
 
!'''Nominal'''
+
<br />
|Values serve only as labels
+
====Discrete and continuous data====
| colspan="11" style="margin: 0; padding: 0;" |
+
 
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0; padding: 0;"
+
 
|- style="vertical-align:middle;"
+
<br />
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
===Structured vs Unstructured data===
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
<br />
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
===Data Levels and Measurement===
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
<br />
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
===What is an example===
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px; vertical-align:top; padding-top:30px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
 
|- style="vertical-align:top;"
+
<br />
| style="height:100px; text-align:left; width:80px;" |
+
===What is a dataset===
Values don't have any meaningful order
+
 
| style="height:100px; text-align:left; width:80px;" |
+
 
No distance between values is defined
+
<br />
| colspan="3" style="height:100px; text-align:left; width:80px;" |
+
===What is Metadata===
Values don't carry any mathematical meaning
+
 
|
+
 
| style="height:100px; text-align:left; width:80px;" |
+
<br />
| style="height:100px; text-align:left; width:80px;" |
+
==What is Data Science==
| colspan="3" style="height:100px; text-align:left; width:80px;" |
+
 
Values cannot be used to perform many statistical computations, such as mean and standard deviation
+
 
|}
+
<br />
|For an attribute '''"outlook"''' from weather data, potential values could be "sunny", "overcast", and "rainy".
+
===Supervised Learning===
|-
+
 
!'''Ordinal'''
+
 
|Distinction between nominal and ordinal not always clear (e.g., attribute "outlook")
+
 
| colspan="11" style="margin: 0; padding: 0;" |
+
<br />
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0; padding: 0;"
+
===Unsupervised Learning===
|- style="vertical-align:middle;"
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
<br />
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
===Reinforcement Learning===
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
<br />
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
==Some real-world examples of big data analysis==
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
<br />
| style="height:100px; text-align:center; width:80px; vertical-align:top; padding-top:30px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
==Statistic==
|- style="vertical-align:top;"
+
 
| style="height:100px; text-align:left; width:80px;" |
+
 
Values have a meaningful order
+
<br />
| style="height:100px; text-align:left; width:80px;" |
+
==Descriptive Data Analysis==
No distance between values is defined
+
 
| style="height:100px; text-align:left; width:80px;" |
+
 
Comparison operators make sense
+
<br />
| colspan="2" |Mathematical operations such as addition, subtraction, multiplication, etc. do not make sense
+
===Central tendency===
|
+
 
| style="height:100px; text-align:left; width:80px;" |
+
 
| style="height:100px; text-align:left; width:80px;" |
+
<br />
| style="height:100px; text-align:left; width:80px;" |
+
====Mean====
|
+
 
|
+
 
|}
+
<br />
|An attribute '''"temperature"''' in weather data with potential values fo: "hot" > "warm" > "cool"
+
=====When not to use the mean=====
|-
+
 
!'''Interval'''
+
 
|
+
<br />
| colspan="11" style="margin: 0; padding: 0;" |
+
====Median====
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0; padding: 0;"
+
 
|- style="vertical-align:middle;"
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
<br />
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
====Mode====
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
<br />
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: red;  font-size: 15pt; text-align: center;"><div style="text-align: center;"><span style="color: red; font-size: 15pt;  text-align: center;">✘</span></div></span></div>
+
====Skewed Distributions and the Mean and Median====
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
<br />
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
====Summary of when to use the mean, median and mode====
| style="height:100px; text-align:center; width:80px; vertical-align:top; padding-top:30px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
measures-central-tendency-mean-mode-median-faqs.php
|- style="vertical-align:top;"
+
 
| style="height:100px; text-align:left; width:80px;" |
+
 
| style="height:100px; text-align:left; width:80px;" |
+
<br />
Distance between values is defined. In other words, we can quantify the difference between each value
+
===Measures of Variation===
| style="height:100px; text-align:left; width:80px;" |
+
 
Comparison operators make sense
+
 
|Addition, subtraction, make sense
+
<br />
|Multiplication, and division do not make sense
+
====Range====
|Interval variables often do not have a meaningful zero-point.
+
 
| style="height:100px; text-align:left; width:80px;" |
+
 
| style="height:100px; text-align:left; width:80px;" |
+
<br />
| style="height:100px; text-align:left; width:80px;" |
+
====Quartile====
|
+
 
|(not sure)
+
 
|}
+
<br />
|An example of an interval variable would be '''temperature.'''  We can correctly assume that the difference between 70 and 80 degrees is the same as the difference between 80 and 90 degrees.  However, the mathematical operations of multiplication and division do not apply to interval variables.  For instance, we cannot accurately say that 100 degrees is twice as hot as 50 degrees. Additionally, interval variables often do not have a meaningful zero-point.  For example, a temperature of zero degrees (on Celsius and Fahrenheit scales) does not mean a complete absence of heat.
+
====Box Plots====
|-
+
 
!'''Ratio'''
+
 
|
+
 
| colspan="11" style="margin: 0; padding: 0;" |
+
<br />
{| class="mw-collapsible mw-collapsed wikitable" style="margin: 0; padding: 0;"
+
====Variance====
|- style="vertical-align:middle;"
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
<br />
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
====Standard Deviation====
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
<br />
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
==== Z Score ====
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
 
| style="height:100px; text-align:center; width:80px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
<br />
| style="height:100px; text-align:center; width:80px; vertical-align:top; padding-top:30px;" |<div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;"><div style="text-align: center;"><span style="color: blue; font-size: 20pt; text-align: center;">✔</span></div></span></div>
+
===Shape of Distribution===
|- style="vertical-align:top;"
+
 
| style="height:100px; text-align:left; width:80px;" |
+
 
| style="height:100px; text-align:left; width:80px;" |
+
<br />
| colspan="3" style="height:100px; text-align:left; width:80px;" |
+
====Probability distribution====
All arithmetic operations are possible on a ratio variable
+
 
|Ratio variables have a meaningful zero-point
+
 
| style="height:100px; text-align:left; width:80px;" |
+
<br />
| style="height:100px; text-align:left; width:80px;" |
+
=====The Normal Distribution=====
| style="height:100px; text-align:left; width:80px;" |
+
 
|
+
 
|
+
<br />
|}
+
====Histograms====
|
+
 
|}
+
 
 +
<br />
 +
====Skewness====
 +
 
 +
 
 +
<br />
 +
====Kurtosis====
 +
 
 +
 
 +
<br />
 +
====Visualization of measure of variations on a Normal distribution====
 +
 
 +
 
 +
<br />
 +
==Simple and Multiple regression==
 +
 
 +
 
 +
<br />
 +
===Correlation===
 +
 
 +
 
 +
<br />
 +
====Measuring Correlation====
 +
 
 +
 
 +
<br />
 +
=====Pearson correlation coefficient - Pearson s r=====
 +
 
 +
 
 +
<br />
 +
=====The coefficient of determination <math>R^2</math>=====
 +
 
 +
 
 +
<br />
 +
====Correlation <math>\neq</math> Causation====
 +
 
 +
 
 +
<br />
 +
====Testing the "generalizability" of the correlation ====
 +
 
 +
 
 +
<br />
 +
===Simple Linear Regression===
 +
 
 +
 
 +
<br />
 +
===Multiple Linear Regression===
 +
 
 +
 
 +
<br />
 +
===RapidMiner Linear Regression examples===
 +
 
 +
 
 +
<br />
 +
==K-Nearest Neighbour==
 +
 
 +
 
 +
<br />
 +
==Decision Trees==
 +
 
 +
 
 +
<br />
 +
===The algorithm===
 +
 
 +
 
 +
<br />
 +
====Basic explanation of the algorithm====
 +
 
 +
 
 +
<br />
 +
====Algorithms addressed in Noel s Lecture====
 +
 
 +
 
 +
<br />
 +
=====The ID3 algorithm=====
 +
 
 +
 
 +
<br />
 +
=====The C5.0 algorithm=====
 +
 
 +
 
 +
<br />
 +
===Example in RapidMiner===
 +
 
 +
 
 +
<br />
 +
==Random Forests==
 +
https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=4s
 +
 
 +
 
 +
<br />
 +
==Naive Bayes==
 +
 
 +
 
 +
<br />
 +
===Probability===
 +
 
 +
 
 +
<br />
 +
===Independent and dependent events===
 +
 
 +
 
 +
<br />
 +
===Mutually exclusive and collectively exhaustive===
 +
 
 +
 
 +
<br />
 +
===Marginal probability===
 +
The marginal probability is the probability of a single event occurring, independent of other events. A conditional probability, on the other hand, is the probability that an event occurs given that another specific event has already occurred. https://en.wikipedia.org/wiki/Marginal_distribution
 +
 
 +
 
 +
<br >
 +
===Joint Probability===
 +
 
 +
 
 +
<br />
 +
===Conditional probability===
 +
 
 +
 
 +
<br />
 +
====Kolmogorov definition of Conditional probability====
 +
 
 +
 
 +
<br />
 +
====Bayes s theorem====
 +
 
 +
 
 +
<br />
 +
=====Likelihood and Marginal Likelihood=====
 +
 
 +
 
 +
<br />
 +
=====Prior Probability=====
 +
 
 +
 
 +
<br />
 +
=====Posterior Probability=====
 +
 
 +
 
 +
<br />
 +
===Applying Bayes' Theorem===
 +
 
 +
 
 +
<br />
 +
====Scenario 1 - A single feature====
 +
 
 +
 
 +
<br />
 +
====Scenario 2 - Class-conditional independence====
 +
 
 +
 
 +
<br />
 +
====Scenario 3 - Laplace Estimator====
 +
 
 +
 
 +
<br />
 +
===Naïve Bayes -  Numeric Features===
 +
 
 +
 
 +
<br />
 +
===RapidMiner Examples===
 +
 
 +
 
 +
<br />
 +
==Perceptrons - Neural Networks and Support Vector Machines==
 +
 
 +
 
 +
<br />
 +
==Boosting==
 +
 
 +
 
 +
<br />
 +
===Gradient boosting===
 +
 
 +
 
 +
<br />
 +
==K Means Clustering==
 +
 
 +
 
 +
<br />
 +
===Clustering class of the Noel course===
 +
 
 +
 
 +
<br />
 +
====RapidMiner example 1====
 +
 
 +
 
 +
<br />
 +
==Principal Component Analysis PCA==
 +
 
 +
 
 +
<br />
 +
==Association Rules - Market Basket Analysis==
 +
 
 +
 
 +
<br />
 +
===Association Rules example in RapidMiner===
 +
 
 +
 
 +
<br />
 +
==Time Series Analysis==
 +
 
 +
 
 +
<br />
 +
==[[Text Analytics|Text Analytics / Mining]]==
 +
 
 +
 
 +
<br />
 +
==Model Evaluation==
 +
 
 +
 
 +
<br />
 +
===Why evaluate models===
 +
 
 +
 
 +
<br />
 +
===Evaluation of regression models===
 +
 
 +
 
 +
<br />
 +
===Evaluation of classification models===
 +
 
 +
 
 +
<br />
 +
===References===
 +
Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977 Mar;33(1):159-174. DOI: 10.2307/2529310.
 +
 
 +
 
 +
<br />
 +
==[[Python for Data Science]]==
 +
 
 +
 
 +
<br />
 +
===[[NumPy and Pandas]]===
 +
 
 +
 
 +
<br />
 +
===[[Data Visualization with Python]]===
 +
 
 +
 
 +
<br />
 +
===[[Text Analytics in Python]]===
 +
 
 +
 
 +
<br />
 +
===[[Dash - Plotly]]===
 +
 
 +
 
 +
<br />
 +
===[[Scrapy]]===
 +
 
 +
 
 +
<br />
 +
==[[R]]==
 +
 
 +
 
 +
<br />
 +
===[[R tutorial]]===
 +
 
 +
 
 +
<br />
 +
==[[RapidMiner]]==
 +
 
 +
 
 +
<br />
 +
==Assessments==
 +
 
 +
 
 +
<br />
 +
===Diploma in Predictive Data Analytics assessment===
 +
 
 +
 
 +
<br />
 +
==Notas==
 +
 
 +
 
 +
<br />
 +
==References==
 +
 
 +
 
 +
<br />

Latest revision as of 21:50, 10 March 2021



aver


Contents

Projects portfolio


Data Analytics courses


Possible sources of data


What is data


Qualitative vs quantitative data


Discrete and continuous data


Structured vs Unstructured data


Data Levels and Measurement


What is an example


What is a dataset


What is Metadata


What is Data Science


Supervised Learning


Unsupervised Learning


Reinforcement Learning


Some real-world examples of big data analysis


Statistic


Descriptive Data Analysis


Central tendency


Mean


When not to use the mean


Median


Mode


Skewed Distributions and the Mean and Median


Summary of when to use the mean, median and mode

measures-central-tendency-mean-mode-median-faqs.php



Measures of Variation


Range


Quartile


Box Plots


Variance


Standard Deviation


Z Score


Shape of Distribution


Probability distribution


The Normal Distribution


Histograms


Skewness


Kurtosis


Visualization of measure of variations on a Normal distribution


Simple and Multiple regression


Correlation


Measuring Correlation


Pearson correlation coefficient - Pearson s r


The coefficient of determination


Correlation Causation


Testing the "generalizability" of the correlation


Simple Linear Regression


Multiple Linear Regression


RapidMiner Linear Regression examples


K-Nearest Neighbour


Decision Trees


The algorithm


Basic explanation of the algorithm


Algorithms addressed in Noel s Lecture


The ID3 algorithm


The C5.0 algorithm


Example in RapidMiner


Random Forests

https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=4s



Naive Bayes


Probability


Independent and dependent events


Mutually exclusive and collectively exhaustive


Marginal probability

The marginal probability is the probability of a single event occurring, independent of other events. A conditional probability, on the other hand, is the probability that an event occurs given that another specific event has already occurred. https://en.wikipedia.org/wiki/Marginal_distribution



Joint Probability


Conditional probability


Kolmogorov definition of Conditional probability


Bayes s theorem


Likelihood and Marginal Likelihood


Prior Probability


Posterior Probability


Applying Bayes' Theorem


Scenario 1 - A single feature


Scenario 2 - Class-conditional independence


Scenario 3 - Laplace Estimator


Naïve Bayes - Numeric Features


RapidMiner Examples


Perceptrons - Neural Networks and Support Vector Machines


Boosting


Gradient boosting


K Means Clustering


Clustering class of the Noel course


RapidMiner example 1


Principal Component Analysis PCA


Association Rules - Market Basket Analysis


Association Rules example in RapidMiner


Time Series Analysis


Text Analytics / Mining


Model Evaluation


Why evaluate models


Evaluation of regression models


Evaluation of classification models


References

Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977 Mar;33(1):159-174. DOI: 10.2307/2529310.



Python for Data Science


NumPy and Pandas


Data Visualization with Python


Text Analytics in Python


Dash - Plotly


Scrapy


R


R tutorial


RapidMiner


Assessments


Diploma in Predictive Data Analytics assessment


Notas


References