Difference between revisions of "Página de pruebas 3"
Adelo Vieira (talk | contribs) (→Measuring Correlation - The Correlation Coefficient) |
Adelo Vieira (talk | contribs) |
||
(655 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | + | {{Sidebar}} | |
− | + | <html><buttonclass="averte" onclick="aver()">aver</button></html> | |
− | + | ||
+ | <html> | ||
+ | <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script> | ||
+ | <script> | ||
+ | function aver() { | ||
+ | link = "http://wiki.sinfronteras.ws/index.php?title=P%C3%A1gina_de_pruebas_3+&+action=edit" | ||
+ | link2 = link.replace("amp;","") | ||
+ | window.location = link2 | ||
+ | sleep(2); | ||
+ | window.document.getElementById('firstHeading').style.color = "red" | ||
+ | } | ||
+ | $(document).ready( function() { | ||
+ | $('#totalItems, #enteredItems').keyup(function(){ | ||
+ | window.document.getElementById('firstHeading').style.color = "red" | ||
+ | }); | ||
+ | window.document.getElementById('firstHeading').style.color = "red" | ||
+ | }); | ||
+ | </script> | ||
+ | </html> | ||
+ | |||
+ | <br /> | ||
+ | ==Projects portfolio== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==Data Analytics courses== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==Possible sources of data== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==What is data== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Qualitative vs quantitative data=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Discrete and continuous data==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Structured vs Unstructured data=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Data Levels and Measurement=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===What is an example=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===What is a dataset=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===What is Metadata=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==What is Data Science== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Supervised Learning=== | ||
+ | |||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Unsupervised Learning=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Reinforcement Learning=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==Some real-world examples of big data analysis== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==Statistic== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==Descriptive Data Analysis== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Central tendency=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Mean==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | =====When not to use the mean===== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Median==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Mode==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Skewed Distributions and the Mean and Median==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Summary of when to use the mean, median and mode==== | ||
+ | measures-central-tendency-mean-mode-median-faqs.php | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Measures of Variation=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Range==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Quartile==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Box Plots==== | ||
+ | |||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Variance==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Standard Deviation==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==== Z Score ==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Shape of Distribution=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Probability distribution==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | =====The Normal Distribution===== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Histograms==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Skewness==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Kurtosis==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Visualization of measure of variations on a Normal distribution==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==Simple and Multiple regression== | ||
<br /> | <br /> | ||
===Correlation=== | ===Correlation=== | ||
− | |||
− | + | <br /> | |
+ | ====Measuring Correlation==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | =====Pearson correlation coefficient - Pearson s r===== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | =====The coefficient of determination <math>R^2</math>===== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Correlation <math>\neq</math> Causation==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Testing the "generalizability" of the correlation ==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Simple Linear Regression=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Multiple Linear Regression=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===RapidMiner Linear Regression examples=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==K-Nearest Neighbour== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==Decision Trees== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===The algorithm=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Basic explanation of the algorithm==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Algorithms addressed in Noel s Lecture==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | =====The ID3 algorithm===== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | =====The C5.0 algorithm===== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Example in RapidMiner=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==Random Forests== | ||
+ | https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=4s | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==Naive Bayes== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Probability=== | ||
+ | |||
+ | <br /> | ||
+ | ===Independent and dependent events=== | ||
− | |||
− | + | <br /> | |
− | + | ===Mutually exclusive and collectively exhaustive=== | |
− | |||
− | |||
<br /> | <br /> | ||
− | ==== | + | ===Marginal probability=== |
− | + | The marginal probability is the probability of a single event occurring, independent of other events. A conditional probability, on the other hand, is the probability that an event occurs given that another specific event has already occurred. https://en.wikipedia.org/wiki/Marginal_distribution | |
+ | |||
+ | |||
+ | <br > | ||
+ | ===Joint Probability=== | ||
+ | |||
− | + | <br /> | |
+ | ===Conditional probability=== | ||
<br /> | <br /> | ||
− | ==== | + | ====Kolmogorov definition of Conditional probability==== |
− | + | ||
− | + | <br /> | |
+ | ====Bayes s theorem==== | ||
− | |||
− | |||
− | |||
− | + | <br /> | |
+ | =====Likelihood and Marginal Likelihood===== | ||
− | + | <br /> | |
− | + | =====Prior Probability===== | |
− | |||
− | |||
− | + | <br /> | |
− | + | =====Posterior Probability===== | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
<br /> | <br /> | ||
+ | ===Applying Bayes' Theorem=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Scenario 1 - A single feature==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Scenario 2 - Class-conditional independence==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====Scenario 3 - Laplace Estimator==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Naïve Bayes - Numeric Features=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===RapidMiner Examples=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==Perceptrons - Neural Networks and Support Vector Machines== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==Boosting== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Gradient boosting=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==K Means Clustering== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===Clustering class of the Noel course=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ====RapidMiner example 1==== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ==Principal Component Analysis PCA== | ||
− | |||
− | |||
+ | <br /> | ||
+ | ==Association Rules - Market Basket Analysis== | ||
− | |||
− | + | <br /> | |
+ | ===Association Rules example in RapidMiner=== | ||
− | + | <br /> | |
+ | ==Time Series Analysis== | ||
− | |||
− | |||
− | |||
− | < | + | <br /> |
+ | ==[[Text Analytics|Text Analytics / Mining]]== | ||
<br /> | <br /> | ||
+ | ==Model Evaluation== | ||
+ | |||
− | === | + | <br /> |
− | + | ===Why evaluate models=== | |
− | |||
− | + | <br /> | |
+ | ===Evaluation of regression models=== | ||
− | |||
− | + | <br /> | |
+ | ===Evaluation of classification models=== | ||
− | |||
− | |||
− | |||
− | + | <br /> | |
+ | ===References=== | ||
+ | Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977 Mar;33(1):159-174. DOI: 10.2307/2529310. | ||
− | |||
− | |||
− | |||
− | + | <br /> | |
+ | ==[[Python for Data Science]]== | ||
− | |||
− | |||
− | |||
− | + | <br /> | |
+ | ===[[NumPy and Pandas]]=== | ||
<br /> | <br /> | ||
+ | ===[[Data Visualization with Python]]=== | ||
− | |||
− | + | <br /> | |
+ | ===[[Text Analytics in Python]]=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===[[Dash - Plotly]]=== | ||
+ | |||
+ | |||
+ | <br /> | ||
+ | ===[[Scrapy]]=== | ||
+ | |||
+ | <br /> | ||
+ | ==[[R]]== | ||
− | |||
− | |||
− | + | <br /> | |
+ | ===[[R tutorial]]=== | ||
− | + | <br /> | |
− | + | ==[[RapidMiner]]== | |
− | |||
− | |||
− | |||
− | </ | ||
− | + | <br /> | |
+ | ==Assessments== | ||
− | |||
− | |||
− | + | <br /> | |
+ | ===Diploma in Predictive Data Analytics assessment=== | ||
<br /> | <br /> | ||
+ | ==Notas== | ||
− | |||
− | |||
− | < | + | <br /> |
− | + | ==References== | |
− | |||
− | |||
<br /> | <br /> |
Latest revision as of 21:50, 10 March 2021
Contents
- 1 Projects portfolio
- 2 Data Analytics courses
- 3 Possible sources of data
- 4 What is data
- 5 What is Data Science
- 6 Some real-world examples of big data analysis
- 7 Statistic
- 8 Descriptive Data Analysis
- 9 Simple and Multiple regression
- 10 K-Nearest Neighbour
- 11 Decision Trees
- 12 Random Forests
- 13 Naive Bayes
- 14 Perceptrons - Neural Networks and Support Vector Machines
- 15 Boosting
- 16 K Means Clustering
- 17 Principal Component Analysis PCA
- 18 Association Rules - Market Basket Analysis
- 19 Time Series Analysis
- 20 Text Analytics / Mining
- 21 Model Evaluation
- 22 Python for Data Science
- 23 R
- 24 RapidMiner
- 25 Assessments
- 26 Notas
- 27 References
Projects portfolio
Data Analytics courses
Possible sources of data
What is data
Qualitative vs quantitative data
Discrete and continuous data
Structured vs Unstructured data
Data Levels and Measurement
What is an example
What is a dataset
What is Metadata
What is Data Science
Supervised Learning
Unsupervised Learning
Reinforcement Learning
Some real-world examples of big data analysis
Statistic
Descriptive Data Analysis
Central tendency
Mean
When not to use the mean
Median
Mode
Skewed Distributions and the Mean and Median
Summary of when to use the mean, median and mode
measures-central-tendency-mean-mode-median-faqs.php
Measures of Variation
Range
Quartile
Box Plots
Variance
Standard Deviation
Z Score
Shape of Distribution
Probability distribution
The Normal Distribution
Histograms
Skewness
Kurtosis
Visualization of measure of variations on a Normal distribution
Simple and Multiple regression
Correlation
Measuring Correlation
Pearson correlation coefficient - Pearson s r
The coefficient of determination
Correlation Causation
Testing the "generalizability" of the correlation
Simple Linear Regression
Multiple Linear Regression
RapidMiner Linear Regression examples
K-Nearest Neighbour
Decision Trees
The algorithm
Basic explanation of the algorithm
Algorithms addressed in Noel s Lecture
The ID3 algorithm
The C5.0 algorithm
Example in RapidMiner
Random Forests
https://www.youtube.com/watch?v=J4Wdy0Wc_xQ&t=4s
Naive Bayes
Probability
Independent and dependent events
Mutually exclusive and collectively exhaustive
Marginal probability
The marginal probability is the probability of a single event occurring, independent of other events. A conditional probability, on the other hand, is the probability that an event occurs given that another specific event has already occurred. https://en.wikipedia.org/wiki/Marginal_distribution
Joint Probability
Conditional probability
Kolmogorov definition of Conditional probability
Bayes s theorem
Likelihood and Marginal Likelihood
Prior Probability
Posterior Probability
Applying Bayes' Theorem
Scenario 1 - A single feature
Scenario 2 - Class-conditional independence
Scenario 3 - Laplace Estimator
Naïve Bayes - Numeric Features
RapidMiner Examples
Perceptrons - Neural Networks and Support Vector Machines
Boosting
Gradient boosting
K Means Clustering
Clustering class of the Noel course
RapidMiner example 1
Principal Component Analysis PCA
Association Rules - Market Basket Analysis
Association Rules example in RapidMiner
Time Series Analysis
Text Analytics / Mining
Model Evaluation
Why evaluate models
Evaluation of regression models
Evaluation of classification models
References
Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977 Mar;33(1):159-174. DOI: 10.2307/2529310.
Python for Data Science
R
RapidMiner
Assessments
Diploma in Predictive Data Analytics assessment
Notas
References