|
|
| (178 intermediate revisions by the same user not shown) |
| Line 1: |
Line 1: |
| − | ===Simple Linear Regression===
| |
| − | https://www.youtube.com/watch?v=nk2CQITm_eo&t=267s
| |
| | | | |
| − |
| |
| − | In general, there are 3 main stages in Linear regression:
| |
| − |
| |
| − | : '''1'''. Using '''Least-squares''' to fit a line to the data
| |
| − |
| |
| − | : '''2'''. Calculating <math>R^2</math>
| |
| − |
| |
| − | : '''3'''. Calculating a <math>p-value</math> for <math>R^2</math>
| |
| − |
| |
| − |
| |
| − | <br />
| |
| − | : '''Using Least-squares to fit a line to the data'''
| |
| − | <blockquote>
| |
| − | [[File:Linear_regression1.png|400px|thumb|right|Takinf from https://www.youtube.com/watch?v=nk2CQITm_eo&t=267s]]
| |
| − |
| |
| − | :* First, draw a line through the data.
| |
| − |
| |
| − | :* Second, calculate the '''Residual sum of squares''': Measure the distance from the line to each data point (residual), square each distance, and then add them up.
| |
| − | ::: The distance from a line to a data point is called a '''residual'''
| |
| − |
| |
| − | :* Then, we rotate the line a little bit and calculate the RSS. We do this many times.
| |
| − | :* ...
| |
| − |
| |
| − | :* Then, the line that represents the linear regression is the one corresponding to the rotation that has the least RSS. The regression equation:
| |
| − |
| |
| − | :: <math> y = a + bx </math>
| |
| − |
| |
| − | :: The equation is composed of 2 parameters:
| |
| − |
| |
| − | ::* Slope: <math> b </math>
| |
| − | ::: The slope is the amount of change in units of <math>y</math> for each unitchange in <math>x</math>.
| |
| − |
| |
| − | ::* The <math>y-axis</math> intercept: <math> a </math>
| |
| − | </blockquote>
| |
| − |
| |
| − |
| |
| − | <br />
| |
| − | : '''Using Least-squares to fit a line to the data'''
| |
| − | <blockquote>
| |
| − | [[File:Linear_regression2.png|600px|thumb|center|Takinf from https://www.youtube.com/watch?v=nk2CQITm_eo&t=267s]]
| |
| − | </blockquote>
| |
| − |
| |
| − |
| |
| − |
| |
| − | <br />
| |
| − | <br />
| |
| − |
| |
| − | <!-- [[File:SimpleLinearRegression.png|350px|thumb|center|]] -->
| |
| − |
| |
| − | [[File:SimpleLinearRegression2.png|600px|center|]]
| |
| − |
| |
| − |
| |
| − | '''The regression equation:'''
| |
| − | <math> y = a + bx </math>
| |
| − |
| |
| − | * Dependent variable: <math> y </math>:
| |
| − | * Independent variable: <math> x </math>:
| |
| − | * Slope: <math> b = r \frac{S_y}{S_x}</math>
| |
| − | : The slope is the amount of change in units of <math>y</math> for each unitchange in <math>x</math>.
| |
| − | * <math> y </math> intercept: <math> a = \bar{y} - b\bar{x} </math>:
| |
| − |
| |
| − |
| |
| − | <br />
| |