
Say we want to find RSS of the red line. First we need the equation of this line. From precalculus we know that the equation of a line can always be determined by two points on the line. For example the points (4.5, 2.7) and (6.5, 4.3) are on the line. They are marked by the X in the next graph.
To find the equation of this line we need to find their intercept and their slope, and we all remember from precalculus:
|
|
Note this was for illustration only, it is not how we calculate the line in Statistics!
What does our line predict for the actual observations? Let's consider observation #4 with values x4=4.89 and y4=3.34
Now using the x value in our equation we get
4 = -0.9 + 0.8·4.89 = 3.01. This is called the fitted value (or just fits).
Notice that now we have two y values:
• y4 = 3.34 - the actual observation
•
4 = 3.01 - what our line predicted we should observe for an x value of x4=4.89.
The difference between them is
e4 = y4 -
4 = 3.34 - 3.01 = 0.33.
e4 is called the residual (or error).
Of course we can do the same thing for all the other observations: Find the fitted values
, and then find the corresponding residuals e1 to e10 :
Each of these errors shows how much the actual observed y value differs from what the model predicted. Finally we can combine all these individual errors into one overall error called Residual Sum of Squares (or RSS) as follows:
RSS = ∑ei2 = 1.13
RSS is a measure for how well a line fits the points, the smaller RSS, the better the fit.
Note that this is not the most obvious choice. If all we wanted is to get rid of the minus signs of the residuals we could just use ∑|ei|. Sometimes we actually do that, but mostly we find RSS, for some good mathematical reasons.
So, how about the other two lines? Repeating the above calculations we find for the blue line RSS=1.63 and for the green line RSS=2.14. So of the three the red line is in fact the best.
But that's just three lines out of infinitely many lines. Is there a "best" line, that is a line with an RSS smaller than any other line? The answer is of course yes. But how can we find it? Let's do a little trial and error. In the MACRO error we can pick any two points and have the RSS calculated. Let's just play around a bit to find the lowest value:
CTRL-L %k:\3102\error 'Alcohol' 'Tobacco' 4.5 2.5 6.5 4.5
This of course is not very "intelligent". Instead we could use calculus and find the best possible line, the one with the lowest RSS of all lines. This line is called the Least Squares Regression line. Using MINITAB it is found as follows:
1) Stat > Regression > Regression
Its equation is Tobacco = 0.11 + 0.61 Alcohol and it has an RSS of 0.97.
Using another command we can see what this line looks like:
Comments
• so for any line with equation y=β0+β1x we can find the corresponding RSS using the formula
RSS(β0,β1) = ∑(y-β0+β1x)2
The least squares regression line is the one with coefficients β0 and β1 which minimizes RSS, so we have a minimization problem. The formulas for β0 and β1 are the ones you learned in ESMA3101.
For more on least squares regression see page 503 of the textbook.
2) Response: Tobacco
3) Predictors: Alcohol
1) Stat > Regression > Fitted Line Plot
2) Response: Tobacco
3) Predictor: Alcohol