
• That the slope is positive is explained by the fact that golfers (even pros) have long term ups and downs: for weeks or months they play well, and then they play not so well. So a golfer who plays bad in the first round also likely plays bad in the second, and vice versa.
• That the slope is less than one is a general phenomenan often observed in real life called regression to the mean. Here it is this: a golfer at any one moment in time has a given scoring ability. It will change over time but not from one day to the next. So if somebody shot a high score one day they are likely not playing well overall but they also likely had an exceptionally bad day as well. The next day their overall ability is still the same but they likely won't have a bad day on top of that, and their score should go down. If somebody shot a very low score one day they are not likely to be able to do it again the next day.
This is one of the most missunderstood principles of statistics. Say a player has done very well in the first round of a major tournament, but then he plays not so well in the second. Almost always commentators will say the he "felt the pressure" and so played worse. In reality it is likely just regression to the mean: the first day he played well above his natural ability, and the second day he came back to it.
This is a phenomena that we find everywhere. For example it helps doctors: who goes to a doctor? People who don't feel well. But some of them would get better anyway, doctor or no docter. But they all will think it was the doctor who did it! This by the way is also one of the explanations for the famous placebo effect.
Note: there is something special about these datasets: x and y measure the same thing (number of strokes needed for a round). In such a case it is often a good idea to draw the scatterplot with the same x and y scale.
Note: the slope for the AT&T is really small: 0.087. Is there even relationship between the first and second rounds? Here are the correlation coefficients:
| Tournament | r | p-value |
|---|---|---|
| Sony Open | 0.305 | 0.000 |
| AT&T Peeble Beach | 0.088 | 0.261 |
| Honda Classic | 0.259 | 0.002 |
| Byron Nelson Classic | 0.138 | 0.091 |
| Shell Huston Open | 0.294 | 0.000 |
So for the AT&T and Byron Nelson tournaments there is no statistically significant relationship between the two rounds (at these sample sizes!) Any idea why the two are different?