Today we looked at the number of wins of a team for the 2006 and 2007 seasons. Our interest was to predict the number of 2007 wins given the number of 2006 wins. I gave you a simple prediction formula. We talked about predicted values and residuals. A good line is where the squares (the sum of squared residuals) is small. The least-squares line makes the sum of squared residuals as small as possible.
We also illustrated regression to the mean. We plotted a team's improvement (2007 wins - 2006 wins) against the 2006 wins. We saw that crummy teams in 2006 tend to show positive improvement, and good 2006 teams tend to show negative improvement. This is good news for Indians fans. They are currently having a poor season, but because of the regression effect, they likely will improve next season.
In the computer lab, we searched for the best batting measure. We looked at team data and looked at the relationship between runs scored per game and the batting measures AVG, OBP, SLG, and OPS. The best batting measure is where the squares (sum of squared residuals) is smallest -- using this criterion, the OPS measure is best.