Wednesday, July 2, 2008

Pete Rose and the Big Red Machine

In today's class, we honored Pete Rose and one of the great Reds teams of the past. Here's what we did:

1. As a warmup, we looked at histograms of the OBP's and walk numbers for the 2007 regular baseball hitters. We saw that OBP's are symmetric-shaped, while walk counts are right-skewed. (This is a general result: counts of things like walks, strikeouts, home runs, and so on tend to be right-skewed, and derived stats (like AVG, OBP, OPS) tend to be symmetric.)

2. Pete Rose was quite a player. He played for 24 seasons and collected a ton of hits, although most of them were singles. Your prof has a soft spot for Rose since he was a member of the 1980 Phillies, but he did gamble on baseball which has kept him out of baseball's Hall of Fame.

3. We computed five number summaries for Rose's season hit counts and his RBI counts. Rose's median season hit was remarkably high -- many other players would be happy to have a single season with Rose's median number of hits. In contrast, Rose was not a high RBI man, but that may be due to the fact that he batted first or second in the batting order.

4. We looked at the ages of the 1975 Reds, one of the greatest teams of history. Rose was an old guy (34) on this team, but he still would play 12 more seasons of baseball. The mean age of the 1975 Red hitters was 27.5, pretty young. We used this example to illustrate the idea of a deviation and a typical (or standard) deviation.

5. When data is bell-shaped (like OBPs or AVGs), then we can use the mean and standard deviation to construct an interval that contains 68% or 95% of the data.

Have a great 4th of July weekend. Next week we'll start looking a comparisons between datasets. Is Bonds better than Ruth? Is the AL better than the NL?

No comments: