Baseball Prospectus - Small sample size (July 30, 2003)
An entire paragraph devoted to explaining the limitations to the data presented. Excellent!
The concept of comparing the performance of the team with and without the player is nothing new (hockey's plus/minus, pitcher's won/lost WAT, and scores of others). So, BP does the same with BIP (how often does the team turn an out, when that player is anywhere on the field, and not).
Small sample size plays a huge huge role, far more than any other valid limitations already cited. How can you as a reader tell this right away?
The range offered by the unnamed BP author (is this BP's way of creating a brand?) is +/- .05 outs / play for the team. Now, all you have to do is look at a position's ZR (Zone rating), and you will quickly see that for a position, the range is around +/- .05 or .07 outs / play. That's for one position, which you can divide by 7 or 8 (to give you less than .01 outs / play). Right away, the Redsox range shown is at least 5 times too large based on their ability. Why? Sample size.
Another way to look at it? The best/worst players are worth about +/- 30 outs per 162 games, or about .2 outs per game. In a game, you'll get about 30 plays, so that works out to less than .01 outs / play.
So, I applaud the BP author for taking the time to add that huge caveat. However, with the above analysis, we can pretty much say that the list presented is not statistically significant, and therefore is worthwhile only for fun, and for a much, much larger future study.
--posted by TangoTiger at 03:38 PM EDT
Posted 12:44 a.m.,
August 1, 2003
(#1) -
Vinay Kumar
Excellent point, Tango. Here's another way to look at it: by those numbers, the Sox opponents raise their BA on balls in play by fifty or so points going from Damian Jackson to Millar or Giambi. Is that reasonable? Could swapping a LF be the difference between teams hitting .260 and .310? Of course not! That's a huge difference. That's the difference between the best pitching+fielding in the league and the worst.
Over the long haul, what kind of difference might we see between good and bad LFers? Even 10 points seems like a lot, but let's just go with that. That means that the BP rankings are way off; the noise is so much greater than the signal that we can't read anything into it.
What would we think about a batting stat based on the team's scoring while that player was in the lineup?
The more I think about it, the more disappointed I am that BP ran those numbers; they're just so far from being statistically significant, that it's like a listing of averages in particular batter-pitcher matchups.
Posted 12:49 p.m.,
August 1, 2003
(#2) -
tangotiger
What would have been better would have been to breakdown by GB, Flys (and ignore pops and liners). The 1B being there or not would not affect the flyball out rate, and vice versa for the CF and groundballs.
If you really wanted to do something cool, extend that for multiple years. How did Giambi's teams throughout the years do on groundballs when he was there, and when his backups were there? Problem is that you might not get enough "backup" games to get any meaningful results.
Like I said, it's a fun exercise, but limited to the sample size.