Does Clutch Hitting Exist?

Yes!

© Tangotiger

Background

These last few months have been a whirlwind for me. Thanks to some key insights from Andy Dolphin, Arvin Hsu, Walt Davis, Alan Jordan, and a few others (thank you internet!), I finally have a so-so grasp of some statistical concepts.

The first important one was:
variance (observed) = variance (true) + variance (luck)

It was this key concept that Arvin Hsu and Erik Allen used in solving DIPS.

The next important one was when I asked if the spread of observed talent tells us anything about the underlying true talent. That is, can't I use the spread to establish my regression towards the mean, so that I don't need to run a year-to-year test? I ran a few sims, and I stumbled my way across what I thought made sense. Andy Dolphin remarked rather calmly that
regression towards the mean = variance (luck) / variance (observed).

Let's get to it

The first thing I did was classified all PAs from 1999-2002 by being clutch or not clutch. Thanks to Leveraged Index (LI), that's a snap. I also discarded all IBB and bunts. I looked at all players with at least 100 clutch PAs and 800 overall PAs. I used my lwtsOBA as the metric, which lets me use binomials.

Technical Interlude

My lwtsOBA (which gives more weight to HR and less to walks to better capture the performance of the player) has a higher spread than regular OBA (1.08 times wider). Therefore, in order to determine the luck variance for this metric, I have to multiply the result of the binomial by 1.08.

Next, normalize all the lwtsOBAs against the quality of pitchers faced. Some players face higher quality of pitching during clutch situations, and some don't. So, using each pitcher's lwtsOBA, I establish a quality of pitching faced. (I regress those pitchers with at least 800 PA 0%, and those pitchers with less than 800 PA 100%. It would be ALOT more work to regress each pitcher separately and weight them for each hitter. However, since the standard deviation of the quality faced is .002, the likelihood is that doing it right won't buy you much.)

Some results

Ok, all that wasn't too important. The first thing would be to treat the nonclutch lwtsOBA as the "true talent" OBA. (More on that in a bit.) Having this, it's a simple matter to compute how much luck is involved, given the number of clutch PAs. For example, Miguel Tejada's nonclutch lwtsOBA was .347, and I treat that as his true talent. Given the 478 PAs he had in clutch situations, we expect that his performance would be centered around .347, with 1 standard deviation = .022. We multiply this spread by 1.08 to get a standard deviation of .024. His actual performance in those 478 clutch PAs was .418, or 3.02 standard deviations from the mean.

Repeating this process for all 340 hitters, and we'd expect 232 of them to have their clutch performance fall within 1 standard deviation, if clutch hitting did not exist. Actually, 214 of them were within 1 SD. The standard deviation of these standardized scores was 1.14. Clutch hitting exists, albeit at a very small level!

But wait, I can't use a player's sample OBA as his true talent OBA, right? Right. If I apply a regression towards the mean, using the equation x/(x+PA), where x=209, the SD is now 1.12. In fact, I can't find an "x" value where I can make the SD any lower than 1.12. (A regression towards the mean of 100%, meaning everyone has the same true talent level in nonclutch situations, yields a spread of 1.61.)

OBA instead of lwtsOBA

My choice of lwtsOBA was good as it better captures the performance of the player than does OBA. However, I had to fudge with the 1.08 figure. That figure would have a huge impact as to how much clutch exists. If I had no fudge, the actual spread would have been 1.22, which is extremely significant. The closer my fudge is to 1.22, the less we can say that clutch hitting exists.

If instead I use OBA instead of lwtsOBA, what happens? In that case, I don't need the fudge for the binomial. The spread comes in at 1.14. This figure is almost identical to the lwtsOBA figure.

Now what?

Well, now we try to figure out how much we should regress the difference (in the sample clutch performance and the true talent OBA) to establish the clutch ability of our players.

Thanks to Andy, this is easy: regression towards the mean = (1/1.12)^2 = .79. That's how much to regress, given the average number of clutch PAs in our sample (325). Our regression towards the mean equation therefore is:
clutch regression = 1250 / (1250 + clutchPAs)

Clutch Hitters

So, who are the clutch hitters? From 1999-2002, Jason Giambi and Miguel Tejada have shown to have the true talent clutch ability to add 2 runs per year. That's it. That's the effect of clutch ability.

Again, I should say that that's how much clutch ability we can detect. It's possible that there is more. After all, the clutch performance of these players over this time period suggests they added 12 runs during clutch performance each year. Our current detection process says that most of that was due to good luck, but not all of it.

Clutch hitting is there. It's an ability that does exist, and is detectable. And its effect is rather limited.


Added: Feb 4, 2004

While the effect is 2 clutch runs for Giambi and Tejada, these clutch runs come at the most opportune time. The effect of the 2 runs, which normally contributes 0.2 wins, is 0.6 wins when considering the timing. Essentially 2 runs in a clutch situation is equivalent to 6 random runs. Therefore, it is best to consider these 2 runs as 6 leveraged-runs. 2 runs in a clutch situation, 6 random runs, or 0.6 wins. They are all equivalent.