Baseball Player Values (November 22, 2003)
Yet another system based on "Win Probability Added".
The major difference between Ed's system and mine is that I use a math generated model of win expectancy for every inning/score/base/out, while Ed uses actual data for the given year.
The problem, if you remember from Phil Birnbaum's data, is that the sample size is so small for the 1978-1990 time period that sometimes you get weird change in states. Using only one year is extremely problematic. However, overall, you'll get fairly good results.
--posted by TangoTiger at 08:54 AM EDT
Posted 4:35 p.m.,
November 22, 2003
(#1) -
tangotiger
There are many interesting links on that page. Notice how Gagne contributed more wins above average than any other pitcher in 2002? And 4 of the 5 pitchers were relievers?
The career totals shows how few pitchers can break into the +40 win class.
Bert Blyleven does NOT do well at all with this system, though I'm not sure if any park adjustments were made. Check out Goose Gossage and Trevor Hoffman's impact here.
Posted 5:18 p.m.,
November 22, 2003
(#2) -
pyrite
This is great stuff. The methodology could be refined (especially if only one year of data is used), but this is the largest amount of win probabilty player data I've come across.
So many cool things to note when looking at the data:
Barry has eight "MVP" seasons thru 2002. I assume 2003 would be his ninth. Wow.
Relief pitchers have a tremendous impact compared to context-free metrics.
I wonder how large the skew is between computer modeled probabilities and actual data in individial seasons, especially for high leverage situations. E.g., does the poor work of Jose Mesa, Mike Wiliiams, Antonio Alfonseca, et al, lead to Gagne's 2002 season being overrated or does the sub-optimal use of relievers in high-leverage situations just bring the late-inning probabilities more in line with average performance?
Posted 8:25 p.m.,
November 22, 2003
(#3) -
tangotiger
(homepage)
If, and only if, you want to do the hard work, at the above link I generated a sim of Win Expectancies using 1 million games. (I don't know how accurate it is, as I did it a few years ago.)
Do a search on this site for Phil Birnbaum, as I have a link to his work where he published the actual data from 1978-1990.
Then, just compare.
Posted 5:31 a.m.,
November 23, 2003
(#4) -
Steve Rohde
This is very interesting stuff. I think it is pretty clear that he didn't make any adjustments for park factors, otherwise, for example Helton's 2000 season, good as it was, wouldn't be so out of sight.
It would seem that to calculate these values more accurately, the best approach might be to come up with some methodology to generate separate win expectancy tables for each park (which might also take into account the differences in the League). If you did that, Bonds career total through 2002, of 88.147 offensive wins above average, would increase substantially.
I have long thought that an approach such as this was a good way to evaluate the impact of relievers, because it takes into account the extent to which they are being used in high leverage situations. In this connection, the high career totals of Gossage and Hoffman are particulary interesting.
Of course, for pitchers, this approach doesn't take into account the impact of a team's fielding in helping a hurting a pitcher's wins contribution, but nevertheless the data are quite interesting. I am a little surpised that Maddux has career totals significantly better than Clemens by this measure, and that Clemens is not as far ahead of Randy Johnson or Pedro Martinez for a career total as I would expect.
Posted 10:45 a.m.,
November 23, 2003
(#5) -
studes
(homepage)
It's amazing how WPA is cropping up everywhere. The fact that he doesn't split responsibility between pitching and fielding pretty much undermines the pitching numbers, IMO. Also, it is certainly possible that Helton would rank first in 2000, even playing in Coors.
Tango, I imagine that park factors/run environment impact the WPA weights. Are park factors going to be part of your system?
Posted 11:00 a.m.,
November 23, 2003
(#6) -
tangotiger
Park factors, by far the most important thing to consider, will be part of my (eventual) system.
As for the pitcher's totals, the numbers are completely appropriate if, and only if, you believe that his fielders were league average.
Since the question being asked is: "what is the win probability, given average conditions, at the bottom of the 5th, down by 1, bases empty, 1 out before and after this PA", then as long as that pitcher had average fielders at every position, then you do NOT have to worry about fielding.
While you can't say this for any given season, you can pretty much figure that this is the case for a pitcher's career. I'd be a little surprised if the effect for a given pitcher would amount to much.
Posted 12:08 p.m.,
November 23, 2003
(#7) -
Steve Rohde
My point about Helton's 2000 was not that it is impossible that Helton could rank first even if park factors were considered, but that I think it is pretty obvious that park factors were not considered in this system, and I was using Helton's 2000 season number as a clear indicator of that. Looking at the data provided, Helton's contribution of 9.370 wins above average for 2000 ranks as the third highest offensive score any player in any season from 1972-2002, behind only Bonds' scores in 2001 (11.278) and 2002 (10.040). The only logical explanation for this is that the effect of Coors has not been considered.
In 2000, Helton had an OPS+ of 158 and an EQA of .348, with both of these measures of course taking park factors into account. Bonds in 2000 had an OPS+ of 191 and an EQA of .362. Yet on this new version of win probability added, Helton in 2000 scores 9.370 wins above average compared to 5.519 for Bonds. Helton did play more games than
Bonds in 2000 (160 compared to 143), but that difference wouldn't create this kind of discrepancy.
Another example-- In 1997, Mike Piazza, playing only one less game than Larry Walker, had a higher OPS+ and a higher EQA than Walker. But Walker played in an extreme hitters park whereas Piazza played in an extreme pitchers park (Dodger stadium). As a result, according to the measure, Walker contibuted 7.746 offensive wins sbove average to lead the majors, whereas Piazza didn't make the top 5, and thus Piazza contributed less than the 5.856 above average attributed to the number 5 man (Bagwell). Failure to take park factors into account appears to be the only logical explanation for this discrepancy bewteen Walker and Piazza.
The developer of this sytem, Ed Oswalt, in his explanation, doesn't mention park factors, but instead describes an approach which appears to use just one table for any given year.
I think Ed Oswalt has done some fine work here, but I agree with tango's point about the importance of considering park factors. I also agree with tango's point that if you assume all fielders are League average than the pitcher numbers are a reasonable reflection of their contribution. However, I don't think that we can necessarily assume that the impact of fielders will even out over a pitchers career.
Posted 3:52 p.m.,
November 23, 2003
(#8) -
Michael Humphreys
Great stuff.
I think the Ricky Henderson result is very interesting. In one of his Abstracts, Bill James ran some simulations--perhaps based on run rather than win expectations--to compare the impact of Rickey with Mays, and Rickey turned out to be surprisingly close in value to Mays, possibly even better.
The Win Expectation approach might provide more evidence of the greater importance of on-base percentage compared with slugging percentage, and, even more importantly, the effect of basestealing on Win Expectation. I think Ricky's "Wins" from basestealing *might* be a little higher than his Linear Weight basestealing runs/wins if you apply UZR or Tango's run weights. Perhaps Rickey was really stealing extra bases when they had the most impact on getting that one run needed for a win.
Another interesting result is that Brett came out better than Schmidt. Perhaps Schmidt received more low-Win-value quasi-IBBs than we realized.
Finally, I agree with the other posters that we need to find a way to factor in fielding for evaluating pitchers. Most pitchers with long careers have, on average, average fielders behind them. However, Maddux's rating is almost certainly improved to a non-trivial degree by Atlanta's fielding, which was the best overall in the '90s. Similarly, Palmer benefitted from outstanding Oriole fielding.
Posted 10:02 p.m.,
November 23, 2003
(#9) -
MGL
The easy way to "factor in" fielding to pitching is to simply substitute a league average or team average $H for a pitcher's sample $H. If you want to be a little more rigorous, simply adjust each pitcher's $H by a team's DER. If you want to be more rigorous than that, use team UZR ratings to adjust a pitcher's $H. If you don't have UZR ratings, you can use MAH's new DRA ratings on a team level.
How to do these adjustments in a win added probability system is a little tricky....
Posted 11:50 a.m.,
November 24, 2003
(#10) -
tangotiger
Michael, can you do the following:
1 - figure out how many runs / BIP were saved by each team that Maddux pitched for
2 - multiply the above figure by Maddux's BIP, year by year
3 - Sum the above figures
This will give us the fielding support that Maddux received, relative to average, (assuming that the GB/FB IF/OF effect is also balanced).
If you want to break it down by IF/OF to try to make it more Maddux-centric, feel free.
I have to believe that the effect for any given pitcher will be under 5 runs per season. I'd say over a career, it's probably under 50 runs. These are just guesses.
Posted 1:19 p.m.,
November 24, 2003
(#11) -
FJM
Fascinating! But there are some strange things here, particularly among the 1972-2002 Pitcher Rankings. For example, how can Appier, Hershiser, Gooden, Franco(!!!), Saberhagen, Brown, Mussina and Glavine all score higher than Steve Carlton even though Lefty faced more batters and had a lower BAA (.238) than any of them (.243-.252)? Franco is really weird, as he faced only 29% as many batters. Assuming equal performance, he'd need an LI of 3.41 just to match him.
Posted 1:31 p.m.,
November 24, 2003
(#12) -
ColinM
I have to echo FJM here. I certainly have to respect anyone who takes the time to do this sort of thing, but as much as I hate to judge a system by its results, some of the results with the pitchers do seem a bit strange.
For example, Caldwell over Guidry in 78? And no Clemens among the top 5 in 91,92 or 98?
Posted 3:04 p.m.,
November 24, 2003
(#13) -
J Cross
yeah, 4 closers and a starter as the 5 top pitchers in 2002 is a surprising result but I figure this is b/c he's comparing players to average instead of replacement level. To look at "value" each pitcher would arguable have (Avg. Rate - Repl. rate)*IP added to their total and the starters would all rise compared to the closers, right? What I can't figure is why does Maddux rate so much better than Clemens (I figure that they're really pretty close in career valeu)? Does he not take league into account?
Posted 3:09 p.m.,
November 24, 2003
(#14) -
tangotiger
If someone has a little time, do they want to compare the results of my Win Advancements from 1999 to 2002 to Ed Oswalt's (at least for the pitchers)?
Since I have not yet factored in park, the differences would be entirely due to the win probability tables we would have used (mine generic/math, and his empirical for a given year). Like I said, you need hundreds of thousands of games to get a good table, and not just 2000 games.
Posted 3:37 p.m.,
November 24, 2003
(#15) -
Michael Humphreys
Tango,
This is a quick, though much "dirtier" estimate. Some major caveats also apply. In spite of all that, your estimate is amazingly similar to mine.
Maddux pitched effectively full-time from 1988 through 1992 in Chicago, and, for the years covered by DRA, from 1993 through 2001 in Atlanta.
The Chicago defense was essentially average while Greg pitched for them: -16 runs over the course of 5 seasons.
The Atlanta defense was--with one important qualification I'll address below--outstanding, the best sustained team fielding from 1974-2001 by far. Over those 9 seasons, Atlanta never had a negative rating, and only 1 rating below +33 runs saved (1994, +9). The total runs saved over those 9 seasons was +515, or +57 per season.
The per-season errors in Atlanta DRA pitching and fielding runs minus actual runs allowed were: -14, +5, -20, +12, +1, -4, -4, -6, -20. If anything, DRA very slightly underestimates the overall effectiveness of Atlanta's pitching and defense.
Greg has averaged 235 IP per 162 games over the course of his entire career. Assuming average team IP are 1445, he pitched about 16% of his team's innings. 16% of +499 team runs saved (Chi plus Atl) is 80 runs to Greg's benefit, or about 6 runs per each of Greg's 14 full-time seasons.
Now the caveats.
You're right that we should measure by BIP, not IP. We should take Greg's percentage of team BIP, season-by-season, and multiply that by the team fielding rating that season. Even though he is not known as a strikeout pitcher, he almost certainly had above-average strikeout rates and less than 16% of his team's BIP.
As mentioned in the DRA article, the biggest "kink" in the DRA system is that CF putouts and estimated infield fly outs ("IFO") have a lot of overlap, and CF putouts are assigned to team fielding, wherears IFO are assigned to pitchers. Atlanta's IFO have been very, very low. I suggest that it's probably a good idea to allocate 10 runs or so of "Andruw's" per-season runs to IFO, which "belong" to pitchers. (That's four seasons, or 40 runs.) On the other hand, Atlanta had one very low IFO number before Andruw joined the team, and Atlanta also has very high team *pitcher* fielding ratings (assists), which Charlie Saeger has observed are a good proxy for ground ball pitching. Anyway, let's haircut Atlanta's fielding runs by about +40, to about 53 per season, which is probably still a little too high. Take 16% a season for Greg, and the *most* he could be getting subsidized--by probably the best fielding team of the past quarter century--and that's 8.5 runs a season. Since Greg has more Gold Gloves than any other pitcher in history--and Atlanta's team pitcher fielding ratings have been outstanding--I wouldn't be surprised if at least a few of those runs per season belong to Greg, not his fielders.
If you look at his whole career with the 40 run haircut, Greg has had at most 16% of 450 team fielding runs support over 14 seasons of full-time pitching. That's five runs a season.
Together we may have put together the first reasonably good estimate of the upper bound for per-season fielding impact on a starting pitcher, though interactions between GB/FB pitching and the relative abilities of the team infield/outfield are not considered.
Posted 3:55 p.m.,
November 24, 2003
(#16) -
tangotiger
The other way to get a decent estimate is to remember the work from Allen/Hsu.
We figure that the standard deviation of hits / BIP from fielders is about .006 to .008 or so. Giving Maddux 700 BIP means that number is around 5 hits. 2 SD (95%) gives you 10 hits, or about 8 runs. Since you get some turnover in fielding, its unlikely that a team can continue to field great fielders year-in year-out. Perhaps over 15 seasons, the standard deviation for fielding may fall down to .003 to .004. Over 10,000 BIP, that works out to about 60 runs (for 95% of the pitchers).
Posted 4:54 p.m.,
November 24, 2003
(#17) -
tangotiger
RJ is the only pitcher that appears in all 4 years in the top 5. Ed has him at +24 and I have him at +23 (though I have him at +24 if I give 100% of the BIP to RJ).
I've emailed Ed, but have not gotten a response from him.
Posted 11:38 a.m.,
November 28, 2003
(#18) -
Steve Rohde
I think Ed Oswalt's work here is exciting, but clearly, as this thread suggests, some refinements in approach are needed. Park effects, of course, are important. But another interesting factor is the impact of the League. Oswalt has a separate win expectancy table for each season (and I understand Tango's questions about his methodology for that). But from Oswalt's description it seems clear that he uses the same table for both Leagues in any given year. However, because of the DH, the American League consistently has a higher run scoring environment than the National League, and in many years the impact of that difference between Leagues is substantially greater than the typical park effect. Taking the difference in Leagues into account would narrow the difference in Oswalt's results between Clemens and Maddux. And if what we are seeking to measure is value, taking the difference in Leagues into account is also important for hitters, and would increase, for example, Bonds' numbers. It might also eliminate Brett's advantage over Schmidt.
With the unbalanced schedule, there is also the issue of different
win expectancies based on differential quality of opposition. This all gets quite complex, and I am not sure what is feasible and desirable to address at this time.
I will be looking forward to Tango's book, to see how he addresses the various complexities.
Posted 12:47 p.m.,
November 28, 2003
(#19) -
Tangotiger
I mentioned this elsewhere, but WPA will NOT be part of the book.
Posted 7:57 p.m.,
November 29, 2003
(#20) -
Tangotiger
Here are the top pitchers (inc active) not inthe HOF, according to this system:
Glavine, Gossage, K Brown, Hoffman, Smoltz, Mussina, Schilling, Eckersley, Saberhagen, John Franco, Cone, Gooden, Hershiser, Appier, Lee smith, Blyleven
Posted 1:57 p.m.,
November 30, 2003
(#21) -
Steve Rohde
Tango,
In your post # 20, you omitted reference to the big 4 of Maddux, Clemens, Johnson, and Martinez. I guess it is so obvious that these are future Hall of Famers that you didn't even think to include them on the list.
Posted 2:37 p.m.,
November 30, 2003
(#22) -
Tangotiger
Wow, that must have been what I was thinking. You know, you can argue that any one of those 4 pitchers was the best pitcher since 1919. And you've got them all in their primes at the same time. Damn right pitchers aren't like they are used to.
Posted 3:00 p.m.,
November 30, 2003
(#23) -
Steve Rohde
I am somewhat surprised that Blyleven didn't come out better on Oswalt's list than he did, finishing with a win contribution only 24.018 above average. Nevertheless, I do think it should be kept in mind that Oswalt's methodology understates Blyleven's career performance in several ways. First, because his data started with 1972, Blyleven gets no credit for his first two years in the majors, when he combined for 442.3 innings with an ERA+ in excess of his career ERA+. Moreover, Blyleven not only spent the great bulk of his career pitching his home games in hitters parks, he also spent most of his career in the higher run scoring environment of the DH League. Accounting for Blyleven's first two years and adjusting for park effects and League would move Blyleven measurably up the chart.
Posted 5:05 p.m.,
November 30, 2003
(#24) -
Tangotiger
Those are all great points. I can easily see an extra 10 to 15 wins because of this.
Park and league factors are critical here.
Posted 9:57 p.m.,
January 30, 2004
(#25) -
RossCW
Perhaps someone here can explain why Oswalt's toatls for players "offensive wins" is negative and his totals for pitchers contributions are positive by an equal amount. It seems to have some fairly serious implications.
Posted 10:31 p.m.,
January 30, 2004
(#26) -
RossCW
Perhaps someone here can explain why Oswalt's total for players "offensive wins" is negative and his total for pitchers contributions is positive by an equal amount. It seems to have some fairly serious implications.
Posted 9:17 p.m.,
February 1, 2004
(#27) -
AED
It means there is an error of 0.0002 wins per at-bat in favor of pitching. Hardly a serious implication.
Posted 10:19 p.m.,
February 4, 2004
(#28) -
Cyril Morong
Some of you may have seen my research on this. Ed Oswalt's stat is highly correlated with OPS over the time period. Here is my site on this. Also, sorry, I have not read much of what is above here.
http://hometown.aol.com/cyrilmorong/myhomepage/totalclutch1.htm
Posted 11:57 p.m.,
February 4, 2004
(#29) -
Anonymous
.