See copyright notice at the bottom of this page.
List of All Posters
Bruce, Lee, and the Goose
December 17, 2002 - ColinM
Really fantastic work. I have a question regarding peak value for relievers. It would seem to me that the years in which a reliever would be used with maximum leverage would be those seasons in which he was at his best. So, for example, Goose had a LI of 1.62 for his career. But Goose also had a number of mediocre years at the end of his career where he probably wasn't used in high leverage situations. So isn't it likely that his LI for his peak seasons may be quite a bit higher, maybe more like 2.0 or so. If this is true that would make his best seasons, 75, 77, 78, extremely valuable, HOF caliber.
I guess my theory is that while we can determine which starting pitchers had a similar career value using a relievers career LI, a top reliever such as the Goose is more likely to have a greater peak value than those starters because his LI will be greatest during his best years. So it may be that while it is almost impossible for a relief ace to compile HOF level career numbers, his peak value might be high enough to justify election along the lines of Koufax or Dean. Do the numbers support this, or am I off base here?
Bruce, Lee, and the Goose
December 17, 2002 - ColinM
That would be great Tango, to see the breakdowns by situation. I look forward to seeing the results. After this step, it would be nice to see if we could come up with a better way to estimate leverage for relievers for those seasons where PBP data isn't available. For the last 5 years or so I have used my own little method for relief aces, to estimate historical leverage-equivilent innings. Simply IP + (G-GS)/2 + SV. It seems to give results that make intuitive sense, and looks reasonably consistent with your leverage numbers. I still like it better than Bill James' save eqivilent innings which just has a hard cap. Of course it only "works" for top relivers as the assumption is that relief games are more valuable than GS. It would be nice to use the PBP data to come up with a formula that isn't quite as rough an estimate as this.
Double-counting Replacement Level (August 25, 2003)
Discussion ThreadPosted 4:25 p.m.,
September 9, 2003
(#36) -
ColinM
I was late in finding this thread, but I want to say great job tangotiger on noticing this and letting BP know. I'm not surprised about Loiza. WARP-3 and even worse, Win Shares, really downgrade the contributions of top pitchers versus top position players. Most historical research I've done suggests that the top pitchers are roughly as valuable as the top position players. I would guess on average maybe 5 out of the top 15 players in a given year would be pitchers. Not that you guys have any reason to just take my word for this :)
Double-counting Replacement Level (August 25, 2003)
Posted 2:55 p.m.,
September 11, 2003
(#37) -
ColinM
I posted this in clutch hits but it seems to have some relevance over here also. In the thread we noticed that Randy Johnson strangely has more career WARP3 than Greg Maddux. I then ran across this in the BP glossary:
XIP
Adjusted Innings Pitched; used for the PRAA and PRAR statistics. There are two separate adjustments:
1) Decisions. Innings are redistributed among the members of the team to favor those who took part in more decisions (wins, losses, and saves) than their innings alone would lead you to expect. The main incentive was to do a better job recognizing the value of closers than a simple runs above average approach would permit. XIPA for the team, after this adjustment, will equal team innings. First, adjust the wins and saves; let X = (team wins) / (team wins + saves). Multiply that by individual (wins + saves) to get an adjusted win total. Add losses. Multiply by team innings divided by team wins and losses.
2) Pitcher/fielder share. When I do the pitch/field breakdown for individuals, one of the stats that gets separated is innings. If an individual pitcher has more pitcher-specific innings than an average pitcher with the same total innings would have, than the difference is added to his XIPA. If a pitcher has fewer than average, the difference is subtracted. This creates a deliberate bias in favor of pitchers who are more independent of their fielders (the strikeout pitchers, basically), and against those who are highly dependent on their defenses (the Tommy John types).
Ignoring #1 for the moment and just concentrating on #2, this is...well...terrible. I mean, if you want to use DIPS theory and try to figure out the quality of defense behind a pitcher, then sure, thats fine. But to give a pitcher credit over another just because he has more defense independant plays? This tells you nothing about the quality of the defense behind him.
Look at it this way. Say RJ and Maddux both save 60 runs above replacement. Do you give Maddux less credit because he accomplished it by not giving up homers and walks? In this case the fact that he has less BB and HR, less defense independant plays, is a GOOD thing. You don't give more to RJ for having more Ks.
If I'm wrong about how this works, then I apologize to BP. But it sure seems to fit the results and the description. I also found these numbers:
IP, ERA+, WARP3
Nolan Ryan 5386, 112, 130.2
Gaylord Perry 5350, 117, 113.3
Nolan Ryan has over 380 XIP in 1973 and 74. Looks to me like WARP3 for pitchers is just as unreliable as WARP3 for batters.
Bonds, Pujols and BaseRuns (September 6, 2003)
Posted 2:54 p.m.,
September 8, 2003
(#9) -
ColinM
If IBB are truly win-neutral events wouldn't that be the same thing as saying that your average IBB would lead to the same expected value as pitching to the player? In other words, if an average player produces about .125 runs/pa, then an average IBB should be worth about the same (I know I'm ignoring inning/run context right now)?
Now assuming managers have this balancing act right, wouldn't you expect that on average Barry Bonds, like anyone else, would not be walked when the win-value is neutral for an average player, but when the win value is neutral for Barry Bonds? So in other words, it seems likely to me that the value of an average IBB for Bonds is much higher than for a typical player.
Extrapolating this line of reasoning, if we assume that IBB are win-neutral for most players, then wouldn't it make sense to deal with them as such:
Set the value of IBB to 0 and calculate BsR. Then adjust BsR as
BsR = BsR + (BsR/(PA-IBB) * IBB)
The result of this would be to treat each IBB as equal in value to an average non-IBB plate appearance for each player, in essence setting the expected win-value for the IBB as neutral. Does this make sense, or am I missing something?
Bonds, Pujols and BaseRuns (September 6, 2003)
Posted 4:37 p.m.,
September 8, 2003
(#14) -
ColinM
Thanks tango,
You're right of course about separating runs/wins, the concept only really makes sense in terms of wins. Although if you have a straight runs per win value, it really doesn't make any difference over the course of a season.
David, I agree with tangotigers latest post. And I think it can be demonstrated that the average value of an IBB will increase as the quality of the batter increases (although tangotiger will have to verify whether the win-neutral hypothesis holds true). Think about it like this:
Barry Bonds is intentionally walked way more than any other player. This means that there are a number of situations where Bonds is walked but nobody else would be. In other words, there are a number of situations where the win expectancy is too high to justify putting anyone else on base. Given this, the average win expectancy of a Barry Bonds IBB must be higher than that of an average player (or any player at all for that matter).
I'd love to see some empirical data that suggests that this average value remains essentially win-neutral, it would seem to be the most logical result, given that's the case for an average player. If we can verify that this is true, we might have a nice way to deal with the intentional walk!
Bonds, Pujols and BaseRuns (September 6, 2003)
Posted 2:02 p.m.,
September 9, 2003
(#20) -
ColinM
Great point RossCW. But Tango, I think you might be misinterpreting what the impact of that might be. In fact I don't think there is really any impact at all.
Let me see if I can work through this. Say you have a hypothetical Barry Bonds who for some silly reason is never intentionally walked. Let’s take the average win-expectancy for a PA by this bizarre world Bonds and set it to a baseline of 1.
Now let’s take the real Barry Bonds, who gets 70 IBB in a season. For real Bonds, we can divide his PA in to two groups, group A (PA-IBB) and group B (IBB). Following what Ross said, group B typically occurs when Bonds win expectancy is at its highest. So if group B has an average win expectancy of 2, then group A must have a reduced average win expectancy, maybe .85 or so. So yes, his group A PA are less valuable on average than they would be for bizzaro Bonds. BUT, here's the key point: if the tradeoff in win expectancy for an IBB is neutral as hypothesized, then the average win expectancy for group B PA MUST remain the same, 2. So when you combine the two groups, win expectancy is still 1 and real Bonds is just as valuable as bizarro Bonds. So if this is true, then the method we have proposed for dealing with IBB would still be valid.
In fact I think the data supports this assumption. I posted earlier that if IBB are win neutral, then you would expect an average run value of about .125 if they occured in a typical situation. But the BsR value used in the article has them closer to .17. This would suggest that the IBB are occuring at more leveraged times, as Ross pointed out, and as a result have more value than an average PA.
Long story short, Bonds non-IBB PA are worth less than you would expect, but this is balanced out by the fact that his IBB are worth more than anyone elses. In fact I'm coming to believe that an average IBB by Bonds may be more valuable on average than a regular BB by anyone else. It's also interesting that one result of his "regular" PA having reduced value would be that he would have less RBI than predicted. Which is exactly the case. But now I'm rambling on...
Bonds, Pujols and BaseRuns (September 6, 2003)
Posted 4:13 p.m.,
September 9, 2003
(#22) -
ColinM
The .125 was absolute runs. I took this statement by Robert Dudek in the original blog "I'll note that from 1994-2001, it was empirically determined that an IW is worth about .178 runs and a NIW was about .33" to also be referring to absolute runs. However, I'm mixing runs and wins here and just detracted from my main point.
Let me say this, if you ignore the last two paragraphs of my last post and just concentrate on the rest, I think what I'm saying is still valid. Let me use your example of equally talented Bonds and Pujols (imagine Bonds having an equal :)). You said:
"But, Pujols gets pitched to, and as a result can create more wins than Bonds in the exact same situation.
That is, if we have that late and close situation where walking Bonds will have a win expectancy of .35 and facing him will be .37, they walk him. But Pujols, they face him, and he makes them pay... to the average win expectancy of .37."
I agree. In your situation Pujols will create more runs. But the problem is, we're working under the theory that the IBB for Bonds, like everyone else, is esentially win-neutral. Meaning the manager guesses right on the IBB about half the time. This implies that there will be another situation where the win expectancy of facing Bonds is .37, but the WE of walking him is .39 and they walk him anyway. So in this case, Bonds has actually had MORE impact than Pujols, by being walked. And if WE for IBB is neutral for Bonds, then this situation is just as likely as the one that you've given.
So in the end, Bonds' PA will be just as valuable as Pujols. They have to be. The only way they cannot is if the win-expectancy for Bonds IBB is negative, IOW, if the managers do a better job of guessing the break-even point of walking Bonds than they do with other batters.
Bonds, Pujols and BaseRuns (September 6, 2003)
Posted 12:14 p.m.,
September 10, 2003
(#26) -
ColinM
So it seems that we have a few competing theories about how to peg the value of an IBB. These theories basically boil down assumptions about what a typical manager's tendencies are when issuing a walk. All of these theories work off of the data that on average, the IBB is a win neutral event over the course of a season:
Theory A - Managers guess right on the IBB most of the time with very good players, but guess wrong most of the time with other players. The win-expectancy of an IBB is generally the same for IBB to good players and bad players, so it's a net loss for the good player but a net gain for the lesser guy. This is the theory that would support setting the IBB to a common value for all batters. This seems to be the position taken by David, although I personally do not support this theory as I think good players are more likely to be walked in high leverage situations.
Theory B - Managers guess right on the IBB about half the time for every player, good or bad. In this case the win-expectancy on the IBB is neutral for each player and the IBBs can be treated as having a similar win value as the player's other PA. So the typical IBB will have more value for good players than bad players. This is the theory that I have been working off of in my previous posts.
Theory C - This would be some combination of A and B. In this case good players are walked in higher leverage situations like in theory B, but the managers do a better job of finding the break even point with the good hitters as in theory A. The value of a typical IBB for a good hitter will be worth more than for a bad hitter, but it won't be worth enough to make it win neutral, it will still be a net loss in value.
To sum it up using our Bonds-Pujols example. (The numbers are just pulled out of my ass)
Theory A - Value of an average Bonds PA ~= .95 * PujolsPA
Theory B - BondsPA = PujolsPA
Theory C - BondsPA = .975 * PujolsPA
Which one is right? We just don't know until we have some solid data. Right now I get the feeling that B is closest to the truth, but for now, that's just opinion.
Bonds, Pujols and BaseRuns (September 6, 2003)
Posted 4:44 p.m.,
September 11, 2003
(#29) -
ColinM
David, thanks for the link. It'll take me a while to get through that. If I come up with anything else to add I'll post it here.
Evaluating Catchers (October 22, 2003)
Posted 5:47 p.m.,
October 23, 2003
(#9) -
ColinM
This is brilliant work Mr. Tiger, far and away the best evaluation of catcher defense I have ever seen. It seems to pass the Bill James 80% test, notice how the top ranked catchers are mostly the guys with the best reputation to begin with.
One small suggestion... If I understand your methodology right, it looks like you could run into sample size problems in the case where a pitcher spent the great majority of his career with one catcher. Take Charie Lea from your Gary Carter example. Lea only had 805 PA with other catchers. But this gets pro-rated to match the 3061 he had with Carter, producing large deltas of -12. Lea's "without Carter" numbers are given as much weight as Scott Sanderson's, even though Sanderson has 7 times as many PA with other catchers. To avoid this, maybe it would make sense to use the lower of the two sets of PA as the weight, and pro-rate everything at the end. So in the case of Lea, the deltas would be about -3 and the PA weight 805. Then you can add up the deltas, and multiply them by ActualPA/WeightedPA to get the total for a career. What do you think, am I insane?
And maybe it's just the lingering insanity, but couldn't this work for other fielders too? I mean, all of the stuff we adjust for: GB:FB, Lefty:Righty, this should all be accounted for in the selection of pitchers for each fielder. As with the catchers, the comparison environment should be the same as the fielder's actual environment EXCEPT for park factors, which is a big exception of course. But then PF could always be added afterwards.
Evaluating Catchers (October 22, 2003)
Posted 8:21 p.m.,
October 28, 2003
(#20) -
ColinM
Thanks for checking that Tango. Once again, great study.
ALCS Game 7 - MGL on Pedro and Little (November 5, 2003)
Posted 2:21 p.m.,
November 6, 2003
(#9) -
ColinM
Nice explanation Arvin. It makes me think about my strato days. Even Roger Clemens would sometimes give up 7 runs in 3 innings, but I doubt the strato card was missing its "stuff" that day. (Of course we all managed the game as if that was the case, I'd yank him for Bob Stanley real quick).
Anyway, I do agree with most of you that pitchers have bad days and I'm sure managers and coaches can probably recognize that even if the results might not show it.
What's a Ball Player Worth? (November 6, 2003)
Posted 10:32 a.m.,
November 10, 2003
(#15) -
ColinM
studes:
"Got to admit, though, that the Halladay rating surprises me. Judging from the rankings, these guys are not splitting credit between pitching and fielding."
Andrew Edwards:
"Agree that pitchers are overrated, they almost certainly underrepresented fielding if Halalday and Delgado are their top 2 AL players."
This isn't right guys, pitchers are not overrated. Say the Jays have an average defense (I doubt its even that good). If you are looking at an average starter you would expect both him and the defense to contribute 0 wins above average. Now consider when Roy Halladay pitches, how much does the defense contribute? It's still an average defense, so it still contributes 0 WAA. All of the extra wins above average should be credited to Halladay. The defense does not get any better because Halladay is on the mound, and therefor should not get any of the extra credit. I've said it before, top pitchers are typically every bit as valuable as the top hitters.
What's a Ball Player Worth? (November 6, 2003)
Posted 4:17 p.m.,
November 10, 2003
(#19) -
ColinM
studes,
That might be an improvement but I don't think that the pitching/defense split itself is the problem with pitchers WS. I think the real problem is that James translates marginal runs saved directly into WS in a linear fashion. In other words, a pitcher who saves 80 marginal runs will have exactly twice as many WS as a pitcher who saves 40 (ignoring the other little pieces of the formula). But if you work out their expected winning pct. using a pythagorean type formula, you will see that the pitcher who saves 80 runs is actually MORE than twice as valuable as the other guy, because the more runs he saves, the lower actual the run environment becomes. The effect is that PWS tend to severely underrate the best pitchers when compared to the best hitters.
I think PWS can be best improved by using a pitchers pyth. W% vs. the team average, as opposed to using marginal runs in order to divvy up the shares. i can go more into detail later if you're interested, but I'm running out of the office as I type this.
What's a Ball Player Worth? (November 6, 2003)
Posted 12:03 p.m.,
November 11, 2003
(#24) -
ColinM
studes,
I took the time to mock up a spreadsheet to model PWS and test my theory by playing around with different Runs allowed distributions and comparing Pyth WAA vs. WS WAA. I DID find what I was looking for but the effect was much smaller than I thought it would be. Once you I got to Pedro level, WS was about 1/2 a win under the pythagorus estimate. The results were pretty much identical for pitchers closer to the average. However, I did this assuming that since the defense didn't change, all of the extra "credit" would go towards the pitching WS total. I'm not sure if this matters.
I still think that WS causes people to undervalue good pitchers. The main reason is probably that most people look at the raw WS total as opposed to above average or above replacement. But the expected WS total for a pitcher is generally going to be a fair bit lower than that of a position player. If you look at WSAA, the best pitchers start to look a lot better. This is especially important for relievers.
It may also be that there are a number of small factors that add up to have larger effect such as:
- Save adjusted innings. WS adjusts relievers totals upwards to reflect higher leverage situations. But the extra WS credited to the relief ace are pulled from the general pool, when if I remember correctly, Tango showed that starters LI should still be around 1.
- Pitching/Defense split
- The Slight undervaluing of extemely good pitchers
- The "Other stuff" which may tend to pull claim points towards the average.
In general, now that I've taken the time to look at PWS, they seem to work better than I expected. But most research I've seen and done suggests that a top pitcher is just as likely to be the "best" player in the league as a position player. Until win shares reflects that, it doesn't seem like a good idea to use them to compare pitchers and hitters with each other.
What's a Ball Player Worth? (November 6, 2003)
Posted 2:59 p.m.,
November 12, 2003
(#28) -
ColinM
Tangotiger,
I understand if you don't want to give too much away before your WPA is finished, but do you in fact find that the top couple of pitchers each year are about as valuable as the top position players? I don't think it is fair to compare 4 year averages since pitchers tend to be injured more frequently. But I would expect to see Pedro, Randy and Schilling ranking right up there with the top (non Barry) players in the majors when they are healthy. In my own ranking system I usually expect to see about 2 pitchers among the top 5-6 players and 8 or 9 pitchers among the top 30 players in a typical year.
Underrating of star pitchers has been my primary criticism of pitcher win shares and I would be curious to see this validated (or not). For example I rank the following pitchers as the best player in the major leagues over the last 20 years:
00 - Pedro
99 - Pedro
97 - Clemens
95 - Maddux
94 - Maddux
89 - Saberhagen
86 - Scott
85 - Gooden
Win Shares certainly sees things differently.
What's a Ball Player Worth? (November 6, 2003)
Posted 8:37 p.m.,
November 12, 2003
(#33) -
ColinM
Thanks tangotiger for running through that. I think before WinShares can be taken seriously for pitchers it would have to produce numbers in line with WPA for the seasons where WPA is available. I'm sure there's a bunch of us really looking forward to seeing what you've come up with when you do roll out WPA.
What's a Ball Player Worth? (November 6, 2003)
Posted 1:49 p.m.,
November 13, 2003
(#39) -
ColinM
I dunno, I don't buy it. A pitcher who can throw 252 average innings is less valuable than an injury prone average CFer?
Look at it this way, say we use a replacement baseline of .75. For pitchers it would be about 1.22 assuming an average defense.
(2*(1/.75) + 1)/3.
So assume 5RPG average, pitcher gives up 140 runs. Player creates 78 runs. Pitcher saves (140*1.22-140) = 31 runs. Player adds (78-78*.75) = 19.5 runs. The player gets no more credit for defense because studies show replacement defense is close to average anyway.
Whoa..whoa..whoa. Hold on a sec. Let's think about that last sentence. Could it be that Win Shares is suffering from the same double counting effect that Tango found at BaseballProspectus?!
Think about it. As some of the guys have pointed out around here, a replacement player who gives you .75 offense should still give you about a 1 defense. But what is WinShares doing? It sets its 0 level replacement level at 1.52 for both pitching AND defense. Does this make sense?
Is it possible that what we need is somehting like differing 0 levels for pitching and defense? In this type of system FWS over average remain the same, but total FWS go down and PWS go up bringing pitchers and hitters more in line with each other. I don't know, I need to think about this more...
What's a Ball Player Worth? (November 6, 2003)
Posted 2:02 p.m.,
November 13, 2003
(#40) -
ColinM
Michael,
When you say that Pedro finishes a little behind Bonds and ARod in 2000 are you comparing them in terms of runs? If so, it may be that Pedro still jumps ahead when you convert the runs to wins. You know, the old a run saved is worth more than a run added. Just think of how low the run environment is in games that Pedro pitches.
BTW, nice job so far on the DRA articles.
Win and Loss Advancements (November 13, 2003)
Posted 12:36 p.m.,
November 13, 2003
(#5) -
ColinM
This is going to be great stuff. It will be interesting to see if the best relievers are closer to the best starters in value when looking at individual seasons rather than an aggreagate. It seems to me that you would generally find more variance in an ace reliever's performance season-season than a starters, and it might be that the best relief seasons do compare well with the best starter seasons. Make that the best non Pedro and RJ starter seasons.
IOW, you might find that while an ace reliever is unlikely to be as valuable as an ace starter over a number of years, It is not so uncommon for a reliever to be the top pitcher in a league for a single season (like Gagne in 2003). I'm just guessing though.
Win and Loss Advancements (November 13, 2003)
Posted 3:35 p.m.,
November 14, 2003
(#12) -
ColinM
I do find the Bonds result surprising. I wonder though, how much of that difference from LW is because you are including his sub-par 1999 season? If his yearly totals are something like 4-9-9-8, then his 2000-2002 WAA numbers are actually very close to the LW estimate.
Win and Loss Advancements (November 13, 2003)
Posted 3:36 p.m.,
November 14, 2003
(#13) -
ColinM
Actually, I guess 4-7-10-9 might be a better guess but you get the point.
Win and Loss Advancements (November 13, 2003)
Posted 11:50 a.m.,
December 3, 2003
(#26) -
ColinM
I don't know, .473 looks awfully high as a replacement level. Isn't ~.400 a more commonly used level? For example, I quickly looked at all pitchers with < 25 innings in 2002. They had a collective ERA of 6.05 compared to the MLB average of 4.27. So you would expect around a .340 W% from this group. Now there is going to be some selective sampling issues at play here, but I would think this shouldn't bring the expected W% much passed .400 if you wanted to use this method to set the replacement level. What am I missing with your method?
Win and Loss Advancements (November 13, 2003)
Posted 12:01 p.m.,
December 3, 2003
(#28) -
ColinM
Some of the difference is I would only be looking at about 5% of IP. If I increase it to all pitchers with < 37 IP that comes to about 10% with an ERA of 5.54 or about a .375 W%.
Win and Loss Advancements (November 13, 2003)
Posted 12:17 p.m.,
December 3, 2003
(#29) -
ColinM
Thanks Tango, that clears it up. I should have seen that after all the posting I did on other threads about win and loss shares.
How many GA would an entire staff have? I ask because I'm interested in seeing where you have the replacement level set in terms of W%. How many wins would a staff of replacement pitchers have in your system?
Win and Loss Advancements (November 13, 2003)
Posted 1:40 p.m.,
December 3, 2003
(#32) -
ColinM
I might lean towards 95%, just because it seems to line up better with some existing replacement levels.
However,
There still seems to be some (smaller) discrepency between the replacements W% using WA/LA and the expected W% using ERA. Would I be correct in assuming that these replacement pitchers would tend to have a low GA/IP ratio because they tend to pitch in low leverage situations? Is it possible that as the GA/IP ratio increases that the WA/LA ratio may change? Or should it remain static?
Also, if a team were to pitch all replacement pitchers, I would think it's possible that the total amount of GAs might increase. If a game with average pitching had 1.4 GA, the replacements might have more of an overall effect on the game (negatively of course) than an average pitcher. So the total impact might be greater than the -.11 wins you came up with keeping the GA the same.
Win and Loss Advancements (November 13, 2003)
Posted 1:50 p.m.,
December 3, 2003
(#33) -
ColinM
I guess the what I'm getting at in the last part of my previous post is that using "game advancements" to figure replacement value might not be a good idea. If everything a pitcher did was very close to the expected value and he pitched 270 innings he would be a very valuable pitcher but might only be 5-5 in GA. This would not be any worse than a pitcher who is 20-20, but when comparing to replacement level, there would be a big difference.
Win and Loss Advancements (November 13, 2003)
Posted 3:13 p.m.,
December 3, 2003
(#36) -
ColinM
Point taken about the 95%. I will say though that I'm not just pulling .400 out of the air. This is about the level I would expect given the ERA of the bottom 10% of pitchers by IP. So if you use 10% as the cutoff, I would also think that this should be consistent with the level you find.
Win Shares, Loss Shares, and Game Shares (November 15, 2003)
Posted 1:46 p.m.,
November 17, 2003
(#13) -
ColinM
This is some great thinking and work being done by studes. Despite that I have to say stop, don't do it! The dark side Tangotiger speaks of is just that, and once you go down the road of negative win and loss shares, you're going to lose a lot of people because the concept of negative wins and losses just doesn't make any sense.
In wins and losses, baseball already has a debit and credit system. If you have a positive contribution you can tally that in the wins column. If you're contributing negatively, you can put that into the losses column. A negative win is just a positive loss.
I think the conceptual problem with Win Shares can be summed up in one statement from Tangotiger:
"To convert from a marginal utility to a total utility will require some weird things, like negative wins."
That is exactly what needs to be fixed. You need to get away from marginal runs and use a true 0 baseline in order to calculate absolute wins and losses. Marginal runs is just a hack to deal with the fact that it is impossible to pinpoint a true 0 level for the defensive side of things. But I think I have an idea of how a framework based on 0 level might be achieved. I won't go into the details of how everything might be worked out because frankly, I'm not actually sure about all of the details myself. But I will outline the high level concepts.
This is long, so if you want just skip to the conclusion.
For simplicity's sake I will refer to Win Shares as 1 win share per real win instead of 3.
1. Just like Win Shares currently does, you need to split the offensive and defensive credit. Offense and defense both have 50-50 responsibity. So if you win 100 games and your offense is credited with 54 wins shares, then the offense will have 27 loss shares, and the defense will have a 46-35 record to work with.
2. Start with offense. Instead of using marginal runs, use a baseline of absolute 0. Calculate "Offensive Wins Shares". Except what you are calculating in this system is NOT total Offensive Win Shares. Instead, what you really have is the total number of WINS ABOVE A 0 LEVEL PLAYER. This can be defined as (WinShares-LS)/2 - (0WS-0LS)/2 where 0WS = the 0 Level Player's WinShares. Let's leave offense at that for now, we can calculate Offensive Loss Shares later.
3. Defense. By which I mean Pitching+Defense of course. So how can we calculate Pitching Win Shares using 0 level when a zero level pitcher is undefined? The answer is we can't. But the solution is to look at it backwards. We do have an absolute limit for pitchers, its just that the limit is found in the opposite direction of the offensive limit. Instead of calculating how many wins a pitcher was above a zero level pitcher, we can calculate how many LOSSES a pitcher was below a PERFECT pitcher. Using Runs Allowed (maybe combined with pythagorus) you can calculate a pitchers "Loss Shares". But just like for Offense, these are actually LOSSES BELOW A PERFECT PITCHER.
4. Here we bring it all home. Let's go back to offense. Right now we have WinsAbove0. Let's use the 100 win team to illustrate the difference between the GameShares method and the method I'm proposing. In this example the offense was credited with 54 win shares. Say its an AL team with 9 players who play every inning of every game. If every hitter were equal we would have 9 players with 6 Win Shares. Now say instead we have one player with 14 WS and 8 with 5 WS apiece. If these players all made the same amont of outs, the Game Shares method would divvy up WS and LS as
Player WS-LS
Player1 14-(5)
Players2-9 5-4
I'm saying that this is wrong. The key is to remember that the 14 is NOT total Win Shares but WINS ABOVE 0. So let's think about Loss Shares now. I would argue that a hitter contributes towards losing by making outs, so that Loss Shares for the most part, should be proportional to outs made. So in this example, everyone gets (27/9) 3 Loss Shares since everyone has the same amount of outs. Then its a simple matter to figure Win Shares. WS = (WAbove0 - LossShares)*2.
In my method, the WS-LS list would look like this:
Player WS-LS
Player1 22-3
Player2-9 4-3
Game Shares are NOT the same for each player in this case even though W above average would be. I think this is a more accurate portrayal of real life.
For pitchers a similar process is performed. Randy Johnson pitches 270 innings and we've credited him 6 Losses Below Perfect. We calculate Perfect Pitcher as having 30 wins. Therefor RJ gets 30-6 = 24 WS. LossesBelowPerfect always equals Loss Shares for pitchers since Perfect Pitcher always has 0 Losses. RJ has a 24-6 WS-LS record.
I haven't even touched the pitching/fielding split. But I think if Win Shares is to move forward what it needs is a framework similar to the one I have just described, with no negative values.
Conclusions:
1. Use absolute runs instead of marginal runs
2. Using GameShares is flawed. Win Shares as we use them now are actually Wins Above 0. Loss Shares should be proportional to outs for hitters. Win Shares are (WAbove0 - LossShares)*2. For hitters, Loss Shares are limited by outs but Win Shares are unlimited. For pitchers, Win Shares are limited by IP, but Loss Shares are unlimited.
3. Start with Loss Shares for pitchers using the Perfect Pitcher as a baseline.
Win Shares, Loss Shares, and Game Shares (November 15, 2003)
Posted 2:15 p.m.,
November 17, 2003
(#16) -
ColinM
Thanks Tango,
I think that WPA is exactly what a system like this wants to be and what it needs to be verified against for those seasons where PBP data exists.
And you're correct, right now it is just a series of concepts. I would imagine that straight RC above 0 would not work so well. What might be needed is a VORP(MLVR) type system for hitters where replacement would be a guy who makes all outs. And probably using pythagorus somehow for pitchers.
But the reason for the long post was simply to point out some key concepts that would make the Win Shares framework more palatable and mirror reality more accurately.
Win Shares, Loss Shares, and Game Shares (November 15, 2003)
Posted 8:53 p.m.,
November 17, 2003
(#28) -
ColinM
studes,
When I'm talking about a 0 level player I mean a guy who contributes absolutley 0 wins. A guy who does nothing but make outs. This is what Win Shares wants to measure right? Thats why WinShares adds up to total wins, ie, wins above 0. I mentioned the VORP idea for exactly the reason you gave, how runs->wins becomes non-linear at the extremes. So What I was thinking is something like: Calculate the team pyth. record using team runs created. Then replace Player A's plate appearances with the 0 level player who makes all outs. Of course you'll end up having to adjust actual PA for all players as the 0 level player eats up so many outs. But then you can compare the original Pyth record with the new record using the 0 level player. This can be used as the "claim points" for offensive WS above 0.
You are absolutely right that Loss Shares in my system are just like game shares in yours. In fact WSAA should be the SAME in both systems. This is EXACTLY why negative win shares are unneccessary. My argument is this:
There is a limit to how much a hitter can negativley affect the outcome of a game. But there is no practical limit to how much a player can positively affect the game. If you want some backup for this check out Tango's WPA for hitters. Losses Advanced do not differ that much between the best hitters and the worst hitters. But there is a huge difference in Wins Advanced.
Now look at my example using Player1, who would be sort of a Sammy Sosa '98 I guess. In the Games Advanced method he is 14 and -5. Using my method, he has just as many loss shares as everyone else, but his WS jump up to 22. I think this is a much greater reflection of what really happened on the field, his WS+LS are more than anyone else because he had more of an impact on the game. But in both methods he would be compared to an average player of 4.5-4.5 (WS-LS) and have the exact same WSAA total. It may be that this part of my framework just differs in symantics from Game Shares, but I really think it is an important difference that makes the numbers more meaningful.
Win Shares, Loss Shares, and Game Shares (November 15, 2003)
Posted 11:43 a.m.,
November 18, 2003
(#30) -
ColinM
I think it can be that simple (if by simple you mean complicated :). Although you might not be able to go staight from the proportion of RC to wins, you might have to do that VORP type thing I was talking about. Or maybe you can, I don't know. It would have to be validated against WPA anyway. And remember, you'd have to adjust the hitters' Wins once you assign the Losses.
But yeah, I think thats about how you'd do it.
Win Shares, Loss Shares, and Game Shares (November 15, 2003)
Posted 3:44 p.m.,
November 19, 2003
(#32) -
ColinM
I don't think the first part is right jimd. Win Shares is supposed to divide credit for absolute wins. And absolute wins are created by absolute runs, this is absolutely true. The fact that they may not proportional in a straight linear way is the reason I suggested some type of pyth. approach to claim points. An average player should still be an average player on any team.
As for the Loss Shares, it probably is the case that outs isn't the ideal thing to use. But I'm actually not convinced that Win and Loss shares shouldn't be different in different team settings anyway. Take the extreme example, a team that never loses. Would it make sense for any player on this team to have anything but 0 losses? The important thing is that the players WS above average or 0 or any given baseline remain about the same from team to team. So presumably a player who gets more loss shares on a bad team will also have more win shares in my system and therefor a greater game impact. Does this make sense? I really don't know. Does it make sense that a good player might have more overall impact on a bad team but not more overall value? Again, I'm not sure.
And remember outs is just a guess at a proxy to use for real negative impact. In a loss every result of an at-bat that isnt't a home run is actually conributing in some way, however small, towards the loss. So perhaps in light of this, it would make the most sense to include RC in any Loss Shares calculation.
I think the most important thing to remember is that we are trying to model reality. When we attempt to assign absolute wins and losses to a player what we are really trying to do is answer the question: In games that this players' team won, what percentage of those wins is this player responsible for? And the reverse question for losses. And the fact is, negative wins and losses don't make any sense as an answer to these questions.
Win Shares, Loss Shares, and Game Shares (November 15, 2003)
Posted 8:52 p.m.,
November 20, 2003
(#37) -
ColinM
jimd,
That's a great explanation of the problems with the pythagorean scale. Its certainly a hurdle for any absolute wins type system.
tangotiger,
This is the first time I've been been able to check out this site since yesterday afternoon and you pretty much pulled the thoughts right out of my head. I think thats exactly what you have to do in an absolute system, go only positive in wins and negative in losses. It seems your discussion with David Smth may have touched on this concept a bit. The more I think about it, the more I think that using outs as a proxy for loss shares was a bad idea. If you go the absolute route, even a triple in a loss will be worth some negative "points". I wouldn't argue that an absolute system is better than a marginal approach, just different and another source of data for comparison.
studes,
If you're done with it, I'd like to play around with the absolute concept a bit more myself. I've got a few more ideas and see some errors in my orginal thoughts. If nothing else, I may have some small ideas that might merge with the relational approach as it is. Keep us updated with the pitching and fielding, you've done a great job with this so far.
Win Shares, Loss Shares, and Game Shares (November 15, 2003)
Posted 1:13 p.m.,
November 24, 2003
(#40) -
ColinM
"In the "absolute model", we start with the end of the game, 1-0. We then give 1.00 wins to the only players involved in the scoring, and in this case, it's the first 2 batters, and presumably, we give each batter 0.5 wins. The pitcher who threw to those two guys gets 1.0 losses.
Does anyone really think this captures reality?"
Maybe, but not as you've described it. I would think that the pitcher(s) for the winning team would get the biggest share of the "credit". I also think any reasonable system would not simply split the credit 50-50 between the two batters. You may end up with .8 going to the pitcher, .15 to the first batter and .05 to the second batter, or something along those lines. I would also guess that the hitters on the losing team would receive most of the "blame" for the loss, with a very small amount going toward the pitcher.
I doubt there's any single "right" model that can be used to assign credit in an absolute system. Who knows, some models may even be based off of some type of win probability in order try and capture the impact of each event.
And as much as I do like the idea of a marginal wins system, I also think that you can run into reality problems just as you can within an absolute system. Take your example with the leadoff triple, where the winning pitcher ends up with about .6 wins. Now say that instead of the triple-SF combo coming in the first inning, it happens in the bottom of the ninth. I assume that more wins added are now credited to the two hitters than in the first example, and the pitcher has less wins added. But is the pitcher's shutout really any less valuable than in the first example? Does this capture reality?
An absolute system may treat both shutouts the same. Again, I'm not saying absolute is better, just different, and may capture some things that marginal wins doesn't, just as marginal wins can capture things that an absolute system may miss.
Win Shares, Loss Shares, and Game Shares (November 15, 2003)
Posted 2:52 p.m.,
November 24, 2003
(#42) -
ColinM
David, how can I view the fanhome thread?
Win Shares, Loss Shares, and Game Shares (November 15, 2003)
Posted 3:25 p.m.,
November 24, 2003
(#46) -
ColinM
Thanks David, that is what I was looking for, I'm just not familiar with fanhome.
Tango, I think we are actually in agreement. Although I find value in a system which sees the two pitchers as the same after the fact, using the additional information of who won. I also see a marginal wins system as valuable.
As for WinShares, it wants to be after the fact, but since it covers all of baseball history it can't split up performance in wins and losses, so it has to approximate. I don't think there's anything else it can do.
Win Shares, Loss Shares, and Game Shares (November 15, 2003)
Posted 10:48 a.m.,
November 25, 2003
(#51) -
ColinM
David,
I hadn't actually read the fanhome thread yet (just didn't have enough time at work) when I posted the response to Tango. When I said we were in agreement, I was referring to what had been said so far on this thread. I meant that I think we both understand the difference between an absolute or marginal system even though we may value them differently. (This crossover thread between fanhome and here kind of reminds me of when Buffy and Angel were both running on the WB. Can I be Spike?)
Now that I have read the fanhome thread:
Tango might be OK with it, but I have some serious issues with your system for crediting absolute wins. You really have to deal with hitters performance in losses, and pitchers in wins. Consider two hitters, one who goes 0-4 and the other who goes 4-4, in a loss. Now I agree that in an absolute system neither hitter can get any positive credit because the team did not win. But that doesn't mean that they shouldn't be assigned any responsibility for the loss! In reality, every player on the losing team should be assigned some share in the loss. It's just that the first player will get a lot more "loss shares" than the second.
I don't yet know HOW such credit would be assigned, but just because it is more difficult to do doesn't mean it isn't the right way to go about it.
Win Shares, Loss Shares, and Game Shares (November 15, 2003)
Posted 3:33 p.m.,
November 25, 2003
(#53) -
ColinM
Tango, good news, we disagree again! Well not completely but close enough. First, I don't understand why you need to have this hard seperation between marginal and absolute. Everything is marginal, whether you start from average or 0. Absolute wins are built with marginal contributions. It's the value of these marginal contributions that differs depending on whether you look at them real-time or after the fact.
I actually think I'm OK with the concept that hitters create wins and pitchers create losses. So let's go with that. Where I may differ is in the initial conditions.
Here's another way to think about it (which I've only partially thought out). The hitters start with 0 runs and therefor 0 wins. So what happens if the initial conditions do not change, if no runs score? The team loses. So is it valid to to say that the hitters START a game with an implicit loss? This is balanced by the defense, where the initial condition is 0 runs allowed, or an implicit win, so the overall state is 0.
If a game ends in a 2-1 loss, the final state is -1. The responsibility of the defense for the loss is proportional to the amount of "marginal losing" it has contributed below its inital state of 1 win. The contribution of the offense to this loss is proportional to the amount left in its implicit loss after subtracting its "marginal winning", which in this case, results in more "loss shares" credited to the offense then the defense.
Win Shares, Loss Shares, and Game Shares (November 15, 2003)
Posted 2:41 p.m.,
November 26, 2003
(#58) -
ColinM
"Colin, from that standpoint then, there should be no problems with a negative total utility (i.e., negative win shares)."
Maybe, although I'm not necessarily taking that standpoint, just trying to point out that there are some other ways to look at the reality of the situation. And I think that it is a lack of reality that is the problem with Davids absolute system. I'm know I'm going to contradict some things I've said at the beginning of this thread, and I admit that is true, but I've learned a lot over the course of this discussion, it's really made me think. And if you can't learn and change your views then it is probably not worth reading forums like these anyway.
It seems to me that the problem with an absolute system is that there really is no absolute when measuring the value of a player. All value, all contributions are relative. This has to be true, since there will always be 9 players on the field and in the lineup. You can't start a game without any hitters, someone will always come to the plate. If a player can't play someone will replace him. So when you attempt to answer how valuable a particular player was in the context of a game, you are ALWAYS making a relative comparison against some hypothetical replacement. The concept of trying to determine a players contribution within a particular game is the same as asking one simple question: "Given the end state of the game (W or L) what is the probability that the opposite state would have resulted had this player not played in that game?". I really can't see any other way to look at it. While the question may be simple, the answer is not because the answer depends on how the player would have been replaced had he not played.
Even Davids absolute system is actually a relative estimate of value. In his system the player in question is replaced with a player of absolute 0 value. So in this context, yes, Davids method is the right way to go about it. Because if the team loses anyway, taking Player A out of the game and replacing him with the 0 value player will lead to a 0% increase in the chance that the team would have won the game (assuming all other results remain the same). He can then disregard a hitter's performance in all losses.
The problem with this is that a 0 value player does not exist in reality. And if this is true then wins above 0 do not reflect any type of reality. Even Bill Bergen was able to contribute something offensively above 0. And as soon as you move that "replacement level" to anything above 0 you have to start to consider hitters performance in losses. By taking any of the losing hitters out of the lineup there is a chance that the team could have won the game with the replacement in his place.
So, after all of this, I'm finally forced to conclude that there is no absolute value, only relative value. And there is really no "right" answer for where to set the line for relative comparison, only best guesses. And value above 0 is not a realistic one.
Baseball Player Values (November 22, 2003)
Posted 1:31 p.m.,
November 24, 2003
(#12) -
ColinM
I have to echo FJM here. I certainly have to respect anyone who takes the time to do this sort of thing, but as much as I hate to judge a system by its results, some of the results with the pitchers do seem a bit strange.
For example, Caldwell over Guidry in 78? And no Clemens among the top 5 in 91,92 or 98?
Baseball Musings: Defense Archives (December 5, 2003)
Posted 8:25 p.m.,
January 14, 2004
(#16) -
ColinM(e-mail)
Hey guys,
I wasn't sure where to post this, but here seems like a good spot. I was wondering if a couple of you could help me out a bit?
One of the things that I spend a lot of time doing is evaluating historical seasons. Comparing the top players in a given year, over time, etc... Of course the most difficult part of it all is trying to come up with a good estimate for defensive value. But now it seems like there are finally a few really solid systems out there that do a pretty good job of measuring this. The problem is, most of them aren't public.
Right now I have a database with Win Shares and can get Davenport's numbers from the net. Would any of you with your own systems be willing to share the results? Not the methods, just the final numbers would be great. Charles Saeger and Michael Humphreys, I've read the descriptions of you're methods and would love to see the numbers if you wouldn't mind. Same for anyone else who has a good system and is willing to share.
If any of you have this stuff already published would you be able to post a link? Or email me if you'd rather. Thanks for any help.
Baseball Musings: Defense Archives (December 5, 2003)
Posted 1:21 p.m.,
January 15, 2004
(#18) -
ColinM
studes,
I'd love to share the DB, how could I ask for other people's stuff and not share mine! The thing is, I don't know if I can. As I'm sure you know, STATS sells a digital update to the Win Shares ebook that contains WS for every player. If I share a database with the same info am I infringing on this? OTOH the formula for WS is published. If I were to use this formula to calculate all of the historical WS myself, why can't I share the work I've done?
Probably I'm just being paranoid, but if any of the legal types that hang out here know the answers, let me know.
Absolute Wins Produced (December 8, 2003)
Posted 10:20 a.m.,
December 9, 2003
(#1) -
ColinM
Here is a brief summary of my problems with Davids method:
- I'm not sure what question this method is trying to answer. What is meant by "value" in the context of this system? I cannot come up with a reasonable definition of value that does not include a relative comparison.
- The only relative comparison that makes sense for the way AWP is constructed is value over 0. If this is the case, AWP does answer the question. However, I propose that 0 value does not exist in the real world. There has never been a player or team in the history of baseball that had 0 chance of making a positive contribution in a given game.
- If you then want to compare a player or team to some replacement level above 0, you absolutley HAVE to consider hitter's performance in losses and pitcher's in wins. Because there is always a chance, however slim, that the use of a replacement(s) could have resulted in a different end result. This is true for both hitters and pitchers regardless of the actual end result.
- Finally, the world is based on probabilities, it really doesn't matter if this perspective likes it or not. You can't really say that if Joe Carter hit a 2 run homer in the first inning of a 2-1 win that the Jays would have lost without it. If he didn't hit the homer the whole sequence of events changes and we don't know what would have happened. We only have probabilities. Not sure how this relates exactly to AWP, I just think it should be kept in mind before handing out absolute credit.
So if value above 0 is what you want, then I suppose this will give that to you, I just don't think it has any relevance in the real world. And you have to leave it as TOTAL value above 0. You CANNOT start introducing AWP over replacement because you would have to consider that a replacement hitter could have produced a win in games which the team has lost. And at this point you must start considering hitters performance in losses to account for that.
Absolute Wins Produced (December 8, 2003)
Posted 12:01 p.m.,
December 9, 2003
(#5) -
ColinM
Thanks for the quick reply, if we don't want to call it value added, then I suppose I have no problem with it. Like I said, extra information is good, as long as we realize what this information doesn't cover.
You're response is exactly what I was getting at with the first point in my initial objection. Before we start discussing a "value" stat, we need to define exactly what we mean by value. And I think there are certain logical requirements a concept of value needs to meet; this is what I attempted to outline in the rest of the post.
I have to say though that I find your perspectives summary a bit misleading in relation to AWP. Perspectives 1 and 3 obviously can be related to total-value stats (WPA, Lwts), so it makes it seem as if you are offering AWP as a total-value solution for Perspective 2.
Diamond Mind Baseball - Gold Glove Winners (December 11, 2003)
Posted 12:40 p.m.,
December 12, 2003
(#19) -
ColinM
"When giving yearly awards, you don't care if a player's one-year performance was luck or skill."
I want to comment on this statement, which dlf also touched on. While I agree with this in principle, I don't think it can work in practice, at least when it comes to handing out awards for fielding. It's not that I want to give credit to a player for performance in other years, it's just that I don't think the existing fielding metrics provide enough of a confidence level to acurately judge what a players performance acually was in a single season.
Look at batting performance. You have a solid baseline to comapare to, the league average. Whether you want to adjust this to a replacement level is a matter of taste, but at least you have something tangible to compare to, a level where you can be reasonably confident the "average" player would perform at. You want to adjust for park factors of course, but you can still be pretty sure its close to the right value.
Fielding is different. Even the best methods like UZR (which I'm a bit in awe of) make a ton of adjustments and assumptions in order to estimate an "average" baseline. The confidence level just isn't there that the average it comes up with is "right", at least compared to batting or pitching. So it only makes sense to look at multi-year data in order to add extra information, to increase your confidence that the average baseline you're using is correct.
Do Win Shares undervalue pitching? (December 15, 2003)
Posted 11:01 a.m.,
December 16, 2003
(#14) -
ColinM
I'm late to the game here, but I think AED and Guy are on to something here. Actually, I believe I posted something very similar to AED's comments regarding replacement level about a month ago on one of those never ending Win Shares threads.
I can only make a short comment right now but I want to say this:
The 70-30 split makes sense when comparing runs saved to an AVERAGE baseline. There's no reason to assume this holds true when comparing to some hypothetical 0-level baseline like Bill James uses. The split can still be 70-30 compared to an average team, but it may be 80-20 or something when compared to a marginal team. I would imagine the split moves as the replacement line moves. I feel this is actually the source of the undervaluing of pitchers, more of those marginal runs at the really low level belong to the pitcher, not the fielders.
I may elaborate later if anyone cares...
Do Win Shares undervalue pitching? (December 15, 2003)
Posted 8:45 p.m.,
December 16, 2003
(#25) -
ColinM
Damn AED, that's some great stuff. I was going to elaborate on my earlier post but all I can really say now is... yeah, what he said. This is exactly the way I was thinking when I mentioned a moving piching/fielding split depending on the replacement level that you set. (Except I won't pretend to have been thinking in such nice, clear, statistical terms).
UZR, 2000-2003, Adjusted by Difficulty of Position (December 21, 2003)
Posted 12:54 p.m.,
December 23, 2003
(#23) -
ColinM
Tango,
OK, that regression seems reasonable. So Erstad probably has a "true" UZR of +31. But what's intersting to me is, how much more should UZR be regressed before you can combine it with an offensive measure to get a total value stat?
What I mean is, in a situation where you want to combine UZR and an offensive rating, shouldn't there be a further regression applied to UZR in order to account for the amount of confidence we have that UZR is actually measuring the right thing?
In another thread you provided a comparison between UZR and David Pinto's method. Both methods seem great, but there are still some pretty big differences there. The r you found was .69. What would the r be between RC and BaseRuns or XR or EQR, etc...? I would guess it would be quite a bit higher, .9 or more? So if you do adjust UZR to account for confidence in the method, how much extra would you regress?
UZR, 2000-2003, Adjusted by Difficulty of Position (December 21, 2003)
Posted 11:41 a.m.,
December 24, 2003
(#32) -
ColinM
No MGL, I don't agree with that. In Tango's example he says that after regression, Erstad's "true" UZR value is +31. I'm not arguing that. I'm sure that this is the best guess for Erstad's true UZR. What I'm arguing is that UZR itself is not as good an estimate of a players REAL defensive value as most offensive measurements are of a players REAL offensive value.
Now I'm not trying to critisize UZR here. You do an unbelievable job with it and it is the best thing I've seen for defense. However, as I've already pointed out, all of the many offensive measurements out there correlate extremely well with each other. Tango says this is mainly beacuse they are measuring the same thing. But the reason they are measuring the same thing is because we are really damn sure that this is the best way to measure offensive production!
UZR on the other hand, does not correlate nearly as well to other defensive systems like Pinto's. There just can't be as much confidence that UZR measures real defensive value as well as LWTS measures real offensive value. And you can't just add together two numbers that you have differing levels of confidence in if you want to have the most accurate rankings. You have to further regress UZR. How much further? Well, that's what I was hoping to find an answer for here...
UZR, 2000-2003, Adjusted by Difficulty of Position (December 21, 2003)
Posted 12:39 p.m.,
December 24, 2003
(#34) -
ColinM
Tango,
I'm probably not writing very clearly because you've missed my point entirely. We're talking about two completely different measures of confidence. When you give the 95% CI this is a statistical measure estimated (I think) using UZR's year to year correlation with itsself. What I'm tallking about is confidence that UZR makes a usefull measurement, which is a totally separate thing.
Look, here's a silly example. I created a stat called Defensive Runs Prevented. I calculate this stat by taking the amount of letters in a player's last name and subtracting 7. Erstad had -1 DPR last year. This stat has a 100% year to year correlation, so I'm 100% confident that Erstad's "true" DPR is -1. It needs no regression to the mean! But how much extra should I regress it before I combine it with Off LWTS? Why 100% of course, since I've got no confidence at all that it actually measures defensive value.
See what I mean?
Valuing Starters and Relievers (December 27, 2003)
Posted 11:41 a.m.,
December 29, 2003
(#26) -
ColinM
Great discussion. This site is tough to keep up with if you go on vacation for a bit!
Got a bit of a problem with how you might set different replacement levels for starters and relievers. Say for example, Curt Schilling averages 8 innings a start. Is it right to just compare him to a replacement starter in order to find his value over replacement? There's no way that his hypothetical replacement would be expected to pitch 8 innings. So wouldn't his true replacement be something like 5 IP of starting pitching and 3 of relief?
Same thing might be a problem if you have real good middle relief (think Mark Eichhorn in '86). In a lot of cases the true replacement might have been a starter pitching an extra inning. The replacement line is fuzzy if you want to split it between starters and relievers because there is no rule for what inning to bring a reliever in.
Valuing Starters and Relievers (December 27, 2003)
Posted 3:16 p.m.,
December 29, 2003
(#31) -
ColinM
But Guy, whether it seems to penalize Schilling or not, isn't that what really happens? If you have to replace Schilling, the replacement starter won't be as durable, so his innings will be replaced by a combination of starter and relief innings. If you want to give extra value to Schilling for saving the bullpen for another day then that is a seperate thing altogether. I don't think you can just compare all of his innings to a starter benchmark and hope it evens out in the end.
Valuing Starters and Relievers (December 27, 2003)
Posted 3:52 p.m.,
December 29, 2003
(#36) -
ColinM
I think Tangos line of thinking is definitely worth checking out before jumping to two different levels of replacement.
Not sure if I agree with the per game thought Studes. It might be best to start at a seasonal level. In theory, a team's starter/relief innings split will be set at whatever split will maximize the overall effectiveness of its staff, given the players available and the leverage of the situations encountered.
So for any given pitcher, you'd have to figure out how he would be replaced if he couldn't pitch. How many of his innings would be replaced by starters? How many by relievers? How would the bullpen use change?
It's a thorny issue.
Valuing Starters and Relievers (December 27, 2003)
Posted 4:48 p.m.,
December 29, 2003
(#41) -
ColinM
Oh I agree Guy, we can't just leave aside the question of the value of extra innings by a starter. But it's not answered by comparing to only replacement starter innings.
But it makes sense that there would be a negative impact on the bullpen in the Schilling scenario. Those extra 60-70 innings are going to be thrown by a replacement reliever in theory. So for sure the bullpen suffers.
But look at it this way:
What if instead of Schilling, there were two pitchers, a starter and a reliever, who were exactly as productive as Curt. And what if they worked as a tandem, the starter pitched the first 5 innings and the reliever the last 3, so that at the end of the year they had the exact same combined numbers that Schilling had.
How would you evaluate them over replacement? Shouldn't their combined value be the exact same as Schillings? ( I know that two roster spots aren't as good as one, but thats a different issue).
Valuing Starters and Relievers (December 27, 2003)
Posted 9:00 p.m.,
December 29, 2003
(#42) -
ColinM
BTW,
I think I might be coming off as a bit too critical here and wanted to mention that I do think this was a good article by Guy. Which you can tell by the amount of discussion it generated!
MLB Timeline - Best players by position (January 14, 2004)
Posted 1:03 p.m.,
January 14, 2004
(#2) -
ColinM
Yeah, nice looking list.
I've done stuff like this before just for fun, though never made it look so good. It's a good way to pass the time on a bus or train. So who would you guys change? Looking just at first base, going back to the mid 70's (far as I want to go without looking at the data), I'd change it to:
giambi 2000-2003 (Helton wouldn't be a bad choice though)
mcgwire or bagwell 1996-1999 (Bagwell really was just as good)
thomas 91-95
clark 88-90
mattingly 85-87
murray 82-84
hernandez 79-81 (way better than Eddie in 79-80, just as good in 81)
carew 76-78
MLB Timeline - Best players by position (January 14, 2004)
Posted 3:17 p.m.,
January 16, 2004
(#14) -
ColinM
Wihtout Gwynn from '82 then it might make sense to have Dave Parker replace the end of Reggie's reign, say 75-80 (maybe Winfield in 79-80) and give 81-85 to Dwight Evans. I wouldn't start Gwynn until '86.
MLB Timeline - Best players by position (January 14, 2004)
Posted 1:36 p.m.,
January 23, 2004
(#18) -
ColinM
Sam,
I like the list but I don't see why there needs to be gaps. I mean someone had to be the best (true talent) pitcher in any given year, even if they didn't end up with the best results for that particular season. Take '96 for example. Smoltz, Brown and Hentgen were all clearly above their heads and Clemens and Pedro hadn't begun their great runs yet. But what about Maddux? Even though he had a bit of a down year, his peripherals were still excellent. And he went right back up to a dominating level the next year. I feel pretty comfortable saying he was still the best pitcher in the game in '96.
My nominees for the last couple of decades:
01-02 Randy Johnson
99-00 Pedro Martinez
97-98 Roger Clemens
92-96 Greg Maddux
86-91 Roger Clemens
84-85 Dwight Gooden
83 Dave Stieb
80-82 Steve Carlton
MLB Timeline - Best players by position (January 14, 2004)
Posted 5:08 p.m.,
January 23, 2004
(#25) -
ColinM
So if I understand this right, then the difference between a really good and a really bad clutch hitter might be as much as 1 win? That sounds like a pretty big finding to me.
Back to (sort of) the original topic, who would be better than Stieb for '83? It was a pretty weak group but someone has to be the best. If you were God and could replay the 83 season a million times, some pitcher would end up the most valuable overall and I'm guessing it would be Stieb. I think you could make a case for Valenzuela or Quisenberry or Morris too.
MLB Timeline - Best players by position (January 14, 2004)
Posted 8:29 p.m.,
January 23, 2004
(#28) -
ColinM
Sam,
I'm not calling Stieb the best pitcher in 1983 because I think he had the best season. I actually think he didn't. I'm looking at a five year span from 81-85, trying to gauge his actual level of ability in '83, and saying that I think at that particular season in time, Stieb was the best pitcher in baseball from a true talent point of view. Carlton had declined from his late career peak, and Gooden didn't come along until the next year. So for one season, Stieb was the best pitcher that MLB had.
There's really no right or wrong eay to do this. But I think that just because it isn't so clear cut who is the "best" for a moment in time, doesn't mean that somebody isn't. It's just harder to seperate them from the rest.
Agreed about Rogers in '82 and Seaver-Carlton in '72.
MLB Timeline - Best players by position (January 14, 2004)
Posted 8:57 p.m.,
January 23, 2004
(#29) -
ColinM
And just for the hell of it,
Best Player in Baseball
1991-2004 Barry Bonds
1989-1990 Rickey Henderson
1985-1988 Wade Boggs
1983-1984 Cal Ripken
1977-1982 Mike Schmidt
1973-1976 Joe Morgan
1970-1972 Johnny Bench
1967-1969 Carl Yastrzemski
1959-1966 Willie Mays
1956-1958 Mickey Mantle
1954-1955 Willie Mays
1949-1953 Jackie Robinson
1948 Stan Musial
1941-1947 Ted Williams
1936-1941 Joe DiMaggio
1934-1935 Lou Gehrig
1932-1933 Jimmie Foxx
1930-1931 Lou Gehrig
1918-1929 Babe Ruth
1909-1917 Ty Cobb
1900-1908 Honus Wagner
Clutch Hits - Tango's 11 points to think about --- to understand why we regress towards the mean (February 12, 2004)
Posted 10:51 p.m.,
February 22, 2004
(#15) -
ColinM
Not that I disagree with you Tango, but I'm laughing my as off right now!
I'm just picturing one of my long discussions, over a bottle of scotch, with my good buddy who's a PHD in theology. He asks how we can accept the concept of a Platonic universe given all we know about quantum physics, etc...
And I answer, "With regression". The answer to all of cosmology's probelms in two words!
ARod and Soriano - Was the Trade Fair? (February 16, 2004)
Posted 9:02 a.m.,
February 17, 2004
(#11) -
ColinM
studes makes a lot of sense here. This basically follows what SABR types have been saying for years. Why waste money trying to go from a 65 win team to a 75 win team? When the young guys are ready, spend the money and go for the playoffs. I suffered through enough of Gord Ash to know that 83 wins doesn't feel so good as a fan. (Unless you liked the '73 Mets of course).
I guess what it boils down to is that the extra wins that Arod provides are probably more valuable to the Yankees than the Rangers, as they could be the difference between them and the Sox (or the Jays!!!). In some way, this kind of relates to the never ending debate over the definition of MVP.
Baseball Prospectus - : Evaluating Defense (March 1, 2004)
Posted 1:00 p.m.,
March 2, 2004
(#20) -
ColinM
"Of course it is statistically significant (pretty much anything is if you put your level low enough). I meant if it was significant at the 95% level. "
The problem with this is that you run in to the old selective sampling issue. The reason why we're looking at Rolen to begin with is because someone noticed how out of line his '03 was with the rest of his career. By chance alone, there should be a few players who are this far out of their range simply by luck.
I'd say that there's a decent chance that Rolen was just one of those players.
Baseball Prospectus - : Evaluating Defense (March 1, 2004)
Posted 1:33 p.m.,
March 2, 2004
(#23) -
ColinM
Even taking out the bold part doesn't seem like enough. If BPro wants to discuss the state of defense evaluation, they should at least mention that there are better metrics available! If they don't know about UZR, then they really aren't cutting edge anymore, are they?