OPS: Begone! Part 2
by Tangotiger
Last week, I looked at 6 players of varying degrees of profile (high-walk, low-power to low-walk, high-power) who had an equal impact in run production when surrounded by typical teammates. We learned that to equalize these players using only OBA and SLG, we need to use the best-fit equation of 1.64*OBA+SLG.
After that article came I out, I received two very interesting questions.
Two great questions, and so, let's find out the answers.
Same OPS, widely differing OBA,SLG
The following table presents 6 players with the same OPS.
AVG | OBA | SLG | OPS |
0.267 | 0.29 | 0.453 | 0.743 |
0.267 | 0.313 | 0.431 | 0.743 |
0.267 | 0.333 | 0.41 | 0.743 |
0.267 | 0.353 | 0.39 | 0.743 |
0.267 | 0.371 | 0.372 | 0.743 |
0.267 | 0.389 | 0.354 | 0.743 |
In order to construct a typical team, I have to adhere to the following constraints: each team will generate the same number of outs, and each player on the team will have the same number of PAs. Now, I'm counting outs as AB-H, though technically that is not true. You can also get outs on base. However, given that these teams are very similar to begin with, we'd expect the outs on base to even out. Here are the 6 teams that result from using the above players.
AVG | OBA | SLG | OPS | BsR | Diff from Baseline |
0.267 | 0.329 | 0.415 | 0.744 | 675.5 | -5.7 |
0.267 | 0.331 | 0.412 | 0.743 | 677.9 | -2.9 |
0.267 | 0.333 | 0.41 | 0.743 | 680.3 | 0 |
0.267 | 0.336 | 0.408 | 0.743 | 682.8 | 3 |
0.267 | 0.338 | 0.406 | 0.744 | 685.3 | 6 |
0.267 | 0.34 | 0.404 | 0.744 | 687.8 | 9.1 |
The first 5 players in that group are the most realistic looking of the players. We see that while they have the same OPS, they do not have the same run impact. In fact, there is an 12-run swing from one realistic end to the other. So, if you rely on OPS, realize that you will be off by up to +/- 6 runs. And this is for this kind of player. If you choose star players instead of average players, the swing will be even greater, perhaps double that. It's up to the reader to decide if this is an acceptable deviation within his objectives.
Now, let's look at other run measures as well. The following table presents our 6 players, with their expected PAs and outs to conform to the above guidelines (that is, this is how they would do if they each played in the same number of games).
In addition to the typical numbers, I present
PA | Outs | AVG | OBA | SLG | OPS | BsR | BsR/440 | BsR+/- | LWTS | Team diff from baseline |
655.3 | 465.1 | 0.267 | 0.29 | 0.453 | 0.743 | 74.2 | 70.2 | -5.4 | -5.7 | -5.7 |
657.7 | 452.2 | 0.267 | 0.313 | 0.431 | 0.743 | 75 | 73 | -2.6 | -2.9 | -2.9 |
660 | 440 | 0.267 | 0.333 | 0.41 | 0.743 | 75.6 | 75.6 | 0 | 0 | 0 |
662.2 | 428.5 | 0.267 | 0.353 | 0.39 | 0.743 | 75.9 | 77.9 | 2.4 | 3 | 3 |
664.2 | 417.5 | 0.267 | 0.371 | 0.372 | 0.743 | 76 | 80.1 | 4.5 | 6 | 6 |
666.2 | 407.1 | 0.267 | 0.389 | 0.354 | 0.743 | 75.9 | 82.1 | 6.5 | 9 | 9.1 |
Alright, so what do we have here? We know that BaseRuns is not a precise measure of run production for hitters (only for pitchers and teams). However, how accurate is BaseRuns? We know that static Linear Weights is not a precise measure of an individual hitter (nor for a pitcher nor a team). But, how accurate is it?
The above table shows that static Linear Weights is very accurate (a virtual match). This statement comes with a disclaimer. Because I chose 8 typical players to form my team, and because the static Linear Weights values are based on the historical typical team, we should expect it to be very accurate. If I would have chosen 8 typical 1960's hitters, we would not have achieved this level of accuracy using the static LWTS model. (We need custom LWTS values.)
Among the first 5 realistic hitters, we see that BaseRuns is accurate to within 1.5 runs. That is rather impressive, considering that the interaction effect occurs at the team level, and not the individual level.
Same OBA, same SLG, widely differing AVG
The following table presents 6 players with the same OBA, same SLG, but widely differing batting averages.
PA | AB | H | 2B | 3B | HR | BB | Outs | AVG | OBA | SLG | OPS | BsR | BsR/440 | BsR+/- | LWTS | Team diff from baseline |
660 | 640 | 200 | 30 | 4 | 8.1 | 20 | 440 | 0.313 | 0.333 | 0.41 | 0.743 | 74.6 | 74.6 | -1 | -1.8 | -1.7 |
660 | 620 | 180 | 30 | 4 | 12.1 | 40 | 440 | 0.29 | 0.333 | 0.41 | 0.743 | 75.1 | 75.1 | -0.5 | -0.9 | -0.8 |
660 | 600 | 160 | 30 | 4 | 16 | 60 | 440 | 0.267 | 0.333 | 0.41 | 0.743 | 75.6 | 75.6 | 0 | 0 | 0 |
660 | 580 | 140 | 30 | 4 | 19.9 | 80 | 440 | 0.241 | 0.333 | 0.41 | 0.743 | 76.1 | 76.1 | 0.5 | 0.9 | 0.9 |
660 | 560 | 120 | 30 | 4 | 23.9 | 100 | 440 | 0.214 | 0.333 | 0.41 | 0.743 | 76.6 | 76.6 | 1.1 | 1.8 | 1.8 |
660 | 540 | 100 | 30 | 4 | 27.8 | 120 | 440 | 0.185 | 0.333 | 0.41 | 0.743 | 77.2 | 77.2 | 1.6 | 2.6 | 2.7 |
Essentially, as the walks and HR go up, I decrease the hits. In this way, I force the OBA and SLG to match, while varying the batting average.
We see that not considering the batting average in your OPS metric will have an effect of +/- 2 runs. We again see that Linear Weights, as expected, is an almost perfect match. BaseRuns comes to within 1 run of the true value
Conclusion
When using OPS, be aware of its accuracy and its limitations. As a back-of-the-envelope calculation, it works reasonably well. But, don't take it too far.
May 27, 2003 - Dave Studenmund
We know that BaseRuns is not a precise measure of run production for hitters (only for pitchers and teams).
Tango, I've completely missed this along the way. What do you mean by the above statement? Have you covered that in some of your previous articles? And why isn't it a precise measure for hitters -- because of team context?
Sorry if I'm rehashing old ground.
May 27, 2003 - Jason
I'm confused with the players of different batting average as I read the results it looks like the players more dependent on secondary skills produce more runs per OPS than a player who relies on batting average? I was under the impression that batting average players were a bit better than walk slugging types.
May 27, 2003 - Nick S
Dave-
I think Tango means that BaseRuns assumes interplay of events in a lineup. That is, e.g. making fewer outs results in more plate appearances which in turn results in more run production. For a whole team or a single pitcher (facing a continuous lineup) this is true. For a single batter it is not - unless he is batting nine times in a row and using "ghost runners" - because he generates extra PA at (mostly) his teammates ability levels, not at his own.
May 27, 2003 - tangotiger
(www)
Nick, very well said, and I especially liked this
...because he generates extra PA at (mostly) his teammates ability levels, not at his own. It would have taken me a paragraph to explain this, but you said it perfectly in half a sentence.
As for the batting average thing, I suppose that's another myth. It's pretty clear that given two guys with the same OBA and SLG, you want the guy with the LOWER BA (though in reality, we're not talking about much difference).
I suppose if you really needed to quantify it, probably something like 3*OBA+2*SLG-BA (I really don't know, but it would be of some form like that). I'll bow out of any discussion on trying to find the best-fit equation using OBA,SLG,BA. I already don't have much use for OPS, and I know I won't like OPSMB!
May 27, 2003 - Jason
Are you sure the batting average thing isn't context dependent i.e. depending on the run environment?
May 27, 2003 - tangotiger
(www)
(e-mail)
Jason, interesting thought.
I just tried with a weird environment (OBA/SLG of .393/.493), and in this case, the higher the BA, the more runs scored. I then tried the other way, with .289/.351, and this time the LOWER the BA, the more runs scored.
The "break-even" point seems to be about .360/.450. That is, at that level, the change in batting average (and I checked from .200 to .340) made zero change to the run production of the team.
Great call!
May 27, 2003 - Brian
Without running the numbers, I think intuitively BA has more value in the very short-term. OBA depends in part on the pitcher, and SLG depends on infrequent XBH. So if I have a key at-bat I'd rather have a high BA guy up there who better controls his own fate.
I've toyed with Albert Pujols vs Carlos Delgado as an example (2002):
Pujols .314, .394, .561, .955 OPS Delgado .277, .406, .549, .955 OPS
(Tango, this seems to fall under your high case environment.)
Of course using a higher OBA multiple gives Delgado a slight edge. This still doesn't sit well with me. For my money I prefer Pujols up there in a key situation.
May 27, 2003 - tangotiger
(www)
(e-mail)
"Key" situation is another topic entirely.
Click the above link, select your "key" situation, and plug in the numbers (on a /PA or /600PA basis). That'll tell you which guy you want.
If by key you mean inning/score as well as base/out, then you need another tool to evaluate it.
May 27, 2003 - tangotiger
(www)
(e-mail)
I just want to make it clear: do not, absolutely do NOT, rely on OBA/SLG/AVG to make game decisions.
You must break it down to your components, and you must apply those components against the context being faced (base/out states, inning/score/base/out game state, game/pitcher state, etc, etc).
OPS is quick and dirty and has no place in game decisions. Relying on it for some cases will make you rely on it for most cases, and sometimes all cases. That's a bad habit to start. OPS, begone!
May 27, 2003 - Jason Belter
Well I've thought about it and my guess is that what your seeing is that maybe the correct break even point is tied to percent of OPS determined by batting average. As batting average increases most of your offense becomes singles conversely for a fixed OPS a decrease in batting average shifts it to a walk/HR dominated offense. Introducing a single into that type of offense is hardly going to be marginally better than getting a walk. Since the advancing a runner isn't going to mean much if he only scores on HRs anyway. On the other end in an offense consisting only of base hits a walk will be much less valuable since in most cases it won't have advanced a runner who was on say 3rd base following two singles and the like. Maybe that makes more sense when the offense is singles and doubles.
May 27, 2003 - bigcpa
Tango, how can you warn not to use OPS in a game context? Are you talking purely in a theoretical sabrmetric world?
Considering most managers appear to use nothing more than batting avg, hr, rbi in creating lineups, wouldn't OPS be a quantum leap? We can't expect field managers to run computer simulations to make lineup decisions.
May 27, 2003 - tangotiger
Every game context produces different "win potential" for H, HR, BB, outs, SB, sacs. The values between those components are not static. In a completely "run potential" world, you would never call for an IBB or a sac. But in a "win potential" world, there are many many times that you need to call for the IBB or sac.
OPS, if left to its own devices, would become the defacto mechanism to evaluate game situations, when in fact its purpose is to gloss over player evaluations. I don't believe in taking baby steps, and the long path to get the job done. I also don't believe that we should hand hold the manager for 20 years to lead him to the proper tools.
Give them the right tools for the right job, and let them decide if they want it. If Felipe Alou says that looking at OPS is b.s. to decide whether to walk Bonds, I'm going to agree with him. Should I say that OPS is less b.s. than using BA? A rose by any other name...
May 29, 2003 - kenshin
"Give them the right tools for the right job, and let them decide if they want it. If Felipe Alou says that looking at OPS is b.s. to decide whether to walk Bonds, I'm going to agree with him. Should I say that OPS is less b.s. than using BA? A rose by any other name..."
I am not certain what you intended to mean with this statement. I fully agree that OPS does not constitute an ideal metric; however, if it is superior to those tools currently used why not employ it in game situations? Additionally, Alou's lifetime of baseball experience does not validate his point. If I owned a team would I rather have Alou manage it than I? Yes. Yet, his superior ability does not add unwarranted credence to outdated and possibly erronous methods
May 29, 2003 - Walt Davis
Well, let me put words in Tango's mouth.
I think part of the confusion is over "game decisions". Tango appears to mean (mainly) in-game decisions. So, is OPS a better means than many managers currently use for deciding, say, who should be their regular LF? Yes, probably so.
However, within the context of an in-game decision, OPS may be as likely to lead you astray as BA or another metric.
Take the Pujols-Delgado example offered earlier. Which one you'd rather have up there in a "key" situation depends on what kind of "key" situation we're talking about.
If you're down 1 in the 9th with 2 outs, you probably want the guy with the best HR rate.
If you're down 1 in the 9th with 2 outs and a runner on first, you probably want the guy with the best XBH rate.
If you're down 1 in the 9th with 2 outs and a runner on second, you probably want the guy with the best BA.
If you're down 1 in the 9th with 2 outs and the bases loaded, you probably want the guy with the best OBP.
If you're down 2 in the 9th with nobody on base, you want the guy with the best OBP.
I say "probably" because there are other factors involved, like how fast the runner is, how good is the next hitter, who's pitching, etc.
Still you can see that for all four of those scenarios, OPS is pretty much useless. So are EQA or LWTS or probably any other meta-metric.
May 30, 2003 - tangotiger
Yes, what you want is win-based LWTS (or a sim). And I would guess that a manager will be able to be right (using only his experience) more often than using just OPS, in a tight in-game decision.
May 30, 2003 - Peter Keating
(e-mail)
I appreciate the utility of Base Runs and Linear Weights, but I'm wondering if some of the examples Tango has used run into the apples-and-oranges problem in using OBP and SLG, namely that they have different denominators.
In the Clutch Hits thread that originated with the Thomas Boswell article on how sabermetric principles are taking hold in MLB, someone asked whether, given the same on-base average and slugging percentage, a higher batting average is more valuable. After some back-and-forth, a couple of posters wrote that BA doesn't matter; if Player A has a lower BA than Player B but an equivalent OBA and SLG, then he must be compensating with more walks and isolated power.
Tango gives a different answer here: he says that if we hold OBA, SLG and the number of outs constant, reducing BA increases run generation, as measured by Base Runs or Linear Weights, in a small but noticeable way.
But this is the opposite of what Runs Created says, and even though Tango has deconstructed RC, I think the discrepancy tells us something.
Suppose Player 1 bats .250, with a .400 OBA and .500 SLG. In 600 plate appearances, or 360 outs, he will create 96 runs by the basic RC method:
120 hits in 480 AB, with 30 2B, 30 HR, 120 W = 240 times on base, 240 total bases, 600 PA = 96 RC
Now suppose Player 2 has the same OBA and SLG in the same number of plate appearances and outs, but hits .350. He will create about 111 runs:
194 hits in 554 AB, with 23 2B, 20 HR, 46 W = 240 times on base, 277 total bases, 600 PA = 110.8 RC
What happens when we hold outs and OBA constant is that 'times on base' also has to stay constant. But because Player 2 has a higher batting average, he will walk less frequently and have more at-bats, more hits and more total bases than Player 1. And because the players have identical slugging percentages, Player 2 will add total bases at a rate proportional to the *at-bats* he is adding, not to his plate appearances or his outs. And that means his RC and RC/out will go up.
Note that Player 1 and Player 2 both create 0.2 runs per AB. But Player 2 is creating about 15% more runs than Player 1 on a raw or per-out basis. This is because RC = times on base x total bases / plate appearances, and has nothing to do with at-bats, while SLG takes AB as its denominator.
In other words, if we wanted Player 2 to maintain the same number of RC per out as Player 1 while increasing his BA, it's not SLG we should be holding constant, but 'advancement percentage,' or whatever you want to call total bases / plate appearances.
To see why this is more than a technical argument about Runs Created, look at Tango's final table.
The bottom-line player (the guy who hits .185) is on base 220 times (100 hits, 120 walks), has 221.4 total bases (100 hits, 30 2B, 4 3B, 27.8 HR), and uses 440 outs in 660 plate appearances.
Using the same 440 outs in the same 660 PA, the top-line player (who hits .313) is on base 220 times (200 hits, 20 walks) and has 262.3 total bases (200 hits, 30 2B, 4 3B, 8.1 HR). Yet he is creating 2.6 fewer runs than the first player according to Base Runs, and 4.4 fewer according to linear weights.
My questions are:
1) How can injecting more than 40 extra bases into the same number of plate appearances or outs produce a negative result?
2) How can Base Runs and Linear Weights work if that's what they do?
May 30, 2003 - kenshin
Perhaps I misunderstanding the arguement about the relevance of BA in run scored. However, doesn't a single generate marginally more runs than a walk (it allows runners to advance from 2nd to home)? Thus an individual with a higher batting average and identical OBP and SLG to another player should produce more runs than the lower batting average player.
May 30, 2003 - tangotiger
(www)
1) How can injecting more than 40 extra bases into the same number of plate appearances or outs produce a negative result?
40 extra bases on hits, but 100 less bases on walks.
May 30, 2003 - Peter Keating
'40 extra bases on hits, but 100 less bases on walks'
I'm not sure I understand this. In moving from the bottom guy to the top guy, we're not taking away 100 walks and adding 40 singles. We're taking away 100 walks, adding 100 hits and making sure that the 100 hits are configured so that the second player winds up with 40 more total bases, right?
May 30, 2003 - tangotiger
(www)
The differences between the top guy and the bottom guy, the bottom guy has: 100 more walks 19.7 more HR 119.7 less singles
everything else is the same.
Straight static LWTS says that works out to +33, +28, -56 = +5, or some such.
May 30, 2003 - Peter Keating
(e-mail)
So then comparing them by looking at times on base and total bases points up a problem with Runs Created, rather than some problem with using OBP and SLG in combination even though they are denominated differently?
May 30, 2003 - tangotiger
RC has its own problems, magnified substantially when the HR/H or HR/PA becomes out of whack. RC does not model run scoring at all: it just got lucky that it looks like it models it. If you've got a computer, there's zero reason to use RC, when you've got BsR (unless you want to propose a model that's better).
I don't really care about the different denominators. The whole thing of OPS centers around: more good, less bad. The more walks, the more hits, the more TB, the less outs, the better the number. There's nothing inherent in OPS that ensures that the balance is proper. It's just plain old luck that for the run environment of MLB, that it works out that way.
Believe me, if the run environment was half what it is today, or double what it is, there'd be some other "quick" estimator that would get lucky to model run creation.
Sorry for the rant.
June 1, 2003 - Walt Davis
Peter, to put it in the most straightforward (but slightly misleading :-) way, think of it like this:
Player A has 240 TB and 120 BB for 360 bases produced; Player B has 277 TB and 46 BB for 323 bases produced.
In essence, B is trading 74 walks for 37 TB. That's not likely to be a good trade. Generally speaking a walk is worth 2/3 of a single. So if all 37 of those extra TB are singles, those are worth (on average) about 55 walks. That still leaves Player A about 19 walks ahead.
As Tango hints, it gets a bit more complicated than that because not all TB are created equal. Also non-intentional walks are worth more than intentional ones. Using the simple LWTS formula (as good a set of values as any):
a single is worth .47 runs, or .47 r/TB, and roughly 1.5 times a walk a double is worth .77 runs, or .385 4/TB, and roughly 1.2 times a walk (on a per-base basis) a triple is worth 1.05 runs, or .35 r/TB, and roughly 1.1 times a walk a HR is worth 1.41 runs, or .35 r/TB, and roughly 1.1 times a walk
So if we were to compare 12 TB to a corresponding number of BBs, under the different scenarios:
12 singles = 18 walks 6 doubles = 14.4 walks 4 triples = 13.2 walks 3 HRs = 13.2 walks
It brings up an interesting dilemma. You're at the plate with a 3-2 count. A pitch which is a ball but which you can hammer is coming to the plate. Do you swing? Well, to produce the same number of runs (on average), you'd better be able to get a hit with that pitch about 2/3 of the time or slug close to 800, otherwise you're better off with the walk.
Obviously the above varies with the situation. Which is Tango's other point.