Tango on Baseball Archives

© Tangotiger

Archive List

Are Managers Optimizing Their Best Relievers?
by Tangotiger

Background

Recently, I looked at how much impact Bruce Sutter, Goose Gossage, and Lee Smith had as relievers. I introduced a concept called the Leverage Index (LI) which gives more weight to the innings in crucial situations, since those situations will impact the final game outcome the most.

A while ago, I also discussed the Defensive Responsibility Spectrum, which is nothing more than Voros' DIPS. I also introduced a simple measure called FIP (Fielding-Independent Pitching), though I should really call it mFIP (with m meaning mostly). That formula is

mFIP = ( 13 * HR + 3 * BB - 2 * K ) / IP

I have an extended version (xFIP) of this that also includes HBP, BK, PO, WP, and excludes IBB.

In any case, the formula itself isn't too important, other than to have a decent measure to try to select from a group of pitchers.

10 Top Relievers

Using the extended FIP version, here are the 10 top relievers (not league, nor park adjusted), along with the HR, NIBB (non-intential walks), and K per 9 innings, using only performances as a reliever, with at least 800 BFP over the last 4 years -

 
RelieverHRNIBBK
RobbNen0.62.210.7
OctavioDotel0.73.212.3
MarianoRivera0.51.9 7.8
TrevorHoffman0.81.910.1
BillyWagner0.92.912.0
KeithFoulke0.81.8 8.9
SteveKarsay0.62.1 7.7
PaulShuey0.53.710.3
MikeStanton0.62.5 8.0
UguethUrbina1.03.011.8

A somewhat surprising list. The similarities between Billy Wagner and Octavio Dotel, and between Mariano Rivera and Stever Karsay stand out. Paul Shuey? Yes.

Leverage Index (LI)

We have a list, if not the top 10 relievers in the league, arguably 10 of the 20 top relievers in the league. How did their managers use them?

Here are their LI, over the same time period. For a frame of reference, Bruce Sutter's career LI is 1.90, while an average LI would be 1.00. Bob Stanley's career LI is 1.30.

 
FirstLastLI
RobbNen1.85
TrevorHoffman1.85
MarianoRivera1.72
BillyWagner1.68
UguethUrbina1.56
KeithFoulke1.30
SteveKarsay1.29
PaulShuey1.29
OctavioDotel1.28
MikeStanton1.08

The Big Boys

Robb Nen and Trevor Hoffman have been used very well. Not optimally (which is a topic for another article), but better than anyone else. Mariano Rivera's poor showing is at first surprising, and, a little later on, we'll take a more in-depth look at the New York Yankees reliever usage patterns over the last 4 years. Billy Wagner and UUU did have arm problems that may have given their managers a more conservative view in their usage patterns.

The Next Wave

Keith Foulke is the poster boy for sabermetrics relievers. His poor usage was not limited to just last year, with an LI almost 1.00. His top seasonal LI was 1.60. That the sabermetric-minded Oakland A's have traded for him is not surprising.

Octavio Dotel is stuck with Billy Wagner, and Steve Karsay is stuck with Mariano Rivera (though he hasn't been effectively used prior to that). These are two valuable pitchers who have not received all the exposure they deserve.

Paul Shuey. When I was a kid collecting cards, I'd know every single player in the league, along with their teams and if they were traded to. As a teenager playing fantasy baseball, I also knew about the rookie crop, and everyone primary position and age. As you are less and less involved in these kinds of enterprises, as there are more teams to keep up with, some players slip right by you. Paul Shuey has slipped right by me.

Mike Stanton's very poor showing, along with Mariano Rivera's relatively low showing demands more inspection.

New York Yankees 1999 - 2002

Here are the year-to-year breakdowns of all Yankee relievers with at least 150 BFP
RelieverLI
-- 1999 --
MarianoRivera1.50
MikeStanton1.11
RamiroMendoza1.09
JasonGrimsley0.82
JeffNelson0.81
DanNaulty0.28
-- 2000 --
MarianoRivera1.73
JeffNelson1.13
MikeStanton0.93
DwightGooden0.55
JasonGrimsley0.49
-- 2001 --
MarianoRivera1.75
MikeStanton1.24
RamiroMendoza1.06
BrianBoehringer0.89
JayWitasick0.85
MarkWohlers0.46
RandyChoate0.39
-- 2002 --
MarianoRivera1.99
SteveKarsay1.35
RamiroMendoza1.05
MikeStanton1.03
SterlingHitchcock0.62

The team LI from 1999 to 2002 were: 0.88, 0.83, 0.98, 1.05. This is probably the price you pay (or rather earn) when you have starters that continue to pitch in high-leverage situations.

Within this context, the 1999 numbers of Rivera and Stanton look fine. In 2000, with even less leverage situations to work with, Rivera's numbers are excellent. Jeff Nelson (#25 over the last 4 years) had his great year, and was given the secondary role over Stanton. In 2001, things look normal. And in 2002, Rivera's LI jumps to 1.99, while Steve Karsay takes over the seconday role of Mike Stanton.

Mike Stanton should have the opportunity now to show more of his stuff in the higher leverage situations. While he now has Armando Benitez (#16) to contend with, Armando does come under alot of scrutiny in the New York media and fan base. Scott Strickland (#26) is also around.

Conclusion

Some managers are effectively using their best relievers, and some are pitching them in more secondary roles. It's just a matter of time before the secondary relievers receive the prominence they deserve. Unfortunately, for some of these pitchers, this may come during the downward phase of their career.


December 31, 2002 - Slpashot

I wonder how the Yankees' 1996-2001 usage in the playoffs compares to their regular season patterns, and to other playoff teams?

December 31, 2002 - Bill James

Are Managers Optimizing Their Best Relievers?

No.

Hey, don't you read my books?

December 31, 2002 - Anonymous

Paul Shuey was always a good middle reliever for the Tribe. He has the nastiest stuff in Baseball (Peter Gammons always says he has filthy stuff). But the Tribe couldn't rely on him as a closer. When his splitter wasn't walking he would bounce pitches in the dirt and walk guys or leave it hanging in the strikezone and get whacked.

He had a postseason record of 1-4 with a 4.34 ERA. He couldn't be counted on in pressure situations. He only has 22 saves in 9 years Plus he was injury prone. He's missed alot of games with groin injuries. The Tribe drafted him #2 overall in 1992 as there closer of the future. But he never lived up to it.

December 31, 2002 - James Withrow (e-mail)

I probably should have posted a couple of my concerns about your approach after your first articles on the subject of leverage, but I wanted to think about this subject a little longer.

But first, I'd like to suggest that the most optimal use of the best relievers would generally be as a starter. I'd have to have a very good reason as a manager not to turn Keith Foulke or Trevor Hoffman into a starter, so that I could get more innings out of him.

Here's my leverage concern and I'm going to start with an analogy, which is always a little dangerous. When we look at how valuable players are, we usually don't weigh performance in the heat of a tough pennant race any more than we do the players' same stats in April. The usual thinking is that a win in April is just as important as a win in September .

Why don't we use the same thinking for relievers? Why is the 9th inning any more important than the 1st? If there's a one-run game, aren't each of the starters' six innings just as vital as the closer's ninth? Naturally, closers' innings are more likely to take place in one-run games than are starters, but isn't that really the question to look at here?

In fact, I would even argue that the first 7 innings of pitching are MORE important than the 9th because the score after the 7th (and often the 6th) influences the choice of relievers the opposing manager will use. I would estimate this to be worth at least one unearned per game, roughly figuring, which is major.

James

December 31, 2002 - Joe M

Is it reasonable to conclude that Rivera's LI could be somewhat deflated due to the fact that the Yankees have been consistant winners over the last few years, and therefore may have fewer late-inning high leverage situations available for their closer? I don't recall any mention in the previous article of a correlation between overall team LI and team W-L records (though I would expect really bad teams to have the lowest LI's for their relievers).

It would also be fun to see the converse of this study, i.e. what is the xFIP for pitchers with the highest LI? This would show us which pitchers are being tossed into high leverage situations without showing the performance to warrant that decision. I'm thinking our friend Six-digit Alfonseca would pop up on that list.

December 31, 2002 - tangotiger (www) (e-mail)

But first, I'd like to suggest that the most optimal use of the best relievers would generally be as a starter.

Agreed.

Why don't we use the same thinking for relievers? Why is the 9th inning any more important than the 1st?

If you bring in Mariano Rivera with a 6-run lead 50 times, you won't change the outcome of the game, than if you brought in an average pitcher.

If you bring in Mo 50 times with a 1-run lead, the Yanks will win a few more games than if you brought in an average pitcher.

If there's a one-run game, aren't each of the starters' six innings just as vital as the closer's ninth?

I'm not taking anything away from the starters. Their LI is about 1.00.

In fact, I would even argue that the first 7 innings of pitching are MORE important than the 9th because the score after the 7th (and often the 6th) influences the choice of relievers the opposing manager will use.

7 innings of LI of 1.00 is 7 leveraged innings. 2 innings of LI of 2.00 is 4 leveraged innings. Yes, the first 7 are more important, or at least, they have more impact to the final outcome of the game.

December 31, 2002 - tangotiger (www) (e-mail)

Is it reasonable to conclude that Rivera's LI could be somewhat deflated due to the fact that the Yankees have been consistant winners over the last few years...

I believe I mentioned that as a possibility that the Yanks pay (earn) this price.

I don't recall any mention in the previous article of a correlation between overall team LI and team W-L records (though I would expect really bad teams to have the lowest LI's for their relievers).

On my to do list. I should be able to come up with the LI, on a team-by-team, year-by-year basis, from 1974-1990. I expect the LI to peak with teams at .500, and slowly degrade the more the team's win% is from .500 (on either side).

It would also be fun to see the converse of this study, i.e. what is the xFIP for pitchers with the highest LI?

Also on my to-do list. I just ran a prelimiary report for 1974-1990, and Todd Worrell actually tops the list at 1.97. Bruce Sutter is second at 1.90. The top of the list is all the usual suspects. The first name that I didn't recognize was Victor Cruz at 1.58. Next was Steve Foucault at 1.50.

Among "middle-relievers", Tim Burke was 1.54. He's a favorite of mine, and it certainly looks like he was used prominently. Paul asked earlier, and john Hiller was 1.62. Mike Marshall was 1.51.

Among pitchers with at least 2000 PA, Dave Tomlin was 0.73, and worst of the bunch.

December 31, 2002 - sleepy

Tango, I'm sorry to bring up a basic question, but I didn't quite get the math on your leverage index. If I read your past articles correctly, you take the difference between a plate appearance's win index and that of the plate appearance immediately after. Is that right?

My question: Which plate appearance immediately after? Add another out? Or something else.

Thanks. This is great work.

December 31, 2002 - tangotiger (www) (e-mail)

Thanks, I'm enjoying this as well!

The problem with the "out" is that sometimes an out increases your WE (win expectancy), say a flyball with a man on 3b, of a tie game in the 9th inning. Strictly speaking, you have to look at the change in WE for every possible event, and then come up with the variance (and the frequency of those events). In essence, how much swing potential in winning does a particular game state provide? That's the question to answer.

I'd love a faster computer, as I'm running this on a 650 MHz (but 512 RAM). Sometimes, I have to run stuff overnight.

January 1, 2003 - Anonymous

Regarding Paul Shuey, he was labeled unfairly a failure as a closer when he was called up out of single A ball and inserted directly into the closer's role. Two years later, when he was dominating AAA ball and ready to pitch in the majors, his reputation was already tarnished. If you look at his percentage of saves + holds divided by saves + holds + blown saves, I don't see anything to suggest that he can't handle pressure.

The more substantial problem is his frequent trips to the DL. He spends half a year working his way up in the bullpen pecking order and then gets injured again about the time he starts earning a manager's confidence.

January 1, 2003 - tangotiger (www) (e-mail)

I will give a performance breakdown for Shuey and Stanton, among crucial, normal, and non-crucial situations. Look for this in a few days. We'll see if they can "handle" the pressure...

January 2, 2003 - Fred Bobberts (e-mail)

The idea of "leveraged" appearances, is interesting because it attempts to capture the essence of the sequential nature of bullpen decision making in the late innings. But I can't help wondering if it is leading analysts in the wrong direction. I remember some managers from the seventies and eighties - Billy Martin and Sparky Anderson come to mind - who would trot out their stoppers in the seventh inning in certain situations. Their reasoning was clear. They felt that the particular situation at hand was serious enough to be their most likely chance to lose the game RIGHT THEN. They were determined to put out the fire, and not wait to see if the situation would get better on its own. And they were both excellent bullpen managers.

If Milt Wilcox or Ed Figueroa tired in the seventh and allowed two runners in a one- run game against a good starter- well, it was Willie Herndandez time for Sparky or the Goose was up for Billy. They wasn't going to wait until they were two runs down, and these men were no longer a potential factor in the game. The modern practice of only pitching your stopper in the ninth is very strange to me. I saw Buck Showalter throw away a good part of the entire second half of the 1995 season. He was just determined to use Wetteland for only the ninth, watching Perez and Wickman blow game after game. Check it - Wetteland pitched 61 innings in 60 games. You can't tell me he couldn't have had a positive effect on his team with 20-30 more innings pitched. Hey, I lived in New York and had to watch that season.

But Willie Hernandez's 140 innings in 1984 would have probably lowered his leverage figure for that year. I say this, but I haven't calculated it, so maybe I can't say squat. But I doubt the innings he pitched in the 6th and 7th would add to his figure. Does this really mean he was misused? To me the answer would be "not if his extra innings went to Sid Monge, Dave Rozema, or Glen Abbott".

If leverage is the key figure for bullpen use, then a manager should save his best reliever for use only in the ninth, with men on. Clearly, he could do some good in other situations. There has to be a way to calculate the costs and benefits of the use of certain pitchers in certain situations, and this is a step in the right direction. Finding this answer means finally understanding the value of good set-up men, and maybe publicizing their real value. Hats off to Ramiro Mendoza, Aurelio Lopez, Mike Stanton, etc.

January 2, 2003 - tangotiger (www) (e-mail)

What we are after is *not* to maximize a pitcher's LI, but rather to maximize their leveraged-innings (LI x IP). LI of 1.00 with 120 IP will have the same win impact as 1.50 LI with 80 IP to a reliever. Of course, it's not that simple, as you have to take the totality of your starters and relievers, and maximize the leveraged innings for the good pitchers, and minimize the leveraged innings for the bad pitchers, such that all innings are accounted for. You have other constraints as well, with respect to the tiredness of a pitcher's arm, etc.

Mark Eichorn, for example, had 200 leveraged innings (LI of about 1.3) in his great year. That is an excellent total.

January 2, 2003 - Sean Smith

"But first, I'd like to suggest that the most optimal use of the best relievers would generally be as a starter"

The problem is that many (most? all?) of the top relievers would not be as effective as starters. Mariano Rivera had a 5.51 ERA when he was used as a starter. There might be another Derek Lowe out there (Foulke?) but I think most of the closers in baseball wouldn't cut it as starters. I wouldn't even consider the guys like Percival, Nen, of Hoffman, who have been relieving too long. John Smoltz didn't go to the bullpen because Atlanta thinks he's more valuable saving 50 than winning 20 (althought their media might). Smoltz is in the bullpen because he couldn't stay off the DL as a starter.

January 2, 2003 - tangotiger (www) (e-mail)

All things equal, you are better off having your pitcher as a starter.

Your considerations would be to take someone like Urbina and Wetteland, and determine their level of effectiveness as a starter or reliever.

Say that as a starter, their performance would be a win% of .600. And as a reliever, they would be .650. You know that you can get say 160 leveraged-innings as a reliever, or 220 leveraged-innings as a starter. What do you do?

Compared to a baseline level of .450 (the effective level of rejigging your whole pitching lineup), you get 160/9 * (.650-.450)= +3.6 wins as a reliever or 220/9 * (.600-.450) = +3.7 wins. Essentially, a wash.

So, you really have to go into it deeply, determine the effectiveness level of all your pitchers based on the starter/reliever role, determine how you can best optimize your leverageable innings, and come up with your plan. It's not so easy, especially considering injuries throw a wrench in your whole plan. Unless you are the Yankees.

January 2, 2003 - tangotiger (www) (e-mail)

Shuey and Stanton breakdown

The leverage classes were broken up into high-leverage (LI of 2 or greater), low-leverage (LI of 0.5 or less), and the rest.

$H is non-Hr hits per ball in play. All the others should be self-explanatory.

Paul Shuey? He was at his best in high-leverage situations. Mike Stanton? He was by far his best in high-leverage situations. Note the small sample of PAs. Note also that it's easier to get more WP in high-leverage situations, since high-leverage situations occur more often with men on base. In any case, Shuey's WP rate wasn't so high, relative to his other situations.

I think there's some interesting DIPS numbers in there as well. With the leverage situations different, each pitcher gave up fewer hits / ball in play, and fewer Ks as well. Almost as if the pitcher had to bear down in the high-leverage situation, and therefore, has a different pitching approach, thereby lowering his K rate, and improving his $H rate. We may in fact find that pitchers DO control the hits/ball in play ALOT. And it may simply be the fact that once you reach the majors, the pitchers are similar in this regard overall.

January 2, 2003 - Ted Arrowsmith

Tango Tiger:

Great work as always. A few points:

The goals of pitching staff management are: (1) have your best pitchers pitch the most innings (2) have your best pitchers pitch the highest leverage innings and (3) reduce the size of your staff so you'll have more room for bench players (4) use your pitchers in advantageous matchups (lefty-righty, curveball, flyball pitchers at Pac Bell etc.). In modern baseball the desire for #4 has over taken #3 to a point that those of us who grew up listening to Earl Weaver's team hit pinch-hit homers find alarming. Attempts to maximize leverage for your best players should only be made until they hurt your efforts to minimize innings and roster spots used by bad pitchers.

January 2, 2003 - FJM

I want to make sure I understand how the LI is calculated. Let's consider 2 hypothetical closers. #1's typical outing: he enters the game at the start of the 9th inning with a 1-run lead and retires the side 1-2-3. #2's typical outing: he also starts the 9th with a 1-run lead. But he gives up a couple hits and a walk before retiring the side. Would #2 have a higher LI than #1?

January 2, 2003 - tangotiger (www) (e-mail)

FJM: yes that is correct. The second guy was on a hotter seat, and that's what LI is reflecting. As I mentioned on another thread, LI is not about rewarding a player, but classifying each PA.

Note that a manager is choosing to bring back the same reliever. If he had chosen to replace the reliever after 2 hits with another reliever, we'd have no problem saying that the replacement was on a hot seat.

It doesn't matter who sets the fire. We are capturing the existence of the fire, and we are capturing that the manager is letting someone pitch in that fire.

Doug Drinen's reliever reports works based on when the reliever enters and exits the inning. This metric works great in other areas, for other purposes. Eventually, I'll probaby create an LI for this as well.

January 3, 2003 - FJM

My concern is that the LI as defined now may be biased against closers who allow very few baserunners per inning (e,g., Mariano Rivera). Is there an easy way to test this?

January 3, 2003 - tangotiger (www) (e-mail)

Well, I provided the LI for 10 top relievers of 99-02, as well as the historical LI for all pitchers in the 74-90 time period (see Clutch hits).

As for biased, again, there is no bias. It's a reflection of the game state for each PA. I know what you are saying about say John Franco or Mel Rojas being arsonists.

But it's not like there is a giant in a land of pygmies, even Mariano, that we should be concerned about. In the 74-90 time period, Clemens and Gooden are probably the giants. Their LI are 0.96 and 1.03. Hershiser was 1.03. Ryan was 1.05 and Blyleven was 0.98.

January 3, 2003 - FJM

Looking at the "giants" of the 1974-90 period doesn't prove anything. As you have pointed out elsewhere, it's almost impossible for a starting pitcher to have an LI significantly different than 1.00. It really only has meaning for relievers.

Over the last 4 years, Rob Nen has a WHIP of 1.125; Mariano's is 0.969. Nen has allowed 41 more hits and 22 more walks while pitching only 18 more innings than Rivera. I suspect that goes a long way toward explaining Nen's higher LI.

January 4, 2003 - tangotiger (www) (e-mail)

FJM: Again, I don't know how much effect it has, but I suspect a little. I'll find out eventually.

But again, remember the purpose of leveraged PAs. It's about describing the level of fire during that PA, regardless of whether that fire was arson or not. The manager is bringing back Mel Rojas, the arsonist, for the next PA.

As mentioned in another article, I can also create leveraged appearances, whereby I only note the fire level when the reliever is first brought in. This I will also do eventually. (Drinen essentially did this already.)

It's important to realize that a stat is constructed to answer a specific question, and it should not necessarily be used to answer other questions. Nor is it a shortcoming of the stat if it can't answer this new question.

January 4, 2003 - FJM

I understand your point. When a pitcher gets himself in trouble and his manager leaves him in he is in effect making a new decision, to let him "relieve" himself. I agree with you, except in the case of closers. The way closers are used now, the manager really doesn't have to do any thinking at all. When the closer comes in, everybody knows he will stay in until either he gets the Sv or he gets a BS. There is simply no other option. I'm not saying that's the way it should be; I'm saying that's just the way it is.

Anyway, I guess I'll just have to wait until you get a chance to rerun the data using the relievers entry point, Don't make me wait too long, OK?

In the meantime I have another request. I was surprised your TOP 10 list didn't include several of the top closers of the last 4 years, most notably Billy Koch (144 saves) and Troy Percival (142 saves). I think I know why Koch didn't make it: he really isn't that good. But I certainly thought Troy would make any Top 10 list. Is it really fair to include guys like Shuey who rarely faced high pressure situations and exclude guys like Percival who always did? How do we know Shuey would have performed as well as he did under pressure?

Actually, what I'd like to request is that you rerun the study using the Top 10 Closers of the last 4 years, using Saves rather than mFIP as the selection criteria. Also, since most of the top closers in 2002 were relatively new to the role (e.g. Smoltz, Williams, etc.), could you look at that group as well? Thanks a lot.

January 4, 2003 - tangotiger (www) (e-mail)

If you page up to my Jan 2 comment, you will see a link to Paul Shuey and Mike Stanton, and how they performed in the various leverage situations. Paul Shuey, and especially Stanton, have excelled in high-leverage situations, when given the chance. The sample size is small, so who knows.

I was surprised with Percy too. I thought he was better, but his K,BB,HR numbers don't compare with the best, though he would have come in the 11-20 list.

As for more analysis, I would love to do it. But my time is really constrained. I want to do an analysis on a team-by-team year-by-year basis for the last 4 years, and within that, show how each pitcher performed in the high-leverage and low-leverage situations. There is really so much I want to do, I don't know where to begin.

Right now, I'm taking a break from relievers and concentrating on baserunners.

January 6, 2003 - David Smyth

Tango, if you still check this thread, I came to this and the Sutter thread late, and I just want to give you your props. Really, really good stuff. I think we should take up a "collection" to get Tango the PBP from 2002. :)

January 6, 2003 - tangotiger (www) (e-mail)

David, thanks much! I'm actually using alot of different concepts into all this, so it's rewarding to me as well.

As for 2002 PBP, astrosdaily.net has it, so I'm fine there. What I need is *time*. Can you help me there?

July 2, 2003 - Brian Schnooks

In regards to the Shuey/Stanton breakdown and pitchers possibly being able to control $H, what about the fact that in a lot of very high leveraged situations hitters are more worried about striking out and therefore have a different approach at the plate where they are just trying to put the ball in play and therefore are not making as hard contact as normal which would lead to a lower $H%. Also, I would guess there would be more SH in high leverage situations leading to the lower $H% as well as lower K rates.