Tango on Baseball Archives

© Tangotiger

Archive List

Bonds, Pujols and BaseRuns (September 6, 2003)

Robert Dudek takes a look by using his own modified version of BaseRuns.
--posted by TangoTiger at 11:36 PM EDT


Posted 11:40 p.m., September 6, 2003 (#1) - Tangotiger
  I tried posting this at batters box, but couldn't.

****
I just came across this.

Runs are created on a game-by-game basis (getting a runner on in April won't help you win a game in June). So, your run evaluator should be based on a game-by-game basis.

As for why I didn't test on a seasonal basis, as pointed out elsewhere, because of the incredible clustering of teams to the mean, virtually any half-decent run measure will be acceptable. All that that means is that any deviations will be masked by the 90% of the teams that are close to the mean.

However, when I selected games with 3 HR each, and then grouped them together, that gives you a few hundred or whatever games. So, instead of trying to select 100 teams with 180 HR or whatever, I've given you essentially a couple of teams that hit HR at the pace of Babe Ruth! (And of course, I game teams of HR at the 0 level, 1 level, 2, 3, 4...).

If nothing else, the one major point of BaseRuns, the one thing to keep in mind at all times, is that the HR does not generate runs the same way that the other events do. The more baserunners you have (at some point), the less valuable the HR. That's the takeaway from BaseRuns. That the HR does not have the ever-increasing value that all "multiplicative" methods says it does, or the always-stable value that all "linear" methods says it does. It's value increases to a point (around an OBA up to .350 to .400), AND THEN, it diminishes.

But, since no team actually exists at that level, then who cares?

But pitchers ARE their own teams, and you should care about that.

To evaluate hitters, custom-generated Linear Weights is probably the best thing to use.

Thanks for the interesting discussion!

Posted 8:20 p.m., September 7, 2003 (#2) - David Smyth
  It's always interesting to see what a different brain can do with BsR. I've probably come up with 20 decent versions, and none of them were very similar to Dudek's. I used the plus 1 method to see what the values are for the 1970-98 sample of teams:

BB (non IBB) .327
HBP .343
IBB .172
1B .460
2B .740
3B 1.019
HR 1.395
SB .181
CS -.312
GDP -.329
SF .111
SH .045
AB-H-K -.0802
K -.0866

The only value that I have a real problem with is the IBB. My understanding is that the .17 would be the correct value if the walked batter and the following batter(s) are avg hitters. Since that is usually not the case, the IBB is worth less, say .11 or maybe even .05 runs. And the win value is pretty much neutral from what I have read. So if you are going to put the player's BsR into some sort of win conversion, the IBB should probably be left out because they are not subject to the standard R/W converters. J Jarvis' latest research on IBB suggests that the value is heavily dependent on the batter's ability, and (therefore) that the IBBs to Bonds are probably an overall plus for the opposing team. To give him .17 runs for his IBB seems to me to be wrong. It is not Bonds' fault that he is so good that he can be neutralized in this way, but that happens to be a part of the game.

I'll also say that I don't like the partial baserunners in the A factor. Unless it results in significantly more accurate result, I would stay away from that sort of thing. I (think I) do agree with the inclusion of CS in the A factor, because this "should" result in more accuracy without a corresponding sacrifice somewhere else (conceptually). It is always more accurate to include as much "known" information as possible, as long as it also applies similarly to individuals, and it is certainly known that baserunners who are CS or GDP have no chance of scoring. I personally would not include GDP(or SF in their correct weight) in a formula based on the official stats, because the official stats do not include the greater advancement potential on outs by GB batters. There is a bias against GB batters, because they have fewer SF and more GDP, but the greater advancement on GB outs is not included. This opinion is based on the study by Ruane.

Posted 8:20 p.m., September 7, 2003 (#3) - RossCW
  It seems to me that there are a couple parts of the analysis missing - or I missed it.

1) Runners do not score runs at the same rates even when compensating for teammates. Vince Coleman and Otis Nixon scored runs an average of 40% of the times they got on base other than by a homerun. This is over 10 percentage points more than their teammates. At the other extreme is Willie McCovey who scored only 20% of the time almost 10 points lower than his teammates.

In fact, the variation in how often runners score when they get on base is larger than the variation in how often they get on base. So any analysis of a players value to his team which leaves this out is not complete.

2) It is quite clear that teams are greater or less than the sums of their parts. You can see this in that as a team's averages (AVG, OBP, SLG) increase, their runs scored generally increase even faster. That is the same single (or other event) on one team is worth more on another because of who the other players are.

If Tango is correct about the value of home runs dropping when OBP gets beyond a certain point, it would seem that the relative value of other contributions would have to increase.

Posted 9:50 p.m., September 7, 2003 (#4) - Tangotiger
  There are 3 reasons why Coleman, Willie, Raines, etc score more run per time on base than McCovey and his ilk:

1 - They are faster (this adds about +/- .04 runs / time on base, if I remember my research)

2 - They have better hitters behind them (#2 through cleanup, as opposed to #5 thru 7)

3 - They leadoff more, meaning they get on base with 0 outs more, meaning there are more PAs opportunities to drive them in

Posted 9:52 p.m., September 7, 2003 (#5) - Tangotiger
  The average IBB is virtually win-neutral based on research I published a few months ago. I did not look to see whether the IBB to Bonds specifically are win-neutral as well, but they probably are.

Posted 11:12 p.m., September 7, 2003 (#6) - RossCW
  .04 runs per time on base.

So if Otis Nixon scores runs at .440 rate when he gets on base speed only accounts for the 0.040? If he had average speed his runs per on base would be .400. And McCoveys would score to .240 with average speed.

2 - They have better hitters behind them (#2 through cleanup, as opposed to #5 thru 7)

Probably that is part of it - are there studies that have demonstrated that?

3 - They leadoff more, meaning they get on base with 0 outs more, meaning there are more PAs opportunities to drive them in.

What is the impact on the average leadoff hitter from this? I don't have data that breaks out by batting order.

I am doubtful that the difference is entirely attributable to batting order. If you are right about the value of speed being marginal, then any leadoff batter would be much more likely to score when they got on base than any cleanup hitter. I am sure there are some statistics by position in the batting order that would establish averages for each, but I don't have them.

Regardless of what causes the differene, it needs to be accounted for in any system of objective evaluation of their contribution to their team. The differences in how often players score when they get on base appear to be at least as great as the differences in how often they get on base.

Posted 9:58 a.m., September 8, 2003 (#7) - tangotiger (homepage)
  Ross, you can go to my site, and look for the link on "Batting Order". In there, I have MGL's run expectancy matrix by batting order and league (but only for 1999 I think). I have much better data with more years, and I'll be doing alot more with it sometime in the upcoming months, and they will happen to address your issues here, which are all legitimate.

In fact, the reason I started that batting order thread at fanhome was because I believed that Rickey Henderson and Tim Raines were being ripped off because of their skills were optimally suited for the leadoff spot, but all run evaluation methods were not given them that credit. I.e, they are able to leverage their particular skills more in the leadoff spot, than others would. This impact, for Rickey in particular, I think amounted to almost 1 win per season. You can reasonably add a whopping 10 to 15 wins to Rickey's career simply on the fact that his skills were ideally leveraged in the leadoff spot.

Considering that a HOF is about +30 to +40 wins above average for their career, this +10/15 thing is an enormous impact that is simply not quantified by any other sabermetrician (but is probably intuitively recognized by the average fan).

Posted 10:29 a.m., September 8, 2003 (#8) - tangotiger
  Ross, by the way, MGL's superLWTS *does* take into account the "taking the extra base" performance of players. You can check it out. If I remember right, Juan Pierre and Derek Jeter do quite well.

Posted 2:54 p.m., September 8, 2003 (#9) - ColinM
  If IBB are truly win-neutral events wouldn't that be the same thing as saying that your average IBB would lead to the same expected value as pitching to the player? In other words, if an average player produces about .125 runs/pa, then an average IBB should be worth about the same (I know I'm ignoring inning/run context right now)?

Now assuming managers have this balancing act right, wouldn't you expect that on average Barry Bonds, like anyone else, would not be walked when the win-value is neutral for an average player, but when the win value is neutral for Barry Bonds? So in other words, it seems likely to me that the value of an average IBB for Bonds is much higher than for a typical player.

Extrapolating this line of reasoning, if we assume that IBB are win-neutral for most players, then wouldn't it make sense to deal with them as such:
Set the value of IBB to 0 and calculate BsR. Then adjust BsR as
BsR = BsR + (BsR/(PA-IBB) * IBB)

The result of this would be to treat each IBB as equal in value to an average non-IBB plate appearance for each player, in essence setting the expected win-value for the IBB as neutral. Does this make sense, or am I missing something?

Posted 3:07 p.m., September 8, 2003 (#10) - tangotiger
  I think you are on the right track, but we all should separate runs from wins. Since each component has its own runs-to-win conversion ratio, it makes little sense to compute a run evaluator using IBB, and then converting that overall runs to wins.

What you want to do is figure out the win value for each component.

Now, you make a great point, and I'll reiterate here: the win-neutral value of the IBB is for the GIVEN PLAYER and not a league average player.

That is, if the win expectancy in a given state is .764 with Bonds at bat and pitching to him and .761 with Bonds being IBB to face Santiago, then, the IBB is worth NEGATIVE .003 runs ... adn this is the important part... relative to Bonds himself. So, if Bonds is +.010 wins / PA above average in his "pitched-to" PAs, then in this particular IBB PA, he'd be +.007 wins / PA.

If we assume that managers sometimes walk Bonds when he shouldn't and walk him when they should so that overall they are win-neutral PAs *to Bonds*, then you would do the following:

Compute Bonds' runs above average excluding IBB and convert to win above average. Say that works out to +80 runs or + 8 wins over 400 PA, or +.02 wins / PA.

Suppose that he gets 100 IBB. He gets credit for +.02 x 100 = 2 wins above the average player (or zero wins above himself).

His new wins above average is +10 wins over 500 PA (including the IBB).

Remember, the IBB is win-neutral relative to the player at bat, but not relative to the average player.

Great comment Colin!!

Posted 3:08 p.m., September 8, 2003 (#11) - tangotiger
  Colin, I'm rereading what you said, and you said it better than I did.

Posted 3:34 p.m., September 8, 2003 (#12) - David Smyth
  I'm not sure what the basis is for saying that IBBs are win-neutral for specific players. I thought the idea was that IBBs, on an overall basis, are win neutral. That doesn't imply to me that this applies to any specif player, just to the hypothetical composite player receiving avg IBBs.

Posted 3:56 p.m., September 8, 2003 (#13) - tangotiger
  Suppose you have bottom of the 9th, home team down by 1, man on 2b, 1 out. With everyone in the game an average player, the chances of the home team winning is .296. Suppose though that with Barry Bonds at the plate, the chance that the home team will win is .370. (I didn't check what it is, so let's go with that.) So, do you walk him or not?

Well, the win expectancy for bottom of 9th, with men on 1b and 2b, and 1 out, and down by 1 is .351.

So, insofar as the visiting team is concerned, walking Bonds is worth -.019 wins to the home team.

But, to Bonds himself, he turned a .296 situation if he was not the batter into a .351 situation because he was the batter after the event completed. That is worth +.055 wins for Bonds' IBB.

Because managers probably walk Bonds and INCREASE the chance that Bonds' team wins, we can guess that the win expectancy before a Bonds PA and after a Bonds PA to be virtually the same, following an IBB.

It's a win-neutral event to the visiting team, but a huge win-gaining event for Bonds himself.

Posted 4:37 p.m., September 8, 2003 (#14) - ColinM
  Thanks tango,

You're right of course about separating runs/wins, the concept only really makes sense in terms of wins. Although if you have a straight runs per win value, it really doesn't make any difference over the course of a season.

David, I agree with tangotigers latest post. And I think it can be demonstrated that the average value of an IBB will increase as the quality of the batter increases (although tangotiger will have to verify whether the win-neutral hypothesis holds true). Think about it like this:
Barry Bonds is intentionally walked way more than any other player. This means that there are a number of situations where Bonds is walked but nobody else would be. In other words, there are a number of situations where the win expectancy is too high to justify putting anyone else on base. Given this, the average win expectancy of a Barry Bonds IBB must be higher than that of an average player (or any player at all for that matter).

I'd love to see some empirical data that suggests that this average value remains essentially win-neutral, it would seem to be the most logical result, given that's the case for an average player. If we can verify that this is true, we might have a nice way to deal with the intentional walk!

Posted 4:50 p.m., September 8, 2003 (#15) - tangotiger (homepage)
  Colin,

I'll be working on that in the upcoming months (my guess is that it is win-neutral by batter), but in the meantime, you may be interested in the above link.

Tom

Posted 5:35 p.m., September 8, 2003 (#16) - David Smyth
  Well, how does this jibe with the results of John Jarvis, who has done the most detailed studies of the IBB. In his latest, he seems to be claiming that, assuming that IBBs are handed out in an avg fashion (which may not apply to Bonds), that if the batter has a SLG below .600 it is a bad move for the defense, and vice versa for batters with a SLG over .600.

Here are a couple quotes from his latest study: "I have shown...that the IBB creates runs (for the offense) when the batter receiving it has a SLG less than .600."

"The IBB is only justified (for the defense) for the very best players.

I realize that he is talking in terms of runs, while we are talking in terms of wins, but Jarvis is well aware that the relationship between IBB runs and wins is unique, based on his prior study, which showed that, while the (avg) IBB creates runs for the offense, the runs are distributed in a way which is overall win-neutral.

Something about this analysis by Tango and Colin does not ring true with me, based on all my prior reading on this subject. Usually when I go against Tango, I lose. But still...

Posted 6:49 p.m., September 8, 2003 (#17) - Tangotiger
  No, what he is saying IS consistent with what I am saying. The defense is better off walking Bonds (they gain say +.02 wins in the process).

But, from the perspective of Bonds, Bonds alreay gains +.20 wins just for being in the batter's box. By being handed 1B in that situation, his worth is now +.18 wins instead.

It's a question of which perspective you have, the offense, the defense, or the batter.

Posted 11:41 p.m., September 8, 2003 (#18) - RossCW
  But, to Bonds himself, he turned a .296 situation if he was not the batter into a .351 situation because he was the batter after the event completed. That is worth +.055 wins for Bonds' IBB.

If Bond's is walked in the situations where his hits would be most valuable doesn't that imply that Bond's other numbers are less valuable than their face value since he average situation in which they were produced was less likely to create wins.

Posted 7:02 a.m., September 9, 2003 (#19) - Tangotiger
  Yes!

Just like Pedro would be less valuable if the opposing manager would be allowed to have him replaced for one batter when Pedro allows a runner to get on base, and replace him with the mop-up guy. (Not THAT bad, because Bonds does get to go to 1B.)

How much impact is this? I don't know, but it might be a bit. I did publish the "Win Probability Added" a few months ago, and Bonds' numbers were NOT out of this world (though they were pretty incredible and tops in the league), for 1999-2002.

Posted 2:02 p.m., September 9, 2003 (#20) - ColinM
  Great point RossCW. But Tango, I think you might be misinterpreting what the impact of that might be. In fact I don't think there is really any impact at all.

Let me see if I can work through this. Say you have a hypothetical Barry Bonds who for some silly reason is never intentionally walked. Let’s take the average win-expectancy for a PA by this bizarre world Bonds and set it to a baseline of 1.

Now let’s take the real Barry Bonds, who gets 70 IBB in a season. For real Bonds, we can divide his PA in to two groups, group A (PA-IBB) and group B (IBB). Following what Ross said, group B typically occurs when Bonds win expectancy is at its highest. So if group B has an average win expectancy of 2, then group A must have a reduced average win expectancy, maybe .85 or so. So yes, his group A PA are less valuable on average than they would be for bizzaro Bonds. BUT, here's the key point: if the tradeoff in win expectancy for an IBB is neutral as hypothesized, then the average win expectancy for group B PA MUST remain the same, 2. So when you combine the two groups, win expectancy is still 1 and real Bonds is just as valuable as bizarro Bonds. So if this is true, then the method we have proposed for dealing with IBB would still be valid.

In fact I think the data supports this assumption. I posted earlier that if IBB are win neutral, then you would expect an average run value of about .125 if they occured in a typical situation. But the BsR value used in the article has them closer to .17. This would suggest that the IBB are occuring at more leveraged times, as Ross pointed out, and as a result have more value than an average PA.

Long story short, Bonds non-IBB PA are worth less than you would expect, but this is balanced out by the fact that his IBB are worth more than anyone elses. In fact I'm coming to believe that an average IBB by Bonds may be more valuable on average than a regular BB by anyone else. It's also interesting that one result of his "regular" PA having reduced value would be that he would have less RBI than predicted. Which is exactly the case. But now I'm rambling on...

Posted 3:28 p.m., September 9, 2003 (#21) - tangotiger
  Be careful between your use of runs. The .125 is that absolute runs, or marginal runs?

The IBB has a marginal run value of .17 runs or so for the average player from the perspective of the team, and probably including Bonds. The win-value of the IBB is win-neutral from the perspective of the player, as discussed.

The NIBB has a marginal run value of about .32 runs for the average player, and probably for Bonds as well. Though, my guess is that for Bonds, because he probably gets alot more NIBB with 1B open, that the walk is worth less to Bonds (maybe .28 runs or something).

Now, suppose you have 2 equals, and we'll call them Pujols and Bonds. But one of them gets IBB alot, and the other, not as much. In the cases where Bonds can do alot of damage, he gets IBB. But, Pujols gets pitched to, and as a result can create more wins than Bonds in the exact same situation.

That is, if we have that late and close situation where walking Bonds will have a win expectancy of .35 and facing him will be .37, they walk him. But Pujols, they face him, and he makes them pay... to the average win expectancy of .37.

Like it or not, Pujols has now impacted his overall PAs more than Bonds (assuming they were equals to begin with).

So, yes, it does make a difference, if the managers are approach players in non-optimal ways.

Posted 4:13 p.m., September 9, 2003 (#22) - ColinM
  The .125 was absolute runs. I took this statement by Robert Dudek in the original blog "I'll note that from 1994-2001, it was empirically determined that an IW is worth about .178 runs and a NIW was about .33" to also be referring to absolute runs. However, I'm mixing runs and wins here and just detracted from my main point.

Let me say this, if you ignore the last two paragraphs of my last post and just concentrate on the rest, I think what I'm saying is still valid. Let me use your example of equally talented Bonds and Pujols (imagine Bonds having an equal :)). You said:

"But, Pujols gets pitched to, and as a result can create more wins than Bonds in the exact same situation.
That is, if we have that late and close situation where walking Bonds will have a win expectancy of .35 and facing him will be .37, they walk him. But Pujols, they face him, and he makes them pay... to the average win expectancy of .37."

I agree. In your situation Pujols will create more runs. But the problem is, we're working under the theory that the IBB for Bonds, like everyone else, is esentially win-neutral. Meaning the manager guesses right on the IBB about half the time. This implies that there will be another situation where the win expectancy of facing Bonds is .37, but the WE of walking him is .39 and they walk him anyway. So in this case, Bonds has actually had MORE impact than Pujols, by being walked. And if WE for IBB is neutral for Bonds, then this situation is just as likely as the one that you've given.

So in the end, Bonds' PA will be just as valuable as Pujols. They have to be. The only way they cannot is if the win-expectancy for Bonds IBB is negative, IOW, if the managers do a better job of guessing the break-even point of walking Bonds than they do with other batters.

Posted 4:19 p.m., September 9, 2003 (#23) - tangotiger
  Robert is reporting marginal runs. In fact, you should ALWAYS talk about marginal runs in cases like this. ALWAYS.

****

Let me think about the rest of your post.

Posted 6:54 p.m., September 9, 2003 (#24) - David Smyth
  I come back to a point I made earlier--that there is no reason to assume that every player's IBBs are win neutral relative to what he would hit if they pitched to him. If Tango or anyone else has evidence to the contrary, please post it. Until I see that, I will assume that the IBBs to Bonds, given in typical IBB situations, are negative (for Bonds) and that the IBBs to an avg batter are a positive for him.

Jarvis, in his latest IBB study, seems to be saying that the breakeven point for IBB is about a ".600 SLG batter". This implies about a +.08 lwts runs/PA player, or a player who is about .20 absolute RC/PA.

So the more I see of this stuff, the more I am tending to go simply with the .17 value for an IBB, subject to the standard R/W conversion of about 1 W = 10 runs.

Posted 7:25 a.m., September 10, 2003 (#25) - David Smyth
  I'm wrong. You would have to go with +.08 as a "standard" value for IBB, even in an "absolute" run formula. And that's because you have to account for the fact that if a player was pitched to instead of being IBBd, he would make a goodly number of outs, and that impact has to be accounted for.

So, for example, if Bonds gets 60 IBBs, that's 60*.08, or 4.8 runs, which is about half a win. Bonds gets his credit here compared to simply ignoring the IVVs. But if the opposing team had pitched to him in those 60 PAs, he would likely have produced about 1 win. So the opposition has saved about half a win by IBBing Bonds. This assumes, of course, that Bonds IBBs are standard IBB opps, and Bonds is the one player for whom that is not necessarily true.

Posted 12:14 p.m., September 10, 2003 (#26) - ColinM
  So it seems that we have a few competing theories about how to peg the value of an IBB. These theories basically boil down assumptions about what a typical manager's tendencies are when issuing a walk. All of these theories work off of the data that on average, the IBB is a win neutral event over the course of a season:

Theory A - Managers guess right on the IBB most of the time with very good players, but guess wrong most of the time with other players. The win-expectancy of an IBB is generally the same for IBB to good players and bad players, so it's a net loss for the good player but a net gain for the lesser guy. This is the theory that would support setting the IBB to a common value for all batters. This seems to be the position taken by David, although I personally do not support this theory as I think good players are more likely to be walked in high leverage situations.

Theory B - Managers guess right on the IBB about half the time for every player, good or bad. In this case the win-expectancy on the IBB is neutral for each player and the IBBs can be treated as having a similar win value as the player's other PA. So the typical IBB will have more value for good players than bad players. This is the theory that I have been working off of in my previous posts.

Theory C - This would be some combination of A and B. In this case good players are walked in higher leverage situations like in theory B, but the managers do a better job of finding the break even point with the good hitters as in theory A. The value of a typical IBB for a good hitter will be worth more than for a bad hitter, but it won't be worth enough to make it win neutral, it will still be a net loss in value.

To sum it up using our Bonds-Pujols example. (The numbers are just pulled out of my ass)
Theory A - Value of an average Bonds PA ~= .95 * PujolsPA
Theory B - BondsPA = PujolsPA
Theory C - BondsPA = .975 * PujolsPA

Which one is right? We just don't know until we have some solid data. Right now I get the feeling that B is closest to the truth, but for now, that's just opinion.

Posted 4:11 p.m., September 11, 2003 (#27) - David Smyth
  "This seems to be the position taken by David."

Well, I'm not trying to take any position of my own, I'm just trying to interpret what Jarvis' latest study seems to be concluding. I case you haven't seen it, I'll post a link in a minute.

And Tango, you mentioned some published research will shows that IBB are win-neutral for the given batter. Could you post a link to that?

Posted 4:14 p.m., September 11, 2003 (#28) - David Smyth
  Jarvis link

http://knology.net/johnfjarvis/baseball.html

Scroll down to the link for "Trends, Exceptions...

Posted 4:44 p.m., September 11, 2003 (#29) - ColinM
  David, thanks for the link. It'll take me a while to get through that. If I come up with anything else to add I'll post it here.

Posted 6:42 p.m., September 11, 2003 (#30) - David Smyth
  "It'll take me a while to get through that."

Join the Club. There's lots of gold in Jarvis' stuff, but for the average person it's a pretty slow slog. For that reason, perhaps his work hasn't gotten the attention it deserves, especially on the IBB.

Posted 10:31 p.m., September 11, 2003 (#31) - Robert Dudek
  David,

I must confess I don't really know what the right way to deal with the IBB in the BaseRuns formula is. The formula I devised was meant to closely resemble the absolute run values for the 1994-2001 game log sample, as reflected by the plus 1 method. The "plus 1" value for each event was meant to approximate the values that Tango observed for a similar period.

Afterwards, I take what BaseRuns says and adjust it on the basis of outs saved/consumed over/under the team average rate. For example, on the Giants, Bonds is going to save a lot of outs (which then go back into the team pot) relative to the average. Other players on the team are going to consume those outs. When you add up outs saved/consumed for every player on the team, the result is a net outs of zero.

Part of the IBB's value is captured this way, so you may be right about using a .08 run value in the formula itself.

Posted 3:45 p.m., March 16, 2004 (#32) - Silver King
  [I just posted this on Fanhome, but thought I'd post it here too. It's frustrating that BsR seems valuable yet remains hard/confusing for the very lay person to have any confidence in applying. I'd like to see that rectified in general, and I have...]

2 main questions about using BaseRuns:

Assuming I 'know' their effective true talent, I'm trying to compare various players in terms of how many runs they'd add or save relative to an average player. For hitters, I'm using LWTS. For pitchers, I gather that BaseRuns is good. (In fact, it seems to me that someone who knows what they're doing should post lists of pitchers compared by BsR, since it's apparently so good for describing pitchers due to the interactiveness thang. I'm surprised I've never seen that. If someone did that and explained it step-by-step, that'd help me a lot in appreciating/using BsR; presumably others too.)

First:
I'm looking at pitchers in terms of recent offensive context, the high-offense/high-power era of the last dozen years. I want to work with just basic stats: walks, hbp, homers, and hits (or singles, doubles, and triples, since I'm happy with assuming a league average assortment of these hit types). Also have K's, if that's useful. Near the end of the discussion following Tango's final installment of his Primer BsR-related article, David finally posted a nice & simple filled-in formula--filled in with the constants.

A = H +BB +HBP -IBB -HR~~~~~~~~~~~B = .1*(BB +HBP -IBB) +.8*1B +2.3*2B +3.6*3B +2.1*HR +SB -CS~~~~~~~~~~~~~~C = AB-H +CS~~~~~~~~~~~~~D = HR

R = A x B / (B + C) + HR

I'd use this, but it'd be better if I had a version of the B-term constants that was geared to the last 10 years or really any segment thereof. Yes, I know that y'all have posted ways to derive the constants from the league environment, but I don't really understand how to do it. Please, can you tell me a good set of the constants for recent years? (Robert Dudek gives a recent-years version in an article from last September, but in Pr. Studies discussion of it, David indicates he's sceptical or unclear about that version. Also that version had some other variables and changes that I don't want to use, but I don't know if I can just drop them out.)

I see that about a year ago on Fanhome, someone named rmiller posted this:
--------
Since robls needs the formula for PITCHERS, what formula of BsR should he use? This one?
A= H+BB-HR
B= (H+HR+.1BB) *X
C= IP
D= HR
The X multiplier is around .41
This gives runs. If you want it in ERA form, divide by IP and multiply by 8.2
Runs (A*B*x)/(B*x+C)+D
x=(C*(Runs-D))/(B*(A-Runs-D)
--------
Tango and David picked at this, but didn't lay out what version _would be_ good. Is there a good pitcher version? How 'bout for recent levels of offense?

Second:
How d'ya compare apples to apples here? With LWTS for batters, I compare their LWTS runs-above-average given the same # of plate appearances, 650. I'm not worrying about their health or stamina here, just wanting to gauge which would be better to have in a lineup. Same thing for pitchers, and I used to again use LWTS per N opponents' plate appearances. But I gather that BsR is better and I'm trying to switch.

I'm confused as to whether I should look at BsR per N plate appearances, or per N outs. (or something else?) If I compare how pitchers would do given the same # of innings (opponent outs), is that sort of double-counting the effect of their opponent OBA? Or is it _supposed_ to be 'double counted'? With batters, we care about what they can do when they come to the plate. But with pitchers, don't we care about what they can do in N innings? Both ways seem to have good arguments. What's the 'proper' way for me to do it?

Posted 9:52 p.m., March 16, 2004 (#33) - Anonymous
  .

Posted 2:58 p.m., March 18, 2004 (#34) - tangotiger
  For pitchers, you want BaseRuns per out (akin to ERA).

For hitters, you want LWTS per PA.

The formula shouldn't change based on era, though I have not yet tested whether the best-fitted 1974-1990 BsR matches that of 1991-2003. I'm sure it would be quite close.

***

Also, be careful on using the equation with missing data. Any fudge factor you apply can ONLY be applied to the "B" component. As Patriot rightly pointed out, Clay Davenport did NOT do this for BsR, thereby making BsR looks worse than it should have been.

The idea behind BsR is very simple. As the OBA approaches zero, the run value of the HR approaches 1, and the run values of all other events approach 0. As the OBA approaches 1, the run value of all non-out events approach 1, and the out event approaches infinity.

BsR is the only model that adheres to these known constraints. And, to boot, it's as accurate as anything out there in the "regular" MLB run environments.

Posted 6:30 p.m., March 18, 2004 (#35) - David Smyth
  ---"Any fudge factor you apply can ONLY be applied to the "B" component."

I don't understand that. When I worked out a shortened form of BsR against Tango's full form, I had separate fudge factors for both A and B, and the largest was for A. IOW, if the full A was 2100 for a 4.5 R/G avg, according to Tango's full version, and I only wanted an A factor with H+BB-HR, then this might total to only 1900, so I would multiply A by 2100/1900, or 1.105. Similarly for an abreviated B factor, which came out to about a 1.02 fudge. So MOST of the fudge is in the A factor, not the B factor. And if you include a couple other subtle considerations which are relevant for a short formula, then ALL of the fudge should probably be in the A factor, to deliver the best real-world accuracy.

Posted 9:55 p.m., March 18, 2004 (#36) - Patriot
  I agree with David that the fudge can go on the A as well. However, Davenport also applied it to D, and that nukes the whole 1 run = 1 home run thing.

But if you had a perfect dataset, you would know A for sure, and you would know C and D for sure, and you would only be able to fudge B. But as you point out, we don't always have or use the full data set.

Posted 10:12 p.m., March 18, 2004 (#37) - tangotiger
  Those are good points. The fudge should go wherever the data is lacking. If HBP is lacking, then of course you need to fudge the A as well.

How you fudge is not clear. I fudge as a function of PAs, or estimated PAs. I don't think that the HBP fudge should be a function of H+BB. IBB might be a function of BB. I guess you'd have to run a regression to figure those things out.