BaseballBoards.com - Why Runs Produced (R+RBI-HR) is still a great stat
tangotigre@aol.com

Jump straight to the research

Tango Tiger

Why does R+RBI-HR work? Let's break down the formula.
A run can be broken down as follows: .27 1B + .44 2B +.62 3B + 1.00HR + .27BB +constant

An RBI (as opposed to runners moved along in an earlier post of mine) can be broken down as follows: .20 1B + .40 2B + .60 3B + .60HR + .03BB + constant

Add up the individual components of R+RBI-HR and you get: .47 1B + .84 2B + 1.22 3B + 1.60 HR + .30BB.

Now, these numbers are close enough to the real-world, that R+RBI-HR is a very good quick proxy if you don't have access to anything else.

I know, I know, team helps, batting position helps, etc, etc. As well, if you want to convert Runs Produced to Runs Created take AB/10 and subtract that from Runs Produced. Voila, an excellent proxy

David Smyth

Where do those formulas come from? A little more explanation would help others to evaluate what you are saying

Also, unless I'm reading it wrong, if you add the two HR components together you get 1.60, but when you subtract out the HR (R + RBI - HR) you're left with only 0.60 for a homer.

This agrees with the old Bill James criticism of runs produced--that HR shouldn't be subtracted out.

CRS

I'm thinking the second linear expression is not right because there is no way the coefficents for 3B and HR are the same. Intuitively, they would differ by exactly 1.0 though (you drive in the same amount of people, plus yourself) which makes me think that the second expression is actually for (RBI - HR).

Still, I'm with David Smyth. (R + RBI)/2 should be better than (R + RBI - HR).

Tangotiger, where did you get those linear formulae for RS and RBI?

Tango Tiger

Oops. I did mean to say 1.60 for the Homer for the RBI.

As for where they came from, it gets complicated, but let's see if I can articulate it.

The "runs" portion is strictly derived from the linear weights grid of base-out situations.

The "rbis" portion is derived as follows: I have calculated that for an AVERAGE plate appearance, the batter will have at least one runner on base 45% of the time, and 55% of the time the bases are empty. Of those times that a runner is on base, 70% will have a guy on 1B, 45% will have a guy on 2B, and 25% will have a guy on 3B. (It adds up to 140% because you can have more than one runner on base.) Ok, then you need to know the percentages for each type of hit that causes the runners to move an extra base. So, a 1B will cause a runner from 1B to get to 3B about 35% of the time, and a runner from 2B to score about 65% of the time. A 2B will cause a runner from 1B to score about 45% of the time. Then, it's jsut a matter of plugging all this in to an excel spreadsheet, and you get the RBI values I specified in my first post (with the obvious correction to the HR).

Therefore, Bill James is completely wrong on this issue (I read that Abstract 12 years ago, too.)

As for (R+RBI)/2, it simply makes no sense. On that basis, a 1B will be worth about 0.25 runs, a 2B will be worth 0.35 runs, a 3B will be worth 0.60 runs, and a HR worth 1.3 runs.

If anyone wants it, I will do an Excel spreadsheet, and pass it on.

Now, there are limitations. First off, I keep saying "AVERAGE". This is important, since a team that NEVER homers will have widely different constants. The 80's Cardinals come to mind. The reason is that without the homer, it's not so easy to score from 1B. But at the same time, each hit now has MORE run-driving ability. So, maybe the 1B has only 0.24 run scoring ability, but maybe the 1B now has 0.22 run-driving ability.

I did look at this issue once, and only at the extremes, as you would guess, does the additive power of linear weights lose its strength. It is exactly for this reason.

David Smyth

I see two serious problems with this whole thing.

First, the values Tango generates do look like linear weights, but in order to produce a run estimate, something is missing.

It's the outs. For every 1000 runs, the weighted total for the positive outcomes is around 1500 runs. To get back to 1000, around 500 runs worth is subtracted by the negative outs adjustment.

Simply using only the positive linear values doesn't work to estimate runs. And if R+RBI-HR is a proxy for the positive linear values, it won't work, either.

The second problem relates to the statement that Bill James (and myself) are wrong that it's better not to subtract out the homers from runs produced. One way to see who is correct would be to see which version--R+RBI or R+RBI-HR--correlates better with actual runs. Logic tells me that it has to be R+RBI. If someone does this study and I'm wrong I'll eat my hat.

CRS

(R+RBI)/2 is guaranteed to correlated with runs better than (R + RBI - HR) on a team basis by definition! Add up the values for all the players and you get the runs scored for the team.

I suppose you could account for RBI-less runs by adding a coefficient to balance it out. Say, (R+C*RBI)/2, where C is simply LgR/LgRBI looks to be about 1.05 or so, but that removes the simplicity of the formula.

Subtracting off the HR had no theoretical basis. It was just done to count how many of a team's runs a player "was a part of" and makes as much sense counting half-sacks and sacks the same in american football.

Tango's formulas just don't look right, and if you have the data and time, you'll use XR, RC or even OPS, anyways. They all work better than team and lineup dependent Runs Produced.

Tango Tiger

David, there is no question that R+RBI will correlate closer to team runs than what I have come up with. But that is inherent in that RBI is usually equal to about 94% of Runs scored, regardless of HR, and so you will get 99% correlation coefficient. I.e. you are comparing runs to runs!

But that is not the point. I am talking on an INDIVIDUAL basis. On an individual basis, the positive runs correlate strongly to the positive linear weights WITH THE HOME RUN ADJUSTMENT. The last thing to do is to proxy the negative runs of linear weights. One way would be to use the outs, and work out the constant so that the league totals match. The other way (and the one I prefer for its simplicity) is to take At Bats and divide by 10.

Again, my point isn't to say Runs Produced is BETTER. My point is that subtracting the Home Runs has a basis in fact. And the Runs Produced formula (with or without my adjustment) has a simple elegance to it.

P.S. The rationale for the At Bats / 10 is this: the average hitter with 600 at bats will drive in 60 RUNNERS (RBI - HR). So, you can say that a batter is presented with 600 at bats and drives in 60 runners. If a batter drives in 70 runners, he is a plus 10. Overall, the league total will be zero. Thus leaving the aggregate run totals which will equal exactly. Again, looking for simplicity, with some basis in fact.

David Smyth

OK. For some reason I overlooked the AB/10 subtraction at the end of Tango's original post.

The best way to analyze this is to work backwards fron Tango's linear formula to get to runs produced.

That formula is 1B*.47, 2B*.84, 3B*1.22, HR*1.60, and BB*.30

To incorporate the AB/10 adjustment, note that AB = H + (AB-H). So we subtract .1 run for each hit and out.

The result is 1B*.37, 2B*.74, 3B*1.12, HR*1.50, BB*.30, and (AB-H)*-.10

At first glance this looks decent. When this formula is applied to an actual league, it yields an estimate which is about 20% too low. This wouldn't be insurmountable if all the elements were in balance. But the .37 value for a hit is around 20-25% lower than the *correct* value of .47-.50. And the value for an extra base of .37-.38 is around 20-25% higher than the correct value of .30-.32

This degree of imbalance is unacceptable in modern sabermetrics, even for a so-called simple quick approximator.

The next step is to convert to the run/RBI based version, which is R+RBI-HR-0.1*AB

As we all know, the substitution of a batter's run and RBI totals for his hits and walks is a fairly substantial step down in accuracy, due to the powerful influence of situational differences.

And the final step, to wind up with runs produced, is to remove one of the four elements--0.1*AB--from the above formula.

So what we have here is an unacceptable linear formula to start with, to which another layer of inaccuracy is subsequently added, followed by the arbitrary lopping off of 25% of the calculation.

The funny thing is, if Tango had simply reported on his values for the run and RBI components of runs scored and stopped there, that would have been fine. Those values are worth knowing.

Tango Tiger

First off, let me clarify that I am not trying to supercede, replace, or in any way make a claim that runs produced is anywhere near as good as Runs Created or Linear Weights. I would put it somewhere below OPS, and maybe above OBA or SLG.

Secondly, my claim is also that the Home Run has to be subtracted from R+RBI, based strictly on the Runs/RBI run components as I described.

Finally, there is another component to the RBI formula and that is "outs". If you work it out, I agree that RBI's will fall 15-20% below actuals. To make the component-RBI more accurate, something like .03 * (AB - H - K) would be needed. Again, you work backwards using league stats and runs scored to come up with all the constants you require. (I didn't want to get into all that stuff, as well as SB/CS for the component-Runs.)

I'm also aware that I underweight the Singles, and overweight the extra base hits, but that is a product of the Runs/RBI stats themselves. The missing component would be "Base runner assists" or something to that effect. If MLB would count the number of times a runner was moved along, and eventually scored, this "Assist" would add value as well.

According to TotalBaseball.com: Babe Ruth, 2844 Runs Created, 3673 Runs Produced, 2833 Adjusted Runs Produced (i.e., remove AB/10). Ted Williams, 2538, 3116, 2345. Mike Schmidt, 1757, 2553, 1718. Tim Raines, 1592, 2311, 1455. Craig Biggio, 1041, 1494, 919.

All I am saying is that when you look at your daily newspaper, a quick look at R+RBI-HR has alot of value.

David Smyth

The only real question remaining is whether the best version of runs produced for individuals is R+RBI-HR or just R+RBI. Using the 1999 sample of 57 NL batters with at least 500 AB, I checked the correlation of their runs created (new version) with 3 versions of runs produced-- R+RBI, R+RBI-HR, and R+RBI+HR. For R+RBI, it was .90 For R+RBI-HR, it was .87 For R+RBI+HR, it was .84 Are these differences meaningful? Yes. Are the results definitive? Probably not. One would need to use a larger sample of hitters from different seasons, etc. But I'll go out on a limb and say that I'm pretty sure the result would be the same. Runs produced has been around for 20 years, and is still used by sportswriters and others to make their points. They all seem to follow like sheep, subtracting out the homers without any apparent consideration as to whether it makes sense to do so. Does the run scored on a home run count any less than other runs? Does the RBI on a home run reflect lesser effort or output than other RBI? Am I the only one who is bothered by this?

Tango Tiger

As I mentioned, you have to remove the HR. Breaking the R/RBI into their hit components, by NOT removing the HR gives HR a value of 2.6 runs. Removing the HR gives a value of 1.6 runs. As I also mentioned, on a TEAM level, there is NO QUESTION that R+RBI correlates better with Runs than R+RBI-HR. The reason for this is that RBI is usually equal to 94% of Runs Scored. And this is REGARDLESS whether it is a high homer or low homer team. But the question to ask is, on an INDIVIDUAL level, what makes more sense? And it makes more sense for a HR to have 1.6 run value than 2.6 run value.

I think a better way to think about it is in basketball/hockey terms. Players score goals or score baskets. The total of the individual goals/baskets equals the team totals. Sometimes they score it on a breakaway, and sometimes they get assists. In hockey, there are 1.6 assists/goal. Meaning every goal has 2.6 points attached to it. Basketball must have like 0.5 assists per basket. I would say that an assist is equivalent to an RBI (ask Wayne Gretzky is you don't think an assist is as valuable as a goal). The point is that when you score UNASSISTED (a home run basically), only one point is credited for the goal. But if you score a goal with 2 assists, that's 3 points. The fallacy is that baseball has decided to give the batter an assist for his own run. I prefer RDI (RUNNERS driven in). This would be akin to assists, and would support my results of the R/RBI component runs being similar to Linear Weights. Tango Tiger David, Just re-read your post, and sorry for replying so fast. I did not realize that you did your study on individual players. I apologize again.

It is very interestign what you bring up then. What is also interesting is that not only does your study show that HR should be kept inside the Runs Produced formula, we also both agreed that by removing HR we are STILL overweighting the extra-base portion of the component parts of R/RBI. Therefore, by keeping HR, we are SEVERELY overweighting extra base hits. AND STILL, incredibly, there is higher correlation with a straight R+RBI.

Very good post, and I'll need to think about it. The only thing off the top of my head is that RC itself is invalid at the extreme level (which James kind of admitted). The other part is that R/RBI of individual players are a result of within a team context, and Runs Created assumes that it's basically a team of the same hitter. Personally, I prefer James' other adjustment of calculating runs scored on a team level with and without the player, with the difference attributed to the players.

Great post again.

David Smyth

There are a few ways to analyze why R+RBI is best for individuals. One way which doesn't require a single calculation is based on logic alone. The best version for teams is obviously R+RBI. In order for R+RBI-HR to be better for individuals, it would have to follow that HR have more significance for teams than for individuals. For any outcome other than homers, that question might require some sort of study. But homers are a unique occurence, because the answers are all 100%. On a HR, the team scores a run and records an RBI 100% of the time. On a HR, the player scores a run and records an RBI 100% of the time. The significance for the team and the player is exactly the same.

Tango Tiger

Hey David, I agree I can't argue with your logic as it is stated. The question still remains that the runs component for a HR by using R+RBI is still 2.60 and that is completely wrong. I'll maintain that if baseball originally had an RDI (runners on base driven in) 100 years ago instead of RBI, then R+RDI would be the formula. Anyway, when I have time (next week hopefully) I hope to answer this question on the flip side. I have proved the component part that HR should be removed. Now, I will prove in practice. What I will do will be pretty straightforward: I will look for two groups of hitters (say 30 or so) who have similar batting averages, on-base averages, and slugging averages, but, one group will have far more home runs than the other group. (The second group to compensate will need lots more doubles and a few less singles.) Then we will simply compare their Runs, RBIs, and RDIs, and see which ones match up. The hypothesis is that similar valued hitters should have similar Runs Produced. Anyway, hope to get to it next weekend.

CRS

Originally posted by tangotiger "The question still remains that the runs component for a HR by using R+RBI is still 2.60 and that is completely wrong."

That's because R+RBI double counts runs. You need to use (R+RBI)/2 to compare directly to linear formulae. This puts the HR coefficient at 1.30 which is not so bad. Then the question shifts to why 2B's and 3B's are underweighted. I'm rather curious to see how this turns out, as your R & RBI formulae appear to be interesting if they are correct.

I think it would be helpful to include players from all parts of the batting order in your study, not just the stars who bat in the middle of the lineup and tend to have high RP/BR ratios.

David Smyth

I think I now realize what the problem is. Tango's summed runs scored/RBI values--.47 for a single, etc.--look like linear weights. The only problem is...they're not. Well, maybe the runs scored portion is, but the RBI portion isn't. The .40 value for a double doesn't mean .40 runs, it means that there are .40 expected RBI for each double. But the actual run value for each RBI is different---driving in a runner on third with no outs has a different weight than driving in a runner from first with two outs. So, even though these values happen to resemble linear weights, they're not. And because they're not, there's no reason to alter them to make them resemble linear weights even more. There's no reason to subtract out the homers to change it from 2.60 to 1.60.

CRS

Funny David, I was just going to say that the RBI numbers aren't bad, but the RS numbers are. First, none of this has to do with value at all. It has to do with accounting. Whether or not a single with a man on third is more valuable than a triple with a man on first is not at issue. Both result in 1 RBI and at the end of the day (RS + RBI)/2 will correlate very well with total runs scored as will linear formulae. The two methods just get there completely differently. The (RS+RBI)/2 method will get there very circuitously, lineup dependently, etc. It will place large coeffiecients on things like sacrifice flies, fielders choices and other groundouts that may not have much "value" but add to the accounting of who scores and who drives runs in. RBI from base-situations are straightforward. I used Tangotiger's percentages (I never used the 1st-to-3rd one) and numbers that were a bit higher than his, coefficients of .24/.46/.63/1.63/.035. I think they may be even a bit higher though as the sum probability of the base-out situations was less than one (~.94). That would leave me to believe that the coefficients used in Total Baseball for expected RBI in Clutch Hitting Index, namely .25/.50/.75/1.75 may actually be close to correct. This would put the HR coeffiecient (the most trivial to consider) at 1.375, which looks even better. The RS formula though. I see what tango may have done. If you average over all the outs, you see that you can expect .32 runs with the bases empty and .58 runs with a man on first. That gives you .26 for a single which is about what he had. Trouble is that the value added tell you the increased likelyhood that SOME runner will score, not if THAT runner will score. Once you put that runner on first, chances are that if a run scores it will be THAT runner, so maybe the 1B coefficient should be up over .5 (a bit less perhaps due to FC's). Anyhow, I don't have the retrosheet-type data to look at what I want to look at. Plus, though this is an interesting puzzle (to me at least), runs produced numbers really have no basis in determining "value" and I don't know if this is all worth the trouble. This was originally posted as a way to "save time" after all. I guess what I am curious about in all this is simple accounting-type numbers. When a runner scores, how did they get on base in the first place? What percent due to singles, doubles, fielders choices, etc. Same for RBI's. How many RBI from homers, doubles, outs, etc. From this, one could construct percentages of RS and RBI for a typical event and get some linear-like formulae. It might not mean anything though.

Tango Tiger

CRS: you are absolutely correct, that this is all about accounting and not about value. It just so happens that R+RBI-HR, when broken down, corresponds closely to value. But it is primarily about accounting. R+RBI / 2: the one thing that always bothers me about stats where you divide them by 2 is that it no longer becomes a straight additive play. Going back to hockey, they have a good stat called plus/minus. Each of the 5 skaters on the ice gets a plus one when their team scores, and a minus when the opposing team scored. The aggregate total yields a value that is 5 times larger (by definition) than the team goal differential. You may be TEMPTED to just say plus/minus divided by 5, but it doesn't work that way. The reason is that no all players participate in the plus equally. I think the R+RBI / 2 argument can work out the same way. That we can break down the r, RBI components and show how closely a R+RBI-HR matches closely to Linear Weights reinforces this notion (to me anyway).

CRS: Interesting point about the .26 meaning SOME runner but not THAT runner. You are right, and I simply used my figure as a proxy. That the 1B coefficient should be 0.52 or 0.46 doesn't really change much for my purposes. My point is simply that R+RBI-HR has some basis in fact. But your point is very well taken.

David: absolutely correct that 0.40 does not mean 0.40 runs but simply that the average double results in 0.40 RBIs. And there is no question that the average double is NOT worth 0.40 runs in "run-driving ability". It is closer to 0.30 runs. And T/HR are closer to 0.40 runs in "run-driving ability", and not the 0.60 / 1.60 that RBIs give them. I do agree with your point that an RBI in certain situations should be worth more than others. That "AB/10" thing that I do is suppose to address this in a general sense. If we consider that you get 600 at bats, and that the average hitter drives in 60 RUNNERS, then we can say that every 10 at bats yeilds 1 runner. However, you can try to be fancier about it, and break down his at bats in the situations you describe and get a truer picture of his run driving ability. Before someone out there thinks this is now getting away from the simplicity of R+RBI-HR, please note that this last exercise will yield clutch performance (if not ability). By actually counting the number of runners driven in in different base-out situations, and compare it to the average, you are getting a true picture of a batter's ability to drive in a run. But that is another threas altogether, and I invite someone to start that one.

I just got back from my vacation, and I promise to look at the R+RBI of homer hitters v non-homers hitters this week!

Tango Tiger

Ok, so I couldn't sleep, so I decided to run my study now. Here it is. The process. First off, I used Lahman's database (all thanks to him for making this easy). I created a database, with seasons of at least 300 plate appearance (AB+BB to be more accurate).

From this list, I took the 20 seasons with the biggest skew towards homeruns. Consider these guys as those who contribute most with their home runs ( a player like Barry Bonds, who contributes with everything, would not appear on such a list): 5 seasons of Dave Kingman, 3 seasons of Sammy Sosa, 2 seasons of Mark McGwire, 2 seasons of Matt Williams, and the rest were one season players (1987 Andre Dawson for example). The aggregate totals of these 20 seasons (let's call them King Kongs) were: 327 OBA, 566 SLG, 262 BA. Those are basically the kind of numbers you'd expect from one-dimensional power hitters.

Then I looked at the other side. I looked for hitters who hit within 8% of the OBA and SLG average above, and since all the above hitters came from 1950 and later, I decided to limit my study to those years. I ended up with an eclectic list: 2 seasons of Cecil Cooper, and single seasons of such players as: Dave Parker, Nomar Garciaparra, Felipe Alou, and Andre Dawson (again!, this time 1983, and not 1987). The aggregate totals of these 20 seasons (let's call them Little Cecils) were: 347 OBA, 534 SLG, 307 BA. Those are the kind of numbers you think of when you think of Cecil Cooper.

So, what were the difference in Runs and RBIs between the Kongs and the Cecils? Well, first off let's look at the difference in each of the hitting components. The Kongs had 21 more home runs and 18 more walks. The Cecils had 33 more singles, 14 more doubles, and 5 more triples. (All numbers averaged against a 600 Plate appearance season for convenience.) The OPS was 893 for the Kongs and 881 for the Cecils. The positive values of Linear Weights shows that the pluses of the Kongs are slightly better (by 3 runs) over the pluses of the Cecils.

So, if R+RBI-HR is accurate then we should see the numbers of the Kongs and Cecils to be similar. If R+RBI is more accurate, then we should see those numbers to be similar. The results: For the Runs part, the Little Cecils scored 91.1 runs versus the 90.4 runs of the Kongs. A virtual wash. For the RBIs, the Kongs had 116 versus the 95 RBIs of the Cecils. That difference is 21 RBIs. If you remember above, the Kongs also had 21 more home runs. If you look at RBI-HR, BOTH playerss had 68 Runners driven in. So, what we have are two groups of players of roughly the same value, one of which derives most of their value from their home runs, and the other one does not. Yet, their Runs Produced (R+RBI-HR)came in at 159 for the Little Cecils and 159 for the King Kongs.

If someone wants me to run something different, my database is all set and ready to go; just give me the parameters you want me to run.

David Smyth

Tango, hope you were able to sleep afterwards. Your study suffers from the same main problem as mine--small sample size. If you have all of the batter seasons since 1950, and a computer to do all the dirty work, why not do a study involving a thousand batters instead of just a few dozen? This way, you could include bad and average hitters instead of only good ones. You could reduce or eliminate the dependence on atypical hitters with extreme HR dependence. You could eliminate the possibility of batting order contamination, which may be present in your design. Another variation would be to switch from controlling for batting performance and HR and checking for R and RBI, to controlling for R, RBI, HR, and checking for batting performance. Might be better, I'm not sure.

Tango Tiger

I expanded to 100 player-seasons, and changed the premise slightly. First off, I kept all the stats in a context of 600 PA (AB+BB) for all those players with over 300 PA.

I then looked for those players who contributed most of their offense with their Home Runs. This gave me 7 Dave Kingman seasons, 7 McGwires, 5 Juan Goanzalez, and a slew of other players. Their AB/1B/2B/3B/HR/BB are as follows: 534.95 / 75.38 /21.87/1.84/43.89/65.05. Then I ran a similarity-type score, looking for players who were above the non-HR as much as possible, and were close to 0 in the HR. I ended up with 7 Wade Boggs seasons, and 6 Luke Appling seasons, and a slew of others. Their totals read: 521.54 / 128.65/31.65/4.13/2.24/78.46.

So, looking at the individual differences, we see the Wade Boggs end up with about 53 more singles, 10 more doubles, 2 more triples, and 13 more walks. The Kingmans end up with 40 more home runs. The positive values of Linear weights tells us that the Kingmans are worth about 15-20 more runs. The results. The Boggs players ended up with 82 runs and 62 RBIs. Their runs produced were 142. Their R+RBI were 144.

If Runs produced (with the home run subtracted) is more accurate, then we should see Kingmans RP at about 160. If R+RBI (keeping the homeruns intact) is more accurate, the Kingmans R+RBI should come in about 165. The Kingmans ended up with 91 runs and 113 RBIs. That total is 204, and is a whopping 60 runs above the Boggs numbers. The Kingmans RP (with HR removed) is 160, and is 18 runs above the Boggs RP, and is PRECISELY what we expected.

I am sure if I re-run this study with 500 players or 1000 players, I will end up with the same conclusion: the Runs Produced figure with the HR removed is a more accurate measure of a player. This has been demonstrated by looking at the individual logical components, and by looking at the players' actual numbers.

Thanks for the feedback guys, as this was alot of fun for me. But I've got to get back to some boring work now!

P.S. I am re-running the study now, this time controlling the home Runs at 10 (rather than at zero). The Boggs numbers are: 525.49 / 130.22/ 36.74/ 4.82/8.43/ 74.51. This gives Boggs 55 more singles, 15 more doubles, 3 more triples, 9 more walks, but 36 less home runs. Linear Weights tells us that the Kingmans are slightly better (by 6 runs). In effect, pretty much equal-valued players.

The RP of Boggs comes in at 160.7, while those of Kingmans comes in at 159.65. Pretty much a wash as well. Therefore, removing HR from RP is more accurate than leaving it in. Thanks....

Tango Tiger

Ok, one last study, and this one is really exhaustive. For each year from 1920 to 1999 (80 years in all), I took the 10 players that contributed the most with their home runs. That gives us 800 player seasons. This also removes any era-biases. These are the King Kong players. For each year, I then took the 10 best hitters who were not home run hitters. These are the Prince Boggs players. So, we will be comparing 800 player seasons to 800 player seasons, with the era-bias removed. The results. PAGE DOWN...don't know why it gives me the blank spaces.

Note: information has been lost by website. I'll try to reproduce it.

As you can see, the King Kongs, based on their Linear Weights, are worth about 16-17 runs more than the Prince Boggs. If Runs Produced is accurate, we should see a similar number. If R+RBI is more accurate, then we should see the King Kongs ahead by 15-20.

As it turns out, the R+RBI of the King Kongs are ahead by a whopping 51 runs. Their RP is ahead by 21 runs, which is pretty close to what we expected.

Tangotiger (added in some other thread)

The problem with such a weird profile is that for the R/RBI to come out like that, this player must have performed unusually well or poorly with runners in scoring position.

For example, runs scored is roughly equal to .27*1B+.44*2B+.61*3B+1.00*HR+.27*BB.

RBI is roughly equal to .2*1B+.4*2b+.6*3b+1.6*HR+.025*(AB-H). That last value is "forced" in to make sure that the league averages balance out. Like the out constant in LW.

So, for example, a player with the following profile in 660 PA:
110 30 4 16 60 440 (1b,2b,3b,hr,bb,outs) will have 77.5 runs scored and 72.9 rbis.

Now, to generate a 100/100 guy with 10 hrs, you need UNDER NORMAL CONDITIONS (660 PA):
141 70 30 10 10 399. If you go back to say Tommy Herr, he did not have such a profile. I would guess that he hit alot with RISP, AND he was very good at that as well.

To generate a 125/125, with 65 HRs, you need UNDER NORMAL CONDITIONS (660 PA):
17 20 0 65 173 385. Again, another "impossible" situations. But I think McGwire might have performed like this (the 125/125,65) a couple of years ago. I will guess then that he has few RISP and performed poorly in those situations.

So, which of these 2 guys is better? Well, LWTS says: the first guy has 129 RC, and the second guy (the HR guy) as 125 RC. Their +/- (with 0 as average) is +49 for the 1st guy and +57 for the second guy.

If you incorporate my formula of R+RBI-HR-AB/10, you get: the first guy is 125 runs, and the 2nd guy is 136 runs.

So, however you slice it, these guys are within 10 runs of each other, and not 50 runs apart.

Tangotiger (added in some other thread)

I looked at all players since 1975 with over 300 PA (AB+BB). I then grouped them as one of 6 types of hitters (singles hitter, doubles, triples, homer, walk, steals). I then broke down these hitters into 7 values of hitters (RC over 100 runs, 90, 80, 70, 60, 50 and under 50). What we end up is 42 "aggregate players" each in a very clear category. Any difference can be easily accounted for. All of this can be found at http://www.geocities.com/tmasc/RCType.xls.

The results. First thing I did was a regression analysis of the 6 hitting categories versus R+RBI. This would establish what the Linear Weights coefficients are for R+RBI.
1B = 0.58
2B = 1.09
3B = 1.31
HR = 2.88
BB = 0.27
SB = 0.20
As you can see, R+RBI overweights singles by about 0.10 runs, doubles by 0.30 runs, triples by 0.30 runs, and home runs by 1.4 runs, while underweighting walks by under .10 runs. If you use R+RBI-HR, the constant for HR becomes 1.88.

The r-squares of R+RBI v RC is 93.7%, which is great. Adj RP is 98.5%.

Next, since I slotted each of the 5,800 players into one of 42 categories, we can see what differences pop up. First, let's look at the very best hitters (RC > 100 runs). For each of the 6 types of hitters (singles, homers, etc), they all have a RC between 108 and 112. We can say, therefore, that these different types of hitters all have the same value, though they got there in different ways. When we look at the adjRP, they range from 110 to 118. An acceptable deviation, with an 8-run range that is a bit off from RC. But looking at R+RBI, they range from 182 to 193 for the non-HR player (11 run range), and the HR player comes in at 208! Now, remember, these 6 types of hitters are all worth about the same (RC between 108 and 112). Yet, the homerun hitter's R+RBI is 15 to 25 runs above all the other great hitters.

Let's look at the near-great hitter (RC > 90). Their RC range from 94.0 to 94.6, for a puny range of 0.6 runs. These 6 widely different types of hitters are all worth the same, and any overall stat should show them to be the same. adjRP shows them worth 95 to 100 runs (a 5-run range which is a bit off). R+RBI? Well, the non-HR hitter comes in at a range of 163 and 175 (which is a wide range to begin with). The HR hitters comes in at 186 runs! This is 11 to 23 more runs than he should have.

How about the good hitter (RC > 80)? RC comes in at 84.4 to 85.0 runs. adjRP comes in at 83 to 90 runs (range of 7 runs, which is a bit high). R+RBI? non-HR hitters range 150 to 160 runs (10 run range). but the HR hitter comes in at 173 runs! That is 10 to 23 runs too high.

The mediocre hitter (RC > 70) looks the same: RC between 74.5 and 75.4 runs. adjRP between 72 and 79 (7-run range). R+RBI for non-HR hitter comes in at 137 to 146 runs (9 run range). HR hitter? 159 R+RBI, which is 13 to 22 more runs than he should get. How about the fair hitter with RC > 60? RC comes in between 65 and 66 runs. adjRP between 60 and 68 (8-run range). non-HR R+RBI is 125 to 134 (9-run range). HR-hitter is 146! That makes him worth 12 to 21 more runs that he should get.

The bad (RC > 50) hitter? RC in at 55.2 to 56.2 runs. adjRP is 49 to 57 (8-run range). non-HR R+RBI=111 to 118 (a 7-run range, and the first time R+RBI does better than adjRP). but the HR hitter's R+RBI? Try 128, and 10 to 16 runs more than it should be.

How about the worst hitters in the last 25 years (RC < 50)? How do they do? RC between 42 and 45 runs. adjRP = 37 to 47 runs (10-run range). non-HR R+RBI is 96 to 108 (12-run range). the HR hitters in this group? 119 R+RBI, which is 11 to 23 more runs than he should get.

Conclusion
1 - the R+RBI of the home run hitter is consistently 20 runs higher than a similarily valued, but non-home run hitter.
2 - The adjRP of all types of hitters show no such tendency.
3 - The regression analysis shows that if the home run is to remain part of R+RBI, then the RC formulaes as we know it are invalid (which they are not).