author topic   this topic is 5 pages long:    < first page   ...   2   3   4   5  
David Smyth posted November 20th, 2000 12:52 PM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

Tango, I'm a bit confused by your CS analysis. You list 3 components--the lost BR, out value #1, and out value #2. Since your overall LW CS value is the sum of the BR and out #2, it must be that out #2 includes out #1. Yet the average difference between out #2 and out #1 is only about -.06 runs. IOW, the out value #1 seems too high. The average out value #1 is about -.124 (AL). This would seem to be the analogous value to the typical .09 to .10 value for batting outs. But instead of being higher than that, I would expect it to be lower, since there are (presumably) fewer 'other' baserunners in innings where a CS occurs.

Could you explain your procedure for figuring out components #1 and #2?

IP

tangotiger posted November 20th, 2000 06:02 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

Out component #1 and #2 are mutually exclusive. I will call these components pre-runner outs and post-runner outs to give a little life to these numbers.

The pre-runner out is the run value lost from the runners currently on base after the batter makes an out. This is analogous to the -.10 out that I've been using for RC. As it turns out, the value should be closer to -.12 in the 1993-1999 period, simply because each offensive event is worth more than their historical norms. Anyway, use this to figure out a team's total runs scored.

The post-runner out is the run value lost from all future batters (the team) for that inning. Basically, the average would be the RE at the beginning of the inning (.56) divided by 3 or about .19.

The total of these two out components is what you use for the plus/minus LW. It comes out to about -.31 runs, which is pretty much what MGL's LW numbers show.

(I realize that I didn't explain it too clearly last time.)

So, getting back to the CS. I am first making the assumption that the CS is only with a runner from 1B to 2B, and that there is no other runner on base. In that event, the value of the CS is equal to the lost runner on 1B (generally .27 runs), and the value of the post-runner out (generally .19 runs), for a total of -.46 runs. In actual fact, the loss of a CS is slightly more than that because my assumptions are not accurate.

IP

tangotiger posted November 20th, 2000 06:21 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

As for how to figure out the pre-runner out, I have the data at home, but what you do is figure out how much a runner's chance of scoring is with 0 and 1 out from 1B (say it's 44% and 28% for a leadoff hitter). Therefore, the #2 batter, causing an out, will reduce the chance of a player scoring by .16 runs in this case. Do this for the 9 combinations of outs and bases. Then figure out the frequency in which the #2 batter comes up to bat with runners in each of the 9 combinations. Total that up, and you get the pre-runner out value.

Mathematically, this is great, because it eventually reduced the runners on base down to zero at the end of each inning.

For someone like Pedro, the -.12 would probably be like -.08 or something, simply because each runner on base has a less chance of scoring.

As for the post-runner out, that one is alot easier. We have the RE for each of the 9 batting spots with no one on base. You follow the same process as above, except now you only have 3 combinations (0 to 1 out, 1 to 2 outs, and 2 to 3 outs). So, if the RE are .55, .30, .10, then we know the 1st out costs .25 runs, the 2nd out costs .20 runs and the third out costs .10 runs. Figure out how often a batter makes the 1st, 2nd, or 3rd out, and that gives you the post-runner out value.

Again, what we are trying to figure out is the run impact of a batter batting in a particular batting spot. Once the batters around him change however (say moving the pitcher from the #9 hole to the #7 hole), and the whole thing changes.

If you compare the NL/AL number, you will note that the NL 7 and 8 hitters take a huge beating in LW, simply because of the "automatic out" coming up after them.

In the NL, much more than the AL, it is more important to make sure you have the right batting order. If you can get away putting a good batter in the #7 spot in the AL, you definitely cannot do that in the NL.

IP

David Smyth posted November 20th, 2000 11:09 PM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

Well, if you're making the assumption that there are no other baserunners when a CS occurs, then the (absolute or RC) value of a CS should be equal to the lost baserunner only, or about -.27 runs. Yet you state in your first post that the (absolute or RC) value of a CS is the sum of the lost BR plus the first out component, which would be around .27+.12 = -.39.

Which is it?

The -.12 value, being the same as the current RC out value, implies that you believe that there are in fact an 'average' number of other baserunners when a CS occurs.

Maybe I'm dense, but there seems to be a big discrepancy there.


IP

tangotiger posted November 21st, 2000 12:04 AM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

I should never had said "first" out component in my first post. I corrected it in my second post by saying that you should use the post-runner out.

For CS, for absolute runs (Runs Created), the .27 figure is the only one you should use. For relative runs to average, it should be .27 + .19 = .46

IP

David Smyth posted November 21st, 2000 08:22 PM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

I'll make one more comment on CS, and then I'll shut up.

I have little doubt that the .46 LW CS value is correct.

But I sure wish that the sub-component analysis would add up. It doesn't, the way it is presently structured.

If you simply add the .27 and .19 to get .46, you are implying not only that there are virtually no other baserunners when a CS occurs; you are also implying that virtually no other batters reach base after the CS. In toto, you are implying that there are virtually no runners left on base in innings featuring a CS.

If there are only half the runners left on base in a CS inning as compared to an 'average' inning, then you have to add .05 runs or so to the .46, which is -.51 runs.

If the .46 is indeed correct, it means that a) the baserunner is worth less than .27, for some reason, b) the run expectancy is less than .19, for some reason, c) there are virtually no runners LOB in CS innings, or d) a combination of the above.

Since only 1 out of about 30 innings features a CS, these are indeed atypical innings. Maybe the right answer is d.

IP

tangotiger posted November 22nd, 2000 11:25 AM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

No need to shut up David, as most of my inspirations are a result of other people speaking.

Now, I did make my first assumption that when a CS happens that there are no runners on base AT THAT POINT IN TIME. This is my theory behind the "pre-runner out".

Now, the interesting thing you bring up is what happens to runners left on base AFTER the out. The way I was thinking between pre-runner outs and post-runner outs is that the runners left on base is ONLY those runners that are on base at that point in time. Any future runners left on base would not be affected by that out.

So, the first out, with no one on base has zero runs created. If the second batter gets onto 1st base, that is worth .28 runs (chance of a guy scoring from 1B, with 1 out). If batter #3 gets out, it reduced the chance that the runner scores from .28 to .14 runs. Therefore, the 2nd out is worth -.14 runs. If the cleanup hitter also gets out, it reduced the chance of the runner scoring from .14 to .00. The 3rd out is worth -.14 runs. Add it all up and we get 0 + .28 - .14 - .14 = 0, which is how many runs scored in that inning.

Now, the thing that you bring up is that it's not fair to penalize the 2nd and 3rd outs more, while the 1st out gets out scott-free. Completely valid, and I'd have to think about this.

The plus/minus process though removes this discrepancy. You start off at .56 runs each inning. The first batter gets an out, and the RE goes down to .30, thereby making the 1st out -.26 runs. The next batter gets on base, increasing the RE to .57, making the single .27 runs. The 3rd hitter gets the 2nd out, reducing the RE to .24, making the 2nd out worth -.33 runs. The cleanup hitter gets the last out, making the RE 0, and making the 3rd out -.24 runs. Add it up and you get -.26 + .27 -.33 - .24 = -.56. So, by not scoring, you have -.56 runs from expected average.

But the actual runs scored is 0 runs. How do you add that .56 back onto the hitters so it equals to zero? MGL argues to add it back to ALL hitters, including the guy who got the hit, though that sounds wrong (since we argued that if a HR is worth 1.4 runs, and when we run our regression against RC it is always 1.4, why should it be different now?). Maybe we should add the .56 equally back to all 3 outs? That means adding .187 runs to each of the 3 outs, making the 1st out worth -.073 runs, the second out worth -.143 runs, and the 3rd out equals to -.053 runs, for a total of -.27 runs (which cancels out the +.27 runs for the hit).

So, what is correct? My original assumption of
0, -.14, and -.14 or
-.07, -.14, and -.05?

I don't know. But I just want to bring up the point that if the inning went 1-2-3, then all the outs would be worth 0 runs. Because it so happened that a batter did get on base after the fact, does that make the 1st out more costly?

I suspect that I am wrong on this issue, and maybe the out should have THREE components: the lost value of the current runners, the lost value of the future runners, and the lost value of EXPECTED runs. The first two components should add up to around -.10 or -.12 and give us RC, and adding in the 3rd component gives us +/-LW.





IP

tangotiger posted November 22nd, 2000 11:33 AM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

Going back to:
-.26 + .27 -.33 - .24 = -.56

We see that the toal value of the outs is -.83 runs. We also know that since no runs scored, then there is .56 runs too much accounted for. If we do a straight percentage, and take .56/.83 = 67.5%, and multiply each of these out values by 32.5%, we get:
-.085 + .27 - .107 - .078 = 0 runs

That makes the 3 outs worth:
-.086, -.107, and -.078

These numbers make more sense, as mathematically the spreading of the -.56/3 does not make sense for the 1-2-3 situation. Those values are:
-.26 - .19 - .11 = -.56
If we add -.56/3 equally, we actually get a POSITIVE value for the 3rd out, which is wrong obviously.

IP

David Smyth posted November 23rd, 2000 12:22 AM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

If I only knew the average number of runners left on base at the end of an inning featuring a CS, compared with the average number of runners left on base at the end of an overall inning, I'd be able to finish this.

Tango's point of using the run expectancy for each out, instead of the average .19 value is one of the missing links. If CS occur not evenly, but 20% with no outs, 35% with one out, and 45% with two outs (my unproven estimates), then applying these frequencies to the individual run expectancy out values (-.26, -.19, and -.11) results in a -.168 overall run expectancy value instead of -.19.

Similarly, if you apply those occurence percentages to the value of baserunners on first, you will get a value less than .27; maybe .24. The added up 'savings' is about .052, which is in the estimated range of the lost value of the runners left on base at the end of a CS inning.

I'm simply searching for a way to have a logical value for each CS component, and to have the relevant components add up to the RC CS value of -.32 or so, and the LW CS value of -.46.

IP

tangotiger posted November 24th, 2000 02:40 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

There's somethign else that is troubling about the out. We always talk about the RE for each inning going from .56 to 0 on the third out. And when we apply RC, we say that a 1-2-3 inning produces 0 runs for that inning.

But the 1-2-3 inning also has a future effect on later innings, as you've now reduced the number of innings that a team can score by 1. The average team scores 5 runs / 9 innings. On average, the 1-2-3 inning will end up in the 5th inning, thereby reducing the available innings from 5 to 4.

If the average team scores 2.78 runs in 5 innings, they will score 2.22 runs in 4 innings.

So, while I am always talking about RC being 0 runs, I'm thinking that should be -.56 runs for the 1-2-3 inning (as MGL always asserts).

The mistake I always make is that I treat the end of the inning as the end of the game, which is why I always want the RC at that point in time to equal the actual runs scored. But there are still innings to be played.

My current thought on outs is to always use the RE, so that you are measuring against future results. If Pedro's Runs/Game is 1.8, then his RE for each inning is .20, making each out worth -.067 runs. Every inning he goes 1-2-3, makes it a -.20 runs inning. If he pitches a perfect game, then that makes it -2.0 runs. To get RC, you add back the expected RE for the game (which is 2.0), and you get zero.

Therefore the fudge factor is RE / 3 (which David talked about at some point), applied only to the outs.




IP

tangotiger posted November 24th, 2000 04:48 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

I think I used some confusing examples in my last post. In the end, the fudge factors should be .26/.19/.11 for 0,1,2 outs to switch between LW and RC.

As for my theory of pre-runner and post-runner outs, etc, and whether there are runners on base in the inning of a CS is also irrelevant. All that matters is the RE for the GAME.

IP

David Smyth posted November 24th, 2000 08:17 PM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

I'm not sure about everyone else, but I'm having trouble understanding exactly what Tango is talking about.

For me, the role of outs is clear. The only way an out has a tangible effect on runs is by the loss of baserunners. Either the runner is lost during the inning (CS, GDP, etc.), or at the end of the inning (LOB). The number of runners LOB in a game is a function of the number of innings per game. There are 27 outs per game. What gives an out its specific (absolute) value is the way the 27 outs are divided. If a game consisted of only one inning of 27 outs, the number of runners LOB per game would be only 1/9 of what it is now, and the R/G would therefore be much much higher than it is now, with no change in SLG or OBA. The (absolute) value of an out would be only about 1/9 of what it is now. But the (run expectancy) value of an out would be higher than it is now, maybe much higher.

For me, the key to an understanding of the (absolute) value of an out is the number of runners LOB per inning (and, of course, their distribution).

IP

tangotiger posted November 25th, 2000 02:36 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

"during the inning, at the end of the inning".... or in future innings. That's the point of going 1-2-3. There has to be a cost to the runs if you reduce the available innings by 1.

But to translate the plus/minus LW back to actual Runs scored, you have to add those out factors of .26, .19, .11 (0,1,2, outs), and you end up getting exactly your statement: "during the inning, at the end of the inning"

IP

David Smyth posted November 26th, 2000 11:00 AM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

Back to the RC out value, and your example of out, single, out, out. The way it technically works is 0, +.27, 0, -.27

The BR is not lost until the 3rd out. But it still makes sense to apportion the -.27 among more than one out. And the only way which makes sense is to apportion equally among all 3 outs which would be -.09, +.27, -.09, -.09

The first out should not get off scott-free because it was what set the BR value at +.27 instead of +.39 (value of runner with no outs). Notice that the -.09 is right around the correct overall RC out value.

For a CS, the cost with no outs is the lost BR, which is -.39. With 1 out it's also the BR value, which is -.27. For a 2-out CS, it's the BR (-.13) plus the value of any runner left at third base (-.30, I think). If CS occur equally with regard to outs in the inning, the average BR value is .263. How often would there be a runner on 3rd on a 2 out CS? I do know that the 1,3/2 out situation occurs about 1/6 as often as the 1/2 out situation. Assuming equal propensity to steal, the answer would be 1/6 * 1/3 = .056, and .056 * -.30 is .017. Add that to the .263 and you have -.280 for a CS.

What about steals of 3rd? Presumably almost all of them occur with 1 out, because you're not supposed to risk the 1st or 3rd out at 3rd base. The value of a lost BR from 2nd/1 out is .43 runs. If steal atts at third make up 1/10 of total atts, then adding in the -.43 in this proportion produces a overall CS value of -.295.

How to get from -.295 to the LW value of .46? If you add .56/3 to .295, you get .482, a bit too high. But remember, the .295 is based on the assumption that CS occur equally with 0,1, or 2 outs. If you change that assumption to 30%/33%/37%, and run thru the math again you get -.46 for the LW value, and -.273 for the RC value.

[Edited by David Smyth on November 26th, 2000 at 10:08 AM]

IP

David Smyth posted November 26th, 2000 06:13 PM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

I realize I screwed up a bit in the above post. I'll be back when I figure it out.

IP

David Smyth posted November 26th, 2000 08:02 PM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

I did some refiguring. The loss of the BR on a CS is around -.250 (in a league with a .56 initial expectancy). This refiguring includes CS at 2nd, 3rd, and Home, and estimates of the frequencies of each with 0, 1, and 2 outs.

All that is missing is the value of the other runners left on base at the end of an inning featuring a CS. All that is needed to figure it out is the actual number of runners left on each base and the actual number of innings featuring a CS. The RE table will do the rest. It requires a PBP database to discern these totals, and I don't have it.

IP

David Smyth posted November 27th, 2000 10:23 AM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

So I used the shortcut trick again from my earlier post to make up for missing data. I know how often an inning ends on a CS. I can estimate the number, distribution, and therefore value of any other ROB at that point. By charging the FULL value against the CS (instead of 1/3), I am making up for not being able to count the times when there are LOB and the CS occured earlier in the inning.

Anyway, as stated above, the BR is worth about -.252 runs. The LOB is worth about -.026 runs, for a total RC value of -.278 for a CS (plus or minus about .02, if some of my estimates are off). Add the .56/3, and you have -.464. These numbers include steals of 2nd, 3rd, and Home.

IP

tangotiger posted November 27th, 2000 01:42 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

"Back to the RC out value, and your example of out, single, out, out. The way it technically works is 0, +.27, 0, -.27 "

Technically, that is not correct. The third hitter, with his second out, reduces the chance that the runner on 1st will score to let's say .14. Therefore, the 3rd batter would be something like -.13, and the cleanup hitter would be -.14.

As for the proportioning of the out, I go back to my 1-2-3 scenario, and how the values of those outs are -.26,-.19, -.11. Therefore, you should never use the equal distribution of -.56/3 to turn LW into RC, since you will end up with a positive value for the 3rd out.

I'll have to think about your CS method.



IP

David Smyth posted November 27th, 2000 04:41 PM find more posts by David Smyth    edit/delete message   reply w/ quote
Sports Guru
Member Since: Dec 1999
Location: Lake Vostok

You are right, Tango. I shouldn't have said technically correct. What I was getting to is that if you only have the data on the 3rd out--the inning-ending out--you can, as a trick or approximation, put all the value on that out. So you have only 1/3 of the outs, but you weight them 3 times as much. It was useful to make up for the lacking data on CS occurences. As far as using .56/3, you can use that if you're applying it to an average out value, as is the -.278 CS value. I did not try to apply it to the specific RC CS values for a certain out in the inning.

That's the problem with trying to convert between RC and LW. The RC out value (of -.1 or so) is an 'average' of the 1st,2nd,and 3rd outs. More precisely, it is not conceptually appropriate to divide up the -.1 into 3 different values. Each out is dependent on the others for its identity. The 3rd out is the 3rd out only because the 1st and 2nd outs have occured, etc. The LW scheme is just the opposite, depending completely on the distinction between the various outs. So it probably can be stated that trying to convert between RC and LW outs is like asking how many apples are three oranges.

[Edited by David Smyth on November 27th, 2000 at 05:42 PM]

IP

mgl posted December 3rd, 2000 06:14 AM find more posts by mgl    edit/delete message   reply w/ quote
Senior Member
Member Since: Apr 2000
Location:

Sorry I haven't been around lately. I've got finals all next week and I'm also building a house (not actually building it myself). Anyway, I haven't forgotten about revising my sim and posting it on James' web site. It will just take a little longer than I thought. It will be slick though! Later!

IP

tangotiger posted December 3rd, 2000 03:21 PM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

Here are my "final" LW for each league.

AL out bb s d t hr sb cs sb%
0 -0.311 0.349 0.488 0.793 1.046 1.423 0.180 -0.455 71.6%
1 -0.326 0.378 0.488 0.788 1.020 1.318 0.209 -0.538 72.0%
2 -0.326 0.388 0.509 0.806 1.046 1.362 0.193 -0.507 72.5%
3 -0.310 0.344 0.484 0.772 1.028 1.410 0.167 -0.447 72.8%
4 -0.311 0.348 0.507 0.808 1.090 1.475 0.160 -0.440 73.4%
5 -0.296 0.351 0.503 0.817 1.074 1.466 0.176 -0.428 70.9%
6 -0.283 0.331 0.476 0.789 1.048 1.462 0.175 -0.405 69.8%
7 -0.289 0.331 0.473 0.775 1.029 1.455 0.166 -0.417 71.5%
8 -0.310 0.322 0.468 0.783 1.028 1.452 0.181 -0.433 70.5%
9 -0.334 0.351 0.488 0.807 1.074 1.420 0.197 -0.469 70.5%

NL out bb s d t hr sb cs sb%
0 -0.284 0.327 0.470 0.762 1.006 1.422 0.167 -0.425 71.8%
1 -0.309 0.353 0.470 0.746 0.972 1.302 0.189 -0.523 73.5%
2 -0.309 0.378 0.495 0.768 1.022 1.341 0.169 -0.507 75.0%
3 -0.291 0.359 0.488 0.766 0.994 1.386 0.164 -0.447 73.1%
4 -0.286 0.354 0.505 0.806 1.058 1.452 0.168 -0.424 71.6%
5 -0.263 0.331 0.489 0.799 1.061 1.465 0.174 -0.385 68.9%
6 -0.249 0.313 0.463 0.762 1.035 1.454 0.163 -0.366 69.2%
7 -0.254 0.275 0.429 0.743 1.017 1.462 0.175 -0.347 66.5%
8 -0.269 0.280 0.438 0.733 0.988 1.470 0.152 -0.374 71.0%
9 -0.318 0.332 0.480 0.779 1.036 1.461 0.160 -0.448 73.7%

I looked at the 1975 Reds, and I figure their optimal order as follows:
1 - Morgan (or 2
2 - Rose (or Griffey)
3 - Griffey
4 - Foster
5 - Bench
6 - Perez
7 - Geronimo
8 - Concepcion

As I do these things for each team, it always comes down to the same thing. The #1 hitter has the biggest variance, so it's important to get that guy first. The #2 hitter is the next hardest one. After that, it basically comes down to best to worst hitter, in this order: 4,5,3,6,7,8,9.

I do not have the lefty/right splits.

I also tried something else, using 9 equal hitters, but of differing abilities, and this is how they came out:
1 - Morgan, 2B, 1976
2 - Rickey, LF, 1990
3 - Boggs, 3B, 1987
4 - Piazza, C, 1997
5 - ARod, SS, 1996
6 - Junior, CF, 1994
7 - Thome, 1B, 1996
8 - Bonds, RF, 1994
9 - Molitor, DH, 1987

As usual, most of the variation came in the top 2 spots, meaning even with a balanced attack, it is important to get those 2 right.

Anyway, I think I'm through with all this. As promised, my worksheet can be found here:
filehttpwww.geocities.com/tmasc/BattingOrder.xls
It comes with no documentation, but I've color coded the cells, so that it looks half-decent. You can look into the formulae to see what I did, and why.

IP

tangotiger posted June 11th, 2001 09:35 AM find more posts by tangotiger    edit/delete message   reply w/ quote
Senior Member
Member Since: May 2000
Location:

I'm bringing this thread forward in relation to the other batting order thread. I suggest to the new readers that you take a couple of hours and plod through this.

I agree that a properly constructed simulator is better. But this should be enlightening as well.

IP

> rate this topic: 1: Worst 5: Best (5 is best)