Tango on Baseball Archives

© Tangotiger

Archive List

Empirical Win Probabilities (August 28, 2003)

Because of the size of the sample, you should be careful in using this.

--posted by TangoTiger at 02:22 PM EDT


Posted 4:08 p.m., August 28, 2003 (#1) - bob mong
  Thanks for posting this, Tango.

Posted 6:17 p.m., August 28, 2003 (#2) - Rob Wood
  Do the +/- 6 run leads in the table include leads of greater than 6 runs as well? I am guessing that they do. Plus, I have fiddled around with some of the numbers and the counts do not seem to "reconcile". I will refrain from posting any specific (alleged) anomalies until I play around some more.

I'd also like to extend thanks to Phil B. for gathering this data.

Posted 7:31 p.m., August 28, 2003 (#3) - Patriot
  By "anomolies" do you mean things like say for example, with a 3 run defecit in the 6th, 2 outs, and a runner at first, you have a lesser chance of winning then you do with no one on base in the same situation? If you look at some of the sample sizes on these, I don't think that would be too surprising if some of that happened. I think to get accurate Win Probs you would have to do a simulation as Tango does. The empirical approach may work for RE, but probably not for WE.

Posted 12:23 a.m., August 29, 2003 (#4) - Alan Jordan
  I made a little approximation equation using the data provided. I have no theoretical knowledge of how exactly this model should be specified so I just did a logistic regression. I looked at interactions and even though they were significant, they didn't add much to the predictive power so I dropped them. I even dropped the Inning variable because It's so correlated to the difference in runs. With this huge sample size it had a p value of only .01 and when I rounded it off to two decimal points it was .00 so why bother with adding it to the model. Anyway, here it is.

WE=exp(LF)/(1+exp(LF))

lf=
0.58 +
HOME *0.5 +
DIFRUNS *0.7 +
OUTS * -0.18 +
SIT2=1 *-0.66 +
SIT2=2 *-0.5 +
SIT2=3 *-0.41 +
SIT2=4 *-0.28 +
SIT2=5 *-0.34 +
SIT2=6 *-0.21 +
SIT2=7 *-0.1

(Sit2 is situation but only goes 1-8, outs have been recoded into another variable).

The area under the ROC Curve is .83 which means that if you had the WE from this model you would be right about 83% of the time. Of course it's biased upwards a little. The more complicated models I tried didn't get above .84.

If anybody has any better ideas let me know.

Posted 10:02 a.m., August 29, 2003 (#5) - tangotiger
  I understand your issue with innings, but that can't be right, especially with men on base and with the score.

I would add that interaction you did with Innings to "SIT" and "Difruns". Even if it adds little to the overall predictive power, it will add *alot* to the overall predictive power for the 9th inning of a tie game with men on base.

(Can I guess SIT2=8 will be multiplied by zero?)

Posted 10:28 a.m., August 29, 2003 (#6) - Alan Jordan
  Here's a model for the ninth inning

LF=
1.2898
HOME *0.5231
Diffruns*1.4595
OUTS *-0.3738
SIT2=1 *-1.2722
SIT2=2 *-0.9408
SIT2=3 *-0.7069
SIT2=4 *-0.4566
SIT2=5 *-0.5013
SIT2=6 *-0.2663
SIT2=7 *-0.106

The interaction between sit and difruns adds nothing substantial at all to the predictive power according to this data set and the models. Both models have areas under the curve of .93. Sure the model with the interaction would have a slightly higher area under the curve, but you would have to go to the third decimal point to see it. It's not worth adding six more terms to your model. Remember that this model is essentially multiplicative, not like a linear regression which is additive.

If you did a model like this for each inning the area under the curve would be .84. Maybe you care more about the late innings, I don't know.

Yes Sit2=8 is multiplied by 0.

Posted 10:52 a.m., August 29, 2003 (#7) - tangotiger (homepage)
  Tremendous stuff Alan!

Again, doing only the work that you feel is worth your time, see if a model for any of the following interests you:

- 9th inning, score within 3 runs
- 9th inning, rest
- 8th inning, score within 3 runs
- 8th inning, rest
- 7th inning, score within 3 runs
- 7th inning, rest
- 1 thru 6, all

If you go to the homepage link, you will see that I have generated WE using a math model. Feel free to run your system against that if you want.

Again, what has been presented is excellent work, so thanks!

Posted 2:27 p.m., August 29, 2003 (#8) - studes (homepage)
  Alan, this is great. But could you interpret the table a little, please? What is LF? Just a name for the function? And how do the calculations work, exactly? Is each variable a 1 or 0 and are all the factors additive? Do you only add the one situation that applies? Thanks very much.

Posted 10:17 p.m., August 29, 2003 (#9) - Alan Jordan
  O.K. Tango first what do mean by within 3 runs?

if abs(diffruns)=0 and diffruns <=3 then w3=1; else w3=0;

the first groups games into close or not, the second groups games into (tied or small lead) vs (big lead or behind by at least one run). Both of these lumping of diffruns cause the predictive ability to drop off steeply.

****************************************

Studes

The basic model is the logistic function

WE=exp(LF)/(1+exp(LF))

where LF is a linear function, i.e. straight line equation. exp(LF)/(1+exp(LF)) bends the straight line into an s shaped curve that can never quite hit 1 or 0. For those of you familiar with odds ratios the model can also be expressed as:

WE/(1-WE)=exp(LF) or

ln(WE/(1-WE))=LF

The logistic regression is a generalization of the additive linear regression model, but because all the coefficients are actually exponents of e, its really a multiplicative model.

exp(m+n)=exp(m)*exp(n)

as for the last model I posted which was:

LF=
1.2898
HOME *0.5231
Diffruns*1.4595
OUTS *-0.3738
SIT2=1 *-1.2722
SIT2=2 *-0.9408
SIT2=3 *-0.7069
SIT2=4 *-0.4566
SIT2=5 *-0.5013
SIT2=6 *-0.2663
SIT2=7 *-0.106

only diffruns and outs are continuous variables. All of the others are dummy variables (0,1). Home is 1 when its the home team and 0 when its the visiting team. SIT2=1 is 1 when situation =1 and 0 otherwise. If you have K groups then you need K-1 dummy variables. If sit2=1 - sit2=7 all =0 then logically situation=8. Therefore there is no reason to create a dummy variable for sit2=8. It actually causes problems with the matrix algebra if you do.

Now for an example. Suppose its the ninth inning (model only works for the ninth inning) and the home team is has a man on 2nd and 3rd with 1 out and they are behind by 2 runs.

since its the home team home =1 and since they are behind by 2, diffruns =-2, outs=1, and "sit2=7"=1 because we have a man on 2nd and 3rd. All other sit2 variables must =o

1.2898
1 *0.5231
-2*1.4595
1 *-0.3738
0 *-1.2722
0*-0.9408
0 *-0.7069
0 *-0.4566
0 *-0.5013
0 *-0.2663
1 *-0.106

LF=-1.59

and

WE=.17

unless I screwed something up.

Posted 3:14 a.m., August 31, 2003 (#10) - Alan Jordan
  O.K., I see what you're doing Tango. Here is an equation that will allow you to compare WEs for your table. I have already compared them. I merged your WE's, Mine and the actuals. I estimated number of games won for both systems by multiplying the WE by the number of games. That way scenerios with 7,000 games got more weight than those with 50. I then calculated discrepancies as
abs(estWE-ObservedWE) for both systems. Yours had 16,973 discrepancies and mine had 12,064. This is from a base of 156,857 games played for whatever that tells you.

Here is the model that only works for the 7th, 8th and 9th innings.

LF= 1.0298 +
HOME* 0.5714 +
OUTS* -0.2929 +
SIT2=1* -1.0464 +
SIT2=2* -0.7885 +
SIT2=3* -0.6092 +
SIT2=4* -0.4281 +
SIT2=5* -0.5105 +
SIT2=6* -0.2745 +
SIT2=7* -0.1106 +
INN=7* -0.0138 +
INN=8* 0.0365 +
DiffRuns* 1.4561 +
Diffruns*INN=7* -0.5995 +
Diffruns*INN=8 *-0.3652 ;

If you want want the table line by line, let me know. I'll probably have to email it to you.

I would print them out line by line, but your table is 334 lines long, not counting repeated headers.

Posted 9:43 a.m., August 31, 2003 (#11) - Tangotiger
  Cool, thanks. No need to email, I can generate this on my own.

It's worth pointing out that mine is math generated assuming that both teams are equals at all times, with no HFA.

I would expect discrepencies, especially in the later innings where the pitching talent would change drastically.

Good stuff again!!

With your permission, I will reproduce my chart, along with yours (and the empirical provided by Phil), side-by-side-by-side, so people can see how things compare.

Posted 10:17 a.m., August 31, 2003 (#12) - Alan Jordan
  It would be better if you did it. That way you can check to see if I screwed anything up. You never need to my ask permission to post something like that. I've obviously put it out for public consumption.

I thought top and bottom of the inning reflected homefield advantage. Is that wrong because that's what I went off of.

As for any formal comparison of the fit of these two models, data from later years should be used. The fit from mine is biased and if your model was built off of data from these years, it's probably biased as well.

My winter project is to take the 2002-2003 play by play data and see if I can make a comparison of closers based on the number of men on, outs, inning, park, home plate umpire and strength of hitting.

It will be a logistic model like this except that it will also include terms for parks, umps, and opposing teams. The dependent variable will either be runs allowed or probability of a save. The closers can then be ranked by their coefficients in the model. This was a warmup of sorts for that so thanks to you and Phil for the free data.

P.S. if anyone else wants to tackle this project feel free to steal the idea. The hard part is separating the closer from the defense since closers don't rotate teams during the season. The idea of some kind of dips adjusted model is daunting and I may just stop without separating the pitcher from the defense.

Posted 10:14 a.m., September 3, 2003 (#13) - tangotiger (homepage)
  Here is the win probability chart that shows my math model, Phil's empirical data, and Alan's function.

Posted 10:20 a.m., September 3, 2003 (#14) - tangotiger
  The largest discrepencies between mine and Phil's real data are the following:
Inning HomeAway Score Base Out Tom Phil Alan
7 Away 0 2nd_3rd 0 0.279 0.167 0.288
7 Away 1 3rd 0 0.517 0.619 0.587
7 Away 1 2nd_3rd 2 0.665 0.768 0.631
7 Away 1 Loaded 1 0.500 0.608 0.533
7 Home 0 Loaded 0 0.826 0.967 0.830
8 Away 1 1st_3rd 0 0.453 0.340 0.574
8 Away 1 2nd_3rd 0 0.410 0.300 0.534
9 Away 1 2nd_3rd 1 0.552 0.448 0.696
9 Home -1 3rd 0 0.593 0.457 0.410
9 Home -1 2nd_3rd 0 0.741 0.628 0.509
9 Home -1 Loaded 0 0.766 0.614 0.536

You get oddball results with Phil's data because of the sample size issue. For this reason, I would not rely too much on that data.

Posted 10:09 p.m., September 3, 2003 (#15) - Alan Jordan
  O.K., it took a while to get everything straightened out, but I have been able to verify the numbers for both my estimates of WE and Phil's empirical estimates based on the raw numbers. Both check out fine by me. Given the set of scenerios here which is different from the first, your estimates have fewer discrepencies than mine, 8,929 to 13,908.

I'm confused as to why you say that your model doesn't factor in home field advantage, yet you have home and away on your table.

Posted 10:44 p.m., September 3, 2003 (#16) - Tangotiger
  I mean that I don't give a HFA advantage in terms of a home team winning about 54% of their games.

But, tied in the bottom of the 9th for the home team is far better than tied in the top of the 9th for the home team. After all, if the visiting team scores a run in the top of the 9th of a tied game, the home team can still win the game. But, the home team scoring in the bottom of the 9th guarantees the win.

Posted 10:49 p.m., September 3, 2003 (#17) - Alan Jordan
  Got it.