See copyright notice at the bottom of this page.
List of All Posters
Forecasting Pitchers - Adjacent Seasons (January 30, 2004)
Discussion ThreadPosted 1:29 p.m.,
January 30, 2004
(#7) -
VoiceOfUnreason
Would using the various DIPS components break up the interdependences?
Clutch Hitting: Fact or Fiction? (February 2, 2004)
Posted 12:46 p.m.,
February 9, 2004
(#99) -
VoiceOfUnreason
"But the acutual average impact of an at bat is to reduce the offensive team's chances of winning. At least if you accept the win probabilities that have been published here"
The first of these statements is, on its own, clearly incorrect. Probability is conserved, after all.
P(W)= P(A)P(WA) + P(B)P(WB) + ..., where A,B, etc are the space of mutually exclusive events.
Working in innings (because it's a bit easier conceptually) a teams chance of winning immediately prior to their next turn on offense is the probability that the score no runs times the probability that they win if they score no runs in the next inning, plus the probability that they score one run times the probability that they win if they score exactly one run in the next inning, plus....
This is mathematically identical with the weighted average of the probabilities that they will win after their next inning.
Now, the following things are true, given the nature of baseball scoring in today's game: the most common event of a team's offensive innings decreases their chances of winning (mode), the middle result of the possibilities also decreases the chances of winning (median). But the median (average) outcome is neutral.
You can do the math by innings, half innings, plate appearances, pitches, it doesn't matter - probability is conserved.
Also note that this does not in any way depend upon the fact that the game is zero sum. What that constraint does is ensure that that the sum of the probabilities of each team winning at any moment is unity. But in a different game, where perhaps under certain circumstances both teams could win, that constraint could be violated. The conservation of probability would still hold.
Now, if the win probabilities published here are not consistent with this, then perhaps they are in error, or some subsequent calculations introduced an error. That would be useful to know.
Clutch Hitting: Fact or Fiction? (February 2, 2004)
Posted 2:40 p.m.,
February 9, 2004
(#105) -
VoiceOfUnreason
Gah, I need to get caught up. I think I finally understand what Ross is talking about (I don't agree yet, but it no longer seems stubbornly insane), but then Tango lost his mind. Sigh.
Clutch Hitting: Fact or Fiction? (February 2, 2004)
Posted 5:09 p.m.,
February 9, 2004
(#113) -
VoiceOfUnreason
"What you have here is actual data that shows that the actual probability of a team winning increases more while they are fielding than when they are batting"
You've said this a number of times. Telling me doesn't seem to help. Can you show me? I'd consider a satisfactory demonstration any which illustrates any average (median) change in a team's probability of winning when they are batting. You don't even have to do the fielding part.
Now, what I would expect to see is a probability of winning before the team bats, a list of states after the team bats, a winning probability associated with each, and a transition probability associated with each. But I'll be quite startled if you do it that way and demonstrate a change, so don't feel locked into that approach.
I expect that, seeing how you reach your conclusion, I will be able to work out what assumption you and I don't share.
Clutch Hitting: Fact or Fiction? (February 2, 2004)
Posted 7:22 p.m.,
February 9, 2004
(#116) -
VoiceOfUnreason
"But when the pitchers are on the mound the chances of their winning increases and the chances of the team hittin decreases - and of course the converse is true as well"
This is the bit that I don't see - can you be more specific about the evidence that leads to this conclusion?
Clutch Hitting: Fact or Fiction? (February 2, 2004)
Posted 8:50 p.m.,
February 9, 2004
(#119) -
VoiceOfUnreason
OK, so I add up the pitcher numbers ( 4427533 AB, 22760.56 Wins ), and the batter numbers ( 4404677 AB, 12581.43 Wins ). I assume we can gloss over the missing 23K AB. Now what?
Are you proposing that, since the pitchers are getting more credit for wins than the batters, that it necessarily follows that the win expectation is going down (on average) in the offensive part of the inning?
Clutch Hitting: Fact or Fiction? (February 2, 2004)
Posted 12:17 a.m.,
February 10, 2004
(#125) -
VoiceOfUnreason
What I was looking at was garbage. Bugs in the parser. Now that I've fixed it, I'm also getting 4443803, and plus/minus 1067. That's over some 65K games, so 1.5 % error if it really should be "neutral". I think AED is right, but I'm not certain - I have to think about where Ed could have introduced a bias.
http://www.livewild.org/bb/wintab.html.
Here's what I was expecting to see (Tango, take note). The initial probability in favor of the home team is .546 before the top of the first. What is the probability after the top of the first?
w% freq expectation
0 Runs: .593 : .721 : .428
1 Run : .494 : .156 : .077
2 Runs: .398 : .072 : .029
3 Runs: .309 : .030 : .009
4 Runs: .233 : .013 : .003
5 Runs: .171 : .005 : .001
.547 checks
The frequencies here are specifically those associated with runs scored by visiting team in the first inning. I'm not being terribly rigorous here (you'll have noticed the frequencies don't add up to unity, for instance), but rather am demonstrating that the probability conservation holds at the inning level, and trusting that it is then obvious that it will hold at the plate appearance or pitch level as well [certainly if you use true probabilities, and likely if your estimated probabilities are reasonable].
Clutch Hitting: Fact or Fiction? (February 2, 2004)
Posted 8:11 a.m.,
February 10, 2004
(#133) -
VoiceOfUnreason
Ross, the difference you observer (34 vs 32) is rounding error combined with the fact that the frequencies I listed do not include all of the outcomes which favor the offense.
I wonder if that's the error Oswalt made - if he were cropping his analysis at 5 run differentials, then the hitting team gets short changed any time they run the lead past 5. It might not have seemed an unreasonable thing to do (sample size goes to hell at 5, especially if you are treating each season independently).
I don't see any evidence to support that beyond the fact that it appears he introduced an error somewhere, and it seems an easy one to make which still gives mostly reasonable results.
What does he mean "1/24 of the entries"?
Clutch Hitting: Fact or Fiction? (February 2, 2004)
Posted 11:38 a.m.,
February 10, 2004
(#136) -
VoiceOfUnreason
"the argument that some factors favoring the offense are not included is clearly wrong."
No, it isn't.
I know that it isn't because I generated the transition frequencies in #125, so I know what data was thrown away.
You know that it isn't because you can add the frequencies in that post, see that they sum to only .994, and being aware that an offense cannot score fewer than zero runs in the top of the first, the missing datapoints must be those where the visitors scored more than 5 runs, which is the part of the probability table that favors the offense.
Now, if you made that correction, you would still find a small discrepency in the probability conservation. There is an error that has been introduced because I'm guessing at the correct transition matrix (the frequencies) - I'm substituting the frequencies calculated from the observed transitions in an old study of mine, which are only going to be very close to the frequencies calculated by Oswalt from the observed transitions in the data from his study. Unless I overlooked it, Oswalt didn't publish his transition matrix, so this is as close as I can get without doing real work.
If we had Oswalt's transition matrix, there would still be a tiny error, from rounding.
But if you would rather believe that it is accident that Oswalt's data, my approximation of the frequency transitions, and the formula applied to them ( from #99 above ) gives an answer off by one in the last place, suit yourself.
Clutch Hitting: Fact or Fiction? (February 2, 2004)
Posted 12:15 p.m.,
February 10, 2004
(#138) -
VoiceOfUnreason
"You will have to explain how it got thrown away. What happened to the wild pitches? Was the probability of the team winning after the plate appearance calculated assuming the wild pitch didn't happen? I don't think so."
Plate appearances? I thought you were contesting the results in #125 which are based on innings, not PA.
As for how it got thrown away, Oswalt only provided in probabilities for the 0-5 runs states, so I didn't have anything to apply the other transitions to. I could have stuck them in, with big questionmarks to show that Oswalt didn't provide matching win probabilities, and additional footnotes remarking that the sample sizes had gone completely to pot and therefore the transition frequencies must be taken with a wife of salt.
Clutch Hitting: Fact or Fiction? (February 2, 2004)
Posted 1:53 p.m.,
February 10, 2004
(#144) -
VoiceOfUnreason
"Oswalt didn't base his results on that probability table but tables that measured the likelihood of every state before and after each play."
So you believe that I demonstrated close agreement with Oswalt's first inning result even though I used a probability table completely unrelated to his situation?
Wow. I guess I should be flattered.
On the off chance that Tango is still reading: "The change in win probability by the offense, on a league level, will always be zero. This is not true at a team, game, inning, or PA level."
I still insist that it has to be, by #99. Is there some counter example you are aware of, or perhaps we are discussing different parts of the elephant?
Poisson Distribution - Win % between two teams (Excel Spreadsheet) (February 20, 2004)
Posted 12:24 p.m.,
February 22, 2004
(#6) -
VoiceOfUnreason
So what's the right answer for sudden death?
My guess is that you reduce lambda by a factor of three, run out the goals scored and for those cases where both teams score, the probability is the ratio of the goals (ie, in the case where poisson suggests 3-2 in the overtime period, the team scoring three scores the first goal 60% of the time, and wins).
Mo and the HOF (March 25, 2004)
Posted 12:25 p.m.,
March 25, 2004
(#4) -
VoiceOfUnreason
Lets see, if you figure an average hall of fame career at 18 seasons, with three players being inducted every two seasons and one third of those being pitchers, there ought to be about nine HOF pitchers playing right now.
Clemens, Maddux, Martinez, Johnson?, one of the A's, one of the Cubs, Rivera, one of the maybes (Mussina, Schilling, Brown, Smoltz...)
Mo and the HOF (March 25, 2004)
Posted 7:15 p.m.,
March 25, 2004
(#10) -
VoiceOfUnreason
"But even at 40 inductees, that is ~2.9 per year, almost twice your estimate."
For reasons of personal taste, I tend to ignore the various committee inductions. By my tally, 1990-2004 sees 23 BBWAA selections in 15 years (1990-2004), so my algorithm was off by half an induction. VHOF has 22 or 23 slots during that time period as well.
I could have made clearer the standard I was using.