Pages 21 to 26, excerpted from Player Win Averages, by Mills and Mills. © 1970, A.S. Barnes and Co, Inc. ISBN: 0498076466

Interested readers can obtain book through interlibrary loan. I obtained my copy from Virginia Tech.


Help from the Computer

The most difficult problem (and the key to the system) was to figure out how to accurately determine the chance of a team winning from any of the nearly 8000 "whens" in a game.

First off, we had to force ourselves to ignore all the nonrmal statistics available to us today. That's because they only tell us "what". And furthermore, we don't really care how a runner reaches first, for instance. The fact is, he is there and the game has progressed to that point. What happens next from that point is what we are interested in, and from that next point, and the following point-to the the end of the game.

Where could we get this kind of information? Of all the statistics on baseball today, nobody we could find kept track of a game in this manner. So we had to do it ourselves. The end result was a scorecard that not only simultaneously told us "what" and "when" a player did something, but could be preserved, in such a way that the information could be transposed to computer cards-and then to a computer.

This scorecard fitted our purposes exactly. Now we could gather a history of the progress of every game in both leagues for the entire season (and all seasons to come) as) it actually happened. Now we could tell, for instance, just what percent of the time any situation would follow any other situation. As an example, we know (and we don't know anybody else who does) what percent of the time a double play will occur with a runner on first, and less than two outs. We also know not only what percent of the time a home run will be hit with men on second and third and one out (Bobby Thomson's situation), but also what percent of the time a home run will be hit from every combination of men on base and outs.

Why do we need this information? Because now we can direct a computer to play baseball games just like real games, according to these percentages. We can play the games over and over, thousands of times. We can keep track of who loses and who wins, and from that we can establish a chance of winning.

In order to establish a chance of winning from each of the nearly 8000 situations, we must play out games beginning from each of those situations. When we start thousands of games from the beginning (nobody on, nobody out, top of first, score tied) we find that each team will win 50 percent of the time.

Now, if we play out the game from one of the very next possible situations (nobody on, one out, top of first, score tied) , we find that the home team will win approximately 50.2 percent of the time, or just slightly more than half. Another possible next situation, from the beginning one, might be runner on second, none out, top of first, score tied (lead off man hit a double). Now, playing out the game in the computer thousands of times from this situation, we find the visitors will win approximately 55.9 percent of the time.

And so we go, starting from every possible situation and playing it out from there to the end of the game. We even played out the game thousands of times from the situation that Bobby Thomson faced. (And the home team doesn't win from that situation very often. In the computer, as a matter of fact, the home team won only 264 games out of 1000, for a 26.4 percent chance of winning.) Of course, what we have now is a chance of a team winning, based on normal league play. In other words, if all the players were statistical robots, we could depend on these odds quite precisely in predicting the outcome of a game from any situation. But Willie McCovey (in 1969 the greatest of them all) is far from being a robot. He is also far from being average, and, to tell you the truth, most of us know that without our new statistic.

None of the other players are robots either, and they will all vary from the average to some degree. And as we measure, play by play, just how much each human player changes his team's chance of winning we will learn, over the long run, just how much below or above average he is.

Many players will perform close to that of our average player. Some will be farther above average, some will be further below. And it is our new statistic - Player Win Average - that makes it possible to tell at a glance who is playing average ball, who is playing above average, and who is playing below average. We are also able to rank players from best to worst, as we now do with batting averages.

We can compare this whole process we have just described to another field. A life insurance company knows the life expectancy of a 55-year-old, married carpenter who lives in Milwaukee; we know the win expectancy of team trailing by two runs in the bottom of the sixth with one out and a runner on second base. The life insurance company knows how much premium to charge from its actuarial tables, which cover every age, sex, field of work and so on. We know how much to charge every player action - every "what" - from our chance of winning tables, which cover every situation - every "when" - possible in a game.

6. Baseball Players Set Their Own Standard

Now that we have established a chance of winning for a team from any situation, the next thing is to be able to convert that chance of winning into a meaningful value so that we can award Win and Loss Points. Here's what we've come up with.

The chance of winning is, naturally, expressed in percentages. That's awkward, so we have converted them to whole numbers. Then, for reasons of simplicity, instead of a start of a game being 50-50, we set the value at O. We set the end of a game at +1000 for a home team win, and -1000 for a visitor win.

Now, as the game progresses, the visitors are attempting to move the game to -1000, while the home team is striving for + 1000. Each player, depending on his action, is then awarded points, based entirely on how much he has increased or decreased his team's chance of winning. We already know what the chances of winning are from every situation, so all we have to do is look at the value of the situation when he came to bat, look at the new value after he is through, and award the points.

If he increased his team's chance of winning (usually by getting on base) he will receive Win Points. If he decreased his team's chance of winning (usually by making an out) he will receive Loss Points. The opposing responsible player (usually the pitcher) receives just the opposite, so that on every play a player on one team receives Win Points, and a player on the other receives exactly the same number of Loss Points.

And so on down through the game: The more clutch the situation, the larger the value of pcoints, both Win and Loss. Average situations will generally have a value of between 25 and 75 points. Big clutch plays get up as high as 1800 points (going from probable defeat to certain victory), and small clutch plays drop to 5 to 10 points (hitting a home run in the ninth while leading by six runs) . Bobby Thomson's home run? Worth 1472 Win Points. Pitcher Ralph Branca? 1472 Loss Points. Who's Branca? He threw the pitch that Thomson hit.

So, over any period of time-weeks, months, a season- we continually award Win and Loss Points to each individual player. We award the points to a member of each team simultaneously on each play, based on just how much each player increases or decreases his team's chance of winning.

This is comparable to awarding number of hits and times at bat to a player. At any period of time we can stop and figure his batting average. It's the same with our scoring system. At any period of time we can stop and figure a Player Win Average. Everybody knows how to figure a batting average (divide number of times at bat into number of hits), but once again, here's how we figure a Player Win Average.

Add up the total of a player's Win and Loss Points. Then divide that total into the Win Points only. The resultant percentage is a win average. Example-if a player has 13,000 Win Points and 12,000 Loss Points, we divide 13,000 plus 12,000 (25,000) into 13,000. That turns out to be a .520 win average. Since it belongs to an individual player we call it a Player Win Average.

Here's something to keep in mind, and it also explains why we think this measurement system is equitable for the players.

The players are not measured against any arbitrary standard. They are measured against their own teammates and opponents on how they performed this year. Over the year, using our new scorecard, we tabulate every play of every game. We know what actually happened-how many times each situation moved to each next situation. This gives us an average of what will happen on each next play, as actually performed by the players.

So when we score each player against that average, we are really scoring him against his fellow players and opponents. The player who conforms to the average will have exactly the same number of Win and Loss Points, for a .500 Player Win Average. Those who are better than average will be above .500, and those who are less than average will be below .500, no matter what their batting average or earned run average may be.

To illustrate, if it were a common, every-day occurrence for a player to hit a game-winning home run in the ninth, then those who did not would be below average. Since this is not the case, those who do not are not necessarily below average. Also, in a year when hitters are big, and ten runs a game are commonplace, a player had better be up there getting his share, or he'll be below average. On the other hand, in a year like 1968, an average hitter needn't have done so much, since low scoring games were the rule.

In other words, we do not measure players from one era against players from another. We measure them against their own teammates and opponents. But the statistic itself-Player Win Average-can be used to compare players of any era. That's because, in any era, whether the ball be dead or rabbit-like, a .500 ball player will be average, and a .570 player will be much better than average.