From Wiki
Jump to: navigation, search

Ask your question

This is a page dedicated for the authors of The Book to answering questions from readers. Stay tuned. --Tangotiger 06:31, 13 May 2008 (PDT)


BaseRuns - custom version

Is there a way to theoretically create the A and B terms for base runs given just everyday statistics in hand, or do you have to have some type of simulation to find them.? For example, I want to eliminate the GDP terms, but that would create a fractionally out term in the A formula.? How do I find this and others like it?

When it comes to BaseRuns and the lack of some data, I would suggest that you use the frequency that you can find on my site as a stand-in for missing data. This way, the equation never changes.

I should also note that BaseRuns is most appropriate for teams and pitchers, and not really appropriate for hitters. Linear Weights is better for hitters. That said, the issues of using BaseRuns for hitters is more of a theoretical objection than anything practical.

If you are interested in different version of Base Runs, just follow that link.

--Tangotiger 09:53, 14 May 2008 (PDT)

BaseRuns - game-by-game or seasonal aggregate?

Given access to both sets of data, is it more accurate to find the BaseRuns based on season data, or add up individual box scores and game data for that season?

Good question. Since it is not a linear equation, there will be a difference. Your question could even be based on figuring BaseRuns at the inning level. That said, I think you'd end up with more accurate data at the smaller unit, if for no other reason that you know at the game or inning level how many baserunners were on base during the inning, and therefore can come up with a better estimate of the % of those that scored if you knew there was a HR hit in that inning. That presumes that BaseRuns knows what it's doing at the extreme levels. We know it does at the game level. We just need to see it at the inning level.

--Tangotiger 10:07, 14 May 2008 (PDT)


For pitchers, is there any year to year correlation on slugging percentage on balls in play or xbh per ball in play (rather than babip)?

When it comes to DIPS, which is where all this BABIP fascination stemmed from, I can't recommend enough the best paper on the subject. On the very first page, I presented this data:

Treating this as a first step, and realizing that we have parks, and switching teams to take into account, here are the year-to-year correlations of all 1687 pitchers with at least 500 PA in consecutive years (which itself may imply a selective sampling issue) from 1972-1992:

Event r
K 0.78
BB 0.66
1B 0.47
HR 0.34
XBH 0.26
1bBIP 0.25
xbhBIP 0.21

Except for BIP, all are based on a per PA basis. XBH = (2b+3b)/PA. 1bBIP=1b/(PA-HR-BB-K), xbhBIP=(2b+3b)/(PA-HR-BB-K)

You will also find other great correlation data from an article by mgl called DIPS Revisited

--Tangotiger 10:16, 14 May 2008 (PDT)

BABIP - Linedrive rate

People on fangraphs frequently talk about line drive rate for both pitchers and hitters, and say that babip is for the most part a function of line drive rate, but since there is almost no year-to-year correlation of babip, I would assume that there would be no year-to-year rate of pitcher line drive rate. Do pitchers have the ability to prevent line drives?

I put this link as the answer to another question, so let me repeat it here:

You will also find other great correlation data from an article by mgl called DIPS Revisited

You will see that the correlation for the LD rate itself is very close to zero. But that the hit rate on the LD rate is not zero. So, this would imply that it is close to impossible to figure out if a pitcher has the ability to prevent the quantity of line drives, but that you can figure out if he can limit hits on line drives (quality). Note that just because something is hard to figure out doesn't mean that it doesn't exist.

--Tangotiger 10:29, 14 May 2008 (PDT)

Character traits - winning

Do you know if anyone, in any sport, has done a study on the effects of "low character" guys on team performance?

No I don't know.

However, I would like to see any GM turn down Albert Belle. Say that someone with Belle's talent is worth 20MM a year today. What will a GM say? That his bad attitude will bring down the rest of his team by 1 win, so he'll only pay him 16MM a year? And some other GM thinks he has a 2 win negative peripheral effect, so he'll only pay him 12MM a year?

When it comes time to putting someone's money where their mouth is, I'd find it hard to believe that they would put more than 0.5 wins of peripheral effect on it. A player's bad attitude could cost a player maybe 2MM$ a year (presuming 4MM per win). Even then, I doubt it's that high.

Everyone talks about it as something big. But, ask them to quantify it, all of a sudden they are as inconclusive as anyone else.

Same applies for Clutch Hitting. Even if we accept that it exists, and even if we accept that we can identify those hitters as being clutch, how much is it really worth? That is, how much more are you willing to pay Marco Scutaro or Joe Crede than an otherwise equally talented clutch-less player? 1MM more? 2MM more?

Really, it's treated as a value-less bonus, just like your boss would reward your incredible day-in, day-out hard work with a 20$ gift card.

--Tangotiger 19:10, 14 May 2008 (PDT)

Pythagorean Winning Percentage - pitchers

How do you calculate a pitcher's win percentage? Pyathagorean equation using average bullpen and offense? What's the utility in doing it?

I just do it very plain, and use the pitcher's runs per game, the run environment, and throw it into Pythag. The benefit is that everyone knows that .500 baseline. Otherwise, I'd have to explain what my run environment baseline is.

--Tangotiger 13:22, 14 May 2008 (PDT)

Hockey - event data

I was wondering if you knew of a website that publishes more specific statistics for NHL games, things like deflections, turnovers, times scored against (man-to-man and zone, for the non-goalies), etc. Also, do you know if there's anyone who publishes a "goal tracker," showing where exactly each goal a player scored crossed the net? It seems like something like that would be valuable in seeing if certain scorers really favor certain holes or if certain goalies really struggle against goals that are stick-side and high or something of the sort.

I'll just point you to my favorite hockey blogs, and you'll probably find something that would satisfy you. An almost complete set is maintained right here: If it's not there, you won't find what you need anywhere.

--Tangotiger 11:00, 14 May 2008 (PDT)

Roster contruction

Have you guys done any work on optimal 25-man roster construction? I'd like to be able to scientifically answer questions like, "Is it valuable to carry a player who functions almost solely as a pinch-runner", or "Carrying six infielders, when neither of the two backups can function as a valuable pinch-hitter or pinch-runner, is a poor allotment of roster space," or "Carrying 12 pitchers is clearly better than 11, or even 10, despite the potential advantageous platoons you could construct with 15 position players." I realize you need to know the individual team/player stats to properly do this, but are there any generalizations you could make? Thank you for your time. I really enjoy your work.

I have not done any work on that or even thought about it too much. I'll take a stab though. I don't see any problem with a player who functions only as a pinch runner and late inning defender (I think that you at least want both). I would think that you would want your pinch runner to be a good basestealer as well. Not all fast runners are good basestealers of course, especially young ones.

There are many reasons for carrying players on your 25-man that are not just to have extra players to pinch hit or pitchers to pitch when the starter gets knocked out early. You might want to have certain young players get major league experience. You might even want a veteran or two to mentor your young players. I am not necessarily saying that these things have merit - only that teams do do that.

That being said, an NL team definitely needs more bench players than an AL team. No question about it. As we have said many times, teams do not pinch hit for the pitcher nearly enough. To do that, you need plenty of pinch hitters.

I also don't think that any team needs 13 pitches, which you don't normally see anyway. I doubt many teams need 12 pitchers either. I would probably carry 11 pitchers in the NL and 12 in the AL. Then again, you need more pitchers in the NL as well, so maybe that is not such a good idea. How about a 24 man roster in the AL and a 25 man in the NL!

Teams that carry 3 catchers for no other reason than they are afraid of two catchers being hurt in a game are being needlessly cautious (I am being kind - I could say they are being stupid).

NHL teams carry two goalies for each game, never three. The idea that a MLB catcher is more at risk for injury than an NHL goalie doesn't seem reasonable. Furthermore, you can put someone as catcher, and he could be passable. You cannot put just anyone at goalie. Catchers are not quarterbacks--Tangotiger 07:39, 15 May 2008 (PDT)

You really want to tailor your pinch hitters and bench players to your starting lineup. By that, I mean if you have a particularly bad defender in your starting lineup, it makes sense to have a good defender who can come in the late innings when you need to preserve a lead. If you have several weak hitting lefties in your lineup, it makes sense to have a strong right-handed hitting PH or two. Etc.

It also does not make sense to have fewer than 2 lefties in your pen. There are really only 3 reasons to make pitching changes in a game: One, to get the platoon advantage. For that, you want at least 2 RHRP and 2 LHRP. Having 1 LH and 6 RH relievers makes no sense at all. Anyway, the second reason for a pitching change is to pinch hit for the pitcher in the NL. The third reason is to bring in a better or worse reliever as the leverage of the game changes. Only occasionally will you need to remove a reliever because he is tired. Managers making pitching changes based on a pitcher getting "lit up" for an inning or two is generally a waste of time, although some people would disagree with me there.

Even with the bullpen, you need to tailor it to your team. By that, I mostly mean the number of relievers. If you play in a hitters park and/or your starters tend not to go deep in games, either because they are bad, they are not durable, or both, then you obviously need more relievers. In fact, on second thought, I might suggest that a team with good/durable relievers and/or that plays in a pitcher's park, carry 11 pitchers, and the rest of the teams carry 12.

And of course you need to give starters rest now and then as well as replace players who are hurt but are not on the DL. So you need to tailor your bench to replace the types of players that often need rest or are more likely to get hurt, etc. Obviously teams know all of that though.

If it were me, and I were carrying only 11 pitchers and 14 position players, I would make sure that I had 2 excellent hitters on the bench as pinch hitters, especially, again, in the NL, where you simply have a lot more pinch hitting opportunities. Too many teams have players on the bench who can't really hit. And you have to consider the pinch hitting penalty. I forgot how much it is,

Around 25-30 points of wOBA, which is also the relief advantage over the starter. That is, the "coming in cold" for a PH compared to a regular hitter is roughly the same as the "going all out" for a reliever compared to the paced starter.--Tangotiger 07:39, 15 May 2008 (PDT)

but if you are pinch hitting for a position player with someone only slightly better, once you include the pinch hitting penalty, you are just spinning your wheels.

I am also a big fan or platooning. There are many teams who have a player at a certain position who is a marginal hitter (for that position). The best solution for that is to platoon with him another marginal hitter who can play that same position. In doing that, you also get a built-in tandem pinch hitting situation. Platooning is an underused strategy and for some reason has really fallen out of favor lately. I am not sure why.

I think it's purely a shift of preferring a deep bullpen. That, and the rise of the switch hitter. Just a few decades ago, there were not alot of switch hitters. Now, they are everywhere.--Tangotiger 07:39, 15 May 2008 (PDT)

Maybe it is the escalating salaries and GM's don't think they can "waste" roster spaces with a lot of platooning. Also, when you platoon, you potentially run into ego problems as well as the fact that certain players or maybe all players do better when they play everyday. Of course if you platoon you always try and do it with the lefty who expects to play every day. If you platoon a star or former star player who is a righty, like a Frank Thomas, you are asking for trouble.

As I said, this is not my area of expertise by any means, but it is an excellent question and an area that you don't often see addressed in the sabermetric literature.--Mgl 22:41, 14 May 2008 (PDT)

UZR - hardness of hit balls

Anyway, I was wondering if you guys knew of anyone who differentiated between "hard" balls in play and "soft" BIP, primarily in regard to defensive ratings. There is, after all, a difference between a Pujols scorcher and an Eckstein dribbler, though I guess it'd be hard to sort out the data and determine if some balls are medium-hard or medium-soft.

This is one of the parameters in UZR.

--Tangotiger 12:16, 15 May 2008 (PDT)

Yes, the data I use for UZR includes how hard a ball is hit on a scale of 1-3. In the long run of course, it doesn't matter much, as for each fielder, those things should tend to "even out," but in the short run (there is no magic line between the short and long run of course), having that additional parameter makes the output (UZR) more accurate.

As for the other advanced defensive metrics, I am not sure if they include that sort of thing. The most similar metric to UZR, I think, is Dewan's plus/minus, although he uses a different database (BIS versus STATS), which changes the inputs and outputs, even though it really shouldn't. I forgot whether he classifies batted balls by their speed.

A couple of other things that I have been meaning to add are, one, a parameter which indicates whether a play by a fielder (if he makes the play) was routine, easy, or hard (I forgot off the top of my head how many classes there are), and two, to account for or just ignore unusual defensive alignments, like the infield (and outfield) in, and a shift against batters like Delgado, Bonds, and Ortiz. The first one is included in the STATS data. The second one is not.

One of the ironies of different metrics using different databases and therefore getting different results is that certain people complain about that and point to that as a reason for not taking these defensive metrics that seriously, or at least as seriously as offensive metrics, like linear weights, EQA, VOPR, etc.

Well, the reason that the different databases differ with regard to the data that go into these defensive metrics is that the data is much more granular than that which goes into offensive metrics. And that is a GOOD thing. With offensive metrics where everyone uses the same thing, single, doubles, outs, etc., there is (obviously) no distinction between a hard out and a soft hit, or a home run that barely clears the fence and one that is hit 450 feet. So we should be happy that the defensive metrics are using data that is so specific that different people may classify it slightly differently.--Mgl 17:09, 15 May 2008 (PDT)

Betting accuracy

Have a baseball simulator that I plug ZIPS projections into. I am using it to get a win probability for each game an NL West team is playing in. I am then comparing my results to the LV Hilton Sports Book and AccuScore. I am having difficulty in coming up with a good way to measure which system is doing better. At first I tried summing the win expectancy of each game together. For example if my simulator gave the Padres a 60% of winning a game and they did, my simulator got 60.00 points added to it's overall tally. If they had lost that game I would add 40.00 points. It was quickly pointed out to me that that method was flawed, because a blind system weighting the favorite at 100% would outperform any of the three systems. I thought about calculating standard deviations for each system. Measuring on average how far off they were from the actual result. Without getting into too much detail, do you have any suggestions for me?

First, I can guarantee you that they (Hilton, etc.) will do a lot better than your sim will! Not that you sim isn't any good or that the inputs (the projections) are not good either. It is just that there is too much collective wisdom and experience (both from the linesmaker and the people that bet into the line) in those lines for any one "system" to do better.

But, that does not mean that you cannot make money betting sports of course. Remember the axiom, "He who put out the line first loses." In other words, you only have to recognize on occasional mistake that he makes, assuming that you have enough knowledge to be able to do that, in order to profit.

Anyway, there are many ways to compare your odds with theirs. One way is to take the result of the game, turn that into a pythagorean win percentage, and use that as a proxy for the "true" odds of that game. Once you do that for the games you are comparing, you can do a regression and see who has the best "r" or you can look at the average squared difference for you and for them, or the average difference, or whatever you want.

You can also do what you suggested, but you have it backwards. If you have the fave as 60% and they win, you get 40 points. If they lose you lose 60 points. But, you can't just tally up the points. The positives would cancel out the negatives and as long as overall you had the right percentage for all faves (whatever the average fave is in baseball), you would net zero points at the end of the season, even though your percentages for each game were blind. What you would have to do is to square each result after each game and then add up all these squared results, and whoever had the least number of points is the winner. This is essentially the same thing as the first method above, but it treats each outcome as if the winning team had a 100% chance of winning. A 3-2 game would be the same as a 10-1 game. Obviously a team that wins 10-1 is more likely to be that much better than their opponent than if they won 3-2. That is why you would much prefer to use method one above, which, as I said, is to use the score of the game as a proxy for the true odds of the outcome of a game between those two teams.

As I said, there are probably other equally good ways you can do it.--Mgl 17:33, 15 May 2008 (PDT)

The usual way of doing this is to use the square of the error. So, if you predict a 60% chance that a team will win and they actually do win, the error in your prediction is 40% and your "score" for the game is 0.16 (higher scores are worse). On the other hand, if they lost, the error in your prediction is 60% and your score is 0.36.

A better (but more mathematical) way is to add up the natural log (ln) of the inverse of the probabilities of the correct team winning. Again, lower scores are better. With the above example, if the team you expected to win (60% chance) wins, you get ln(1/0.6) = 0.511 points. If the 40% team wins, you get ln(1/0.4) = 0.916 points.

I would also expect that well-made projections (assuming you're taking into account home field advantage and the starting pitchers) should outperform the lines. It's not that the oddsmakers are dumb, it's that their job is to get half of the money placed on either side of the line (so that the house is guaranteed a profit, regardless of outcome). --AED 08:08, 16 May 2008 (PDT)

In the old days it was the job of the oddsmaker to just get split action. Not so anymore. Well, they still want split action, but the only way to do that in this day and age is to put out a good line - not one that the public "likes" or expects. The reason is that there are too many really smart people who CAN project with pretty good accuracy each team's respective odds, notwithstanding what the public thinks. And they have enormous bankrolls. So if the oddsmaker puts out a "public line" he is going to put the sportsbooks in grave jeaopardy. While the "public line" may split the public money, the "smart" guys are going to come along and dump a huge amount of money on the "right" side and all of a sudden there is no longer split action. That is more true in some sports than others, but it is true to a large extent for all sports. In addition to that, once the line is out, the "smart money" and to some extent the public money (the collective public wisdom) will move that initial line towards one that is more correct.

My opinion is that while you can find many situations where the line put out by the oddsmaker, or even the closing line, is a bad one, if you pit any good projection system against the oddsmaker's line, on every game, the projection system will get killed. But, again, you get to choose which of their lines to bet into. They don't. And, they have to put out the line first! That is where you can get an edge, assuming that your model is good of course. Imagine the true odds on every game were in between your line and the oddsmaker's line. And imagine that it was much closer to their line then to yours. IOW, their line is much better than yours (and you would get killed in your "point system contest"). You would slaughter them, assuming that your line and theirs differed on enough games and by enough of a spread that you could still bet a lot of games and overcome the juice on each of those games.

--Mgl 22:04, 17 May 2008 (PDT)

Forecasts - Merging them

Do any of you know a good way to put together a "composite projection" if I wanted to combine the projects of, let's say, PECOTA, Bill James, ZIPS, CHONE, and Marcel? I'm not sure that simply adding them all up by raw components (i.e. every counting stat) and averaging them would work.

Why not? Seriously, that is what I would do.--Mgl 21:12, 17 May 2008 (PDT)

Great pitchers - Innings per Win

Mr. Bill J.ames, Do you have numbers on average innings per win for all 300+ winners? I would think that Greg Maddux must have the lowest average innings pitched per win in that group, probably between 6 and 7 innings per win

Double that number. Here's the list (through 2007):

Wins	IP per W	Pitcher
373	12.8	Mathewson, Christy
300	13.1	Grove, Lefty
326	13.8	Plank, Eddie
328	13.8	Clarkson, John
347	13.9	Maddux, Greg
354	13.9	Clemens, Roger
373	13.9	Alexander, Pete
361	14	Nichols, Kid
417	14.2	Johnson, Walter
303	14.4	Glavine, Tom
511	14.4	Young, Cy
363	14.4	Spahn, Warren
309	14.7	Radbourn, Charley
342	14.8	Keefe, Tim
300	15.2	Wynn, Early
311	15.4	Seaver, Tom
307	15.6	Welch, Mickey
329	15.9	Carlton, Steve
324	16.3	Sutton, Don
364	16.5	Galvin, Pud
324	16.6	Ryan, Nolan
318	17	Niekro, Phil
314	17	Perry, Gaylord

Maddux is the same as Clemens. Is your implication that he got lucky in terms of managing himself, and getting alot of wins? Or that he was so good that he didn't need many innings to get wins? The perfect pitcher will get 8 or 9 innings per win of course.

Let me give you a different list, which is what you may be asking, and that is innings per decision:

Wins	IP/W	IP/Decision	Pitcher
300	15.2	8.4	Wynn, Early
417	14.2	8.5	Johnson, Walter
373	12.8	8.5	Mathewson, Christy
347	13.9	8.6	Maddux, Greg
363	14.4	8.6	Spahn, Warren
326	13.8	8.6	Plank, Eddie
303	14.4	8.7	Glavine, Tom
324	16.6	8.7	Ryan, Nolan
361	14	8.9	Nichols, Kid
511	14.4	8.9	Young, Cy
342	14.8	8.9	Keefe, Tim
364	16.5	8.9	Galvin, Pud
373	13.9	8.9	Alexander, Pete
300	13.1	8.9	Grove, Lefty
328	13.8	9	Clarkson, John
309	14.7	9	Radbourn, Charley
329	15.9	9.1	Carlton, Steve
324	16.3	9.1	Sutton, Don
318	17	9.1	Niekro, Phil
354	13.9	9.1	Clemens, Roger
314	17	9.2	Perry, Gaylord
311	15.4	9.3	Seaver, Tom
307	15.6	9.3	Welch, Mickey

Now we see Maddux bubbling near the top.

Finally, let's look at his contemporaries, so any pitcher born since 1955 (Jack Morris's birth year), sorted by innings per decision, min 200 wins:

Wins	IP/W	IP/Decision	Pitcher
201	12.6	8	Pettitte, Andy
250	13.4	8.5	Mussina, Mike
200	16	8.6	Finley, Chuck
347	13.9	8.6	Maddux, Greg
211	14.7	8.7	Welch, Bob
303	14.4	8.7	Glavine, Tom
239	14.4	8.7	Wells, David
254	15.1	8.7	Morris, Jack
230	15.4	8.7	Moyer, Jamie
204	15.3	8.8	Hershiser, Orel
209	12.8	8.9	Martinez, Pedro
210	14.9	8.9	Rogers, Kenny
284	13.6	8.9	Johnson, Randy
216	15.1	9	Schilling, Curt
245	16.3	9.1	Martinez, Dennis
354	13.9	9.1	Clemens, Roger
211	15.4	9.2	Brown, Kevin
207	16.3	9.6	Smoltz, John

As you can see, it's Andy Pettite that is the pitcher who is getting an abnormal number of decisions per inning pitched, likely because he pitched for a team that gave him leads that they didn't relinquish, either because of their bats or because of his bullpen). Smoltz of course was a reliever for a time.

--Tangotiger 08:21, 16 May 2008 (PDT)

Parks - size comparison

Hey Mr. James…are NL parks on average smaller than AL parks?

James, Schmames...

You hear occasionally that "NL parks are smaller than AL parks." Then again, you hear ALL the time that, "Parks have gotten smaller," presumably as one reason for the most recent offensive explosion starting in the early 90's. Well, I can tell you that parks in general, "Have not gotten smaller." At least not from the 80's to the 90's. Before that, I don't know. Then again, that ("Parks have gotten smaller") is only one of thousands of stupid (and incorrect) things you hear from commentators, writers, baseball insiders, and I suppose, fans. Not all of them of course.

Anyway, the average approximate area (fair territory) in square footage of NL parks today, including Coors Field, is 109,600 feet. Without Coors, it is 109,100. I mention both of these because while Coors increases the average size of the NL park, it obviously increases the average runs scored (not by nearly as much as the pre-humidor and pre-mega-humidor days of course).

In the AL, it is 109,400. So basically I think you could say that they are around the same.

Now, I assume you are asking the question in reference to run scoring, or at least home run rate. And that is usually the context when you hear the talking heads speak of ballpark size.

As I am sure you know, ballpark size is only one (and maybe not even the most important) of several things that affect home rate and run scoring in general. In fact, ballpark size can be quite misleading, as anyone can tell from the fact that Coors Field is the largest park in baseball, and one of the highest run scoring parks. Ditto for Chase Field (ARI). Another example is the fact that Texas and Oakland have around the same size fields and one is an extreme pitchers park and the other is an extreme hitters park.

So, park size has little to do with anything, other than how much grass the groundskeepers have to mow.

Factors, other than park size, which affect home run rates and run scoring (and other offensive components) are altitude, prevailing weather (temp and wind), height and configuration of the fences, size of foul territory, and things that affect the batter being able to see the ball well or not.

Foul territory is one of the most important factors. The reason that Oakland is such a pitchers park, other than the fact that it is cold and at sea level, is that it has an enormous amount of foul territory. I think that an increase or decrease of 30% in foul territory corresponds to something like .1 rpg, but I am not sure. I would have to check.

The average foul territory area in the NL, including Coors Field (even the foul territory there is weird, in terms of the number of foul balls caught), is 24.2 (without Coors, it is 23.9), In the AL, it is 25.75.

So based on foul territory and fair territory size only, you can infer that the NL parks may be more conducive to run scoring. To be sure (or at least MORE sure), you'd have to check average altitude, temp and wind, and the characteristics of the fences, especially the heights of course, and even research the "visibility" factors. A decent rule of thumb is that 1 foot of extra fence height adds a foot to fence distance, but don't quote me on that. You can look all of these factors up yourself and report back to us. I've done too much of your work already.

--Mgl 21:49, 17 May 2008 (PDT)

Rule Changes - DH

Bill James, Does the home team decide whether or not the DH rule is used in the game? Teams such as the Twins or the Rockies could really benefit from this where as teams such as Boston or Cleveland could really be at a disadvantage. Now I don’t expect this to ever actually come up in a game, however isn’t it the duty of an organization to find the best way possible to win every game? It seems odd that no one would have tried to exercise this.

I'm pretty sure it's not at the discretion of the team, but I love the idea. I wrote an article based on suggestions of rule changes from readers of my blog here:

We had a good idea for changing the DH rule, but I like this one as well. I love the idea that a team can force his opponent to consider turning a one-dimensional player (Ortiz, Hafner, Edgar, Frank Thomas) into a two-dimensional one, and causing some havoc for the team. We already have this rule in place somewhat in the World Series and interleague play (no discretion, but half the time a team is forced to not use the DH). But to leave it at the discretion of the team would be fantastic.

--Tangotiger 08:37, 16 May 2008 (PDT)

General Managers - measuring effectiveness

In your estimation, what is the best tool for measuring GM efficiency? MP/MW given the limitations on financial data or something else?

You'd really have to look at the specific circumstances. How do you evaluate fire sales? Or conversely, how do you evaluate someone who's been given a blank check? My honest answer is a non-answer.

--Tangotiger 13:24, 20 May 2008 (PDT)

I'd have to ditto Tango's sentiments. Almost impossible to do. The only thing more difficult than evaluating a GM is evaluating a manager. Manager-of-the-year awards are ridiculous. Complete nonsense. It is usually given to the manager of a team who does well, but was not expected (by the talking heads) to do well. Whether the manager had ANYTHING to do with his team exceeding "someone's" expectations is another story. Not to mention the fact that the expectations of the talking heads are not necessarily well-founded. For example, all you hear about in the media is the "surprising Rays." Surprising to whom? Not to the analysts! We had them projected at 86-88 wins. No surprise there. In fact, I can say right now that if the Rays make the post-season, Maddon will probably win the MOY award! Why would a manager win an award when his team does exactly as the player talent suggested?

Anyway, do we even know what a GM does? I don't. If I owned a team, I would want my GM to make sure my team made as much money as possible. I wouldn't give a hoot about wins. (Of course, there is some connection between wins and profit.) Some owners and therefore GM's want to win even if it means making less money. How are we to know which teams are more focused on the bottom line and which ones are more focused on winning? We can't.

Plus, GM's are presumably in charge of the entire organization from top to bottom. Helping to evaluate and coordinating the evaluation of talent (and negotiating and approving contracts) is but one small part of a GM's job I presume. There is player drafting and development, marketing, etc.

Especially being an "outsider," there is simply no way to evaluate GM's. I'll go on record as saying that when you hear good and bad things about GM's from fans, from the media, and from the analysts, they are "shots in the dark" at best. About the only thing you can clearly see is when teams mis-evaluate talent and overpay for certain players. You can probably see some good and poor drafts if you know a lot about the amateur players. Other than that, I have no idea who is a good or bad GM, and I sure as heck can't tell by how many wins a team gets or how often they make the post-season.

Maybe, just maybe, as you suggest, marginal wins and marginal dollars (I assume that is what you mean), can give us some small notion of how efficient teams are at evaluating, developing, and paying for talent. Even then, you would still need to look at many years, and make sure you attribute the appropriate results to the appropriate GM.

--Mgl 17:42, 20 May 2008 (PDT)

The Book - Who wrote what

  • MGL: Sac Bunt, Game Theory, Hot Streaks
  • Andy: Walks, Platoon, Clutch, Appendix, Last few pages of Toolshed
  • Tom: Matchups, Batting Order, Pitchers, Stealing, Most of Toolshed

--Tangotiger 08:28, 21 May 2008 (PDT)

Bunting - Frequency with 2 strikes

Stemming from a stupid argument, someone tried to tell me that it was 'common sense' that a pitcher with an 0-2 count after 2 failed bunts would expect bunts and groove a pitch down the middle (specifically, this person believes that Dusty Baker was not making a phenomenal error by having Dunn try to bunt, but rather it was a wily strategic play to set up the home run). Not to dignify them with a response, but it made me wonder about bunting statistics. How often do people bunt on a potential third strike? How often are they successful? How often is the pitch following a failed bunt attempt when there are two strikes a batting practice pitch/easy strike? Do you know of anywhere this data can be found? Does anyone track bunt attempts vs. swinging or called strikes?

I believe that some of the professional stats companies track the swing type in addition to the outcome (ball/strike), so the data are there if someone wants to pay for it. I don't have that data myself, though. --AED 08:18, 27 May 2008 (PDT)

From watching thousands of games and studying the PBP data, I can't tell you whether and how often pitchers "groove" pitches when they are expecting a bunt, but I CAN tell you that position players almost never bunt with 2 strikes, even when they start out bunting, and pitchers almost alway continue bunting with 2 strikes. So no pitcher is going to groove a pitch to a position player with 2 strikes when that batter was attempting a bunt before 2 strikes. No pitcher is even going to groove a pitch to a hitter like Dunn before 2 strikes even if that hitters has already offered to bunt on a pitch or two.

As far as "grooving" pitches in general when a batter is likely to bunt, yes, it is correct to throw more fastballs and more strikes when you expect the batter to bunt, but...

You can't groove it too much, because a hitter can always switch and hit away at any count. If a batter knows that you are going to groove him a pitch, why bunt? There is game theory involved there. Also, you want to make it difficult for the batter to bunt. Some pitchers think that the curveball is more difficult to bunt. Other pitchers (more of them) think that the high (and inside) fastball is the most difficult pitch to bunt. And even if the batter is bunting, the pitcher still has to be concerned with the count. If he gets behind, he will (presumably) throw fastballs near the middle, and if he gets ahead, he is still going to use the corners, even if the batter is likely bunting.--Mgl 13:14, 2 June 2008 (PDT)

Reliability of stats - How reliable are they?

I was really surprised later when I found a quote by Tango in another post stating that (approximate quote) “after 250 at bats, a sample is reliable”. This is the question people are trying to ask, though, isn’t it? When should we totally discount far-past performance and instead base conclusions only on some recent sample?

Never! You can never totally discount anything.

The weighting I use is: weight for hitting = .9994^daysAgo weight for pitching = .9990^daysAgo

So, even performance from 10,000 days ago has some weight, albeit realllllly tiny.

As for what I’m sure I meant about the 200-250 PA being reliable is that at that point, your regression toward the mean is 50%. That is, half of that performance is real, and half is not. The amount to regress is: x/(x+PA) where x=200 for hitters and 300 for pitchers and 400 for fielders (BIP, not PA for them).

If you are going the component route, then that “300” for pitchers will be much lower for K rates and BB rates, and far higher for BABIP rates.

--Tangotiger 11:24, 22 May 2008 (PDT)

There is no “magic point.” It is a sliding scale which changes on each PA. For a batter, a good rule of thumb is that each previous year gets 80% of the weight of the following year, and for pitchers, 60%. That is a very general rule of thumb. There are many caveats, exceptions, etc. For example, every component has a different weighting, because certain components have more “talent” associated with them (and thus a player is more likely to have achieved a new level of talent with respect to them). So, if we are discounting last year’s stats 20% as compared to this year’s, we also have to discount yesterday’s PA’s a tiny amount as compared to today’s. So on and so forth. So you tell me at what point current (I don’t even know what “current” means) stats become “reliable,” because I have no idea.

If Tango ever did say “after 250 PA, you can consider a batter’s stats reliable,” what he meant was that that was the point (for whatever stats he was talking about, probably OBA or wOBA) at which you would regress it 50% towards the mean if you had no other history to work with. In my opinion, the use of the word “reliable” is a poor choice as it has no inherent meaning in that context, and as I like to say, “One man’s reliable is another man’s unreliable.”

Plus, we have two issues. One, is given no history at all, how much (percentage-wise) do we regress a player’s stats toward some mean or average (if we know nothing about the player, it is the mean of the population of all MLB players)? The other is how do we weight each PA or group of PA, with respect to recency? Two completely separate, albeit related, issues.

--Mgl 11:30, 22 May 2008 (PDT)

Sabermetrics in the media - affect of pitchers on fielders

Not so much a question but... During the 05/22/08 Texas at Minnesota game, Rangers' tv booth started talking about sabr-like analysis in the top of the fifth inning.... Tom Grieve was wondering whether anyone had researched whether or not pitchers who work quickly do, in fact, get better defense from their teammates than those who worked at a slow pace. Followed that with saying he realized it might be hard to compare pitchers on different teams, but thought pitchers on same teams could be studied.... Josh Lewin then said he wasn't sure numbers could quantify that or the like. Then decided to tell everyone errors alone don't make someone a bad defensive player and vice versa. Specifically mentioned Brian Downing having an AL record without errors but never winning a Gold Glove.... Not the most earth shattering discussion ever, but I found it unusual for the duo to discuss analysis, so I thought I'd pass it along. Usually someone in the production puts a "nerd alert" graphic on screen if Lewin even mentions what Josh Hamilton does with runners in scoring position. Do with it what you will. Have a good day.

Interesting! I actually don't mind the Texas broadcasters. Someone did a study (on BP I think) where they looked at error rate versus "elapsed time of game" I think. Time of game was used as a proxy for the speed of the pitchers, which is not a bad presumption, I don't think, as long as you control for other factors that affect the time, or use a robust enough sample that everything else should even out.

They found no correlation between error rate and time of game, which did not surprise me in the least. If a more rigorous study was done (you'd probably have to have someone make a list of fast and slow working pitchers, because there is nothing in the data that tells you that other than time of game), I would be surprised if there were any significant correlation between the speed at which a pitcher works and the quality of the defense (however you want to measure it) during that game. There might be some in Little League or even high school, but in MLB, I doubt it. It is one of those silly, stupid things that COULD be right (no on has ever bothered to check), probably isn't, but commentators say all the time as if we just KNOW that it is correct.

If you are sensitive to sabermetrics and TV commentators, you are best off turning the sound off on the TV. The stuff that commentators say is enough to make you (me) throw up.

From yesterday:

Morgan: Beltran is a valuable player to the Mets because he can get hot at any time.

Brennaman: With Dunn's (or someone else's) stats for driving in a runner on third with 1 out listed on the screen (50%), he says, "Folks, that is NOT a good ratio." Now, I actually don't know what the average is in MLB, but I would bet a lot of money that it is around 50%.

Brantley (maybe it was Brennaman again): "McCann may be the best pure hitter on the team (ATL)." Sure, if Chipper gets hit by a car, I guess. --Mgl 19:13, 2 June 2008 (PDT)

True talent - what is it?

What do you guys mean about true talent ?

"True talent" is a player's expected performance level for the rest of the season, given average context.

It is based on his career performance level after accounting for the context, weighting the most recent games the most.

As a result, "true talent" is our best estimate, an estimate which changes day-to-day (not unlike say the stock market, if interest rates were a constant), but it is fairly stable for a large portion of players.

A team's "true talent" is based on the expected players and expected playing time for those players. That of course, also can change.

As a proxy, it's whatever tells you it is, right now.

--Tangotiger 11:09, 28 May 2008 (PDT)

Component v ERA - why the gap?

Why is Carlos Zambrano's era always so much better than his fip?

I am no expert on FIP (In fact, it is Tango's invention I think), but for one thing, FIP is NOT an unbiased estimate of a pitcher's true run prevention talent. There is some bias inherent in the stat, by definition. For example, since it doesn't include non-HR hits, and we KNOW that pitcher have SOME control over their BABIP, it will not accurately reflect a pitcher's true BABIP. Maybe Carlos' true BABIP is a little lower than league average. If it is, than we would expect his FIP to overstate his expected ERA. FIP also does not include a pitcher's WP (wild pitch) rate, which is obviously reflected in his ERA. So to for the SB/CS rate that a pitcher is responsible for.

And even if FIP were an exact proxy for ERA (which it isn't), we would expect plenty of pitchers to have ERA's different from their FIP's, by chance alone. So your question does not really have an answer other than, "We expect pitchers' ERAs to be different from their FIPs's by chance AND because FIP is a biased, and not nearly exact, estimate of ERA.

All FIP (or DIPS ERA) does is eliminate the noise (and a little bit of the skill) in BABIP. That allows us to get a better estimate of a pitcher's run prevention skill, in the short run. In the long run, ERA, RA or ERC is MUCH better because it captures the differences in BABIP skill among pitchers, as well as the other things I mentioned above that contribute to a pitcher's run prevention skill but are not addressed at all in FIP (like WP rate)--Mgl 13:39, 2 June 2008 (PDT)).

Betting accuracy - comment

A chi-square test should be used, except not a test, just ÷2 = (O − E)2/E, least some is best, where expected is percent * games at that percent

This isn't really a place where a chi-square test is applicable, because the "distribution" of outcomes is binomial (the team either wins or loses). The chi-square statistic is technically only correct if the outcome is a continuous distribution (and for that matter, only if the distribution is Gaussian).

Chi-square goes as the -2 times the sum of the natural log of the probability densities at every outcome being used. If you have any other sort of distribution (in this case, a binomial), you want to use the appropriate statistic. So, in this case, the probability that the outcome O comes from the expectation E is equal to E if O=1, or 1-E if O=0.

So, to "score" your predictions against someone else's, you would just have to add the natural logs of the probabilities you had down for the correct result. A higher score is better. Or, to make it more chi-square like, you would use ln(1/probability), which effectively multiplies the values by -1 (and makes them positive), and a lower score is better.

While I'm at it, I still have to disagree with MGL's assertion that you won't beat the oddsmakers. I've run extensive tests against lines, and good predictions do beat the odds. The reason you won't find many chances to actually profit is that the house skims off the top. For example, if the oddsmakers have a game as a pick-em and you put $1 on the winner, you win something like $0.90. Of course, if you lose, you lose your dollar. So, you'd have to be 53% certain that your team will win, and the lines are almost always (especially in pro sports) good enough that the bookies won't be that far off. --AED 09:16, 9 June 2008 (PDT)

Park factors - consistency

Why are hitting park factors so similar to pitching park factors each year, but they're both (relatively) inconsistent from year to year. For example, at the Ballpark at Arlington, there are years with park factors at 112 and others where the park factor is 97, but the hitting and pitching are at most 1 apart. Doesn't this imply that a park's true park factor significantly changes each year?

You are confusing what a hitting and pitching park factor are. A lot of people do that. In fact, I don't like the terminology. A hitting park factor does not only use hitting stats and a pitching park factor only uses pitching stats. They both use exactly the same stats (hitting and pitching for both teams). The only difference is that the batting park factor adjusts a batter's stats because he does not hit against his own pitchers. IOW, a batting park factor is NOT just a park factor. It is a park factor COMBINED with a "strength of opponent factor" for batters. It should not be called a "batting park factor." It should be called a "park and opponent factor for batters."

The "pitching park factor" is the same thing for pitchers (it adjusts a pitcher's stats for the fact that he does not pitch against his own batters). So the difference between a batter and pitcher park factor reflects only the difference between a team's hitters and the average hitter in the league and the difference between a team's pitchers and the average pitcher in the league. The difference is always going to be small. Even if a team has great hitting or pitching, the difference is only going to be a couple of points at most. You might as well just use one park factor. Although technically using one or the other is correct, it doesn't make that much difference. Just as long as you know what you are using. To park adjust batters (their home stats of course), use the batting park factor (for his home park) and to park adjust pitchers, use the pitching park factor of his home park. That is only for batters and pitchers on that team though. For example, don't use a batting park factor to park adjust a Mets batter for his games in Turner park! Use the "generic" park factor for him, which is NOT, BTW, the batter and pitcher park factor combined.

A "park factor", a "batting park factor" and a "pitching park factor" are three related, but separate entities. As I said, a batting or pitching park factor is merely a "park factor" plus an "opponent factor."

As far as whether a park's "true PF" changes each year, it depends on what you mean by "true PF." If that includes weather, then yes, it changes every year, for outdoor parks at least. Not a lot in most cases, but some (I guess in some cases, a lot, if the weather conditions for an entire year are extreme). If you mean "without weather", than no, true PF's do not ever change at all. Which is why you want to use as many years of data as possible to compute a park's PF (making sure that you are adjusting for any radical changes that occur to other parks, since a park's true PF is always relative to the league).

And when you adjust a player's stats in any one year, you still want to use a long-term park factor to make that adjustment, not how the park has "played" for that year (the one-year PF). That is a mistake that a lot of people make. For example, lets say that MIL this year plays like an extreme pitchers park, say with a one-year PF of 92. Now, of course that is going to change its long term PF since it has not been around that long, but say the long term regressed PF is still 103. And let's say that Braun has hit a lot worse at home this year (after adjusting for HFA), to go along with that one year PF of 92. And let's ignore weather related issues by assuming that the weather this year is the same as in an average year in MIL. What do you use to adjust Braun's stats? You use the 1.03! You assume that if he played in a neutral park, that his home stats would be even worser than they are, even though they are worse than his road stats. That is a little bit counterintuitive to some people, but that is the way it works. --Mgl 14:05, 2 June 2008 (PDT)

BABIP - speed of pitches

We know that lefties have a slightly better BABIP than righties, and that knuckleballers have a better-still BABIP. And we know that the harder a pitch is thrown, the harder you can hit it. Wouldn't it make sense that pitch speed is correlated with BABIP, and part of what makes Moyer "crafty", and what makes Wakefield's grounders more likely to be turned into outs, is that they're coming at fielders slower? It seems like this could be studied by the pitch/fx folks pretty easily. EXCEPT that I wouldn't quite know how to control for pitcher BABIP skill. I.E. perhaps pitchers who throw slower are more apt to have the skill of inducing balls that are easier to field. Anyways just curious as to your thoughts on the theory and whether or not it could be studied with today's available data.

Sure, you could study it. For one thing, you could look at the average speed of a batted ball against the various pitchers. The STATS data, at least, includes the speed of each batted ball, on a scale of 1 to 3. If the speed of a batted ball is correlated with the speed of a pitch, it may or may not show up in the STATS data.

You could also try and control for location when looking at BABIP for the various pitchers. There really is only 2 ways to have a BABIP skill, I would think. One, is the speed of the batted balls and the other is the location. If a certain class of pitchers, say those with slower pitch speeds, has a lower BABIP than another class, say, ones with faster pitch speeds, and the locations for each class are around the same, you might be able to conclude that the difference in BABIP is due to batted ball speed. I guess.

As far as my thoughts or opinions on the matter, I don't have any. Sounds plausible. I did not know that lefty pitchers had better BABIP than RHP. I'll take your word for it. I assume that the average LHP has a lower average pitch speed than the average RHP. Maybe that is the reason. Or maybe it is the fielder or the hitters, or has something to do with the platoon advantages. --Mgl 19:13, 5 June 2008 (PDT)

Pitching Styles - Which pitches longer

Is it true that pitchers who "pitch to contact" throw less pitches and go deeper into games than strikeout pitchers?

It certainly makes sense, doesn't it? I've taken a quick look through the season stats database for pitcher seasons with 20+ starts and no relief appearances from 2000-2007.

There is indeed a correlation between number of pitches per plate appearance and the pitcher's combined strikeout+walk rate (also true of the strikeout rate alone). So, for every extra walk or strikeout, the pitcher averages 2 extra pitches. So far, so good.

However, when we look at how far a pitcher gets into the game, it looks like the strikeout pitchers actually fare better. A pitcher whose strikeout+walk rate is 0.1 higher than average will face an extra 0.07 batters per start, and last an extra 0.10 innings per start.

So, at least from this limited data set (one could do better with stats broken down by game), I'd have to say that the strikeout pitchers seem to get deeper (albeit throwing more pitches) than contact pitchers.--AED 09:40, 9 June 2008 (PDT)

Here's a cool chart by David Gassko

--Tangotiger 12:52, 9 June 2008 (PDT)

Yeah, it looks to me like the pitchers who "pitch to contact" throw fewer pitches per batter (naturally), but they face more batters per inning (or per out - same thing), because a strikeout, by definition, requires only one batter, and a ball in play requires 1.43 batters (.7 outs per BIP). So, as David Gassko shows, it all evens out. However, since the high K pitchers are much better pitchers (lower ERA) as a group, they will pitch deeper into games (and also have a greater pitch count per game) simply because they are better pitchers. Managers take pitchers out of a game for two reasons, basically. One is how well they are pitching, such that the better pitchers will average more IP per game. Two is their pitch counts, such that if everything else is equal (which it isn't if you are comparing high and low K pitchers), you will last longer if you throw fewer pitches per out or per inning. One thing we don't know, or at least I don't know, is whether contact pitchers (or any other class of pitchers) tend to be able to throw more pitches per outing or fewer pitches per outing, as a group. We know that individual pitchers have their own thresholds, for whatever reasons. For example, Livan Hernandez or Carlos Zambrano can and do throw 120+ pitches per game, whereas Maddux, even when pitching well, threw not much more than 100 pitches per outing in his heyday, and throws maybe 85 now. Whether certain types of pitchers as a group have more endurance (per outing) than others, I don't know. I doubt it, but you never know.--Mgl 18:23, 9 June 2008 (PDT)

Regression - Equation

I once heard you mention that if you came up with 75% as the regression (r = .25) for 800 TBF per year (average for a pitcher) you would use this formula - 2400/(TBF+2400) – to find out how much you would regress for any diff. number of TB, whether it’s 1200 TBF or 300 TBF. If we plug in 800, we get .75. I know you get the “constant” (in this case, 2400) from the regression of pitchers who had 800 TBF in each year and the fact that we got .25 as the “r”. My question to you is how did you specifically calculate the 2400 from 800 and .75? I have feeling it is simple math…

reliability = PA / (PA+x) = r

If your correlation coefficient (r) is .50, and the average number of PA in your sample is 2400, then we simply plug it in the above equation, which obviously makes x = 2400.

Then, for any number of PA you have, you know how reliable the sample is. If PA=800, then r=.25.

What if your correlation coefficient is .25, and the average number of PA in the sample was 800. How do you calculate x? 800/(800+x)=.25 A little algebra makes x=2400

As a shortcut, you do x=(1-r)/r*PA

So, if your correlation coefficient is .25 and the number of PA in your sample is 800, then x=.75/.25*800=2400

--Tangotiger 17:58, 10 June 2008 (PDT)

Regression - Equation part 2

1) If in a player’s first year he posts an OPS of .850 after 500 PA and the league average is .740, if you add that 500 PA into the 200/(200+PA) formula, it tells you to regress .850 only 29% towards .740. If this were true, then 500 PA (or one season’s worth of stats) would be a large enough sample size, which we know is false. What is wrong with my thought process because I know after 500 PA we need to regress a heck of a lot more than 29%.

Nope, 29% is just about right. I don't know why you say "we know is false".

--Tangotiger 18:25, 10 June 2008 (PDT)

I think I figured out what I was doing wrong...the .29 is the r, so since 1-r is the amount you regress the stat, you would regress the .850 OPS 71% towards the .740 (meaning it is only 29% reliable). You can edit that in my question if you'd like...

In order to get an r of .29, then the number of PA in your original example would not be 500, but 82 PA. And if you have 82 PA, you would regress 71% toward the mean. Regardless, I think we're on the same page here.

--Tangotiger 12:49, 11 June 2008 (PDT)

Regression - Equation part 3

2) When you list the three formulas (200, 300, and 400 “constants”) what are the average PA/TBF/BIP numbers per year you’re using (I’m assuming you’re regressing to zero/average)? It would be nice to know just because of the fact that you get the constants from the average number of PA/TBF/BIP per season along with the corresponding "r" from the year-to-year correlation.

It doesn't matter what I use. That's the great thing about it. If I use an average of 600 PA, I'll get a year-to-year r of .75. This implies an "x" value of .25*.75*600=200

If I use an average of 400 PA, I'll get an r of .67. This implies an "x" value of .33/.67*400=200

If I use an average of 200 PA, I'll get an r of .50. See where I'm going here?

--Tangotiger 18:29, 10 June 2008 (PDT)

Fantasy Dollars - How to calculate

I just came across a thread containing the following question and (your) answer: "Is OPS the best way to evaluate projections?" ... "Of course OPS is not the best way. Linear Weights is (for real baseball) and Fantasy Dollars is (for fake baseball)." ... I'm a novice when it comes to the quantitative side of fantasy baseball and am wondering is there is one primary source for "fantasy dollar" information. There's an owner in my league who will state definitively that "Player X earned $28 last year" but won't divulge the source. Can you help?

I'll guess it's Ron Shandler, at and the Baseball Forecaster. But, really, it's a snap to figure out. I laid it out all here:

In that thread, at post #6, I bring you to a link that gives you the step-by-step details.

--Tangotiger 18:33, 10 June 2008 (PDT)

Run Environments - Extreme cases

I guess my main question is about run environments. ... You talk extensively about how the value of certain events is based on the run environment that it occurs in. If I understood your methodology correctly, you manipulated the variables in baseruns such that they fit the actual data from the 1974-1990 seasons, which had a 4.3 RPG average. So if your model is built on actual data that averages 4.3 RPG, then why would you ever expect that it would still work for the extreme cases? Wouldn't you expect that the model would be way off for a game with 20 runs scored or with 10 home runs hit since the equation is based on data for a run environment with an average of 4.3 RPG? ... I see how you created the model, and I see the results of the model. I am just not understanding why you would excpect it to be correct, nor why it actually is correct when applied to a run environment that is extremely different than the one that is used to construct the model.... My next question is, if you are able to make it work for extreme cases like a game with 10 home runs, then can you consider 2008 baseball games as an "extreme case" and use the baseruns equation to approximate run scoring in today's game?... I guess the question that I am asking is really the point that you were trying to make in your articles, so maybe I just missed the point. However, I would appreciate any assistance you might have. Great work by the way.

The correct model is a Markov model that uses all variables. Since that's a bear to program, you either create a simulator, or you can accept a basic version of the Markov, as I have done here:

That model comes complete with source code, so it is out in the open for all to see and use.

BaseRuns comes very close to that model, regardless of run environment. So, we used the 1974-1990 data to "tune" the model, and then let 'er rip for any environment. The basic point is that as the OBP rises, the percentage of runners on base scoring also rises (almost in lock-step 1:1). And BaseRuns holds to that principle pretty strongly.

--Tangotiger 18:38, 10 June 2008 (PDT)

Intangibles - measuring

In this article the author says that Cliff Floyd makes teams better as shown by the fact that the teams he joined won. ... A quote:
Unfortunatley sabermetrics cannot nor can any kind of metrics measure the unmeasurable little things teammates do for each other to make each other better. You can't measure a conversation that instructs or motivates or takes pressure off. You can't quantify a well timed joke that eases tension in the clubhouse. Or a look that reassures. Or a skillful deflection of the media away from a player who can't handle it. Or a million other little things that go into the relationships that are bonded over the course of season together. These are the human qualities that elude objective statistical analysis. But they're crucial to creating a winning team. In order to win, especially to win a championship, it's not simply about being the best player but about making other teammates play better.
Do you agree?

I have no idea, but I doubt that it is true. Anyway, it is like a Bill O'Reilly, "Just answer the question yes or no!" thing. Lots of questions don't have yes or no answers, but bullies and dolts like to think that they do. If I weren't being politically correct, I would say that the article you reference is incredibly stupid and that the author is a dolt, at least as far as baseball goes. But I'll be politically correct and just say that since it admittedly (by the author) can't be measured, he has no idea what effect certain players have on others and what effect that has on a team "winning." We also know for a fact that a team can win enough games to make the post-season regardless of the atmosphere in the clubhouse and regardless of what influences certain players have on others. All teams will win 60 pr 70 games on marginal baseball talent alone. Certainly some teams are more talented than that no matter what the atmosphere, so there are certainly some teams who will win on the average 80 or 90 games on talent alone. We also know that the laws of mathematics guarantee that 10% of all teams will win 10 games more or less than they are supposed to by "luck" alone (and 5% will win or lose 14 games, etc.). So the author, whoever he is, if he had an ounce of sense, which apparently he does not (again, at least about baseball), would know that some teams can win almost any number of games, certainly enough to get into the post-season and even win a WS, regardless of the so called intangibles. Now, how much, if anything, so-called intangibles are worth, is a mystery to anyone. If this guy thinks he knows the answer, he ought to be on Oprah or a billionaire, or both.

Ah, it's not even worth my time discussing this sort of nonsense. Nothing wrong with your question. The "nonsense" I am referring to is the article. There is no shortage of garbage in the media and on he internet. We could fill a small planet with it. Then again, my idea of garbage is not the same as everyone else's. If this guy entertains people and they enjoy reading his stuff, all the more power to him. Or if he entertains himself, that's fine too. I assume he is not forcing anyone to read his writings. Not everyone's job is to make people smarter. Some people do a great job a making people dumber, and that is not a crime yet, as far as I know.--Mgl 14:29, 10 June 2008 (PDT)

That article sure had alot of yapping.

What it comes down to is we all quantify intangibles. I'm sure you are a great worker at work, and I'm sure you do alot of little things that simply aren't being noticed. They somehow all add value or contributions to your company. Alot of intangible things that we simply can't properly value, but we know is there. However, you are being paid right? Presumably, someone with the exact skillset as you, but without the intangibles would get paid a bit less than you? How much less? 1% less? 20% less? That's your boss putting a value on your intangibles, and that is you putting value on your intangibles. I'll bet anything that a person's intangibles will make up less than 5% of that person's total compensation.

So, there you go, intangibles have been "tangibilized". How much does Cliff Floyd get paid? Are teams clamoring for him? Exactly how much more does he get compared to someone with similar skillsets otherwise? Is there a single MLB player that gets more than 1 million more than he would otherwise get if not for his "intangibles"? On the free agent market, teams pay 1 million$ for an extra 2 or 3 runs in a season.

Are we really doing all this yapping for the possibility that at the extreme level someone *might* be generating 2 or 3 runs from his intangibles?

--Tangotiger 18:47, 10 June 2008 (PDT)

You say that intangibles are controlled by the market, and that the market doesn't value intangibles that much. That was the argument from USS Mariner, and it makes a certain amount of sense in showing why most of the intangibles talk isn't backed by action. ... However, I don't think that answers whether you actually believe in the value of intangibles. As we all know, there are large inefficiencies in the baseball market: why would we trust these intangibles to be properly evaluated?... To make it short and simple, if you were the GM of a major league team, would you spend any time/resources on trying to identify if there actually is some sort of personality effect on performance? Giving the players Myers-Briggs assessments or the like, and/or having them express their mood on some sort of scale/standard before a game to see if there's a correlation?...Or would you focus on other things you felt were more worthy of the resources? We all know what the market thinks about intangibles, I want to know what you think.

I wouldn't do any more than what they are already doing. I'm sure they are doing some, and I would not allocate more resources to it. I don't know that I would allocate less resources to it, but until someone can convince me either way, I'll maintain the status quo.

There's just so many other places where you can have real benefits, that that's where the cost/benefit lies.

--Tangotiger 07:28, 11 June 2008 (PDT)

I'll add that if I were a GM, I would not go out of my way to acquire players with a bad reputation or players that I personally did not like, and that I would probably give a little more deference to players who are considered good clubhouse guys. Not a whole lot on both count, but a little. There are several reasons. One, players are fungible and there is rarely going to be a player that I either "must have" or one that is not replaceable by someone similar. So I might as well go for the one that I "like." Two, there might be "something" to a player such that his "intangibles" help or hurt my team's chances of winning, such that I will use that as a tiebreaker or even more. It's sort of like all the atheists on the airplane praying to God as its going down, "Just in case." Three, I am only a human being, and human beings like to work with other human beings whom they deem pleasant to work with. Why would I go out of my way to hire someone who no on likes or has a bad reputation? There would have to be a compelling reason to do so. Finally, the fans like players who are likeable, of course. That has got to have some "fannies in the seats" value, even if it is a little.

As Tango says, bottom line is that if there are intangibles that impact a team's chances of winning, it is likely small, at least as compared to player talent. Plus, it is only worth putting any time and energy into if it can be identified. Maybe it can and maybe it can't. So we already have two factors working against us: One, we think that whatever value it has in the absolute, it is probably small. Two, we would probably have a hard time identifying who had the good mojo and who had the bad mojo, to any degree of certainty and reliability. Does that sound like something you want to put a while lot of time and energy (and money) into? It does not to me. The other bottom line is that as with many questions about the world and about human nature and human dynamics, we have no idea as to the answer to the above questions. No idea. And probably never will. We have some educated guesses. But no idea with any certainty. Someone, like this author, proclaiming that he has the answer, is just a talking head flapping his lips with nothing much he says making us any smarter. --Mgl 14:56, 12 June 2008 (PDT)

Agreed. Once you try to quantify things (and you have to, since you are paying with dollars), the course of action becomes self-evident: use it as a tie-breaker, and maybe do a little digging. But, that's pretty much it. Any other course of action is really just a precursor to the quantification step. That is, the yapper is intentionally NOT going to the next step, for fear that all his yapping will come against the realization that it's not worth as much as his intuition would think and that it'd be darn hard to figure out how much it was worth.

Martin Brodeur (goalie for the Devils) went through a nasty divorce during the playoffs a few years ago. His wife would call him before the game to tell him she was going out with one of his friends. She'd play head games with him, etc. So, what do you think happened? a) he was so messed up that they were eliminated in the first round or b) he was so focused, he needed extra concentration that allowed him to win the Stanley Cup or c) make up some other reason after knowing the outcome of the games?

This is my point. How the heck do we know how someone's personality is like and how it affects his game and the game of others around him? While these things are real, our knowledge of these things is severely lacking. We know sh!t, basically. How much would you pay for sh!tty knowledge?

The Devils won the Cup. I guess that means that Brodeur has supreme concentration levels.

--Tangotiger 11:37, 13 June 2008 (PDT)

OPS - how to better balance it

You say that (obp*2+slg)/3 is close to wOBA (which was a relief because the RBOE stat I cant find anywhere) yet in the blog section when you discuss "getting to know wOBA" you give league averages of: OBP: .338 SLG: .426 wOBA: .335 ... Obviously that average OBP and SLG doesn't equate to anything close to .335 when (obp*2+slg)/3 is used . What am I missing? Does the (obp*2+slg)/3 provide a much different number, but the relation between one players actual wOBP and (obp*2+slg)/3 and another's stay about the same?

The double and divide by 3 is a rough approximation.

If you use 1.74OBP+SLG divide by 3, you'll get the better number.

Or 2OBP+SLG divided by 3.26. In The Book, it was really just a quick blurb to alert the reader of the general relationship.

If you are looking for the nuts and bolts, you can find it on my blog

--Tangotiger 07:05, 12 June 2008 (PDT)

wOBA - Park Adjusted

Are you guys going to work on park-adjusted, league-adjusted wOBA's? It seems like the last step for the best offensive rate statistic.

My personal viewpoint is that once you start to adjust things, you lose half the audience, and the half you didn't lose will accept what you do on faith. Most people don't know how EqA is calculated, and certainly no one knows how to do it with all the park and league adjustments. People accept the results on faith, and that's not what I think people should be doing.

That said, it's a straightforward enough process to calculate park and league adjustments, and I'd like to leave the field open to anyone else who wants to provide those adjustments.

A reader also recommended , which is a great site.

--Tangotiger 09:12, 24 June 2008 (PDT)

Pitcher Batting - value

As a Mets Geek regular I read John Peterson's article on re-thinking the nature of 4th & 5th starting pitching slots. John cited you guys as claiming the offensive benefit of removing pitchers from the lineup to be .42 runs per game. That seemed remarkably dramatic to me based on nothing more than intuition and I saw in the comments thread that you were willing to consider substituting a "conservative" 0.25 runs per game. You've obviously got a pretty sophisticated model for calculating it and sadly I'm one of those guys who's good enough at arithmetic to raise questions but probably not bright enough to readily understand the models that provide the answers . . .  :) With that in mind, I'd love to walk you through my "crude layman's math" and see if you there's a chance that even .25 rpg is optimistic. (Calculations omitted.)

I detailed it at length in The Book. I don't necessarily want to force you to buy it, but at the same time, I don't want to repeat everything I said there.

That said: 1. Your PA estimate is a little low for pitchers. Retrosheet will give you total PA by position. No need to "randomly" select one team. 2. You are ignoring the outs in your equation. Linear Weights, not Runs Created, is what you want.

The impact is as substantial as I said, unless you have crappy bench players.

--Tangotiger 12:29, 8 July 2008 (PDT)

Linear Weights - men on base

A few years ago, I looked at your calculation of linear weights with men on base. I can't see to find them any more; is that work still available?

It's also in The Book.

--Tangotiger 12:29, 8 July 2008 (PDT)

Pitching with men on

I was wondering if you knew of any studies showing how pitching performance changes between pitching with the bases empty versus pitching with runners on? I looked quickly at the Book but did not find anything.

You looked too quickly: p. 113-116.

--Tangotiger 12:29, 8 July 2008 (PDT)

Runs Produced - why remove HR

I enjoy your comments on Bill James Online. In the discussion on RBI Opportunities, you mentioned that you hate RBI and prefer RBI - HR. Could you send me or point me to your article on that subject, please? Given only a moment's thought on it, I don't see the logic. Do you also hate Runs and prefer Runs - HR?... Also, I've only glanced at your site, but there's a ton of information there. I'm looking forward to spending some quality time with it.

--Tangotiger 12:29, 8 July 2008 (PDT)

Base/Out Frequency

I'm trying to figure something out, and to do it I wanted to know the probability of a two-on, two-out base/out state. It occurred to me that you must have compiled that data in order to do your run expectancy/run frequency matrices (which I refer to frequently and really appreciate, btw). Is this something that you can share with me?

Look at the PA column:

That's for 1956-2007, excluding 1999.

--Tangotiger 12:29, 8 July 2008 (PDT)

Chance of winning

What's the chance of a team winning after x innings if leading by y runs.

By inning, 1978-1990:

Including base/out, since 1956:

--Tangotiger 12:29, 8 July 2008 (PDT)

Defense - splitting credit

I came upon your 'anatomy of a collapse' post referencing the Cubs-Marlins game and it started me wondering if, in general, the responsibility for a batted ball could be quantified to the pitcher and fielders. It likely is too subjective or too individual to do, but do you know if this has been investigated? Does it seem that a batted ball would cause two separate pathways, depending on the outcome? i.e., a line drive would be either a hit (most/all of the blame going to the pitcher), or an out (most/all of the positive "blame" going to the fielder). Would a catcher get part of the credit for a strikeout?

I abhor the "splitting credit" viewpoint, because it always presumes that both sides share some positive or negative aspect.

It's possible that if there's a +.30 run play, that the pitcher get -.20 runs and the fielder gets +.50 runs. It doesn't have to be +.10/+.20 or +.15/+.15.

You should think in terms of marginal changes between intermediate states. So, if a pitcher throws a pitch and the batter hits it, AT THE POINT that the ball just leaves the batter's bat, you have to decide the likely outcome of the play, presuming average fielders at each position. Maybe the outcome would be an out 70% of the time, a single 20%, a double 10%. So, the run value would be say -.03 runs. If the fielder(s) get an out, they get credit for say -.27 runs. If the ball falls through for a single, the fielder(s) get credit for say +.50 runs.

So, an out in this case would get split as -.03/-.27 and a hit would get split as -.03/+.50. As you can see, regardless of the outcome, the pitcher gets the same credit.

--Tangotiger 10:15, 11 July 2008 (PDT)

Aging Curve - by position

I have a quick question about your aging curve research that I would greatly appreciate any feedback on. 1. Have you ever broken down the aging curve by position? I am particularly interested in 2B aging curves. 2. Have you done any research on how a late start to a player's career (either by injury, platoon or late call up) prolongs the curve?

One of those things constantly on my todo. I really hope to eventually doing it, though I'd be happy if someone beats me to it.

--Tangotiger 19:30, 11 July 2008 (PDT)

Minor League data

I am interested in finding minor league statistics for the past couple decades or so and was wondering if you might know of a source for such data? I see that has minor league data, but I was wondering if you know of any where or any way to get the data in a database form similar to retrosheet or Lahman data.

Check the Websites page on this wiki. The SABR site listed there might get you what you need. Otherwise, some of those sites might be willing to sell it to you.

--Tangotiger 19:32, 11 July 2008 (PDT)

Regression toward the mean

Is it possible to regress to the mean when the stat isn't a binomial rate with some sample size associated with it (like plate appearances) but does have a standard error for each player's observed rate which measures the uncertainty? So for each player I would have an observed rate and a standard error for that observation, and for the population I would have a mean observed rate, and I could calculate the standard deviation in the observed rates (I could even calculate the standard deviation in the standard errors for all the players, if that would be of any use). Is this enough to give some estimate of how far a player's rate should be regressed to the population mean? How would one go about this?

Sure. You need four pieces of information:

 player rate = the value of this player's rate statistic
 standard error = the standard error in this player's rate statistic
 league avg = the league average for this statistic
 true SD = the standard deviation of "true player talent" for this particular stat.

You have the first three but not the fourth, so we have to calculate it first. The best way of doing this is in the book appendix. The quick-and-dirty way is to compare the standard deviation of observed rates with the root mean square average of the standard errors for the players (root mean square being the square root of the average of the squares). You'll want to require some minimum number of chances. The standard deviation of true player skill is given by

  true SD = sqrt [ ( standard deviation of rates)^2 - (rms average of standard errors)^2 ]

With this, you can regress any individual player's rate using

  regressed rate = [ (player rate)/(standard error)^2 + (league avg)/(true SD)^2 ]
                 / [ 1/(standard error)^2 + 1/(true SD)^2 ]

--AED 12:09, 12 July 2008 (PDT)

Calculating pitcher WAR

Do you use ERA or FIP? In your post you use ERA, but I also some people in the comments section using FIP. Also, how do you weigh each year in terms of importance when finding a player's true talent level?

I use both, but weight the FIP more.

As for true talent, I've explained Marcel step-by-step:

Basically, for pitchers, the weight is around weight = 0.999 ^ daysAgo

So, performance that happened a year ago is weighted at 70%, two years ago is 50%, three years ago is 35% and so on.

There's also a regression toward the mean component.

For pitchers however, you'd be better off doing it by component, as the current K rate is both more indicative of current talent and requires far less regression. His BABIP rate has the opposite issue.

Weight for hitters is .9994^daysAgo

--Tangotiger 10:19, 11 July 2008 (PDT)

Strikeouts - run value

I am trying to determine how much worse a SO with ROB is than any other type of outs. On your linear weights page , with a runner on 3B and 1 out it says a SO is -.60. An out is -.22. Does that mean that a SO in that situation is 38% worse?

If you can get away in speaking in terms of percentages, then yes, the run impact of a K in that situation is .38 runs more damaging than another out.

Basically, if you have runners on 3B and less than 2 outs, a K is far far more damaging than any kind of out. Then again, if you have a runner on 1B and less than 2 outs, a GB out is more damaging than a K.

I talk about it a bit in The Book, if you are interested (p. 130-132).

--Tangotiger 10:08, 11 July 2008 (PDT)

wOBA - where to find

wOBA: Best Stat Evaaaahhh! I hope it takes over the world. Is there a website with player's current and historical wOBA's listed?

wOBA is really just Linear Weights (LWTS) but expressed as a "rate" stat. Personally, I prefer LWTS, or LWTS per 700PA. But a large group of people have an enormous problem with 0 = average. So, all we did was scale it so that it looks like OBP (which has the added advantage of making it easy to figure out the standard deviation). I would show all 3 stats.

I agree, someone should show this. Our friends at Fangraphs, Hardball Times, and Baseball Reference are all great candidates to be the first to show the stat. You can press on any of those guys to move forward on it.

On a historical level, I wonder if we should try to scale it the way Baseball Prospectus does and force a common scale for all years. That is, should we force a wOBA = .333, just so that we can compare year to year. For our puposes, since we were looking at such limited years, it wasn't a worry. In the historical data, I would have that concern.

A reader also recommended , which is a great site.

--Tangotiger 13:58, 17 July 2008 (PDT)

Standard Deviations - by PA

The chart that shows the standard deviations for wOBA over different PAs makes me aroused. Do you have a similar chart for OBP, SLG, and OPS? I want to be able to tell my friend he's an idiot when he says "Player X is the greatest- look at his OPS" based on 300PAs. I want to be able to say exactly how much of his performance could be random fluctuation. Also, if you have something clever and nasty I could say about his mom when I do this, that would be greatly appreciated as well.

I believe that the SD for all of those stats are in the appendix in The Book. I don't have a copy handy, so you'll have to shell out a few bucks if you want to impress and/or show up you friend.

As far as something clever or nasty, you'll have to come up with something yourself. Sorry. --Mgl 01:13, 17 July 2008 (PDT)

GB v GB - why the platoon split?

Do you have a possible explanation as to why the gb pitcher vs gb hitter split works like it does? This was a counterintuitive revelation to me. ...I would have thought if Chien Ming Wang is on the mound I'd prefer to have an Ichiro type of player because he wants to hit the ball on the ground and use his speed. Likewise, if Former and Future Yankee Savior Eric Milton were on the mound, the last guy I want to see up there is Jack Cust because he's gladly going to hit those fly balls and they'll land in some colonial Mexican town. ...But, if I can match Wang up against Dunn I can at least keep the ball on the ground and limit his effectiveness. And if I can put Milton in there against Ichiro I can get Ichiro to hit a harmless fly ball.... Why doesn't it work like this?

While you may have a point about "neutralizing a batter's strength" (e.g., as you say, if you can force Adam Dunn to never hit a fly ball, he is not worth much), the reason is that a fly ball batter hits more line drives and otherwise harder hit balls versus a ground ball pitcher, and thus his batted balls have more success. (The same type of thing for all other combinations.) A ground ball batter tends to swing more level or even downward and tends to hit the top of the ball more often. This is a good strategy against a fly ball pitcher who tends to pitch up in the zone and make batters hit the bottom of the ball. Against a ground ball (sinkerball) pitcher, he really pounds the ball into the ground, hitting a lot of weak ground ball outs. The fly ball batter versus the fly ball pitcher tends to hit a lot of pop-ups and lazy fly ball outs.

That is the reason. --Mgl 01:19, 17 July 2008 (PDT)

200ft fence - effect

Here's my first radical idea: Knowing that fast-pitch softball is such a low-scoring environment, where infield singles and bunting proliferate...If you had a 200-ft. fence in the outfield, which most high schools do...could you justify a two-outfielder, five-infielder approach??? One girl plays up and on the line, completely taking away the bunt (ideally she's fearless, since she's 30 feet away from the hitter. But to prevent injury, she could also be placed on the non-pull side near the line for each hitter). This allows you to never crash your first baseman or third baseman, and puts your "extra defender", pitcher and catcher in charge of all bunts. My theory is that you'd take away a ton of singles by never having to have infielders moving on the pitch to cover bags. The 2b could play a conventional MLB 2b depth at all times, on the lip of the grass. You'd also essentially have two shortstops, almost taking away the hole. Obviously you'd be vulnerable in the OF, but frankly, the ball doesn't get hit there with any kind of authority much in high school softball - especially opposite field. The key is the 200-ft. fence behind you; with two quick OFs, I think you could still prevent any triples, since the ball couldn't roll forever. In the cost-benefit analysis, I think you'd affect the other team enough psychologically, AND take away enough singles and their bunt game, to overcome the extra double you'd allow per game....Other thoughts you might ruminate on: Even in softball, I think I'd eschew giving away outs via bunting. What do you guys think?...Thank you very much for your time. I am a huge fan of your blog and it's influenced my knowledge of the game a great deal.

Sounds reasonable. Or maybe not. I know nothing about softball. I would think that in a low run environment like fast pitch softball, sac bunts and other small-ball strategies would be very effective. Of course it depends, as always, on the success rate of a sac bunt, both in terms of moving the runner over with an out, and reaching base safely on a hit or an error.

--Mgl 01:23, 17 July 2008 (PDT)

Pitcher wear patterns - Weibull

If you have access to the data, have you ever considered using Weibull Distribution to try to chart pitcher wear patterns (either by innings or pitches)? By breaking it down by month to see when the average pitcher 'breaks' it would be interesting to see.... Any chance I can con you into doing it? Or is it a waste of time that won't really provide useful data because of all the noise?

Just to let you know, I have not ignored this question. I just can't answer it. Maybe Andy can. Andy?

--Mgl 17:56, 28 July 2008 (PDT)

Talent level - AL v NL

eith Law said in a recent chat that, in terms of talent and competitiveness, "the AL is still light-years ahead of the NL." Anecdotally this seems at least partly accurate, but is there any way we might quantify the claim?

MGL wrote a 3-part series here two years ago:

--Tangotiger 11:25, 16 July 2008 (PDT)

To summarize the methodologies in my paper, there are 2 ways to compare the average or overall talent in both leagues (hitting and then pitching and defense, leaving out base running and other small things like that):

One, look at what happens when players switch leagues. For example, let's say that 30 position players from the NL went to the AL the next year. If those 30 players were exactly average collectively in the NL and then were a little below average in AL the next year, after adjusting for aging, then we would have to conclude that the AL in year X+1 as a little better than the NL in year X. We might also speculate that anyone who switches leagues might get a little "worse" until they get used to the league and pitching. However, if we look at the batters who went from the AL to the NL, and they got "better" (again, not in raw stats, but normalized stats, relative to the rest of the league), then we could conclude that one league was better than the other, at least comparing one year to the next.

Two, in inter-league games, we can see how the road NL batters do against NL pitchers in NL parks and compare that to how the the road AL batters do against those same pitchers in those same NL parks. Again, we have to do the same for the AL batters versus AL pitchers and NL batters versus AL pitchers. And we can do the same for NL and AL pitchers.

It is not that hard. It is the same thing we do for comparing AAA (and other minor and foreign leagues) and MLB to compute MLE's. We look at players who have played in both leagues.

--Mgl 01:31, 17 July 2008 (PDT)

True Talent

Does "true talent" apply to fielding and baserunning/basestealing as it does to hitting and pitching?

I don't know what you mean by "apply." When we talk about a player's "true talent" we mean what he would do if he had exactly an infinite number of opportunities to do it in. A fair coin's true talent "batting average" is around .500.

If you timed me in a 60 yard dash and infinite number of times, my average time would be my "true talent" speed in the 60 yard dash.

If a certain batter batted an infinite number of times, whatever his BA was at the end would be his "true talent" BA.

So yes, "true talent" applies to any skill or characteristic that a player possesses. That is what we are always trying to estimate from a player's sample performance in a finite number of opportunities (such as a player's BA in 300 AB), using statistical techniques and other methods of deduction (like scouting and similar observational methods).

--Mgl 19:11, 25 July 2008 (PDT)


Are the playoffs really a "crapshoot"?

When people bet, do they lower the odds for a Billy Beane-controlled team? Get past the yapping, and follow the money. That tells you what people really think.

--Tangotiger 08:33, 25 July 2008 (PDT)

IOW, no. I don't even know what that means. That both team's in a playoff series have the same chance of winning? Would that make any sense whatsoever? Whoever said that is a moro...wait, that was the venerable Billy Beane. ;)

The team with the best personnel will win, in the long run, more than the team with the worse personnel. Obviously. It is also obvious (well, maybe not as obvious as the fact that if both teams are not of equal quality, then one team has a greater chance of winning a playoff series) that if two teams are unequal, the chance (say, in percent) that one team wins the series, is proportional to how long the series is. If one team is a lot better than the other, they may have a 65 or 70% chance of winning a 5 game series. If the same two teams played 19 games, the better team might have an 85% chance of winning the series. Etc.

For the record, it is my opinion that a 5 or 7-game series in the post-season is the same as any 5 or 7 games in the regular season. Obviously the way the managers manage is a little different, but generally the team with the best pitching, hitting, defense, and baserunning will be the favorite in the series.

The idea that what "works" in the regular season may not or does not "work" in the post-season, is nonsense. Obviously in the post-season, your pitching is weighted differently (e.g., your number one and two starters pitch 65 or 70% of the games instead of 40-45% like in the regular season), and so may other things be weighted or leveraged a little differently, etc. But still, the team that plays the best pitchers, hitters, defenders, and baserunners combined is going to be the favorite in the series. Period. There is no "magic" to post-season strategy or what makes a team a good post-season team (basically the same things that make it a good reg season team).

What Beane meant was that a good team might be a .550 team or better 80% of the time for 162 games but only 55 or 60% of the time over a 5 or 7-game series. He is right and that is obvious to anyone who knows anything about baseball. It is not that hard to put together many years of good, winning teams, and then be unsuccessful in 5 or 6 (or 10) post-seasons simply by chance alone. It is A LOT harder to put together good teams and be unsuccessful in the regular season for 5,6 or 10 seasons, simply because there are a ton more games in the regular season and the "law of large numbers" tells us that the more trials (opportunities or games in this case) we have, the closer a team will come to its true winning percentage (true talent).

Odds Ratio Method - shortcut

I remember a post a while back about the odds ratio method, where a commenter wrote a "shortcut" for it. If I remember correctly it was A+B-C where "A" was the batter OBP, "B" was the pitcher OBP against, and C was the league average OBP. Pugging numbers in would reveal the expected OBP for the matchup. Is this shortcut correct? If not, what is the correct way to find the expected result of a matchup? I have somewhat limited mathematic capabilities, so please keep that in mind when responding.

Yes, the shortcut for the Odds Ratio method is close to the differential method, a method employed by Strat-O-Matic. A shortcut is not "correct". It's close enough, and the differential is close enough.

--Tangotiger 08:33, 25 July 2008 (PDT)

Park Factors - wOBA

Regarding Question #33 -- how could I incorporate Baseball-reference's park factors to park adjust wOBA? I have an idea on how I could ESPN's park factors since they are expressed in runs (not sure about B-R), but I don't trust ESPN's park factors.

I think that the basic rule of thumb is that run scoring is proportional to the square of measures like OBA or BA (or wOBA or OPS), so if you took the square root of the park run factors, you could use those to park adjust wOBA.

Try and use multi-year park factors. A minimum of 3 years is best. I use 10 (or more) year park factors if I can (if the park has not changed), although I adjust for the other parks in the league. Also, regressing any park factors towards a mean of 100 is nice too. The fewer years you have, the more you regress. For example, if I told you that there was this park in the majors (we know nothing about it) that had a park run factor of 87 last year, it would be silly to use that as your park factor. A park factor is supposed to be a "true park factor" or how much it affects an average player or team in the long run. If you know something about the park, then you can start to regress the sample (however many years it is based on) PF to something other than 100. For example, you might regress that 87 50% toward the mean since it is based on one year only (I don't know whether 50% is the correct regression amount), and call it 93.5. But if I told you that the park was located in a city which was at sea level, you might regress it towards 97 (or whatever the mean of all sea level parks is), so that your true estimate is now 92. If I also told you the outfield dimensions, you could probably come up with an even better estimate of the "population mean."

--Mgl 17:34, 28 July 2008 (PDT)

The Book - Win Expectancy chart

"The Book" looks very interesting, but before I spend money on it, I need an explanation of the chapter-1 chart on your Web site (win expectation based on game state). I cannot make heads or tails of it. It looks like the chances of winning the game are consistently higher when, say, you're ahead by 4 runs in the 6th and the bases are loaded AND THERE ARE TWO OUTS than when there are no outs (.888 to .748). This doesn't make any intuitive sense to me. However, I do realize that I'm not seeing the explanation of how the table works. So could you please give me sufficient explanation so I can see what's going on there? Thanks very much.

Thanks for your interest. The Book makes clear (and the website does NOT... my bad) that the chart is from the perspective of the home team. You may also appreciate this chart, also from the home team perspective:

--Tangotiger 10:32, 28 July 2008 (PDT)

Regression toward the mean - weighted or simple mean

Regressing to mean - In the classic scenario, you have 20 students take a 100 question test, and get the average for the test. If you have 2 students at 90, 16 at 70, and 2 at 50 you get an average of 70. In baseball, the questions are plate appearances - 2 90s answer 100 questions, the 16 70s answer 50 questions, and the 2 50s answer 20 questions - this makes my mean 73.1, due to weighting. Everything I have seen tells me to regress to the mean of 73.1, but I would think it should be 70. - If I take a student (Player) with no experience wouldn't he be more likely to get 70, not 73.1 (Obviously fill in the major league stat of your choice)

You are always regressing a person/player toward your estimate of the mean of the population from whence he or she comes. If you think that their true mean (population mean) is a function of their experience, then by all means use something different than the weighted mean. If you don't, then use the weighted mean.

With baseball (and other things), you have to be careful about "cause-effect relationships" in terms of inferring population means. If I have two players and one player gets a lot of PA and has a great BA (or whatever stat) and the other player has a few PA and has a poor BA, don't automatically assume that they come from two different populations because one is given a lot of PA and the other is not. They may come from exactly the same population but BECAUSE one has been doing better than the other, and only because of that, he is allowed to amass more playing time. In that case, you want to regress both players towards the same mean.

On the other hand, you might have a player who is a part time player his whole career and another player who is a starter his whole career. If you only know their most recent season's stats and one player has 100 PA and the other has 600 PA, you may want to regress those stats towards a different mean - one towards that of a part-time player and the other towards that of a full-time player. Even then, you have to be a little careful, as they may actually come from the same population of players, only one is full-time and the other is part-time simply because the full-time player happens to have done better over the years.

If I didn't answer you question, please let me know.

--Mgl 17:45, 28 July 2008 (PDT)

Game shares - fielding

I get how to figure Game Shares for offense. I don't get how to figure them for Defense! You have mentioned before .16 for SS, .09 for fielders, 57-58% for players, 43% for pitchers etc. What is the system for calculating game shares for any defensive player. ...Ideally, it would be really cool if I could come up with an individual 2 number player record (236-74 or 311-(43), in a quick and dirty fashion by using readily available data. For example, Derek Jeter 8846 career PA. What if we just divided PA by 38.25 (4.25*9) to get a quick and dirty individual games (231.3*.5=115.7). This divided by 2 to get (57.8 OGS). Add his LWTs (23.3) for batting from Baseball Reference and this would give an 81-35 (243-105 by james win shares)record. You got something like 77-29 in another example. This process leaves out having to divide by the team plate appearances in figuring an offensive record....Another way to handle this is to use BFW from retrosheet and you might be to take PA and divide by 38.25 shares then by 2 and multiply by 3 to get (347 OGS).To this add defensive game shares . In the Jeter example he has played 16732 innings at SS divide this by 1458 (162*9) this equals 11.5 seasons. Next, using your earlier work 486*.09*.16 for SS yields 7 GS per season. Multiply by 11.5 seasons and you get 80.5 DGS. We now have 427 total game shares. If you divide this by 2 and add his BFW(from retrosheet) by mulitplied by 3 you get a won/loss shares record of 265-162. James has him at 261-124. What do you think? ...My questions are after all this: What are the % each Defensive player should have .I know 16% for SS but what are the others? And do you have any idea what a similar quick and dirty pitcher system might be? Can leverage be estimated using w/l/sv ?

I'm not sure any of us use games shares very much. My primary gripe is that a team of replacement-caliber players would win about 30% of their games, not 0%. So, only 40% of a team's games should count (the games that take them from 30% wins to 70% wins)...

All that aside, if I were to try to quantify the number of wins a player contributed, I'd do something more or less like that. Figure out the fraction of batting plate appearances, pitched innings (or better, plate appearances), and fielding innings he saw. Then, decide how many game shares he should have -- a full season in this case would be 64.8 games, which broken down would be 31.1 batting games, 5.8 fielding games, and 27.9 pitching games. Next, divide his games into half wins and half losses, and adjust for wins relative to a league-average player using your favorite WAA sort of metric.

--AED 08:27, 12 August 2008 (PDT)

Pinch runners - how much value do they have?

Along the lines of your recent look at when to use defensive replacements, when should pinch-runners be used? I'm thinking in particular of a situation with an elite batter but terribly slow runner, such as David Ortiz -- whether trailing by a run or tied late in the game, when is it worth it to sacrifice his bat (in possible plate appearances later in the game) for faster legs and an increased chance of scoring the tying or go-ahead run? Thanks very much.

The reader mentioned this article on late-inning fielding replacements.

The question is now about late-inning running replacements. The easiest way to describe this is to establish that each base gets you one-fourth of the way home, so that we can say that each base is worth 0.25 runs.

Also note that a fast runner will usually take an extra 0.15-0.20 bases per opportunity. So, if someone normally goes 1B to 3B 30% of the time, the fastest runner will do it 45 or 50% of the time. We can see therefore that a superfast runner will add around 0.25 runs about 15-20% of the time. So, a fast runner adds about 0.04 runs over an average one. If it's a fast runner to replace a slow runner, that ups it to up to say .08 runs. That is an enormous gain. Being a fast runner is very very important... if you can get on base.

If you can bring in a great fielder who is also a fast runner as a pinch runner, then put him on the field, that is a fantastic tradeoff.

Of course, if you have someone who doesn't hit lots of singles or doubles hiting behind him, his value becomes less as a pinch runner, but not that much less.

--Tangotiger 10:17, 28 July 2008 (PDT)

I'll add that with 25-man rosters and too many pitchers on that roster (IMO), I think that having and using pinch runners are underused by teams and managers. One reason is that managers are so risk averse that they would rather bunt a runner over to second late in a close game than have that runner attempt a steal even if the runner is probably going to be safe 80% (or more) of the time. Since bunting is ALWAYS a marginal decision (IOW, bunting or not bunting is always a close decision), if you have a runner on first, and his likely success rate is way above the BE point (which is usually low in a high leverage situation - maybe in the 60-65% range), it is ALWAYS correct to attempt the steal rather than the bunt. Many managers eschew the steal for the bunt. I think the reason is that if the steal does not work, it makes them (the manager) look bad, but if the bunt is not successful, it makes the bunter look bad. Manager strategy is often dictated by how it makes the manager "look" if the move is successful or not. Unfortunately for the team.

--Mgl 17:52, 28 July 2008 (PDT)

CORRECTION: Ouch, did I make a boo boo. A fast runner will add 0.25 runs 15%-20% of the time... that a hitter gets a single or double. So, that will occur a bit over 20% of the time. All of a sudden, we are talking about a bit less than .01 runs. Round it up to .01 for taking extra bases on outs, above average. So, turning a bad runner into a great runner may add .02 runs. Still ok, but not the great improvement I noted yesterday.

I will echo MGL here that if you have a fast runner, you'd be crazy to sac bunt him over. I remember seeing Tim Raines bunted over once. It was perhaps the worst percentage play I'd ever seen at the time. I never went over that play, but perhaps I should go over it to see what the manager may have been thinking.

--Tangotiger 08:36, 29 July 2008 (PDT)

BABIP and UZR - for hitters

I don't know if it would be worth the effort, but shouldn't one be able to use the data you use for UZR to come up with some sort of expected BABIP for hittersand/or pitchers? It would obviously be misleading in some circumstances--especially, e.g., guys like Giambi or Thome who face odd defensive alingments--but it still seems like you could get some useful information out of it. You already see discussions of a pitchers expected batted ball performance based on his GB%, FB%, LD%; it would seem that the use of more granular data could sharpen that sort of analysis considerably.

Yes, I have toyed around with taking the exact distribution of balls in play for a pitcher and then assigning s,d, and t, based on league averages (probably for all LH or RH pitchers). For example, if a pitcher allows a hard hit ground ball in zone C, and 20% of those balls league-wide are singles, and 80% are outs, he gets "credit" for .2 singles and .8 outs. We then add everything up for all of his batted balls, and we get "expected" s,d,t, and outs. In fact, the difference between those expected numbers and a pitcher's numbers are what we call "PZR" or pitcher zone rating. I could be wrong on that definition, as Tango sort of coined the term. He can correct that if necessary.

Even if we do that, I am not sure it means a whole lot, given what we know about "DIPS." In other words, if a pitcher has a particularly non-league average distribution of "expected" s,d,t, and outs, it is likely because of "luck" at least in small samples. The only thing that this method really does is correct for defense and for odd defensive alignments.

We can do the same thing for batters, but, in my opinion, we probably shouldn't. For one thing, as you say, since each batter has his own defensive alignment, with some having a radical one (like in a "shift"), using league-average s,d,t, and outs for each batted ball is not right. For another, even though the data includes how hard a ball is hit, a "hard" line drive or ground ball by Jason Kendall is probably not the same as a hard batted ball by Giambi. For another, if, for example, Dunn hits a blooper to the short outfield, it will probably fall in for a hit more often than league average because the outfielders are playing deeper than average. To assume a league-average percentage of hits and outs for a Dunn blooper would not be correct. To some extent that is true of pitchers (the defense plays a little different for each pitcher).

As you say, we can probably use the more granular batted ball data to "correct" (remove some of the noise/luck) the performance results for pitchers and batters, but we have to be really careful and we probably should err on the side of being conservative if we do so.

--Mgl 10:08, 30 July 2008 (PDT)

Run Frequency

Thanks for doing your run expectancy work. I have seen the run expectancy matrix that gives you the odds for scoring a specific number of runs based on base-out situations, but now I can't find it. For instance, what are your odds of scoring 3 runs in an inning as opposed to 1 run, if say, you have bases loaded and nobody out. Would you happen to know where I can find this particular table?

It's in The Book.

Or here:

Or here: (all Retrosheet years)

--Tangotiger 07:08, 31 July 2008 (PDT)

WPA - win% style

I know you have already been critical of Bill James' new Loss Shares, especially in relation the lack of negative loss shares for Mariano Rivera. Wouldn't using +WPA and -WPA be the best way to calculate Win and Loss Shares. You could basically have team independent "replacement levels" (or the level of 0 value to the team) depending on the teams +WPA and -WPA. To figure out Win Shares you could take the (team+WPA)*3 plus (team-WPA)*x where x equals ((team+WPA)*3-teamWins*3)/(-team-WPA). You can pretty much calculate Loss Shares the same way, the only thing missing would be defensive W/L shares, but as we know that is another animal to itsef.

Let's rename "+WPA" as WA (win advancement) and "-WPA" as LA.

In a typical game, the winning team will have 0.95 WA and 0.45 LA and the losing team will be 0.45 WA and 0.95 LA, natch.

The way I convert this to try to get a "win% style" of a metric (you know, to get the winning team at 100% and losing team at 0%) is to subtract 0.45 WA and LA from both teams. That would basically be the "starting point" for both teams. If you do that, the winning team has 0.50 WA and 0.00 LA, and the losing team has 0.00 WA and 0.50 WA.

Now, you really should do this on a game-by-game level, and the subtraction of 0.45 is not a static number, but whatever number gets the winning team to 0 LA for that game.

The key to remember is that WA-LA must remain constant, after whatever adjustments you make.

So, once you've decided that you need to subtract 0.45 WA and LA for the team, you have to figure out how much to subtract from each player. I would say that if the total WA+LA is 1.40 for the team, then you need to subtract 0.45/1.40 (which is 0.32) times a player's individual WA+LA from his WA and LA.

As a quick example, in 2004, Mo was 9.10 LA, and 14.32 WA. His GA (game advancement, or LA+WA) is 23.42. If we presume that we want to remove about .32 LA and WA per GA, then we are subtracting 7.5 WA and LA from his totals. That leaves Mo as being 6.82 WA and 1.6 LA.

As you can see his WA minus LA remain a constant at +5.22.

Now, at the team level, this makes the average team worth 40.5 WA and 40.5 LA. In order to get them "win% style", we double all this. Mo ends up having a 13.6 W and 3.2 L.

Note that in win% style, you have to figure it as wins above .500. And with W+L being 16.8, a .500 player would be 8.4 and 8.4.

And 13.6 minus 8.4 is +5.2 wins.

Hope all that made sense...

--Tangotiger 14:29, 31 July 2008 (PDT)

FIP - starters, relievers

Here's quick question. Is FIP as informative for relief pitchers as it is for starting pitchers and what is the reasoning behind your answer? Thanks!

If anything, you would prefer FIP for relievers over starters. The more PA you have, the more real all the stats become, even those where it interacts with the fielders (like BIP) and his team (like a W/L record). At those high levels, the adjustments are easier to make, and also more minor.

--Tangotiger 08:39, 8 August 2008 (PDT)

FIP - for hitters

Is there something similar to FIP for batters? It might be based on (LD%, GB%) of a batter to which gives us expected BA and expected SLG.

Right. But better than BA and SLG, why not denominate it in runs right away? For example, the run value of a popup is the same as a strikeout, so that's around -.30 runs. The run value of a line drive is similar to a walk, around +.35 runs. The run value of a flyball (excluding HR) and of a ground ball are both very close (around -.10 runs or so).

So, you could do something like:

4*HR + BB+LD + (GB+FB)/3 -K-Pops

And then divide the whole thing by 3. That'll give you, roughly runs relative to average.

--Tangotiger 08:37, 8 August 2008 (PDT)

I would be careful about trying to infer too much about the difference between a batter's "FIP" and his real stats. The reason is that a line drive by Juan Pierre is not the same as a line drive by Miguel Cabrera, same with a ground ball, fly ball, etc. But, you can probably take some of the noise out a player's "real" stats if you compute an "FIP" sort of stat. Ideally (or better, I should say), you could come up with a separate value for line drives, fly balls, etc., depending on the power of the batter. And the power of the batter could be based on HR and extra base hit rate plus maybe height and weight, or something like that.

Ryan Ludwick

Why does Ryan Ludwick bat right-handed?

Matchups - How to figure out expectations

In the 2007 NL, 2.67% of plate appearances resulted in a homerun. 1.04% of the plate appearances vs. Brad Penny resulted in a homerun -- i.e., 1.63% less or at a 39% rate of the league average. Ryan Howard hit homeruns in 7.25% of his plate appearances....If Penny is facing Howard, do I expect Howard to hit a homerun 5.62% of the time (7.25 minus 1.63) or 2.83% of the time (39% times 7.25)? The former is how, say, Strat-O-Matic would simulate it. I read page 93 of The Book to suggest the latter (which seems correct to me). Which is right? Or is it something in between?...Note -- obviously this question is not limited to homerun rates, but since the variances here are greater, it presents a starker contrast.

And someone else also asked:

I'm trying to come up with a way to properly predict the outcome of a given pitcher/batter matchup. I know the simple way is to take a properly regressed wOBA (or whatever metric you want to use) for the pitcher and the batter and then use the odds ratio method. But, what if we wanted to drill a little deeper using variables that we know have some statistical significance (platoon differentials, GB/FB differentials, etc.)? In other words, say I have a GB/RHP facing a GB/LHB. How does one combine the various matchups (GB vs GB, RHP vs. LHB, etc.) to get one true outcome? Given a perfect world and unlimited sample size, I guess you could come up with multiple figures for each pitcher & batter (i.e. against RHP with a GB tendency in Coors Field, such and such has a .325 wOBA) and then use the odds ratio. Given sample size limitations, this is probably not possible. Any suggestions on how to solve the problem?

I use the Odds Ratio method, which means you take the successes and divide it by the failures for the player, and do the same for the league, to figure out his relative success level. And multiply all those for as many parameters as you have.

So, a guy with a .400 OBP in a .350 league facing a pitcher who allows a .200 OBP in a .300 league and then play in a park that in which the average hitter hits .380 OBP would come out looking like:

.400/.600 divided by .350/.650
.200/.800 divided by .300/.700
p/(1-p) divided by .380/.620
Solve for p.

--Tangotiger 07:05, 6 August 2008 (PDT)

Let me add this:

If you want to get a batter's (or pitcher's) projection against RH or LH opponents, do not take his stats versus LHP and do a projection and then his stats verus RHP and do a separate projection. That is a common mistake. Also don't take his projection and apply his own platoon ratio or differential. Do it this way: Compute his projection, say OBP. Then compute his projected platoon ratio, using the same method as you use for computing any projection - historical ratio weighed by recency regressed toward the league platoon ratio mean. And remember that sample platoon ratios get regressed a lot, especially for RHB. Remember also that when you get a player's platoon projection and you have his OBP (or whatever stat) projection and you want to convert those to an OBP projection versus RHP and LHP, you have to use his historical PA versus LHP and RHP to do that. --Mgl 20:10, 8 August 2008 (PDT)

Days off - Do they help?

I don't have my copy of The Book at home (I left it in my classroom over the summer), but does it say anything about the benefit of giving a slumping player a day off? I write this while listening to a Rangers postgame show where they are arguing that Ron Washington should give Ian Kinsler a day off against the Yankees. I'm thinking there isn't a scenario where taking Kinsler out of the lineup makes sense given that it puts Travis Metcalf's bat in play instead, but I think I remember there being some concrete numbers to back up my POV. Any help would be appreciated.

Have no idea and it is not in the book. I would think that you would want to give a player a day off every once in a while and that you also need to give your reserves some playing time every once in a while as well.

--Mgl 20:12, 8 August 2008 (PDT)

wOBA - more questions

I have a question regarding wOBA, something that you wrote about in your book. I like the idea of wOBA, however I dont understand a few things. ...First, why are players rewarded for reaching base on an error? Unless you think that a batters skill cause a significant amount of errors to be comited I dont understand putting that in there.... Second. The same thing can be said for HBP. Now in this case I think its possible that better players are pitched inside and therefor hit more often, however Im not sure of this. Have you ever found any evidence of a coreolation between a batters skill and his number of HBP's? ...Third. IBB are removed from this statistic. Now I dont think this is fair. A player who is feared enough to be IBB should be rewared. Espicialy if you have HBP in there you should have IBB in there. What are your thoughts on this? ...Now my last thing is regarding the use of linears weights. I like the idea, and I think it can be said that a single is a little more valuable than a walk. I just think that, although a positive step, this doesn't take into account the luck factor in baseball. If you put a ball into play, you are not guaranteed to get on base. If you walk you are....I love the concept of this statistic, I just dont consider this way of doing it as a valuation of skill. It just seems like it is more an evaluation of performance and performance is influenced significantly by luck. So what are your thoughts on this?

OK, let's go one at a time here:

1. There is indeed a correlation between a player's reaching base on error one year and the next. So, this implies that there is some amount of player ability. Primarily, this will be the type of balls hit (I haven't researched this, but I suppose grounders are more likely to go for errors) and the player's speed (which can mean the difference between safe and out on the same play).

    I'll also add: batter handedness. --Tangotiger 11:57, 12 August 2008 (PDT)

2. For HBP, I have looked into this and yes, there is definitely an "ability" to get hit. Again, the evidence is in the correlation from year-to-year -- a player hit more than average one year will, on average, get hit more than average the next as well.

3. The jury is out on how to handle IBBs (likewise for SHs). Just looking at the numbers, one is led to believe that most IBBs are a bad idea. In other words, the wOBA expectancy from the particular batter/pitcher matchup is much less than what the wOBA credit would be from the IBB; thus the pitching team is making a mistake by not just throwing to the hitter. So, if you count IBBs, you're overvaluing the better hitters and undervaluing bad pitchers (who presumably issue more IBBs). Of course, the flip side is that by omitting them, you're preferentially excluding the most favorable (for the batter) plate appearances, and by doing so putting in the opposite bias.

The other issue for IBBs and SHs is that these are strategic decisions. In other words, a really bad manager could have a great pitcher intentionally walk a bad hitter; does it make sense to credit the hitter and penalize the pitcher for the manager's choices? This is unlike errors and hit batsmen, which happen in the flow of the game and do reflect the players' abilities.

4. You're right that there is a chance element in wOBA, or for that matter every statistic. So, you need to have a handle on the dispersion of actual player abilities, and a handle on the amount of dispersion that would be expected due to the chance element. A common solution here is to use regression to estimate what the player's stat line would have been, had luck been totally average. You'll find that walks and strikeouts don't regress by much (i.e., a player's adjusted walk and strikeout rates will be close to his raw ones), while other (a pitcher's rate of giving up hits on balls in play) regress most of the way back to the league average.

--AED 07:23, 12 August 2008 (PDT)

Derek Jeter - where is he at now?

I was having an argument about Derek Jeter with a not so sabermetrically oriented friend. I argued that Jeter is basically an average player at this point in his career. Crunching some quick numbers, I rated him as a +1 win hitter, -1.5 win fielder plus a .5 win bonus for playing shortstop. Do you guys concur in general with that assessment?

For batting relative to the league-average shortstop, I'll concur that he's worth about 1.5 wins. I'll defer to my coauthors for the fielding part of it. --AED 07:38, 12 August 2008 (PDT)

FIP - unFIPping FIP and making it dependent on fielders

In the discussion of fielding/pitching and % breakdown to credit each with I have the following thought. FIP uses the following weights HR=13 BB-IBB+HBP=3 SO=-2. Obviously, missing is an element of Hits-HRA. Using you prior research of Balls in Play being 62%pitcher/ 38% fielding. Why not use an additional weight of 3 for each H-HR a pitcher allows? That is .62*.53(the value of H-Hr)=.33. This new adjusted FIP (which includes H-Hr would mimic the effect of DER. Adding this new per nine rating to about .7 would yield the leagues runs allowed average. From this you could deduct the fielding runs saved by the defense. As an example I did this to the 1962NL. What I came up with was an adjusted FIP of 3.82 compared to 4.52 RAA. This would imply that pitching gets 85% and fielding 15%. On a team level Pittsburgh's Adjusted FIP is 4.14 compared to an actual RAA of 3.93. The deduction then leads to 34 RSAA for the PIT fielders. The flip side of this is the Mets of 62 adj FIP of 5.52 actual RAA 5.97. Therefore, -.45 for the fielders and -71 runs.

It really depends where you're trying to go with this. If you want to estimate a particular pitcher's ability, then yes, you need to include H-HR into the equation somehow. Regressing the stats, or reducing the weight on H-HR to account for the fact that those plays are less indicative of skill than are BB or SO, will do the job nicely.

Where you get into trouble is combining this with fielding runs. Fielding runs are basically a different estimate of the same quantity (the rate at which balls in play are turned into outs, compared with league average). The fact that the numbers are computed different ways doesn't mean that you're really measuring anything different (except that errors are counted in fielding runs). So, you're effectively just blindly assigning some fraction of defensive performance to the pitchers and some fraction to the hitters, when in fact a good defense may be the result of great pitching and mediocre fielding or vice versa. So, no, you can't use FIP plus H-HR, combined with fielding runs, to try to determine the relative effectiveness of the pitchers and fielders.

--AED 07:54, 12 August 2008 (PDT)

Free agent dollars - allocation

I really like the work you guys do on WAR and free agent value and overall market value for players. Do you guys have any recs or guidelines as to how a team on a budget should allocate their financial resources? I've seen 57% towards position players and 43% towards pitchers but I'm talking more about how money should beallocated within the 40 man roster towards big free agents, mediocre free agents, arb eligibles and league minimum types. For example, with a team like the Braves on a $100 million dollar payroll, how much would you recommend be spent on high priced guys in free agency years, free agents who are about league average 2 WAR guys, guys who are arb eligible and guys at the league minimum? My numbers won't be correct on this but would it be advisable for a team to perhaps have 4 or 5 high priced items averaging about 3.5 to 4 WAR type of salaries, with another 4 or 5 at about a 2 WAR salary, another 5 guys in arb years and the rest of the 25 men on the 40 man roster averaging about the league minimum? And with the high priced free agents, is it more advisable to stagger the years when you sign them? By staggering, I mean not signing them all at once so that if you have 5 guys averaging about 4 WAR salaries over 5 or 6 year contracts, you have two guys getting paid 4 WAR salaries based upon what the market price was 4 years ago, 1 guy making a 4 WAR salary based upon what the market was 3 years ago, and 2 guys with 4 WAR salaries from what the market was last year. My thinking is that by staggering it like that, the team never becomes overburdened in the short term.

Well, the problem here is that it's pretty easy to find 2 WAR players at a discount, but you can't put 25 players on the field, all of whom are 2 WAR, and go win 100 games. Because of this, any team that wants to succeed is going to have to bite the bullet and spend real money on some higher-priced players. My gut feeling is that the optimal solution is a mix of 2 WAR players paid bargain-basement prices (assuming you can find them), plus some truly good players that take up the bulk of the salary.

Regarding when to sign high-priced free agents, it's fairly common for player salaries to increase over the lifetime of the contract, meaning that this really shouldn't be a significant issue (assuming that the contract correctly estimates the average increase in player salaries).

--AED 08:10, 19 August 2008 (PDT)

If we think about constructing a realistic good baseball team, we'll have say 1 guy at 4 WAR, 2 at 3 WAR, 4 at 2.5 WAR, 6 at 2 WAR (that's 13 so far, meaning 8 regulars, 4 starters, 1 closer), 3 each at 1.5, 1.0, 0.5, 0.0.

That's a total of 41 WAR. With replacement level set at 48 or 49 wins, that's a 90 win team.

Now, to pay for that, as a free agent, you earn 4.4MM per WAR, plus 0.4 MM. The 4 WAR guy will cost you 18MM as a free agent. Your 2 WAR guy will cost you 9MM as a free agent. But, as an arb-eligible player, he'll cost you roughly half that (depending on years of service).

An all free-agent team will cost you 190MM.

An all non-free agent team will cost you one-third of that, or around 65MM.

How can you have a non-free agent team? Player development. If you can put in 125MM in player development every year (and get the average ROI on that), then it works out to the same thing.

No team spends that much money on player development. I will guess they spend around 30-40MM on player development.

So to me, I don't see any reason to ever go to the free agent market, other than to get average or spare parts (max 2 WAR) and discounted deals (Bradley, Cameron, etc). Instead of spending the 18MM on that 4 WAR player (each year!), take that extra 9MM or 12MM that you are paying him, and put it to use into player development.

The key is to figure out how much can you possibly maximize in your player development. If we assume that the best you can reasonably be able to make is to create a true talent 86 win team with a non-free agent team, then the most you should dip in free agency is around 4 WAR or 18MM.

So, that's my limit to free agency: dip in for one big guy, or two average guys. You should build from within otherwise. I would guess that the Twins are probably the team that best satisfies my requirements. They did lose Santana and Hunter, and last I checked, they are making a very strong run for the playoffs.

--Tangotiger 09:46, 22 August 2008 (PDT)

Reliability of Components

In The Book on page 157, you set forth the number of plate appearances needed to accurately measure platoon splits, using wOBA. My question is, do some of the observed components of wOBA become meaningful at earlier (or later) stages? For example, the ability to hit safely consists of the ability to avoid striking out as well as the ability to hit safely on a ball in play (BABIP). Are platoon splits in strikeout percentage more stable than platoon splits in BABIP? (I.e., are observed splits in the former more indicative of a batter's true differential in that category than observed splits in the latter?)

Yes, I would definitely think so. Maybe Andy can provide the numbers, but I would think that each component has a very different regression amount (the number of PA of league average platoon ratio or differential to add to the player's observed platoon ratio or differential).

As well, each component has a different true platoon ratio or differential. For example, as you might expect, and for pitchers as well as batters, singles rate has a very small platoon difference, whereas extra base hit rate, including HR, has a much larger ones.

There are even large differences among the components depending on the handedness of the player. For example, lefty pitchers have almost no platoon differential in their BB rates, whereas they have a large one for extra base hit and K rates. For righty pitchers, their BB rate has a large platoon difference.

And remember that the number of PA we care most about when determining how much to regress a player's observed platoon difference is the lesser of the two. For example, if a lefty pitcher faced 1000 RHB and only 125 LHB, we care mostly about the 125 PA when doing the regression.

Also, just to remind everyone. How many PA we need before a player's "whatever" gets regressed a certain amount is a function of one thing and one thing only - the spread of true talent for that "whatever" in the population of players. For example, the reason we regress a RHB platoon rate a lot even when he has lots of PA is because we know from analysis of all RHB that the true platoon ratios of RHB are all around the same. IOW, if we were able to get all RHB to face an infinite number of RH and LH pitchers (such that their observed "anything", including platoon rates, would be equal to the true "anything"), we would find that there just wasn't a whole lot of spread in platoon rates. Most of those RH batters would be right around the mean.

--Mgl 22:16, 13 August 2008 (PDT)

wOBA - Why OBP scale and not BA scale?

You adjust the weights in wOBA to make them parallel average OBA. Why not make them parallel batting average? Even the most casual fan knows a "good" BA and "bad" one. (Indeed, I suspect that even the most die-hard sabrmetrician finds batting average more intuitive than OBA. Multiply by 0.875 (or whatever the number is), not 1.15! Let's bring wOBA to the masses!

In 30 years, BA may not even exist, while OBP will always exist. Eventually, OBP will supplant BA because the old ignorant generation will die and the young people who favor OBP will thrive. In any case, BA is a horrible stat, and I'm never going to perpetuate its use.

Sometimes history trumps logic -- My sons know what a .300 hitter is (and it's not a guy with a lousy OBP). My only point is that, since adjusting upwards to match OBP is arbitrary, you might as well adjust down. wOBA is a great stat, and I'd love to see it become more widely reported - at least more so than RC and OPS.

It seems like we'll have peace in the MidEast before OBP supplants BA, if the new generation is still talking about BA. What gives? Is it Fantasy Baseball's fault?

--Tangotiger 07:54, 13 August 2008 (PDT)

Playoff offenses - best kind?

Have any of you done/read studies regarding what "type" of offense (aside from the one that scores a lot of runs) is the most successful in the playoffs? Every year, various people talk about how small ball is superior, or you need players who grind out at-bats, or you need speed on the basepaths, but no one really puts forth any worthwhile data or evidence on the matter.

The average team in the playoffs scores 4 runs per game. That is small-ball stuff. I would think that a great basestealer would have his value jump in the playoffs. While the run values of the other positive events goes down, the HR hitter stays constant. So, I would guess power and speed on the hitting side.

For pitching, you don't need depth. So, top-level starters, and two great relievers ought to do it.

Otherwise, I have performed no study on the issue.

--Tangotiger 09:51, 22 August 2008 (PDT)


Have you ever used the methodology of UZR to address the question of to what degree pitchers control the run-values of the batted balls they yield?

I think MGL might have done it. I will also point you to a fantastic collection of batted ball data and run values run by studes of Hardball Times.

--Tangotiger 09:53, 22 August 2008 (PDT)

Park Factors - personal park factors

Have you ever looked into team or player specific home field advantage (besides the fact that a park may be tooled for a player's hitting style)?

There's no question that each player is affected by a park differently. Any park factor you see I would usually call Lazy Park Factors. Barry Bonds, as my famous example, hit as many road HR as home HR at his name-changing SF home park. However, when you look at all other lefty hitters, they are killed on the HR. With the sheer number of PA that Bonds has had, it's clear that he is affected differently. We know he is different. He has a fantastic swing, control of the strike zone, and power. You cannot apply a LHH park factor to Bonds as you would to other hitters. If you want to create a family of Bonds hitters, then fine. That's the population that Bonds is truly drawn from.

--Tangotiger 09:57, 22 August 2008 (PDT)

You misunderstood the question about home field advantage. I said BESIDES how a park is tooled to a player's hitting style. What I mean is that after adjusting for park and general home-field advantage, a player may do even better or worse. For example, Jack Cust's home/road splits are bigger than you'd expect given his park. Is this a coincidence or does he perform especially well in front of a home crowd?

First, I'm not sure what kind of advantage the A's home park gives a player like Jack Cust. When you look at his personal gap, you have 3 reasons for this: (1) he is particularly well-suited for his home park in that his toolset gives him more of an advantage than someone else, perhaps even someone of similar toolset, (2) he is fond of playing in front of that crowd, (3) sample size is too small to ascribe a large effect to Cust or the park.

Of all the answers, #2 is both the least likely and least provable. It certainly may well be that it is the case, but the data won't give you the answer. Even Jack Cust himself may not even give you the answer, since he may buy into his own press clippings.

--Tangotiger 11:26, 26 August 2008 (PDT)

Win Expectancy - in-game baserunning strategy

Here's a Win Expectancy question. In tonight's Brewers/Dodgers game, it appears that Matt Kemp made a crucial baserunning mistake. In the bottom of the 10th inning with 1 out and Kemp on first, Andre Ethier hit a long flyball to centerfield. Mike Cameron went back on the ball, jumped at the wall and missed the catch. Kemp, instead of being near or on second, had already gone back to first to tag (expecting the ball would be caught). So instead of runners on 1st and 3rd or 2nd and 3rd with 1 out, the Dodgers had runners on 1st and 2nd with 1 out. What's the discrepancy in Win Expectancy between those situations, as well as having a runner on first with 2 outs versus having a runner on 2nd with 2 outs in the bottom of the 10th? Thanks in advance.

You can look at the charts in The Book, or this smaller subset on my site. Presuming you had a tie game, and that the bottom of the 10th is the same as bottom of the 9th, and all runners and batters average, then with 1 out: 2b/3b: .839 1b/3b: .829 1b/2b: .711

So, you are talking about a .118 win difference, which is ENORMOUS. A HR, in a random situation, is worth around .12 or .13 wins. So, his misplay is like giving up a typical HR.

I cannot say enough how important it is to get a guy to 3B with less than 2 outs. ESPECIALLY in a tie game of the bottom half of the last inning.

Now, Kemp was trying to figure out, by your explanation, was trying to figure out whether to be 1b, 2 outs or 2b, 2 outs. That's .562 wins or .610 wins, a gain of .048 wins.

That was his decision by going back: giving up .118 wins if the ball is not caught, against gaining .048 wins if the ball is caught. If there was more than a 30% chance of the ball being caught, then he was right to do what he did. If there was less than a 25% chance of the ball being caught, he was wrong. At 25-30%, it was a flip-the-coin call.

You tell me. What chances did superfielder Mike Cameron have of catching the long flyball, from the vantage point of Matt Kemp?

--Tangotiger 10:07, 22 August 2008 (PDT)

Observational data - enhanced from performance data

In statistics which aim to measure a player's true talent, is there a place for correcting incorrect calls by umpires? This pertains mainly to statistics like UZR or Dewan's +/-, which are reviewed by video. For example, if a ball is caught, but the batter is ruled safe, should the fielder be credited with the play?

If I had control of the data recorders, I would certainly make notations of everything they see, and so, yes, I would count some out plays as "safe" for purposes of player evaluation. But, the operations department of the baseball recording companies seems to be very stubborn about this issue, among others.

--Tangotiger 10:09, 22 August 2008 (PDT)

The Book - why the name?

Why is it called The Book? Is it for strategy that is by the book?

The preface of the book has the answer.

--Tangotiger 08:07, 19 August 2008 (PDT)

Olympic tie-breaker rule - where to start in the batting lineup?

With the Olympic extra inning rules stating you can start anywhere in your order with runners on first and second, in terms of the general lineup, is there a specific place you should start? I guess it is more of a case-by-case question but I'm curious to see if there is a specific strategy.

And someone else:

Interesting tie-breaker formula in the Olympics. USA vs. Japan, tied through 10 innings. Instead of a shoot-out or home run derby (as they use in the Israel Baseball League), they do this: Put runners on 1st and 2nd with no outs. Let the game begin. I assume that if tied after one of these innings (top and bottom), they repeat. In top of 11th, however, Davey Johnson, instead of doing the conventional thing -- bunt the two runners along -- has the first batter swing away and he hits a single to right. US is now up by four runs at end of top of 11th. We'll see what happens in the bottom half....What do you think of this tie-breaker approach as opposed to alternatives, for use in some kind of tournaments where you don't want to use up pitchers and you don't want games to go on forever?

I discuss this issue on the blog. As for what to do, in the bottom of the inning, if you are down by 3 or more, you should play it as if the runner on 2B has already scored.

I'm not sure the best approach otherwise, but I would say in the top of the inning, you need to swing away and not bunt.

--Tangotiger 10:20, 22 August 2008 (PDT)

ERA estimators - correlations

What is the r-squared for FIP and the other ERA estimators?

Reader and analyst Colin has posted his results

--Tangotiger 13:50, 22 August 2008 (PDT)

Different evaluations of the same thing

I read MGL's article on defense and then I'm fairly sure the article on the next page is Dewan's fielding article. I'm pretty sure the numbers that struck me as odd were those of the Toronto Blue Jays, who one (I think Dewan) claims is the best fielding team in the Majors, and who MGL rates as good, but mediocre. I also could be remembering this completely wrong, but I'm fairly sure this is the case. What happened here? How could two people (in the same book) use defensive metrics and get such wildly different ratings? Doesn't someone have to be wrong?

The best answer is that they gave an unqualified conclusion. All sample data has an uncertainty level around it. You can't claim someone is the best with 100% certainty (not even that Barry Bonds is a better hitter than Ben Sheets) by using sample data. There is always an error band.

The Redsox may be a better team than say the Milwaukee Brewers, but that doesn't mean that if they faced each other 162 games that the Redsox would definitely win at least 82 games every single time. Sometimes the Brewers will end up with more wins.

So, Dewan and MGL are looking at two different data sources and are looking at the data through two different lenses. Implied in their statements is something like "I'm 60% sure that the Blue Jays have a good but not great fielding team", or some such.

--Tangotiger 11:28, 26 August 2008 (PDT)

UZR and Dewan - correlations

Have there been any correlations run between UZR and +/-? What about seeing how the stats measure up with some sort of park adjusted, pitching independent DER?

I don't know that anyone has done a regression of UZR on Dewan's plus minus, to get a correlation. I would suspect that it is going to be somewhere around .7. Depends on the min number of opps for each player in the sample of course.

Since both methods use different databases and similar but different methodologies, you are going to get different numbers, in some cases, very different numbers, but since both methodologies are VERY similar and the databases are bound to be quite similar as well, by and large the numbers should be close.

Tango addresses (correctly) the issue of who would be "right" when the numbers on a player differ. As he says, there is no "right" or "wrong" because we are dealing with sample data and trying to estimate true talent with some level of certainty. Obviously the better methodology using the better database (if there is one) would get the "more correct" number a greater portion of the time. If both methodologies were sound but each one had some strengths that the other one did not, and both databases were also sound (which they should be), then a combination of the two measures would probably be better than each one individually. That assumes that one methodology is not much better than the other. If one were substantially better than the other, then that measure alone might be better overall than combining them. OR maybe a weighted average would be better than the better one alone.

--Mgl 07:55, 7 September 2008 (PDT)

I recall taking the results for the 2007 teams given in the 2008 HBT annual. The 2 stats agreed in sign (plus or minus) 77% of the time. But I did a quick conversion od Plus/Minus into runs by multiplying by .8. Then I found the difference between UZR and Dewan for each team, squared them, added the result, divided by 30, and took the sq rt. The result was 29. So, UZR and Dewan differed by an avg 29 runs per team in 2007. That seems quite large to me.

September Callups - new strategies?

With September call-ups underway, what are some of the things you would like to see teams do, now that they can utilize so many more players (potentially)? I think that the idea in the Book regarding the six-man rotation & pinch hitting for the below average pitchers would be a good place to start.

I don't know that we advocated a six-man rotation in The Book. In fact, I think we noted that the 5-man was probably optimal.

September is certainly a great time to use the "starting pitcher rarely or never bats" strategy, unless perhaps he is an ace, and even then, it depends on the inning, pitch count, etc. I doubt you will see that any time soon, although I will bet that you will see a manager using some version of that strategy some time in the next 10 years.

As well, all teams, especially those in contention should be using a "designated runner" (speedster and/or great base stealer) and perhaps some designated defensive replacements, if they have any great defenders on their 40-man.

Teams should also be doing more righty/lefty pitching switches from the bullpen, again, assuming that they have some decent relievers in the minors that they can bring up. As we sometimes mention, it makes little sense to take out a very good RHP and put in a bad LHP to face a lefty batter. Of course, you have to know what the "break-even point" is based on the pitchers' and batter's true platoon ratio. For example, at what true ERA would a righty pitcher facing a lefty batter be the same as a lefty pitcher facing that same lefty batter, assuming that everyone had a league-average platoon ratio? I would guess off the the top of my head that around a half run difference in ERA makes them equal facing a LHB. In other words, a LHP with a true ERA of 4.50 (versus all players, which for a LHP is probably 50% RHB and 50% LHB) is equivalent to a RHP with a true ERA of 4.00.

--Mgl 08:07, 7 September 2008 (PDT)

Yeah, I was wrong to call it a "6-man" rotation. It's more like a "3 and 3" rotation.

Right, that's what I like to call it. You have 3 guys in the traditional role, and then 3 guys that are used for the other two slots.

--Tangotiger 12:12, 8 September 2008 (PDT)

All Star Game streaks

What model (beyond "luck") can explain the streaks in the All-Star games? has a discussion. I suggested that the Pythagorean exponent should be higher in a competition like the All-Star game, as with there being so many substitutions, the game is essentially longer. Any thoughts?

Given the game-to-game parity in baseball, "luck" has to be the dominant factor in any streak. The best team in MLB would only win about 70% of its games against the worst team in MLB. And, that's the extreme -- taking the best players from each league, you'd get much less disparity than this. No rule changes, substitution patterns, etc. will significantly alter the fact that any game result won't be all that far away from a coin flip.

The substitutions won't have any impact on randomness. The game may last longer, as viewed by a clock, but it's still 9 innings. To reduce random effects, you'd need to lengthen the game in innings, rather than clock time. --AED 13:09, 8 September 2008 (PDT)

The results of All-Star Games will tend to be ever-so-slightly more streaky than a series of independent contests as the better team one year will tend to be the better team the next year. However, this is a small effect. In fact, it probably should not even be noticed.

In noting the streaky nature of the game over the last X years, you also have a case of "arbitrary endpoints." If we play enough games (there have been around 75 ASG's), sooner or later someone is going to notice just about any anomalous pattern you can think of by chance alone.

Although when it comes to contests involving human endeavor, especially complex ones like a baseball game, we never know for sure what causes on side to win, a basic binomial model with a "p" (chance of success) that represents the "log 5 ratio" of one team's overall strength (true "p" versus an average opponent) to the other one's seems to work over the long haul and in the aggregate.

That being said, most All-Star games favor one team or the other by no more than 55% or so, or at least we think that is the case. Given that, the streakiness of games over the last few decades is more than likely just a fluke.

Then again, you never know. That is just our best guess based on a model of baseball games that fits nicely in the aggregate and in the long run.

--Mgl 18:51, 8 September 2008 (PDT)

Break-even points - When to try for home

When should you go for it from 3B?

With a runner on 3b and less than 2 outs, the runner has around a 27% chance of scoring if he stays put. If he goes for it, he gains .73 runs. If he doesn't, the inning ends and he loses his .27 runs, plus another .11 run potential lost by the batter and guys on deck. Basically, he can gain twice as much as he loses. For that kind of payoff, you only need to be successful one-third of the time to make the right decision.

With 1 out, the dynamic changes substantially. In this case, the tradeoff is in gaining .35 runs (you have a 65% chance of eventually scoring from 3B with 1 out) against losing those .65 runs plus another .18 run potential lost for the batters and guys on deck. So, +.35 against -.83. It's your standard SB situation scenario: you gotta make it at least 70% of the time.

--Tangotiger 12:33, 8 September 2008 (PDT)

Meaningful and meaningless games

Have any of you seen any studies on which games in the season are most "meaningful" and which are most "meaningless"? Every year, people quip about how "meaningless" games are before August or September (though they don't use those words explicitly), but it seems to me that the early games are pretty damn meaningful because of the early holes that a team can put itself into (or conversely, an early lead that a team can stake that will help later in the season). Any studies/thoughts?

I don't know how to answer that without a more specific definition of "meaningful." Obviously a win (or a loss) is a win regardless of when it occurs.

On the other hand, just like a game has leveraged situations that a player or manager can exploit, so does a season. For example, if I had a secret weapon that enabled me to definitely win a game, but I could only use it once, I would wait until the end of the season and if I were fighting for a playoff berth, I would use it in a high leverage game, such as when I was playing against a rival team that was also fighting for a post-season berth.

If you are implying that an early win or loss or particular place in the standings may affect later season play, I know of no research regarding that and have nothing to say about it, other than it is probably just more of the usual bloviating you hear from the media, the fans, and the talking heads known as sports commentators.

--Mgl 17:17, 22 September 2008 (PDT)

Batting Order - empirical results

According to many lineup simulators, the most important position in the lineup is 4th. That is, it's the best spot to put your best player. Next most important is the 2nd position in the order. This clashes with most people's intuition; they'd think 3rd in the order is the next most important....Would it be feasible to code each player with an OPS (or wOBA, or whatever) for each season... then examine all the lineups by game for all games in the last 20 years... then compare the Runs Scored amongst games where teams had their second-best hitter 2nd in the lineup versus 3rd? I'd be curious to see if the estimators are correct "in practice".

I suppose, but obviously you would have to control for the other batters in the lineup and the opposing pitchers. I am not sure you could get a large enough sample size after controling for all of that, but it is an interesting thought.

My personal view is that without a good lineup simulator that includes things like base running, you are fighting windmills in trying to tease out nuances in a batting order by using either linear weights by BO or a Markov simulator (which does not include base running).

For example, we have shown in The Book that batting your pitcher 8th generally generates a few more runs per season. When I plug the various alternatives into a lineup simulator that models everything (tries to, at least), I don't get the same result. I get either no difference or that the pitcher batting 9th is optimal. At least I did with one particular lineup. (Obviously it depends on the pitcher and the other batters.)

On top of everything else, a theoretical lineup optimizer cannot account for things that a manager might know or suspect like batter tendencies. Even if that is a small consideration, it might be enough to affect the results. For example, what if you bat a position player 9th in your NL lineup and he feels slighted and performs just a little worse (maybe subconsciously to get back at his manager)? You could easily lose any edge you may have had in the first place. Or what if you have a batter who the manager thinks gets a little nervous leading off the game? Or batting in an RBI slot?

One of the few things I would not insist on foisting on my manager (if I were in control of a team) is lineup order. I might suggest an optimal one and go from there. Or if I found that his lineup was extremely sub-optimal, according to a good sim, I would probably make him change it under threat of taking away his sunflower seeds.

--Mgl 14:24, 2 October 2008 (PDT)

Odds Ratio - empirical results

I have a couple questions about the "odds ratio" you guys use to compare ratios and come up with likely matchup results. Supposedly this can be used to predict, say, the chance of a batter walking when you know the batter's walk rate and the pitcher's walk rate. Has it ever been tested to see if the prediction actually matches up with the real results for something like walk rates or strikeout rates?... Second... we know that there are pretty real, consistent tendencies of umpires to have an impact on walk rates and strikeout rates. How do you factor this into predicting the result of a PA? Now you'd need to compare the pitcher's K rate, the batter's K rate, and the umpire's K rate... Thanks for any insight here...

The answer to the first question is yes (of course), although I am not sure I have seen any publicly available studies as such. Why would you "need to" include the umpire in the model? How about the weather, the park, day or night, etc.? You can include anything you want.

You CAN include the umpire if you have that kind of information about the umpire's true tendencies. In this day and age of Questec and pitch f/x there are not a whole lot of differences among umpires. Certainly a lot less than the differences among pitchers and batters. That should be pretty obvious, no?

Here is a reprint from a thread on this blog that addresses including other variables in the odds ratio method of determining the outcome of a "matchup."

If you have one guy who is a true .600 facing another guy who is a true .400, the resulting win% will be .692, if the league mean is .500.

While the calculations when the mean is .500 is straightfoward (log5), you can use the Odds Ratio method for any mean. For example, assume the league OBP is .333, you have a hitter who is .400 and the pitcher is .250. What’s the resulting OBP?

Odds(H) = .400/.600 = .667
Odds(P) = .250/.750 = .333
Odds(L) = .333/.667 = .500
Odds = Odds(H) * Odds(P) / Odds(L)
= .667*.333/.500=.444

If the Odds are .444 safe to 1 out, the the Rate is .444/(.444+1) = .308

You can further extend this so that the Odds(L) for the hitter and pitcher are different.

The full equation is:

Odds(matchup)        Odds(H) * Odds(P)
----------------- = -----------------------
Odds(environment)    Odds(envH) * Odds(envP)

So, you have a hitter with an OBP of .400 in a league of .300 facing a pitcher with an OBP of .250 in a league of .350, and they are both playing in a league (or park) where the OBP is expected to be .380 for the league average player. What’s the resulting OBP?

Odds(matchup)   (.400/.600) * (.250/.750)
------------- = -------------------------
(.380/.620)     (.300/.700) * (.350/.650)
Odds(matchup) = .590
Matchup OBP = .590/1.590 = .371

And, you don’t have to limit yourself to just these variables. You can extend them to infinity.

--Mgl 14:32, 2 October 2008 (PDT)

Fielding System - nonPBP

What is the best defensive analysis system or method to use IF you do NOT have play by play or hit location data?

This would presume MLB seasons prior to 1954, and most of the lower-level leagues in their entirety. There are four systems that I am aware of, though I don't know which is necessarily better than the others:

  • Michael Humphrey's DRA
  • Charlie Saeger's CAD
  • Clay Davenport's FRAA
  • Bill James Win Shares (or Revised Range Factor)

--Tangotiger 08:44, 3 October 2008 (PDT)

There is plenty of research or commentary on this, on the web. It is definitely not 3 or 4. There are some other ones similar to 1 and 2 which are good. David Gassko of THT has one I think. You can probably find it on the web. DRA is sort of a black box I think, and we have not heard from Michael in a long time, I don't think. I am not sure if he ever released the "blueprints" for the box. I think it is based on some kind of regression analysis.

Anything that takes a team's pitchers' handedness, GF ratio, and BIP info (which you can infer from the K rate of course, and perhaps other things like GDP rate) and then uses a baseline of how many balls usually get fielded by each fielder given those parameters is going to be a decent one. And all of them that use this methodology are going to come out with essentially the same results. You also have to decide how to use the assists and putouts for each position.

Basically, you take a fielder's assists and putouts and use them to determine how many outs he actually made on grounders and fly balls. For outfielders, this is easy as all putouts are fly balls and line drives caught. For infielders, not so easy, so you have to make certain "assumptions" and approximations. Then you simply determine how many balls are caught by an average fielder given your approximation of where and how many balls are hit given the pitchers on the mound (when that fielder is on the field - or if you don't know that, you just use all of a teams' innings), their estimated balls in play, ratio of fly balls to ground balls, and handedness (if you know the handedness of the opposing batters, even better.)

There is your perfect non-PBP system defensive system in a nutshell. I think that is what Saeger does, Gassko, and probably essentially what DRA does.

--Mgl 14:47, 2 October 2008 (PDT)

The Shift

Do you know of any studies done on the effects of the over-shift? Seeing as there are only 10 hitters or so who do get the shift, would the sample size be much to small to research? And why don't more hitters get unique shifts? Every batted ball's location is tracked now, so wouldn't it be easy to position fielders in better locations? I assume the traditional defense is not optimal in all cases.

Check out Greg Rybarczyk's article in the 2008 Hardball Times Annual.

--Tangotiger 14:04, 8 October 2008 (PDT)

Teams do use unique defensive positions for every batter versus right and left handed pitchers. During one of the recent post-season games, the camera focused in on one of the "scouting sheets" that the pitching coach or bench coach was using. It has each batter's "spray chart," among other things, which I assume they use for positioning the fielders.

The problem for teams and for researchers is how to "regress" these spray charts, given the sample size. Obviously if a hitter hit all his BIP to the right side of the field in 10 PA, you would not put all of your fielders on the right side only. The same concept applies to 50 or 100 or even 500 PA. And I would hope that the teams use more than one year's worth of data to construct those spray charts, although I doubt they do, except perhaps at the beginning of the season, when they might use last year's data.

One of the humorous things about those "cards" that they showed on TV was that they had each batter's BA versus RHP and LHP (as far as I could tell). My guess is that they were this year only. Not only is that a "mistake" to use one year's stats only, but you would not want to use BA any more than you would use BA to describe a player's offensive value or production. If nothing else they should be using OBA or OPS (ideally, something like wOBA of course). Sometimes, you mostly care about BA only when deciding which matchup you would or would not prefer, but usually it is some measure of total offensive value that you want.

And of course, ideally, you want regressed values for those numbers. In fact, using raw BA against RHP and LHP for each batter is a really bad use of data to make decisions (often batters only have 75 or 100 PA per season versus lefty pitchers). It is no wonder that analysts end up questioning 2 or 3 decisions per game that a manager makes. They rely on such poor data to make those decisions and even then, they are often incapable of processing that data correctly anyway.

As far as a shift, there are varying degrees of shifts used on various players, so it is not like it is "all or nothing." Whether they (or any other defensive alignment) are correct or not, I don't know. I don't remember Greg's article, but other than that, there is scant research that I am aware of. I suppose it can be done. There is probably room for lots of improvement as far as teams aligning their defense optimally.

For example, you often hear announcers talk about how some managers don't like to "guard the lines" at the end of a close game and that some do. Well, obviously it is not a matter of "taste" like, "some managers like blue uniforms and others like white." Given a certain situation, one is correct and the other is not (or it is a tie). So obviously some of those managers are right and some are wrong (again, depending on the situation, the batter, pitcher, etc.). You would think that a manager or GM would be smart enough to admit that he has no idea whether his "personal preference" for guarding the lines or not is right and that he would ask his front office to find someone who could figure it out. But that is not the way it works in baseball (managers are NOT smart enough to know or admit that there are MANY things that they don't know).

One more thing. If batters would drop a bunt down every once in a while away from the shift, that would pretty much end it (the shift) which would be beneficial overall to the batters, I assume (unless the shift were wrong in the first place). Why they don't do that, I don't know. Surely, some of the more athletic ones, at least, could do that if they really wanted to. Certainly in a potential sac bunt situation, a bunt away from the shift would be ideal (since even if you bunt it badly and get thrown out, say even 40% of the time, you would probably increase your team's WE over swinging away).


When you calculate your Marcel forecasts, do you use Excel or Access? I want to re-create my own Marcels, but I'm a novice at Access, and figured that if I used Excel I'd have to go through and manually do =SUM function for every name in the spreadsheet.

I use Access, and I would not recommend Excel. I can easily do it in any SQL-compliant database, or even a programming language. I should eventually release the code one of these years, but someone is retracing my steps and already doing it. You can follow along on my blog, like here or here.

--Tangotiger 12:02, 21 October 2008 (PDT)

Regressing BABIP - changing other components

If figuring a hitter's (or pitcher's) stats with a BABIP regressed to the mean, how should you change the player's hits/HRs/strikeouts in order to regress the BABIP? Prorate each of the stats down a certain number equally?

I'm not sure I follow the question. If you separate all the data into components, then you are keeping the number of PA fixed, you are keeping all the non-BIP fixed, and you are regressing the non-HR hits only. So, the only thing that moves up and down is the total number of hits.

--Tangotiger 12:02, 21 October 2008 (PDT)

My basic question was, How should you regress BABIP to the mean--by regressing each stat used in BABIP (H/HR/K) equally, or just by regressing one stat (you recommend hits)?

Remember that BABIP does not include HR. It is non-HR hits per non-HR balls in play. I think you know this.

Personally, I regress each component separately. And I use different denominators for each component (some share the same denom). For example, I do K and BB+HBP per all PA (I don't include IBB and sac bunt attempts but that is a different story), all HR per BIP (PA-BB-HBP-K). Then I do all s,d,t per BIP-HR. If you do that you have to constantly re-adjust the denominators after the regressions.

Anyway, so I regress K and BB rates (per all PA). Then I regress HR rates (per BIP). Then you can regress s, d, t all the same (regress them separately), which is the same as regressing BABIP, and simply keep the ratios among the s,d,t, the same after the regression as before.

All of that is not exactly correct (there are inter-dependencies among the components, etc.), but it is not bad either.

Does that sort of answer your question? If not, can you be more specific?

--Mgl 15:03, 29 October 2008 (PDT)

Teams scoring first

Hello, and first of all thank you to all of you guys for really expanding and deepening my appreciation for baseball with your amazing insight! My question regards teams scoring first. I recently heard some Boston radio personalities discussing the Sox vs. Rays, and one writer commented that the Rays scoring first was huge. He went on to assert, without statistical evidence, that the team that scores first wins more often than not. However, I remember reading in "The Diamond Appraised" that Craig Wright asserted the opposite, that teams that score first win the same amount. Who is right? Does it matter? Just for the record I submitted this question to Bill James online, as well (I don't know why I felt obligated to mention that). I would love your insight on this. Thanks again for really adding to my enjoyment of a game I love. Have a great day guys!!!!

It should go without saying that the team that scores first will win more often. The question is to what extent this is true. And the answer is right here.

--Tangotiger 18:26, 22 October 2008 (PDT)

Yes, of course (what Tango said). Look at it this way: The worst a team that scores first can do is to score only one run. Agreed? Well, look at the WE for any team that has a 1 run lead (1-0 in this case) either going into the bottom of the 1st (a small advantage) or a 1-run lead going into the top of an inning (IOW, if the home team scores first and it's only one run), which is of course an even bigger advantage. Either way, the team has increase its chances of winning, right?

But PLEASE, do not use this notion to "justify" playing small ball when there is no score early in the game (in order to "score first"). One has nothing to do with the other. It's not like you CAN'T score if you don't play small ball. And obviously scoring first with one run (which tends, ever so slightly, to happen with small ball) is WORSE than scoring first with multiple runs (which tends to occur, again, ever so slightly, when you don't play small ball, at least as opposed to playing small ball).

--Mgl 15:09, 29 October 2008 (PDT)

tRA - fielding-independent pitching stat

I was just wondering your thoughts on tRA from Graham MacAree at It seems fine to me as another kind of defense-independent pitching statistic.

Never heard of it. I went to their blog site,, and I cannot easily find their explanation of tRA and I am not going to spend more than 2-3 minutes in doing so. If you have a link to their original article explaining what it is, or you have the "formula" yourself, feel free to post one or the other here. My guess is that it is just another version of FIP or DIPS. By the way, most of those formulas are rough regressions of the individual components that go into a component ERA (ERC), and thus factor out the "luck" more than they factor out the defense. I call them LIPS (luck independent pitching stats), not DIPS. The problem of course is that they are good for small samples of data and bad for large samples, because the rough regressions are usually quite aggressive (or 100% actually) and with large samples of data you don't want to regress any of the components, even BABIP, quite so aggressively, and certainly not 100%. With DIPS for example, the question is at what size sample is regressing BABIP 100% (which DIPS does) better than not regressing at all, and at what point is not regressing at all (regular ERC) better than 100% regression. Of course, the best thing to do is to regress the appropriate amount according to the number of TBF or IP.

--Mgl 15:24, 29 October 2008 (PDT)

About. He does it exactly the way I would have done it. Basically, it's the batted-ball version of FIP. It follows the same methodology that something like UZR, or PMR follows: figure out how many hits, extrabase hits and outs a particular batted ball should produce if you had average teammates.

--Tangotiger 13:53, 30 October 2008 (PDT)

OK, that certainly takes the defense completely out of the equation. It does not, however, take all or even most of the luck out the equation, which is the main thing you want to do in the short-term if you want to get close to an estimate of a pitcher's true talent. Plus I am not sure I like it, as I am not sure that the result of a batted ball does not tell you something that the "data" does not tell you. IOW, let's say that the "data" tells you that there was a hard hit ground ball. I guess that tRA assigns that ball a generic run value based on all hard hit ground balls, or maybe only the ground ball itself (I don't know what parameters it uses for each batted ball). Well, if the ball was actually a single, it probably suggests that it was hit harder than if it was an out. So by ignoring the actual result of a batted ball, I am not sure that you are not throwing out valuable information about the pitcher, certainly in terms of his actual performance you are, but you might also be doing so in terms of his true talent. That might be especially true with fly balls and line drives to the OF. Read my article (do a Google search), DIPS revisited, in which I talk about batted ball events that a pitcher has some control over.

Home Field Advantage - Batting first or last

How do you think home field advantage would change if the road team got to bat last? Is the advantage based on familiarity of the home field, the crowd or the final bats advantage?

It is definitely not due to the home team batting last, so to answer your question, no, it would not change (appreciably).

As to what it is due to, that is anyone's guess. Probably not the crowd though. Probably a combination of familiarity of the ballpark and the usual, "eating at home, sleeping in one's own bed, being with one's family, not traveling, etc."

--Mgl 15:28, 29 October 2008 (PDT)

Wins and Payroll - correlation

Just for the hell of it, I looked up and graphed payroll versus wins for each of the last 3 seasons. For one, I found that this season the r-squared value between wins and dollars spent was especially low (under .1) and when I did it for 2007 and 2006, I got numbers just below .3. I'm a layman when it comes to this stuff, but that seems awfully low. It suggests to me that any talk of a salary cap or a lack of parity is a little off base, isn't it?

First of all, an r-squared of, say, .25 is NOT that "bad." Because you have a lot of random fluctuation in one-year w/l records anyway, the relationship between ANYTHING and one-year win/loss records will necessarily be pretty small. If you want to get a better idea of how payroll relates to team success, I suggest using multi-year records.

I don't spend much time thinking about or reading much about this topic. But, a salary cap will not change the parity issue much as it will only affect the few teams that spend A LOT of money. If they had a salary cap of 150 mil, there will still be teams that won't or can't spend more than 30 or 40 mil. On the other hand, if they had a salary cap of 50 mil, which won't ever happen of course, then that would be a different story, although I am not sure that even if they did, the Yankees (and Boston, et al.) would not spend 50 mil of course, and Oakland and Florida (et al.) might not spend 10 mil (if they could field a team for that).

As far as parity goes, that is a relative term. One man's parity is another man's non-parity. There is always going to be a big connection between money spent and games and championships won. At the same time, there will always (with the present system) be plenty of surprise teams and surprise WS winners. The reason for that is three-fold: One, sheer luck as I mentioned. Two, the more teams that are efficient and smart at spending their payroll, the less parity there will be. At the present time, there are lots of inefficiencies and stupidity that dictate the spending. And three, the draft and FA compensation systems somewhat mitigate the lack of parity because of payroll disparities.

--Mgl 16:04, 29 October 2008 (PDT)

Trading star players

I propose that trading Adrian Gonzalez is a better move than trading Jake Peavy but I’d like to know statistically and theoretically. In 2008 (or projecting for 2009), would the team have been more successful this year with a league average starter replacing Peavy this year or replacing AG? Though I realize the offense would have been beyond boring and nearly nonexistent, I still believe that they would be worse off without Peavy, I just don’t know how to mathematically prove my point using simulations.

Playing the "what if game" is a silly one (unless you are into that sort of thing). If you want to know which player benefits his team more going forward, look at their WAR (wins above replacement) projections for 09 and beyond. There are plenty of places in the net that will give you those numbers. Gotta take estimated playing time for each year into consideration of course, not just a "rate" projection.

And of course even if you do that in order to see who is the likely "better" (more valuable) player going forward, that information is meaningless without knowing contracts and salaries. If Peavy is going to make 20 mil a year and is worth 5 WAR next year and Gonzales is going to make 5 mil and also is worth 5 WAR (I am just making up numbers for illustration purposed), obviously they would take Gonzalez and have another 15 mil to upgrade the rest of the team.

NEVER fall into the trap of comparing players without including their contracts and salary (unless you are just trying to answer the question who is likely going to be more valuable over X amount of time). Evaluating trades without both of those pieces of information (salary and likely WAR value) is meaningless, yet the media and fans do it ALL the time. Stupid. It's like I ask you, I'll trade my house for yours, and we don't discuss the mortgages on the homes. Yeah, I'll trade you my 3 million dollar home with a 2.75 mil mortgage for your $500,000 free and clear home. I'd do that any day of the week, no?

Playoff rotation - setting it up

Excuse me if this has been addressed elsewhere, but what is the optimum playoff pitching rotation? Specifically, suppose you have pitchers ranked 1,2,3,4 (1 being best). Everything else being equal, pitching them in which order will optimize the probability of winning a 7-game series?

Isn't that simple? 1,2,3,4

Depending on how much better one pitcher is than the other, for example, if your 4th starter is really bad, it might be correct to only use 3 starters and pitch some of them on 3-days rest. However, even if one starter is not that could, you could certainly start him and yank him as soon as his turn comes to bat in an NL-park, or just throw him for 2-3 innings and then bring in relievers who are generally going to be better than him in the first place, plus you can mix and match your RH and LH releivers to try and get the platoon advantage as much as possible. --Mgl 15:43, 29 October 2008 (PDT)

Is it that simple? Under that rotation, you would have your third best pitcher starting a possible seventh game, which must be the game with the highest leverage. If the opposing manager had arranged his rotation so that his ace started game seven, you would be at a disadvantage (everything else being equal). Right?

You want to maximize your chances of your ace pitching more than one game. A series only goes to a game 7 roughly 30% of the time. You don't want your ace only pitching 1.3 games on the average, rather than 2.

--Mgl 18:32, 30 October 2008 (PDT)

Home Field Advantage - how to figure for each team

How do you figure out a teams home field advantage? Is it just the difference between home winning percentage and road winning percentage divided by 2? If a team had a rather large HFA, of let's say 8%. How much of that 8% would you determine as this team being a good home team as opposed to a bad road team? How many years/games back of data would you look at to get a reliable read, and how much would you then regress that number back towards league average?

Go to this thread:

Start on comment #40. As far as computing the HFA, it is probably some combination of the home to road ratio and difference. I don't think that either one fits a best-fit-curve real well. I would opt for the ratio method if anything and not the difference. I could be wrong though.

Fans Scouting Report - Alexei Ramirez

It looks like the White Sox are going to move Alexei Ramirez over to short for 2009. The consensus so far is that he was a fairly poor defender at second, which would make a switch to short seem like a pretty bad idea as defense goes. The Fan Scouting Report, however, seems to indicate that the transition will go much better. Alexei scores pretty well compared to good shortstops. That conclusion only plays as long as the fans to do as you exhort to disregard position. So I want to know if you think they do that and, more importantly, I want to know if you think the FSR is robust enough to say with certainty that Alexei Ramirez is likely to play at least average at shortstop?

I'll let Tango answer your question, but...

"with certainty"

Are you kidding?

BTW, he had a +7 per 150 UZR (he "saved" 5 runs total) at second base this year. Not knowing anything else about him, that suggests he might play a roughly average SS. Not with absolute certainty though. ;)

--Mgl 15:51, 29 October 2008 (PDT)

One man's consensus is another man's biased circle of friends. I've got 37 Whitesox fans from around several blogs evaluating him as a "70" as a 2B, which is above-average. That would roughly correspond to an evaluation of +10 runs per 162G, a number pretty darn close to MGL's just-posted UZR. The profile of his skills, when we look at his top comps, shows comparables that are almost all SS or CF.

So, what we have here is: a) an analysis of his performance via numbers-only that suggest he is an above average 2B, b) an evaluation of 37 hardcore Whitesox fans that took time out to tell us he is an above average 2B and who has a profile similar to other very good fielding SS, and c) an evaluation from the Whitesox management that suggests he was good enough to be promoted from 2B to SS.

Against that is the reporting of some consensus that he was a poor defender.

It is far more likely that the people you are reporting from is not a representative group of fans.

--Tangotiger 12:01, 14 November 2008 (PST)

Baseball Reference

How can you get baseball-reference's league stat page ( onto a spreadsheet? I don't have Excel (I have Microsoft Works), so I can't save it and open the text file in Excel.

A reader recommended: "Scale in the OpenOffice install, its free for any one to download. "

Regression - sample and population distributions

Quick question, guys:... Tango in one of your mailbags or somewhere on the blog you mentioned that if you use an average of 600 PA you’ll get an r of .75, which implies that x=(.25/.75)*600=200 based on the formula for finding x: x=((1-r)/r)*PA. If you use an average of 200 PA, r= .50 and x= 200. ...Also, I have it written down somewhere that for 500 PA, one SD for lwts is ~ 9.7 runs. This means that 68% of the time (in 500 PA) a player will be 9.7 runs plus or minus his true lwts talent level....Can at 200 PA, r = .50 and in 500 PA 1 SD = 9.7 runs? If at 200 PA we know 50/50 about a player's true talent (or 50% regression towards his historical stats/50% regression toward his pop. mean), wouldn't the SD in 500 PA be lower?... Hopefully you can clear this up for me. Might be a math/data error.

I think you are mixing up two things, or I'm not understanding properly. The 9.7, if I indeed said that, is the population spread. That is, two-thirds of the players are within 0 +/- 9.7 runs.

So, it is not correct to say that 68% of the time, a player will be 9.7 runs +/- his true talent (or even true talent +/- 9.7 if that's what you really meant).

You have to distinguish between the population spread in talent, and the spread in sample performance for a single player, explained by the binomial.

--Tangotiger 12:01, 14 November 2008 (PST)

Hockey - broadcast deals

Tango, any thoughts on whether the NHL network's broadcasting of D-1 college hockey will have an impact? Should they also broadcast Canadian amateur hockey (college or otherwise)?,15995/NHLNetworkAnnouces200809CollegeBroadcastSchedule.html

Impact? I'm not sure on who it should have an impact. Hockey is regional, and college hockey is even more regionalized. The impact should be felt very locally. I'm fine with that. I don't think that we need to necessarily export the product to everyone in order to make a splash.

--Tangotiger 12:01, 14 November 2008 (PST)

Injuries - effect on talent level

I was wondering if any of you had seen any studies on how injuries affect aplayer's true talent for hitting, fielding, baserunning, pitching, etc. I specifically was thinking about what a player's expected true talent would be if he tried to play through an injury compared to what a player's true talent would be after returning to health from the injury. Factors like type and severity of injury, aging, park/league/team changes, etc. would also have to be considered. Thanks for your time

Thanks for your question. Unfortunately I am not aware of any research that has been done trying to quantify that sort of thing. There's a great project for you!

--Mgl 22:06, 22 November 2008 (PST)

Run value - batted ball trajectory, velocity

Are there any defensive metrics that evaluate the run value of a ball hit based on what happened to balls hit with similar velocity and trajectory?

Yes of course, if by "velocity and "trajectory" you mean an estimate, like a ground ball, fly ball, line drive, pop fly, or even a "fliner," and hard, medium, and soft.

All of the play-by-play defensive metrics essentially use the run value (how often it is turned into an out by each potential fielder, and if it is not turned into an out, the run value of the hit or error, or what have you) of a batted ball based upon the characteristics I mentioned above. That is the essence of all of these metrics, like UZR, Dewan, Pinto, SAFE, etc.

I am not aware of anyone that goes beyond putting each batted ball into one of a few different "buckets" based on what kind of batted ball it was (fly, liner, etc.) and the approximate speed (soft, medium, or hard), as guesstimated by the "stringer" who works for the company that provides the data.

Ideally, you would want to do the same thing with actual speed in mph, fps, or whatever, and specific trajectory, like angle of ascent or maximum height, or something like that. But I am not aware of any source of data like that. Eventually when MLB puts or allows someone to put cameras on the field that track batted balls ("hit f/x data"), we will have that data, and hence we will have much better defensive metrics.

--Mgl 22:14, 22 November 2008 (PST)

Value of a run - prevent or score

I'm guessing it depends on the run environment, but generally speaking, is it better to score a run or prevent a run?

Given the same run environment, it wouldn't matter. A 9-7 lead is no different from a 2-0 lead, given the same situation (inning, outs, men on base, teams playing).--AED 07:33, 19 November 2008 (PST)

In addition, it is generally "better" to allow one fewer run than to score one more run, because allowing fewer runs obviously lowers the run environment which makes the impact of that extra run more valuable. If you score 5 rpg and allow 4, you will win slightly more often than if you score 6 and allow 5. The difference is negligible I am afraid, especially since the impact of one or two players on a team is usually no more than .1 rpg or so.

Which is why, in general, good pitching is not really any more important than good hitting. The notion that "pitching wins" is nonsense. It just seems that way, which is probably where that notion comes from. When you win a game 10-8, it seems like a bad team just got lucky, but when you win 1-0, it seems like it was won with "skill," right?

Another similar idea, that, for example, a team with poor hitting but good or decent pitching, needs to improve their hitting, but not their pitching (or vice versa), is also nonsense. It is obviously easier to improve in an area that is not that good in the first place (although not necessarily cheaper, as a marginal win tends to cost the same no matter what your starting point is), but, for example, if you have great pitching and crummy hitting, if you improve your pitching one win or your hitting one win, it is essentially the same thing. Yet somehow a lot of people seem to think that teams "need" to improve in areas that they are lacking. Not true, although as I said, it is generally a lot easier to improve your weaknesses.

For example, let's say that you have 5 top-notch starting pitchers, but a bunch of replacement level position players. It would probably be difficult to add some wins to your starting pitchers. You would somehow have to do something like trade a 4 WAR pitcher for a 5 WAR, with the presumption being that you would have to pay for that extra win - around 5 mil. But in order to upgrade your offense, assuming you had some replacement-level players at a few positions, all you would have to do is find a halfway decent player to replace one of your replacement players. It would still cost you around 5 mil for that extra win (and in both cases, you would have improved your team by one win), but it should be quite easy to do.

In fact, if you were a smart team, you could probably find that extra offensive win a lot more cheaply than the going rate for a marginal win (although if you were a dumb team, you might overpay for that halfway decent players and end up paying 6 or 8 mil for that extra win). Stars and superstars tend to be consistently overpaid, or at least paid the "going rate". Marginal players tend to be more variable in terms of whether they are over or underpaid. IOW, smart teams can usually find lots of bargain low-impact players (and dumb teams will overpay for some of them). Almost everyone usually has to pay top dollar for a star player.

--Mgl 22:34, 22 November 2008 (PST)

Linear Weights - pitchers as hitters

I apologize for this totally noob/naive question. I am obviously no stats guru (I am a drummer...insert joke here!) My question is regarding the calculation of an ABF for developing batting runs linear weights: is it true that pitcher's batting stats should be removed from the league stats in calculating that factor? I am struggling to accurately remove pitcher's batting stats from league stats (I am not using play-by-play data, and sorting batting stats by exluding pos 'p' from the fielding table of the bdb obviously eliminates batting stats of anyone who appeared as a pitcher, not just their batting stats as a pitcher...this is my best attempt, pretty weak I know). Anyway, thank you as always for your awesome work! I truly love reading all of the stuff from all of you guys, it really adds so much enjoyment of the game to my life and I am very grateful for the work you guys all do!

The ABF that the reader refers to is the outs term in the Linear Weights equation.

Yes, I remove all pitchers hitting. I listed the primary position for each player here, and I suggest that you use that:

Babe Ruth as a pitcher/nonpitcher is its own special thing.

--Tangotiger 10:47, 26 November 2008 (PST)

OBP - with or without IBB?

On Base Percentage should be computed net of intentional walks (i.e., remove from both numerator and denominator). Discuss.

Tough call. The real answer is "it depends" really on what you are trying to do. In alot of analytical work, it helps to treat the IBB as a non-event. However, sometimes you come across the unintentional-intentional walk, and so, how do you really know what kind of walk you are categorizing. This problem is exacerbated with a player like Bonds who got so many IBB it's not even funny.

In the end, we really need to match it to the reality, and the reality is best expressed as WPA. So, what you'd need to do is figure out how much a Bonds NIBB and IBB walk is worth. And do the same for the top hitters (and 8th place hitters). This will tell you how to treat the IBB.

--Tangotiger 10:55, 26 November 2008 (PST)

The short answer is that yes, it is generally correct to simply ignore them in any rate stat because they generally are equivalent to a generic PA by that particular batter. IOW, letting the batter hit will generally and on the average not change his value (managers like to think that the IBB lowers the offensive team's/batter's win value in that situation, but overall, it is probably a wash), so can simply substitute a PA for every IBB. Obviously ignoring the PA will yield the same result, as far as OBP is concerned.

There are at least 2 caveats or qualifications to that: One, we can assume that a batter will actually get walked more often if allowed to have a PA ("hit" way) in an IBB situation, so ignoring the PA is not exactly the same thing as substituting a PA in that situation. Two, it depends on what you want to "do" with the OBP. If all you want to do is tell someone how often the batter got on base, then by all means include the IBB in both numerator and denominator. That is probably not the answer you want to hear, but the answer to questions like that often depend on your "utility" or what the heck you are trying to do. If you are using OBP to represent some kind of value or skill for the player, then you are better off not including the IBB in the computation. If you just want to represent how often the batter got on base, period, then obviously you want to include all events, although even then, it is not 100% clear what to do with ROE's or SH and SF's.

--Mgl 12:08, 29 November 2008 (PST)

Regression - component numbers

What are the year-to-year correlations of component batting stats, such as singles, doubles, triples, homers, etc.?

Did we answer this already? If we didn't, you can find one set here:

Win Expectancy - Other sports

Have you or anybody else calculated a "Win Expectancy" table for sports other than baseball? ie - Football, Hockey, Basketball, Tennis, Golf. ... In football you could look at. 1) Score differential. 2) Possession of the ball 3) Yard line (0-20, 21-79, 80-99) 4) Time Left (to the X seconds)

I did a simple one in hockey. What was interesting is how much the leverage index in hockey does not navigate much away from the 0.7 to 2.0 range. That is, you are always in the game, and yet, it never gets to the point where situations are super high critical like in baseball. The downside of baseball of course is that there are ALOT of low-leverage plays. So, it depends what kind of sport you need.

As for football, I asked the guys at Football Outsiders for their event files a few years ago, so I could calculate it. It's pretty straightforward, and easy to do with a Markov program. But, I never followed up with them on it, and I've got a long todo list as it is.

--Tangotiger 10:55, 26 November 2008 (PST)

Supply and Demand - free agents

When you use WAR to calculate what a player is worth and to make a judgement on a contract that a player just signed, do you think it is important to take into consideration the supply and demand of what's available in the free agent pool? If there are a glut of players available on the open market at a certain position, it would seem logical from a supply and demand standpoint, that the player would be a good candidate for getting a contract with a lower salary than what WAR calculations came up with.

I don't really get this. There are always 30 teams, there are always 750 players on the 25-man roster and 1200 players on the 40-man roster. You can have 10 teams bidding on 10 SS one year or 3 teams bidding on 3 SS the next year, and I'm not sure it matters much. In the 10 team example, after the first 7 SS are signed, you are left with 3 teams and 3 SS. Sure, I can see some jockeying going on, but I don't see much to worry about.

--Tangotiger 10:23, 4 December 2008 (PST)

I don't know. I have not thought about it too much. But it seems to me that the salary for any professional athlete is determined by two things: One, the revenue stream. Basically, the owners of the teams and the players are entitled to "split" the profits that are generated, in some fashion (50/50, 60/40, etc.). Two, the supply of the players, as you allude to. If somehow there were 100,000 players of equal talent (which would make little sense of course) at the top of the talent heap, and only 1000 could sign contracts, then the average salary would be a lot less than if there were only 5000 or 10000. But in reality, you don't have a pool of players of equal talent that MLB teams draw from. You have a bell curve of players, and the teams take the 1000 best players from that distribution. So, there really is not much of a supply and demand factor working in professional sports. There really is no fixed talent level that defines a professional athlete. A professional athlete is merely the X best of whatever the talent pool happens to be.

If, however, you happened to have a glut of somewhat equal talent available as FA, either at one position or for all positions combined, then I can see a reduction in salaries. But, this is not likely to happen either. What Tango is trying to say is that no matter what the talent, a certain fixed number of players are hired by the teams, so to that extent, there really isn't much of a supply and demand. If there happen to be a glut of talent in any one time period, that just means that those players who are hired will be more talented, on the average, than in another time period. But they will still get the same money.

But, as I said, it is theoretically possible for there to be a glut of equal talent such that it drives the price of that talent down. Let's say that there is one more spot open on one team's 25-man roster and instead of the usual 10 players who are 5 runs above replacement that can be used to fill that spot, you have 20 players at 5 runs above replacement. Theoretically, I think, you ought to be able to get that one player at a little less cost.

--Mgl 18:05, 4 December 2008 (PST)

Quantifying heart

Is effort a quantifiable stat or does it interfere with the stats? What happens if a player up his effort in one year and ultimately raises his stats, can it be accounted for by stats or is it an anomoly? Can passion be accounted for? Heart? I realize this may go along with your intagible theory but obviously a player who isn't trying whatsoever will perform worse than if he isn't getting his daily pep talk.

Forget about sports, and focus on you. You work in corporate America, or you are a student, or you are exercising. Whatever you want. The output of your work is a combination of your innate talent plus your desire. Is it important to be able to split that into two components? Does your boss really care that he's paying you 100K a year based on 80K of talent and 20K are desire, while your coworker is paid 50K on talent and 50K on desire?

Whatever it is that you manage to squeeze out, that's all we really care about. And a player's HR total is a combination of talent, desire, and the opportunity.

--Tangotiger 10:27, 4 December 2008 (PST)

Ditto exactly what Tango said. It always amused and bemused me when the media (or whoever) talks about a player who plays with heart, plays hard, or plays the "right way," and that therefore they should be worth more than people think they are, based on their "numbers."

As Tango says, "heart," effort, and all of that, are already IN the numbers, so what do we care if a player plays at 100% or 50%. The numbers are the numbers and reflect talent + effort (and heart). In fact, we should be more likely to want to acquire a 2 WAR player who plays WITHOUT much effort or heart than a 2 WAR player who plays with maximum effort, since the former presumably has more talent and if we could ever motivate him to play harder, we would have a better player. The latter player is already maxed out.

That being said, if you think that a player's character, effort, heart, etc., has an impact, one way or the other, on other players, or that the fans prefer to see a player who plays with heart and effort, all other things being equal, that is a different story. In those cases, if you have 2 players with the same "numbers" you might prefer and want to pay more the player with more heart, character, and effort.

--Mgl 18:12, 4 December 2008 (PST)

OBP, SLG aging curves for young players

A SI article in 2007 said: "Since taking over as G.M. in 2003, Epstein has introduced an emphasis on advanced statistical analysis; for instance, he believes so strongly in how minor league track records project to big league performance that he expects that within a batter's first two years in the majors he will lose 10% off his OBP but add 20% to his slugging percentage." Do you guys have any evidence of this?

I don't really get the connection between the two clauses in Epstein's (purported) sentence, and I don't offhand know the average aging curve for every component of a player's offense, but I will say that a player's (at least a batter's) age is much more important than his level (minor, major) in terms of an aging curve. In other words, a 25-year old player in the minors probably ages the same as a 25-year old player who has been in the majors for 3 years, with one caveat. If the player who is in the majors, made the majors because he was, for example, already hitting for power at a young age, his peak age for power may have been earlier than the player still in the minors. There are selective sampling issues when it comes to players of the same age being in the minors or majors.

So Epstein should be talking about age and not about players' first or second year in the majors. Of course he is probably speaking in generalities. In other words, he is assuming that we are talking about a typical 23-25 year old player, which is around when batters make their debut in the majors (I think). If you want to see how the various offensive components "age", take a look at Tango's aging curves:

--Mgl 18:22, 22 December 2008 (PST)

Team Forecasting 2009

My apologies if this has been asked before -- I assume it has -- but I can't seem to find a good explanation: I am interested in how one would use 2008 stats of a particular team to forecast 2009 wins and losses. (rest of question truncated)

MGL has a great article in this year's and last's year Hardball Times Annual. If you catch him in an unusually verbose mode, maybe he'll expand more here.

--Tangotiger 10:25, 22 December 2008 (PST)

I'm not in a verbose mood, although one person's verbosity is another person's laconic-ness. You can do several things, depending on how much info you have, how much work you want to do, how much "math" skill you have, etc. If you really mean 08 stats, as in 08 stats only, you can take each player's individual stats, and establish an 09 projection for each player. Since you are only using 08 stats, you would have to regress them fairly heavily and age adjust them. How much to regress depends on how many PA or TBF in 08, and what stat or stats you are using.

If you use that "individual" method, you have to figure out how much playing time each player from 08 will get in 09 and then pro-rate all of their 09 projections. Obviously there will be players who are slated to play in 09 that were not on the team in 08, so you have to use their projections as well. For example, if you use some kind of lwts stat (runs above/below average) for batters and some kind of RA or ERA stat for pitchers, you add everything up to establish each team's expected runs scored and allowed (you don't care if you are using park or non-park adjusted stats, as a team's park does not really affect it's runs scored and runs allowed differential or ratio). From that expected or projected team RS and RA, you use pythagoras to convert that into a w/l percentage. From that, you can simply apply that to 162 games or incorporate the expected w/l percentage of the teams that your team is going to play (a "strength of schedule" adjustment).

If you don't want to use the individual method, you can take a team's stats from 08 and apply some kind of conglomerate regression and aging adjustment. Then you have to adjust for changes in personnel and playing time. You can take their actual record from 08 and regress and adjust that to forecast their 09 record, although that method would be course and not very accurate. You can be a little finer and use their 08 runs score and runs allowed (essentially their pythag record) in order to forecast their 09 runs scored and allowed and ultimately their expected (pythag) record. That is still pretty coarse and inaccurate. You can also take their team underlying stats (like ERA and OPS, or team FIP and team lwts, or any number of stats) and use that to project the same stat in 09 (by, as usual, applying a regression toward the mean, an age adjustment, and an adjustment for changes in personnel and playing time).

I feel like I just answered a really obvious question unless you wanted me to walk you through a specific methodology. I am not going to be THAT verbose.

Fact of the day: "Verbal" means "with words" and can refer to spoken or written words. A "verbal contract" can be a contract spoken or written. An "oral contract" is one that is spoken. In common usage, "verbal" often refers to oral and not written, but technically that is not what it means.

--Mgl 18:40, 22 December 2008 (PST)

Most dominant 10-year stretches

I was going to run an excercise to look at pitching dominance over a period of years (one of your entries regarding 'best pitcher of the decade' inspired me) to see who was the most dominant in any given 10 year period, but then I thought maybe someone has already done this. Do you know of a place I might find this?

I may have seen it, but nothing sticks out at the moment.

--Tangotiger 10:25, 22 December 2008 (PST)

Fielding similarity scores

I was wondering if you had any suggestions for (or know someone who does) fielding similarity scores.

I'm partial to my list:

Merge that with UZR, and you get what you want.

--Tangotiger 10:25, 22 December 2008 (PST)

I was looking at Adrian Gonzalez's similarity scores briefly after submitting my question. I was wondering if anyone could do similarity scores using historical players. Rally's TotalZone is what made me think of this. Maybe one could work something out with TotalZone, speed rating, throwing errors and non-throwing errors, or something of the ilk?

I think that similarity scores are of limited use for hitting stats, so I have to believe they would be even more limited with fielding stats. All these things try to infer the actual toolset of the players, and so, I think other than speed stats (better found in hitting numbers), I don't know that we'll get much use. But, I'd love for someone else to spend the time to prove me wrong.

--Tangotiger 11:35, 6 January 2009 (PST)

Changing FIP

Could you comment on the stability of 2B+3B rate for pitchers and whether or not including it in FIP would make for a worthwhile overhaul or not?

FIP is not some overall/complete metric. It addresses one particular function, and that is: how does the pitcher perform when he doesn't rely on his fielders. If you want to count the number of line drives or ground balls, that's one thing. But, to look at the actual outcome of plays, that are dependent on his fielders? That goes against the "I" in FIP.

I'm not saying it's not a worthwhile effort, but not within the scope of FIP.

--Tangotiger 10:25, 22 December 2008 (PST)

Extra base hit rates (depends a little on the denominator) are more stable then singles rates and less stable then HR rates. You can use anything you want to evaluate a pitcher and to make pitcher projections. I agree with Tango that FIP is not supposed to be anything but what it is. Which is a pitcher's component ERA (ERC) after taking out all plays that fielders had a chance to make. At the same time it takes out that portion of a pitcher's stats that are least stable (most unrelated to the pitcher's true talent). If you throw in any singles, doubles, and triples to an FIP formulas, as Tango says, it is no longer an FIP. You can include a pitcher's 2B+3B rate in any "runs" formula if you want. Just know that for small samples of performance it will not have a whole lot of predictive value and is somewhat subject to the fielders no matter what the sample size. The reason that FIP is so nice is that, especially for small samples (say, one or two years or less), it better characterizes a pitcher's true talent and is more predictive of future performance than ERC or ERA. Once you start throwing in things that have a lot of noise in them, like BABIP, you will have a less predictive measure of a pitcher's performance and it will be less reflective of true talent. On the other hand, as the samples get larger, you tend to be better off throwing in those things, since the larger the sample, the less noise there is in anything, even BABIP, 2b+3b rate, etc. But again, 2b+3b rate is probably less noisy (more predictive) than BABIP and singles rate, and is also probably positively correlated with HR rate. In fact, in an FIP formula, you could probably use a 2b+3B rate tied in to a pitcher's HR rate, rather than a league-average BABIP rate (which necessarily includes a league-average 2b+3b rate).

--Mgl 18:57, 22 December 2008 (PST)

Odds Ratio - on other stuff

Can you use an odds ratio for football, too? Say you want to figure out the probabilities of a QB throwing for 300 yards, when his average is 325 and his opponent gives up 275 per game (in a league where 250 is the average per game)?

Well, odds is based on successes per opportunity. Yards per game would be a ratio of two numbers, and so does not qualify. However, completions per attempts would work. However, with short, medium, and long passes to contend with, you'd like to have a breakdown at that level as well.

--Tangotiger 11:44, 6 January 2009 (PST)

As a really rough approximation, you can predict that the number of yards in this situation would be 325 * 275 / 250 = 358 yards. But, this makes an awful lot of assumptions -- that the offense has faced average-quality defenses, the defense has faced average-quality offenses, that the defense has faced offenses that throw the ball an average fraction of the time, etc.

As for predicting the odds of exceeding 300, you'd have to do a LOT more work. --AED 09:59, 9 January 2009 (PST)

Effectiveness of a manager

Has there been any study into how many individual wins a manager is responsible for? I know James invented the win shares for players, but is there any way to show exactly how effective a manager truly is?

That is a good question and something we would like to know since we talk all the time about things managers do that we think are wrong. There has been some research on manager skill, but I can't remember off the top of my head where it is/was. No one has done it from the standpoint of individual strategies, though, at least in a comprehensive fashion (for example, one article I recall looked at manager pinch hitting strategies and effectiveness), I don't think.

It is on my long list of things to do one of these days. The methodology I anticipate using is to look at all of the individual strategies used by managers, such as bunts, IBB's, pinch hitting, pitching changes, usage of the bullpen, especially closers, early or late hooks with starters, etc., and simply tallying all of the gains/costs that we think each strategy produced based on some optimal model that we think is right.

Of course another strategy and a whole different ballgame in terms of evaluating managers is to look at player performance under each manager as compared to other managers (sort of a manager "with and without you" as Tango does with fielding) or even team performance as compared to some baseline expected team performance. As I said above, I think that a few things have been done along those lines. Lot's of different ways to go about it. Also difficult to do, as you can imagine.

If someone does do a comprehensive study in this regard, it would be groundbreaking, although it probably just entails some grunt work (and lots of it) although the framework would be a little tricky. --Mgl 17:02, 6 January 2009 (PST)

I've spent a fair amount of time quantifying effects of tactical decisions, but it was a while ago. My recollection is that the difference between a "perfect" manager and an average one was a few games. --AED 10:02, 9 January 2009 (PST)

The Appendix - for hockey

I was reading the appendix of The Book and if you don't mind just a few quick questions, all I've done recently is basic stats in College so bear with me....I'm trying to find out true shooting percentage for hockey players. Correct me along the way if I've done anything wrong.... I took the sum of the shots of the 07-08 season which was 71503 and the shots of one player 241. I do this knowing the league average is 9.1 % In order to find the variance for the individual palyers shooting %, I use the equation at the bottom of page 371. Then for the population variance I use the equations at the top of 375, since I'm only doing 1 individual shooting %....Then I'm pretty sure I use the formula in the middle of page of 380 (the one for OBA), however I am not sure what I replace .22 with and that varaince symbol, is that the variance of the individual or the league?

Batting Order - sims and no sims

Also one other baseball-related question, I was reading the mailbag and MGL said he thought computer simulators for lineups aren't that reliable because he said factors like baserunnign aren't taken into effect. If so, wasn't a good portion of your lineup chapter based on computer simulation?

There's no question that simulators or models are the preferred method if accuracy is the ultimate goal. But, it would be a boring an unconvincing book if I ran my simulator and reported the results.

What I did instead was try to break it down into its components, so you can see the effect of things. Basically, an "all other things equal" approach, what would happen if you swap the #2 and #3, or #4 and #9 hitters. I also considered GIDP, SB, etc, to show you when they come into play, how often, and its impact.

Now, to put it all together, you can try to do it piecemeal, and you'll get the right answer most of the time. But because of the interdependency, it's best to use a simulator. The problem is that it won't be apparent why Pujols might be a better #2 or #4 hitter. You'll just get a result. It's good for eating a fish today, but you won't be able to do so tomorrow, when you ask about Chase Utley or Hanley Ramirez.

--Tangotiger 11:44, 6 January 2009 (PST)

Ditto what Tango said. While using a comprehensive sim (not, for example, a Markov sim), which takes into consideration base running, potential sac bunts and IBB's, interactive effects of all 9 players, pitching and batting "approaches" (suited to the game situation) etc., is the best way to evaluate batting orders, doing it the way we did it in the Book is more transparent, as Tango said, generally yields roughly the same results, and enables the reader to come up with optimal lineups or slots for a particular player on his/her own.

I'll give you an example of the disadvantage of the results of a sim not being transparent, and thus, not so workable. I ran in a very comprehensive sim several scenarios whereby an average pitcher batted 8th in a typical lineup. I came up with fewer runs per game than a conventional lineup (where the pitcher bats 9th of course). This is contrary to what we found in the analysis in the Book. While I am not too surprised that the two methods yielded results on different sides of a "bright line" (because it was so close either way), I have no idea why the sim came up with the results that it did (without picking it apart). In fact, it is possible that the sim was doing something "wrong" (even though generally the sim is a better way to evaluate lineups) such that its results were wrong and the results in the Book were right.

Also, theoretically you can come up with a near perfect analysis using the kind of methodology that we did in the Book as long as all of the pertinent variables are accounted for. And, again, I would trust that kind of "piecemeal" analysis more than I would a sim because of the former's transparency.

Keep in mind something that is not discussed or even mentioned too often in the sabermetric literature with respect to lineup optimization. Generally the difference between any two reasonable lineups is fairly small, on the order of less than 10 runs (1 win) per season. It is entirely possible, if not probable, that the intangibles that go along with various lineups (for example, a player or players being more comfortable in certain slots, or the opposing manager making certain mistakes or not given certain players in certain slots) might negate or at least abate whatever we come up with "on paper." In fact, Bill James says that he does not interfere in Francona's lineups for essentially the same reasons.

Of course if I were a manager or running a team in some other capacity, I would probably want to at least know the optimal lineups on paper and then go from there. So, for example, if Soriano did not care in what slot he bats, and I had no reason to think that he would do better in one slot or another, I would put him in the slot that was technically (on paper) optimal. However, if I or he thought that he was more comfortable in the leadoff spot, for whatever reason, I might leave him there even if my analysis indicated that that was not the optimal slot for him.

Of course, in practice, managers have no idea what optimal lineups are. They have certain ideas in their head that are based on conventional wisdom, more or less. While that conventional wisdom is pretty decent, we know that some of it is wrong. We also know that it is nearly impossible for a human being (without some technical analysis or help from a computer) to optimize a lineup, other than by virtue of and accounting for the intangibles mentioned above.

--Mgl 16:48, 6 January 2009 (PST)

I'll ditto mgl back. When we are talking about a 2 or 3 run difference to figure out if a guy should bat 8th or 9th, or 3rd/5th, then it's possible that the "little things" could make up for the difference, that perhaps a full sim would pick up, that little piecemeals don't.

The largest gap I have is moving the pitcher from 8th to 4th, and that difference is 0.1 runs per game (16 runs per season). That is as bad as it gets for any single one decision (though moving Rickey/Bonds to 8th/9th might be worse). Clearly, no manager will make a decision so horrible as to do something like that. We are generally talking about 2 or 3 runs. Maybe 4 or 5 for any single decision. More important than that is how a player "feels". One would hope players will be good soldiers and do what they are told. Seeing that some are not, then you defer to that.

The one spot that a manager needs to be aware is the 2-hole. It's beyond silly to put one of your worst hitters here. It's barely tolerable that you put an average hitter here. That's the one spot that I wish some managers would simply stop thinking about "moving runners over" as the sole justification.

--Tangotiger 07:10, 9 January 2009 (PST)

Hockey consulting

I have a couple of questions about hockey consultancy. What is your opinion of the work Alan Ryder is doing for the Toronto Globe and Mail (essentially calculating win shares by player). And, the Canucks new (ish) GM Mike Gillis stated that he was going to be a very sabermetric GM and get consultants - have you heard anything about this? I think it is high time that teams start to use stats in a more efficient manner, but hockey is a very conservative game. As well, I think you can succesfully make he argument that a lot of things that are important in hockey do not show up too well on the stat sheet (take a look at the inconsistent manner in which hits are measured, for example)....Love your work on the baseball side! And, as a lifelong Mariners fan, I am glad to have you aboard (so to speak)

Alan does great work. I haven't heard any more on the Gillis situation. NHL teams have or are trying to use more stats. It's a tough situation to get into, but eventually we'll have more of an impact. Yes, there are alot of scorer-bias issues. That's one of the fun things to worry about. And thanks for the kind words!

--Tangotiger 08:05, 9 January 2009 (PST)

Linear runs to roster construction

We discuss WAR gains in a linear fashion for a team. Player A is worth 3 WAR, add him to team X and they should, theoretically, gain 3 wins in the standings if he's replacing a 0 WAR player. Is there some non-linearity at the ends of the win spectrum for a team? That is to say if you have a 92 win (true talent) team and you add another player who is a 3 WAR upgrade at a position is there a saturation effect where teams just aren't going to win 95 games often because of luck regardless of roster construction?

WAR = (talent minus baseline) * playing time

Since playing time is finite, you don't simply "add a 3 WAR" guy to a team. You are also removing a corresponding amount of playing time from someone else (or to him for that matter).

The WAR scenario works great on average, but for specific scenarios, you need to break it back down to its components noted above.

--Tangotiger 11:04, 12 January 2009 (PST)

Not sure what you mean "by luck, they are not going to win 95 games very often." Every lineup has a certain run scoring expectancy and a certain win expectancy, given a certain run allowing expectancy. How may games they actually win "around" that win expectancy given a sample of games is indeed purely random, but it is reasonably symmetrical. A roster that is "supposed" to win 95 games (that is their win expectancy) is going to win 95 games as often as a team that is "supposed" to win 80 games wins 80 games.

Obviously if you don't account for chance of injury properly in a team win/loss projection, a team that is projected at 95 wins is more likely to have their actual win/loss record be pushed towards 81 wins than a team that is expected to win 85 or 75 games (since injured players tend to be replaced by less than average players, on the average). But if you properly account for the chance of injury in your team win/loss projection, a 95-win projection should average around 95 wins and an 80 win team should average around 80 wins. But that is not really your question.

As Tango says, a player's WAR or WAA is generally based on his projected stats added to an average team with him in an average lineup slot. If you have a non-average team and/or you place your player in a non-typical lineup slot given his stats profile, his "custom" lwts values will change, which will change his WAR or WAA (for that team).

I have noticed a few times in playing around with sims, that when I had a team that had a lot of offense and I added a very good offensive player, that the expected run scoring/wins addition did not materialize (it came up short), based on the WAR/WAA or RAR/RAA of the player. Whether this is a general rule or not, I don't know. I do know, again, as Tango has already said, that if you want to know the true and actual run/win combination of a player to a team, you have to use customized lwts and apply them to the batter's expected hitting stats (or use a sim). Those customized weights are based on your player's slot in the lineup and the stats of all the players in the lineup, particularly those "around" and near (in the lineup) the player in question. And don't forget that traditional stats (s,d,t,hr,bb,etc.) are not the only thing a player does that affects his run/win contribution to a team. Base running, ability to hit behind runners, GDP rates, etc., all affect team run scoring.

A generic WAR or RAR usually gives us a pretty good idea as to the actual effect a player will have on a team, but it is not exact of course. It is closest to being correct/exact when that player is added to an average lineup in a slot that is typical for his stats (since that is how lwts values are figured i the first place - using an "average" team with a typical lineup). As far as your original question, though, which I think is whether a good player added to a very good team produces "diminishing returns" (less actual runs/wins than his generic WAR/RAR would suggest) as a general rule, the answer is that I don't know, but it could be.

--Mgl 18:38, 14 January 2009 (PST)

PITCHf/x database

...what I was wanting to do was create run values by count. I know there has been some discussion of this on the book blog in the past but I was wondering if you guys could provide step by step on how to do accomplish this? Any help would be greatly appreciated.

Figure out how many runs are scored from a particular state to the end of the inning. You'll end up with something like this:

The value of going from one state to another is the difference between the two states. So, going from 1b,0 outs to 2b,0 outs is 1.189 - .953.

It's the same idea for counts: how many runs are scored from this particular count to the end of the PA. That's the easy way. There are medium and hard ways to do it, but that's a lengthier post.

--Tangotiger 11:12, 12 January 2009 (PST)

Fielding metric - reliability

My question relates to the reliability of defensive metrics. Offensive metrics, such as batting runs, RC, etc, can be validated by seeing if they work at the team level -- or indeed, sometimes their constants are adjusted to make them work at the team and/or league level. In this sense, we know that they work at some level....Is there any analogous "calibration" for UZR or any other pbp defensive metric? Can we evaluate these metrics somehow? The only evaluations I've seen typically assume (explicitly or implicitly) that one of the systems is best (often UZR), and then look at correlations to that best system. But how do we know that UZR (or any system) works? (Other than looking at what goes into it and thinking, "Yeah, that looks right.") ... Have you guys, or anybody else to your knowledge, tried to verify UZR (or any other defensive metric) by comparing to team DER, for example? Or maybe looking at team run prevention and factoring in pitching quality to relate team defensive ability to team UZR (for example). Thanks.

I've compared UZR with my own defensive metrics, which are based on overall season stats, so are quite a bit noisier. As best as I could tell, my own data were statistically equivalent to "UZR plus noise". --AED 10:43, 14 January 2009 (PST)

Two things: One, since pitching and defense are inextricably related, you can't really do that without controlling for the pitching, which would be difficult. You could try doing something like Tango's WOWY (with and without you) and see the difference in runs allowed with a certain player on the field and when he is not on the field (with everything else being controlled). If you group all players with a UZR of say, +10 or better, +5 to +10, 0 to +5, 0 to -5, -5 to -10 (per 150 games), and less than -10, you could get some nice sample sizes and you would be able to compare the runs allowed WOWY differences for each group with that expected by their UZR's. IOW, your group of players with a +5 to +10 UZR should allow around 7 runs less per 150 games in your WOWY analysis. If not, then your UZR may be a flawed metric.

I must say, though, that we are pretty far past (in our understanding of how runs are scored and wins are tallied in baseball) having to "verify" a relatively transparent metric like UZR to "see if it works." After all, all that UZR (or any of the other advanced or even simple PBP fielding metrics) does is to tell us how many balls of a given type get fielded by a certain fielder as compared to an average fielder. If you need to "verify" that that actually has an effect on runs scored by the opposing team, well...

I mean, if I tell you that given all of the ground balls in a season (150 games) near Jeter that could possibly be fielded by any SS, the average SS would field 100 of them, and Jeter fields 90, and therefore his UZR (or whatever defensive metric) is -8 runs, do you really need to "verify" that the Yankees will actually allow 8 runs more with Jeter than with an average SS? As I said, there are ways you can do this, but I don't think it is really necessary. If you or anyone else wants to do this, be by guest. Of course, if you want to check that your methodology is actually doing what you want it to do, then it might not be a bad thing to use some kind of a WOWY.

--Mgl 18:53, 14 January 2009 (PST)

Spring Training - as a predictor

Is there any correlation in wOBA for pitchers and hitters between good or bad spring trainings and the following first week/month of the season?

I haven't studied this, but laws of probability make it impossible to infer anything significant from spring training data. There just aren't enough plate appearances to establish a player's skill level with any reliability. --AED 10:39, 14 January 2009 (PST)

There was at least one "study" that indicated that teams with certain records in ST either did better or worse than expected during the regular season. I forgot where I read that study/article. You can probably do a Google search. In other words, although the samples are small and we don't know that much about the context (the opponents, whether players are giving 100%, whether they are working on something in particular - pitchers especially, rehabbing an injury, etc.), ST stats are to some extent the same as any other small sample of stats - they have a little predictive value.

So to answer your question, ST stats have a little bearing on a player's forecast for the season, but not much due to the small sample size and the unreliability of the data due to what I mentioned in the parentheses above.

--Mgl 19:31, 14 January 2009 (PST)

Component park factors - by batted ball type

Is there a site that lists batted ball (LD, GB, FB) park factors? I saw Brian's line drives article on FanGraphs, but I know nothing about Retrosheet databases to recreate it.

Not sure what you mean by "batted ball park factors" (for example, FB doubles, GB doubles, etc., or FB percentage, LD percentage, etc.), but I have not seen anything like that anywhere. If you familiarize yourself with the retrosheet databases and how to use them, you can easily do all that kind of work yourself.

--Mgl 19:33, 14 January 2009 (PST)

FIP for College

Hey guys, how would one go about creating a FIP-like stat for pitchers in college baseball? I looked at some per game stats from the Big 12 last year: 7.53 K/9, 3.58 BB/9, 0.77 HR/9. It's a small sample size (only 1 season for conference--5200 innings), but it's what I'd expect to be the norm for college baseball (more K, less HR). I calculated BABIP, and I was surprised to find that it was .320. I expected much higher because of poor defenses and the aluminum bat. I'm guessing this would make strikeout pitchers more valuable in college compared to contact groundball pitchers. Sorry for the rambling question, but how would a FIP for college pitchers different from Tango's version of FIP?

I'd need to know how many runs are scored per inning.

--Tangotiger 13:57, 20 January 2009 (PST)

Value of a draft pick

Is there any way to accurately assign a dollar value to a draft pick, based on how much value the average MLB player choosen in the draft produces at the big league level. Breaking it down by draft pick number might be difficult, but how about by round or half round, or something along those lines? The question arises from trying to evaluate a free agent signing where a team loses a first round draft pick.

--Tangotiger 08:44, 26 January 2009 (PST)

No question

I don't have a question, to be honest. I just wanted to tell you that this mailbag page was, and still is, a fantastic idea and resource. Thank you to all of you for taking the time out of your day to write the responses (I think I only have one question in there, but I enjoy reading other responses as well). Keep up the great work!

Thank you...

--Tangotiger 08:39, 29 January 2009 (PST)

Opponent OBP, SLG - why not for pitchers?

I would like to know why OPS/OBA of pitchers are not popular in baseball (both in mainstream as well as stat community)? In cricket the major stat average (runs per out) is used to evaluate both batsman and bowler. I understand that there are proxy stats like WHIP for OBA. I hate ERA which is too complicated and the fact it is not stable over different type of pitchers (By stability I mean, starting pitchers having a worse ERA than relievers) but I think it would be nice to have a symmetric stats for both batters as well as pitchers. Also are pitcher's OPS and OBA stable across starting and relief pitchers?

You will sometimes find "opponent OPS" (or OBP, wOBA, BA or whatever) in a pitcher's stats (sometimes you will see it as, e.g., "OPS against"), however it is just not something that the mainstream fan associates with pitchers for whatever reasons.

I don't know what you mean by "complicated" as far as ERA is concerned. You are right that in evaluating a reliever or starter or comparing relievers to starters, you need to know the average ERA for a starter and for a reliever (and in which league), but the same is true for OBP or OPS (or any other rate stat) "against." Relievers will have lower numbers "against" because relievers can pitcher "better" in shorter stints.

You are also right that something like OPS or wOBA (or linear weights, BaseRuns, or Runs Created) is a better stat for pitchers than ERA, at least in terms of predictability, and at least in the shorter term, because ERA has more noise in it.

Statheads/sabermetricians use something like it all the time. For example, a component ERA (ERC) is essentially the raw stats of a pitcher's opponents (using a linear weights, BaseRuns, or RC formula) turned into an "ERA-like" number.

Keep in mind that in the long run, ERA is a perfectly good indicator of a pitcher's talent/value/effectiveness/run prevention skill (or whatever you want to call it), because in the long run the noise in ERA is greatly reduced as the magnitude of that noise (proportionally) is a function of sample size. The qualification to the above statement is that even in the long run, ERA would have to be normalized and adjusted to league average ERA, park, opponents, etc., in order to compare one pitcher's ERA to another to determine who is/was the "better" pitcher in a neutral environment or on the same "playing field." Then again, that is true for OPS. OBP, etc. "against" as well. In addition, in ERA, there is some bias against fly ball pitchers. Ground ball pitchers will allow more errors, on the average (which makes them less valuable, everything else being equal), which makes the RA (runs allowed per 9 innings, including "unearned runs") of a ground ball pitcher higher than that of a flyball pitcher, given the same ERA, since ERA "factors out" errors. The other "weakness" or flaw in ERA is that it is not perfectly reflective of a pitcher's overall run prevention skill because it gets "screwed up" when a pitcher enter or exits in the middle of an inning. This is why you sometimes hear that ERA is not a good stat for relievers. For example, if a pitcher comes in at the start of an inning, he will allow around .56 runs in that inning. However, if a pitcher comes in with 2 outs and no one on, he will allow around .14 runs. If he does that 3 times, he will get credited for a full inning, but allow only .42 runs. Or if a pitcher comes in with a runner on first and less than 2 outs, he will get the benefit of a possible DP, but he will not "pay the price" (in ERA) if the runner on first scores.

--Mgl 07:26, 2 February 2009 (PST)

Regression and sample size

You guys use the formula PA/(PA+x) to show how much to regress to the mean (among other things...). Is there a way to include the population size for the PA factor? A 100-player sample with an average of 400 PAs looks the same as a 10,000-player sample with 400 PAs, which obviously is a mistake.

Nope, there's no mistake. The formula we use for estimating the population takes into account the effect of population size on the overall population estimate. --AED 18:15, 8 February 2009 (PST)

Runs to Wins Conversion - which to use

I see that some other sites (BYB) are using 10.1 instead of 10.5 for their runs to wins conversion due to the lower run environment. If I am using a 3 year weighted projection system like Marcels, what conversion factor would you use when calculating WAR, 10.1 or 10.5, or something else?

Just stick with 10. First off, there's no big difference between 10 and 9 or 11. Secondly, and most importantly, everyone is in the same boat. Make life easy, and use 10.

If you insist on something a bit better, runs to win = 0.75 * Park Run Environment + 3. So, if you score 8 runs per 54 outs, then the runs to win converter is 9.

--Tangotiger 08:45, 24 February 2009 (PST)

Replacement level - where to set it

I see that Dave Cameron is using a replacement level for team wins of around 47 (not sure the exact number), and always remembered you using 50. Again, what would you use for the replacement level team wins offset? Thanks for your time!

Good question. Common wisdom is that a team filled with replacement-level players would win 30% of its games. With a 162-game schedule, that means 48.6 wins. But, this is a really rough estimate based on some of baseball's worst teams, and anybody who claims that he knows the number more accurately than about 5 wins is deluding himself. --AED 18:19, 8 February 2009 (PST)

Baserunning - converting to runs

Thanks again for the great blog guys....I know this is pretty rudimentary, but if I wanted to convert baserunning gain into runs, do I just apply the .22 stolen base weight to the net baserunning gain (i.e. a player who contributed +10 extra bases contributed +2.2 baserunning runs)?

Pretty much, yes.

--Tangotiger 08:47, 24 February 2009 (PST)

wOBA v EqA

Thoroughly enjoyed "The Book."... So what would you say are the strengths and weaknesses of wOBA compared to EqA as far as predicting runs?... Have been reading BP for awhile and have just discovered this site, so am looking forward to learning much more. ...Thank you.

The main difference is that EqA uses a somewhat arbitrary valuation of particular game outcomes. Specifically, what we find is that the average run contribution of event types, relative to getting out, is something like:

 0.72: walk
 0.90: single
 1.24: double
 1.56: triple
 1.95: home run

In the computation of EqA, the values are simply set to 1.5, 2, 3, 4, and 5, respectively. Now, this isn't completely off the wall. If you were to multiply the EqA values by 0.42, you would end up with:

 0.63: walk
 0.84: single
 1.26: double
 1.68: triple
 2.10: home run

So, the trend isn't that bad, but EqA definitely overvalues power and undervalues on base.

I don't want to sound like I'm totally against EqA. From my observations, I've found it to be a far better estimate of run production in different run-scoring environments. In other words, wOBA's values of particular events are based on a particular average scoring environment, and thus is not necessarily going to be always correct. EqA appears to be more accurate over the history of baseball. --AED 19:12, 8 February 2009 (PST)

I concur with Andy.

Note however that I have published wOBA weights that changes with the the run environment. It's a pretty simple process, and one that has been implemented at

--Tangotiger 08:48, 24 February 2009 (PST)

Weather affecting stats

We all know that weather effects the run environment in baseball. ie - Colder weather = lower scoring games, Wind blowing out = higher scoring games. Do you know of any thorough studies that have looked at which batting and pitching components are effected and by how much due to weather?

I recommend the articles linked to from these threads:

And more on park impacts can be found in threads here:

--Tangotiger 07:29, 6 February 2009 (PST)

PA or games - what to use for forecasts With respect to the above post which shows that Ichiro is the 5th best OF in WAR over the last 3 years, I'm wondering if there is a better way to evaluate value, considering so much of the offensive value is tied directly to plate appearances, something a player has no control over. If Ichiro batted 6th in the lineup and had 60-75 (pick a number) fewer PA per year, he wouldn't be in the top 5. At the same time, if Manny and a few others batted lead-off instead of 3rd-4th-5th-6th, they would move up in the rankings....Would WAR per out or WAR per PA be better to compare players in a vacuum?... (I realize Ichiro stays healthy, which is surely valuable, but I'm more concerned with the extra PA he gets purely because he's a leadoff hitter.)

I agree with your basic concern, and we'd be better off using innings.

--Tangotiger 11:51, 23 February 2009 (PST)

Adding SB, CS to wOBA

I know there are a few shortcut formulas that convert OBP & SLG to wOBA, but I was wondering if there is a way to add a shortcut component for SBs?

Fangraphs tracks it. Use +.25 for SB and -.50 for CS.

--Tangotiger 11:53, 23 February 2009 (PST)

Uncertainty in performance metrics

I was discussing the inherent error/uncertainty in the stats we use. Particularly, I was trying to put "error bars" on the WAR value on fangraphs. It seems to me that much of the work has been done to make stats that do a good job at portraying reality and should be accurate but we don't really know HOW accurate they are. Are players who have wOBA of .352 and .355 significantly different?... It seems to me that it wouldn't be that hard to calculate the inherent uncertainty in the parameters that make up wOBA. Once the uncertainty of the linear weight parameters it should be pretty easy to do some fairly standard error propagation to figure out the approprate error to assign to each player's performances....If I can recall back to my college stats class the best way to do this may be to do a population sampling method. If I understand how the linear weights are calculated, you go to Retrosheet and figure out how much a single, double, etc effects the chance of a run scoring. To figure out the uncertainty in that measurement you can use a method where you sample the whole population so instead of using all plate appearances possible to calculate the coefficients, you randomly take smaller subset and calculate the linear weight coefficients. Then repeat this a ton of times and you can develop a distribution for the coefficients. If you do the sampling enough you will come out with the same mean and the variance can be assumed to be the variance in your coefficients.... This same sort of procedure can be used to figure out the inherent error for things like FIP probably too. Showing the error bars for stats isn't as sexy as coming out with something new and exciting but at some point the community should start using it because knowing what differences are actually significant can be extremely important (scientist in me coming out).

Let me offer you something else to read:

--Tangotiger 08:51, 24 February 2009 (PST)

I always quote uncertainties in my projections. However, that's uncertainty in the player's current true talent level, not uncertainty in the upcoming season results. Unless you've got a tiny amount of data on a player (or, have really hosed up your projections), the projections should be effectively truth, simply due to the fact that 5 seasons (even if weighted somewhat down) of data tell a lot more than the 1 season that you're trying to project. --AED 14:52, 9 March 2009 (PDT)

Groundball / Flyball pitchers - range

Hey guys, I have a question about the GB/FB section of the Mano A Mano chapter of THE BOOK.... Can you please tell me what the ratios are that determine GB/Neutral/FB pitchers and hitters. ...i.e. if MLB AVG GB/FB ratio is 1.22, what is the range for Neutral, what exactly constitutes a GB or FB pitcher. I am using 1.08-1.39 = Neutral, >1.39 = GB pitcher, , 1.08 = FB Pitcher. (I'm Just loosely pro rating off Sean Formans GO/AO rating of .83-1.08 GO/AO = Neutral to get an approximation.) Thank you very much for taking the time to respond.

Oof. I don't remember what I used. First off, I did GB divided by all BIP. So, this include FB, LD, Pops, etc as "air balls". I didn't just look at outs, but hits too. I would bet I took the top 25% and bottom 25% as my GB and Airball groups. I may have used top/bottom 10%. Somewhere in there.

You can get the data from or in an easily sortable format (and at Fangraphs you can get it cumulative over a 3yr span, which I would recommend).

--Tangotiger 08:38, 24 February 2009 (PST)

Balks - who controls it

I was wondering if any of you had seen any studies regarding balks, namely, who's the most responsible for the balk (hitter, baserunner, pitcher, maybe umpire?)? Also, is drawing a balk (or for the pitcher, not balking) a skill, and if so, how much year-to-year correlation is there and how would one incorporate balks into linear weights?

I have no idea. First of all, they occur so rarely, I think, that I doubt you need to worry about them. I would think that it is mostly within the control of the pitcher; however, I would also think that it is kind of a fluke thing and therefore it would not have much predictive value. Then again, when something occurs rarely, it does not have much practical predictive value anyway, because there is not much spread in "true talent" (true balk rate) among pitchers. For example, if most pitchers have no balks in any given year and some have one, where is the "spread?" I would guess that as a class, young or inexperienced pitchers probably have a higher balk rate, but I am not sure.

Again, I have not read anything about this, nor done any work on it, but I really don't think it is going to change much of anything, especially in terms of a projection. If you want to include it in a pitcher's actual value going backwards, then by all means, do so. There are all kinds of things that we include in value going backward that are mostly "luck." I would not include them in a base runner's or a batter's value, although I suppose that they could be included in the value of the SB (a good basestealer may "draw" more balks than just an ordinary runner on base). Tango does that I think. To some extent they are already included in the lwts value of getting on base.

BTW, you can answer all of your questions yourself. The value of balk is obviously the value of a base (or bases) advancement - around .25 runs per base. The year to year correlations you can do yourself with Excel or with a simple formula like r=(n*txy-tx*ty)/sqr(n*tx2-tx^2)*(n*ty2-ty^2)),

where tx2 is the sum of the x variable squared, ty is the same for the y variable, tx and ty aer simply the x and y variables summed, and txy is the sum of the x and y variables multiplied together.

--Mgl 19:42, 7 March 2009 (PST)

What are your day jobs

If you don't mind me asking, what are your day jobs? I think Tango's a programmer/maintenance guy, MGL a lawyer, and Andy a professor.... is that right?

I do not practice and in fact am not licensed (lest I ever be accused of "practicing law without a license" when I give a legal opinion). If I told you what my actual day job was, I would have to ki..... --MGL

I used to be a postdoc in astronomy (wasn't ever a professor), and still am active in astronomical research. But, if I told you what my actual day job was, I would also have to ki..... --AED

I've been doing some sort of programming, analysis or database work since I was out of diapers. I also act as a mortician when the situation calls for it. --Tangotiger 11:48, 10 March 2009 (PDT)

Different Batting Runs

I've noticed that "batting runs" in the value section at fangraphs, "batting runs" on sean s's site, and braa at stat corner all give different numbers. My understanding is that they're all based on the same lwts, and so I'm assuming that the difference is based on different park factors and league adjustments. Is there a principled way to choose among them, or is it better to take an average?

The basic linear weights will all be close. That is to say, the run value of the walk will be .15 or .16 runs less than the single, and the double will be .30 runs higher, and the HR will be fixed at 1.40, and the out will float to make sure it all adds up.

The differences will be in park adjustments, league adjustments, and whether you consider the pitcher or not. What you should do is find out how each one handles the differences, and choose the one that appeals to you. That would be the principled way, which is what you are asking. On the other hand, if you want to be lazy, just take the average. Nothing wrong with laziness, as long as you are honest about it.

--Tangotiger 10:12, 18 March 2009 (PDT)

There are other nuances that different people and different "systems" handle differently, I think. Some lump the IBB's in with the other BB's (I think that is "non-principled"). Some ignore them, and some give them a different value. It is probably most correct to give them the same weight as an average PA for that player, so you can ignore them when giving a linear weights "rate" and consider them as another PA (with the appropriate lwts/PA value) if you are giving a lwts as a "counting stat."

Some treat the HP the same as a BB (I do that). Some give the HP a slightly different value (that is fine). And some ignore the HP completely (that is "non-principled").

Some formulas treat the ROE as simply an out with a negative value (which is wrong as that is a "skill" with a positive value, if you are better than average at coaxing ROE's). Others assign a positive lwts value to them.

And, as Tango said, they all use slightly different values for each event. You should also sum everything to zero without the pitchers batting stats included, but you don't have to I guess.

The basic point is to be consistent from one player to another in order to fairly compare them (although if two systems are similar, but not exactly the same, you can still fairly compare players - but, for example, if one system included pitcher hitting when summing the league to zero, and the other one didn't, you can't fairly compare players unless you make an adjustment to one or the other) and to know exactly what they are doing when you read someone's lwts.

"Lwts" does not mean one specific thing, like BA or OBP (and even with OBP, some people include the SF in the denominator and others do not for some somewhat silly reason). So, as Tango, says, it is nice when someone presents lwts for a player, that they tell you how it was computed, at least the basics, if not all the values. It's not like it is a big secret.

--Mgl 19:14, 27 March 2009 (PDT)

Convert OPS to runs

Is there any formula or even rule of thumb for converting hitting stats, especially OPS to runs created or total bases?

OPS is a rate metric, and RC and TB are not. At the least, you need to divide RC by outs or PA to at least get it into a rate form.

What I do is this... wins above average = (1.7*OBP+SLG-1)*PA*.025. So, a guy with a .450 OBP, .600 SLG, and 660 PA will come in at +6.0 wins above average. Use that as your basis to do what you need.

--Tangotiger 10:12, 18 March 2009 (PDT)

Playing the infield in

In the Japan/USA WBC game, Japan was the home team. In the top of the 8th with 1 out, runner on 3rd and up 6-4, they played the infield in. My first reaction was "WTF?" But maybe I'm missing something and Japan has the right answer. In the broadcast, they made it seem like this was not unusual. Is Japan right or wrong?

No idea. It would seem to be wrong, and you rarely if ever see an American team play the infield up unless they are losing or the tying or go ahead runner or runners are on base, which I'm sure you realize.

So it seems like a bad play, but you never know. It is complicated to figure out, but it can be done. Obviously you increase your chances of cutting the runner off at home on a ground ball, but increase significantly the chances of the batter getting a single or reaching on an error or fielder's choice (with the runner at home being safe as well).

You say they made it seem like it was not unusual. Unusual for whom? That is certainly unusual for American ball, as I said. You'll never see that. For Japanese ball, who knows. As smart as the Japanese are, I have always assumed that they do lots of dumb things (things that are not optimal to winning) in a baseball game because of "tradition and honor," whatever that means.

Keep in mind that if the decision to play the infield up or back is close, with everything being league average, then you would have to consider lots of other nuances, like the batter's and pitcher's propensity for the ground ball, the speed of the runner at third, the fielding ability and arms of the infielders, etc.

But, as I said, I would guess that it is clear cut to play the infield back in that situation, but I am not certainly not close to 100% certain without running the numbers. I think lots of people would be surprised if that were correct.

--Mgl 19:22, 27 March 2009 (PDT)


If you guys are discussing basketball papers by Wharton authors then you might want to cover the point shaving controversy between Wolfers and Bernhardt-Heston.

What, you think we don't have day jobs or something (read above)? We can't critique everything that comes across the wire!

I think that paper has been beaten down enough, hasn't it? There does not seem to be a shortage of academic research regarding sports, where the problem is that the researchers know little or nothing about the sport and they don't have much, if any, experience in doing sports analysis. That is not to say that all academic research involving sports is bad. There are plenty of good and great ones. But there are lots of stinkers too. Personally, if I am forced to make a choice between reading (and believing) some research by a sabermetrician (or whatever you want to call it in other sports) who has a real day job, and an academician, I'll take the former in a heartbeat.

--Mgl 19:28, 27 March 2009 (PDT)

We finally started a thread on this:

--Tangotiger 04:41, 31 March 2009 (PDT)

Sabermetrics in the lower leagues

With sabermetric approaches taken down the minor leagues and into college games, it seems a clever team could try to use this to help win ballgames.... If a high school team were interested in taking advantage of this untapped information, where would you suggest they start?

Do you mean using principles gleaned from sabermetrics to help college and high school (and other amateur leagues) teams win more games? Sure, why not? I am positive that there are teams that are doing that, with the thousands of high school and college teams in existence.

Not only that, but amateur teams do lots more dumb things than in MLB. Like when I played amateur ball after college, and our coach would call for a sacrifice bunt, I would usually just ignore him. Then again, fielding in amateur ball is so much worse than in pro ball, that maybe the sac bunt attempt is correct a lot!

You asked where to start, right? Hmmm, I know this book about sabermetrics and in-game strategy..I'm not sure of the name of it, but it is something like, "The Book.."

Seriously, though, teams in leagues other than MLB should know that while the principles of sabermetrics and the principles we discuss in The Book regarding in-game strategy and the like, are the same for any baseball game, team, and league, the actual numbers and hence the ultimate optimal decisions could be completely different in amateur ball than in pro ball or MLB.

--Mgl 19:35, 27 March 2009 (PDT)

GIDP rates by handedness

I'm working on some lineup optimization stuff in relation to platooning and L/R matchups. I'm thinking about GiDP-prone players. Is there a a platoon split for this, as well? Are left-handed hitters more likely to ground into a double play against lefties than they are against righties? Is this also true for right-handed hitters? I haven't been able to find anything on this anywhere (I don't remember Walsh covering it). ... [Yes, I know the answers to this and other questions can be answered to one willing to get into Retrosheet... but I don't have 15 gigs of hard drive space available at the moment!]

Great question. Hopefully, someone else can answer it. Otherwise, I'll take a look later.

--Tangotiger 06:58, 2 April 2009 (PDT)

Yup, good question! Here is your answer (in GDP per opportunity - runner on 1st, less than 2 out)), for 2008 at least:

All RHB: .111 RHB/RHP: .114 RHB/LHP: .107 All LHB: .088 LHB/LHP: .087 LHB/RHP: .088

So, there does not look to be much of a platoon factor, and it looks like it is all about the batter, which makes sense. RHB hit into fewer DP versus LHP and LHB hit into slightly more DP versus RHP. That is probably because opp-hand pitchers tend to pitch "away" in the strike zone.

Now, we are not controlling for the G/B tendencies of RHP and LHP as a group - for example, let's say that RHP allowed more GB per PA, and thus more GDP per PA as a group. That would make it appear as if RHB had a large GDP platoon split and that LHB had a small one, which would technically be true but it would have nothing to do with anything inherent about the matchup.

BTW, that is true of regular platoon splits. Because LHP are worse pitchers overall, it makes it look like RHB have larger platoon split than they really do (which isn't that much in the first place) and that LHB have a smaller split than they really do (which is a lot in the first place). Of course, in order to figure out how much better RHP are than LHP, one has to adjust for whom they face and the platoon splits, so, as with pitcher/catcher SB/CS, the two are intertwined and you have to do a recursive process or a with and without you in order to unwind it...

--Mgl 17:10, 5 April 2009 (PDT)

Lineups - Bunching lefties together

I loved the sections on lineups and platooning, but as far as I can tell, you didn't discuss the relative importance of avoiding two lefties in a row with respect to other lineup choices. I know its mentioned in passing in the Blog and in the Q&A, but I just want to see if separating lefties to avoid having a left-handed reliever leveraged against your left-handed hitters means that one should avoid having lefties hit next to each other, even if their "best spot" according to the lineup optimization guide says they should hit, say, first and second, or third and fourth, or whatever. I take it that just using a pinch-hitter isn't an option since the expected pinch-hitting "penalty" cancels out the platoon benefit.

As I've said many times, I think that it always behooves managers to split up lefties in the lineup, even if they are putting out a suboptimal lineup, not considering who they might face in relief. I am not sure of that, but I am assuming what you lose on the average is more than made up for by not giving the opposing team a chance to bring in that lefty reliever to face back to back lefty batters in a high leverage situation.

As I said, that is only a guess on my part. It could be costing more to change the lineup - I don't think so, but I don't know for sure. Some mathematical analysis would have to be done. I don't think anyone has ever done that. It could also be that it is correct on the average (to split up your lefties), but that in certain cases the actual lineup order is so critical that you don't want to split them up. That would be more likely to be true if you has lots of lefties in the lineup so that your options to split them would be limited.

Why don't you do the analysis and let us know the answer!

--Mgl 16:49, 5 April 2009 (PDT)

Regression platoon splits - historically

I found the 2200 PAs for RH hitters/1000 for LH hitters thing very insightful, and in just applying generic splits to some of my lineup stuff, found that some players regressed to the average splits almost exactly. I was wondering, though, just for the sake of it, if one could do a "Marcels" just for hitters' platoon splits. IN other words, take a weighted average of a hitter platoon ratio and regress it to the mean. If so, how many PAs of the league average for that season's platoon split ratios (I think Marcels adds about 200 per season, weighted 5-4-3, so 1200 PAs) need to be added per season for each left/right handed hitter, given the 2200/1000 rule?... Does this question make sense? I can re-ask it: Basically, it seems that right-handed hitters would need to be regressed to the mean more than lefties given the 2200/1000 guideline, so when trying to get a more precise idea of their platoon splits at any point, should difference numbers of league average split ratio PAs be added in for each?

Maybe Andy can expand. But if I understand correctly, you start with the league splits (say for RH, it's 15 wOBA points). You look at a player's career split (say it's 50 wOBA points based on 4400 PA). Then you weight the 15 with 2200 PA, and figure out the weighted average. Are you asking if perhaps we should weight the more recent seasons greater, like you would in a typical forecasting system? I suppose, but I would say that rather than a 5-4-3, it'd probably be a 10-9-8-7... kind of weighting.

--Tangotiger 06:58, 2 April 2009 (PDT)

Not anything to do with us

This may be a better question for Baseball Prospectus and Hardball Times, but I'll ask here regardless. In their team reports, they often have numerical projections that don't match with the words they write, or numerical projections that don't match with other numerical projections. Take Hardball Times for example- in their Tampa Bay section, if you add up the W-L projections for their starting 5 pitchers, it is right around .500. Yet, they project the team to win 90 games at the beginning of the section. And yet, they project the team to win 95 games at the end of the section. Completely inconsistent. Baseball Prospectus is the same way. Is it ludicrous or am I missing something?

I presume this is an issue when you have multiple people writing. Nonetheless, feel free to contact the respective editors, and hopefully they can give you a better answer.

--Tangotiger 08:10, 2 April 2009 (PDT)

Warming up - impact

How significant of an effect does "warming up" have on pitchers? I'd assume a pitcher's first batter faced will result in a higher BB rate because the pitcher is trying to get the feel of things. How much better/worse does a pitcher do against the first few batters he faces.... (Need to exclude LOOGY sort of situations somehow where a RP is brought in to face a particular batter.)

Cut off or not?

Runner on 3B tagging up on a fly to the OF, what's quicker to home plate or which scenario has a higher percentage of success to throw the runner out at home: OF > Home? ... OF > cutoff > home? ... I understand that this is all dependent on the speed of the runner, speed and accuracy of the OF's throw, the speed and accuracy of the cutoff man's throw, and if the ball takes a bounce or two before reaching the catcher in any case. ...I'm not much use because I'm not mathematically (or scientifically inclined) to do the work myself, but all I have is an idea…and a feeling that, since this is the internet after all, this idea has already been thought of, proven/disproven, and is old news.... This came to me when overhearing comparisons of Melky Cabrera and Brett Gardner. I hear Melky has the arm and Brett has the foot speed (although, I think Gardner's arm is not as bad as people say). During Friday's exhibition game against the Cubs, Brett tried to throw a runner out tagging from 3B on a fly to CF. The ball made it to Home in time even though it bounced just right before it hit Jorge's glove (the guy was called safe, I called bullshit, but whatever). This is what sparked this thought in my head. What if Brett threw the ball to the cutoff man and the cutoff man threw to home? Would the outcome be any different?... I also no that there's an amount of time that the ball is not moving towards home plate once it reaches the cutoff man's glove. But how significant is that?...My head hurts now. I'm sorry if I'm not making any sense and I'm rambling. All you guys are great. I apologize, I do not have The Book, so if this is in The Book, please show mercy on me.

I don't think the cutoff man isn there to reduce the time required to get the ball home. He has two things to offer. One is accuracy. If the outfielder's throw is off by 5 yards, the cutoff man can still easily get to it and relay the ball home. However, if the throw is off by 5 yards when the catcher gets it, he's not going to make a play. So, perhaps the ball arrives a fraction of a second later, but it'll be close enough to home to make a play almost every time.

The other advantage of the cutoff man is that, if the ball will be too late to make a play, he can throw out another runner who is trying to advance. --AED 13:08, 8 April 2009 (PDT)

I have no idea which is quicker, but I assume that it is quicker to let an online throw go through rather than have the cutoff man relay the throw. I think it is a lot quicker. For someone with a ragged arm, like Damon or Pierre, maybe it is quicker to realy the throw home. I don't really know. You rarely see a player thrown out at a base on a single (obviously on an extra base hit hit in the gaps or over an outfielder's head, he usually has to throw to a relay man becuase it is too far to a base) on a relay from the cutoff man.

The cutoff man's job is also not to cutoff (relay) a throw and then to try and get the runner out. If the cutoff man cuts the throw off, it is usually to prevent the trailing runners or the batter from moving up a base on the throw. The cutoff man's job is to cut off the throw, based on verbal instructions from the fielder covering the base that the throw is going to, under one of two conditions: If there is not going to be a play on the runner anyway (he is going to be safe no matter what, or he isn't attempting to advance an extra vase), or the throw is so bad that letting it go is not going to get the runner anyway. His job is NOT to relay the throw to the same base that the outfielder was throwing to in the first place. Occasionally you will see that kind of a relay on an off-line or weak throw, but it is not the norm and not the intended play. It is assumed that major league outfielders have enough of an arm that they can reach a base (on a single) much quicker on one throw than with a throw and a relay, and that is generally the case I think. In any case, how do you think that a sabermetric analysis is supposed to answer your question?

--Mgl 23:30, 8 April 2009 (PDT)

Putting money in mouths

u guys write a lot about projections and predictions. do you use this skill and knowledge to bet on baseball games? and win?

I haven't tried combining the individual player analysis into a full team prediction on a per-game basis. So, no, I don't do team-level predictions in baseball. I do have a pretty good team-level prediction system for other sports on my website, and while I personally don't use it for betting, others tell me it's worked out well for them. --AED 13:00, 8 April 2009 (PDT)

There are lots of projections out there, either free for the taking (like Chone, Zips, or Oliver) or pretty cheap to purchase (like THT, BP, and Bill James). I am pretty sure there are lots of folks who use them for betting purposes. I don't know how they do. It is very difficult to "beat" the bookmakers, especially at baseball.

--Mgl 23:33, 8 April 2009 (PDT)

Errata Page?

I was curious if there was an errata page for The Book. I searched around on your website as well as Potomac's but I couldn't find one.

None that I'm aware of. Note that there are errors in the Potomac version that do not appear in the original edition (which we self-edited). I don't know how that happened.

--Tangotiger 08:11, 21 April 2009 (PDT)

Actually, I am glad you asked and I never really thought of that. I am not aware of any errors in either version other than a few typos, other than one. There actually is a mistake in one of the computations in the sac bunt chapter, which changes one of the conclusions/recommendations. It is not a major deal. I forgot exactly which one it is, but it involves pitcher sac bunting, I think with runners on first and third, but I am not exactly sure. When I get a chance I'll put the correction on the blog. Thanks for the heads up!

--Mgl 10:39, 24 April 2009 (PDT)

Log5 - why that name?

Why is it called Log5? I'm just wondering why Bill James called his various predictions formulas Log5? I'm guessing that he found a logarithmic relationship somewhere (which makes sense, given that he was working with probabilities), but does anyone know the whole backstory?

No idea, but you can probably look it up somewhere online or ask The Man himself via his Bill James Online web site.

--Mgl 10:41, 24 April 2009 (PDT)

WPA by base out state

Have you ever done linear weights by base out state, but instead of calculating runs, you calculate wpa? I think it'd be interesting to see if any of the stats occur more often in situations with low or high leverage indexes (HBP in blowouts) or at times where their run value is of less importance (IBB when first is open in the 9th) which would cause their linear weights wpa to differ from a simple multiplication of their regular linear weights times a constant.

If you check The Book (which you can do for free via Amazon's Look Inside feature), Table 11, you will see the run values and win values of each event. For example, the run value of the non-intentional walk is +.32 runs and the win value is +.028. For the intentional walk, those values are +.18, +.010, respectively.

--Tangotiger 07:16, 24 April 2009 (PDT)

Platoon splits

Given that in "The Book" the threshold for platoon splits being meaningful is: 2,000 PAs vs. LHP for righties, 1,000 vs. LHP for Lefties, and 600 PAs for switch hitters (give the relative variance of platoon skill)...then why do teams consistently employ RHH like Jeff Baily to bat against LHP instead of JD Drew? Is it a rest thing? If it takes years for a RHH's platoon splits to become meaningful then why do teams need to have RHH backups who always bat against LHP? You see this everywhere. Yes, it does make traditional sense to do this but I'm wondering your thoughts.

The issue isn't that the "best guess" of a player's true platoon split is zero split until he has 2000/1000/600 plate appearances. Rather, the "best guess" of a player's true platoon split is that it's average until he has enough plate appearances to demonstrate otherwise.

Specifically in this case, suppose that you have a .280 LHH and a .270 RHH. Even in the absence of any information about the players' platoon splits, you would want the RHH to start against LHP. --AED 10:53, 21 May 2009 (PDT)

Win Probabilities

I am trying to compute the probabilities of all the possible final scores in a particular game given two pieces of data: 1. the probability of the Visitor team winning; and ... 2. the probability the game is under a set number of runs. ... Is there a formula or a program I can use for this?

The best I can offer you is the Tango Distribution, which you can get as the last two links on my home page.

--Tangotiger 06:48, 22 May 2009 (PDT)

Platoon splits - versus power pitchers

A fantasy comment I left regarding Chris Young's splits... "he struggles mightily against power pitchers (as defined by baseball reference). His split OPS+ is 58 against power, 107 vs average, and 149 vs finesse. There's some sample size issues here, but that is somewhat counteracted by visual evidence. Young has a slow bat. Just for comparison, Chipper Jones was balanced and Randy Winn has actually been better against power pitchers. Barry Bonds was balanced as well. Somebody ought to do a study and see if there's a particular type of hitter that does better against a particular type of pitcher. The league average was 98 in 2007 and 2008." ...Has there been any research on this? Is it all just variance? Young's BABIP is .226 against power pitchers, but he really does look overmatched. His overall BABIP is .277 and it wouldn't surprise me if it stayed in that area. BABIP was lower league wide against power pitchers.

I don't know of any research on this, but I doubt that you will find much of a true platoon split for batters versus power and finesse pitchers (not sure how you define those terms either). Even if you found some, most of what you see in individual players is random noise (what you call "variance") anyway. It is always a mistake to work "backwards" as you are doing. In other words, you are pointing out a player with a large "split" and wondering if it means anything. When you do that, it is easy to come up with all kinds of plausible explanations and reasons why it DOES mean something (something like "confirmation bias"). Most of the time, it does NOT mean anything and even when it does, you generally have to regress those observed splits so much (in order to estimate a "true" effect) that you are usually left with almost nothing. The better way to investigate something like that is to first use the data to determine whether there are likely any true effects, then if there are, determine how much to regress sample data based on sample size, and THEN apply that to any particular player.

--Mgl 21:05, 22 May 2009 (PDT)

Win Expectancy by count

Have any of you done/seen any work regarding WPA, run expectancy, and/or lwts for the 24 base-out states, but also including the ball-strike count (e.g. Bottom of the 9th, 2 out, down 3 runs, bases loaded, count is 0-2)? ... Just something I was pondering over tonight after Pablo Sandoval went from down 1-2 to hitting the walk-off home run against the Nats. Thanks again for your help, work, and time guys.

I have thought about it (alot). I just don't know if I'll learn something from it, and it takes a while to setup. If I was a gambler and I needed something at the pitch level, I would definitely find the time to code for it. It's pretty straight-forward.

--Tangotiger 06:46, 22 May 2009 (PDT)

Replacement Level - NFL

How would you figure replacement level for the NFL? It can't be .300, because that would mean a 5-11 record, and multiple teams finish below that every year. And it can't be .000 (at least I think), because the Detroit Lions just finished 0-16, and they weren't filled will only replacement players--their #1 WR had 1,300 yards and 12 touchdowns! ...Or is the replacement level really .000, because the Lions' true talent level may have been 2-14 or 3-13 and they were just unlucky?

I'm not sure how I'd approach this mathematically. However, if one defines replacement level as the quality of players on the fringe of the roster -- taxi squad players, the third-string QB, etc. -- it's hard to imagine that a team composed of such players would win more than one game. --AED 11:00, 21 May 2009 (PDT)

I think this highlights the issue regarding the non-additive of players to teams. For example, if the replacement level was the recently cut, or if it was guys in college, or guys in high school, the win % will be .000. But, we really don't care about the team win%. That's just a stand-in for what we want, which is the wins below average at each position. You could end up with say 10 or 12 or 15 wins below average in total, for each team, even though the average team wins 8 games.

Remember, we are always asking "Given that the rest of the team is average". With that provision, it makes it hard to try to separate the absolute wins (1-15, which is -7), among all the players, seeing that the rest of the team is not average. There is overlapping wins lost that are not accounted for at the team level. You can have a team of replacement players be 0-16, and then you remove a bad player and replace him with a worse one. Guess what, they are still 0-16. This crappy player's bad value simply is irrelevant to the team.

--Tangotiger 06:44, 22 May 2009 (PDT)

Log5 shortcut

I have a question about the equation used at the beginning of Chapter 7 regarding the "expected wOBA." Why is the expected wOBA = Batter wOBA + Pitcher wOBA - league average wOBA? Further, what does the "league average [wOBA] for these pools of players" actually mean?

The correct way to figure it is to use the Odds Ratio method. For example, if you are twice as successful as the league average (2-1, or win% .667), and you are facing an opponent that is one-third as successful as the league average (1-3, or .250), your odds is that you will be six times as successful as your opponent (6-1, or .857). A quick shorthand for this is .667 + .750 - .500 = .917. It obviously doesn't work as well at the very extremes, but it works quite well with players and teams in MLB.

--Tangotiger 06:38, 22 May 2009 (PDT)

Regression to what mean?

Why always "to the mean"? Is it because it is both easy and good enough, or is there something I am missing?... Let me start by saying that I totally (I think) get why you would add X incidents of league average performance to Y incidents of actual performance to come up with "true talent" if actual performance was 100% of the information you have. ...But, for example, if you actually knew the true talent, would you not add billions of incidents of that, rather than any number of incidents of league average. (In, fact, of course, if you could actually know true talent, adding any number of actual performance data would only reduce your accuracy. But ignore that for now.) ...If that is true, then don't we always have more information than just actual performance (such as minor league or other league performance) that would suggest regressing to something other than league average?

Yes, one should regress to the mean of the player sample that you know the player belongs to, without considering his performance. At its most basic, this is the mean of players good enough to play major league ball. One can further divide by position, handedness, and size, all of which are readily available. If you had scouting reports, you would regress to the mean of players with comparable scouting reports. The more non-performance data you have, the better a "mean" you have and the more strongly you regress to it.

As you say, if you actually knew the true talent level, you would have the best possible mean and would regress 100% to it.--AED 08:23, 1 June 2009 (PDT)

How much of a skill is hitting doubles?

Is hitting doubles a separate, repeatable skill? Or is it simply a function of how often a batter hits safely? Put another way (and backing out park factors), does the ratio 2B/H tend to systematically vary among batters, or are differences from league averages more likely random variation (plus park factors)? Is the answer different if you use 2B/(H-HR)?

I can't speak for 2B/H, but I do use 2B/(H-HR) ratios in my projections. And yes, it is repeatable, to about the same amount that hits on ball in play are repeatable.

Keep in mind that there are varying degrees of repeatability, and we can quantify the repeatability by stating how many "average" outcomes should be added to a player's stats when regressing to the mean. --AED 08:31, 1 June 2009 (PDT)

Yes, everything is a "skill" and is "repeatable" theoretically from 0 to 100%, depending upon the sample size. The fact that we have a sample of performance for EVERYTHING means that there is SOME random element to it. There are some things that can be "sampled" once or twice and you would be very close to the true mean, but in sports, not much. (For example, measuring height only once would put you very close, but not perfectly, because of measurement error and perhaps diurnal changes in height, to a person's true height.)

Certainly doubles rate, either per PA, or per whatever you want, has a large element of skill. A double is a reflection of basic hitting skill plus power and/or speed. You make contact a lot, you will tend to get your share of doubles. You are fast, you will tend to get your share of doubles. And if you can hit the ball hard and far, you will tend to get your share of doubles. Right?

One of the goals in projection methodologies is to choose the right denominators for these things. For pitchers at least (I am not sure about hitters}, I have found that you can first lump in doubles, triples, and HR's and then look at doubles and triples (the same thing for pitchers) per extra base hit (including the HR). I find that works best. Other people like to first do HR per BIP and then look at doubles and triples per non HR BIP. There are other people who look at HR per fly ball (for pitchers at least), so they first project a pitcher's fly ball rate (per BF I guess).

I am not saying that it is a matter of personal taste. Probably one method is better than another, but it may be hard to figure out which one IS better than the other. Sometimes when it is difficult to figure out the best way of doing something and/or more than one method likely works almost as well as another, you might as well choose the method you are most comfortable or familiar with. I hope that helps. --Mgl 00:33, 2 June 2009 (PDT)

Success above the breakeven point

If the break-even SB% for 2nd base is ~66, does that mean the optimal SB% is higher than 66%? If a runner succeeds on 2 out of 3 attempts, he's created 0 net runs, right? So if were able to increase his success rate by attempting fewer steals, he would increase his contribution to the team, right? Am I missing something?

Just considering the effect on the base/run situation, you're right. The break-even rate isn't always the same (it depends on the game situation), but a player should only attempt a steal if the odds of stealing successfully -- i.e., given the pitcher and catcher in question -- exceeds the break-even rate. Thus, if players are stealing when they have a 66% chance or higher, but never under 66%, one should see overall base stealing rates well above that value.

That said, there is some benefit to the batting team to have the pitcher altering his delivery, the first baseman altering his positioning, and the occasional pitchout. So, if the success likelihood is 65% for the runner, and because of the above logic he will never steal, and the defense KNOWS he will never steal, then the defense will adjust accordingly. Likewise, if the probability is 67% and the defense knows he will steal, it'll be a pitchout and easy out at second. So, there's a strong element of game theory here that forces you to attempt the steal once in a while with a 50% success chance, and occasionally not go for it with a 75% chance. --AED 07:09, 22 June 2009 (PDT)


What career/season records are relevant to hold, in your opinion?

Just so you don't think all of us are ignoring your question, I have no interest whatsoever in career or season records. I can't speak for Andy or Tango of course.

--Mgl 13:50, 28 June 2009 (PDT)

Data - bunting

major league hitter's percentage of putting a ball in play by way of the bunt

Your question is probably answered in The Book, but I don't know what you are asking. What is the denominator? "Attempts" at a bunt per pitch, including foul balls and missed bunts for a strike? You really have to be quite a bit more specific for anyone to answer that question, plus if the answer is in The Book, you can look through it on Amazon or shell out the 15 bucks or so...

--Mgl 13:53, 28 June 2009 (PDT)

WAR on a scouting scale

You see on alot of message boards and blogs the misperception that the sabermetricians are speaking a different language than the scouts. It's my belief though that they're not. ...I could be wrong, but if you look at the 20/80 scouting scale, it's actually scaled the same way as, say, WAR or VORP or whatever replacement level valuation metric one chooses to utilize. 40 is basically replacement level on the scouting scale. 50 is average. 60 is a star. 70 is an annual all star. 80 is a superstar MVP HOF kind of guy. Isn't WAR saying the same exact thing, except saying that every 10 points overall above or below 40 on the scouting scale is worth 2 WAR or 20 or so runs? ...Wouldn't the calculation then be 40 + (WAR/2) ?... That of course is imperfect because it doesn't break things down into their individual scouting and run valuation components, but I'm sure you fellas can figure that out way better than me. Wouldn't you then just need to figure out what the run difference is between a replacement level baserunner and the best baserunner, between a replacement level hitter and the best hitter, between a replacement level slugger and the best slugger, between a replacement level fielder and the best fielder, between a replacement level outfield arm and the best outfield arm, a replacement level walker and the best walker, and so on, and just convert that somehow to the MLB 20/80 scouting scale to quantify what kind of WAR the scouts are projecting these guys to have when they are drafted or are in the minors and how those projections performed? Maybe this has already been done by someone, but it would be interesting and fun to see the results in a readily available and accessible place. ...Hope that made sense.

Check out for an example of that.

--Tangotiger 07:41, 24 July 2009 (PDT)


If using ERA for a pitcher is flawed, then why is it OK to use runs allowed in the Pythagorean theorem? Why not use, then, wRC and wRC allowed in place of RS and RA? (I suppose you couldn't use two different estimators, like wRC and FIP or tRA runs.)

Not sure what you mean. Using pythag for what? If you know a team's RS and RA, pythag generally gives you a better estimate of their true talent than their actual w/l record. If you want to estimate a pitcher's expected w/l record going forward or his "true" w/l record, I suppose that using FIP or ERC or DIPS ERA or some such thing in a pythag formula is better than using his ERA. Still not sure what you mean though. You'll have to explain more. --Mgl 21:23, 24 July 2009 (PDT)

I'll add a little bit. You're right, it would be better to evaluate the entire defense based on wOBA. However, for an entire team's statistics, the difference between RS/RA and the wOBA-based numbers will be pretty small. So, it's not that bad to used RS/RA. For just the innings pitched for a single pitcher, the random variations are bigger and thus RA is a much less accurate estimate. --AED 21:50, 25 July 2009 (PDT)

Two-strike hitters

I’m just a baseball Dad, but I try to follow some of the SABR stuff. ... Have you heard of anybody ever hear of tracking some statistical measure of a hitter's ability to survive a 2-strike count and not make an out? ...For instance, it could be a simple ratio - the percentage of time a hitter gets two strikes and doesn't make an out. ...I wonder what the major league average is? Is there a large variance between different types of hitters? Are some hitters freakishly able to get two strikes and not make an out? Is this statistically independent from just measuring the strikeout ratio? That is, does it tell us anything that strikeout ratio doesn't? ...I’ve heard “He’s a good two strike hitter,” but I wonder if it is true – are some players actually consistently better than others with two strikes? It seems that this would be easier to answer than the murky question of whether there are clutch hitters, since there are no definitional problems. ... I've never heard reference to anything like this statistic, and was wondering if you were aware of anyone who has analyzed it to see if it is related to creating runs in a way that other batting statistics don’t reveal.... Just wondering. Would appreciate any input you could give me.

This chart tells you how players perform at various counts:

Look at the "Through 0-2" line in the second chart. That tells you how well players do. As for players having a "freakish" ability, here's Wade Boggs: &t=b#count

Look at the "After 0-2", he had a .258 batting average. I don't know what qualified as a freakish ability, but I presume that there is no such thing that we can measure.

--Tangotiger 07:41, 24 July 2009 (PDT)

Thanks for the response. I would imagine that Boggs' 0-2 count average of .258 is pretty extraordinary. To your knowledge has anyone ever tried to analyze whether some batters are better on two strike counts than others, and what this particular skill might add to team offense? ...So that would include all two-strike counts. I guess "freakish ability" would just be in comparison with other players. My question, I guess, is whether the standard deviation for 2-strike survival is small or larger. Are some hitters really statistically much better two strike hitters than others, or is this an illusion. And how does 2-strike survival relate to other stats. ...Know if anyone has done any analysis of this kind of thing? ... It does look like the data is readily available, if what you have linked for Boggs is available for others.

I have no doubt that ability to hit with 2 strikes (of course there is BA, there is OBP, there is SA, and there is a combination like OPS or some such overall metric) is a talent and that the spread in talent among major league hitters is fairly significant. However, I am not sure what the practical significance is, unless you have a batter with an 0-2 count and he gets hurt and you have to put up a pinch hitter (I am being facetious). Other than that, a batters value to his team is based on his overall offensive talent. If a player is projected at a .900 OPS (which is very good of course), it doesn't matter if he is good or bad at a 2 strike count.--Mgl 21:27, 24 July 2009 (PDT)

Weighting for binomial purposes

In one post, you mentioned you use the weighting:... "weight for hitting = .9994^daysAgo, weight for pitching = .9990^daysAgo" ... If you do this, how do you treat the number of PA's for the purpose of calculating the standard error of the estimate. Do you simply weight the number of PA's each day by the same weight, and then use the weighted sum as your value for "N" in the calculation of standard error?

Yes! --Tangotiger 07:41, 24 July 2009 (PDT)

Setting up the post-AllStar Break rotation

Why do teams not start their best pitchers after the All-Star break and use the three/four days off as in-between start rest?

I don't know. Answering a question like that is not in my job description!--Mgl 21:28, 24 July 2009 (PDT)

Weighting UZR

What kind of regression technique would you use on UZR? 50% regression after X many innings of defense, and how would you weight a players defensive history? ie: 5/4/3, 4.5/4/3.5? or something different?

I add 400 balls in play of league average performance. That would mean from 80 games for a SS to 200 games for a 1B. I use these BIP per game: 5 SS, 2B; 4 3B, CF; 3 LF, RF; 2 1B

--Tangotiger 07:41, 24 July 2009 (PDT)

I would probably weight the same as I would for offense, although that is just a guess. 5/4/3 is close enough (around 80% "discount" per season). --Mgl 21:30, 24 July 2009 (PDT)

More positional scarcity

Within a position's population, should offensive skillsets be valued over defensive skillsets due to scarcity? If player A is worth 2 wins on offense and 0 on defense and player B is the reverse, should we value prefer player A because it's harder to find offense-oriented shortstops. (This is based off of an ongoing conversation here: but that's become something of a muddled mess.)

I don't know what you mean by "valued" but I think the answer is "no" regardless of what you mean. A player's overall value is his value, period. If I have 1 SS who are 2 wins on offense and zero on defense and 100 who are 2 wins on defense and 0 on offense, all 101 are essentially the same player and I am going to pay them exactly the same and not prefer one or the other (ignoring things like aging curves and the like). Scarcity of players at a particular position drives prices up because of bidding wars. But if all the players at a position are the same overall value, no one should care what the breakdown is.--Mgl 21:34, 24 July 2009 (PDT)

Top 10 Saber books

Could you do a top 10 list of must read baseball books or articles? I have read the Book, Between the numbers, the numbers game, Moneyball and am still interested in more. Thanks.

Off the top of my head, gotta add Hidden Game of Baseball, Diamond Appraised, and any of the Bill James Abstracts as well as some of his full-length books. There are tons of great articles - I wouldn't know where to begin. Try starting with Tango's personal site, --Mgl 21:40, 24 July 2009 (PDT)

Fielding - Scouting or Talent?

how much of good defense in baseball is based on scouting and positioning of the players as opposed to the players' abilities? If a team is really good at positioning a player due to the pitchers/hitters tendencies (and much better than other teams for some reason), then maybe the good defensive stats of the fielders may be a credit to the team preparation rather than the players true abilities. If that same player is then traded to another team who are not exceptional at positioning players, maybe he would just be an ordinary fielder?

I really have no idea and it could be. Certainly positioning is part of a player's skill set, although I am not sure how much latitude a coach or manager gives to a player in terms of positioning. It was said that Ripken was great at positioning and that is why he was such a good fielder. He sure wasn't fast, which is generally a requisite quality for a good SS.

My guess is that actual fielding talent excluding positioning is 80% of the package. But I am by no means sure about that.--Mgl 21:43, 24 July 2009 (PDT)

Different training programs

Dice-K says: "The only reason why I managed to win games during the first and second years (in the U.S.) was because I used the savings of the shoulder I built up in Japan. Since I came to the Major Leagues, I couldn't train in my own way, so now I've lost all those savings." ...Any idea what he means by "savings of the shoulder"? I've never heard of this term before. I think this quote was translated from Japanese to English, so something may have been lost in translation.

I don't know why you are asking us. I do not speak Japanese and I have no training in "savings." Seriously, I assume he means that because of his training program in Japan, whatever that was, he was able to achieve and maintain a healthy shoulder and one that could withstand the rigors of pitching even without that same training program, for a while (2 years). But that without that same training program eventually his "savings would run out" meaning that his shoulder would not be able to take the abuse (and it is abuse, according to the medical guys) of pitching after some period of time.

I assume that he is pissed off because he initially told Boston what kind of training program he wanted and needed and that they told him something like, "We do it differently over here," or they just ignored him. I would find it hard to believe that after 2 years he would say to the Red Sox, "Oh yeah, I forgot to tell you, I like to train this way.." But who knows? --Mgl 21:09, 20 August 2009 (PDT)

Where to find data

Casey Fien came into his major league debut the other day with the bases loaded. I don't like to give other people work, so can you point me in the direction of a tool that might tell me how common this is?

More to fielding than UZR

I've got a defensive statistics question. Watching Mark Teixeira play everyday, I feel like he's got to be a better defender then his -.6 UZR tells us. And it's not like Cano (.7) is fielding everything in site. One thing I do see is that he is sometimes tentative in going to his right, which may hurt his range rating. However, i think tex is one of the best in baseball at picks (you could say that an MLB first baseman should be able to pick everything, but I think giambi would argue with you on that) and he's also great as staying on the back on some extreme stretches. Are any of these talents, which I would say save about 2-3 baserunners a game if not converted, recorded or factored into defensive metrics? Thanks for the time, keep up the great posts!

2-3 baserunners a game? What games are you watching? 2-3 baserunners a year maybe. Read my article on first basemen scoops on Fangraphs. My research and that of others suggests that a good scooping first baseman saves maybe 2-3 runs a year. I'll grant you 5, but no more than that. 5 runs is about 6 "plays" made that would have been errors. So you must be watching alternative universe baseball where every throw is in the dirt. What actually happens in OUR universe is that maybe once every two games or so (I am guessing) an infielder throws a ball in the dirt. Of that, the average first baseman catches it maybe 70% of the time. A good one catches it maybe 80% of the time. So that is an extra .1 saves 75 times a year, or 7.5 saves or around 6 runs. (Numbers are for illustration purposes only.)

The number can't be much higher than that. Think about it this way. A team only makes around 100 errors a year. Only around 40 of those are throwing errors. Some of those are balls that cannot be caught. How many do you think are balls in the dirt (or high throws which the first baseman can possibly "catch and tag")? Maybe 10? 15? 20? So if there are only 10 error per year, how many total throws in the dirt can there be per year? 15? That is one every 10 games and not every other game which is what I originally said and was probably too high. If there were one ball thrown in the dirt per game and an average first baseman fielded it 70% of the time, there would be 50 throwing errors on balls in the dirt plus another 50 or so on balls which were air mailed or whatever. That's 100 errors on throws. No team makes 100 infield throwing errors.

As far as Teixera's -.8 runs in UZR, "whoop de do!" Anything can happen in 120 games. Just as Teixeira batted lousy for the first month of the season, or at least his batting stats were lousy (maybe he hit a lot of hard balls that were caught and a lot of fly balls on the warning track and didn't actually "hit" lousy), even though he is an excellent hitter, so can his fielding stats be lousy even though he is an excellent fielder. which he likely is.

Maybe he actually fielded very well in those 120 games but UZR did not reflect that for HIM. How about the other 50 or 60 first baseman in the league? Did you know that even the greatest metric in the world will not accurately reflect actual performance in SOME percentage of cases, depending on the sample size of that performance? If you did, then you would immediately say, "Well {insert stat of choice] has to screw up on someone (probably more than one) in 120 games! Why not Texeira?" You (the fans and media) get to choose anyone you want. That's like shooting fish in a barrel. It is almost guaranteed that there will be a player or two (or three) that EVERY stat messes up on in less than an infinite number of games. If that is evidence that there is something wrong with the stat then every stat will be automatically certified as BAD, because we can always find a player or two like that, especially if we limit our sample to 120 games or less!

Did you know that UZR rates Texeira as one of the better first baseman in his career? Did you know that NO first baseman. no matter how good they are, is going to be more than 5-10 runs better than average because they only get less than 2 chances a game to field a ball and most of those are routine? If you did, then you would realize that a good first baseman can just as easily be -.8 runs in 120 games as +4 runs in 120 games and 4 runs in 120 games would be an excellent first baseman.

One final thing. Maybe Teixeira did actually perform around average in those 120 games (I am not saying that he did - he could easily have performed above average and a great stat would not capture that because he is only one of 50 or so first basemen and a few of them will not be captured very well by that stat). Did you know that the eye and mind combined are terrible at quantifying things? That is one reason why we have stats. And did you know that when we observe things, we often fool ourselves into thinking that we saw something that wasn't there based on our pre-conceived notions (like Teixeira is an excellent first basemen, so surely I saw him perform excellently in those 120 games).

Nothing personal, guy...

--Mgl 21:31, 20 August 2009 (PDT)

I really wanted to sign Tango's name to this... LOL!

Betting using BaseRuns

I don't know if you have seen this, but the Covers betting site has a forum that includes System and Strategies. One person has started a thread where he uses the Base Runs formula to pick games and totals. He's up to close to 300 units and takes well over half an hour to "cap" each game. You may have to run through a couple of pages to get the full flavor but I thought it was very interesting. It's working even if the idea may sound counterintuitive based on the small sample sizes. He's not the best writer, and he seems to be saying Smith invented RC when we know that's not the case. He also seems to be linking you to BsR. I will be buying your book. I came across the subject of sabermetrics one day in the Wayne County, North Carolina library when I saw "The Hidden Game of Baseball." That led me to Bill James, and I've been hooked ever since. I do American Legion games but enjoy reading all of the work you guys do.

Hey, whatever works. Thanks for buying The Book. We hope you enjoy it. --Mgl 21:39, 20 August 2009 (PDT)

Run on defense same as on offense?

Let's say you have two league average players who play the same position, equal in terms of WAR. Let's say "Player A" is ten runs above average defensively, ten runs below average offensively. "Player B" is ten runs below average defensively, ten runs above average offensively. Would Player A be more valuable to a team with a positive run differential? The team would have the same run differential with both players, but Player A would cause less total runs.....thus a better expected win-loss record (cause the run differential is more significant when there are less RS/RA). Does my logic make any sense here? I hope this made sense

Yes, you are correct although the difference is VERY small for one individual player. The nice thing about defense is that you can leverage it. If you have a contact, ground ball pitcher on the mound, put in your defensive wizard infielder. You might double his defensive contribution as compared to a fly ball, high K pitcher. Have a high K, fly ball pitcher on the mound? Put in that high offense lousy fielding infielder. --Mgl 21:43, 20 August 2009 (PDT)

Authors working in MLB

Tango, I am SO HAPPY that you are working for the Mariners. Stay with them, please.

Cool, it's working out fine.

--Tangotiger 09:06, 20 August 2009 (PDT)

Slumps and streaks

Usually we ignore when players go through slumps saying its just bad luck (and sometimes it is). Same thing can be said for hot streaks. While some of it is just luck, most people who play sports know that you do get hot or cold. ...Using statistics can we come up with some way to tell the difference between lucky streaks and hot streaks. Can we predict players who are more likely to be streaky vs. lucky? ... For example, Player A's true talent is 1 HR per 50AB. For the first 2 months of the season he hits 1 HR per 25 AB. His true talent is still 1 HR per 50 AB if he plays for an infinite amount of time as you average his hot and cold streaks. Since he is hitting the ball well it makes sense to say for the next x months he will likely hit between 1HR/20AB and 1HR/50AB (outperform his true talent). ...I know over a large number of ABs he will perform at his true talent level but is there some "hot streak" factor that is visible in the numbers which helps us predict player's performance over the short term?

"I know over a large number of ABs he will perform at his true talent level but is there some "hot streak" factor that is visible in the numbers which helps us predict player's performance over the short term?"


"Since he is hitting the ball well it makes sense to say for the next x months he will likely hit between 1HR/20AB and 1HR/50AB (outperform his true talent)."

The research which we have done strongly suggests (that is a euphemism for "we are pretty darn sure but not 100% positive") that over the next few anything (game, week, month) his HR rate will be exactly whatever his previous weighted career average regressed toward some mean, regardless of whether he has been hot, cold, or lukewarm at any point in the past. IOW, hot and cold streaks, for hitters at least, have little or no predictive value. Obviously any hot or cold streak changes a player's weighted historical average, but not usually by much unless the history is small as compared to the size of the streak. And even if it is, you are going to regress the numbers in the streak a lot if you don't have much more data than that. For example, if a player has been hot for the first month of the season, say 1 HR every 20 PA, and that is all the data we have, even though his career average is 1 HR per 20 PA, after regressing that A LOT toward the league average (we don't know anything else about the guy like his size) of 1 HR per 38 PA, we are going to project him to hit maybe 1 HR per 33 PA for the near future, or something like that. --Mgl 21:50, 20 August 2009 (PDT)

F/X data

I'm not sure if this is meant for the mailbag, but I was looking for your help. I'm looking to do my own in-home analysis with some friends, and am looking for the best way to get data. Is there an easy/organized way to get play by play CSV or XML data? And the same goes for Pitch f/x. Lastly, I was wondering, regarding the hit f/x, where I can get the data that's been released so far, and if there are plans to release it in a continuous or periodic fashion, where that can be acquired.

The play by play data can be gotten from in CSV format at least. There are lots of other sites where you can download player, team, and league data from, like and fangraphs,com, among many others.

Pitch f/x data I am not familiar with, but I think that MLB lets you download it somewhere in XML format. Tango can probably tell you more or you can search on this site ( for threads that link to the pitch f/x data sites. You can probably Google it as well. --Mgl 21:54, 20 August 2009 (PDT)

Yes, you can download pitch f/x data from MLB. For a single game, I've heard it's pretty easy. To build an entire database, I've heard is quite difficult. Try this link: --AED 11:57, 25 August 2009 (PDT)

Replacement level and WPA

I was recently working on a WPA-based metric for relievers set above a replacement level and adjusted for a lineup's offensive ability. My question is on the adjustment to replacement level. Is it correct to take win expectancy as the expectancy of an average team facing the current base/out state and facing another average team? If this is the case, would the average team's chances of winning, including the replacement level reliever, be at a discount akin to Winexp*(.47/.5), given that the replacement level reliever is a 47% WP% pitcher? Or am I missing the boat here entirely?

The playoff sweet spot

In a recent blog post, you wrote: ..."Nate had a very good article a few years ago that showed the change in playoff odds, and what constitutes the sweet spot. Just last week, I posted a simarly-themed article." ... I could not find this article you wrote. Could you send me a link please?

UZR park adjustments

After reading the Dewan-chat thread, I have some questions about UZR park adjustments and Coors Field. How does UZR handle zones for a large park like Coors--are the outermost OF zones “simply” expanded outward to account for the added OF dimensions? At what point is the pf is applied--after the results for the zones are summed for a player? Are the Coors OF PF’s still .93, .91, .91 left to right, as they were in the 2003 Primer article? Did the infield PF change post-humidor [was .97]?) ... Finally, has MGL ever posted home/road breakdowns? I'm curious to see if Rockies OF's show big splits despite what look like substantial park factors. I suspect Brad Hawpe really is that bad (plus/minus and the fans scouting reports seem to agree), but Rockies fans seem to think UZR is not accounting enough for park, at least for RF and CF. Thank you.

Park factors are done rather simplistically for UZR, but I think they work pretty well. Not nearly perfect and there are better ways to do it, but I can't spend all my time improving UZR - seriously. I've never made a dime from them, so I don't really have any incentive.

For the OF, each park has a number for LF, CF, and RF which represents what the "catch rate" that every "bucket" in each section of the OF gets divided by in order to park adjust it. For example, left field at Fenway is something like .84. So whatever a player's catch rate is in each bucket, it gets divided by .84 to make it better than it is. I know that sounds pretty crude but it works pretty well in the long run. It is of course part of the measurement error in UZR as well as bias I would think. For Coors, I think that all 3 outfield sections are like .95 to .97. I don't treat any one zone or bucket differently than another.

To say that in general that a park or section of an OF in a park is not being adjusted enough is an argument or criticism that won't fly. It might not work that well for a particular player in a particular time period, but over the long haul and across all batted balls it works just fine because those "adjustment coefficients" are based on actual data. The reason that Fenway left field is .84 is because if we compare all players at Fenway in LF over the last 10 years to all players in all other parks, balls are caught in LF at Fenway at a rate of around 84% of that in other parks (that is not exactly correct as I actually regress the sample park rates). So .84 as a park adjustment for Fenway LF is not going to be too much or too little. As I said, it might not work too well for a certain player or team or pitching staff, but like any other non-prefect metric it works pretty well in the long haul and I can live with it in the short term.

And yes, I have adjusted the Coors park factors for the advent of the humidor and then the super-humidor.

But, to be honest, Coors is the toughest field to do adjustments on because of the altitude, the size of the park, the humidor and the different effect that it has on the home and road players. At one time, before the humidor, the home players seemed to have an enormous advantage in fielding (UZR). That seemed to have disappeared with the advent of the humidor. Plus, the hangover effect for hitters and pitchers also seems to have disappeared or at least gotten smaller. I have never looked into a hangover effect for fielding (UZR) but I would not be surprised if there was one, at least before the humidor.

So I would take the UZR's for Rockies' players with more of a grain of salt than with other players. Or at least add more parts scouting. One way to mitigate the problem of park adjustments of course, is to look at road stats only (knowing that they will be a little worse) or if you can, road stats plus 1/14 or 1/16 of home stats (so that everyone has a smattering or data from every park, even though that is impossible because of the imbalanced schedule). That is more reliable of course, if you have lots of data. Again, if there is or was a significant hangover effect on defense for Rockies players, then looking at road stats only is going to be problematic too.

Why do you think I often just remove all Coors field data when I am doing certain research? Curse that field (the city actually).

And I don't think I have ever published home/road splits across the board. I could easily do that for FanGraphs, but to be honest, I don't want to as people would make of them WAY more than they should. --Mgl 21:47, 25 August 2009 (PDT)

Choice of analysis

Hey guys, I'm working my way through the book, and I have a few questions about your analysis choices:...When doing the analysis on hot and cold streaks, you discover that there is "residual hotness," but that it is so minimal, that it's not a significant decision-making factor. But thinking about it, I'm not even sure that there is residual hotness. Assume that you have a 6 day hot streak. This is really considered by you guys to be two 5 day hot streaks: 1-2-3-4-5, and 2-3-4-5-6 (by day). Then, considering day 6, that's going to, by definition, have a high wOBA, because it was part of our assumption in defining the second hot streak. So if we consider it as the day after our first hot streak, we're in a way double counting, I would think. Could this possibly account for the platoon split? Or is my logic on this completely wrong?...Secondly, I was wondering with Gonzo's wOBA in '02 in the chpater Mano a Mano: you're always comparing his performance to previous years', but there's never any talk about comparison to his career norm. In fact, you never do this with any of the hitters. Sure, their performance decreases, but is there any time they're still hitting above their average in that timespan (where you can define average in a number of ways, but maybe best as their average performance in the 3-5 surrounding years)?...Third, you use wERA and wOBA for splits...I was wondering if there's been any recent work with using FIP or tRA on these analyses again?...Lastly, with all the prior analyses to where you introduce pull-rate, well, you don't use percentages. Does percentage change rather than difference affect the streaks, platoon effects, or pitcher vs. hitter (specific) matchups?...Anyway,great book so far guys; I look forward to your responses.

Measurement error of UZR

I am not sure if this is the appropriate contact method for this, so I apologize if I'm misusing your contact form, or if this has been addressed on the site and I've missed it. It seems the measurement error of UZR (and other statistics) has been discussed a lot lately, so I was thinking about it today. If one of the sources of error is that not all balls in a particular zone are equal (i.e. balls on one edge of the zone may be harder to field than on the other), would something like the following help cut down on that at all:...Divide the field into x number of zones and compute UZR (basically what is already done). Then, redraw the zones so they are similarly sized and shaped, but so the boundaries are shifted (say a foot to the left or right, for example). So now two balls that were in the same zone originally may be in different zones if they were on opposite edges of the original zone. Calculate a new UZR with the new zones. Repeat several times and then average the different UZRs calculated for each player using the different zone maps. My thinking is that if you could redraw the zones in several ways, it would cover a wider range of the possible values of a ball hit to a particular location. If you draw the zones one way, a particular batted ball might have one value, while if you draw the zones another way, it might have a slightly different value. Since whichever zone map you could choose is largely arbitrary, neither one should be more right than the other in a general sense. So by taking the value of each batted ball using several different zone maps and averaging them, it sort of acts as a pseudo-smoothing function that nudges the value of balls at the edges of the current zones up or down depending on whether they tend to fall into harder to field or easier to field zones when you redraw the boundaries....I don't know if this is feasible or if it would even end up helping any, but the I thought the idea seemed interesting enough to run it by you.

Sounds like a pretty good idea. I'd have to think about it some more, but a "smoothing function" would definitely be helpful in reducing the measurement error inherent in UZR. The SAFE system by Shane Jensen presumably does an excellent job of that. --Mgl 21:25, 9 October 2009 (PDT)

UZR primer tips

I am trying to sum up some "tips" for understanding how to use UZR, especially since it is now readily available at Fangraphs, and could use your help....1) If one were trying to get a rough projection/true talent estimate for fielding, but the player is a rookie or is new at a position (like Zobrist), is there a general amount of regression to add? ...That is, do you have suggestions for "X" values to add to the equation regression = x / (x + "chances")?...It looks as if Colin Wyers came up with values of 228 for OF "observations" and 190 for IF "observations" (I assume we could substitute total chances for "observations"). Do these constants seem right? (Wyers's article: )...Is it right to be regressing by chances/observations rather than defensive innings?... 2) Then there's the issue of how to use UZR in value (MVP-type) discussions, esp. since UZR is now readily available at Fangraphs and built in to their WAR totals....I know there was a long thread about this in June ("UZR in MSM"), and Rally and MGL disagreed about methods. Since a season's UZR value doesn't necessarily mean a player had a bad/good year with the glove, or saved/cost the X runs, only that he had a bad/good UZR, would a decent compromise be to regress the single/partial season's worth of UZR data if trying to incorporate UZR into MVP/all-star type discussions? ...It seems a bit much, for example, to assume Ben Zobrist is a top 5 player in WAR, when so much of his value is UZR runs based on a partial year's worth of data, and we have almost no data on Zobrist at 2nd prior to this season. And yet, we do want to find some way to credit Zobrist's defensive contributions, aside from the positional adjustment....Thanks again.

1) Those numbers sound fine to me. We don't know the exact ones - we can only approximate based on whatever set of data we used to come up with them. Of course you want to use "chances" and not innings. What if one position got 10 chances per inning and another got 1 chance every 10 innings. Would you want to weight the samples by innings? Also, if a player changes positions, you can use his UZR at one position to help with your assessment at another position, using the positional adjustments. For example, if a player has 3 full seasons at CF and has a -10 UZR per 150 and then moves to LF and in one season has a UZR of +10 per 150, I don't think you want to ignore the -10, do you? Not so easy (or reliable) doing that when moving from IF to OF or vice versa, but you want to do something along those lines (combining UZR's from different positions using positional adjustments).

2) Yes, even in value/MVP discussions, you want to regress a player's sample (e.g. one year) UZR. You want to regress less than if you were estimating true talent or doing a projection (how much less, I don't know) and you want to regress towards his true talent number. For example, let's say that a player has a weighted and regressed 5-year UZR of +10 (very good defender). And let's say that he is +20 this year (150 games). For a value/MVP type discussion, I might regress that +20 25% toward his +10, or something like that. It is an interesting concept that most people do not understand, because they don't do that for offense. They figure whatever a player's offense is, it is, for value discussions. Not so for UZR, because UZR is only an estimate of what actually happened on the field, whereas offensive statistics are more or less a photograph of what actually happened on the field, at least as far as the final result (single, walk, out, etc.) is concerned. --Mgl 21:35, 9 October 2009 (PDT)

Current event data

I'm a Retrosheet newbie and was wondering what researchers do for play-by-play data during the current season (before the most recent Retrosheet update is produced). Is there a preliminary play-by-play database coded like Retrosheet available anywhere? Thanks

Not that I am aware of, that is publicly available. I think that some of them use the Gameday data but I'm not sure. People that work for some of the web sites like Fangraphs, THT, or BP, probably have access to the BIS or STATS PBP data during the season. --Mgl 21:39, 9 October 2009 (PDT)

Coors park

Not a question, just a thank you for your mailbag answer to the question of Coors field park effects and UZR. Yes, Coors field is a tough park to deal with, I don't envy the extra work that it must cause. I'm amazed at how it affects play (like strikeouts), even post-humidor (and super humidor). Given that the Denver Post's main beat writer still thinks Barry Larkin robbed Dante Bichette of the '95 MVP, I think plenty of Rockies fans are still working through Coors field park effects 101.

You know, we had a pretty good handle on the effects of Coors Field until they started that humidor and super-humidor thing. Now it will take us another 10 years to figure out the effect of that! --Mgl 22:22, 9 October 2009 (PDT)

Sabermetrics intro

As an 8th grade teacher, I have to teach an elective class about something I am interested in. I have decide to teach a course about popular baseball misconceptions and objective analysis. I enjoyed reading "The Book", but it's probably a little deep for 8th graders. I want to help them understand basic concepts like "RBI and W-L record should not determine the MVP and Cy Young. I wonder if you have any suggestions....Any advice would be greatly appreciated by me and my students.

I think it is fantastic that you are doing this. I can't really give you any practical advice, other than what I always say: The value in teaching sabermetrics to young people is in encouraging and teaching them how to be critical thinkers, how to embrace the scientific method when examining issues that interest them, and in not being blithering idiots like the so-called analysts they see and hear on TV! Seriously, what I mean by the last part is teaching them not to accept at face value what they hear and read from the various media outlets, and that the so-called "experts and analysts" are not necessarily what they purport and others make them out to be. How to do that is your job! Good luck! --Mgl 22:20, 9 October 2009 (PDT)

WAR totals

If one were to total the WAR of all of the players on a team or in a league in a given year, would those numbers a) track with the teams' records or b) be constant-ish (for league) year to year?

WAR is fixed every year at about 1000 WAR per year.

--Tangotiger 07:47, 9 October 2009 (PDT)

More WAR totals

Piggybacking on a previously submitted question on WAR (if summing for team or league would track wins for the team or approximate a constant for a league). If the answer to that is 'no', that the total WAR varies, perhaps wildly, annually, would a %WAR stat be more meaningful? That is, if Pujols' 8 WAR came in a year that only had 80 WAR for the NL, would that be a more impressive season than an 8 WAR in the context of 100 league WAR?

I am not a big WAR person, but it depends on what you mean by "impressive" and it depends on some other assumptions. There is always the issue of whether one should normalize individual stats to a league level (for that year or for several years). Let's say that a league gets worse one year for whatever reason. Most players will look better that year as opposed to other years. Should they get credit for that? I don't know. Let's say that the ball changes one year or the weather is hotter or cooler for the entire season. Players' raw stats will look better or worse across the board but won't change if you normalize them to league levels. Personally, I like to normalize stats to league levels for one or for several years. But that is just a personal preference. If you are pretty sure that the environment has changed you probably want to do that. If not, maybe you want to just look at raw stats or perhaps normalize stats (or use as your baseline for things like lwts or WAR) to a 5 or 10 year period. Then you have the issue of whether you want to compare or normalize individual player stats to just their league (NL or AL) or both leagues combined. I don't think there is an easy answer to your question and your would probably have to narrow the focus of the question or define the terms (like "impressive") more precisely. --Mgl 21:47, 9 October 2009 (PDT)

Batting Order - table 52

Hello everyone and thanks for the work you do, I have a question regarding the batting (dis)order chapter of the Book. Specifically, table 52 on batting order run values. Is this chart a good way to optimize a lineup in the NL or does it under-emphasize the pitcher batting 9th? If the latter, how would you adjust the table to be more accurate for NL clubs.

Baseball Academy

The Kansas City Royals Baseball Academy which taught great athletes to play baseball produced Frank White, UL Washington, Rodney Scott and Ron Washington plus 10 other players that made it to the major leagues. In all, 14 out of 77 players made it to the big leagues, an 18% success rate which is probably equal to 77 second-round draft picks.... Ewing Kauffmann spent roughly $3.6 million dollars on the project over three years. Assuming 3.5% inflation I think that comes to about $15,000,000 over 40 years which is the price for Stephen Strasberg.... The project was shelved because it cost too much money in the minds of the Royals minor league system.... What would be the modern day career value of Frank White, UL Washington, Ron Washington and Rodney Scott +10 cup of coffee major leaguers in terms of WAR be and shouldn't the Kansas City Royals be rushing to spend $15 million dollars to restart the project in hopes of repeating this success? ... What would your advice be to an owner with this crazy idea be based on this one documented case?

I don't know, but one of the problems with your logic is assuming that the value of all those players to the Royals was a direct result of the academy. I doubt that is the case. How much of it was a result of the academy, I have no idea. --Mgl 21:51, 9 October 2009 (PDT)

Smaller ballparks

Recently, the idea that ballparks have gotten smaller was ridiculed on this site. I had believed this to be true but realized it wasn't after a little thought. However, it does seem that the ball seems to "carry" better in many newer parks than in old parks. ... For example, the old Arlington park was once one of the best pitcher's parks in baseball and the new one is one of the best hitters parks even though it is larger. The Astrodome was a great pitcher's park and Enron/Minute Maid is a good place to hit homers, old and new Comiskey play differently, old and new Yankee Stadium is reported to be different. Plus you have Coors and the Arizona parks where the ball flys pretty good. The Royals Stadium is easier to hit home runs in after the remodel. In almost every case, changes seem to favor the hitter even if unintentionally. The only reversals I can think of are Seattle and Detroit. And Detroit modified its park to make it more homer friendly.... Is it fair to say that parks are more home run/hitter friendly even if they have not become smaller?

"Is it fair to say that parks are more home run/hitter friendly even if they have not become smaller?"

I don't think so. I think that I looked at park run factors over the last 20 years or so and found that they have not gotten any larger. (Obviously every year the average park factor is 1.00 by definition, so you have to adjust for that.) Before that, I don't know. Obviously the question depends on the time frame. Are parks now (in 2009) more run friendly than parks 10 years ago? 30? 50? 70? Those questions might have different answers.

Sure, parks can get bigger but more run friendly and vice versa. When you hear from announcers that "parks have gotten smaller" I assume they mean that either runs or HR have increased and not necessarily that parks have gotten "smaller." I also assume that they have no idea whether that is true or not as is the case with most things that announcers say. And again, it depends on what time period they are talking about.

What has muddied the waters is three things: One, the strike zone changed sometime in the 70's or 80's such that the high strike was not called anymore. That produced more run scoring of course. And the ball likely changed in 93 and/or 94, which also produced more runs scoring. And of course there is the steroid issue, which may have caused more run scoring. So obviously there was more run scoring starting sometime in the 70's or 80's I think, and spiking in 93 and 94. Lots of people just assume that part of that increase in run scoring is "smaller parks." That may or may not be true. I don't think it is true at least in the last 20 years or so. Before that, I don't know.

You can do some research. When a new park (or a park changes) enters the league, if it has a PF grater than 1, then the parks as a whole get "smaller." If a new park has a PF of less than 1.0, then the league as a whole now has "larger" parks. Go back in history and net out the PF's for parks that enter the league and parks that leave the league. You should get some idea as to whether parks have gotten "smaller" or "larger" (not in size but in run scoring). If you want to talk about size, you simply go back in history and add up the square footage of all the parks in the various eras and that will answer your question. Do you want us to do ALL the work for you? --Mgl 22:02, 9 October 2009 (PDT)

Total Zone

Lately I've been taking a look at Total Zone numbers for the 80s and even earlier. They seem to be exactly what I would expect in most cases with a few surprises. But I wonder where these numbers come from. There were no zones being accounted for as far as I know in those days. There are even TZ numbers for Babe Ruth. What the heck?... I know you guys are not involved with TZ, but I'm sure you can provide some insight.

I guess that's an unfortunate mislabeling. He uses as much data as Retrosheet provides, and does his best with that information. He might "infer zones" in order to get the data to work with his system.

--Tangotiger 07:45, 9 October 2009 (PDT)

Fans Scouting Report - position-weightings

For the 2007 Fan Scouting Report, you released a relative weighting ( which I have been using to convert FSR results into estimated runs. With the new format for the 2009 Report, is it still viable to use these weights if the 2009 Report's scale is reconverted to the 0-100 scale, or will you be releasing a weighting that's appropriate only for the new 2009 format?

A few days ago, I have updated the average weighting on the 2009 report. If you look at the position pages, you will see the weighting are different from the team and "all" pages. The former is position-specific, while the latter are position-neutral.

When I close off balloting, when the World Series ends, I will present the results just like I have in the past years. For now, the real-time reports are done as simple as possible.

--Tangotiger 07:44, 9 October 2009 (PDT)

Trading a GM for a player

I have a hypothetical situation I've been pondering. Since you're much smarter than I, I was hoping you could take a crack at it.... I'm a Twins fan, and a bit frustrated with Bill Smith's performance as a GM so far (mainly the Matt Garza deal). Watching the Red Sox - Angels game tonight, I couldn't help wishing that Theo Epstein was our GM. So it got me thinking, what would I trade to have Theo be the GM of the Twins? Is there a way to quantify who would be the best player a team should give up to acquire a GM? Thanks.

I created a thread:

--Tangotiger 08:02, 9 October 2009 (PDT)

We have a pretty good idea of the value of players. We have very little idea as to the value of a GM. And the value of a GM probably depends a lot on the context. One GM might be good on a team with a high payroll and another GM might be good on a team with a low payroll. So my answer is, "Your guess is as good as mine." Obviously Epstein and his staff are very smart and know a lot about sabermetric principles. I have no idea what Bill Smith's strengths and weaknesses are.

If someone held a gun to my head, I'd say that the difference between a good and bad GM is worth 10 mil a year, maybe more.

Adjusting OF Arms

Is there a positional difficulty adjustment for OF throwing arm?

I'm sure there is, and I think I looked at it once. Probably no more than 1 or 2 runs. I can't believe it would be more than that. Tangotiger

Hitting your forecast

Question about projections. If a player is projected to have a ture talent of 10 HR and he hits 10 HR, shouldn't his next season projection be 10 HR? Aside from an age adjustment, I can't see why this wouldn't be. However, if you are to use a weighted average of three years, this won't always be the case. For example, using a 3/2/1 weight, a player who went 10/10/10 is going to project the same as someone who went 20/0/0. If they both hit 10, then the first guy is going to project for 10 the next year, while the second is going to project for 11.67. That would lead me to believe that the second guy has a higher true talent level than the first guy. For the second guy, he needs to project and hit 13.33 to have the projections line up. So, do the projections project 10 for both players or would there be a difference?

This is only because you are actually doing 3/2/1/zero. In your illustration, you are dropping off the last season when a new season comes along. For example, suppose you use weights of 1, .70, .49, .343. If he goes 22/0/0, then his forecast is 10 HR. If he hits 10 HR (now he's 10/22/0/0), then his new forecastis STILL 10 HR. Try it out! Tangotiger

Constant Parks against floating parks

After reading MGL's assessment of park factors in the Q&A this month, I thought that a fairly simple and effective way of making an assessment of whether parks are now more hitter friendly or not would be to look at the comparative advantages of Fenway and Wrigley Field from the early 80s and then look at today's values. Most parks have either been replaced or remodeled since that time so this seemed to me to be a great way of comparing. ... Back in the early 80s, Bill James assessed Wrigley Field as improving run scoring by 22% and Fenway Park in the high teens. I went to ESPN and looked at the PFs for the last nine years for the two parks. The averages for the parks was 1.06 for Wrigley and 1.09 for Fenway. Amazingly, Fenway now seems to decrease home runs more than the league average. ... My takeaway from this exercise is that parks are much more home run and hitter friendly than 20 and 30 years ago. So much so that Wrigley and Fenway no longer stand out as extreme hitter's parks as they used to. Good or bad assessment?

Good, if Wrigley and Fenway didn't change, and if the wind patterns in Chicago are constant. None of that is true. Probably Twins Metrodome is one that is the most constant, so perhaps that is the baseline to be used. But, that's alot to pin your hopes on, just one park. Tough call. Tangotiger

Right. For one thing, Fenway (apparently) changed dramatically when they added the luxury boxes in the 80's (I don't know when exactly). And Wrigley is very unreliable (because of the wind), not to mention the fact that they had construction the last 5 or 6 years as well.

Why not just look at each park that was added or subtracted over the years? It's not perfect, but pretty close to it I would think. --Mgl 18:38, 17 December 2009 (PST)


Jeff Francoeur's career wOBA over about 3,000 career PAs is .317. Over his last 308 PAs, since joining the Mets, his wOBA was .350. Two hypotheticals: (1) the .350 was just a random, small sample fluctuation and his real talent remains in the range of .317; or (2) the shift from hometown Atlanta to New York had some psychological effect that has resulted in an increase in true talent level. For purposes of evaluating the comparative likelihood of each of these hypotheticals being correct, it would be presumably be useful to know the numerical probability of the random occurrence of a .350 wOBA over 308 PAs for a player with a true talent of .317 wOBA. What's your estimate of such probability, and how do you calculate it?

Treat wOBA as a binomial, which it is designed to imitate, even though it isn't, and use the binomial probability formula:

That number is going to be exceedingly small, of course, since you are computing the probability of a player with a true p of .317 (or whatever) hitting EXACTLY .350, rather than .349, .348, .317, .300, .355, etc.

Your attempt at making this a Bayesian probability problem, which it technically is, is very complicated. You have to ask a zillion questions, such as, "What is the probability that his true talent changed (by virtue of moving to NY) by X points and he hits .350, versus the probability that his true talent did not change but that it was Y (of which .317 is but one of many possible values) in the first place. Put in many, many values of X and Y, and you have your answer. That might take you a couple of years or maybe a few hours with a good computer program - I don't know. A much simpler solution is to look at similar players in history (with similar stats) who have changed teams.

--Mgl 18:47, 17 December 2009 (PST)

The simple answer is that, for about 300 PAs, 2/3 of players will have wOBAs within about 30 points of their true talent level. So no, a 33 point increase is not statistically significant. --AED 12:31, 21 December 2009 (PST)

Pitchers as hitters

What is the difference in value (wins, runs etc...) between a good hitting (starting) pitcher and a poor hitting one? Is it something NL teams should take into serious consideration when signing a pitcher?

IIRC, a good hitting pitcher has a wOBA of around .250, and a poor hitting one is .100. So, that .150 difference is .13 runs per PA. Give them about 65 PA, and that gives you 8 runs, or +/- 4 runs. That's at the extreme. Tangotiger

I'll add that most of those 65 PA are low leverage PA, as most high leverage situations occur late in a game.

I'll also say that of course it should be a consideration. Whether it should be a "serious" consideration, that is semantics. The entirety of a player's value should be considered. Why not? Obviously if you didn't consider pitcher hitting, you are not losing much in most cases or on the average, but every little bit helps. Interestingly, my guess is that for many teams, it rarely comes up in front office discussions and considerations, which is another subtle mistake that many teams make. In fact, even in sabermetric discussions, it usually gets overlooked. When we talk about valuing offensive players, we don't usually talk about baserunning and things like that - usually only offense plus defense. But if you are running or advising a team, EVERYTHING should be considered, right? --Mgl 18:51, 17 December 2009 (PST)

Pitchers familiarity v tiring

You touch upon it in your book and in your blog about how each time through the lineup the starting pitcher fairs worse. My question is, is this a by product of the hitters having seen previous pitches in previous at bats, or more of a by product of the pitcher just getting tired based on pitch count?... This segways into the main part of my question. Let's say a pitcher faces 36 batters, which amounts to the same nine batters four times. What if the rules of baseball changed and the pitcher faced 36 different batters each one time. Would you see the same trend in the decrease of the pitchers effectiveness in both scenarios? Or would the "make believe" scenario see a different trend due to each of the 36 batters only facing the pitcher once?

We've talked about this many times. I don't think we know the answer yet. My guess is that it is a combination of the two (tiring and familiarity), but that the bulk of the effect is familiarity.

This can be researched, but I don't know that anyone has yet. I suppose you could try and look at high and low pitch counts and control for batters faced. It will be a little tricky though. You also have the issue of batters seeing more pitches which might be impossible to separate from fatigue. So it might not be possible to fully separate the two issues, unless you conducted experiments like in your example of the 36 batters. --Mgl 18:56, 17 December 2009 (PST)

Pinch hitting early or late

As your work in your book shows, there is a rather large penalty in terms of production for a player who pinch hits vs hits as a starter. Is the drop in production consistent in every inning? Is pinch hitting earlier in a game easier than later in a game, accounting for quality of pitching? For example, it would seem like pinch hitting in the top of the first inning would have no pinch hitting penalty. Shouldn't every at bat in the first inning be akin to a pinch hitting at bat? Both at bats are coming in cold off of the bench so to speak. Thoughts?

Well, we do know that batters get better each time through the lineup, so to some extent batting in the first inning IS like a pinch hitting appearance. Also, we don't know that at least part of the pinch hitting penalty is not that the weather is colder (in general) late in the game and some pinch hitters are a little injured (and that is why they are not starting) - think Kirk Gibson.

To argue against your assertion that a PA by a starter in the first inning is the same as a PA by a pinch hitter in the first inning, that is not necessarily true. A player who starts is mentally and physically prepared to start. A pinch hitter is not necessarily mentally and physically prepared to bat in the first inning. In fact, he probably is not. That might be like asking Mo Rivera to pitch in the first or second inning. He probably wouldn't be nearly as effective.

You bring up a lot of good points and to tell you the truth much more research needs to be done on the issue of pinch hitting and DH penalties. --Mgl 19:02, 17 December 2009 (PST)

Leverage in playoffs

What's the average leverage index of each game in a 5 or 7 game series. I assume you always want your best starters pitching games 1 to maximize the chance that he gets to start twice, but it's possible the average LI of one of the games could influence how you set up your rotation before entering a series.

For the games that are guaranteed (games 1-3 of a 5-game series; 1-4 of a 7-game series), the average leverage of each game is identical. Later in the series, the average leverage index increases, but the odds that the player you're saving for that high-leverage game actually gets to play goes down. In fact, it's exactly a wash, meaning that the LI multiplied by the probability of the game happening is exactly the same for every game. --AED 13:00, 21 December 2009 (PST)

Strikeouts old and young

I've tried asking people about this before, and never been able to express it well enough to generate much interest. I've never seen anybody write about it. ...Let's say you have two pitchers - one is 26 years old, and the other is 40 years old. Assume that their statistics over the past 5 years are very similar. So similar in fact, that when you make adjustments for the aging curve you project them to have the exact same statistics in 2010. Now assume that they each start out slow next season, striking out far less batters than anticipated. As far as I know, existing projection systems would give them identical 'rest of season' projections at that point. But should they? Isn't the low strikeout rate more likely to indicate a change in true talent level in the older player than the younger one? Similarly, wouldn't superior short term performance be more likely to indicate a true improvement in the younger player?

The answer to your question, which is a good one, is yes and yes. Age (as well as other things) can determine how much to regress certain stats and how much to weight past performance. Not a lot of work has been done in this area, and most projection systems do not handle this kind of thing properly, I don't think, but you are 100% correct. Remember that most of these projection systems that use weighted past seasons, regression, and age adjustments are just shortcuts for using Bayesian probability models which would incorporate the chances that a player has changed his true talent level (among other probabilities). And that part of the methodology definitely is a function of age and other things, especially for pitchers (who we think have a much greater chance of changing their true talent levels at any given time in the first place). --Mgl 19:10, 17 December 2009 (PST)

I'd caution that old players who perform below expectation are more quickly written off than young players who perform below expectation. Put differently, while it's perfectly logical to believe that the Bayesian prior should maybe be weaker for old players, teams seem to overestimate that weakness. --AED 13:07, 21 December 2009 (PST)

Explain these metrics

I am trying to understand the differences between wOBA (wRAA), RE24 (REW), WPA, RE24/boLI, WPA/LI, and WPA/LI * boLI. Could you give some examples of specific situations and what value a player would receive in all six metrics?

Yowza. I gotta find you all the threads. Tangotiger

Scouting Report

In years past, I've seen that 50 is the set average for each skillset in the Fans Scouting Report. Now that the format has changed, is the average no longer set to the equivalent score any longer? If not, does this then mean that the average position player for a given position is the average of each player's position adjusted total score?

The 0-100 scale will return. The 1-5 was just for the real-time averages. Tangotiger

Playoff strategies

A question regarding the final game of the Phils/Dodgers series. ...Top of the eighth, bases loaded, two outs, and the Dodgers trailing by five runs. Casey Blake is scheduled to bat, with the pitcher's spot due up next. Why in the world wouldn't Torre pinch-hit Thome in this situation? The Dodgers need to score a boatload of runs in the next 4 outs. It stands to reason that this single at-bat is their best opportunity to score multiple runs--and they have one of the greatest sluggers in baseball history on the bench. ... Now, Thome is not the Thome of old, certainly, and Blake does not trail Thome all that much in terms of OBP/SLG, at least not this season. But if the Dodgers weren't planning on using him in exactly such a situation as this, then why pray tell do they even have him on their roster?

It's way past the 09 post-season, so the situation is not fresh in my mind by any means, but yes, you want a home runner hitter in that situation even though a HR does not get you the lead. That being said, you always have to balance using a pinch hitter now versus later. But, in this case, it is probably correct to use Thome, although I am not sure. Of course it would be nice to be able to use a pinch hitter in place of a bad hitter. Using a pinch hitter in place of good hitter is a little bit (maybe a lot) of a waste. Keep in mind that there is probably a pinch hitting penalty for Thome, even though that is what he is mostly used for now. Plus, forget about him being "one of the greatest sluggers..." Obviously he is nowhere near that level anymore (I realized you said this, but there was also no need to tell us how good he USED to be). Plus, if the Phillies could have and likely would have brought in a lefty reliever to face Thome (again, I don't remember the situation), that pretty much puts the kabosh on Thome pinch hitting, I would think. --Mgl 19:17, 17 December 2009 (PST)

WAR Fantasy

Is there a fantasy game involving WAR? ... Las year a link was posted to a BtB-run league with salary and WAR. I am looking for a fantasy league where there are normal draft rules, but just with WAR instead of the usual 5x5 scoring.... Thanks for any direction you may be able to provide!

With luck, I'll have it. We'll see. Tangotiger

Appendix calculations

I'm having trouble following the Variance in Skill section of the appendix, specifically the second to last equation on page 375. Using that formula, I was coming up with numbers that seemed too high (Var(OBP skill excl Pitchers)~0.016 or SD=0.125) and I noticed that players with low PA had a big impact on the estimate. As a test, I added 10 players with 1 PA: 6 with a hit, 4 with an out, to my 650 players and 180,000 PAs and it increased the Var to 0.019, which is a huge difference for 10 PA. It seems the formula has no weighting based on PAs and the denominators increase greatly with each player with a single PA (increases by +1/(2*Var(Skill)^2)). Am I missing something here or is that how the calculation should operate? Does this formula treat the 0.000 OBP or 1.000 OBP a player has after a single PA as his true OBP?

Unfortunately, I'm going off our original printing, so I hope that my page 373 equals your page 375. At least, that seems to make sense based on your phrasing. However, the number of attempts ("Ni" in this equation) does factor into the equation in such a way that a player with a very small number of attempts would not affect the final result.

So, I think you probably have something wrong in whatever you're using to make the calculation. Perhaps you missed the square in the denominators (of both the numerator and the denominator)? --AED 18:48, 18 January 2010 (PST)

Thanks for the response Andy, but I'm not sure we're looking at the same equation. I flipped through the appendix again and didn't see what you must be looking at. Preceding the formula you write "we can calculate sigma-sub-OBP (the variation in OBP skills) using:". The formula I'm looking at only has Ni in the Process Var portion of the calculation (of the numerator and denominator)-- wi*OBPi*(1-OBPi)/Ni. I think that's just reducing the process variance in the OBP estimate as the sample increases. Ni does not factor into the Tot Var portion of the numerator-- (OBPi-OBPTot)^2... Unless Sigma-i is meant to be the sum over all PA, rather than all players? That would be an inconsistent use of i, though... (Oh, I double checked and I have got the squares in the num and denom.)

That actually is the one I'm referring to. Consider the example of a player with zero attempts. Obviously the 1/N terms go to infinity, and 1/infinity is zero. So the player makes no contribution to either numerator or denominator. A player with 1 attempt makes a small contribution, and it gets more significant with number of attempts. --AED 16:27, 10 April 2010 (PDT)

Meaning of split stats

Which is more meaningful? The stats of RH pitchers vs. RH & LH hitters, as well as the same for LH pitchers or flyball, groundball, and neutral hitters and pitchers versus each other?

Not sure what you mean, but platoon splits are more significant and meaningful than G/F splits.

--Mgl 14:52, 23 April 2010 (PDT)

Career OPS calculation

i was wondering about the correct way to calculate OPS+ over a career...i assume to average the averages, so to speak, would be wrong, i.e. a player posts a 120, then a 110 in his 2 year career, 120+110/2 would be wrong, i assume, so do you have to add up the h,tb,pa etc...of the leagues he played in to get the lgOBP and lgSLG of his career? thank you for being so incredibly generous with your time and expertise as always!

I don't know off the top of my head whether it would be perfectly correct or not, but certainly you can take a weighted average of the yearly OPS+, weighted by PA for each year. That should be close enough. --Mgl 21:51, 13 May 2010 (PDT)

Game Score

Do you know of any metric for evaluating a starter’s performance in a given game that is better then Game Score?

You may like this one: Tangotiger

New Park Factors

First off, "The Book" was awesome. I've been thinking about how park factors are done. I wonder, given the data we now have available (at least I think we do), couldn't we figure park factors from a totally different perspective? I'm thinking something nominally similar to what the people at do with home runs. ... Here's what I've been thinking about: With Pitch F/x (or some similar thing), I believe we know the angle and the speed the ball has when coming off the bat. Using a simple physics equation, I believe we can estimate the distance the ball should travel. I was thinking we figure out which zone it is estimated to go into, and then for each ball we compare it with the location it actually went to. ... I really think this could get you to the point of customizing the park factors based on the profile of the batter. For instance, if there was an issue like Barry Bonds in AT&T Park, I think we could answer that question if we had this kind of data. ...I realize part of a park factor is also simply how big the outfield is, since that makes it harder for outfielders to get in position to make plays. I don't know if you could account for that in a system like this or not. ... Just an idea I had, and since you guys are in my opinion some of the smartest sabrmetric minds on the internet, I wanted to see what you thought of it, and if there is some obvious flaw that I'm not considering.

Thanks for the kind comments! To be brief, there are many ways to do park factors that have not been done yet. Yours is a good idea. That is one of the areas (park effects) in baseball/sabermetrics that we know the least about.

--Mgl 14:44, 23 April 2010 (PDT)


Do you plan on making 2008-2009 PZR data available? I've downloaded the 2000-2007 spreadsheet and I personally think this is one of the best methods out there for measuring defense-independence for pitchers in terms of actual performance (i.e. for WAR calculations, for example). I was interested in getting my hands on 2008-2009 data for some work with pitchers.

Not really in the works. Personally, I have some issues with it.

--Mgl 14:33, 23 April 2010 (PDT)


Tango & the boys, have used your projections over the past couple years with several other formulas, and want to know how often (if at all) players will be added to the marcel database as signings increase towards spring training. And after reading the article about how marcel performed with other models, I might just scrap everything else and let you guys pick my players!

Marcel only uses MLB stats, and so, nothing changes in the offseason. Tangotiger

Summaries of findings

I get the gist of how most sabermetric stats work, but one thing I would be interested in is a summary of what some of the new measuring technology has discovered. About a year ago, I took some notes on pitch f/x, but I haven't kept track of what is going on with that system or some of the other ones like Hittracker Online. One thing that I do recall is that the screwball is about as rare a pitch these days as the knuckleball; if not moreso. Studes used to do "Ten Things That I didn't Know" over at THT. Has anyone done something similar on this topic?

Don't really know. With all the good search engines out there, I'm sure you can find the info you need and/or desire. --Mgl 21:52, 13 May 2010 (PDT)

UZR not 0 at the team level

Why don't the sum of team UZRs in a given season equal zero? Shouldn't they, if 0 is determined as league average? I'm looking specifically at the AL in 2009 (sum = 1.6), but I suspect it doesn't really matter which instance I'm concerned with. Thanks for any guidance you can provide.

I imagine that 1.6 is a rounding error. We do set the league average to zero each year, although it does not have to be that way. For example, we can set the last 5 years to zero and then each year can be more or less than that, depending on the overall quality of the fielders for that year. But to be consistent with most other metrics (which normalize to THAT year), we decided to set each year to zero, so that when you see a player's (or team's) UZR, it is relative to the average player at that position for that year. I don't think that is the best way to do it in a perfect world, as one position can easily be especially good or bad in any one year, but that is the way we decided to do it nonetheless. --Mgl 14:15, 23 April 2010 (PDT)

Does it matter where the talent level is on a player?

salary demands) and you can only sign one. One is a high slg low obp type player the other is a high obp low slg type player. Who do you sign ? Does it matter if your team is already a high obp or high slg team. In other words do you gain more by adding a slugger to a team of sluggers or by signing a high obp type player. is there an optimum balance of player types or are you better off going down the slugging route or the obp route? Does it matter if your team is a high scoring or low scoring one?

There are obviously interactive effects among the various components of offense. For example, common sense (and it is true) tells you that a slugging guy has more value surrounded by a bunch of OBP guys and vice versa. So technically, if you are a team and you are considering a player, you want to evaluate him not at a context neutral level, but withing the context of your park, your other players, your pitchers (for defense), etc. In practice, it is not going to make much difference though.

--Mgl 14:27, 23 April 2010 (PDT)

Using PA as a regression point

About means regression: could playing time be used as a proxy for ability, to a degree? Say, if players who have around 620 PA hit .280, a player who has 620 PA would be regressed to .280; a player who has 140 PA might be regressed to .225. (I don't recall the exact numbers, but something like every PA adds like .00012/.00013/.00024 to the mean value for players who play that much.)

If playing time depends on talent, then yeah, you can use it to some extent. Tangotiger

The issue is that playing time would have to be independent from the results. That is, if someone hits .240 and gets pulled from the lineup because of it, you'd be double-penalizing him for the poor batting average. --AED 10:57, 6 April 2010 (PDT)

Retrosheet boxscores

Would it be plausible to use the Retrosheet boxscores to make a game-by-game analysis where play-by-play analysis is unavailable?

Sure, I use the boxscores generated from the retrosheet data all the time, but I don't think it has any info that is not contained in the PBP data. In fact, it is gotten directly from the PBP data. For me, it just makes some things easier to analyze on a game by game basis.

--Mgl 00:50, 27 April 2010 (PDT)

Career BABIP

Has anyone ever checked to see which teams, based on career BABIP data for pitchers, would have the lowest and highest expected hits? Say, were the '27 Yankees expected to be 5 hits better than league average based on the career data of its pitchers and their playing time.

I don't know, but you are welcome to do so, of course, as all of the requisite data is available. Of course a team's expected BABIP (DER) based on the career BABIP of its pitchers is still going to contain some "luck" as the career BABIP of the pitchers would still need to be regressed toward some mean. IOW, if the career BABIP of all the Yankee pitchers in 2007 were .305, it is likely that their "true" BABIP is closer to the league average of .300 or whatever it is, so that the expected number of hits in that year would be based on the regressed career BABIP and not the actual BABIP of all their pitchers. And then the difference between the expected BABIP and the actual BABIP of the team would reflect luck and defense and park effects. In addition, the career BABIP of the pitchers would also reflect the defense behind them during their careers, as well as the parks they played in. If a lot of those pitchers played for the Yankees for their careers, then their career BABIP would also reflect the Yankee home park and the Yankee defenses over the years. So, depending on what the heck you wanted to do with the data, it could be a mess...

--Mgl 00:56, 27 April 2010 (PDT)

HR skill for pitchers

Matthew Carruth recently posted on TMI ( that, "Research has shown that pitchers have little control over how often a fly ball actually leaves the yard." I've seen this result referenced many times, but the actual research behind it seems to be lacking. Is there some definitive study out there? I seem to recall MGL disputing Carruth's point but I can't seem to find it on the blog. Thanks.

Roughly speaking, if you give up 600 air balls, the HR per airball rate is half luck and half skill. Once you get into multi-seasons, then HR per airball is most definitely a skill.

--Tangotiger 14:01, 7 July 2010 (PDT)

I'll add this: It is difficult to answer a qualitative question that requires a quantitative answer. At what point, quantitatively, does something fall into the category of "little control." If you regress something 90% or more, based on a certain sample of opportunities, I think most people would be comfortable saying that at that point there is little skill associated with it, but what about 30%? or 35%? Is that "some" skill, "little" skill, "significant" skill? I have no idea. That is what I prefer to answer questions like that quantitatively, as Tango did, unless it is obvious (such as if something is regressed 95%, we can say that "there is little skill").

And keep in mind that whether something is "a little" skill, "almost no skill," or "a lot of skill" always depends on the size of the sample of opportunities. If something has any skill at all (also keep in mind that we really mean "the spread of skill in the population" and not actually whether it "requires skill" or not), if the sample of opportunities gets large enough, then it is near 100% skill, if that makes any sense.

--Mgl 15:59, 7 July 2010 (PDT)

In-season forecasts

I'd like to combine a player's real 2010 stats with their projected 2010 stats. In my mind it's almost like regressing to the mean, except it's regressing to the player's projection. Do you have any thoughts (even just an educated guess) about how many projected 2010 ABs I should add to the player's real 2010 ABs?

I know I've answered this question... somewhere. You need to know how many PA the 2010 forecast was based on. Let's say for example that the PA for 2009, 2008, 2007 was 600,400,500, and you weight those as 100%, 80%, 64%. The effective PA is 1240, plus another 200 PA for regression. That's a total of 1440 PA that the 2010 forecast was based on. The current season counts as 125%. So, after 200 PA in 2010, that counts as 250 effective PA, compared to the 1440 PA for the 2010 forecast. If the 2010 forecast was a .400 wOBA, and in 2010 he has a .300 wOBA, his new in-season forecast is .400 x 1440 plus .300 x 250 divided by 1690 equals .385.

--Tangotiger 14:01, 7 July 2010 (PDT)

Update park factors

Hey, I was searching the site for any topics involving ballpark factors and came across one of yours as of 2007. I was wondering if you have more accurate numbers since then. If possible, could you send them to my email? I am greatly interested in your guys work as I have had issues using the standard ballpark factors that all the sites generally use. However, a few of your numbers seem off. Oakland, Seattle, Dodgers, and Padres all seem a little bit higher than they should be. Do you consider changes to each and every park as I know you take data from 15 years or so. I imagine most parks have undergone changes within that time frame. Thanks alot, really appreciate your work!

That's on my todo.

--Tangotiger 14:01, 7 July 2010 (PDT)

BABIP by count

This question is in relation to BABIP. I know the pitcher has some control over the type of ball hit - what I'm curious about is whether or not pitchers who tend to work themselves into favorable hitter counts (ie 2-0, 3-1) have a higher BABIP than pitchers who tend to work themselves into more favorable pitcher counts. I suppose the logic would be that in a pitcher's count, you'd figure the hitter would have to swing more defensively and the pitcher's options would be more numerous, and you'd figure during a hitter's count, the pitcher would have to throw a lesser value pitch, while the hitter could sit on something more easily.

The range is .280 to .320 for BABIP by count. Since no pitcher is 100% at 0-2 or 3-0, you can see how quickly that's going to coalesce around .300, regardless of how often a pitcher is in a pitcher's count (in real-life).

--Tangotiger 14:01, 7 July 2010 (PDT)

WPA - why add it at the seasonal level

So I was trying to explain WPA to a friend this morning. I think I understand why someone would want to look at a game total (Choo's .533 yesterday occasioned it), but why in the world would you add up all WPAs (positive and negative) over the course of a season? That number seems meaningless to me. What does that number "mean"?

It shows who was involved in the most game-changing plays, whether by luck or design. Get enough games, and luck evens out. So, it tells a story at the game level, and over a career, it matches their win impact. That would seem a good enough reason to add them up. Why do we bother adding up the W/L of starting pitchers?

--Tangotiger 14:01, 7 July 2010 (PDT)

Hmmm. I guess I don't understand WPA as well as I thought. It's my understanding that the total WPA on a winning team (for any particular game won) adds to .500 (not 1.000). So when Choo had a WPA of .533 yesterday, he was "involved in" about half a win? Who got the "other half" (since the rest of his team added to a negative WPA)?

Yes, the total adds up to .500, since both teams start at .500, and one ends up at 1.000 and the other at .000. So, the winning team always gets +.500 and the losing team gets -.500.

The totals are here:

At the time that Choo was at bat, to the time he finished his at bat, the chances of the Indians winning increased a cumulative .533 wins. That's all it says.

Ever watch World Series of Poker when they show the odds after every flip?

--Tangotiger 14:01, 7 July 2010 (PDT)

I understand how Choo accumulated his .533 yesterday. You just add together all the WPAs from his plate appearances. That makes perfect sense to me. Before he came up to bat, his team was likely to lose; after his HR, they were likely to win.... But what would it tell you if you added up all of the odds in the World Series of Poker for one particular player *over his many rounds of play*--an analogy for multiple games over the course of a season? I can't wrap my head around what the meaning of that sum would be.... Wouldn't a season WPA figure be more appropriate if you divided it by games played? Then you'd have the average amount that a player helped or hurt his team over the course of the season rather than a counting stat that doesn't translate well to wins. ...Maybe this is my problem: what are the units on WPA over the course of season? If it's wins added (or created, or something else), then Choo got credit for "adding" half of yesterday's win. But that' s obviously not the case. He won that game *in spite of his teammates* (because they're combined WPA was negative).... I know I'm not being clear. I'm sorry. Thanks for trying to explain this to me.

He was involved in 0.533 more wins than an average player would have, given the same situations.

The unit is wins. And if Pujols ends up with a WPA of 8, then that means Pujols was involved in 8 more wins than an average player was involved in.

And if you look at career WPA, you get a list of the best hitters in baseball. That is, over a career, it's not a coincidence that great hitters are always involved in positive WPA.

--Tangotiger 14:01, 7 July 2010 (PDT)

Alright. That all makes good sense. Especially since I can see a baseline (in this case, the "average" player).... One last question, and I promise to leave you alone. ...Let's say, for instance that Pujols racks up 16 straight games with a WPA of .500 while his teammates combine for zero WPA. If that were to happen, his team would have won all 16 games, but his WPA over that stretch would be 8.... So isn't the unit off by a factor of one? In other words, if every .500 units of WPA equals one win, then every "full point" (1.000) of WPA would be two wins?

I don't see the problem. If he has 8 WPA after 16 games, and the rest of the Cards are at 0, that means that the Cards won 16 games. Where's the issue?

--Tangotiger 14:01, 7 July 2010 (PDT)

The problem is that in this scenario, Pujols was, in your words, "involved in" 16 wins--actually responsible for them since no one else on the Cardinals contributed (0 WPA). But WPA would only "credit" him with 8.... Ok. I may have just figured it out by asking that last one.... If everyone else on the Cardinals had a zero WPA over that stretch *and *Pujols had WPA of zero, they'd win eight games and lose eight games (because they'd have been "average" or a .500 team--exactly 0 WPA), right? But by by replacing eight of those losses with 8 wins, Pujols *added* 8 wins. Yes?... In this way, WPA counts "wins above average" whereas WAR is counts wins above replacement? So WPA<WAR? (And yes, I know that WPA is context-dependent and WAR isn't, but WPA is analagous to "wins above average" rather than "wins above replacement. That's what I meant by WPA<WAR.)

Odds of a true 90 win team losing 10 in a row

I remember a comment that Bill James made back in the days when he was putting out his annuals. He said that good teams don't have 0-10 losing streaks. That got me to thinking. What do losing streaks say about a team? For example, the Braves have lost six in a row, yet fans still see them as a contender. Is that a reasonable assessment? More specifically, what are the odds that a team that's lost six in a row, or eight, or ten, will win 90 games on the season? I realize that any practical answer to this question will assume that the roster doesn't change over the course of the season.

The answer depends on what other information you present. First thing, how many games did the team play? For another, do you know anything else about the team, such as projections for any of the players? If you don't know anything about the players, but you know the team's record, then how many games the team has won or lost in a row either some time during the season or in the last X number of games is meaningless. For example, a team that is 30-30 but has just lost 10 in a row is essentially exactly the same as a team that is 30-30 and just won 10 in a row. The same thing is true if one team has a 10 game winning or losing streak during the season. It doesn't matter. (As you say, the assumption is that the true talent of the team is the same all season - at least the roster and playing time is the same.)

Now, if all you know is that a team either lost it's last X games in a row, or that it won or lost X games in a row some time during the season, but you don't know how many games it has played, then you can estimate the true talent level of the team and it is not going to be good of course. Off the top of head I don't know how you would do that. Maybe Andy or Tango can give you a more specific answer. I could make a wild guess that if all I knew was that a team had a 10-game losing streak sometime during the season, and the season was over, my estimate of their w/l% for the season would be on the order of around .400 or so.

Obviously (at least it should be obvious), any team CAN have a 10-game losing streak. It is just much more likely for a bad team to have one than a good team. Much more.

--Mgl 19:21, 7 July 2010 (PDT)

Pitch sequencing

It seems to me that pitch sequence is really an under appreciated and understudied area of the game. Is this something being looked at more so internally? Also, with Pitch FX now available this seems like it could really take off. I was curious for your take(s).

We have several threads on pitch sequencing on our blog, which mostly linked to work from Hardball Times. Check it out.

--Tangotiger 14:01, 7 July 2010 (PDT)

I don't know to what extent it is being looked at by individual teams (my guess is not much), but I think it is an area that needs much more research and analysis. Much more. If I were in charge of a team, I would be spending lots of time, energy, and manpower on that.

--Mgl 19:22, 7 July 2010 (PDT)

What don't we know?

We know that 10 or 20 years from now, some things that are believed by saberists will probably be proven to be wrong. Having rooted for teams with bad short relievers, I think saberists are probably underestimating the value of a top relief pitcher. If you were to guess, what are some of the things you think the field could potentially be wrong about it at this time.

I don't know what you mean about "underestimating the value of a short reliever." We know how to figure the value of pitchers, starters and relievers, so you would have to be a little more specific if you wanted someone to comment on that comment. If I said, "In 10 or 20 years, we are going to realize that we are underestimating the value of a team's cleanup hitter," what the heck does that even mean? Or, "We are going to realize that we are underestimating the value of a #2 starter?" I am afraid, none of those statements, including yours, makes any sense, at least without clarification or more specific information.

To answer your question about what we think might prove to be wrong...

If I knew that, then it wouldn't be wrong anymore, right? Or at least it would be in the category of "We don't really know." If you want an answer like, "We think this is true, but we are not really sure, therefore, it might eventually end up being wrong," my answer is, "I don't really know."

There are, however, lots of things that we don't have a real good handle on, which we probably will in 10 or 20 years (likely a lot sooner at the pace we are going and the number of analysts out there):

Pitcher workload and chance of short and long-term injury and effectiveness. Catcher influence on the pitching game. Clutch hitting and pitching. Team chemistry and winning. Manager influence on winning. Are some batter and pitcher matchups better than others, outside of platoon (including G/F) considerations? Are some IBB's better or worse than we think. Park effects on individual batters and pitchers. Can you leverage good or bad defense in small or big parks?

And more...

--Mgl 19:38, 7 July 2010 (PDT)

ERA by rotation slot

I thought I saw recently, and I thought it was on this site, 2009 league average ERAs for each spot in the rotation (#1 starter, #2, etc. through 5). I can't seem to find that -- does this ring any bells?

Not for 2009, but you have to be careful. You must choose the rotation order without looking at the underlying data. I usually go with these win%: 0.600 0.540 0.490 0.450 0.420

Passes the sniff test. --Tangotiger 11:22, 23 July 2010 (PDT)

Quantifying game excitement

Would it be possible to use the sum of the absolute values in WPA game state to come up with a number showing how exciting a game was?

Already happening on Fangraphs. They show average LI and average Win Expectancy. --Tangotiger 11:22, 23 July 2010 (PDT)

Throwing handedness of an outfielder

Does the throwing handedness of an outfielder matter? My friend and I just got into an argument in a bar over this. He claims that it would be an advantage for a leftfielder to throw righthanded and vice versa, because of the ability to make a better throw after running to catch a ball going towards the line. ... His argument was that the harder-hit balls, the ones that would require a running catch, would generally go towards the line, and thus put a LH LF in a position where he'd need to spin around to make a throw.... I argued his initial logic (that balls requiring a running catch would generally go towards the line), but being unable to prove that in a bar, I argued that this was probably about as much of a factor as the added ability a LH LF would have to make a running catch going towards the line (not needing to reach across the body to get the glove on the ball).... I also pointed out that I had seen the argument about handedness for all infielders and catchers, but I can't remember seeing it about outfielders, which doesn't prove anything, but at least establishes that it's not considered a factor by most....

I like the idea behind the question. In practical purposes, whatever it could matter won't count for more than one run in a season. --Tangotiger 11:22, 23 July 2010 (PDT)

Walks score faster

" I hate walks, Toronto manager Cito Gaston said. " I just absolutely hate them because most of them score, quicker than a base hit. But his next time, hopefully, he can work it out."

A guy reaching base on a single will score as often as reaching on a walk. The batters after that guy will score just as often in either case. So, I don't see how one way could be faster than the other, without getting a subsequent slowdown on the future batters. --Tangotiger 11:22, 23 July 2010 (PDT)


Been wondering this for a while: what with the xBABIP tools out there, is there a calculator to make a rough form of xwOBA - i.e. scale the singles so it matches up BABIP-wise, as is the fashion on FanGraphs(and really everywhere else) to make BABIP adjustments. Would be great to be able to do that on your own for whichever players you choose, I think.

An xWHIP calculator just came out, so xBABIP can't be far behind. --Tangotiger 11:22, 23 July 2010 (PDT)

Using bad starter after an off day

This topic came up recently at True Blue LA during a discussion about the Dodgers starting Ramon Ortiz against the Padres and pushing back Kershaw and Billingsley who would've pitched on five full days rest without being pushed back.... One poster stated that it was a better idea to use your worse pitcher (Ramon Ortiz) after a day off, due to the bullpen being fresh, than to use him when there was a game on the day before?

I assume that the rationale is that the bad starter is likely to pitch fewer IP than the good starter and therefore the pen is likely to pitch more innings (and you want a fresh pen). I doubt it makes much difference overall and if I were a manager I would not worry about that at all.

--Mgl 19:48, 28 July 2010 (PDT)

Do we play baseball

Hi, love the site/book (what I've seen of each so far...) Just wondering, have/do any of you play baseball?

I play with my kid. I used to play as a kid with the other neighborhood kids, back when we could go to parks without parential supervision or consent. I used to play in the company softball league. Not sure why you want to know. Are you holding tryouts? --Tangotiger 11:22, 23 July 2010 (PDT)

I played lots of baseball all my (younger) life. It certainly helps some in my work as an analyst, but by no means is it a pre-requisite to being a good baseball analyst or in advising a team. Baseball "experience" is a requisite, but that experience can consist of watching many, many games on TV and in person, which all good analysts do and have done.

--Mgl 19:40, 28 July 2010 (PDT)

Relevance of batter/pitcher matchups

...I pointed him to the study done in The Book and it's conclusion that even with 25 PA, you're better off looking at his last 3 years against the entire league. Anyway, he had some questions regarding your study. He wanted to know what the 2002 stats were for the middle 80% of the 300 matchups that weren't discussed in The Book. Also, he wanted to know if aging/decline in overall performance could have affected the results. One last question, what number of plate appearance against a specific pitcher would be needed for you to feel that the stats are meaningful?

The batter faced the same pitcher. So, they both got older at the same rate. Why would aging play a role?

If you have a large group of matchups, you are definitely going to get the league average. Considering that the extreme guys hit the league average, well, by definition, the non-extreme guys will also have to hit the league average. Right?

Are you asking me at what point would my decision to choose the player matchup override the non-player matchup?

So, if I had 25 PA of matchups and 1500 PA of non-matchups, I already said I'd choose the 1500 PA of non-matchups. The question then would be: 30 or 1800 40 or 2400 50 or 3000 60 or 3600 70 or 4200 80 or 4800 90 or 5400 100 or 6000

That is, would I prefer to look at 100 PA of player matchup, rather than 6000 PA of non-matchups?

It would be impossible to prefer the player matchup to the non-player matchup.

--Tangotiger 11:22, 23 July 2010 (PDT)

Thanks for the reply, I definitely would expect the middle 80% to hit about average since both extremes do, it's just my friend was wondering what the specific data was. And as for decline in performance, he thinks that maybe players having an off year could impact the study, although that seems pretty far fetched.... As far as my question relating to the number of plate appearances needed for a batter-pitcher matchup to be meaningful, I probably could have phrased it better so that you understood what I was really trying to ask, which is if it's possible for a hitter to over or under perform by a large enough margin against a specific pitcher in a large enough sample of plate appearances so that it would statistically significant.

Certainly, a batter or pitcher can have such an impact that a NON-ZERO difference would be statistically significant.

That is, if there was a 50 PA matchup where the wOBA was say .500, while we expected say .340, that a non-zero gap would be statistically signigicant. But, that gap, the .160 gap would NOT be the presumptive gap. It would be like .010 or something. That is, when you OBSERVE a large gap, and you pass the statistical signifiance test, the ONLY conclusion is that the gap is non-zero.

So, in all practical purposes, it makes little difference other than to use it as a tie-breaker.

--Tangotiger 11:22, 23 July 2010 (PDT)

So are you saying the .010 gap in wOBA is attributable to the hitter or pitcher such that the most of observed gap is due to random variance while a very small part of that gap is something inherent to the particular matchup?

--Tangotiger 11:22, 23 July 2010 (PDT)


Don't quote me on the .010 number. That was for illustration. I doubt it's any higher than .020 or so anyway. Unofficially.

--Tangotiger 11:22, 23 July 2010 (PDT)

Oh, I wasn't trying to quote you, I read The Book blog almost daily, and you use examples for illustration frequently, which are really for that purpose only, to help the reader understand more clearly. Thanks for taking your time to answer my questions. Whether you're busy or not, I appreciate it greatly.

Regression toward the mean on win percent

30-11...The method shows them with a 24-18 record. In conclusion, the Rays have been very lucky so far. They will come back to the pack, you heard it here first! What is your take?

Virtually no team that has a 30-11 record at some point in the season is actually a true 30-11 team.

--Tangotiger 11:22, 23 July 2010 (PDT)

Batted Ball FIP

Given that ground balls, liners, and fly balls are associated with vastly different expected BABIP and Run Values, why is it that there are no pitching metrics that account for the kinds of batted balls that a pitcher allows? Could something like FIP be adjusted, using linear-weights for batted-ball types, to more accurately reflect/project, say, an 'extreme ground-ball pitcher' profile? And would it be able to better explain the performance of pitchers who are said to routinely out- or under-perform their FIP?... It seems like such an obvious addition, to me, that I'm almost wondering whether I haven't seen it done only because someone has tried it and it didn't work.

It's been done so many times, that I'm surprised you haven't see it already.

For example, on my view own blog, do a search for bbFIP (batted ball FIP). That's one of many many incarnations. There's tRA, there's SIERA. I mean, it's a long list.

--Tangotiger 11:22, 23 July 2010 (PDT)

So I guess, then, that I'm wondering why FIP is more widely used/referenced than bbFIP, since it seems to me that bbFIP should be more accurate/useful. But maybe this is something I'll learn when I run a search for it on your blog...

Right, read and learn, and you'll get some new perspectives.

FIP has value and bbFIP has value. They each have their place. Think of how many different kinds of shoes you own.

--Tangotiger 11:22, 23 July 2010 (PDT)