Tango on Baseball Archives

© Tangotiger

Archive List

SABR 301 - Talent Distributions (June 5, 2003)

The following shows the theoretical distributions of talent. It is based on estimates, some empirical data, and my intuition.
--posted by TangoTiger at 09:39 AM EDT


Posted 11:36 a.m., June 5, 2003 (#1) - PhillyBooster
  It seems from those charts that it should be possible to determine the quality of the "median" baseball player (or, the same thing, the quality of the average plate appearance).

I have wondered whether, in calculating a stat like OPS+ or ERA+, whether the comparison should less 100 as "league average" rather than "league median". Assumedly, adding a handful of greats will not effect replacement level or median, but it could have a big impact on "average".

Value over replacement is useful for looking at what happens in Player X is injured in June, but when determining who to sign in a relatively plentiful free-agent market in December, it should be worthwhile to know how Player X (who, let us assume, is exactly 'average') will measure up in a typical plate appearance (against a pitcher who, by the definition of "typical", is exactly 'median').

On, to look at it another way, a median player is the 375th best player in baseball (halfway point on thirty 25 man rosters). Is the average player the 350th best? The 300th best?

Posted 11:58 a.m., June 5, 2003 (#2) - tangotiger
  In a typical plate appearance, a player will not face the median pitcher, but the average pitcher. If you've got say 360 pitchers, but the BFP are spread out based on talent, with the 360th pitcher barely pitching, then the batter will faced a weighted version of the 360 pitchers, which comes out to exactly average.

However, the concept of median and other things that you can gather from these charts is certainly interesting. You just have to be careful in how you apply it, and its purpose.

Posted 11:58 a.m., June 5, 2003 (#3) - Yaz
  Good Lord, Tango, that's great stuff. Very interesting. Thanks.

Posted 12:08 p.m., June 5, 2003 (#4) - RossCWXYZ
  This is a waste of time. If you need to know how baseball works, ask Syd Thrift!

Posted 12:13 p.m., June 5, 2003 (#5) - Randal
  Are you the biggest idiot ever?

Posted 12:33 p.m., June 5, 2003 (#6) - Patriot
  #4 and 5...my point exactly. Of course, I understand that it would be a big hassle for Primer to have a registration system so I can't blame them for not having one.

The idea of the median player may have some applications, but average is still very important because that is your opponent. Using median as a baseline has basically the same pitfalls as using average and doesn't change much...plus it isn't nearly as useful.

Posted 12:48 p.m., June 5, 2003 (#7) - PhillyBooster
  Perhaps I am misinterpreting the Chart 4 and Chart 6 in the article.

I interpreted Chart 4 to say that average major league talent is at about 4.45 standard deviations (the x-intercept of where the blue line crossed 1.00 on the y-axis).

But then, I then looked at Chart 6, and saw that 4.45 was also the high point of the skewed bell curve. From eyeballing the chart, it appeared that there were more plate appearances to the right of 4.45 than there were to the left,and that those to the right extended a greater distance from 4.45. Therefore, I concluded that the average major league plate appearance was in fact performed by a player better than 4.45.

If more than half of all plate appearances are by players who are better than 4.45 standard deviations, then the "average" plate appearance would be by a 4.50 or 4.55 player. The 1.00 in Chart 4, I therefore concluded, represented not an "average" player (a player who, for all over at bats, half are by someone better than him), but a median player (a player who, for all other players in MLB, half are better than him), who was somewhat worse than an average player.

Perhaps, though, I have misunderstood.

Posted 12:50 p.m., June 5, 2003 (#8) - PhillyBooster
  My question, I guess, is:

1. What does it mean that a 4.45 player has a talent level of "1.00"? and

2. What does it mean that more than half of all plate appearances are above the 1.00 level?

Posted 1:11 p.m., June 5, 2003 (#9) - tangotiger
  Philly, I'll answer your specific questions in post8, and if you have other questions from post7, please rephrase them. I had a hard time following it.

1 - 1.00 is just a "fictitious" number, like Pamela Anderson is a 9.5 / 10, or what have you. I say "fictitious" in quotes because there is some reason behind it, but I haven't presented it here, though I will in the future. It's a number that can be multiplied and divided. A guy with .5 talent level compared to a 1.0 talent level is the same as a 1.0 talent level compared to a 2.0 talent level. That is, if you have a 1.0 hitter against a .5 pitcher, the resultant expected matchup is exactly the same as a 2.0 hitter against a 1.0 pitcher.

Those numbers might be roughly equivalent to a single-A pitcher (.50), an avg MLB hitter or pitcher (1.00) and Barry Bonds/Pedro (2.0). (Maybe top college and not single-A... I don't know yet.)

2 - If there are say 680x9x30 PA in MLB, and you have say 14x30 hitters, then the avg #ofPA per average hitter is about 430 PAs. A top hitter would have say 700PAs, or 160% of average, or 1.60. Something like that. I was thinking of putting actual numbers, say from a scale of 0 to 750, instead of 0 to 1.8, or whatever. I kind of like having the average at 1.0 though.

Posted 2:52 p.m., June 5, 2003 (#10) - PhillyBooster
  I will try again, since I guess I was not very clear.

Chart 6 has a Y-axis labelled "Playing Time" 1-100. Draw a vertical line on that chart so that there is exactly as much Playing Time on each side. That line would probably fall between 4.50 and 4.55 (in any event, to the right of 4.45). But 4.45 is defined as 1.00 ("MLB average") in chart 4.

Why is the mid-point of Playing Time chart (Chart 6) not identical to the MLB average (1.00) in Chart 4?

Posted 3:28 p.m., June 5, 2003 (#11) - Vinay Kumar
  Why is the mid-point of Playing Time chart (Chart 6) not identical to the MLB average (1.00) in Chart 4?

PhillyBooster, I don't think MLB average is defined as 1.00. Tango says in #9 that 1.00 "might be roughly equivalent to ... an avg MLB hitter or pitcher." In the scale for talent, the numbers are useful only relative to each other; the absolute numbers don't mean anything (well, apparently they do, but we don't know what yet).

If you think the mean in Chart 6 is 4.50 or even 4.55, that mean's the average talent level is about 1.05, so it's not that big a difference.

But this does have me curious now what these numbers do mean; what is 1.00, and why is it that?

Posted 4:23 p.m., June 5, 2003 (#12) - tangotiger
  Chart 6 has a Y-axis labelled "Playing Time" 1-100. Draw a vertical line on that chart so that there is exactly as much Playing Time on each side. That line would probably fall between 4.50 and 4.55 (in any event, to the right of 4.45).

Yes, that's correct.

But 4.45 is defined as 1.00 ("MLB average") in chart 4.

4.45 is defined as the talent level of 1.00. The talent level of 1.00 is the MLB average.

Why is the mid-point of Playing Time chart (Chart 6) not identical to the MLB average (1.00) in Chart 4?

Because the distribution is not normal. Chart 6 is a multiplication of Chart 5 (with a very high skew to the right) and Chart 2 (with a very high skew to the left).

The mid-point of chart 6 (Playing Time chart) would be the *median* and not the mean. Because of the skew, this is pretty much what we expected.

Posted 4:31 p.m., June 5, 2003 (#13) - tangotiger
  I made an error in my article. If you multiply Chart SIX by Chart 4, you'll get 1.00. That's the same as multiplying Charts 2 (number of players at each level), 4 (talent at each level), and 5 (playing time at each level).

Vinay, you might remember I had a thread last week regarding "ERA by era" or some such. And in there, I showed that regardless of the run environment, your ERA relative to league average ERA was pretty constant. So, if you are Pedro with a 2 ERA in a league of 4, you should expect to be a 2.5 ERA in a league of 5. That makes his ERA+ as 200, or twice the league average. That gives him a talent level of 2.00, in a league of 1.00.

If you decided to double the number of MLB teams, the talent level would drop to say .83, and Pedro maintains his 2.00 talent level. I used 1.00 as a convenient marker, but it was fixed, so that I didn't always have to redo the baseline.

Consider the 1.00 level to be the avg MLB player in 2002.

Posted 7:32 p.m., June 5, 2003 (#14) - Walt Davis
  mostly a non-sequitir, but let me just jump in and say that normal distributions are anything but typical. About the only place in the real world where you find normal distributions are in physical/biological matters (e.g. height and quite possibly baseball talent) and large enough samples of something close enough to a binomial distribution. To most statisticians, the real-world normal distribution is a bit like the holy grail.

I bring this up only because this is at least the third time I've seen a sabermetrician mention how common or typical a normal distribution is.

Posted 8:35 p.m., June 5, 2003 (#15) - Tom N
  As far as I can surmise:

The talent distribution within baseball is the far right of the curve, with, in terms of talent, very few truly great players, a few very good ones, etc. There is, however, an absolute number of plate appearances/pitches required in order to play a proper game. The number of PAs or pitches that can be performed by players at the right of distribution is limited by their scarcity. Similarly, far less playing time goes to the players at the left because of their crappiness. Thus when one considers opportunities, major league talent distribution is normal.

I hope the above does not misrepresent what the graphs convey. If so, please correct me. But isn't it the case that almost all players to the left of the batting distribution have something to recommend them: defensive skill, the ability to pitch, looks good in a uniform, minor league lifer getting September call-up? Unless your point is that given the scarcity of players on the right, it's necessary that the last batter or two on the bench or perhaps all the Detroit Tigers be pretty crummy. Or perhaps the point is that no matter how large or small the red box is drawn, this distribution, when playing time is taken into account, will remain relatively stable? I'm not sure, since it seems that the plateau on the right of chart 5 will just extend further to the left, and the drop-off will remain just as steep. Feel free to disabuse me of any wrong-headed notions in the above post.

Posted 8:54 p.m., June 5, 2003 (#16) - Kevin Harlow (homepage)
  Assuming something close to a normal distribution (which may not be technically accurate) I've created ERA++ which is ERA+ adjusted based upon the league COV of ERA+.

(ERA++) = 1 / [1-RefCOV(PF*LgERA)/COV(PF*LgERA) * (1-1/(ERA+))]

When RefCOV(PF*LgERA)=COV(PF*LgERA) then ERA++ = ERA+.

Derivation is at my website.

Posted 10:29 p.m., June 5, 2003 (#17) - tangotiger
  Walt: I was trying to avoid technical terms so that I wouldn't get slammed for using normal, binomial, standard normal, etc improperly.

common or typical a normal distribution is

I don't think I said that a normal distribution was typical, but rather that the distribution that does exist in Chart 6 was a typical looking distribution (in lay terms).

For my education, if the median is to the left of the mean, what is that distribution called? Does a normal distribution imply that the mean and median are equal? Does a standard normal distribution imply that the 68% of the points fall within 1 SD and that the mean and median are equal?

Thanks...

Tom: yes, that's pretty much it.

Kevin: please explain the purpose of the equation. I don't know what it's trying to tell me.

Posted 1:48 a.m., June 6, 2003 (#18) - Michael
  Stat terms 101 as applied to this article:

I think one thing that must be kept in mind is that the distribution of talent in MLB (according to this study) is distributed in a normal way when considering playing time. If you don't consider PA, but instead consider by number of players, then you get the exponential distribution as in chart 2. For certain discussions the by player is a key point (like what replacement level might be). For certain discussions (like what an "average" player faced in a typical mlb situation is like) the PA weighted version is more powerful.

The distribution with a higher than normal distribution of values left of the mode [the most frequent value] is a skewed distribution (with a left-skew, or positive skew value). There is also the kurtosis of a distribution that determines how skinny or fat the distribution is (positive narrow, negative wide, 0 normal). In the normal Gaussian distribution the mean = mode = median. I've normally (no pun intended) heard a "normal like" distribution that is skewed called a, cleverly enough, skew-normal distribution.

For a finite sample you can calculate the skew by taking the (sum of ((the difference between each value and the mean), cubed) divided by the number of samples), all divided by the cube of the standard deviation.

The standard deviation being the familiar square root of ((the sum of (the difference between each value and the mean), squared) divided by the number of samples) which generally measures the spread of a distribution.

The kurtosis can be calculated similarly as the (sum of ((the difference between each value and the mean), to the fourth power) divided by the number of samples), all divided by the fourth power of the standard deviation.

When the kurtosis and the skew are near zero and you have a normal like distribution then the confidence ranges rules of thumb are the familiar 68% within 1 SD of mean, etc.

Now to try to explain what I think we need for what Kevin is getting at:

covariance. So Cov is short for covariance which is a measure of, cleverly enough, how two things co-vary. I.e., how they vary together. Covariance is the average of the product of the difference between the two values and their distributions means. For instance imagine we were measuring the baseball talent of people residing in the US and the income of people residing in the US. If the average baseball talent is B and the average salary is $ then for each person [p] we compute their talent level [T(p)] and their income [I(p)] then for each p residing in the US we do (T(p) - B) * (I(p) - $) and take the average of all these products across all these p's to get the covariance(T, I). (Actually we divide the sum of the product from these n people by (n-1) for some technical reason I can never remember).

I think that paragraph makes sense if you read it carefully. But even if not what it implies is that if the covariance is positive then the two things we are measuring (variables) increase together (in the example above we might expect there to be a [slight?] positive covariance between baseball talent and income level). If as one variable goes up the other goes down you end up with a negative covariance. If the variables being analyzed are independent than the covariance is 0. Using some math one can figure out that the absolute value of the covariance of X and Y (where X and Y are two variables we are studying) is always less than or equal to the product of the standard deviation of X with the standard deviation of Y. Also the covariance of X and X (I.e., of a variabled with itself) is always equal to the standard deviation squared (aka the variance).

Now this covariance may sound like some similar to correlation. But one thing to note is the units of covariance are in terms of what the units of X and Y are and the size of the covariance of X and Y is to a very large degree influenced by the standard deviations of X and Y (I.e., a huge standard deviation of X will lead to a larger value for the covariance of X and Y than one might expect for things that are not related). This means that if we want to tell how related X is to Y and compare that with how related A is to B we might be in trouble using just covariance.

So there we bust out correlation and move each distribution to a more standard z-score (mean of 0 standard deviation of 1) by dividing each point in X by the standard deviation of X and subtracting from each point the mean/std. deviation so the new distrbitution has a mean of 0 and a standard deviation of 1. This means each point is now measured in standard deviations from the mean. So if we do the average of the product of z-scores of X and Y (instead of the average of the product of the differences between each point and the mean of the respective distributions) we calculate the correlation coefficient. Note this is just a special type of covariance so the sign of the correlation should be the same as the sign of the covariance. Also note that thanks to what we know about covariance the correlation must be between -1 and 1 because the standard deviations of our original z-scores was 1.

Once you have the correlation you can compare the correlation of X and Y with the correlation of A and B (or even with the correlation of X and A). Also you can tell how much the variance in one distribution is explained by the other distribution by squaring the correlation (the r^2 number that so many stat projects show).

So getting pack to Kevin, the cofusing thing is I don't get what the covariances are of, are they the covariance of park factors and league era? Are they instead the covariance of ERA+ and league.

Think, think, think... Ah, I should RTFM and then I'd learn COV by Kevin is actually coefficient of variation (std. deviation/ average). So what he's doing is converting the ERA+ to z-scores and then comparing seasons across time on the z score axis. Which as alluded to above in my covariance digression is an attempt to put things on the same untis for comparison. In other words it is an attempt to compare and ERA+ of 150 in 2002 with an ERA+ of 150 in 1960 (amongst other things). It answers the question "is an ERA+ of 200 equally good across all time" with "no, look at the ERA++".

Posted 9:43 a.m., June 6, 2003 (#19) - tangotiger
  Michael, thanks much for all that info.

I really don't get the use of the covariance for the ERA, insomuch as what it's trying to tell us.

I showed mathematically that an ERA of 2.00 in a 4 RPG environment is the same as a 2.5 in a 5 RPG environment. Therefore, an ERA+ type fits the bill.

However, if you are trying to use the covariance to say "how hard" is it to get a ERA+ of 200, based on the talent distribution of your opponents, or your peers or something, that's another issue. That maybe there's such diluted talent that an ERA+ of 200 in 1906 is equivalent to 160 in 1993 or something. That's really a whole other ball of wax (and really more in line with what I'm trying to do here, than what ERA+ equivalencies are normally used for).

It almost looks like Kevin is trying to work backwards by trying to infer what the talent distribution could have been to produce those results. It's a worthwhile exercise, but I must ask what is the confidence level and sampling error in doing so.

Posted 11:50 a.m., June 6, 2003 (#20) - Ted Arrowsmith
  The fact that Tango is developing here that the median player does not produce a mean performance is quite important. This fact causes a casual (i.e., non-statistical) observer to fall into error.

As Walt notes, many biological/human characteristics are normally distributed: height, outgoingness, anxiety, openness to new ideas or experiences. Thus, for many characteristics the mean, modal, and median are the same. That is, in most things we evaluate informally the "typical" person and the "average" person are the same.

But, as Tango's graphs show, this is not true of baseball players. If you take a typical ball player -- one that people describe as "average" in everyday usage-- you will not get a league average performance from that player. I suspect that this is a major cause of overvaluing players. Derek Bell seems like a typical outfielder so our usual decision making processes tell us that he should provide league average performance. But our common sense lets us down and we're left with Operation Shutdown.

As is frequently observed, many pennants are lost for lack of a couple league average performances. You can't put a typical third baseman at third and get average performance.

Posted 12:34 p.m., June 6, 2003 (#21) - tangotiger
  Well said.

Another thing that opens up now is "Regression towards the mean". What mean? The unweighted mean of all MLB players (say talent level .91)? The weighted mean by PA of all MLB players (talent level 1.00)?

The issue is with PA: is that based on a player's sample performance, or based on his "tools"? This becomes critical, especially for rookies. If I were to regenerate the charts, but only look at say 22 year olds, thing are going to get skewed differently. Are 22 year olds given PAs by talent or performance? What mean do we regress them towards? What's the difference between a performance level of a 22 year old of .85 in MLB than in TripleA?

The only reason he's in MLB is because he was selected there for some reason. If that selection was based on tools, that's one thing. But if it was based on sample performance, that's quite another.

Posted 12:49 p.m., June 6, 2003 (#22) - Dave Studenmund (homepage)
  Just a comment to help me fully understand the line of reasoning. Maybe add a graph that replicates #6 (placed before current #5), but by number of major leaguers instead of plate appearance opportunities. This would make the distinction between median and average more apparent.

Also, I have an incredibly picky point: the graphs are a bit big for my 800x600 screen resolution, and the right sides get lost. Sorry for being picky, but I thought you might like this pointed out.

Posted 1:32 p.m., June 6, 2003 (#23) - tangotiger
  Dave, you want me to multiply Chart 2 (number of players, per SD) by Chart 4 (talent level, per SD)? That resultant will be a weighting of the players, by talent level, per SD.

That is, if you had 300 players at 4.2 and 150 at 4.6, and the talent level at 4.2 was .8 and the talent level at 4.6 was 1.6, then you'd get "240" at 4.2 and at 4.6.

What would this represent?

If you want to multiply them all AND add then, to come up with one number, that's a different story. This would be the unweighted average talent level in MLB. The answer to that question would be 0.92. That is, if you take the 1500 players who play in MLB in any given year, and they were to each play the same amount, their mean talent level would be 0.92.

Posted 11:29 p.m., June 6, 2003 (#24) - Dave Studenmund (homepage)
  I originally thought of your first idea, Tango, but you're right. The second idea is better. Actually, it would be interesting to impose the one distribution on top of the other so the difference would be more apparent. Just an idea.

Posted 12:36 p.m., July 7, 2003 (#25) - tangotiger
  With these distributions, we are in a good position to establish how much "talent dilution" exists with adding or subtracting teams.

If we assume that the current 2003 average player has a talent level of 100, what would happen if half the teams were to disband, leaving us with 14 or 16 teams? What would the average player look like?

I figure that the average player in such a league would have a talent level of 110, which is roughly equivalent to a player that is right now about +1 win / 162 GP over the 2003 average player.

How about if we double the number of teams from 30 to 60 teams? These talent distributions say that the average player in such a league would have a talent level of 90, or about -1 win / 162 GP from the average 2003 player. Troy O'Leary or Ricky Ledee would be an average player in such a league.

So, when we talk about adding or subtracting 4 teams, how much impact is that? This would have the effect of making the player with the 97 talent level or 103 talent level average. Effectively, this would be imperceptible to the viewer.

So, when people talk about "talent dilution", it's hard to see it, if you are talking about adding/subtracting 2 or 4 teams.

Posted 6:40 p.m., July 7, 2003 (#26) - MAH
  Tango, great article, graphics and comments. Your last comment made me wonder whether it would be possible to build a sensible model to "discount" player performances from eras in which the major league talent pool was artificially constricted (by the color bar, for example) or expanded (by the siphoning off of talent from declining non-affiliated "minor" leagues, such as the PCL). Any thoughts?

Posted 9:02 p.m., July 7, 2003 (#27) - tangotiger
  At fanhome I did a very long study on timeline adjustments. However, based on assumptions, you can make the case that Ruth's era had half the talent as today, making Ruth average today. Using a very slightly different assumption, Ruth then became much much better.

The study also suffered from my non-understanding (at the time) of regression towards the mean, and my non-understanding that a player's performance is only a sample of his true talent, and not representative of his talent. This basically invalidated everything I've just said about Ruth.

I've seen this error repeated when looking at aging patterns for pitchers, where regression towards the mean is much more important to understand. You can spot these studies when they show a pitcher's peak age to be 23.

I was going to rerun that study (eventually).

Using the talent distributions that I've listed here would probably be best to be used as explanation after the fact, than trying to lead to a conclusion.

Posted 8:21 p.m., July 9, 2003 (#28) - FJM
  In a perfect world there would be a perfect correlation between talent level and playing time. The real world is far from perfect. I hope we can all agree that Barry Bonds was the best outfielder last year. Yet he was tied for 33rd (with Trot Nixon, no less!) in PA's. Manny Ramirez had the 2nd best OPS among outfielders, yet he ranked 55th in PA's.

Injuries are the most obvious reason that the correlation breaks down, but they are not the only one. Ichiro lead the way in PA's, but his OPS was far down the list. (In fact, he barely beat the aforementioned Mr. Nixon, .813 to .808.) Darrin Erstad ranked 15th in PA's, despite a lousy .702 OPS. I hope your future studies will develop are more complete Playing Time model.