See copyright notice at the bottom of this page.
List of All Posters
Aging Patterns
July 3, 2002 - Ed
Wading into the Ben-Gerry discussion of normality, I was curious, so I performed the Shapiro-Wilk test of normality on batting averages, by season, including in each season those players with 300+ PAs.
Three numbers are shown here, year, # of players with 300 PAs, and the observed p-values for the Shapiro-Wilk tests. You can see that whether one is willing to conclude that the seasonal results are normally distributed overall depends on what level of significance will be tolerated. [The null in the S-W test is normality].
1960 126 0.736 1961 151 0.030 1962 168 0.293 1963 166 0.233 1964 177 0.961 1965 173 0.899 1966 169 0.161 1967 168 0.271 1968 163 0.635 1969 194 0.085 1970 210 0.721 1971 206 0.183 1972 195 0.060 1973 222 0.968 1974 216 0.353 1975 215 0.689 1976 211 0.181 1977 236 0.171 1978 224 0.623 1979 231 0.732 1980 237 0.246 1981 154 0.535 1982 225 0.920 1983 242 0.516 1984 231 0.709 1985 224 0.176 1986 234 0.013 1987 236 0.320 1988 228 0.072 1989 231 0.072 1990 233 0.503 1991 226 0.861 1992 241 0.053 1993 244 0.346 1994 206 0.023 1995 224 0.153 1996 249 0.855 1997 244 0.187 1998 261 0.915 1999 276 0.781 2000 256 0.874 2001 260 0.581
Advances in Sabermetrics (August 18, 2003)
Discussion ThreadPosted 2:39 p.m.,
August 19, 2003
(#10) -
Ed
Your best guess should be that JW's true level is .380.
To guess otherwise is to misunderstand regression to the mean.
Advances in Sabermetrics (August 18, 2003)
Posted 12:32 a.m.,
August 20, 2003
(#23) -
Ed
I think we have a conflation of concepts here. The question, as originally posed, was:
"Let me ask a question then: all you know is the following 2 bits of information
- Johnny Walker has an OBA of .380, with 600 PA
- the league OBA is .340
What is your best guess as to JW's true OBA talent level? That is, if he were to have 1 million PAs, what's your single best guess as to his true OBA level? Is his chances at really being .380+ equal to, more than, or less than 50%?"
Now, we are not all being consistent in how we are using the term "true talent level." I am using it in the traditional frequentist sense. Call JW's true talent level T. I assume the 600 PA are a random sample of PA for JW. From that sample, we calculate an estimate of T, T*. Sampling theory tells us that the sample mean is an unbiased estimator of the population mean. That is, your best guess of T is T*. It is not T* plus or minus some value, depending on whether T* for JW is above or below the average OBA of a whole bunch of players. When we calculate T*, we can also calculate SE(T*), the standard error of T*, which reflects our level of uncertainty. T* is our best guess for T, whether the number of PA is 600 or 100. The standard error will be much larger for the latter, of course.
But what about the league average? Let's call it MLB-T to distinguish it from JW-T. Everyone wants to factor that in for our guess for JW-T, but unless one is willing to make a set of complicated dependence assumptions wrt the connection between the random draws for JW-T and MLB-T, MLB-T is irrelevant, given the question as posed, unless one wanted to go a bayesian route and make some auxilary assumptions. One could assume that the league OBA is always around .340, and treat the problem as an updating problem. We start with a prior for any given player, JW included, of .340. Then we update our beliefs about a player's ability (but not "true ability", which doesn't work very well in a bayesian framework) as information (PAs) comes in. At the end of the 600 PAs at a .380 level, we will have updated our beliefs to be somewhere between .380 and .340. With more PAs at .380, we will get closer to .380. And this is not regression to the mean. It's just a way to work MLB-T into the calculations.
I think people want to work in regression to the mean because of the well known regression effect (usually credited to Galton) that leads to declines above the mean and increases below the mean, given non-perfect correlations. If you want to forecast OBA(t+1) from OBA(t), the regression forecasts for players above the mean at t will tend toward the mean (on average) and players below the mean at t will have forecasts that will tend toward the mean. But the forecasts for t+1 are not the same thing as the best guess of the true values for individual players. How could they be? If there is measurement error or other random errors that lead us to believe that one set of 600 PA is only an estimate of a player's true value, we should believe the same thing about any set of 600 PA, right?
To sum up a windy post, I agree with what people are saying when they say that our forecasts will be somewhere between .340 and .380 (because of the regression effect). I disagree that that is the same thing as making a best guess as to JW-T, given the way the question was posed. For that, I'll lean on sampling theory and feel quite safe.
Advances in Sabermetrics (August 18, 2003)
Posted 8:32 a.m.,
August 22, 2003
(#34) -
Ed
"A player will regress 100% towards his true mean."
I know what you, um, mean, Tango, but I think a better way to put this to avoid confusion is to say that in the long run, players converge to their true abilities. This can be justified by the law of large numbers, or what some people call the law of averages. It also provides the underpinning for estimating a population mean with a sample mean.
Our best guess for a player with 9791 PA, an OBA of .482 during a period of LgOBA of .356, say, is that his true OBA is .482, not a little less than .482. For these types of problems, think convergence, not regression.
HOOPSWORLD.com Review: Pro Basketball Prospectus 2003-04 Edition (November 18, 2003)
Posted 11:46 a.m.,
November 29, 2003
(#14) -
Ed
Does anyone know of factor analytic type studies being done to calculate similiarity scores? It seems like a natural way to approach the problem, but I haven't seen it done before. It would be less ad hoc than some of the competition, esp. with respect to how the various indicators are weighted.
Clutch Hits - True Talent Levels (January 22, 2004)
Discussion ThreadPosted 4:11 p.m.,
January 24, 2004
(#1) -
Ed
So technically this is another thread, right?
On the test score theory, one thing you should note is that the standard setup (as presented on the PSU page) makes an assumption of one-dimensionality. That is, the "latent trait" or the "true ability" or whatever is only one-dimensional. Tango is introducting another dimension, which would leave one with something like:
Xi1, Xi2 instead of just the Xi.
To deal with those types of models, psychometricians typically use the more modern technique of Item Response Theory (IRT), which can much more easily accomodate multiple dimensions. Implicit in what you are arguing, I think, is an IRT model with covariates, whether the covariates capture the "context" that affects the IRT scores. But I really don't think you want more than one dimension.
Baseball Prospectus - : Evaluating Defense (March 1, 2004)
Posted 8:33 a.m.,
March 2, 2004
(#6) -
Ed
I've started to read through BP 04 and was surprised to see Scott Rolen with negative fielding runs for 2003. Do any other metrics produce this surprising result?
Baseball Prospectus - : Evaluating Defense (March 1, 2004)
Posted 9:25 a.m.,
March 2, 2004
(#8) -
Ed
Interesting. I seem to remember Diamond Mind giving him a GG for 03, but that may be faulty memory.
Baseball Prospectus - : Evaluating Defense (March 1, 2004)
Posted 9:27 a.m.,
March 2, 2004
(#9) -
Ed
(homepage)
Answering my own question, Beltre beat out Rolen in 2003 (homepage).