See copyright notice at the bottom of this page.
List of All Posters
Measuring Team Efficiency (December 15, 2003)
Discussion ThreadPosted 4:33 p.m.,
December 15, 2003
(#1) -
Arvin
Wow. the Red Sox were both offensively inefficient and defensively inefficient. Scary. I'm sure Theo is happy to see that.
-Arvin
Request for statistical assistance (December 17, 2003)
Discussion ThreadPosted 12:08 a.m.,
December 22, 2003
(#20) -
Arvin
Alright... here are some preliminary thoughts.
Your catcher's delta is a difficult random variable to deal with.
Why? You're determining it by a fairly complex method.
Ostensibly, it's similar to the sum of two binomials.
The problem is, the binomials may be vastly different.
eg. Carter:
PB distributed as Binomial(n=8000,p=.001) -making up numbers here.
PB2 distributed as Binomial(n=2000,p=.002)
X0 = PB-PB2*8000/2000
X = X0 normalized to 162 GP.
Thus, X is a mixture of two binomial random variables with different
N and different p.
Request for statistical assistance (December 17, 2003)
Posted 1:12 p.m.,
December 22, 2003
(#23) -
Arvin
Ok, further thoughts...
1) The binomial is a funny distribution. It's very similar to the normal distribution for .3
EXTREMELY skewed. What to do about it? Well, you can conclude that the normal approximation will do nothing for you. You can't use it.
2) back to the mixture of binomial R.V.'s:
PB-Carter = PBC ~ Bin(8104,.0009) (µ=7.3, σ² = 7.2)
PB-others = PBO ~ Bin(3598,.0022) (µ=7.8, σ² = 8.0)
δPB = PBC - (8104/3598)*PBO
Formula: Var(aX) = a^2*Var(X)
Thus,
Var((8104/3598)*PBO) = (8104/3598)^2*Var(PBO) = (8104/3598)^2*3598*(.0022)*.(1-.0022)
Thus, the second term in the mixture, (8104/3598)*PBO,
has (µ=17.8, σ² = 40.8)
δPB = PBC - (8104/3598)*PBO
= (µ=7.3, σ² = 7.2) - (µ=17.8, σ² = 40.8)
Alright, what then? Simulation results(n=10,000) give:
δPB = (µ=-10.5, σ² = 48.4)
The variances don't strictly add, as you would expect with a normal distribution. Here's a histogram chart of the resultant R.V:
center-of-bloc count
-40.5 4
-35.5 16
-30.5 83
-25.5 345
-20.5 960
-15.5 2146
-10.5 2690
- 5.5 2423
- 0.5 1085
+ 4.5 230
+ 9.5 18
Things I notice:
a) the variance from the small n sample dominates.
b) the variances come close to pure additive variance. You could probably fudge the variance calculation by approximating an additive model and then fudging upwards a little bit.
c) the distribution is skewed but not too crazily skewed.
Next:
You're using ΔPB, which is δPB normalized to 162 Games Played.
Q) How do you do this normalization?
So far, we have δPB = -11 over 8104 PA. How is this normalized to Games Played?
Request for statistical assistance (December 17, 2003)
Posted 1:14 p.m.,
December 22, 2003
(#24) -
Arvin
First line edited:
1) The binomial is a funny distribution. It's very similar to the normal distribution for .3<p<.7, but as you near the edges, it becomes increasingly skewed towards .5.