Individual Poster Page

See copyright notice at the bottom of this page.

List of All Posters

 


Measuring Team Efficiency (December 15, 2003)

Discussion Thread

Posted 4:33 p.m., December 15, 2003 (#1) - Arvin
  Wow. the Red Sox were both offensively inefficient and defensively inefficient. Scary. I'm sure Theo is happy to see that.

-Arvin



Request for statistical assistance (December 17, 2003)

Discussion Thread

Posted 12:08 a.m., December 22, 2003 (#20) - Arvin
  Alright... here are some preliminary thoughts.
Your catcher's delta is a difficult random variable to deal with.
Why? You're determining it by a fairly complex method.
Ostensibly, it's similar to the sum of two binomials.
The problem is, the binomials may be vastly different.
eg. Carter:
PB distributed as Binomial(n=8000,p=.001) -making up numbers here.
PB2 distributed as Binomial(n=2000,p=.002)
X0 = PB-PB2*8000/2000
X = X0 normalized to 162 GP.
Thus, X is a mixture of two binomial random variables with different
N and different p.


Request for statistical assistance (December 17, 2003)

Discussion Thread

Posted 1:12 p.m., December 22, 2003 (#23) - Arvin
  Ok, further thoughts...

1) The binomial is a funny distribution. It's very similar to the normal distribution for .3EXTREMELY skewed. What to do about it? Well, you can conclude that the normal approximation will do nothing for you. You can't use it.

2) back to the mixture of binomial R.V.'s:
PB-Carter = PBC ~ Bin(8104,.0009) (µ=7.3, σ² = 7.2)
PB-others = PBO ~ Bin(3598,.0022) (µ=7.8, σ² = 8.0)
δPB = PBC - (8104/3598)*PBO

Formula: Var(aX) = a^2*Var(X)
Thus,
Var((8104/3598)*PBO) = (8104/3598)^2*Var(PBO) = (8104/3598)^2*3598*(.0022)*.(1-.0022)
Thus, the second term in the mixture, (8104/3598)*PBO,
has (µ=17.8, σ² = 40.8)

δPB = PBC - (8104/3598)*PBO
= (µ=7.3, σ² = 7.2) - (µ=17.8, σ² = 40.8)

Alright, what then? Simulation results(n=10,000) give:

δPB = (µ=-10.5, σ² = 48.4)

The variances don't strictly add, as you would expect with a normal distribution. Here's a histogram chart of the resultant R.V:
center-of-bloc count
-40.5 4
-35.5 16
-30.5 83
-25.5 345
-20.5 960
-15.5 2146
-10.5 2690
- 5.5 2423
- 0.5 1085
+ 4.5 230
+ 9.5 18

Things I notice:
a) the variance from the small n sample dominates.
b) the variances come close to pure additive variance. You could probably fudge the variance calculation by approximating an additive model and then fudging upwards a little bit.
c) the distribution is skewed but not too crazily skewed.

Next:
You're using ΔPB, which is δPB normalized to 162 Games Played.
Q) How do you do this normalization?
So far, we have δPB = -11 over 8104 PA. How is this normalized to Games Played?


Request for statistical assistance (December 17, 2003)

Discussion Thread

Posted 1:14 p.m., December 22, 2003 (#24) - Arvin
  First line edited:
1) The binomial is a funny distribution. It's very similar to the normal distribution for .3<p<.7, but as you near the edges, it becomes increasingly skewed towards .5.


Copyright notice

Comments on this page were made by person(s) with the same handle, in various comments areas, following Tangotiger © material, on Baseball Primer. All content on this page remain the sole copyright of the author of those comments.

If you are the author, and you wish to have these comments removed from this site, please send me an email (tangotiger@yahoo.com), along with (1) the URL of this page, and (2) a statement that you are in fact the author of all comments on this page, and I will promptly remove them.