Individual Poster Page

See copyright notice at the bottom of this page.

List of All Posters

 


SABR 301 - DIPS Bands (July 15, 2003)

Discussion Thread

Posted 3:50 p.m., July 16, 2003 (#1) - JK
  Its been a long while since Ive done any stats work, so bear with me.

Taking $H* = (Teammate $H/Individual $H), the idea is to show that there is a statistically significant correlation between $H* and pitcher quality, with pitcher quality represented by BIP. The problem is that a pitcher's BIP is a function of how much a manager plays the pitcher, which in turn is a funciton of perceived quality, which in turn is a function of the pitcher's observed $H* to that point. So cet par, pitchers who have exhibited better past $H* will get more BIP opps; thus, selection bias.

IIRC the usual way to correct for selection bias is a two-step Heckman procedure, where you specify a "selection equation" (usually using a probit model) and then use the residuals of the selection equation to generate a control factor for the selection bias. Then in the second step you just run an ordinary OLS regression on the substantive question that you are looking at (eg $H* = B1*BIP + B2*X2 + . . . + Bn*Xn + Constant + e), but also including the selection bias control factors derived in the first regression as an independent variable.

Unfortunately, I just dont know how to pull this off; Im not even sure how to configure the selection model properly. I do know most of the good commercial stats packages out there will do must of the heavy lifting for you if you get the basic specification right. Hopefully someone who knows a lot more about stats will post a solution.


SABR 301 - DIPS Bands (July 15, 2003)

Discussion Thread

Posted 4:33 p.m., July 16, 2003 (#3) - JK
  A thought:
Create dummy variables for each BIP band. Then run the Heckman where the selection model is a probit model estimating the probability of belonging in a given BIP band given $H* and any other included indepedent variables. Then import the control factor based on those residuals into an OLS regression of $H* against the BIP category dummies plus other included indep. variables plus the control factor. Only thing is, I'm not quite sure how this is done for multiple exclusive dummy variables.



Patriot: Baselines (September 17, 2003)

Discussion Thread

Posted 11:31 a.m., September 19, 2003 (#12) - JK
  I don't agree with the replacement paradox; or, more to the point, I accept its literal truth but reject its significance. The relevant question seems to me to be: given an established level of (empirical) performance, which player would a team prefer to take. All else equal, a team may prefer to take the .499 player with 500PA over the .510 player in 10PA because the former has proved to have significant value whereas the latter has proved very little. The same reasoning can be used to show that the true talent level of .345 player in 500PA is higher than a .355 player in 10PA. But that does NOT mean a team would prefer to take the former player, because all he his done is prove he is worthless to the team, whereas with the latter there is some chance (even if small) that his true talent level is much higher.

Ie if one takes the point of using a replacement baseline as being to assess the line at which a player ceases to have value to team relative to avialable substitutes, as opposed to an attempt to get at absolute true talent levels, there is no paradox.


Copyright notice

Comments on this page were made by person(s) with the same handle, in various comments areas, following Tangotiger © material, on Baseball Primer. All content on this page remain the sole copyright of the author of those comments.

If you are the author, and you wish to have these comments removed from this site, please send me an email (tangotiger@yahoo.com), along with (1) the URL of this page, and (2) a statement that you are in fact the author of all comments on this page, and I will promptly remove them.