Tango on Baseball Archives

© Tangotiger

Archive List

After Sabre-School Special (June 18, 2003)

Thanks to Fred, Jeff, and Michael for their great questions and commentary over the last 2 weeks (Jun 5 to Jun 18). Let's get into it with talks about BaseRuns, Linear Weights, DIPS, Park Factors, Play-by-play systems, Win Shares, and more.


More comments


BaseRuns and Linear Weights

Q: Ever since I've been reading Bill James' abstracts I thought runs created was by far the best method. You seem to be shooting it down by claiming baseruns is better. Where is this basis coming from and how are baseruns calculated?


A: Please go to my site and read the 3-part series on how Runs Are Really Created. I don't think I can add anything more significant than what I have there.

Q: I thought in your articles you were stating that BaseRuns is better than Linear Weights as an offensive measure of a hitter's overall production?


A: BaseRuns is better for teams and pitchers. Linear Weights is better for individual hitters.

Q: If LWTS is the best way to compare individual hitters, which weighted values to you use as the ideal LWTS system? I've read numerous articles regarding this and it seems as though it varies depending upon the runs environment. Is there 1 overall weighted system you like to use?


A: The run impact of every event is dependent on the run environment. Therefore, if you are trying to figure out how much run impact a player had on HIS team, you need the team run LWTS values. If you are trying to figure out how much run impact a player has on an average team in a league/year, you need those values. If you want it historically, you need another set.

Q: How many runs would a team of these 9 players score for the year?


A: To do this, you do the following:

1 - give each of the 9 guys the same number of PAs

2 - Use BaseRuns on this team

That'll give you team runs scored.

Now, once you have that, maybe you want to know: "ok,who's responsible for some of that?" You need LWTS. What LWTS formula? The best one to use would be the one that you would generate from BaseRuns. How do you generate one from BaseRuns? Well, you start off with your "team of 9" stats that you have. BsR will tell you how many runs that team will score. THIS IS YOUR BASELINE.

Then, add 1 H and 1 AB to that team. How many runs would this team score? What's the difference from your baseline? Congratulations, you just calculated the linear weight value of a single (around .47). Now, start with your baseline, and add 1 H, 1 AB, 1 2B. Do the same process for every hitting event, including the out! You have just generated custom LWTS values.


Data

Q: Where can I get...


A: You can check my recently updated site, where there's a bunch of stuff that may be useful. In there, I also have links to some stuff from MGL/UZR, such as


Complete UZR downloadable file


UZR 99-02, formatted


Data Interpretation

Q: In your opinion, what's the best statistical measure for a pitcher's performance and/or future success? how do you compare DIPSera to oppOPS?

A: You do it just like hitters. You break things down by component, and you find the aging pattern for each component. If you want to get fancy, you take pairs of components, and see if you can find a pattern for that. If you really want to get fancy, you take all possible combination of components, and try to find the aging factors that way.

You might have missed my OPS Begone series at Primer. It's worth taking a look, as I really have no analytical use for OPS.


DIPS

Q:Can you get good, accurate DIPS calculations from just IP,BB,K,HR...


A: Yes, you can check out FIP, which I explain here


Park Factors

Q: Do you agree with Bill james' park adjustment #'s and his weighting system of using the last 5 years as 1/8 of the # and the latest year as 1/2?


A:He's wrong. The park itself does not change year-to-year (mostly). The climate/wind may change.

Now, is the park configuration in 2002 more indicative of how they play in 2003, than say 1998-2001? Probably not. So, no reason to weight the older seasons more.

Now, is the climate of 2002 more indicative of the climate in 2003 than say 1998-2001? Probably not, but I don't know.

So, use as many years as possible (10, 20, 30, whatever!), as long as you are confidant that the park configuration is relatively stable.

Also note that parks factors are relative to other park factors. If you have a whole bunch of new parks, you may have the exact same park go from being a 103 park to a 98 park, simply because all the parks around the league changed.


Play-by-Play (PBP)

Q:What do you think of fielding systems by...


A: Any non-pbp system is going to try to estimate what the pbp will tell you (how many lefty batters, how many gb pitchers, what's the effect of the park on LF fly balls?, etc), when the pbp will tell you exactly. So, why stick with non-pbp?

Q: What is A.S.S.?


A: Ray Kerby wrote a C program that parses through the Retrosheet event files to generate incredible things.


PythagoPat

Q: What formula do you use to calculate an expected won/loss record for a team?


A: Use pythagoPat, which is

W/L = (RS/RA)^x

where x=(RS+RA)^(0.29)

Historically, this x value is about 1.8, but it depends on each team (and in fact each PITCHER).


Win Shares

Q:The Win Expectancy article made clearer to me some of the more metaphysical points you were making with Rob Wood in your Win Shares Dialogue. It's strange but true that a "real" Win Share system would have different gross win and loss shares per game, depending on how close the game was and how long the game remained close.


A: Yes, the key is to create the right best most complete model, and THEN try to cut corners for simplicity, but staying true to the model.

Going from Baseruns to Runs Created would break the model concept completely, which is why I turn my back on RC.

Same deal for how wins are conntributed by players. Win Shares breaks the model for what we know about how players contribute wins.


--posted by TangoTiger at 11:03 AM EDT