Tango on Baseball Archives

© Tangotiger

Archive List

SABR 301 - PZR - Blueprint (June 17, 2003)

This is the blueprint for evaluating pitchers (PZR) and fielders (UZR, though FZR is a better name), using Play-By-Play (pbp). If there are additional variables to consider, feel free to add them in. (This process will also have the benefit to validate to the degree in which DIPS is true or false.)

The biggest issues are: reliability of data, and sample size to establish the necessary baselines with respect to the variables at play. (Some variables are more significant than others. The key is establishing the degree of significance before deciding whether to use the variables.)

- Tom Tippett (proprietary engine, properietary process, public ratings) and
- MGL (proprietary engine, public process, public ratings) are years ahead of
- Bill James, who mentioned that he is looking at fielding using play-by-play (in a fully-proprietary sense). James has the time/resources to catch up to these guys.
- Aspiring sabermetricians are encouraged to delve into the play-by-play files, the fields of gold.
--posted by TangoTiger at 02:28 PM EDT


Posted 11:51 a.m., June 19, 2003 (#1) - Mike Emeigh(e-mail)
  Minor point:

We don't know - for sure - that knuckleballers are different. The knuckleballers in Voros's analysis have all been fly ball pitchers as well, and we do know that fly ball pitchers as a group have a lower $H than do ground ball pitchers. So what Voros has posited as a "knuckleballer's advantage" might be nothing more than a reflection of the known fly ball pitchers' advantage.

More comments on the thoughts expressed in the linked thread as/when I get time...

-- MWE

Posted 1:15 p.m., June 19, 2003 (#2) - tangotiger
  I agree that we don't know what the knucle advantage is, especially since we've got such a small sample to deal with. It could even be "good knucklers versus bad knucklers" have a huge gap in $H, while the "good flyballers verus bad fylballers" have a smaller gap in $H, etc.

In all cases, we are always comparing a MLB relative to other MLB pitchers, who *may* have been specifically selected to play in the majors because they can keep the $H down. Lots of work ahead of us.

Posted 1:56 p.m., January 13, 2004 (#3) - tangotiger
  I'm bringing this thread forward in conjunction with the True Talent Fielding thread.

Posted 1:23 a.m., January 14, 2004 (#4) - MGL
  I read through some of the old stuff and I'm still lost. Here is one of your equations, Tango:

Anyway, DER = UZR+Park+PZR.

Maybe I do get it. Are you saying that UZR measures how much better or worse a fielder handles line drives to zone 7S or hard hit ground balls to zone 56M, etc., wihtout regard to the distribution of those batted balls (e.g., we don't care if fielder A got 100 hard hit balls to zone 56M only and fielder B got 100 slow hit balls to zone 56M only - if they both fielded them at the league average, they would both have a UZR of zero), but PZR only considers the distribution of those different batted balls - i.e. PZR doesn't care which ones are actually turned into outs or not, only the league average out rate for each kind of ball in each zone (as well as the other parameters)? For example, if pitcher A had 100 hard hit balls in zone 56M only and pitcher B had 100 soft hit balls hit to zone 56M only, then pitcher B has a much better PZR? And we would calculate the exact PZR based on the league average out rates of hard hit balls and soft hit balls into zone 56M?

Aha, now I thinkI get it! I was thinking that PZR was like UZR in that it considered the actual out rate of each type of batted ball/zone/runners, outs, etc., and comapred that to the league average rates, yielding the same result as a pitcher's collective fielders. Now I see what you are doing! Brilliant! Now I also see how park adjusted UZR + park adjusted PZR = DER.

Of course, once we figure PZR, we still want to know how much of PZR is luck and how much is skill. I have a feeling that you already calculated that ahead of actually doing the individual PZR's. That must be from the team PZR's that you estimated from the team DER's minus the team UZR's. Is that right? And you came up with the fact the pitchers and fielders have about the same amount of responsibility? Is that right? And how much of each one's value (UZR or PZR) is skill and how much luck? I guess what that question always means is that for an infinite sample size, what is the regression? I think that is what that question means.

Hmmm... PZR. Briliant! I know Tango is now thinking, "What took that idiot (boor) MGL so long to figure this out?"

Let me know if I have this right now, and I'll do someof the preliminary work.

I assume that the only things yoiu can hold a pitcher responsible for, and you want to include in PZR is where the balls are hit, wht type and how hard. You can't hold him responsible for the other parameters, like baserunners (well, MAYBE baserunners), outs, and handedness of batters (other than how the pticher's handedness affects the batters' handedness), so I assume that you would want to "include" some paramteres in PZR, and adjust, but not include other parameters. In that way, it is a little different than doing the UZR calculations. Let me give an example of how I would caluclate a PZR and how I would hande the paramters issue, which is different from how they are handles in UZR (for PZR some of the paramters establish the baseline, and some of them are used to "adjust" the baseline; for UZR all the parameters are used for one and not the other). Correct me if I'm wrong here...

pitcher A

100 hard hit balls to zone 56M all with one out and by RHB's in 50 innings. That is all of his batted balls.

League averages

All hard hit ground balls to zone 56M are caught 60% of time with 0 outs and RHB, 62% 0 outs and LHB, 64% 1 out and RHB, and 66% 1 out and LHB.

All soft hit ground balls to zone 56M are caught 70% of time with 0 outs and RHB, 72% 0 outs and LHB, 74% 1 out and RHB and 76% 1 out and LHB.

All ground balls are caught 70% of time with 0 outs and RHB, 65% 0 outs and LHB, 75% 1 out and RHB and 70% 1 out and LHB.

All GB's are caught 70% of the time regardless of outs or batter hand.

If we don't want to penalize (or reward - whatever the case may be) the pitchers for the outs and the batter handedness, then we calulate for pitcher A:

If a league average pitcher gives up 100 ground balls with 1 out and a RHB at the plate, 75% are caught (see above league averages). However, pitcher A's 100 ground balls were all hit hard, were all hit to zone 56M (with 1 out and a RHB). A league average pitcher who did that would have only 64% of those kinds fo GB's caught (also, see league averages above). So our pitcher A allowed 11 fewer balls to be caught (regardless of how many were actually caught - that gets into the UZR realm), for a PZR of 11*.8 or 8 runs per 100 BIP or 50 innings, which ever we used as our "rate."

If we want to penalize (or reward) pitcher A for the fact that all his hits were by RHB and were with 1 out, then we would have to start with:

The league average conversion rate for ALL GB's, regardless of outs and batter handedness is 70%. That is the only thing that changes in our calculations. Now we take the difference between 70% and 64% for a PZR of 6*.8 or 4.8 runs (per 100 BIP or 50 innings).

Is that right? Should the pitcher be "penalized/rewarded" for any parameter other than speed, type, and location of batted ball? I don't think so. After all, we wouldn't think of doing that for park affects. We "adjust" for park affects. Why not "adjust" for outs/baserunners/handedness, and certainly batter G/F ratio (hmm.. do I adjust for batter G/F ratio in UZR? Probably not becuase there are so many batters it is not worth it - they probably average to near league average), as we should for park effects?

Posted 9:47 a.m., January 14, 2004 (#5) - tangotiger
  I know Tango is now thinking, "What took that idiot (boor) MGL so long to figure this out?"

Not at all. I only criticize your memory!

I'll comment on the rest of your post in a while.

Posted 10:40 a.m., January 14, 2004 (#6) - tangotiger
  Actually, this PZR thing is getting more complicated than I thought. I'm going to need some time to try to sort things out.

(For those also trying to work on this, there are two layers to consider: things that the pitcher controls and those that he doesn't. We probably also want to consider HR and not just BIP. We don't want to consider the end-result of the play, so that we can ignore the fielder's impact.)

Posted 11:50 a.m., January 14, 2004 (#7) - J Cross
  I'm missing something. Why not think about it like this:

1) you know how many runs did score
2) you know how many runs the defense was worth compared to average (team UZR)
3) you know the park effects

#1-(#2+#3) = PZR

or you could replace #1 with how many runs SHOULD have should based on the numbers of hits, walk, doubles etc. Basically a peripheral ERA.

Also, I think you SHOULD hold pitchers responsible for the handedness of the batters they face. For instance, managers stack their lineups with righties when facing tough left-handed pitchers and those pitchers ARE responsible for the handedness of the lineups they face.

Posted 12:50 p.m., January 14, 2004 (#8) - tangotiger
  PZR should be calculated independently. But, as a test, they should all add up.

Team UZR does not necessarily apply equally to all the pitchers on the team (if for example you have a great CF and you have a pitcher that rarely allows a ball to that CF, then he won't benefit as much, etc, etc).

The point of PZR is that we should be able to get it independent of the fielders, while, as a test, the fielding + pitching + park should get you the total of defense on contacted balls or BIP (not sure which yet).

The handedness issue is an interesting thought. After all, RJ gets to face a disproportionate RH because he is a LP.

Posted 1:24 p.m., January 14, 2004 (#9) - MGL
  I probably agree with J. Cross on the handedness issue for most pitchers. Kind of like my explanation on the QOC article for not adjusting a pitcher's stats for opponent handedness in the QOC adjustments (but should do it for batters). There are exceptions though, like for LOOGY's, as I explained in the article. And of course, it would be a little unfair (one way or the other - good or bad), if a pitcher with not that many PA's happened to have faced more than his share of RHB's or LHB's, for no particular reason. Our concern might be somewhat baseless, as I'm sure the "adjustments" one way or another don't amount to much.

What about baserunners/outs? Should pitchers be responsible for any weird runners/outs profiles they have that signifciantly affect their PZR?

I;mnot even sure that we are going to gain with PZR's. I don't think it will help inpitcher evalaution or projection. After all, the regressions sort of take into consideration the inherent PZR's. Plus we can infer them quite accurately, by just "subtracting" their fielder's UZR from their stats. In fact, when I do my pitcher evaluationas and projections, I do a QOF adjustment which uses team regressed UZR (each fielder's multi-year regressed UZR added together, pro-rated by the distribution of that pitcher's BIP's).

Even when we get PZR, they still have to be regressed to "remove" the luck element. It might be nice to quantify what DIPS tries to ignore, but what is the purpose? Tango originally said something about using PZR to validate DIPS. I;m still not sure what that means. After you get the PZR and let's say it turns out that it is exaclty on the scale of UZR and that the regressions are about the same (as we think is true). What does that say baout DIPS? Only that a pitcher's BABIP is x part defense, x part pitcher, and y park luck. I think it has already been proven that: one, the pitcher does have some pretty decent control over BIP, and that two, the luck element is at least as strong as the defense element, probably much stronger...

Posted 1:36 p.m., January 14, 2004 (#10) - J Cross
  when I do my pitcher evaluationas and projection

When DO you do this and do you make your projections public?

Posted 2:10 p.m., January 14, 2004 (#11) - tangotiger
  MGL, you can't just blanket use the team UZR on each pitcher. If you have 3 great fielding OF, and 4 poor fielding IF, and you have 1 GB pitcher and 1 FB pitcher, you can't given them the same UZR runs / BIP impact.

What if you have a great SS,3B,RF, but poor other fielders?

If you are going to go down the path of adjusting a pitcher's stats by taking the UZR of the fielders, then you'll have to do it one position at a time.

I really have no problem with this, and it should be done.

I'm (trying to) offer a way to do PZR without needing to know about UZR. But, by calculating PZR, UZR and parks, you'll end up with a team's DER.