Some thoughts on Pitch Count Estimators, following the tremendous work by Steve Treder and Walt Davis
The Basic Pitch Count Estimator (bPCE) that I have is by no means "mine". I believe that the STATS Scoreboard from a few years ago published the number of pitches each event (H, HR, BB, etc) takes. Patriot at Fanhome published it, and then MGL verified it with more recent data. I just noticed that all the events, except BB and K, took around 3.3 pitches. So, I just used 3.3, 4.8, 5.5 as my numbers. When I run it against actual data, I fudge the numbers slightly to calibrate against the league average. I believe 2003 had 3.75 pitches per PA.
The Extended Pitch Count Estimator (xPCE) is the result of work I've been doing on counts, unrelated to the number of pitches. However, I realized that my theoretical work should be able to give me expected number of pitches. Not having, at the time, the number of pitches in electronic form, I selected two extreme type pitchers (Randy Johnson and Brad Radke) to fit against my model. Using the league average for calibration, I came up with the xPCE.
Since Baseball Prospectus (Keith Woolner based on the handiwork) has been so kind as to compile the actual pitch counts for all pitchers over the last 15 years, I really should tweak my model to conform to reality. However, Walt Davis' analysis shows that my current xPCE theoretical model already conforms at the high .9x level. That is pretty darn good.
Since BP has published the actual data, and I've published my equation, I will leave it up to other sabermetricians to try to tweak my equation. As Michael Lewis said of Bill James: "He prefers to leave an honest mess, rather than a tidy lie". I'm no Bill James, but I like Michael's point.