Tango on Baseball Archives

© Tangotiger

Archive List

Futility Infielder - 2003 DIPS (January 27, 2004)

Jay Jaffe does all the hard work, and creates a great list of links, for your 2003 DIPS.
--posted by TangoTiger at 09:49 AM EDT


Posted 10:54 a.m., January 27, 2004 (#1) - Chris from Chelsea
  Wow, the Mets were not good. Doesn't exactly get you excited about the prospects of our pitching in 2004.

Posted 10:59 a.m., January 27, 2004 (#2) - Stephen
  That HRPF for Montreal doesn't look right.

Posted 11:01 a.m., January 27, 2004 (#3) - Mike
  To the extent that Voros' work is accepted, how valid is the following formula? If a pitcher's success/failure is based laregly on luck, it seems to me the best pitchers are those who control the aspects of the game they can control (Ks, BBs, HRs, HBPs). Therefore, does the following formula make sense?
$K-($BB+$HR+$HBP)=dSCORE

It seems to me it's similar to OPS in that it conflates important pitching statistics into a number that is not technically an average, but is a quick and easy way to measure success. Notice that the top 10 in dSCORE from last year is very similar to the top 10 in dERA

dSCORE (100+ dIP)

Schilling - .207
Prior - .195
Martinez - .182
Schmidt - .166
Vazquez - .164
Mota - .163
Santana - .152
Johnson - .151
Mussina - .150
Beckett - .140


I've not run these numbers on prior years. Is it worth pursuing any further?

Posted 11:10 a.m., January 27, 2004 (#4) - jss
  Just a quick look at Montreal team totals at ESPN shows 120 HRs in away games, and 190 in home games. Home include the PR games.

Posted 11:25 a.m., January 27, 2004 (#5) - tangotiger (homepage)
  Mike,

I suggest you use FIP:
(13*HR+3*BB-2*SO)/IP

You may find the above link somewhat informative. You can also check out baseballgraphs.com

Tom

Posted 11:25 a.m., January 27, 2004 (#6) - Pfizer
  Wow, three Jay hurlers with dERA significantly lower than actual ERA. All three no longer with the club. Doesn't that indicate there's a good probability for improvement?

Maybe I'm just the only guy in the world who wanted JP to bring Lidle back.

Posted 11:56 a.m., January 27, 2004 (#7) - Andrew Edwards
  Wow, three Jay hurlers with dERA significantly lower than actual ERA. All three no longer with the club. Doesn't that indicate there's a good probability for improvement?

And they signed Pat Hentgen, who was one of the most 'lucky' by DIPS. Does Keith Law not believe in DIPS anymore? Or did he get overruled?

Also, what the hell is wrong with Glendon Rusch? Is he just the least lucky pitcher in human history, or what?

Posted 12:15 p.m., January 27, 2004 (#8) - Jay Jaffe (homepage)
  Hi guys,

I just wanted to note that since this article's posting here I've added a significant chunk which I think will interest you, so you may want to re-read it (I had planned this as a follow-up but was able to put it together more easily than I'd suspected, so I've squeezed it in). For DIPS 1.x, Voros published season-to-season correlation data between various components ($SO, $BB, $HR, $H) and for dERA & ERA, but to the best of my knowledge did not do the same for DIPS 2.0. I have done so for the two seasons of data that I hold, and the results are pretty consistent with his findings. The formatting won't hold here, but if you want to skim it, it's here:

category/DIPS 1.x/DIPS 2.0
Years/98-99/02-03
Baseline IP/162/162
Number of P/60/56
$BB/.681/.673
$SO/.792/.801
$HR/.505/.372
$H/.153/.106

Years/93-99/02-03
Baseline IP/100/100
Number of P/503/96
ERA to ERA/.407/.378
ERA to dERA/.521/.524

Posted 12:19 p.m., January 27, 2004 (#9) - Jay Jaffe (homepage)
  Oops, those last two lines should read:
ERA to next ERA/.407/.378
dERA to next ERA/.521/.524

Posted 1:16 p.m., January 27, 2004 (#10) - tj
  And they signed Pat Hentgen, who was one of the most 'lucky' by DIPS.

Being lucky by DIPS doesn't necessarily mean that a pitcher isn't good. If a guy hits 30 homers a year for five years, and suddenly hits 50, does that mean he's not good because he fluked into 20 extra homers? Of course not. You'll still take the 30. Given that Hentgen was recovering from tommy john surgery, he seems like a good bet to improve on his 2003 dERA.

Posted 1:42 p.m., January 27, 2004 (#11) - Kyle S
  I'd say the converse to #10 is also valid: just because a pitcher is unlucky by DIPS doesn't mean he isn't awful. For example, there might be a reason that Rusch allowed a .380 avg on BIP: maybe his terrible movement makes it easy to hit line drives, which turn into hits more often, or he rarely induces popouts. I don't think you can explain away his absolutely atrocious numbers last year with "bad defense."

Posted 1:57 p.m., January 27, 2004 (#12) - Andrew Edwards
  tj:

I actually agree, I think Hentgen will be fine this year. But combined with the other Jays moves, it adds up to an ostensibly anti-DIPS strategy, which is strange from this group.

Posted 2:29 p.m., January 27, 2004 (#13) - tj
  I don't get what you're saying, Andrew, but maybe it's because I'm not a regular here. Batista and Lilly both scored pretty well in Jaffe's numbers. Batista was in the top 20 in the majors and he's a groundballer. Lilly's got good strikeout rates in his history. That all seems, well maybe not pro-DIPS, but consistent with DIPS. Am I missing something? I'm not being sarcastic, I just don't see the criticism.

Posted 2:41 p.m., January 27, 2004 (#14) - Andrew Edwards
  tj:

It's not even a criticism, really. It's just an observation that they let go of some guys who DIPS says should be a bargain, and signed a guy who DIPS says is overvalued. Each individual deal was fine, but when you step back and look at the aggregate, it looks inconsistent with a lot of Jays philosophy.

So I'm wondering if they've got something going on that I can't pick up - did Keith Law re-assess DIPS in some way we don't know about? Or were these just several tactical deals, into which I shouldn't read anythign about strategy?

Posted 2:54 p.m., January 27, 2004 (#15) - studes (homepage)
  Very nice job by Jay, especially in the presentation of information.

Not to be a nattering nabob of negativism, but I do think that FIP is just as powerful, and a whole lot less work.

Posted 3:02 p.m., January 27, 2004 (#16) - tangotiger
  Yes, I think a simple:

dipsERA = FIP + 3.2
(or whatever the 3.2 needs to be to best-fit to the 2003 data)

would be a good idea, to go along side Jay's data.

Posted 3:27 p.m., January 27, 2004 (#17) - Jay Jaffe (homepage)
  Andrew,

I think it's far more likely that even a saber-savvy front office such as Toronto's uses more information than just one year of DIPS 2.0 to make their assessments. To assume otherwise is to oversimplify the matter greatly.

They may have a DIPS 3.0, which leaves DIPS 2.0 in the dust but which they own entirely and have no obligation to share with us.

They may think DIPS is hogwash -- even among statheads, not everyone subscribes to the theory, after all -- and have some other means of projecting pitchers based on several years of statistics, weighted in some fashion.

They may have a proprietary system which spits out PECOTA-like projections that take into account physical characteristics as well (FWIW, PECOTA doesn't like Lilly or Hentgen much; weighted mean forecasts for '04 are 5.07 and 5.06 ERA respectively, Batista 4.19).

They may have an in-house stathead jumping up and down yelling, "Sign these guys!" who is overruled by a GM who has financial realities and a broader long-term picture of the organization in mind.

They may have scouts who say that Hentgen's picked up 10 MPH on his heater in the past six months and a pitching coach who knows Ted Lilly's best friend and so thinks he can connect with the enigmatic lefty.

Even those possibilities are oversimplifications. And it just may be that evaluating a front office based on three transactions is an example of letting a small sample size influence your conclusions.

For what it's worth, I'd say Batista was a very good signing because his performance has been steadily improving, Lilly a decent one because he's looked like he's very close to putting it all together at times (especially the end of last year), and Hentgen a kinda lousy one because he has very little upside other than munching about 150 innings at league average.

Posted 3:40 p.m., January 27, 2004 (#18) - Rob(e-mail)
  Any body notice Rusch's season splits?
Pre All-Star
IP 84.3
H 129
HR 9
BB 36
SO 60
BA .355
Post All-Star
IP 39.0
H 42
HR 2
BB 9
SO 33
BA .273
I guess the bullpen did him wonders.

Posted 3:43 p.m., January 27, 2004 (#19) - tangotiger
  Jay, if you want to include FIP, you can do the following:

dipsERA = (13*HR + 3*(BB-IBB+HBP) - 2*SO)/IP + 3.12

That's for 2003. The constant for the last few years is:
2003: 3.12
2002: 3.06
2001: 3.15
2000: 3.22

Posted 4:08 p.m., January 27, 2004 (#20) - Jay Jaffe (homepage)
  tango & studes -- I revised my page to include two links to FIP, one at each of your sites. If and when I get a chance, I'll insert a paragraph incorporating a brief description and the data above. Thanks...

Posted 4:38 p.m., January 27, 2004 (#21) - Jagermeister
  Cool stuff, Mr. Jaffe.

I’m a bit confused about the home run factors, and I’m far too dunderheaded to wade through the explanation. Can it really be that Dodger Stadium and Coors Field rank so close together in their effects on home runs? Can someone explain those to me in layman's terms?

I’m also even more worried about Ryan Franklin, what with his DIPS “luck”, and having one less Mike Cameron in center field.

Posted 9:44 p.m., January 27, 2004 (#22) - impatton
  Park Home Run Factors for 2003
NL PHRF
COL 0.88775
FLA 1.12187
LOS 0.87140
SFO 1.10124

How can it be possible? In 2003, park factors have changed drastically? Or, am I missing something? Or, just typo or something like that?

Posted 10:07 p.m., January 27, 2004 (#23) - Larry Mahnken(e-mail) (homepage)
  impatton-

These Park Factors are the opposite of the type you're used to, you multiply the number of HRs the pitcher gave up by the PF to get the adjusted number, rather than divide.

So, .88775 HRPF for Colorado means that Coors increased HRs by 11.225%.

Posted 10:28 p.m., January 27, 2004 (#24) - Jay Jaffe (homepage)
  Re the Dodgers:

140 HR were hit by the Dodgers and their opponents in LA.

111 HR were hit by the Dodgers and their opponents away from LA, by far the lowest total in baseball (Montreal at 128 was next).

That latter figure has as much to do with the Dodgers' pathetic offense as their spectacular pitching, both of which were extreme even away from the distortion field of Chavez Ravine. No team came close to hitting fewer road HR than their 56. No team came close to allowing fewer road HR than their 55.

My hunch is that it's a single-season anomaly related to the team's personnel rather than their ballpark -- Dodger Stadium has been between .98 and 1.04 for HR from 1999-2002, and I don't recall anybody blaming "the wind blowing out" for extraordinary high run totals those 2-1 slugfests in LA.

Weather or not (hehehe), this imbalance shows up as Park HR factor, and the Dodger pitchers' numbers are adjusted accordingly.

Posted 11:54 p.m., January 27, 2004 (#25) - studes (homepage)
  One-year park factors are my newest pet peeve. There is no good reason to use them, in this case or any other, if you have long-term park factors. It is just as easy to have a fluke park factor year as it is to have Brady Anderson hit 50 whatever home runs in a year. Okay, maybe not that easy, but still easy. Why apply a fluke park factor to get at the "truth"?

I'm not jumping on you, Jay, cause you're just following the methodology. But one-year park factors are really not that much better than no park factors at all.

Posted 12:07 a.m., January 28, 2004 (#26) - tangotiger
  The only reason to use 1 year PF is if you think that:
1 - The climate was very nonrandom that year compared to other years
2 - The park dimensions changed enough that you are playing in practically a new park

I would cut my losses, and assume for an outdoor park that it's 50% 1 year and 50% 100 years (or however long you can go not to conflict with #2 above).

(Reminder with multi-year PF: the PF may be set to 1.0 for the league for 2003, but it won't be 1.0 for the league in any other year where the parks are not the same.)

Posted 12:16 p.m., January 28, 2004 (#27) - Mike Green
  Andrew, Mike Emeigh has presented very persuasive evidence that Glendon Rusch consistently gives up a much higher rate of line drives than the norm. In other words, for Rusch, as well as for the knuckleballers, there is good reason to doubt the significance of DIPS as a projection tool.

Tom Tippett makes a good case that there are significant methodological problems with McCracken's DIPS research, and that indeed there is more to pitching than K, W and HR allowed rates.

Posted 1:57 p.m., January 28, 2004 (#28) - Charles Saeger(e-mail)
  140 HR were hit by the Dodgers and their opponents in LA.

111 HR were hit by the Dodgers and their opponents away from LA, by far the lowest total in baseball (Montreal at 128 was next).

Chavez Ravine has traditionally not been a bad home run park, typically average, but this is indeed out-of-line. (It cuts hitting greatly, but that's because it destroys batting average.)

Posted 2:02 p.m., January 28, 2004 (#29) - ntr Voros
  studes-

I found no park factor to be better than 1-year park factors.

I also found 1-year park factors to be better than 3-year park factors.

I am the God of Counter-Intuitiveness.

Posted 2:05 p.m., January 28, 2004 (#30) - Charles Saeger(e-mail)
  Tom Tippett makes a good case that there are significant methodological problems with McCracken's DIPS research, and that indeed there is more to pitching than K, W and HR allowed rates.

Tippett never said there are "significant methodological problems" with DIPS. He had critiques, yes, but has never said that the research has significant problems -- in fact, he has gone out of his way to say the opposite after his critique.

Tippett's article critiquing DIPS itself contained a pair of flaws:

* Even though there is variation in $H, how much of this is explained by a standard distribution? (Tangotiger has shown that the distribution of $H variance is not standard, but while there is some ability-explained variance, the distribution is fairly close to standard.) We'd expect some pitchers to be better or worse than average, even over the course of a career, just by the luck of the draw.

* While there may be (and in fact, there are) significant differences between pitchers in $H, the effects of those differences pale compared to the effects of those pitchers' different HR, BB and SO rates.

Posted 2:21 p.m., January 28, 2004 (#31) - tangotiger
  I agree with Charlie.

Our best guess as to the ability of a pitcher on BIP, is 1 SD = .009 hits / BIP.

So, given 700 BIP, we expect 95% of our pitchers to have a true talent rate of +/- 10 runs.

I don't know what the +/- is on the 250 non-BIP, but I think it would be higher. (It's certainly higher on a per-play basis)

Posted 8:19 p.m., January 28, 2004 (#32) - RossCW
  The ones between ERA and the next season's ERA are lower in my batch than they are in McCracken's, and strangely seem to get even lower when the bar is raised from 100 to 162 innings. It's a strange anomaly, but alas, I don't have similar data from McCracken to compare.

I'm curious - who here actually duplicated McCracken's work and produced identical results?

Posted 12:43 a.m., January 29, 2004 (#33) - Randall
  Ross, you ARE the biggest idiot ever.

Posted 11:03 a.m., January 30, 2004 (#34) - Craig B
  RossCW, I'm not sure what "work" you're referring to. The work does not need to be duplicated. The results should be duplicated.

If you would RTFA, you'd see that Jay Jaffe actually does replicate one of Voros's most important results. I'll quote Jay's piece for you.


I will point out one more thing. In McCracken's original work (DIPS 1.x), he published data showing how various rates (strikeout, walk, HR, BABIP) correlated from one year to the next, and he also did so showing how the dERA correlated better with the following season's ERA than the previous season's ERA did. To the best of my knowledge, he did not publish the same correlation data for DIPS 2.0. But I have taken the two years of DIPS 2.0 that I have produced and found correlations that are fairly consistent with his findings:

DIPS 1.x DIPS 2.0
Years 98-99 02-03
Baseline IP 162 162 100
Number of P 60 56 96
$BB .681 .673 .733
$SO .792 .801 .824
$HR .505 .372 .272
$H .153 .106 .132

Years 93-99 02-03
Baseline IP 100 162 100
Number of P 503 56 96
ERA to next ERA .407 .288 .378
dERA to next ERA .521 .513 .524

The baseline IP is the number of innings pitched in both seasons a pitcher needed to qualify for the study. McCracken used separate baselines for the two comparisons, but since I had data for both 100- and 162-inning levels, I'm running it here.

The numbers in the last two lines are the most important single result in sabermetrics over the last five years.

Posted 12:01 p.m., January 30, 2004 (#35) - Mike Green
  Tango, if you run the projection study with the Monkey, the experts and the Primer readers again this year, it would be fun if you ran Monkey1 for the pitchers being based on age-adjusted ERA and Monkey2 being age-adjusted dERA.

Posted 4:18 p.m., January 30, 2004 (#36) - mommy
  Re Glendon Rusch: i've not seen him pitch much, if ever. #27 wrote that Mike Emeigh has found he gives up a ton of linedrives. i think people often overlook the fact that DIPS applies to major league quality pitchers. you and i could not step onto a mound and throw our half-assed pitches and have the ball successfully fielded 70% of the time. my understanding of DIPS is that the pitchers who have enough skill to prevent linedrives on every pitch will reach the majors, and then it is their differences in the K, BB, HR which will separate them. the pitchers who do consistently give up a higher rate of hits on BIP will be sent back to the minors. thus they do not accumulate enough innings to be noticed in any study of the issue. i do not think voros' theory states pitchers have little or no control over BIP. i think it is more accurate to say _major league caliber_ pitchers show little variation in their ability to prevent BIP.

i know i'm not breaking new ground here, just making it more explicit because it seems sometimes people oversimplify DIPS or give it too much power.

anyway, while rusch has lasted longer than a few innings, perhaps he is just at the very bottom of pitchers who are able to (sort of)survive in the majors.

Posted 1:27 p.m., January 31, 2004 (#37) - RossCW
  If you would RTFA, you'd see that Jay Jaffe actually does replicate one of Voros's most important results.

Well no - he didn't. He did similar calculations for last year and compared it to Voros results. The first step of any "peer review" process is to duplicate the exact methodology as described by the original research. People make mistakes. And the acceptance of un-duplicated research as the basis for comparison is a problem.

ERA to next ERA .407 .288 .378
dERA to next ERA .521 .513 .524

The baseline IP is the number of innings pitched in both seasons a pitcher needed to qualify for the study. McCracken used separate baselines for the two comparisons, but since I had data for both 100- and 162-inning levels, I'm running it here.

The numbers in the last two lines are the most important single result in sabermetrics over the last five years.

I won't argue with that for the reason I stated above. The problem is that it takes a sample based on pitchers who have pitched over 100 IP (or whatever baseline) two years in a row and treats them as if it is a random sample. They are comparing pitchers who managers have elected to pitch a certain amount two years in a row. And ignoring what happens to pitchers who don't meet the standard the second year.

There is a similar problem with evaluating any predictive system in baseball. How do you treat the people for whom you have little or no data because of playing time?

Posted 1:52 p.m., January 31, 2004 (#38) - RossCW
  I would add its important to remember that DIPSERA is a modification of Bill James' Component ERA using the team averages instead of individual performances for the non-DIPS stats. James uses walks, home runs, each type of hit etc. to create a theoretical ERA based on the individual components. DIPS ERA takes that one step further so that players have the team average hits and distribution of hits between singles, doubles and triples.

I don't think any conclusions can be drawn from this beyond that it correlates better for the population used. Since several adjustments at the same time there is no way to know which are causing the changes.

Posted 2:32 p.m., January 31, 2004 (#39) - RossCW
  Assuming my numbers are correct, 85 pitchers pitched over 160 innings in 2001, 50 of those also pitched over 160 innings in 2002. That means well over a third of the pitchers in the initial sample are not evaluated in the comparison to the next year. The numbers for other years appear to be similar.

Here is the Lahman database SQL:

select p1.Yearid, count(*)
from PitchingAnnual p1
JOIN PitchingAnnual p2 on p1.PLAYERID = p2.PLAYERID and p1.YEARID = p2.YEARID-1
where p1.IPOUTS>480 and p2.IPOUTS>480
GROUP BY p1.YEARID
ORDER BY p1.YEARID DESC

"PITCHINGANNUAL" is a view that sums a players stats for the entire year regardless of the team they played for.

Posted 5:05 p.m., February 1, 2004 (#40) - Jay Jaffe (homepage)
  I don't claim the comparisons to be anything but what they are, a back-to-back correlation between pitchers who reached a certain threshold of innings in the two years for which I held data and with the findings published by Voros via his older system. To the best of my knowledge those are the first published correlations of DIPS 2.0 data, and they appear to support the conclusions Voros reached with DIPS 1.x using the same inning thresholds.

"[T]he most important single result in sabermetrics over the last five years"? Thanks, but I think that's a bit overstated. I am glad that the work I put into this is appreciated, though. Even many "DIPS skeptics" (including an MLB front-office person) have written to tell me that they're glad to see the data because it has some uses to them.

I will concede that RossCW has a point in #37 in that I haven't made any comparison of 98-99 DIPS 1.x to 02-03 DIPS 1.x and 98-99 DIPS 2.0 to 02-03 DIPS 2.0. I am but one man with limited capabilities, and while I've made a good faith effort to do the task I set out to do with as much accuracy as possible, I don't have the time or level of interest to build a spreadsheet that would run the older formula over a new set of data and vice versa. The methodologies for both are out there, though, so if somebody else wants to do so...

And while I see Ross' point about a non-random sample, I'm not sure how meaningful a comparison of, say, pitchers who pitched 100 innings in Season 1 and at least 1 inning in Season 2 would be -- the "sample size" issues seem obvious when it comes to small amounts of playing time. I think the general consensus here would be to use a baseline that has some meaningful level of playing time.

For what it's worth, I've rerun the 2002-2003 comparisons at a lower threshold, 50 innings in each season. The results are not as strong as at higher thresholds and there's a weird "hump" by which many of the 100 inning correlations are higher than either the 162 or the 50 inning ones, but dERA still correlates better than ERA with the following season's ERA.

category/DIPS 2.0
Years/02-03
Baseline 162/100/50
Number of P/56/96/214
$BB/.673/.733/.554
$SO/.801/.824/.767
$HR/.372/.272/.246
$H/.106/.132/.080

Years/02-03
Baseline IP/162/100/50
Number of P/56/96/214
ERA to next ERA/.288/.378/.325
dERA to next ERA/.513/.524/.432

Alas, I don't have XBH data in my sheets so that I could compare the correlations of component ERA against dERA and ERA.

Ross, you make one point there about "team average hits" etc., which is not accurate -- DIPS 1 used team averages, but DIPS 2 uses league averages.

Posted 5:45 p.m., February 1, 2004 (#41) - RossCW
  will concede that RossCW has a point in #37 in that I haven't made any comparison of 98-99 DIPS 1.x to 02-03 DIPS 1.x and 98-99 DIPS 2.0 to 02-03 DIPS 2.0. I am but one man with limited capabilities, and while I've made a good faith effort to do the task I set out to do with as much accuracy as possible

Jay - I wasn't taking potshots at the work you did do, just responding to the claim it duplicated Voros original work.

Ross, you make one point there about "team average hits" etc., which is not accurate -- DIPS 1 used team averages, but DIPS 2 uses league averages.

Thanks for the correction.

And while I see Ross' point about a non-random sample, I'm not sure how meaningful a comparison of, say, pitchers who pitched 100 innings in Season 1 and at least 1 inning in Season 2 would be -- the "sample size" issues seem obvious when it comes to small amounts of playing time.

I don't think it is a solveable problem - but it is something that needs to be considered with all the predictive systems. How do you handle data points where there is no data the second year?

The problem is not really sample size. If you predict something is true for 85 people and you have a good correlation for the 50 who you have data for the next year, can you draw conclusions about the entire first group from that? I don't think so. You may be able to draw a conclusion about the second group but that isn't very useful for predictive purposes.

There is a larger question with DIPS ERA that I can't find anything on. To what extent does it predict better than a different system that regresses ERA to the mean. Afterall, that is the practical impact of using league (or team) averages in making the calculation. Wouldn't we expect almost any system that did that to be better with year to year correlations?

BTW - are your baseline numbers inclusive - i.e. does the over 50 IP include pitchers who pitched over 100?

Posted 1:00 a.m., February 2, 2004 (#42) - Jay Jaffe (homepage)
  BTW - are your baseline numbers inclusive - i.e. does the over 50 IP include pitchers who pitched over 100?

Yes, all of those groups (50, 100, 162) include everybody who met or exceeded the number of innings pitched in both seasons.

Posted 12:12 p.m., February 2, 2004 (#43) - RossCW
  One possible explanation for the decline in correlation over 162 IP could be that the range of variation among pitchers ERA'S is a lot less. Bad pitchers don't pitch that many innings two years in a row, but they may still get over 100 so you get better correlations at the lower threshold.

Posted 2:28 p.m., February 5, 2004 (#44) - Mike Emeigh(e-mail)
  One possible explanation for the decline in correlation over 162 IP could be that the range of variation among pitchers ERA'S is a lot less.

Right. Correlations over a narrow range of performance tend to mask true performance differences; you need to broaden the comparision enough to realistically cover as much of the actual range of performance as you can, without making it so broad as to allow players with small sample sizes to skew the results. An approach based on residuals would help us expand the sample size here; we have an article that has been submitted for the Visitor's Dugout which I'm hoping that Dan S will publish in a day or two.

I prefer to use balls in play as the basis for my comparisons, rather than innings pitched. It's a personal preference, but since we're trying to look at a measure of skill on balls in play it makes more sense to me to evaluate the group based on that, rather than on IP which is only indirectly related to BIP.

-- MWE

Posted 2:42 p.m., February 5, 2004 (#45) - Craig B
  Jay - the original result was the important one... yours was a nice confirmation though.

RossCW - You have a serious reading comperhension problem. Please see a specialist, or a f***ing fourth-grade teacher. I said it replicates Voros's results. With a different data set, yes. The results (you know, the findings?) are replicated

I see the point about redoing Voros's study, the idea that he may have screwed up his study. If he did, then it's odd that everyone since has reported similar results. But of course you will never be satisfied.

Posted 12:05 a.m., February 7, 2004 (#46) - RossCW
  You have a serious reading comperhension problem. Please see a specialist, or a f***ing fourth-grade teacher. I said it replicates Voros's results. With a different data set, yes.

You need a f**cking dictionary. The "results" are not the "identical" same if the data is different, you idiot. Here is the exact question I asked that provoked your first response:

who here actually duplicated McCracken's work and produced identical results?

As I suspected - no one here ever duplicated Voros results before endorsing them and spreading them to the world at large.

I see the point about redoing Voros's study, the idea that he may have screwed up his study.

Then stop complaining about my raising the fact that no one bothered to do it. Or is that just another red herring.

If he did, then it's odd that everyone since has reported similar results

Everyone being Jake. As you note - its nice that Jake got similar results but what if he hadn't? Its the first time anyone has done the work. When Tippets tried to duplicate the results that Voros claimed about the bunching of pitchers BABIP, he couldn't.

The fact that a single year's DIPS ERA correlates better to next year's ERA than the raw annual ERA hardly has any meaning since no one has ever believed that ERA is very predictive of next year's ERA. That's why Bill James came up with component ERA in the first place.

I prefer to use balls in play as the basis for my comparisons, rather than innings pitched. It's a personal preference, but since we're trying to look at a measure of skill on balls in play it makes more sense to me to evaluate the group based on that, rather than on IP which is only indirectly related to BIP.

I agree - but if there were substantially different results between the two you would want to figure out why.

One issue that I have never seen addressed is that we are dealing with data that is based in part on how pitchers are managed. It is related to the above since IP is a measure of how long someone lasted in the game - not how many batters they faced, but how many they got out.

Posted 12:13 a.m., February 7, 2004 (#47) - RossCW
  Actually if the only question is BABIP the proper way of choosing a cutoff is probably just balls in play.