Pitching and Defense
M1421: Sabermetrics, Scouting, and the Science
of Baseball
MIT ESP – HSSP Summer 2008
Wins and Losses
•
Wins and losses are the
measure most commonly used to judge pitchers
•
A pitcher’s record is
heavily dependent on the quality of his offense, bullpen, and on luck
•
For example, Chris Young
went 9-8 last year with a 3.12 ERA, while Tim Wakefield went 17-12 with a 4.76
ERA. The Padres scored 3.76 runs per game when Young started, the Red Sox
scored 5.37 for Wakefield.
Earned Run Average (ERA)
•
ERA is much better than
wins and losses, since it tells us only about what happens when the pitcher is
on the mound
•
However, it is still far
from perfect, as it is very dependent on how a pitcher does with runners
on-base, which is something most pitchers have little control over
•
In 2005, Jarrod Washburn
allowed 184 hits, 51 walks, and 19 home runs, and struck out 94 hitters in
177.3 innings, posting a 3.20 ERA and getting a $40 million contract from the
Seattle Mariners. In 2006, he allowed 198 hits, 55
Component ERA (ERC)
•
Component ERA solves the
problem with ERA by looking at what a pitcher’s ERA should have been,
based on his component statistics, like hits, walks, and strikeouts
•
We can evaluate hitters
just like we do pitchers, using the BaseRuns formula to compute how many runs
they should have allowed
•
Washburn’s 2005 ERC was
4.18, much closer to his 2006 ERA
DIPS/FIP
•
Unfortunately, component
ERA is too far from perfect, because a pitcher’s hit total is still heavily
influenced by luck and the fielders behind him
•
Almost a decade ago,
Voros McCracken found that batting average on balls in-play (BABIP) was very
inconsistent from year-to-year: The best pitchers in baseball allowed roughly
the same BABIP as the worst
•
He invented a formula,
DIPS,
Pitchf/x
•
A new system that has
come online in the past two years promises to provide lots of future innovation
in pitching analysis
•
Pitchf/x measures the
speed, location, and break of every pitch
•
Eventually, we will be
able to better understand what makes a good pitch and faster identify true
changes in talent level
•
Extensions of the system
could also help us evaluate fielding and hitting
DEFENSIVE EVALUATION
•
We want to be able to
evaluate a (position) player’s defensive skills/talent/value in such a way that
we can combine it with his offensive (and baserunning) value in order to get a:
•
Total value for a
player
•
The best format for an
evaluation of any of a player’s skills is runs, since that is the
currency of baseball, and runs can easily be converted into estimated wins.
•
So, no matter what
method or metric we use to evaluate defense, we ultimately want to present it
as runs above or below some baseline. A good baseline for defense is league average
for that position.
•
And of course when we
present a metric (a statistic or formula that measures something
tangible) in runs, we need to specify per how many games, PA, defensive chances,
etc.
“You got it! No, you got it!”
THE DEFENSIVE SPECTRUM
DH
***** 1B ***** LF/RF ***** 3B *****CF *****2B ***** SS ***** C
•
Each position in the
spectrum requires more skill than all the positions to its left and less skill
than all the positions to its right.
•
Because of this, as you
move to the right in the spectrum, there are fewer and fewer players that can
play each position.
•
Because of this, and for
other reasons, as you move to the right, players generally hit worse.
•
If a player were to play
at one position and then move to a position to the left, he will be “better” at
the new position.
•
As players age, and their
defensive skills deteriorate, they tend to move to the left. Sometimes a team realizes that a young player
cannot play his position adequately, and they move him to the left.
•
Even though catchers are
at the far right of the spectrum (and very few people can play it adequately),
moving to the left will probably not “cause” a player to improve defensively. Catcher is a unique position and you can
make the argument that they do not even belong on the spectrum. However, they are the worst hitters, at
least in recent times (since around 1990).
•
Players rarely move to
the right in the spectrum, and if they do, it is usually a bad move.
FIELDING METRICS
Fielding Percentage ŕ Range Factor ŕ Range (“Poor Man’s” ZR) ŕZR
ŕUZR
(DER, WOWY)
Player #1:
So, what is your fielding percentage this year?
Player #2: .978. What is your Range Factor?
Player #1: 3.48 per 9 innings. How about your Zone Rating?
Player #2:
It’s .708. How is your UZR?
Player #1:
Good. +6 per 150. How’s your WOWY?
Player #2: What the he** are you talking
about?
FIELDING PERCENTAGE (FP or FA)
“Oh, crud, I made an error. There goes my fielding percentage!
Maybe I should use a glove next time.”
FIELDING PERCENTAGE
•
Putouts + Assists
divided by Total Chances
•
Also, one minus the
percentage of errors that a fielder makes relative to his total chances.
•
Total chances = Putouts
+ Assists + Errors
•
FP tells us how good a
fielder’s “hands” are. It tells us how
“error prone” a fielder is. It tells us
nothing about how good a fielder’s range is.
•
It is also subject to
the whims of the official scorers.
•
Range is more important
than “hands.”
•
All we really care about
is how many plays a fielder makes.
After all, there is virtually no difference between an error and a
hit. If a fielder gets to a ball and
commits an error, it is essentially the same thing as if he never got to the
ball in the first place.
•
If we wanted to (and we
probably don’t, since FP is such a bad measure of fielding talent/value), we
could convert FP into runs.
•
How could we do that?
RANGE FACTOR
“Oh, yeah, I got range! That’s right!”
RANGE FACTOR
•
Range Factor (RF) =
Putouts + Assists divided by innings played.
•
Better than FP because
it gives credit to a fielder for every play he makes and does not penalize a
fielder if he makes a lot of errors but gets to a lot of balls that other
fielders might not get to.
•
It will penalize players
who do not make a lot of errors but who don’t get to a lot of balls, as well it
should.
Problems with RF
•
It does not take into
consideration the number of opportunities.
•
It treats putouts the
same as assists for infielders.
•
We can also convert RF
into runs saved or cost, or runs above/below average.
•
How would we do that?
DAVID GASSKO’S “RANGE,” OR A “POOR MAN’S
ZR”
“I am so poor, I can’t even afford a Zone Rating.
I got to get out of the minor leagues and start
making some money.”
DAVID GASSKO’S “RANGE,” OR A “POOR MAN’S
ZR”
•
It determines each
fielder’s estimated opportunities by the following:
•
It computes the number
of balls in play to the outfield and to the infield, based on a teams’
pitchers’ outs recorded (3 * IP) minus K minus outs on base minus DP.
•
If G/F ratios are
available for a team or for its pitchers, it separates those BIP into infield
(ground balls) and outfield opportunities (fly balls and line drives to the
OF). If not, these can be estimated
from league averages.
•
If the number of TBF by
LH and RH pitchers are known, we can use them to apportion the IF and OF
opportunities to each fielder. If not,
then these can be estimated as well.
•
Now that we know, or at
least can estimate, each fielder’s opportunities, we can divide each outfielder’s
successful plays (catching a fly ball or line drive) by his opps, and divide each infielder’s successful plays
(turning ground ball into an out) by his opps.
We ignore infield pop-ups and line drives.
How do we figure out each fielder’s
successful plays?
•
For outfielders, it is
simple and clean. All putouts are, by
definition, successful catches.
•
For infielders, other
than at 1B, we use assists and ignore putouts, although some assists are on
relay throws and some putouts are ground balls without a throw.
•
For 1B, many of his
groundball outs are putouts (no throw).
We want to make sure that we don’t miss those, so we can take his total
putouts and subtract the assists from the rest of the infielders.
•
The final number is
successful batted ball plays, not including pop ups and line drives for
infielders, divided by opportunities.
An error is treated like a hit – simply a missed opportunity. This
system penalizes a player for making an error or not getting to a ball that was
presumably hit in his general location, and gives credit for the opposite.
•
That is exactly what
a defensive evaluation system is supposed to do!
What are the weaknesses of Range?
•
It only estimates
successful plays and opportunities for all fielders. It includes some noise.
•
It is not really sure
how many ground ball outs a first baseman makes.
•
It does not know how
hard a ball is hit, where it is hit, where the fielder might be playing (based
on the batter, outs, and baserunners), and other things.
•
These things tend to
“even out” in the long run, so over several years, Range is a very good
indicator of defensive talent/value.
•
How can we turn Range
into runs saved or cost?
Zone Rating (ZR)
“I’m in the Zone baby!
Maybe next time I’ll be out of the Zone – baby!”
Zone Rating (ZR)
•
Uses hit location data.
•
Records how many fly
balls and line drives are hit in each outfielder’s “zone” and how many of these
are turned into outs.
•
Does the same thing for
infielders and ground balls.
•
A fielder’s “zone” is
defined as that area of the field around a fielder in which that fielder makes
at least 50% of the plays.
•
Revised Zone Rating
(RZR) also keeps track of plays made outside of a fielder’s zone.
Weaknesses
•
It does not distinguish
between the location and speed of each ball within a zone, or even outside of a
player’s zone.
•
It arbitrarily creates a
“zone” for a player.
•
It treats an error the
same as a hit.
•
Like most of the systems
so far, it does not account for the position of the fielders.
Again, how can we turn a ZR into
“runs?”
Defensive Efficiency Rating (DER)
“There
is no “I” in “teim!”
Or is
there?
Anyone
want to argue with me?
I
didn’t think so!”
Defensive Efficiency Rating (DER)
•
This is only used on a
team level.
•
We can easily figure out
how many balls in play (BIP) a team allows.
•
DER = outs on non-HR
balls in play divided by non-HR BIP.
Weaknesses
•
It does not distinguish
between ground balls, line drives, and fly balls, and each of these has
different out percentages. (If you have
batted ball type data or pitcher G/F ratio, you can adjust for this.)
•
It does not account for
how hard each ball is hit, the exact location, or the position of the fielders
– similar to the weaknesses of Range.
•
It is only useful for
teams and not for individual players, so it is limited as a projection tool.
Without and With You (WOWY)
•
Compares how many plays
a fielder makes compared to how many plays everyone else makes at that position,
with the same pitchers, and in the same park.
•
It is “clean.”
•
It is easy for everyone
to understand.
•
You can’t blame or
credit your bad or good defense on anyone or anything else but yourself.
•
It can be used for other
things that are hard to evaluate, like catcher defense, stolen bases against
pitchers, double plays, or first basemen “scoops.”
•
It may suffer from
sample size issues. The more you try to
control for the context, the smaller the sample size.
Ultimate Zone Rating (UZR)
“You’ve heard of ‘Ultimate Frisbee?
(Sound of crickets.)
O.K., maybe you haven’t. Maybe I’m stuck in the 70’s.
Or maybe you guys are just too darn young!”
Ultimate Zone Rating (UZR)
•
Uses detailed hit type
and location data.
•
Accounts for the
position of the fielders.
•
Uses park factors.
•
Uses the exact hit value
for each batted ball.
•
Treats errors
differently than hits.
•
Creates “buckets:”
–
Type, location, and speed
of batted ball.
–
Handedness of batters.
–
G/F tendency of pitchers.
–
Baserunners and outs.
–
Infield and outfield (3
of them) park factors.
•
For each bucket,
calculates the percentage of plays made by every fielder.
•
Uses that to compute each
fielder’s UZR in runs.
Weaknesses
None!
2008 Red Sox
The Future of Defensive Evaluation
•
Hit f/x
•
Tracks the position of
all fielders before the ball is hit.
•
Exact landing location,
speed off the bat, and hang time (for fly balls and line drives) of all batted
balls.
•
Given the exact
characteristics of a batted ball and the exact position of a fielder, we can
calculate exactly how often an average fielder does or does not successfully
field any given batted ball.