Tango on Baseball Archives

Bases Batted Forward (December 3, 2003)

This would be the flip side to "Total Bases + Walks + SB". If you are familiar with "Yards After Catch", this is what BBF is.

My perspective is that if you've got access to the play by play, then you have to realize that not all bases are created equal (from the standpoint of player evaluation).

This was posted on the "How Runs Are Really Created" and I'll post it here again:

Chance of scoring, from each base/out state

0 outs 1 out 2 outs
1B .38 .25 .12
2B .61 .41 .21
3B .86 .68 .29

As you can see, if you manage to score a runner from 1b and 2 outs, that's ALOT more valuable than scoring a runner from 3B and 0 outs. So, if you are intent on using "Bases Batted Forward", you'd want to weight them based on the base/out.

If you don't care about the base/out, then you might as well use basic Linear Weights. In order, you want things to show in: win-currency, run-currency, bases-currency.

The other reason that I'm not enamoured with these bases-currency metrics is that they never give you a context (opportunities, etc). A counting stat, without context, is useless.

--posted by TangoTiger at 04:59 PM EDT

Posted 5:47 p.m., December 3, 2003 (#1) - Matt Philip(e-mail)
I forgot to mention that one of the advantages of the BBF is that the BBF% takes into consideration the number of opportunities to advance runners that the batter has. The percentage then provides a context for the batter's success, whether he bats leadoff, cleanup or last.

That's a great point about adding the "base value" to the BBF, one that came up at a recent SABR meeting when I presented my research on the 2003 postseason BBF stats. On the one hand, I like the idea of keeping it a simple, accessible stat that a common fan can keep track of at a ballgame. Perhaps I'll add some auxiliary stats to give it more "gravitas." ;)

Posted 8:15 p.m., December 3, 2003 (#2) - David Smyth
---"My perspective is that if you've got access to the play by play, then you have to realize that not all bases are created equal (from the standpoint of player evaluation)."

The key phrase there is the part in parentheses, whatever that is supposed to mean. But, assuming you have all of the data to work with--all of the bases--then I believe that all bases *are* created equal. It takes four consecutive bases to make a run. If any of those are removed, the run is gone. Why is 1st base worth .27 and second base only .17 additional runs in a conventional weighting system? Because that reflects the *odds* of scoring, without knowledge of whether the runner *did* score. But when we have all of the data, we don't need to worry about that. So, yes, .27+.17+.17+.39 does sum to 1, but so does .25+.25+.25+.25.

I disagree with Tango about the merits of "base-centered" evaluation systems. I think they are the real deal, and will eventually be the basis for sabermetrics. I think they remove a lot of the need to *estimate* things by means of probabilities. Why use probabilities when you have the actual data? Of course, there will (presumably) always be a need for such, as in the "attribution" of a base when more than 1 player is involved (on both offense and defense), but the more you can reduce the dependence on probabilities and use actual outcomes, the better off you are.

As far as the problem of not having proper opportunity factors, the proper opp factor depends on what you are trying to evaluate, but with full PBP data, it should be possible to construct appropriate opp factors for every application.

Posted 9:30 a.m., December 4, 2003 (#3) - tangotiger
David, I added the thing in parentheses exactly for this reason.

From the perspective of player evaluation, we ARE talking about odds and probabilities. It's not about getting bases, it's about increasing the odds that your team will win.

I believe that you are 100% wrong on the issue when you say " I think they are the real deal, and will eventually be the basis for sabermetrics. ". I say that the basis of sabermetrics is and always will be the marginal change of win probabilities (in baseball, football, hockey, basketball, soccer, and any other team sport).

Posted 2:20 p.m., December 4, 2003 (#4) - Cyril Morong
Has anyone checked to see what the correlation is between BBF divided by plate appearances and OPS over a long period of time? Has anyone run a regression for players with a large number of plate appearances that has BBF/PA as the dependent variable and maybe singles, doubles, triples, homeruns, walks, GDP, outs, sacs and SF's as the independent variables?

My hunch is that BBF/PA will be highly correlated with these other stats. At a team level, how much better does BBF predict run scoring than some other stat? Who are the best players in this stat?

Posted 2:22 p.m., December 4, 2003 (#5) - tangotiger
Just to expand on my point:

What are we after? We are trying to figure the impact that a player has on a given team.

What kind of impact? The only kind that matters: win impact.

And what does win impact mean? It's the marginal impact this player has on this given team in contributing to wins.

And how do you figure marginal impact? It's the change in win probability that this player is responsible for, given this team.

How you establish this win impact is fun exercise #1.
How much this win impact will cost you is fun exercise #2.

Outs? Bases? Runs? Nope, nope, nope. Wins.

Posted 5:04 p.m., December 4, 2003 (#6) - David Smyth
Well, this goes back to whether you are measuring in real-time or after the fact, which we have discussed quite a bit lately. I was probably a bit "optimistic" in that quote you cited, because I certainly don't have any influence on what the saber gurus of the moment are interested in. I will say, though, that 1) bases are the stuff of which runs are made, and 2) with the PBP, bases can be accurately counted instead of estimated. It seems reasonable to me, therefore, that a "base-centered" evaluation method is a natural thing to have. That doesn't mean that the abstract thinkers can't have their "marginal" frameworks. But for those who like to first take care of what is non-abstract and concrete, the scanty attention paid to bases is disappointing. Why is Tuttle's Base Production system, which is intricate and well thought-out (over 30 articles explaining it on rsbb), almost never mentioned?

Posted 5:15 p.m., December 4, 2003 (#7) - David Smyth
---"Outs? Bases? Runs? Nope, nope, nope. Wins."

Well, wins are made up of runs (scored and allowed), and runs are made of bases (gained and lost). It is misleading to suggest that wins are a "first-order" and independent entity. The decision to treat them as such and figure out the applicable math and probabilities is just that--a decision or preference. Don't try to say that it is more "fundamentally" correct to do it that way.

Posted 5:17 p.m., December 4, 2003 (#8) - Tangotiger
In terms of evaluating a player's true talent level, what you want is a marginal win probabilistic approach. And, in that sense, not all bases are created equal (from an individual player standpoint). The backward looking stuff is really just for fun, and has no real-word impact. MVP, HOF,all that stuff... just fluff really. The good stuff is the estimating the future, and how much to pay for it. And, you can't do that with a base approach and think you'll get it better than a win approach.

Anyway, I really have nothing more to add to this topic, so I'll bow out at this time.

Posted 5:52 p.m., December 4, 2003 (#9) - David Smyth
---" The backward looking stuff is really just for fun, and has no real-word impact. MVP, HOF,all that stuff... just fluff really. The good stuff is the estimating the future, and how much to pay for it."

That is your opinion, and you are entitled to it. But don't state it as fact. And as we have seen with Marcel, predicting the future is not the most difficult thing in the world. Personally, I find the explaining of the past to be just as interesting as the predicting of the future.

---"Anyway, I really have nothing more to add to this topic, so I'll bow out at this time."

Why do people have to "announce" that they are bowing out? Why not just not post anymore? And what if somebody posts something really good or interesting after you say that? And it's not like anybody has been rude or anything on this thread. To me, that sort of "announcement" is just sabermetric snobbishness--Tango has his mind made up, and doesn't want to be bothered with it anymore.

Sorry Tango, but that's how it strikes me. I may be wrong, of course. You are generally very polite and accomodating. But those "announcements" bug me (even tho I have done the same thing myself). :-)

Posted 6:14 p.m., December 4, 2003 (#10) - David Smyth
I almost missed this.

---"The good stuff is the estimating the future, and how much to pay for it."

Then why do you devote your efforts to WPA? Why not just help MGL to refine Slwts?

Posted 11:15 p.m., December 4, 2003 (#11) - Tangotiger
I meant that I want to bow out, because I feel I've exhausted anything I have to say. I'm bored with it! So, its a preemptive strike, in case someone wanted to continue this with me specifically. It's not like a said something so new, and decided to run for the hills. I just feel like a broken record at this point, like a broken record at this point. In any case, if I receive a personal email on any topic, I almost always respond.

As for WPA, the cool part is all the little things to try to figure out the components. The end-result, the list, is really a byproduct. But, it's these lists that usually excite people. I get my kicks on figuring out the win probability matrix for inning/score/base/out/park, and splitting hitting from running and pitching from fielding. As for superLWTS, WPA will one day supplant it.

Posted 11:19 p.m., December 4, 2003 (#12) - Tangotiger
As for the snob comment, I felt I was being courteous in announcing my departure, especially considering that there's a reasonable expectation that I respond to all comments on this section of Primer, seeing that I start all the threads.

Posted 6:01 p.m., December 5, 2003 (#13) - David Smyth
I apologize for my tone, Tango. I might have just been a bit frustrated that hardly anyone seems to be interested in base-centered analysis.

Posted 6:10 p.m., December 6, 2003 (#14) - David Smyth
---"As for superLWTS, WPA will one day supplant it."

I guess I'm probably beating a dead horse, but why is that so (if you will break your "announcement" and reply)?

Slwts is designed to measure ability. AWP/ALP is designed to measure value. WPA is designed to measure something in between (although I am certainly aware that none of these is a "pure" measure of anything). WPA might be as good as, or as desirable as, these other approaches, or perhaps even better in a total evaluation, but I still don't see that it is the "holy grail".

Posted 8:22 p.m., December 6, 2003 (#15) - Tangotiger
WPA measures theoretical marginal win impact based on actual performance in real-time in a context-specific environment. That, to me, is value.

WPA is easily adjustable to account for the opportunity context to measure theoretical win impact in a context-neutral setting (i.e., superLWTS). However, superLWTS doesn't account for everything that happens, while WPA does. (MGL has decided that some things are just luck, so why account for it. I give credit to the runners for WP, BK, etc, and a whole host of other things.)

To make it an ability metric, you need to regress the metric. Each of the subcomponents would have its own regression rate. (And, maybe the WP would regress 100% for the runners... I don't know. Same for clutch performance, etc.)

superLWTS is somewhere between a value and an ability metric. MGL has implicity decided to regress 100% all those components that he doesn't include (like WP with a runner on base), and regress 0% all the other components.