Tango on Baseball Archives

© Tangotiger

Archive List

FANTASY CENTRAL (February 21, 2004)

I'm taking your requests.

At the very least, I'll provide the 2001,2002,2003 data, and the 2004 Marcels.
--posted by TangoTiger at 09:54 PM EDT


Posted 3:08 a.m., February 22, 2004 (#1) - Dackle
  OK, can we have the 2001 to 2003 data, plus the 2004 projections?

Posted 4:02 a.m., February 22, 2004 (#2) - J Cross
  Well, I’ve put way too much thought into fantasy league baseball lately and here’s what I’ve come up with. You can use the following equations to determine rotovalue for 5x5 leagues:

Hitters:
AVG points = (AVG - .281) * AB * 0.061
HR points = HR * 0.088 - 1.94
SB points = SB * 0.088 – 0.94
R points = R * 0.036 – 3.05
RBI points = RBI * 0.033 – 2.74
(add them up for total value)

Pitchers:
W points = .2695*W-2.6815
S points = 0.0695*S-0.7118
ERA points = (3.88-ERA)*IP*0.0065
WHIP points= (1.28- whip)*IP*0.0382
K points= K*0.0203 - 2.517

I looked at 100 5x5 teams (10 leagues w/ 10 teams each) and it looks like team totals are distributed normally in each category (I’m still working on this though). I don’t have the numbers in front of me (they’re in a spreadsheet at work) but I think the teams average ~300 HR w/ a stdev of 30 and ~130 steals with a stdev of 30 so even though steals are much scarcer either 30 more steals or 30 more homers allows you to pass 34% of the teams in that category and pick up another 3 points. Another implication of this is that the first 30 homers above average are worth a lot more than the next 30 homers (which would pass another 13.5% of teams and be worth 9*.135 points in a 10 team league). Anyway, those equations for rotopoints above are calculated for a 10 team league rotopoints = (change in stdev)*.34*9 so they’re the vlaue a player's contributes to a team that’s average in every category. Using these formulas and Zips I figured out rotovalues for every player. I was thinking of fixing it up so that the spreadsheet would figure out how many stdev’s from average in each category your team is based on the players you have drafted already and re-rank all the players based on how much they could contribute to that team.

What do you think? Does this make any sense?

Posted 12:36 p.m., February 22, 2004 (#3) - tangotiger
  SD is the most important thing to know to establish the roto values.

You also want to baseline by position, such that the "replacement" level at each position is also set to zero $. Gets more complicated when you have 1 ss, 1 2b, and 1 2b or ss.

The sum of the 200 players (or how many players are drafted) should equal the sum of the auction dollars available for the league. Those $ values are the max you should spend. Ideally, you should pick up 300$+ worth of players for 250$ of money.

That is, you should end up with an extra 1 or 2 star players than your opponents.

Posted 3:16 p.m., February 22, 2004 (#4) - J Cross
  My leagues are drafts not auctions so I just need a rank order (not exact value). I do adjust positions based on replacement level.

Posted 3:49 p.m., February 22, 2004 (#5) - reno dakota(e-mail)
  For hitters, I just need basic stat projections a la ZIPS. (Hits, doubles, triples, runs, hr, rbi, sb, cs, bb, and k).

For pitchers, I need er, w-l, ip, and so.

I assume that would all be covered by Marcel's projections, but I wanted to make sure.

Thanks a lot. I'll cut you in on any profits, which will probably not be worth the postage.

Posted 6:32 p.m., February 22, 2004 (#6) - Nod Narb(e-mail)
  I've got a project in the works that is a real time draft advisor.

I have the prototype up and running successfully as a VB macro in Excel, but I would be interested in collaborating with someone who has programming skills to turn it into an actual executable file. If you have any interest in taking my idea and turning it into code, feel free to email me for more details.

Posted 6:33 p.m., February 22, 2004 (#7) - Brian P
  Have you run any data on other 5x5's with less/more team totals? It would be interesting to see how things would change and if that could be formulated reasonably. Tango's mention of replacement players is how I go into my auctions, which are made difficult by the number of variables (MI's, CI's, Util) and ability to manipulate your team makeup in terms of batting/pitching balance. Obviously, even in ranking players, understanding the level of replacement will be highly dependent on the number of teams...

Posted 7:58 p.m., February 22, 2004 (#8) - Score Bard (homepage)
  A beta version of my draft simulator is online at the homepage link.

Posted 8:46 p.m., February 22, 2004 (#9) - J Cross
  Very cool, Score Bard. I think I'm going to use it to simulate 2 drafts: 1 where I'm punting steals and one where I'm not. Then I'll look at how the two teams compare (based on ZiPS projections) to the one hundred teams I've looked at and see which team would score better.

I've looked at SABR scoring league but I haven't looked at leagues w/ a different # of teams. Maybe I will.

Nord, how did you get your equations?

Posted 9:20 p.m., February 22, 2004 (#10) - Nod Narb(e-mail)
  Nord, how did you get your equations?

I'm assuming this was directed towards me?

Anyway, I use a combination of standard deviations and linear weights based on factors such as positional scarcity and category quotas. This gives me an output of which player would be best to draft based on who has been drafted and who's left to draft. All of the values update in real time, as I tell the program who was drafted. Works fine in Excel, but would be much less cumbersome as a a web applet or an executable file.

Since I'm actually in a Primer-based league, I don't want to discuss all of the details here. I'd be more than happy to exchange emails though.

Posted 9:56 p.m., February 22, 2004 (#11) - Michael
  I'm planing on doing what you descibe J. Cross in terms of figuring out the expected distribution of each stat and update it as my team and the other teams in my league fill in. But I'm also in a primer-based league (the same yahoo one as Nob Narb) so I also don't want to share too much. Should be fun to see a league of primers with more sabermetric type stats go at it.

Posted 11:26 p.m., February 22, 2004 (#12) - J Cross(e-mail)
  Wow, that primate league is going to be pretty darn competitive. I'd be interested to see how that plays out. I'm pretty much assuming that no one in my league is developing competing draft software but you never know.

Posted 3:22 a.m., February 23, 2004 (#13) - Joe Dimino(e-mail)
  Tango - here's our system:

Offense

adjOBP - (H+BB+HBP-CS)/(PA)
TB+SB
R
RBI

Pitching

Full Categories (14 for first, 13 for 2nd etc.)
ERA
adjOPSagainst - ((OBPagainst*2)+SLGagainst))
Strikeouts

1/2 Categories (7 for first, 6.5 for 2nd, etc.)
Quality Starts
Relief Points (2*SV+RW+H-RL-BS)

We auction and have a cap of $263 for a 24 man roster (13 hitters, 10 pitchers, one utility that can be hitter or pitcher) 14 teams, NL only.

Posted 3:25 a.m., February 23, 2004 (#14) - Joe Dimino(e-mail)
  Here's the long version of the pitcher aOPS category:

(2*((HA+BBI+HB)/PA))+(TBA/ABA)

Posted 4:34 a.m., February 23, 2004 (#15) - Sandman(e-mail)
  When is the primer fantasy draft going down? Any hopes of getting that spreadsheet formula Nod Norb, if my league drafts after your league?

Posted 10:19 a.m., February 23, 2004 (#16) - Brian P
  Bard - the only thing that I notcied is that when I wanted to view the last teams roster in a 6-man league that it didn't show it to me. Otherwise, nice job.

Posted 12:11 p.m., February 23, 2004 (#17) - Greg Tamer(e-mail)
  Joe -- unfortunately, Yahoo! doesn't offer categories such as Quality Starts, Relief Points, and all those other fancy categories your league has. Nor the option to create one's own categories. But other than that, Yahoo! does offer an excellent fantasy baseball service.

Posted 5:05 p.m., February 23, 2004 (#18) - Score Bard
  Brian P.--Yeah, that's one of the bugs I need to fix. Often, one team won't show it's roster, for some reason. It's on my list.

Posted 10:24 p.m., February 23, 2004 (#19) - Dackle
  What do you think is a better way to evaluate players: (a) adding their stats to an average team; or (b) comparing them against other players? If you choose option b, you're assuming a perfectly efficient league where the top (# of league teams * roster size) players have been snapped up. This isn't unreasonable. The information for b is easier to find too -- just dump the current stats into Excel and calculate standard deviations. In choice A you've got to go hunting for historical league data.
If you choose option B, then could you not divide the stat by the raw standard deviation? Most people would subtract the mean of the player pool, then divide by the SD to get a Z score. But why not assume that every player starts at .000, 0 HR, 0 RBI, 0 SB? Then you subtract 0 in each stat and divide by the SD. The only flaw I can see would be unfair bias for or against rate stats. Maybe there's no bias though.

Posted 12:49 a.m., February 24, 2004 (#20) - Greg Tamer(e-mail)
  just dump the current stats into Excel and calculate standard deviations

How are you calculating standard deviations across players when they have varying amounts of PA or IP?

Posted 4:14 a.m., February 24, 2004 (#21) - Dackle
  I use the raw numbers for the counting stats, and (bat. avg. - league bat. avg) * pa, or (league era - era) * ip for the rate stats. That's why I don't quite feel comfortable starting everyone at .000, 0 hr, 0 rbi, 0 sb, because the rate stats may deserve to be treated differently. Jim Cassandro's residual hits could work better here.
But think about it, the standard deviations for the top 300 players in a particular category (12 teams * 25 players) shouldn't be that much different than the standard deviations for imaginary/historical leagues. So why go to the extra work of adding to an imaginary team?

Posted 11:16 a.m., February 24, 2004 (#22) - J Cross
  Dackle, think about the stdev of saves on the individual v. team level. I don't think saves would be normally distributed on the individual level. I think method a is the way to go. I have 100 teams in a spreadsheet if anyone wants the numbers.

Posted 11:43 a.m., February 24, 2004 (#23) - tangotiger
  J, what kind of difference are we talking about here? In your post #2, are your coefficients based on the SD of the players, or is it based on putting them into teams?

Posted 12:26 p.m., February 24, 2004 (#24) - J Cross
  Well, the equations I posted above are actually from ESPN.com player rater which calculates the value of players compared to an average fantasy player. I looked at a bunch of players and a bunch of values and figured out that the values were based on those equations.

Then I went and took 100 espn 5x5 rototeams from 2003 and looked at the SD on the team level and figured the value of a stat in rotopoints should equal (.34*(n-1))/std] where n is the number of teams. This matches up very well with the coefficients from the espn player rater equations if you use an n=9 and I'm guessing they loooked at a mix of 8 and 10 teams leagues. I'll look at wins and saves and see how differently it turns out if you use std's on the team v. player level.

Posted 12:57 p.m., February 24, 2004 (#25) - J Cross
  okay, here's a comparison of Wins v. Saves values looked at in different ways:

1) scarcity. The average team had 88 wins and 98 saves. Just going by scarcity you might say that wins are worth 1.11 saves.

2) player std. The average pitcher (in the pool of FLB players) had 8.1 wins and 9.0 saves with a std of 5.9 wins and 13.5 saves. Going by the std's a win would be worth 2.28 saves.

3) team std. The standard dev. of team wins was 12.0 and team saves 38.0. Going by team standards a win was worth 3.17 saves.

I think method 3 is the right way to do it and it does come up with a different answer.

Posted 1:04 p.m., February 24, 2004 (#26) - J Cross
  A little more info: Although saves looks very normally distributed on the team level I'm not so sure about wins. I'll let the statisticians decide:

w/in .5 stds (should be 38%): 38/100 in saves, 46/100 in wins

w/in 1 std (should be 68%): 69/100 in saves, 67/100 in wins

w/in 2 stds (should be 95%): 96/100 in saves, 95/100 in wins.

More teams are w/in .5 stds of the mean win number than you'd expect but I'm not sure how meaningful that is. Is there an excel function to test for "normalcy"?

Posted 1:16 p.m., February 24, 2004 (#27) - Nod Narb(e-mail)
  So does this say that ESPN's ratings are good?

One problem I've tried to deal with beyond figuring out who has the most value is the process of drafting based on these values. Some of the issues I am trying to improve on:

With each pick, a number of factors contribute to the decision of who to draft:

Who you drafted previously - this influences contributions to stat categories and position. For example, if you have drafted 3 HR hitters with your first three picks, your fourth pick might be better off being a guy with a lot of stolen bases, even if the higher HR guy has more overall fantasy value. Also, if you have drafted 3 OFs with your first 3 picks, drafting an OF with your 4th pick is probably inferior to drafting a good 1B, even if the OF has more overall fantasy value.

Who is left to pick from - if there are 3 good SS left and 3 good 2B left, who do you pick?

What has been picked by your opponents - ties in with above. If all of your opponents have a SS already, but 6 are looking for a 2B, you're better off drafting a 2B and waiting for a SS, even if the SS you would have picked has more fantasy value.

Picking pitchers vs. hitters - should a pick be spent on a hitter or a pitcher? Many factors play into this, including those listed above. Going strictly by fantasy value may not be productive in the whole scheme of things.

The real challenge is weighting all of these factors to make the optimal pick in each round. I have always tried to do this subjectively, with pretty good success. But there is no doubt that a system that takes all of this into account could be very beneficial. The VB macro I have created tries to do this, although I'm not sure I can test it to see if it's any better than what I would do without it.

I used Score Bard's draft simulator in conjunction with my applet to do a mock draft. I was becoming very frustrated when each and every round I was not drafting starting pitching. Halfway through the draft, I had only drafted 2 SPs. After the draft was completed, I found out why. There were not nearly as many SPs drafted as I had expected. Therefore, there were a lot of good SPs left in the latest rounds. Had I conducted the draft myself, I would have grabbed a lot more SPs earlier on, and thus losing out on all the offense I ended up getting, without sacrificing my pitching. I'm sure many points along the way I drafted someone without the highest fantasy value available.

Enough rambling...the moral here is that accurate fantasy values are good, but unless you have a good system of drafting that takes into account multiple variables, you may not draft successfully.

Thoughts?

Posted 1:29 p.m., February 24, 2004 (#28) - J Cross
  Well, I think the draft has to be a little subjective and surprising b/c that's some of the fun. My main FLB team (I now have two) I run with a friend of mine. We do the draft at his office where we have a a couple computers going, a bunch of print outs and even a white board in our "war room." This year I'm going to hire a couple of guys to sit in the corner of the room, spit, and yell stuff about which players have good makeup.

Posted 1:31 p.m., February 24, 2004 (#29) - J Cross
  btw, Nod, I think ideally the draft software would tell you how far away from the mean you're projected to be in each category and either re-rank the players or just let you know to stress categories where you are close to average. The more stdev you are from the mean the less additional stdev are worth in terms of points.

Posted 1:35 p.m., February 24, 2004 (#30) - f_k_a Scoriano
  Fine, but how do you make this stuff work in the playoffs? :)

Posted 1:36 p.m., February 24, 2004 (#31) - Nod Narb (homepage)
  Well, I think the draft has to be a little subjective and surprising b/c that's some of the fun.

True, but I also have fun thinking about this kind of stuff (pathetic, yes). Anyway, you can take it or leave it, but I think at least considering being objective is useful, at least to some.

More teams are w/in .5 stds of the mean win number than you'd expect but I'm not sure how meaningful that is. Is there an excel function to test for "normalcy"?

See homepage

Posted 1:40 p.m., February 24, 2004 (#32) - Nod Narb
  btw, Nod, I think ideally the draft software would tell you how far away from the mean you're projected to be in each category and either re-rank the players

Exactly. That's part of it.

Fine, but how do you make this stuff work in the playoffs? :)

You ELIMINATE the playoffs (muahaha).

A few years ago my league's format was head-to-head, meaning that the last 4 weeks of the regular season were the fantasy playoffs. I had a good 26 game lead on 2nd place at the end of the regular season, but on the last day of the first round of the playoffs, my opponent's team hit something like .750/.850/2.000 with 9 HRs, etc. and I was knocked out. As commissioner, I made sure that next year we had no playoffs :)

Posted 1:58 p.m., February 24, 2004 (#33) - J Cross
  I agree that comign up with the system is fun. I'm just saying that I might still take the player I like better when it comes time to draft (assuming they're close).

I'm not sure how to apply that test. I suppose I coudl apply the Shapiro-Wilk test if I knew how to calculate ai.

Bard- I used you draft simulater yesterday but now I get the 'need Flash Player' message. I'm pretty sure I have 'Flash Player.' At least the Flash Player webstie tells me I have Flash Player. I reset the computer and everything.

Posted 2:01 p.m., February 24, 2004 (#34) - mathteamcoach
  Is there an excel function to test for "normalcy"?

There is a test, but i do not know if it exists in Excel.

Posted 2:10 p.m., February 24, 2004 (#35) - mathteamcoach
  Is there an excel function to test for "normalcy"?

Actually, if you have the z scores for the data set, you can determine whether the data is actually normal. For example, with wins, plot the z scores versus the wins. The closer the points on the plot are to a straight line, the stronger the evidence that the data were drawn from a normal population.

Posted 2:22 p.m., February 24, 2004 (#36) - J Cross
  z-scores? if the z-score is just (wins- mean wins)/ stdv then z-score v. wins has to be a straight line. I think I need more coaching. I'll try to figure out how to get z-scores.

Posted 2:29 p.m., February 24, 2004 (#37) - Nod Narb
  z-scores? if the z-score is just (wins- mean wins)/ stdv then z-score v. wins has to be a straight line. I think I need more coaching. I'll try to figure out how to get z-scores

not wins minus mean wins, just wins.

in excel, you can calculate z-scores as follows:

underneath the last value in the wins column, type in =average(Cx:Cy) and in the cell underneath that type =stdev(Cx:Cy) where C is the wins column and x and y are the first and last rows of data in your column.

now, in the first blank column to the right of your data, go to the topmost cell of data (usually the second row if you have data labels in the first row). type in =standardize(w,a,s)

in this case, w represents the cell in the wins column for the row you are in. a represents the cell that contains the average wins, and s represents the cell that contains the stdev of the wins column. You'll need to put a $ before the cell number in a and s and then fill down the column.

Thats really hard to explain in writing.

Posted 2:31 p.m., February 24, 2004 (#38) - Nod Narb
  Heh. What a waste of time. Your calculation does the same thing. My bad.

Posted 2:33 p.m., February 24, 2004 (#39) - mathteamcoach
  if the z-score is just (wins- mean wins)/ stdv then z-score v. wins has to be a straight line.

You have the z-scores correct, but they do not have to be a straight line if the population they were drawn from is not normal. Am I wrong?

Posted 2:44 p.m., February 24, 2004 (#40) - J Cross
  well, (wins-mean wins)/std = wins/constant - constant so graphed against wins we'd just get a straigh line, right?

The standardize function gives me the same #'s as (wins-mean wins)/std

Posted 3:12 p.m., February 24, 2004 (#41) - J Cross
  something else to look at:

correlation to team rotopoints:

team w: .503
team s: .544
team K: .739
team ERA: -.482
team WHIP: -.397

team run: .728
team hr: .579
team rbi: .678
team sb: .429
team avg: .461

so, should we be trying to win the K and R categories?

Posted 3:31 p.m., February 24, 2004 (#42) - Nod Narb
  J. Cross -

Can you run the correlation between team fantasy value (using your equations) and team rotopoints?

Posted 3:44 p.m., February 24, 2004 (#43) - J Cross
  Nod, I can but there's a problem using those equations on the team level. For instance the team with the most wins of the 100 gets 7 win point above average using those equations but a team can't really get more than 4.5 win points. The problem is that this team is 2.3 standard deviations out in wins but the second standard deviation is worth less than the first in rotopoints. With players, no one player shifts any one category more than a standard deviation so while there are diminishing returns it isn't THAT big a factor. I guess I should come up with a multiplier to adjust for this... not sure how to do it off the top of my head.

Posted 4:03 p.m., February 24, 2004 (#44) - J Cross
  okay, I think if I weigh the value of each diffence from average by 1 -.2*(stds from mean) it should work out. I could do this with players too to be more exact.

Posted 5:05 p.m., February 24, 2004 (#45) - tangotiger
  J,

Can you print out the SD (and for the rate stats, the number of AB or IP) for your 100 team sample?

***

As for the higher correlation, it might be that the high K pitchers also have better than avg ERA and wins, etc.

Posted 5:28 p.m., February 24, 2004 (#46) - J Cross
  I don't have total IP or total AB numbers b/c I only have final rosters for these teams. But, I know that the average pitcher on these teams has 7.21 K/9 and that these teams finished with an average of 1098 K's so that gives me a best guess of 1370.8 innings/team. I'll do the same thing to figure out AB's and then post all of the numbers.

Posted 6:06 p.m., February 24, 2004 (#47) - J Cross
  ok,

Stat (team avg, team std)

Wins (88.39, 12.01)
Saves (98.19, 38.09)
K (1098.38, 149.30)
ERA (3.85, .305)
*SAVED ER (0.00, 46.58)
WHIP (1.28, .050)
**SAVED W+H (0.00, 7.67)

Runs (1118.9, 76.7)
HR (286.1, 30.17)
RBI (1073.5, 82.1)
SB (131.6, 28.81)
AVG (.282, .0067)
***EXTRA HITS (0.00, 47.58)

*SAVED ER calculated as (3.85-ERA)*1372.9
**SAVED WHIP = (1.28 - WHIP)*1372.9
****EXTRA HITS = (AVG-.282)*7091

Posted 6:10 p.m., February 24, 2004 (#48) - J Cross
  Oops. those Saved ER and Saved WHIP forumlas are wrong. should be:

saved ER = (3.85-ERA)*(1372.9/9)
and
saved WH =(1.28-WHIP)*(1372.9/9)

but now that I think about it I shouldn't have devided by 9 on WHIP so using

saved WH = (1.28-WHIP)*1372.9
we get
average = 0.00
std = 69.07

that makes more sense.

Posted 6:24 p.m., February 24, 2004 (#49) - J Cross
  So, finally, my empirical coefficients (assuming an average of 9 teams in a league) compared to espn player rater's coefficients.

stat (player rater, empirical)

W (.270/.226) !!! The only notable difference
S (.0695, .0714)
K (.0203, .0182)
ERA (.0065*IP, .0065*IP)
WHIP (.038*IP, .039*IP)

R (.036, .036)
HR (.088, .090)
RBI (.033, .033)
SB (.088, .094)
AVG (.061*AB, .057*AB)

So, outside of a disparity when it comes to valuing wins it looks like my empirical results back up the espn player rater equations.

Posted 7:34 p.m., February 24, 2004 (#50) - J Cross
  Nod, I finally have an answer to your question.

Correlation btw calculated team fantasy value and actual rotopoints:

R = .9685

Posted 1:17 a.m., February 25, 2004 (#51) - Sandman
  I'm glad I'm not the same leagues as you guys.

Posted 9:34 a.m., February 25, 2004 (#52) - tangotiger
  Using J. Cross' numbers from post #49 (which are based on the team results' SD), force the HR to 1, and his hitters come in at:

hitter
= HR
+ 1.05 * SB
+ 0.39 * Runs
+ 0.37 * RBI
+ 0.63 * (H-AB*.28)
+ some constant that will be the same for all players

If I instead look at the SD of individual players (from 1994-2002 as a group, with at least 300 PA), and I get:

hitter
= HR
+ 1.00 * SB
+ 0.45 * Runs
+ 0.39 * RBI
+ 0.83 * (H-AB*.28)
+ some constant that will be the same for all players

If I make it at least 500 PA, I get:
hitter
= HR
+ 0.96 * SB
+ 0.51 * Runs
+ 0.41 * RBI
+ 0.85 * (H-AB*.28)
+ some constant that will be the same for all players

***

We see that the biggest difference comes in the handling of the batting average.

***

I suppose what you really want to do is take each team, and replace 1/14th of the team totals with runs scored from 60 to 130, and recompute the standings. Then, come up with the best-fit function. Hopefully, you'll get something like a straight-line, so you don't have to worry about it too much. And, you repeat this for all the categories.

Anyone want to try?

Posted 11:22 a.m., February 25, 2004 (#53) - Score Bard (homepage)
  J. Cross, I hadn't changed anything, so I don't know why it stopped working for you. I found something that might be a cause, and I fixed it, but I can't know for sure.

If it still doesn't work, you can go directly to the page without passing through the flash detection page, by using the Homepage link in this post.

Posted 1:18 p.m., February 25, 2004 (#54) - Matt
  My league has 12 teams -- J. Cross, how do you think your equations would change because of that? Or, how did you get your data on 100 teams, so maybe I can do it myself for leagues with 12 teams? Does your data have other stats as well (my league is more like a 10x10 system, apparently)?

Posted 1:51 p.m., February 25, 2004 (#55) - Matt
  Disregard the part of my last post about having 12 teams. I just now found your post where you addressed that issue. Still wondering about other stats, though.

Posted 1:53 p.m., February 25, 2004 (#56) - tangotiger
  Can you tell me how many hitters are drafted, and from which league?

As you can see from my run in post #52, the SD hardly change, whether you select all players with at least 300 PA or 500 PA.

Posted 1:54 p.m., February 25, 2004 (#57) - J Cross
  Score Bard, I think it's just my wacky work computer. Sorry about that.

Matt, I think the averages would be too high for a twelve team league with talent more dispersed. I don't think the standard deviations (coefficients) would change all that much going from 10 to 12 but I don't know how they would change. I got the data by viewing 2003 espn teams and copying and pasta into an excel spreadsheet.

Right now I'm working on putting all this info into a spreadsheet that adjusts for the team you have and how much extra #'s can help you in each category. It looks like it makes a difference. It ranks Ichiro #7 overall but if you have Beltran and Soriano (two basestealers) on your team already he falls to #23 overall. I'm hoping to have this ready for my draft next week and then, after working out the kinks, I'll send it out to anyone who's interested.

Posted 1:55 p.m., February 25, 2004 (#58) - J Cross
  oh, yeah, tango's post answers how they'd change.

Posted 2:04 p.m., February 25, 2004 (#59) - Strong Bad
  J. Cross, I'm assuming your league data is from mixed leagues?

Posted 2:27 p.m., February 25, 2004 (#60) - J Cross
  Yep, maybe after I have this draft spreadsheet worked out I'll get data for AL only/NL only and other scoring systems. Needless to say, I'm not getting much of the work I'm paid to do done.

Posted 2:36 p.m., February 25, 2004 (#61) - Matt
  Actually, I guess what I need to compare is the roster requirements. Do all of the ESPN teams have the same roster requirements? I see these:
c,1b,2b,3b,ss, 5xOF, 2b/ss, 1b/3b, Util, Bench
9xP, 3xBench

Posted 2:56 p.m., February 25, 2004 (#62) - J Cross
  yeah, I think those are for all espn leagues.

Posted 5:38 p.m., February 25, 2004 (#63) - Depot
  Score Bard,

That's very cool. I'm confused, however, as to the point...other than being lots of fun. How do the other teams draft?

Posted 8:39 p.m., February 25, 2004 (#64) - Score Bard (homepage)
  Depot, nothing I do has a point. And that's how I like it. People in the sabermetric community are far too logical and pointful, and someone needs to be pointless so we can have some balance around here.

Actually, I made it so I could try out various draft strategies and see which ones I liked best, in preparation for my own draft.

The other teams start with a common rank list that I've merged from various sources. Then each team randomly adjusts the rankings of each player with varying degrees of flakiness.

One team (Partisans) picks a random favorite team and gives players from that team a boost in their rankings. One team (Arms) prefers pitchers over hitters, another (Sluggers) prefers hitters over pitchers. A couple teams give preference to the positions where top talent is scarce: 2b, 3b, ss, c.

Then each team's list is resorted, and it will draft from that reordered list. The only exception is that it won't take three players from the same position, until it has filled the other positions.

Posted 8:58 p.m., February 25, 2004 (#65) - Nod Narb(e-mail)
  Well, I just had my first draft of the season. This was a public 5x5 yahoo league, just to test out my draft strategy. I was pretty pleased with the draft, although midway through I noticed that I mistyped one number leading to the overvaluation of SPs, so I drafted a few SPs before I should have.

12 teams, 21 rounds, total of 252 players
Offensive stats: R, HR, RBI, SB, AVG
Pitching stats: W, SV, K, ERA, WHIP

I had the 7th pick (serpentine draft)

Results (in order):
Bonds
Pudge
Randy
Giambi
Dotel
J. Santana
Rollins
Oswalt
Walker
Beckett
Webb
Green
Contreras
Durham
M. Batista
Rhodes
Spiezio
Floyd
Bernie
Klesko
Koch

A few notes: personally, I never would have drafted pudge, rollins, or spiezio as high as I did, but they were the best available at their position by a large margin at the time i selected them. I used marcel's predictions in my draft applet. I was totally shocked to get floyd, bernie, and klesko at the end of the draft. people had drafted so many terrible outfielders before them. all in all, I'm pleased but I have had better drafts without using my applet.

Posted 9:02 p.m., February 25, 2004 (#66) - Nod Narb
  Walker is larry

Posted 11:52 a.m., February 26, 2004 (#67) - Matt
  The more I think about this thread, the more I realize how much I'm not understanding here. But I think I'm getting it.

One confusion I have is how you're handling Batting Average. That .28 figure that you're using in post #52, does it matter what you use there? That is, since I have more teams in the league, should I change that number? Because I understand that the constants for the other stats come out in the wash. The same question would go for SLG and OBP and others.

The other questions are just requests. Could anyone figure out SDs (like in post #52 or #48) for other stats? Hits, walks, and total bases for hitters, and innings pitched, CG, walks, holds (?!), and K/9 for pitchers? And will you have data or Marcels for CG and holds?

Also, do you have data and Marcels for

Posted 12:04 p.m., February 26, 2004 (#68) - tangotiger
  I think you are understanding it fine.

I doubt the .28 will change much, but that's easy enough to check with a little work.

I'll have the Marcels, but remember, Marcel is the most basic estimate that all other forecasters should improve upon. If you have access to ZiPS, PECOTA, DMB, Ken Warren, etc, use those.

Posted 12:58 p.m., February 26, 2004 (#69) - Matt
  Thanks. I know the Marcels are the dummy prediction, but I don't see these fringe stats like CG and Holds (or Saves, for that matter) in the Zips. I'm using the Zips for everything else, though. I could probably get them from DMB but I haven't decided if I want to buy that disk yet. :-) Although I now see that RotoTimes has them so I'll use those for now. But if you get around to finding the SDs for other stats that would be great (I guess from individual player stats, should be adequate). I'm pretty sure I wouldn't have any way to do that.

Posted 12:17 p.m., February 27, 2004 (#70) - Matt
  I found that I could use the Lahman database to try to get the numbers you posted in #52, tangotiger. And I'm spot-on with the numbers for PA>=300. But I'm nowhere close for PA>=500. Here's what I get, following your lead of setting HR to 1:
HR=1
SB=0.93
R=0.61
RBI=0.44

All I'm doing is finding the standard deviation for these stats, for all players, 1994-2002, with greater than the listed number of PA. Then taking the SD for HR and dividing it by the SD of the other stats. Is that right? How come we have this discrepancy?

Posted 12:30 p.m., February 27, 2004 (#71) - Matt
  This Lahman database is fun. I can probably run my own queries to answer my questions, if I make sure I understand what's happening here.

Posted 8:42 p.m., February 27, 2004 (#72) - jto
  I am doing a study comparing projection systems using 2001-2003 data. If anyone has any of these past projection years data stored on your computer or in book form please let me know. I could really use your help. The data could be from any of the fantasy baseball websites or your own personal projections. I already have DMB, Shandler, and Ken Warren's projections. Thanks for the help...

Posted 12:19 a.m., February 28, 2004 (#73) - J Cross
  jto, Nate Silver from primer did a comparison of PECOTA, Zips, BBHQ (Shandler), Diamond Mind, Rototimes, Rotowire and Warren so he must have all the data.

Nod, I like your pitching staff. I'm going to be going after strikeout pitchers almost exclusively, I think.

Posted 3:06 p.m., February 28, 2004 (#74) - Nod Narb
  MLB has a new fantasy service this year called Fantasy Ticket. It looks really damn cool.

http://mlb.mlb.com/NASApp/mlb/mlb/subscriptions/fantasy_ticket.jsp

Posted 5:31 p.m., February 28, 2004 (#75) - tangotiger
  Selecting the pitchers from 94-03 with at least 20 GS or 5 SV (that's an average of 160 pitchers per MLB season), here are the SD's for the counting categories:

W: 5
L: 4
G: 16
GS 13
CG: 2
SHO: 1
SV: 12

I'm not sure if my threshholds are proper. If you would like different ones, please let me know.

Posted 6:28 p.m., February 29, 2004 (#76) - Bob
  Are the Marcels projections posted somewhere? I'd like to use them to do some 2004 seasons using my sim, ala Diamond Mind's projections each year.
Has nothing to do with fantasy discussion though.
Thanks.

Posted 10:31 a.m., March 1, 2004 (#77) - tangotiger
  I'll try to target devoting an hour or two on Fri Mar 5 to do the Marcels.

I'm going to assume a playing time estimate for hitters of:
.5 * 2003 PA + .1 * 2002 PA + 200

That means a guy with 500 PAs in 2003 and 2002 is expected to get 500 PA in 2004.

I haven't looked for starters, middle guys, and relievers what to do.

Posted 12:29 p.m., March 1, 2004 (#78) - jto
  J. Cross...thanks for the Nate Silver reference...but I am looking to do a 3 year study.. Silver only looked at last year...

Posted 5:41 p.m., March 1, 2004 (#79) - Big Series
  In terms of drafting to fill needs vs. drafting the best available player:

There is the economic theory of comparative advantage (right?) that says you should produce as much as you can of what you can produce most effectively.

I think this applies to drafts - even if you already have A Rod, and Tejada is the best available player on the board you should still draft him, because he could theoretically get you more in a trade than you could get by drafting the next best non-SS.

Sam thing applies to stats, i.e., if you already have a lot of steals, and Carl Crawford is the best available player left on the board, take him and trade him later on.

Do people agree with this line of thinking? I guess it really depends on your confidence in your trading ability...

Posted 6:00 p.m., March 1, 2004 (#80) - J Cross
  jto, nice, a 3-yr study would tell us a lot more. btw, does anyone know whether the difference in predicting pitchers for 2003 btw PECOTA and others was significant or could it have just been luck?

Big Series, and it relies on other players thinking Crawford was the best available at the time. And, if you're in a position where you HAVE to trade a player you'll probably lose a little value in the trade. I think in most cases there are several players of similar ability left remaining on the board. If one player really stands out I'd take them even if they don't fill a need.

Posted 7:42 p.m., March 1, 2004 (#81) - Kyle S
  Comparative advantage doesn't have much relevance to fantasy baseball. Basically, it hypothesizes that if trade is free, each good will be produced by the actor and in the location that affords the lowest relative cost. For example, if the US can produce both guns and butter more cheaply than can Canada, which produces butter more cheaply than guns, the US should produce only guns and trade those guns for Canadian butter. The reason for this is opportunity cost: by producing 1 butter unit, the US might forgo 3 units of guns, which themselves can "buy" butter from Canada for less. Obviously, this isn't always true - some countries can't produce ANYTHING well, and they're pretty much screwed in international trade.

However, you can't "trade" for fantasy statistics this way - you can trade for players, whose statistics become part of your own production. If there was a commodities market for RBI, SB, etc, then maybe this would be true - kind of a cool idea.

The economics lesson that applies here is marginal analysis. If you already have A-Rod as your shortstop, the marginal benefit from Tejada is much less than if A-Rod were not already on your team; instead, you might draft someone expected to perform similarly to Tejada but playing a premium position like 2b, C, or (in fantasy) RP.

Posted 8:50 p.m., March 1, 2004 (#82) - Nod Narb (homepage)
  Sam thing applies to stats, i.e., if you already have a lot of steals, and Carl Crawford is the best available player left on the board, take him and trade him later on.

Do people agree with this line of thinking? I guess it really depends on your confidence in your trading ability...

I tend to agree with J. Cross and Kyle S. If Crawford was the best player available by 100%, I would definitely take him. But if he's only 5% better than a player at a position I don't already have, I'm taking the other guy. I've seen guys try to stock up on catchers at the beginning of drafts, but it never works out, because their need to fill the other positions is greater than others' desire to add another catcher. So they get antsy and trade their catchers for lesser value.

Say you took tejada with the 30th pick. if you want a better player in trade, you can't trade him for one of the first 29 players drafted, because obviously their managers chose to pass on tejada for the player they picked. if you think someone after the 30th pick is better than tejada, you're better off drafting him at 30 instead of hoping you get them in trade.

btw, does anyone know whether the difference in predicting pitchers for 2003 btw PECOTA and others was significant or could it have just been luck?

Well, not many people agree with me on this one, but IMO correlations here are pretty much useless. The best stat they report is RMSE - the average difference between projected stats and actual stats. Check out the homepage link. Because the error bars (RMSE) overlap in all cases, this indicates no statistically significant differences. This is not to say that PECOTA isn't better, it's just not significantly better.

The more statistically savvy can correct me if I'm wrong about this.

Posted 11:16 a.m., March 2, 2004 (#83) - J Cross
  Nod, I'm not sure I know what's going on in that graph. That looks like mean ERA and mean OPS with the error bars representing RMSE. Is that right? If so, it looks like pecota has the same size error bars as every other system but that's not what I remember from their report.

Posted 11:32 a.m., March 2, 2004 (#84) - Nod Narb
  That is what's going on. The error bars represent RMSE. They only look the same because the range of RMSE between groups is quite small (.85 to .98 for OPS, 1.11 to 1.24 for ERA) relative to the scale of the graph. The key point is that they overlap, which, unless I'm mistaken (and I tend to be mistaken quite a bit), means that they aren't significantly different. Maybe I can put together two separate graphs so there's better resolution.

Posted 11:44 a.m., March 2, 2004 (#85) - tangotiger
  The best stat they report is RMSE - the average difference between projected stats and actual stats.

The implication of this statement is that if you had a league OPS in 1984,85,86 of .730,.730,.730, you would expect .730 for the league in 1987. We know that 1987 was not like the others. Say it was .770.

If you had Tim Raines at .830,.830,.830 in those 3 years, you might project him at .820 or something in 1987. If he ended up being .860, you'd think you were off by .040. But, everyone in the league would be off by that.

The best way to do is to compare the player to the league (whether by differential, division, or ratio).

Posted 12:26 p.m., March 2, 2004 (#86) - Nod Narb
  The implication of this statement is that if you had a league OPS in 1984,85,86 of .730,.730,.730, you would expect .730 for the league in 1987. We know that 1987 was not like the others. Say it was .770.

If you had Tim Raines at .830,.830,.830 in those 3 years, you might project him at .820 or something in 1987. If he ended up being .860, you'd think you were off by .040. But, everyone in the league would be off by that.

But if everyone's league average predictions were off by .04, then RMSE would still show whose predictions were closest (it would just be larger than it should be- but for everyone). And if you happen to do a better job at predicting the league average, the RMSE will go down even more. You should be rewarded for that.

Posted 1:28 p.m., March 2, 2004 (#87) - tangotiger
  No, because how can you predict 1987?

The fair thing to do is to say: "this forecaster predicted the overall lg avg the best". But, after that, within that forecaster's universe of players, it should be normalized. There are two sets of predictions here.

In fact, every forecaster worth his salt would do the forecasts in this two-step process. The first thing he tries to figure out is how is the whole league average going to be affected (say like when the strike zone was changed). After that, he estimates the player relative to the league average.

Posted 1:41 p.m., March 2, 2004 (#88) - J Cross
  I agree that some credit but not too much should be given for picking the league level.

Using correlation, as I understand it, will favor the system that had players in the right rank order regardless of league level or amount of spread. Now, for the purposes of snake drafting FLB a rank order is basically what you need but really we should be judging the RMSE or each systems OPS/league OPS and ERA/league ERA values.

Posted 2:00 p.m., March 2, 2004 (#89) - tangotiger
  J, agreed. Even the correlation is no good, because that fits the slope and the intercept. In my view, the slope should be fixed at 1, and you fit the intercept to the league average.

Afterwards, you can compute your RMSE.

Essentially, only do RMSE on OPS/lgOPS or OPS-lgOPS.

Posted 2:02 p.m., March 2, 2004 (#90) - Nod Narb
  No, because how can you predict 1987?

You can't (at least with a high degree of confidence). But at least this puts all projection systems on equal footing to start. As long as the starting point is equal for all systems, RMSE will be unbiased.

I think the discrepancy here is as follows:

You're arguing that a forecasting system should be graded based on the RMSE of normalized stats - that is, how far a player's predicted normalized stats are from his true normalized stats.

I'm arguing that a forecasting system should be judged based on the RMSE of raw stats - that is, how far a player's predicted raw stats are from his true raw stats.

Let's say we use your criterion. If a system does a good job predicting normalized stats, that tells us NOTHING about how close the raw stats are to matching reality.

Using my criterion, if a system does a good job predicting raw stats, then by definition it does a good job predicting normalized stats, because normalized stats (whether projected or real) are based on the raw stats from which they are derived.

My criterion sets a higher standard for success. Successful prediction of normalized stats does not guarantee successful prediction of raw stats.
However, successful prediction of raw stats DOES guarantee successful prediction of normalized stats. A system that is good at predicting both is better than a system that is only good at predicting normalized stats.

Posted 2:13 p.m., March 2, 2004 (#91) - Nod Narb
  I agree that for fantasy purposes normalized stats are perfectly acceptable, as long as the rank order is reasonably accurate. Maybe we're not agreeing because I am thinking outside the realm of fantasy.

Think about it. You open up a book from 2002 and see that Bobby Abreu was predicted to hit .260/.360/.470. You say "that prediction was terrible! It was way off!"

However, in a system relying on normalized stats, that projection could have worked out to a perfect *OPS+ prediction of 155 (depending on the league average).

Personally, I would rather see the raw stats be more accurate than the normalized stats. And I can't see how accurate raw stats would reduce the accuracy of the normalized stats by any meaningful margin.

Posted 3:07 p.m., March 2, 2004 (#92) - tangotiger
  I prefer the normalized stats, esp for pitchers who change leagues. If his ERA is 150+, does it matter what his NL ERA forecast is and what his AL ERA forecast is?

What a player's forecast is actually saying is: "given that I think the level of competition would produce a runs per game of 4.52, this is what I think his ERA is going to be". If you take that pitcher, adn throw him into a different league, you will still expect that player, relatively speaking, to produce at around the same rate.

Anyway, I think we both said our points 3 different ways, so we ain't getting nowhere.

Posted 12:38 a.m., March 3, 2004 (#93) - jto
  Let's get outside the fantasy realm for a minute...what do you guys think would be the best method (to normalize or not, use correlations or rmse, only use OPS, etc.)for a Major League front office that is comparing these projection systems......I would love to hear everyone's input because this is the topic I am doing my research paper on and I need some help....

Posted 2:48 a.m., March 3, 2004 (#94) - Michael
  The major league front offices that are currently debating if it is better to compare projection systems using rmse or correlations are, to say the least, not in the majority.

And most front offices would want to invest in what is likely to bring their particular team more money. And that may involve some non-performance things (got a good marketing niche [japan, latino, record chaser]; home town hero; big name/used to be a star; etc.) as well as performance things (winning games brings more fans, playoffs brings a lot of revenue). IMHO on the performance side you absolutely want to have not only a rank order of players (correlation) but also an accuracy on how much better certain players were than other players and than replacement players (rmse). If player A is 1 run better than player B and player C is 15 runs better than player D but there are no players in between A and B nor in between C and D and a team is able to pick either A and D or B and C you want to have the accurate predictions that correctly get the magnitude of differences between these players.

Posted 10:29 a.m., March 3, 2004 (#95) - J Cross
  jto, one thing I'd really like to see is which players/kinds of players have the widest range of predictions and which systems are the best for those players (if there's enough data to get anything but noise there).

Posted 10:59 a.m., March 3, 2004 (#96) - tangotiger
  I made a post in the PECOTA thread at Clutch, but since that takes forever to load, I'll repost here, and continue the discussion here.

***

Why project "Ice" Williams to have a 465 OPS? Better make it 680, since the only way he manages 130 PAs is by putting up an OPS over 650.

This is a great point that is not said enough. The forecast of the performance is also dependent on the number of PAs. There was an interesting article on the Primer home page that talked about this kind of thing.

I also put out, somewhere at Primate Studies, that shows how a player's performance is tied-in to his PAs.

What I would do, if I were so inclined, would be to forecast a different set of OPS, based on the number of PAs he'd get. For example, if my best guess in full-time play that, Soriano's OBA is going to be .360 (with an error range), then I would make the following guess:

PA, OBP
700,.360
600,.355
500,.350
400,.345
300,.340
200,.335
100,.330

(Numbers for illustration only.)

Why would I do that? Because, as a group, this pattern exists. Why does it exist? One might be injuries, that a guy might be playing through injuries, and then it catches up to him. Another might be that when you start off slow, a manager might be tempted to start benching you (see World Series), therefore, not giving you enough PAs to catch up to your normal talent level. Another might be that something might have changed with you.

So, to graphically show this, you would do:
... percentile
PA.... 25%.... 50%....75%
700.....340.....360...380
600.....330.....355...380
500.....320.....350...380
, etc, etc

(Numbers for illustration only.)

Then, within each of those, you would "click" that estimate (say the .355), and you would get another set of "percentiles" to show you the likelihood of getting that from walks or hits, etc.

***

The point here is that PAs are very germane to the issue here. Forecasters are doing their best to estimate the true talent level within that park/league context, but the only thing we can verify is their actual performance levels. And that is tied-in to the number of opps they are given/earned.

Posted 2:08 p.m., March 3, 2004 (#97) - Dark States
  ok. Excuse the noise here:
1. "explain 1987", are you talking about regular season or post season? I know that subjectively the Twins quit playing hard once they clinched the division. I think they lost their last 6 games going into post-season play. I don't know what the 'tanking' of 3.7% of the season does to their overall numbers, but I think there's something to be said for games played after 'elimination'.

2. I did a player ranking for the hockey league I'm in. I think I figured out Yahoo's ranking system, by adjusting for positional players. basically i applied a positional factor in for each position, and came up with a very close correlation. I still need to figure out the relationship between goalies/positional players. Same for pitchers to hitters.

3. is there a spot on this site to solicit for fantasy leagues? I want to get into a roto league for this season, and i have 2-3 other teams that would be interested.

Posted 3:07 p.m., March 3, 2004 (#98) - tangotiger
  3. feel free to use this thread
1. I meant how the run scoring in 87 was out-of-line with the surrounding years

Posted 9:39 p.m., March 4, 2004 (#99) - Dark States
  Thanks Tango.

I've got a Yahoo Baseball League setup and I'm looking for members. It's a roto league with a live draft setup for Sunday March 14th at 10am CST.

Hitting Categories: AB, R, H, 3B, HR, RBI, SH, SB, BB, AVG, OBP, SLG.
Pitching Categories: IP, W, SV, HR, BB, K, HLD, ERA, WHIP, K/9.

The roster size will be finalized depending on the number of teams. 13-16 position players, 7-9 pitchers and 4-5 bench slots. 16 Teams max, and the more teams, the fewer roster spots.

League ID: 204214
League Password: bselig

If someone wants to create a formula for players, please feel free. Obviously, I'm looking for good active owners. But then again, who isn't. [sarcasm] "yes, I'd like a couple of lazy owners that want to draft only Devil Rays." [/sarcasm]

Thanks again.

Posted 10:17 p.m., March 4, 2004 (#100) - Michael
  You also want to baseline by position, such that the "replacement" level at each position is also set to zero $.

So what is the correct way to calculate the "replacement level". Ignoring the adjusting to what players each team has and ignoring the adaptive algorithms what if you want to come up with some static list of $ values a la a lot of roto fantasy sites and a la the BP fantasy manager. Here's my thinking to date:

Simplify for a second and imagine a fake league where each of 10 teams needs 3 utility players (and that's it - no other places) and there's only one category (say SB). Replacement value here is easy to calculate as you just look at the 30th best SB guy on your projections and he's esentially worth $0 (ignoring the potential need for a minimum $1 bid on all players) as if you are the last team to pick the last guy he (or someone better if one of the other teams messed up) should be yours uncontested (maybe it is better to use 31 as replacement value as in the general case with mutiple positions it works best to use the guy who will never be selected). Say you project him to steal 17 bases. Now you just need to figure out how many steals above replacement level there are in the top 30 guys. Say the top 30 guys you project to have 753 total steals. Now there are 243 steals above replacement available (753 - 30*17), and if there are 10 teams with $260 budgets each you'd expect that each steal over replacement is worth about $10.7 ($260*10/243). And to calculate a players value you just know $0 = 17 * $10.7 - b, from that you get the intercept value of about -182, so a players value in my example here is around 10.7 * projected_saves - 182.

Ok, so that's how it works in the simpliest of cases. If you were to generalize to a full field of positions you'd do the same kind of calculations to get the replacement level at each position, calculate the number of steals above position_replacement for each position, do the divide to get $/steal_above_replacement and you'd be off again. So if you imagine another fake simplified league that this time requires a catcher, a ss, and a OF and again just has the steals category and again just has 10 teams. This time replacement level might be as follows 20 for OF, 13 for SS, 2 for C and there might be 323, 214, and 50 SB by the top 10 OF, SS, and C respectively. That means there's a total of 237 SB over replacement ((323 - 10 * 20) + (214 - 10 * 13) + (50 - 10 * 2) = 237). With $2600 dollars total that gives about $10.97/SB over replacement. Which means a 9 C and a 20 SS and a 27 OF are all equally valuable (with formulas of about OF$ = 10.97 * SB - 219; SS$ = 10.97 * SB - 143; C$ = 10.97 * SB - 22).

OK, that's all pretty easy. But the hard part comes with multiple stats. Say we go back to the 3 util player league but this time there are two categories: SB and HR. How does one calculate replacement value now? Ideally you want some formula that says $player = x * HR + y * SB + b. But to calculate x and y you need to know what replacement is for HR and SB. And it is no longer as easy as pick the 31st best guy as you need some way to choose the 31st guy and you don't know if a guy with 20 SB and 25 HR might be better than a guy with 10 SB and 30 HR. The naive way might be to pretend you have 2 different single leagues with $130 per team. In other words to calculate the value of just HR ignoring SB and then to do the same with just SB and ignoring HR. I'm pretty sure this isn't sound.

A better way might be to calculate the Z-values for each player in HR and SB, sum the two numbers and rank the players on that as if you were playing a roto game where the single category was Z-value (in otherwords find the 31st best summed Z value and use that as replacement level). But here you have a couple of problems:

1. Z values based on what pool of players? If you use average HR and stdev HR for the whole league (and same with SB) and you are only going to choose a subset of the league then why should adding scrubs to the league change your answer? Imagine HR average was 20, std dev was 10 and SB average was 10 and std dev was 5. Imagine because you are choosing only a subset of players you are sure you'll never take someone with below average HR and SB. Now all of a sudden you add a bunch of players who hit 2 HR and stole 8 bases. all of a sudden your average and stdev for HR and SB has changed even though you haven't changed the characteristics of the players you plan on picking. So clearly you'd like to be able to calculate the std and average of the players that you might reasonably consider taking. But to do that you have to be able to rank the players somehow which is our first problem again.

2. As has been pointed out before in many cases you'd rather have a guy who was 2 std dev above the mean of players you are considering in both HR and SB than a guy who was 5 std dev above in HR but 1 std dev below in SB. This method doesn't reflect that.

So what do people think the best way to come up with replacement values is when you have two categories at once?

Posted 12:08 a.m., March 5, 2004 (#101) - J Cross
  You need to find the expected standard deviation on the team level (more on this later). Once you do that you find the 30th best middle infielder and call that replacement level for middle infielders. At least, that's what I did.

To account for the fact that one standard in HR and one standard in steals is worth more than 2 standards in HR (or steals) you can discount points by (1 - .2*abs(std)). So, 1 standard in HR's would be worth .8 and one standard in steals is worth .8 points for a total of 1.6 points. 2 standards in HR is worth 2*(1-.2*2) =2*(.6) = 1.2 points.

I set up a spreadsheet to calculate how many standards from average the team I was in the middle of drafting was from average in each category and adjust the value of stats in each category and rerank the players accordingly. It's mildly useful.

So, how to find the expected standard deviation on the team level? Well, I looked at 10 past leagues with 10 teams each but if you don't have that is there anyway to predict the std on the team level from a look at the std's of the players you expect to be in the league?

Posted 12:24 p.m., March 5, 2004 (#102) - Matt(e-mail)
  I think that you don't need to worry about the expected standard deviation on the team level, but you can just use the standard deviation from whatever pool of players you decide is appropriate. This is based on the assumption that the players are randomly distributed among the teams in your league (whether you think this assumption is plausible is up to you). If so, then the team standard deviation will just be (sd of players)/(sqrt (# of teams in league)). Since you are dividing every sd by the same number, and all you care about is ranking the players relative to one another, it doesn't affect anything. I think this is right, let me know if I'm wrong in my thinking.

J. Cross, I'd be interested in seeing your spreadsheet. I have a spreadsheet that ranks players based on sds, but am struggling to figure out how to make it dynamically re-rank. You could e-mail to me, thanks very much.

Posted 1:07 p.m., March 5, 2004 (#103) - Michael
  J. Cross your method is ... unsatisfying.

I understand that it may well work from a practicle standpoint as a good first order approximation, but there ought to be a theoretical way to calculate it from scratch. Imagine that we were the first roto league to ever use just two categories. What would you use then as the std and averages would be different than in a 5x5 or 6x6 or 8x8 league.

Posted 1:10 p.m., March 5, 2004 (#104) - tangotiger
  I agree that if you take a reasonable set of players, you should use the SD among those players. You could rerun it every time. That is, start off with the 250 players you think might be selected, and figure out the SD for the 500 MLB players. Rank them. Take the top 250, and redo your SDs. Rerank them. And on and on. Your SDs will stabilize very very quickly.

Posted 1:11 p.m., March 5, 2004 (#105) - tangotiger
  and figure out the SD for the 500 MLB players.

Should read as: and using the SD for the 250 players, figure out the scores for the 500 MLB players.

Posted 1:11 p.m., March 5, 2004 (#106) - tangotiger
  and figure out the SD for the 500 MLB players.

Should read as: and using the SD for the 250 players, figure out the scores for the 500 MLB players.

Posted 1:14 p.m., March 5, 2004 (#107) - J Cross(e-mail)
  Matt, I'll send you the spreadsheet. I should warn you that I have very little "programming" experience so this isn't exactly robust. The other problem is that it tries to adjust for the # of round of draft that have done by and account for the opportunity cost of next getting saves, home runs or whatever in the rounds gone by. So, if you're drafted 2 pitchers so far you're projected hitting numbers will look pretty bad. It gives projected team percentiles in each category and projected team points at the bottom of the draft page. I'll send it to whoevers intersted and apologize ahead of time if it's difficult to use.

Posted 1:28 p.m., March 5, 2004 (#108) - tangotiger
  J, if you send it to me, I can post it here. Your call.

Michael/#103: I think if you stick to the player level, you should be fine.

Posted 3:49 p.m., March 5, 2004 (#109) - J Cross
  Tango and Matt,

I'm going to work on that spreadsheet later today if I get the chance and should be able to send you a copy with just ZiPS data that works more smoothly.

Posted 5:55 p.m., March 5, 2004 (#110) - Nod Narb(e-mail)
  J. Cross,

If you'd like to "trade" spreadsheets, it might be useful for each of us to get an inside look at what the other is doing. They sound very similar.

Posted 6:49 p.m., March 5, 2004 (#111) - J Cross(e-mail) (homepage)
  Sounds good. I sent you the spreadsheet. Let me know what you think. I'm thinking about emailing Score Bard and seeing if he wants to use these equations with his draft simulator program.

Posted 12:45 p.m., March 7, 2004 (#112) - Sky
  I'm a little confused about this whole standard deviation thing. Why use standard deviations instead of just raw stats above replacement position?

Sure, it's true that the standard deviation for stolen bases is similar to that for HRs, just with a much lower mean. But even accounting for replacement level, you need way more HRs just to get into the range of earning points, where the standard deviation becomes important.

Thus, even beyond replacement level, there are a certain number of HRs per player that aren't used to move past other teams, but are used just to to earn the possibility of moving past teams if you get even more HRs.

I believe the standard deviation option is basically a standings gain points model, which I didn't think was as accurate as a pure value over replacement model.

Thanks in advance for clarifying.

Posted 12:37 a.m., March 8, 2004 (#113) - Kyle S
  Has anyone else bought the Prospectus annual and been, uh, underwhelmed by PECOTA? I realize the book uses weighted means instead of 50th percentile projections, and thus will project too little playing time for almost everyone (because it includes a 10th percentile <200 PA projection for regulars), but it still left me unsatisfied. I don't think I came across any regular from last year who it projected to have a "peak" year - anyone who had a good year last year was projected to regress to either their career average or below, anyone with a bad year projected to better but still below career average, and a few random people to fall off the table. Maybe that's just the nature of the weighted mean projections. I need to cough up the cash for the site :)

Posted 9:09 a.m., March 8, 2004 (#114) - tangotiger
  Sky, the spread in RBIs is twice that of HR. If most players will have 10 to 50 HR, they will also have 60 to 140 RBIs (numbers made simple for illustration).

You use the SD to try to make them even, because RBIs and HR have the same impact to the overall Rotisserie score.

If you do:
HR score = (HR - 20)
RBI score = (RBI - 80) / 2

what you are doing is:
a) comparing players to the mean
b) standardizing the impact of the events

Since you will be subtracting 20 or 80 from everyone, all that cancels out. What you are left with is just the spread (standard deviation).

Total score = HR + RBI / 2

You figure out what the replacement level "total score" will be. That sets your dollar value for that player to zero.

***

If you have just 6 players in the whole league, and they have the following "total scores":
player1: 400
player2: 300
player3: 200
player4: 100
player5: 50
player6: 20

and if the whole league has to draft 4 players, then you know that that last guy will go for the minimum (say 1$). So, recalculate the relative total scores as:

player1: 300
player2: 200
player3: 100
player4: 0
player5: less than zero
player6: less than zero

Say that the whole league will have 64$ in salary. Everyone has a minimum 1$ in salary, so that leaves us with 60$ in marginal dollars.

The total scores above replacement is 300+200+100+0=600. So, we see that the marginal dollar per marginal total score is 1 per 10.

That gives us the following roto values:

player1: 31
player2: 21
player3: 11
player4: 1
player5: 0
player6: 0

Total$ = 64

That's how you do it.

Posted 9:19 a.m., March 8, 2004 (#115) - tangotiger
  As for being underwhelmed by PECOTA, I'm not sure what you are expecting. There are no soothsayers here. Anyone who can predict a "breakout" is full of it. The best you can do is establish a probability distribution to that players true talent level, and then you can establish a probability distribution of a performance level, based on the player's true talent probability distribution.

For example, say that the MLB player is a "100", and Bonds is a "200", and a top minor leaguer is an "80". You've got Javier Vasquez. What do you do?

Well, you try to figure out what his probable true talent level is. You figure that he's a 130. But, that's a best guess. He's more likely to be a 130, with 1 SD = 15. So, you are 68% sure that he's a 115 to 145, and 95% sure that he's a 100 to 160, etc, etc. (Not quite so symmetrical). There is a chance that he's actually a below average pitcher.

Now that you've got that, for every point on the distribution curve, you have to figure what the likelihood of him performing at various levels if given only 1000 PAs. So, at the 100 level, he's got a prob distribution of 100, with 1 SD = 20. At the 101 level, he's got a prob distribution of 101, with 1 SD = 20. (Again, not so symmetrical).

You add it all up, and you end up with a weighted performance level of 130, with 1 SD = 20. You break that out into "percentile" rankings, and you get your answers.

***

You should also realize that there is another dimension: number of PAs. The less PAs, the lower the performance level. Why? Injuries, or just bad luck, and the manager not sticking with you. So, that's another probability distribution to consider.

***

In any case, just be happy with the true talent level. All the other stuff is interesting, but not really useful.

Posted 10:31 a.m., March 8, 2004 (#116) - Sky
  Thanks, Tango. I guess I missed the part where you subtracted out league-average stats from each player's category totals before dividing by the standard deviation. (Which makes perfect sense when normalizing a value, obviously.)

I've read a lot of criticism of the Standings Gain Points method, which is similar to what this SD method is. The main fault is that the SGP method doesn't account for a barrier threshold that's needed to start accumulating points. This SD method DOES deal with that, so I'll have to compare it to what I currently use. Anything inherently wrong with...

Subract out replacment levels stats from each player. Add up the "useful" stats of all players and figure out $$/stat ($/HR, $/SB, etc) for each category. Multiply useful stats by $$/stat to get value.

Also, how does this model deal with the counting stats, such as batting average?

Posted 10:54 a.m., March 8, 2004 (#117) - tangotiger
  Don't do replacement level by each stat. You have replacement level players, and not replacement level HRs.

***

"I guess I missed the part where you subtracted out league-average stats from each player's category totals"

For counting stats, you don't have to do that, because it all comes out in the wash.

***

As for batting average, see posts: 47 through 52.

Posted 5:28 p.m., March 11, 2004 (#118) - Matt
  What would your (plural) suggestions be for figuring replacement level in these fantasy situations, where you have roster spots that are flexible? For instance, our league has 2 starting pitchers, 2 relief pitchers, and 3 just plain pitchers which could be anyone.

My current idea is that I would find the 20 (10 teams in the league x 2) best SPs and the 20 best RPs, then combine all the pitchers back together and find the 30 next best pitchers. That part makes sense, I guess. But then what value do I use to calculate Value-Over-Replacement for, say, Prior? The 20th best SP, or the #30 guy from the 30 next best pitchers?

Posted 5:48 p.m., March 11, 2004 (#119) - J Cross
  First I'd find the 70 best pitchers. If more than 20 are starters and more than 20 are relievers than I think you could just call the 70th best pitcher (overall) the replacement level.

Posted 7:49 p.m., March 11, 2004 (#120) - Sky
  Still not convinced, although I'd like to be, one way or the other.

Assuming you want to use replacement players, not replacement categories, couldn't you just approximate a replacement level player by averaging the statistical categories of the worst+5 through worst-5 (a range of players around the worst) players at each position, assuming the groups are large enough? Then you would compute stats above this theoretical replacment level in each category?

Posted 8:17 p.m., March 11, 2004 (#121) - Nod Narb
  Say in your draft you have to choose between two players:
Player 1 is 22 R, 5 HR, 15 RBI, 4 SB and .002 BA above replacement
Player 2 is 16 R, 4 HR, 21 RBI, 5 SB, and .005 BA above replacement level. Who do you pick?

Just knowing value above replacement is not enough. You need to use SDs to scale each category equally. This is especially important when comparing pitchers to hitters.

Posted 10:25 p.m., March 11, 2004 (#122) - Sky
  Nod - up to now, I've added up all the Rs, HRs, RBIs, SBs, and xH above replacement level and divided by the total number of that stat above replacement level. So if everyone in the draftable player pool had a total of 1000 HRs above replacement level, Player A would have .5% of the pool. Since for auctions you usually deal with dollars, you convert the .5% to .5% of the $$ allocated to HRs. If you assume 12 teams, $260 and a 2/3 hit/pitch split, Player A would earn about $2 for his HRs. Repeat for each category.

Posted 11:51 a.m., March 12, 2004 (#123) - Matt
  When I look at it, calculating replacement level works out about the same either way (post 118 or 119). The only problem arises with catchers. If I take the top 110 hitters (we start 8 position players and 3 utility guys) and call the 110th guy replacement level, then only one catcher is above this replacement level. Most catchers are well below. But I think this method is pretty reasonable anyways. Let me know if anyone else has any other thoughts.

Posted 2:12 p.m., March 12, 2004 (#124) - Paul Finch
  What I've been doing as far as catchers is just adding them into my top x hitters and taking out the same number of "utility" hitters. It does drive up the value of the other hitters, but minimally, and I think it's a correct method, because everyone has to have a catcher (i.e., several teams will be forced to draft sub-optimally, and since we know this ahead of time, we can adjust for it).
I've done the same thing for pitchers when projected closers are outside my sample, because I know those guys will be drafted.

Posted 7:10 p.m., March 12, 2004 (#125) - Sky (homepage)
  I make separate lists of catchers, middle infielders, and 3B/1B/OF/DHers. Now, 3B are almost as bad as the the middle infielders these days, but at least you have the 1B to fill up the CI positions. With these three different lists, you can create a different replacement level for each group. So to determine useful SBs for middle infielders, subtract out the middle infielder SB replacement level, which is different from the catchers SB replacement level, etc. You could theoretically separate out ALL positions, but that gets to be a pain in the ass.

A slightly simpler method is to use the same replacement level for everyone and add on a certain amou (equal to $1 minus the worst catcher vale ) to every catcher's value. For a 12 team league-specific league, it's about $3 to $4. Adding this much doesn't really affect the total money in the pool, so you don't need to adjust the rest of the league.

Posted 5:09 p.m., March 15, 2004 (#126) - Matt
  The other thing I keep struggling with is how to value guys that are in "scarce" positions. Right now I'm thinking about Gagne. I calculate him to be worth "8" in our scoring system -- let's just take this number as truth. The next best reliever is Billy Wagner, at "4." If it's my turn to pick, and I can choose between Gagne, or Vladimir Guerrero, (rated at "17" with the next outfielder after him at "16") whom do I choose, if I know that neither one will be available when it gets back to me? Because obviously Guerrero is worth more, but since everyone has to choose a reliever, Gagne is worth marginally more than reliever that will be available the next time I pick. I keep coming back to the conclusion that I should take Gagne, but I'm wondering what y'all think.

Posted 5:16 p.m., March 15, 2004 (#127) - tangotiger
  It should be based on the last reliever to be picked. The last player at every position is worth exactly 1 dollar. How much value Gagne has above this player compared to Vlad is what decides who you should pick.

If Vlad has "150 units", Gagne has "70 units", the last OF chosen has 100 units, and the last reliever expected to be chosen is 10 units, then it is irrelevant that Wagner has 60 or 30 or 69 units. Gagne is 60 above and Vlad is 50 above. (Assuming that all units are equal).

Posted 6:54 p.m., March 15, 2004 (#128) - Nod Narb
  An alternative method is to use standard deviations. You have to get by the assumption that the player values are normally distributed (and if they're not you can always transform them), but it's a great way to answer the question you're asking. If Gagne's an 8, the average closer is a 3, and the SD is 2, then you know Gagne is 2.5 SD above the mean. If Vlad is a 17 and the average OF is a 10 with a SD of 3, then you know Vlad is 2.333 SD above the mean. If these numbers were real, then Gagne would have slightly more value to your team (all else being equal).

Tango, I'm not sure I agree with you (not saying you're wrong, just I dont follow your explanation) re: gagne/wagner. IMO wagner's (and other closers in between first and last) value makes a big difference.

say you are deciding between gagne(70 in your example) and vlad(150 in your example)...
lets assume if you draft one you can't get the other.

if wagner is worth 69
draft vlad - 150
draft wagner - 69
total 219
--
if wagner is worth 30
draft vlad - 150
draft wagner - 30
total 180
OR
draft gagne - 70
draft next best OF - 140
total 210

wagner's value matters. if gagne, wagner, and smoltz are all around 70, gagne's relative value decreases, regardless of the value of the last player drafted, because if you don't get gagne, you can still get a 70 point closer. but if gagne is 70 and the next best is 40, gagne is more valuable, because there's no other way to get 70 from a closer. it's probably going to be hard to get 70 from two closers!

Posted 7:46 p.m., March 15, 2004 (#129) - tangotiger
  I'm not sayign to draft Wagner! I meant that it doesn't matter how much Gagne is above Wagner.

Posted 8:17 p.m., March 15, 2004 (#130) - Nod Narb
  my last paragraph:

wagner's value matters. if gagne, wagner, and smoltz are all around 70, gagne's relative value decreases, regardless of the value of the last player drafted, because if you don't get gagne, you can still get a 70 point closer. but if gagne is 70 and the next best is 40, gagne is more valuable, because there's no other way to get 70 from a closer. it's probably going to be hard to get 70 from two closers!

it does matter.

Posted 8:34 p.m., March 15, 2004 (#131) - Michael
  I agree with NN. The value of the players at the position matter.

Imagine you have the following choices at 3b:

60,30,28,26,25,24,24,23,22,10

And the following choices at 1b:

60,59,58,57,56,55,55,30,20,10

Where in both places the "10" is your replacement guy.

And again we are assuming a draft based league, not an auction league. If it is your turn to pick it is clear that the top 3b in this example is worth more than the top 1b. They aren't equally valuable even though they are both worth "60" and both have "50" value over replacement.

Posted 10:24 p.m., March 15, 2004 (#132) - Sky (homepage)
  Michael, that is an extreme, although representative example. Todd Zola over at Mastersball has pretty much shown that for typical player populations, taking the highest valued player (above replacement) every round is the best way to go.

If, in your example, we assume the entire league is made up of 10 teams each picking one 3B and one 3B and we have the first pick, it doesn't matter who we pick because we'll get a 60 and a 10. The second pick will get a 60 and a 10.

If, more realistically, there are a lot of players at other positions left and some players are already off the board and it's our turn at some random point in the draft, it still really doesn't matter which $60 player we pick. Assuming the league is somewhat rational and has left many players at other positions in the 30 to 60 range, if we take the 60 1B, we can fill up on the other positions before taking a 3B at the lower end.

There are very few decisions at a draft where thinking about the game theory side of things will prove useful.

Posted 10:38 p.m., March 15, 2004 (#133) - tangotiger
  I agree with Sky. The only way that you care about the "next best player" is with very extreme distributions of positions. Even then, it's hard to believe that this can even have much of an effect when you consider the number of positions in baseball.

Maybe in basketball, if you have your positions as guard, center, forward. And, if there's a big skew in the distributions of centers. Only then do I think I can even consider buying it. And, I'm not even sure about that either.

Posted 12:00 a.m., March 16, 2004 (#134) - Nod Narb
  Take the following example of players and values. I tried to make this realistic. There are 5 1B who are very good and pretty close in value. There are 2 3B who are very good and pretty close in value.
However, the distribution of 1B falls of in value gradually, while the rest of the 3B fall off precipitously.
The 5 best 1B are probably better than the top 2 3B.

Helton(12), Thome(11.5), Giambi(11), Delgado(10.5), Bagwell(10), 6, 5, 4, 3, 1.5.

Chavez(9), Rolen(8), 4, 3.5, 3, 3, 2, 2, 1, 1

Tango and Sky would advocate taking Helton first, Michael and I would advocate taking Chavez first.

My rationale here would be that if you took Chavez (or Rolen), you still might be able to get one of the top 1B with your 2nd pick. Even if you didn't, you'd still be able to get someone at 5 or 6. But if you took Helton first, chances are much less that Chavez or Rolen will be left for your second pick, and you'd have to settle for someone at 4 or less.
The probability of getting 2 good players is higher when selecting Chavez (or Rolen) first.

Posted 12:02 a.m., March 16, 2004 (#135) - Nod Narb
  btw, Michael - we'll probably be able to predict each others' picks with 90% accuracy on sat ;)

Posted 2:21 a.m., March 16, 2004 (#136) - Michael
  Seems likely yes. Although I'm a bit behind (as always) on the amount of functionality that my spreadsheet has. I just hope we get enough live people so that you don't get 8 picks instantaneously and have 90 seconds to enter in 8 picks and choose your next guy.

Posted 11:51 a.m., March 16, 2004 (#137) - tangotiger
  But, you are assuming only 2 positions. In the baseball scheme of things (with the 8 position players, and pitchers), I don't think your scenario has much if any impact. Just a guess on my part.

What you need to do is run Monte Carlo.

Posted 1:07 p.m., March 16, 2004 (#138) - Nod Narb
  True. It is really hard to come up with a convincing, realistic example. I haven't the slightest on how to run a Monte Carlo, but it would be interesting to see the results.

Posted 1:39 p.m., March 16, 2004 (#139) - Matt
  In re: post #128, I've already converted everything to standard deviations. So my evaluation of Gagne, Guerrero, etc. are already converted to a standard scale.

I'm being converted over to Tangotiger's method. If nothing else, it makes drafting a heckuvalot easier, since I don't have to try to figure out what other people are going to draft before I get to pick again. I will just choose the player with the highest value above replacement. If I see any weird distributions emerging, like the skewed distributions in your examples, I will be aware of them, and if players have similar VORP, perhaps I will choose the player from the skewed distribution.

One thing I realized is that, since most leagues let you have some form of "utility" player who can play any position, the VORP for a particular player can have two values. For example, I have the replacement-level 2B valued at 15, so if I am looking to draft a 2B, I should compare them all to 15. But once I have picked a 2B, I have to compare all the other 2B to a different value, 17, which is what I figure the replacement-level "generic hitter" will be at the end of the draft.

Overall, I just wanted to say thanks to everyone for posting here and answering my questions and discussing these things. I have learned a lot and will be much better prepared for my draft on Saturday, my first year in fantasy baseball since high school.

Posted 2:57 p.m., March 19, 2004 (#140) - Jim
  I must say I'm impressed with the statistical knowledge and zeal for the fantasy baseball topic of the contributors here. I'm playing a points league for the first time. It is both AL and NL - 5 players to be kept for next season - new league - scoring as follows:

Hitting - 1 pt for singles, RBIs, runs, walks
2 pt for doubles, SBs
3 pt for triples
4 pt for HR
-1 pt for Ks and Caught Stealing

Pitching - 10 pts for Wins
-6 pts for Losses
5 pts for Saves
7 pts for Complete Game
-2 pts per Earned Run Allowed
-1 pts per Walk Allowed, Hit Allowed

I have to fill a basic roster of C,1B,2B,3B,SS,CI,MI,OF,OF,OF,UTIL,SP,SP,RP,RP,RP,P,P.

Am I right in concluding that the scoring system is weighted strongly in favor of hitters? It also seems that relief pitchers are much more desirable than starting pitchers. I kind of think that I should have 5 closer types and only a couple of starters. Do the math experts agree? Thanks.

Posted 3:03 p.m., March 19, 2004 (#141) - Jim
  Re: #140

Pitchers also get 1 point per K.

Posted 3:05 p.m., March 19, 2004 (#142) - tangotiger
  I have no way of telling by your point listing. You need to:
1 - apply your point system to hitters and to pitchers
2 - draw a line at the number of hitters and pitchers who will be selected (preferably based on position, but not necessary for something Q&D)
3 - reset your points as points above replacement hitter and pitcher
4 - average your new hitter points and pitcher points

That'll tell you if there's any favoritism.

Posted 3:08 p.m., March 19, 2004 (#143) - Sky
  Here's a question i'd like to propose to folks concerning points leagues...

In roto leagues, an uneven split between hitters and pitchers is common place, mostly because more pitcher value is available from the free agent pool throughout the season than hitter value because pitcher performance has more variability given the traditional roto categories.

In points leagues, pitchers are probably still more unpredictable, but how does that manifest itself in some sort of weighting? If you figure out how many points the pitchers and hitters are projected to earn, you should probably discount pitcher points somewhat due to their unpredictability. But how? In roto, half the categories are pitching and half are hitting, by definition. But with points leagues, one group could be given more importance because of the point values.

If youre draftable player pool for hitters contains as many points as the draftable player pool for pitchers, then you might want to apply the same roto split (something like 1:2). But for anything else, how do you know?

Posted 3:27 p.m., March 19, 2004 (#144) - J Cross
  Jim,

In order to answer your question you'd have to plug projections for hitters and pitchers into those equations and find every player's value over replacment. I don't have quite that much time but here are the top 10 hitters (unadjusted for position) using Marcel projections and your scoring system:

Rodriguez, Alex
Pujols, Albert
Soriano, Alfonso
Helton, Todd
Ordonez, Magglio
Garciaparra, Nomar
Wells, Vernon
Thome, Jim
Boone, Bret
Beltran, Carlos

Posted 3:38 p.m., March 19, 2004 (#145) - Jim
  Thanks for the comments.

Posted 10:32 a.m., March 20, 2004 (#146) - Nod Narb
  We've had a few people drop out of the primer fantasy league, which drafts tonight. If anyone else wants to join, here are the details:

http://baseball.fantasysports.yahoo.com/b1
league ID: 2686
password: primer

draft is at 5:30 EST tonight

Posted 4:21 a.m., March 21, 2004 (#147) - Michael
  The aftermath is at:

http://baseball.fantasysports.yahoo.com/b1/2686/draftresults

If you check out the draft results remember this is an 8x8 league with:
R,HR,RBI,SB,TB,BB,OBP,OPS for hitters
IP,W,SV,K,ERA,WHIP,K/9,K/BB for pitchers

Posted 9:09 p.m., March 21, 2004 (#148) - Nod Narb
  Tango - any chance of getting the Marcels for pitchers sometime soon? I'd be glad to help if you're too busy with other things.

Posted 10:19 p.m., March 21, 2004 (#149) - tangotiger
  Funny thing... I was doing them while my kid was sleeping. I'm almost done.

Posted 1:26 p.m., March 23, 2004 (#150) - Matt
  I know that talking about your own fantasy team is like the most boring thing in the world to other people, but anyways. I had my draft, and I learned a lesson. I seriously underestimated the quality of outfielder that would be left at the end of the draft (i.e, the replacement level OF). This caused me to completely overvalue OFs, and drafted 6 of them in the first 7 rounds. Now I have too much hitting and not enough pitching.

I think this was mostly because other people had different ideas about who the good outfielders were. I set the replacement level at about the 70th best OF, but at the end of the day, I should have set it at about the 40th best OF. If anyone still has to draft, you may want to take into account other people's sub-optimal drafting evaluations when setting replacement level. Perhaps it's more obvious who the top people are at the other positions, and there are less to choose from, but I pretty much nailed the replacement level for everything else except OF.

Posted 1:31 p.m., March 23, 2004 (#151) - Matt
  I know that talking about your own fantasy team is like the most boring thing in the world to other people, but anyways. I had my draft, and I learned a lesson. I seriously underestimated the quality of outfielder that would be left at the end of the draft (i.e, the replacement level OF). This caused me to completely overvalue OFs, and drafted 6 of them in the first 7 rounds. Now I have too much hitting and not enough pitching.

I think this was mostly because other people had different ideas about who the good outfielders were. I set the replacement level at about the 70th best OF, but at the end of the day, I should have set it at about the 40th best OF. If anyone still has to draft, you may want to take into account other people's sub-optimal drafting evaluations when setting replacement level. Perhaps it's more obvious who the top people are at the other positions, and there are less to choose from, but I pretty much nailed the replacement level for everything else except OF.

Posted 1:58 p.m., March 23, 2004 (#152) - tangotiger
  I set the replacement level at about the 70th best OF, but at the end of the day, I should have set it at about the 40th best OF

How many teams were there, was it AL+NL or one league only, and how many OF subs and "general" subs did you have to choose?

Posted 7:36 p.m., March 24, 2004 (#153) - Matt
  Ten teams in my league, choosing from both AL and NL. We need three OFs and three utility players which can be from any position. Thus, a total of 11 hitters per team. We also draft 5 bench players which can be hitters or pitchers, I figured about 3 of those would be hitters, per team. That gives us 30 outfielders to be drafted, and a total of about 140 players.

When I sorted my player list by "value" (calculated by standard deviations), the top 140 players included about 70 outfielders. I had to add in a few catchers to that 140, but that's minor. I still had 60-70 outfielders that I thought would be drafted. Now, it's true that 60-70 outfielders were drafted, but about a dozen of my top 70 were not drafted. And the ones that are left are somewhat distributed throughout the list, so that a couple of guys who I have ranked about 30th are still available, and so are several guys that are ranked in the low 40's.

What do you think? Did I make a mistake somewhere?

Posted 10:53 p.m., March 24, 2004 (#154) - tangotiger
  60-70 OF would be what I would have figured as well. Now, you mentioned that there were a about 20% of those in your top 70 OF list not drafted. What about the other positions? Was there the same kind of sub-optimal drafting?

If it's the case that there is a particular position that you EXPECT to have sub-optimal drafting, then you would adjust your base accordingly.

While we like to say to draw a line at the 140 hitters, we're really saying to draw a line at the number of hitters, out of those 140, that would be selected. So, if only 120 of those 140 hitters would have been selected, draw a line at 120.

Posted 2:16 p.m., March 25, 2004 (#155) - Matt
  I would say that yes, 10-20% of my top players at other positions were not drafted. However, the effect was not as great for the other positions, in that I would say I was only off by a little bit when predicting replacement level for the other positions.

For most of the other positions, 10-20% is like one or two guys. I can think of the players left at the end of the draft as randomly distributed throughout my list, but obviously more likely to come from the bottom of the list and increasingly less likely to come from the top. So with 1-2 guys left at one position, it is not too likely that someone will be left that I consider "mid-range." But with 20 guys, as in the case of the OFs, that's pretty likely. Also, if just one guy is left in a position in the mid-range, I have a tough time calling that "replacement level." But with the OFs, there start to be quite a few guys left at a certain level.

It didn't happen with pitchers -- my replacement level estimate was pretty close. My hypothesis is that the mid-range guys that would have been left by other people, got drafted by me later on, whereas I couldn't take any more OFs.

Posted 3:07 p.m., March 25, 2004 (#156) - tangotiger
  It's certainly possible that your group specifically overvalued OF (or undervalued their replacement level). Same thing happens in hockey with defensemen.