In the last few articles, I have shown some average yards per carry and improvement for running backs by age and by experience. In this article, we are going to examine the error we get if we apply these curves to predict 2009 statistics from 2008 and 2007 stats.
We will calculate error on a root-mean-squared (RMS) basis. In other words, we will take our projected 2009 yards per carry, subtract the actual 2009 yards per carry, square the result, add up the squared value for all the players, divide by the number of players, and, finally, take the square root of the result. This gives us a measure that is equally weighted for too high and too low errors, and in units of yards per carry.
As a baseline, we will figure out what the RMS error is if we use last years value or an average of the last two years.
Using last year's YPC = 0.7597
Using avg of last 2 year's YPC = 0.6650
So for our system we are trying to come up with, we better have an RMS error of less than 0.6650, or all this fancy cipherin' will be for nothing.
So we will start with the worst and work our way up to the best.
In last place, we have YPC improvement using the player's age, with a projection error of 0.8836. Funny, since I put in yesterday's post that this was my favorite based upon the curve shape, but it seems to not do very well.
Next, is YPC improvement using the player's experience, with an error of 0.8443. So it looks like multiplying last year's stats by a player improvement average is actually less accurate than just using the previous years' stats.
Using average YPC with age results in an error of 0.6206, so we finally have something that is better than just using the player's stats from the previous year.
Finally, using average YPC with experience results in an error of 0.6166. So just using a player's experience as opposed to any prior performance is the way to go? That just didn't seem right. So I decided to use a percentage of prior performance and a percentage of experience for projecting. I ran an analysis to find which split of percentages would minimize error, and found that a split of 20% prior statistics and 80% YPC per experience resulted in a minimal error of 0.5266.
Still, putting so much more weight on experience than prior performance does not feel right. For this analysis, I used only players with 100 or more carries in each of the 3 year of the analysis, so there were only 20 players used. This small sample size could be cause for error, so for 2010 stats, I will use 50% of the player's last 3 years average, and 50% of the average YPC per experience of the player.
When I'm able to get more years of stats in my database, and when I have time (which may not be until next offseason), I will revisit these numbers to see how the formula should be tweaked.
Next up: running back TDs.
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment