A simpler look at more difficult stats

What makes a stat useful? Thirty years ago, long before Moneyball was written or sabermetrics became popular, this was a question that Bill James asked about baseball. Now that people, most notably associated with Football Outsiders, are working on developing new stats for football, it seems time to consider them closely and see how they can add to our understanding of the game. Football Outsiders uses some complicated sounding statistics, but at the core, most of them are rather simple. My goal here is to simplify these stats to the point that their concept is easy to understand, for people to be able to to make the informed decision as to whether they think they are persuasive or not persuasive, or better yet, to what extent they find them helpful. To put my biases out front, though I generally like statistics, and particularly new ones, the closer I look at many of their stats, the less persuaded I am. I do, however, think that some of their ideas are quite helpful. I hope some of you might be able to explain if I misunderstood their ideas.

Because this gets a little messy, I'm going to bold the beginning of the paragraphs I would describe as broadly relevant. The others are probably more interesting if you want to argue or understand the details.

How do you measure an offensive line's impact on a run? Their basic idea is this; though in any run, both the offensive line and the running back matter, the closer you get to the line, the more the blockers matter, while the further away you go, the more the running back matters. In other words, when a RB loses 2 yards on a run, that's normally because the offensive line messed up. When an RB gets a 50 yard run, at some point, he usually did something right. Makes sense to me.

From that idea, they develop two cool stats. If the offensive line plays a large role in rushes that end up in losses, then it would help to count the number of rushes where that happens, and then compare different teams. According to that stat, which they call "stuffed," the Jets did the 4th best last year and were the best in the NFL in 2010. Another interesting stat is the percentage of success every team has in clear rushing downs, namely 3rd or 4th down with less than 2 yards to go. According to that stat, which they call "Power Success," the Jets finished 9th last year and were 2nd in 2010. This suggests that the Jets' OL might have been well-above average in helping the running game both years, though better two years ago.

Everywhere else, though, this stuff gets messier. Intuitively, a team that has an OL that is effective in not getting its backs stuffed would also do well on power runs. A lot of teams are like the Jets, and either good in both stats or bad in both. Quite a few, though, seem to have had weirder experiences. For example, the Eagles were 6th best in the NFL in power runs last year, but dead last in avoiding getting stuffed. The Broncos, on the other hand, were 26th in power runs, but 9th in getting stuffed. One possibility is that these teams are weird, but a more likely possibility is that because there are relatively few rushes in something like 3rd or 4th down and 2, randomness would play a large role. One solution for them might be to try to measure over multiple years to get more data, though that obviously has its own problems. Even calculated as it is, though, I think it is reasonable to say that teams which do well on both categories, like the Jets do, likely had an above average OL for rushes, while teams that did badly on both, likely were less than average. This seems to be a cool way to measure something that we hadn't known how to do before.

The major stat they use, though, is called "Adjusted Line Yards." They measure this using the first thing they write on this page. What I explained above is what they mean when they says that the OL gets 120% of the "credit" for losses in runs, 100% for 0-4 yards, etc. Precisely how they come up with the 120% figures and so forth, or the 0-4 yard categories (why not 0-3, or 0-5?) is unclear. (Not to get too deep into the weeds here, but it's also unclear how they are taking into account down and distance and other variables. One possibility is that they are associating this with the idea of successful plays/unsuccessful plays, which has its own issues, though my instinct is, considering they get this back to a yards per carry idea is that they are comparing it to an estimate of how many yards a play should get in each case. If that's true, then it would be highly odd, considering it would just ignore all of the gains that they had made from rejecting the idea that all yards are created equal. This is a relatively incidental point, though)

Here's the huge issue with Adjusted Line Yards. It assumes that the OL gets all of the credit (or more than all) for certain runs, which utterly ignores that in every single play, the RB still plays a key role. This might seem like a small oversight, but this is actually the core of the problem with Adj. Line Yards. What they're trying to give you is an estimate of how many yards per carry the team would have gotten had they had a player who was average in getting big gains. Yet to evaluate an OL's ability to help a RB gain yards by using the total yards produced by a run without considering the RB defies logic. They don't consider the role of an RB in making a negative gain positive or anything else. For example, elsewhere they discuss boom-and-bust RBs, and say that they take more risks at the line of scrimmage in order to get the occasional big gain, but the effect of this stat is to simply blame the OL for the losses while taking away all credit for the gains.

One other little problem --- they seem to have gotten their math mixed up, as above they say they would re-average the Adj. Line Yards to match the average yards per carry (don't worry about this, but it's because by overweighting bad runs and underweighting good runs, the effect is to make everybody's Adj. Line Yards less than it should have been), but then they seem to have decided not to do it in the actual stats. I'm assuming that's an oversight. It would be quite easy for them to fix, but for us, it takes a lot of effort to get it precisely right. To get a particular team's stats in 2011 roughly right, though, multiply their Adj. Line Yards by 1.06 (again, this doesn't really matter, but the median ALY is 4.06 and they have the average gain per rush as 4.32. 4.32/4.06=1.06). For the Jets, that would raise their Adj. Line Yards to 4.5

It's tempting to look if a team has more Adjusted Line Yards than actual yards and use that to analyze the OL. What does it mean, though, that the Jets' (Adjusted) Adjusted Line Yards are so much higher than their actual rushing results? Basically, it shows what you know; the Jets' RBs weren't able to break out many big runs, and what gains they did get were highly dependent on the types of runs where the OL plays an unusually large role (aka small ones). To some extent, that is useful. My issue with this stat, however, is what it doesn't tell you. Namely, whether the reason the Jets got so many 2-4 yard runs was because Shonn Greene and company couldn't break out beyond his OL, or if the OL failed him and he had to break tackles to get there. By simply granting all of the blame and credit to the OL for rushes less than 4 yards, this stat does absolutely nothing to answer that question. Instead, it simply assumes that for less than 4 yards, the OL is the only relevant factor, and for 5-10, the OL and RB are equally responsible, and for longer runs, the RB is solely to credit. They present no evidence for that claim, or any consideration that for some teams on some runs, the RB might be more important and for others, the OL more important. To know that, as of now, these stats are of the most limited use, and only watching the games can tell us it.

This is a FanPost written by a registered member of this site. The views expressed here are those of the author alone and not those of anybody affiliated with Gang Green Nation or SB Nation.