When Should NFL Teams Go For It on 4th Down?

After conquering the MLB and NBA, Advanced Stats are now beginning their take over of the NFL. Once only found on outsider blogs, advanced statistics and their corresponding tools have now permeated mainstream NFL commentary and analysis. Perhaps the most discussed application modern analysis has been 4th down decision making by coaches. The general conclusion based on this analysis is that NFL coaches overestimate the value of punts and the success of field goals and as a result, do not go for it enough on 4th downs. This conclusion is derived primarily from the work Brian Burke from Advanced Football Analytics, and has resulted in the creation of a twitter account that tweets out the statistically optimal decision each time an NFL team is faced with a real world 4th down decision:



For the most part, the Robot's optimal solution is the one that yields the most expected points. The expected point metric is based on the idea that for each point of on the field, a team can expect to gain a certain number of points, on average, out of their drive. For instance, a team facing first and ten from its own 20 yard line can expect to earn fewer points, on average, from that drive than it would on a drive where it faced first and ten from its opponent's 10 yard line. Thus, by knowing the value of any point on the field, and by knowing the various percentages associated with converting on 4th down, versus punting, versus making a FG, you can calculate the expected point maximizing decision. This model is diagrammed as:


The above model is fine for a very top level examination of expected point values, but it doesn't account for the wide variety of outcomes for seemingly simple decisions. For instance, if a team faces 4th and 20 and only gains 10 yards, it turns the ball over, but its opponent is in worse position that using a simple 4th down conversion rate would imply because the opponent is taking over the ball ten yards closer to its own goal line and therefore has a lower expected point value for its new drive. Or, for instance, a missed field goal could result in a botched hold, a block, or a return, meaning the downside of a missed FG could be a touchdown for the other team.

Using simple averages for 4th down conversion rates or FG%s is not robust enough for this model. Instead of using averages, our model will use discrete probability distributions for:

  • Expected Points for given drive starting positions

  • FG% and unsuccessful FG outcomes

  • Expected Yards Gained on 4th

  • Punt distances

  • Unsuccessful punt outcomes

These distributions will be used to generate an exact point outcome, and then a monte carlo simulation with 1,000 iterations will be run to determine the average expected points generated from a given decision. 

The distributions will represent all outcomes for a various scenario based on NFL play by play data from 2005 to 2011. For instance, NFL teams faced 230 4th and 3 scenarios during this time frame and gained 0 yards 101 times, 1 yard 23 times, 2 yards 27 times, etc etc. Each distribution is a series of weighted likelihoods that mimic the exact play by play data from 2005 to 2011.

By factoring in a variety of possibilities, we can also calculate tail risk of significantly bad outcomes using "Conditional Value at Risk" (CVAR). In addition to calculating expected points, we will also be calculating CVAR25%, or colloquially, the average points gained or given up from the worst 25% of outcomes. 

For our analysis, we have simulated the expected points and CVAR25% resulting from going for it, punting, and kicking a field goal on 4th and 3 from every yardage point on the field, ranging from 0 to 97 yards away from the offensive team's own goal line. We have run each scenario simulation 1,000 times. 

The results of this analysis align nicely with the results of the analysis done by the advanced stats NFL blogs. The dark lines are the expected points, and the light lines are the second power regressed CVAR25% values, or put another way, the points a team should expect to get on average and the points a team should expect to give away in a worst case scenario:

Go For It

Expected Points and CVAR by Field Position for Going For It on 4th and 3

Expected Points and CVAR by Field Position for Going For It on 4th and 3

Attempt a FG

Expected Points and CVAR by Field Position for Attempting a FG on 4th and 3

Expected Points and CVAR by Field Position for Attempting a FG on 4th and 3

Attempting a Punt

Expected Points and CVAR by Field Position for Attempting a Punt on 4th and 3

Expected Points and CVAR by Field Position for Attempting a Punt on 4th and 3

There are two immediate takeaways from these charts:

  1. Going For It increases linearly because being closer to your opponent's end-zone always increases your chances of scoring by a steady margin, while FGs and Punts are logarithmic because after a certain point, being closure to your opponent's end-zone doesn't significantly increase the chances of making a FG, or in the case of a punt, avoiding a touch back.

  2. Punting the ball almost always yields negative expected points because the opposing team is likely to score from just about any of the starting positions it might expect to get from a punt. Possessions are worth points!

There are, however, some not so obvious conclusions that become clearer when viewing the charts together:

Like the analysis of past models, these results support going for it over punting or kicking a FG anywhere on your opponent's side of the field if a team is trying to maximize it's expected points. This would also support the idea that NFL coaches are too conservative, or risk averse, because they often times choose to punt or kick a field goal despite the fact that the more aggressive play is expected to yield more points. However, here in lies the single biggest flaw in past analysis and the conclusion of our study.

Passed models have assumed that NFL coaches should be risk neutral. The reasons for this assumption is to test NFL coaching decisions against the Efficient Market Hypothesis. While a this assumption may work well, or even be necessary, for previous studies,  it does not work well for the real world where NFL coaches should decidedly have some degree of risk aversion or tolerance. For instance a coach with a better team should be risk averse to minimize the chances that he or she gives away an easy touchdown to an inferior opponent. Conversely, a coach with an inferior team should have some risk tolerance and go for it more often in the hopes of capturing some upside volatility in expected points. This is idea is refereed to as a David Strategy and has been popularized by Malcolm Gladwell.

Based on our analysis, it is obvious why NFL coaches would be risk adverse. Starting from around the opponents 35 yard line, the marginal decrease in tail risk from kicking a field goal is far greater than the marginal point increase a team is sacrificing by not going for it. Obviously a coach's risk aversion should be scenario specific and he or she should optimize/maximize expected points based on a maximum CVAR he or she is willing to accept. 

Like many other advanced NFL statistics sites, we conclude that going for it on 4th down often leads to higher expected point values than simply kicking a field goal. However, for answering the question "When should teams go for it on 4th down?", we do not believe that expected points alone is the best KPI (Key Performance Indicator). We believe that coaches should make an expected point maximizing decision for a given CVAR25% constraint, or put another way, make a point maximizing decision for how much they are willing to risk. Ultimately, we believe that "going for it" it is scenario specific and that NFL coaches are not overly conservative, but rather, appropriately conservative in their decision making.