image via New York Times
by Elizabeth Fournier
For many, the results of the 2016 election came as a shock. Few expected that Hillary Clinton, a candidate with 30+ years of political experience, would ultimately fall to Donald Trump, 306 electoral votes to 232. In the days leading up to the election, many of us looked at polls to anticipate what would happen, and were largely misled. This article outlines the main factors that led polls to underestimate Donald Trump’s support.
Almost all polls were wrong about the outcome of this election. They had Clinton up by a significant margin, even at 5:00 pm on Election Day. To be fair, they did show that numbers would be close in swing states like Florida and North Carolina, and Clinton did win the popular vote. But still, Rothenberg & Gonzales, Nate Silver’s FiveThirtyEight, Sabato, Trafalgar Group, The Associated Press, The NY Times, and many other groups incorrectly predicted the outcome of this election.
Although they had different methodologies, different sample sizes, and slightly varying results, these different polling services made many of the same mistakes. Pollsters overestimated Clinton’s support among minorities, underestimated Trump’s support among white voters (especially college-educated women), and didn’t anticipate Trump’s ability to pull late support from Independents and undecided Republicans. Specifically in the “rust belt,” which includes states like Ohio, Michigan and Pennsylvania that Trump needed to win but was expected to lose, Trump’s margin was underestimated by at least 4 points.
To see what all these pollsters did wrong, we can look to the one poll that got it right. The USC/LA Times tracking poll had Trump as the expected winner of this election throughout the last few months of the campaign. Their methodology was based on an internet survey, and they made the unique decision to ask voters how comfortable they were with talking to people about their vote. This question may have alleviated the problem of dishonest poll responses, as it appears as though Trump supporters were less likely to say that they were Trump supporters than Clinton supporters were to say that they were Clinton supporters. Female Trump supports were “particularly less likely to say that they would be comfortable talking to a pollster about their vote.”.
Another potential problem may have been the dearth of last-minute polls. Concerns about timing are especially valid considering FBI Director James Comey’s decision to review additional Clinton emails just a few days before the election. But still, polls largely showed Clinton gaining momentum in the final days, not Trump.
A more likely source of error was low response rates. Although this is not a new challenge, pollsters’ usual technique for dealing with low response rates may have been problematic in this election. Pollsters have typically weighed groups of respondents differently to try to best represent the overall population. If women or young voters or Hispanic voters pick up the phone less often, pollsters can weigh the few responses they do get from these groups more heavily. However, the response rates in this election may have been determined by factors that pollsters didn’t measure, or factors that are not easy to adjust for. If large groups of voters – college-educated women, for example – were less likely to be truthful about who they were voting for, pollsters may have unintentionally weighed respondents incorrectly. They also may have under-sampled certain groups, like non-college-educated white people, to begin with, exacerbating this problem.
In addition, pollsters may have misjudged the likelihood that certain people would vote. This kind of error is particularly salient when it comes to groups that have been unlikely to vote in the past. Pollsters may have made the mistake of assuming that people who didn’t vote in 2012 wouldn’t vote this time around, and therefore under-sampled these groups or ignored them altogether.
Confirmation bias also may have played a role in how poll results were interpreted. As The Atlantic’s Vann Newkirk asks, we should consider if we “all believe[d] Clinton would win because of bad data,” or if we “ignore[d] bad data because we believed Clinton would win.” Pollsters may have unintentionally constructed polls to reflect their predictions, and not to reflect public opinion.
We don’t know for sure why polls were so wrong, and probably won’t know for a while. It’s likely that many factors played a role, and we can speculate as to what they were, but we don’t have definitive answers yet. Pollsters need time to analyze their mistakes, especially in key states like Michigan and Wisconsin that Clinton was expected to win easily. If they want to maintain their relevance in elections going forward, pollsters need to make sure that they don’t make the same mistakes again.