After more than 50 million votes cast in 100-plus nominating contests since early February, the U.S. presidential primary season is over and each major party finally has its presumptive nominee. Now, as the country prepares for the race between Democrat Hillary Clinton and Republican Donald Trump, and braces for a flood of polling and prognostications regarding who will win, how might we begin to separate the signal from the noise and avoid failing predictions, like the majority of Brexit polls that mistakenly saw Britain remaining in the European Union?
To find out which outlets did the best job of predicting the primary winners in this wild cycle, Bloomberg Politics examined hundreds of polls as well as nearly 300 final projections from four predictors across a big set of 90 nominating contests and a small set of 52 contests that every predictive source weighed in on.
The verdict: a tie, when looking at the small set, between PredictWise, an aggregator of betting-market data, and FiveThirtyEight, whose "polls-plus" prediction model includes statewide polls, national polls, and endorsements. Both were right 92 percent of the time across those 52 primaries and caucuses.
However, PredictWise's sheer comprehensiveness (covering all 90 contests in the big set, with 86 percent accuracy) means it edges out FiveThirtyEight (59 contests, with 89 percent accuracy) for the top spot in our ranking. These outlets are followed, in descending order, by Bing Predicts, which combines prediction market data, polling, internet queries, and social media posts; RealClearPolitics, an aggregator of statewide polls; and, finally, by the combined set of roughly 400 raw poll results we analyzed.
1. PredictWise
Small set accuracy rank: 1 (tied). Big set accuracy rank: 3. Comprehensiveness rank: 1 (tied).
This research project, run by Microsoft Research economist David Rothschild, is arguably the top-performing predictor in the Bloomberg Politics analysis. Not only did it match FiveThirtyEight's 92 percent track record for the small set of 52 nominating contests covered by all the predictive sources, it was also just as accurate across the slightly larger set of 59 nominating contests for which FiveThirtyEight produced a polls-plus forecast.
PredictWise nosed ahead of FiveThirtyEight in this ranking by correctly forecasting 24 out of an additional 31 contests, including many caucuses where little or no polling data was available. This ultimately gave the site the best caucus track record: 75 percent, versus 67 percent for RealClearPolitics, FiveThirtyEight and Bing, which was the only other site to make predictions for all 90 big set nominating contests.
2. FiveThirtyEight
Small set accuracy rank: 1 (tied). Big set accuracy rank: 1. Comprehensiveness rank: 4.
The website run by former New York Times stats guru Nate Silver produced two forecasts for the primaries: one based on just state polls, the other expanded to include national polls and endorsements.
The Bloomberg Politics analysis looked only at the "polls-plus" model, which tied with PredictWise for the highest accuracy rate, 92 percent, in the small set of 52 nominating contests. FiveThirtyEight's model was also far more accurate than the raw polls when projecting the winners' vote margins, missing by around 4.3 percentage points on average, compared with 9.5 percentage points on average across the almost 400 raw polls analyzed.
3. Bing Predicts
Small set accuracy rank: 3 (tied). Big set accuracy rank: 5. Comprehensiveness rank: 1 (tied).
The Microsoft search engine's prediction model is one of just two predictors, along with PredictWise, to cover all 90 nominating contests in the Bloomberg Politics big set analysis. It correctly called the winner in 16 of 24 caucuses and 59 of 66 primaries, with 67 percent and 89 percent accuracy, respectively. This disparity between primary-prediction accuracy and caucus-prediction accuracy is evident across all four predictors, and speaks to the relative difficulty in forecasting caucuses, with their complicated communal dynamics, esoteric rules and general lack of polling data.
Despite the breadth of its coverage, Bing is tied for the second-lowest accuracy rate in the small set, and has the lowest accuracy rate of any predictor across the big set of contests that each covered.
4. RealClearPolitics
Small set accuracy rank: 3 (tied). Big set accuracy rank: 2. Comprehensiveness rank: 5.
Aggregations of polls, including the ones by RealClearPolitics or FiveThirtyEight, use polls' timeliness or previous accuracy to help smooth out methodological concerns _ such as small sample sizes and historical biases _ in order to better allow trends to emerge. The final poll averages published by RealClearPolitics, for instance, correctly predicted the winner around 90 percent of the time across the small set of nominating contests, compared with the 85 percent accuracy of the 370 final polls published for those same primaries and caucuses.
The aggregated model was also more accurate than the raw polls in predicting the winner's vote margin, missing by 6.6 points on the Democratic side and by 3.8 points on the Republican side, on average.
5. Raw polls
Small set accuracy rank: 5. Big set accuracy rank: 4. Comprehensiveness rank: 3.
While polling failures have generated no shortage of headlines worldwide, most recently for inaccurately predicting that the U.K. would remain in the European Union, our earlier examination of polls in May, as well as this more complete study of 396 individual poll predictions, show this cycle's U.S. election polls remain largely accurate, particularly when taken in aggregate.
Looking more closely at the 12 individual pollsters who published qualifying polls in at least 10 nominating contests, we see five that accurately predicted the winner at least 90 percent of the time, beating the overall accuracy of any of the four predictors in our study. These include Gravis Marketing, Emerson College, Opinion Savvy, the Marist College/NBC News/Wall Street Journal polls, as well as online pollster SurveyMonkey, which is noteworthy, given recurring questions about the accuracy of online polling.
Four more, including YouGov and Public Policy Polling, the most prolific pollster _ covering 37 of 90 nominating contests in the analysis _ were right at least 80 percent of the time. The real under-performers in the group, at around 70 percent, were CNN/Opinion Research Corp. and the American Research Group, which have been awarded grades of B+ and C, respectively, by FiveThirtyEight.com.
One clear takeaway: In 2016, when polling missed, it missed big. Of the 61 incorrect individual poll projections (out of 396 total), more than half were clustered in just three states. In Iowa, the first-in-the-nation caucus, Ted Cruz's superior organization helped create a surprise win for the Texas senator. In Michigan and Indiana, polls' methodological issues in determining the likely primary electorate might have obscured the ultimate victories for Vermont Sen. Bernie Sanders in both states. In Michigan, in particular, surveys were designed to match the 2008 turnout, even though then-candidate Barack Obama skipped the state after it was sanctioned by the Democratic National Committee for moving its primary too early in the calendar.
When taken all together, the 396 poll projections in the analysis missed the eventual winner's vote share by 10.3 points on the Democratic side, on average, and by 8.6 points on the Republican side. They particularly underestimated Sanders (off by 13.9 points on average in states he won) and Cruz (off by 12.1 points on average in states he won).
With five months still to go until Election Day, polls and prediction markets will undoubtedly see frequent shifts and swings in reaction to breaking news events (such as Brexit or the latest Trump pronouncement) as well as the messaging and strategy of the presidential campaigns themselves. While not necessarily indicative of the eventual outcome this far out, our Bloomberg Politics analysis is a reminder that all these sources remain useful barometers of the public's present-day mood and of the longer-term demographic and thematic trends underlying this unexpectedly competitive election cycle. It also suggests that come the final stretch, some of these predictors may be worth a closer look than others.
(Note: This Bloomberg Politics analysis includes 90 nominating contests covering all 50 U.S. states and the District of Columbia, but does not include U.S. territories or Republican contests in states whose delegates weren't bound to a popular vote -- Colorado, North Dakota and Wyoming -- or that voted after May 4, when Trump's last rival dropped out -- California, Montana, Nebraska, New Jersey, New Mexico, Oregon, South Dakota, Washington and West Virginia. This Bloomberg Politics poll analysis only includes a pollster's final forecast for any of the above nominating contests if published within one month of each respective primary or caucus, as collected by RealClearPolitics and HuffPost Pollster.)
About the Author