Relevant, real-time, accurate, and scalable: 2013 Oscar predictions are a win for predictive science
DavidMRothschild on February 25, 2013 @ 1:33AM
Predicting the Oscars for me is not about the Oscars per se, but the science of predicting. The challenge was to make predictions in all 24 categories, when most predictions only do 6. The challenge was to make predictions that move in real-time during the time period between the nominations and the Oscars, when most predictions are static. The challenge was to make to predictions that were accurate, not just in the binary correctness, but in calibrated probabilities. The challenge was to make these cost effective predictions, so that they could not only scale to 24 categories, but be useful in making predictions in varying domains.Prediction market data, including Betfair, Hollywood Stock Exchange, and Intrade, combined with some user generated data from WiseQ, allowed me to meet all of these challenges.
I was able to produce predictions for all 24 categories, expanding down the list through film editing, sound mixing, etc. I showed how these predictions moved in real-time during the period between the Oscar nominations and the Oscars. For example, Argo zoomed upward in the best picture and adapted screenplay categories as Zero Dark Thirty plunged in best actress and original screenplay. I was very accurate with 19 of 24 categories correct and the winners in the other 5 categories showing reasonably high probabilities. Prediction market data and experimental prediction games harnessed the wisdom of the crowds to allow me to scale easily to all 24 categories. These same data/models will allow me to easily expand to all sort so domains in the near future.
DavidMRothschild on February 23, 2013 @ 12:27PM
I created my Oscar predictions in real-time, because real-time movement is an important part of my basic research into predictions, not because I thought the Oscars would provide an interesting domain for movement; I was wrong. In category after category significant movement in the likely winner provides a window into the power of certain events that occurred on the road to the Oscars. These events include regularly scheduled events, such as awards shows, and idiosyncratic events, such as prominent commentary on certain movies.
Every prediction I do is in real-time for two reasons. First, real-time predictions provide the most updated prediction for the end user whenever that user needs them. For example, it is easy to see with economic or financial predictions that knowing the likely outcome is an important part of major decisions that happen continuously. Movement is a good thing in predictions, because it demonstrates that predictions are absorbing new information that affects the outcome we are predicting. Second, real-time predictions provide a granular track-record to explore when/why movements occur (i.e., what things actually impact the final outcome). Granular predictions allow me to judge the value of a debate or big advertisement buy or vice-presidential choice or an awards show, something that cannot be isolated with less regular indicators.
The most obvious movement has been in the best picture category, where Lincoln's original lead has collapsed as award show after award show favored Argo. Shortly after the nominations were released Argo was in a distant second place to Lincoln at just 8 percent likely to win. Yet, all of these wins brought Argo to 93 percent.
This theme carried into the adapted screenplay category, where a commanding lead by Lincoln is now a tight proxy fight with Argo. Our data is demonstrating a strong positive correlation between the outcomes of these two categories. Lincoln started off with a smaller lead, 70 percent likely to win to best adapted screenplay. And, the change has not been as dramatic with Argo leading slightly at 57 percent.
Sources: Betfair, Hollywood Stock Exchange, Intrade, WiseQ (detailed at PredictWise.com)
Zero Dark Thirty's likelihood has fallen in nearly every one of its strongest categories including best actress and original screenplay. The implication is that the increased scrutiny of Zero Dark Thirty's depiction of torture will hurt it with the voters. Just after the nominations were released, Zero Dark Thirty's Jessica Chastain was a viable 28 percent to win best actress, but that has plummeted to 5 percent in the last few weeks. Similarly, Zero Dark Thirty was 65 percent likely to win for best original screenplay. Amour and Django Unchained were distant second and third at about 13 and 17 percent likelihood. Today we have Django Unchained leading with 47 percent and Zero Dark Thirty nearly tied with Amour around 25 percent.
By the time this Oscar night concludes we will have a much richer understanding of the value of the awards show and the cost of negative publicity.
If you think you are a better prognosticator than I, please play the new WiseQ Oscars Game and show me how smart you are!
This column syndicates with the HuffingtonPost.
DavidMRothschild on February 22, 2013 @ 12:23PM
I am stunned at the confidence of my predictions for the Oscars, seen in real-time here and here. Of the 24 categories that the Academy of Motion Pictures Arts and Sciences will present Oscars for live this Sunday, the favorite in eight of them are 95 percent or more likely to win their category. Yet despite my concern that no nominee should be that confident of victory, I have no choice but to stick to the data and models. My data and models have proven correct over and over, while hunches and guts checks are prone to failure.
I created and tested these models using historical data and then release them to run prior to the Oscar nominations; I do not make any tweaks to my models once they are live, because I do not want to inadvertently bias my results with considerations of the current predictions. The Oscar predictions rely mainly on de-biased, aggregated prediction markets. This method has proven not just accurate, but has the added benefit updating in real-time and is so scalable that I can provide predictions in all 24 categories. Further, I incorporate user generated data to help determine the correlations between categories, within movies, that I use to create predictions on the number of Oscars for each movie.
The biggest errors in my 2012 election forecasting were painfully obvious to me, even as I published my forecasts in mid-February, but I stuck to them. The errors came from state-by-state predictions of vote share for Massachusetts and Utah. Despite Rick Santorum dominating the nation's polling for the Republican nomination, I was extremely confident of a Mitt Romney nomination. Our model demanded the home state of the Republican nominee and we provided Romney's official home state of Massachusetts. "Everyone" knew that he would get a home state bump in Utah, where he has religious roots and was instrumental in saving the 2002 Winter Games, and not in Massachusetts. But there is no objective data for swapping out the official home state. Making arbitrary model/data changes is bad science and costly; I design my models to be easily scalable to new questions and categories of questions and I do not want to manually review each individual prediction for extra data. So, I proudly overestimated Romney's vote share in Massachusetts (although I still had him losing!) and underestimated his vote share in Utah (although I still had him winning!). Because, while that hunch was correct, over time science is much more reliable than hunches. So, I am sticking to my increasingly confident predictions going into Oscar night and feeling confident about them.
Further, while the eight strong predictions are very salient, the average prediction is at its exact historical level. The average likelihood of victory for the favorite nominee across the 24 categories is 75 percent. In five categories the favorite nominee is not even 50 percent likely to win! Thus, if my model is properly calibrated, I will only get 18 out of 24 categories correct (i.e., 75 percent of the categories). This is the exact same average likelihood that the market-based forecasts provided in 2011 and 2012, and in both years, 3 of every 4 categories landed correctly.
I leave you with a question; is there any particular likelihood, in any of the 24 categories, which you would place a large wager against? What looks to high and what looks too low for you? I invite you to go on the WiseQ Oscar Game and prove me wrong and you right!
This column syndicates with the HuffingtonPost.