NOTE: This post originally appeared on Scientific American’s blog network on May 12, 2013. http://blogs.scientificamerican.com/cocktail-party-physics/2013/05/12/idol-tweets-mapping-the-social-space-with-twitter/
Many viewers who tuned into American Idol on April 4th expected the dismissal of Lazaro Arbos, a likeable young man with an endearing stutter but marginal talent and an unfortunate tendency to forget lyrics. They were stunned when Burnell Taylor was eliminated instead. Arbos inexplicably wound up in the top three of the remaining contestants, despite a reedy, squirm-inducing rendition of Queen’s “We Are the Champions” the night before.
Perhaps those viewers should have been monitoring the Twitter chatter more carefully. American Idol has never been exclusively about rewarding merit; it is the voters, not the judges, who have the final word. I call it the “Squee Factor.” Arbos’s inspiring personal story won him a rabid fan base, dubbed “Lazzies,” who choked the social networking feed each week with outpourings of devotion, urging each other to save him from elimination. According to Alessandro Vespignani, a computational physicist at Northeastern University, all those tweets collectively contain enough information to predict which contestants are likely to be eliminated.
It’s Person of Interest for the social media set. Vespignani is among those scientists specializing in what’s known as big data analytics: mining the huge amounts of personal information we reveal online to build demographic profiles, the better to target advertising or improve train scheduling, among other uses. Vespignani is harnessing the power of social networking to model the spread of disease outbreaks, stock market behavior, collective social dynamics, and election outcomes — or voting behavior for American Idol. And Twitter is his current favorite tool.
Modeling these kinds of complex systems is a bit like trying to predict the weather. It’s well nigh impossible to accurately forecast weather conditions beyond ten days, given the large number of factors that can influence the outcome. Even then, “The better the data you gather about the climate at a certain moment, the better your predictions will be about the future,” said Vespignani. The same holds true for his social networking models.
Cell phones have been a mainstay of studying social phenomena for many years. With their GPS tracking components and call logs, they make fantastic behavioral “sensors,” providing a far more accurate record than random surveys or asking people to record their own behavior in a diary—the traditional methodology for such studies.
But exploiting cell phone data for predictive models is soooo 2008. Twitter brings data collection to a whole new level. “Twitter is not just where you go; it is what you think about politics, about society, about who you think will win American Idol—what we call social phenomena,” said Vespignani. “What we can do now is map the social space.” His lab collects hundreds of millions of tweets every day, posted by several million users, giving him an exponentially larger sample size.
It is no easy feat sifting through huge amounts of raw unstructured data to find those proverbial needles in haystacks. Fortunately, “physics has taught us a lot about what to do with big data,” said Vespignani. His primary filter is vocabulary. Much like physicists at the Large Hadron Collider sift through the detritus of billions of collisions between elementary particles to pick out the unique signature of a Higgs boson, Vespignani sifts through millions of tweets looking for the most relevant words to whatever system he is trying to model. That makes Twitter a kind of social collider.
“Ninety percent of the things we do, we are predictable like atoms,” said Vespignani, although he is quick to clarify that he meant this solely in a statistical sense. “When you do weather forecasting, you are not predicting the motion of a water molecule or an oxygen atom,” he said. “This is what we can predict: social collective phenomena, not the behavior of single individuals.” Ultimately, his goal is to model how consensus of opinion forms, and how ideas and viruses spread through a population (whether online or off).
Last year, during the eleventh season of American Idol, Vespignani and his students analyzed the Twitter activity during the voting period for nine episodes leading up to the finale, and found that the number of tweets that mentioned a specific contestant during each voting period was a good indicator of how many votes that contestant received. This made it possible to predict which contestants were most likely to be eliminated each week. The two finalists that season were Jessica Sanchez and Phil Phillips.
Vespignani’s group broke down the Twitter activity even further by geographical region—a subset that proved to make a crucial difference in the predictions. The initial analysis favored Sanchez as the ultimate winner of the competition.
But the subset revealed that Sanchez had many fans outside the US, particularly in the Philippines, and those fans were not eligible to vote. When the group adjusted their analysis to exclude Tweets from outside the U.S., the model showed Phillips in the lead. And Phillips did win the title, with Sanchez as runner-up.
American Idol served as a handy test case, because it is a relatively simple model. “We used American Idol because we figured if we could not do predictions there, we would not be able to make predictions anywhere else,” said Vespignani.
Granted, even Twitter has its limitations as a predictive tool, because it represents just a fraction of potential voters. For every rabid “Lazzie” who casts multiple votes, there are a million passive viewers (like me) who never bother to vote at all. Then again, we’re unlikely to be airing our Idol views on Twitter. Vespignani argued that despite this relatively small sample set, the people tweeting about the competition are doing so because they are fans of the show, and hence are the most likely to vote for their favorite contestant. This makes it much easier to identify likely voters and classify them according to their preferences.
Doing the same for political elections is much more complicated. Case in point: the recent Italian elections held in late February. “It is a country that is going completely nuts,” said Vespignani of his native land. “It’s one of the few countries where people don’t tell the truth about their opinions.” That makes it more difficult to identify and classify users according to their political preferences, on top of the usual biases that inevitably creep in with regard to geography and so forth.
Technically, Vespignani’s team looked at the raw signal, rather than making explicit predictions. Still, with the exception of disgraced politician Silvio Berlusconi winning a shocking thirty percent of the vote (an outcome none of the official polls predicted), the group’s model matched the election results pretty well—better than the standard electoral polls, in fact, and at a fraction of the cost.
Of course, the “Squee Factor” can only take you so far on American Idol (or elsewhere). Arbos’ luck ran out the very next week, when he was sent home after butchering his cover of The Carpenters’ “Close To You,” in which he couldn’t manage a simple key change. It was so bad even the genial Randy Jackson admitted, on live TV: “You know that I love you, the person…. But all I can say is, ‘No, no, no, NO.’ That was horrible.”
Last week, Candace Glover and Kree Harrison — both of whom have given consistently solid performances this season — emerged as the top two finalists. But Vespignani’s team won’t be analyzing Twitter to predict which of them will win this season’s competition. Last year’s exercise served its academic purpose, with a published paper and a presentation in March at the American Physical Society meeting in Baltimore. Besides, “this year the show is not as fun,” he said.
Personally, I’m rooting for Candace. But I’m not getting my hopes up, for reasons that can be summed up in two words: Melinda Doolittle. She was a contestant for the sixth season of American Idol in 2007– the only other time I’ve watched the show. I happened to be flipping channels and caught her performance of “Nut Bush City Limits” — and was blown away. So was the notoriously acerbic Simon Cowell, who took to calling her “My Melinda.”
Twitter was barely one year old, but even without that metric, I was sure Doolittle was a lock for the finale. Week after week, she knocked it out of the park, whether she was asked to perform rock (“Have a Nice Day“), Diana Ross (“Home“), Inspirational (“There Will Come a Day“), Motown (“Since You’ve Been Gone“, “I Am a Woman“), or country (“Trouble is a Woman“). On the basis of pure talent, she should have won.
But Doolittle didn’t have the “Squee Factor” — at least, not as much as seventeen-year-old Jordin Sparks, who won the title, and budding heartthrob Blake Lewis, who was runner-up that season. So much for the wisdom of crowds. I was so disgusted I never followed American Idol again — until the last few weeks, when Vespignani’s work gave me a reason to tune in again. But I’m still nursing a grudge from 2007. Sure, Sparks was adorable and talented, but c’mon! She wasn’t in the same league! Check out Doolittle’s killer rendition of “My Funny Valentine,” easily one of the top performances in the competition’s history:
Sigh. Yep. Still bitter. The voters did better this time around.
Ciulla, Fabio et al. (2012) “Beating the News Using Social Media: The Case Study of American Idol,” EPJ Data Science 1:8.
Fumanelli, M. et al. (2012) “Inferring the Structure of Social Contacts from Demographic Data in the Analysis of Infectious Diseases Spread,” PLoS Computational Biology 8: e1002673.
Goncalves, B., Perra, N., and Vespignani, A. (2011) “Modeling Users’ Activity on Twitter Networks: Validation of Dunbar’s Number,” PLoS ONE 6(8): e22656.
Ratkiewicz, J. et al. (2010) “Characterizing and Modeling the Dynamics of Online Popularity,” Physical Review Letters 105: 158701.