CryptocurrencyLeadershipInvestingClimate ChangeMost Powerful Women

The limits of Big Data

October 10, 2012, 4:19 PM UTC

Stop me if you’ve heard this one: Three statisticians go rabbit hunting. They spot a rabbit. The first statistician shoots. He misses the rabbit’s head by a foot. The second statistician fires; misses the rabbit’s tail by a foot. The third statistician cries out, “We got him!”

Even if you don’t find this joke remotely amusing, you’ve probably worked with exactly the kind of managerial rabbit hunters it describes. Their math may be impeccable but their real-world results, alas, are rubbish. Lies, damned lies, etc. What must organizations know to improve the odds that their quants will deliver real value instead of statistical illusions? How can stochastically innumerate executives be sure they’re not being bamboozled by Big Data?

Excellent answers can be found in Samuel Arbesman’s The Half-Life of Facts and Nate Silver’s The Signal and The Noise, two distinct but complementary efforts that explore how “data” become “evidence” and why so many sophisticated mathematical models fail so spectacularly at distinguishing the two. The books embrace and extend upon the themes of uncertainty and quantitative self-deception articulated by Nassim Taleb’s popular and insightful Fooled By Randomness and The Black Swan, as well as Nobel laureate Daniel Kahneman’s superior Thinking, Fast and Slow. Like their precursors, Arbesman and Silver have produced entertainingly actionable books.

Both authors cite the cynically apt line — variously attributed to Mark Twain, Will Rogers and Charles Kettering — that ‘It ain’t so much the things we don’t know that get us into trouble. It’s the things we know that just ain’t so.” Both discuss the media and mechanisms used to distinguish between “real” knowledge and “ain’t so’s. Arbesman and Silver both argue persuasively that the “ain’t so’s” are winning. The more data you deal with, the more attention that case deserves.

Arbesman, an applied mathematician and fellow at Harvard’s Institute for Quantitative Social Science, deconstructs what it means to be a fact. He mercifully avoids getting bogged down in post-modernist philosophy. Instead he explores how serious scientists attempt to nail down what it is they think they know about what they’re studying. This “scientometric” approach — the science of how science measures its process and progress — proves extraordinarily helpful in identifying the lifecycles and ecosystems of what scientists call “facts.” This approach allows Arbesman to ask intriguing questions, such as: How are “facts” born? How do they typically replicate, mutate and evolve? How long do they take to die?

The provocative core of Arbesman’s argument is that there is a virtual physics of facts. Depending upon how they’re defined and measured, ‘facts’ follow defined laws and trajectories. “Every day that we read the news we have the possibility of being confronted with a fact about our world that is wildly different from what we thought we knew,” he writes. “…But it turns out that these rapid changes, while true phase transitions in our knowledge, are not unexpected or random. We understand how they behave in the aggregate, through the use of probability, but we can also predict these changes by searching for the slower, regular changes in our knowledge that underlie them. Fast changes in facts, just like everything else we’ve seen, have an order to them. One that is measureable and predictable.”

What do we mean by “measurable” and “predictable?” Arbesman is quite good at describing the institutional, individual and probabilistic biases that skew how both science and scientists assess, publish and extinguish “facts.”

“The clearest example of this is in the world of negative results,” Arbesman writes. He cites evolutionary biologist John Maynard Smith, who noted that “statistics is the science that lets you do twenty experiments a year and publish one false result in Nature. However, if it were one experiment being replicated by twenty separate scientists, nineteen of those would be a bust, with nineteen careers unable to move forward. Annoying, certainly … but that’s how science operates. Most ideas and experiments are unsuccessful. But crucially, unsuccessful results are rarely published.”

The point is not that the science of statistics or the statistics of science are pathologically flawed but that known pathologies and flaws can create incentives to rethink, revise and redesign what we measure and test. We need “facts” to help us renew our insights and understandings about “facts.” Science — and the increasingly digital technologies that both drive and support it — offers a powerful model for enterprises struggling to make sense of and add value to their growing mountains of data.

In that respect, The Half-Life of Facts offers a pop science primer on the epidemiology of epistemology — that is, the process by which ideas about the nature of knowledge and knowing spread throughout a discipline, a profession and a culture. Arbesman’s work challenges decision-makers worldwide to rethink how they want their organizations to turn intriguing data into useful facts.

Silver, a statistician who writes the FiveThirtyEight blog for the New York Times site, takes a different but compatible approach to knowledge, fact, and predictability. Almost overstuffed with detailed examples and vignettes, his book delivers a sobering portfolio of warnings about predictive hubris. “This book is less about what we know,” Silver writes, “than about the difference between what we know and what we think we know.”

From weather to earthquakes to global warming to football to subprime mortgages to the global financial crisis, Silver explains how modelers and forecasters struggle to convert yesterday’s data into tomorrow’s “you can bet on it” predictions. These miniature case studies, while necessarily superficial, don’t shy away from the math and consistently take a fair-minded view of the most important assumptions. A better editor might have pushed Silver to sacrifice quantity for keener insight, but the breadth of examples undeniably reveal a “pathology of prediction.”

Where Arbesman’s unit of analysis is the fact, Silver focuses on “predictive validity.” He has the good grace and self-awareness to accept human frailty as a design constraint. “But I’m of the view we can never achieve perfect objectivity, rationality or accuracy in our beliefs,” Silver writes. “Instead we can strive to be less subjective, less irrational and less wrong. [emphasis in original] Making predictions based on our beliefs is the best (and perhaps only) way to test ourselves. If objectivity is the concern for a greater truth beyond our personal circumstances, and prediction is the best way to examine how closely aligned our personal perceptions are with that greater truth, the most objective among us are those who make the most accurate predictions.”

I wonder, however, if Silver is fully aware of the cumulative effect his mix of cautionary tales and shocking failures might have on readers who take his reporting to heart and mind. He provides example after example of flawed and biased human beings building flawed and biased models and using them flawed and biased ways. He provides a superb riff on “overfitting” in statistical models. By trying to get their models to fit the data a little too well, Silver explains, statisticians all too frequently end up making them far less accurate and reliable for prediction.

To the extent that Silver’s stories present a fair sampling of today’s predictive modelers, this book predicts neither a happy nor brave new world of statistics-driven success. In this world, average performance may prove to be quite a few standard deviations away from world class.

Silver cites Philip Tetlock’s classic study of expertise, which shows that “experts” in a disconcerting number of disciplines are disproportionately worse than chance at predicting likely outcomes. Experts also tend to be disproportionately overconfident about the quality of their predictions. In short, expertise frequently yields the worst of both worlds: wrong answers stewed in arrogance. This is not a recipe for success.

Between IBM’s Jeopardy-winning Watson, Google’s search algorithms and Amazon’s recommendation engines, there’s no doubt that data-driven computational systems can enjoy remarkable success, particularly when they focus on real-life testing rather than abstract theory. “Companies that really ‘get’ Big Data, like Google, aren’t spending a lot of time in model land,” Silver writes. “They’re running hundreds of thousands of experiments every year and testing their ideas on real customers.”

The ironic takeaway from both these fine books, however, is that the more data and facts one has, and the more predictions matter, the more important human judgment becomes. The co-evolution of human beings, datasets, and algorithms will ultimately determine whether Big Data creates new wealth or destroys old value.

A research fellow at MIT Sloan School’s Center for Digital Business, Michael Schrage is a former Fortune columnist and the author of Who Do You Want Your Customers To Become, which we reviewed this week.