All the data in the world can't eliminate uncertainty—yet.
Early on Saturday, Nate Silver, renowned poll analyst and editor of FiveThirtyEight.com, unleashed a Twitter tirade directed at the Huffington Post. His target was an article that criticized Silver’s use of a modeling technique known as trend line adjustment to adjust poll results in light of each poll’s changes over time.
HuffPo Washington Bureau Chief Ryan Grim characterized such adjustments as “merely political punditry dressed up as sophisticated mathematical thinking.” In part, Grim seemed motivated to assert the primacy of the resolutely left-leaning HuffPo’s own election projection, which gives Hillary Clinton a 98% chance of winning, over FiveThirtyEight’s, which paints the election as much closer, giving Clinton only a 64.9% shot.
Get Data Sheet, Fortune’s technology newsletter.
Silver responded yesterday with a long series of tweets. That included several that point to an important lesson for business:
And yes, even that last one is important. Because this isn’t ultimately a debate about who’s ‘right’ and who’s ‘wrong’—it’s a debate about how we think about statistical prediction itself.
In industry, big data has been a buzzword for years now, with existing and potential applications including fraud detection, securities trading automation, supply chain management, and market trend prediction. All of these, in various ways, attempt to use current data to predict the future.
But Silver, despite his position as the reigning king of predictive analytics, is in essence calling for a very cautious, even skeptical, approach to its applications. Numbers, he is arguing, are not magically correct because they’re numbers—they can be incomplete or wrong. Silver’s trend line adjustments are, at least in part, an attempt to account for errors and noise in data by smoothing them against a historical pattern.
For more on analytics, watch our video.
But even that historical data is limited, with thorough and modern polls only going back to the 1970s. The same can be said for a lot of business data. As Silver references, the housing market bets that led to the 2008 collapse were based on a very shallow pool of data about default rates on high-risk mortgages.
That limited data pool is one possible reason that Silver has come up with a model that bakes in a great deal of uncertainty—in modeling terms, it’s conservative in its predictions. Giving Donald Trump a 35% chance at the White House, even with a lot of national polls showing a big Clinton lead, is, in essence, a message that the world is very complex and unpredictable.
That seems to frustrate people who don’t like uncertainty—which, right now, includes a lot of Democrats like Grim.
And of course, uncertainty isn’t a big selling point for big data applications in business, either. The entire point of data analytics has always been to make decisions easier—or even, as in the case of trading algorithms, take decision-making out of human hands entirely. And maybe, as data pools get deeper and we have longer and more robust historical patterns, the numbers will speak to us more clearly, about everything from polling to disaster risk to people’s taste in toothpaste.
But until that day comes, both Silver and Grim are at least partly right—but mostly Silver. Numbers don’t simply speak for themselves, and interpreting them still leaves room for human fallibility, and even bias. And building models that reflect uncertainty isn’t a crime. Instead, those models merely reflect reality.
The real question is whether we’re comfortable acknowledging that.