• Home
  • News
  • Fortune 500
  • Tech
  • Finance
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia

When Big Data goes bad

By
Joshua Klein
Joshua Klein
Down Arrow Button Icon
By
Joshua Klein
Joshua Klein
Down Arrow Button Icon
November 5, 2013, 1:00 PM ET

FORTUNE — Big Data and the cloud are putting supercomputer capabilities into everyone’s hands. But what’s getting lost in the mix is that the tools we use to interpret and apply this tidal wave of information often have a fatal flaw. Much of the data analysis we do rests on erroneous models, meaning mistakes are inevitable. And when our outsized expectations exceed our capacity, the consequences can be dire.

This wouldn’t be such a problem if Big Data wasn’t so very, very big. But the amount of data that we have access to is enabling us to use even flawed models to produce what are often useful results. The trouble is that we’re frequently confusing those results for omniscience. We’re falling in love with our own technology, and when the models fail it can be pretty ugly, especially when the mistakes all that data produces are concomitantly large.

Part of the issue is oversimplification of the models computer programs are based on, rather than actual errors in their programming. For example, in early April 2011, Peter Lawrence’s The Making of a Fly, a classic work in developmental biology that many biologists consult regularly, was listed on Amazon.com as having 17 copies for sale: 15 used from $35.54, and two new from $23,698,655.93 (plus $3.99 shipping).

MORE: Here comes Mark Zuckerberg’s knowledge economy

The book, last published in 1992, is now out of print, but that doesn’t quite explain the multimillion-dollar price tag. What had happened was that two automated programs, one run by seller “bordeebook” and one by seller “profnath,” were engaged in an iterative and incremental bidding war. Once a day profnath would raise their price to 0.9983 times bordeebook’s listed price. Several hours later, bordeebook would increase their price to 1.270589 times profnath’s latest amount.

It’s a classic example of how unanticipated factors can foil even the best-prepared computer models, and it’s not an isolated incident.

For example, does this sound anything like the subprime mortgage crisis? Before 2008, the best minds with the best technology running the most advanced hypothetical scenarios completely missed the looming crisis and then failed to understand its severity. The more broadly a model is scoped, the more possibilities for error it includes. It sounds obvious, but we often miss the fact that those models are not, and will never be, as accurate as reality itself.

Here’s another example. One t-shirt seller on Amazon.co.uk put up a shirt for sale emblazoned with the statement, “Keep Calm and Rape a Lot.” One might wonder who thought such a shirt would be a good idea. But Solid Gold Bomb, the company that made the shirt, wasn’t necessarily aware that it was even selling it. The company apologized publicly and copiously, but in its defense the only mistake it made was a small coding error. That’s because the shirt wasn’t designed by anyone. Nor were the shirts even necessarily ever printed. Solid Gold Bomb’s business isn’t in artfully designing T-shirts. Instead, it writes code that takes libraries of words that slot into popular phrases (such as “Keep Calm and Carry On,” which enjoyed a brief mimetic popularity online) to make derivations that get dropped onto a template of a T-shirt and automatically get posted as an Amazon item for sale. Their mistake was overlooking a single word in a list of 4,000 or so others (the company was lucky no other offensive words or phrases made it onto the site). The problem was context.

MORE: Using Big Data to reinvent football

Again, a simple model, with serious social consequences. The program that made the Solid Gold Bomb T-shirt isn’t aware of how its intended audience perceives the concept of rape, let alone how the business process that rendered the T-shirt works. And yet that context turned a one-word oversight into a massively damaging event.

In both these instances an inability to anticipate how the program would interact with other programs, or of the broader context in which it would operate, caused significant harm. Those are just two ways in which a model on which code is based can be flawed.

Big Data still has big issues. For example, the information we’re gathering is often not being properly normalized (put into a format where all data is apples-to-apples), the models we’re making aren’t often peer tested or reviewed (witness the problems with the ranking tool Klout as a standard for social media influence), and, most crucially, the information itself is usually siloed inside of large corporations instead of being democratically available and verifiable.

Which isn’t to say our technology is doomed. Most of the applications we use every day work tremendously well, and in some cases really do produce amazing capabilities that improve our lives in countless ways every day. But it behooves us to examine the models that underpin them. Because someday, somehow, they will fail.

Joshua Klein is a hacker, consultant, television host, and author of Reputation Economics: Why Who You Know is Worth More than What You Have(Palgrave Macmillan), from which this essay is adapted.

About the Author
By Joshua Klein
See full bioRight Arrow Button Icon

Latest in

AIMeta
It’s ‘kind of jarring’: AI labs like Meta, Deepseek, and Xai earned some of the worst grades possible on an existential safety index
By Patrick Kulp and Tech BrewDecember 5, 2025
9 hours ago
RetailConsumer Spending
U.S. consumers are so financially strained they put more than $1 billion on buy-now, pay later services during Black Friday and Cyber Monday
By Jeena Sharma and Retail BrewDecember 5, 2025
10 hours ago
Elon Musk
Big TechSpaceX
Musk’s SpaceX discusses record valuation, IPO as soon as 2026
By Edward Ludlow, Loren Grush, Lizette Chapman, Eric Johnson and BloombergDecember 5, 2025
10 hours ago
data center
EnvironmentData centers
The rise of AI reasoning models comes with a big energy tradeoff
By Rachel Metz, Dina Bass and BloombergDecember 5, 2025
10 hours ago
netflix
Arts & EntertainmentAntitrust
Hollywood writers say Warner takeover ‘must be blocked’
By Thomas Buckley and BloombergDecember 5, 2025
10 hours ago
Personal FinanceLoans
5 ways to use a home equity line of credit (HELOC)
By Joseph HostetlerDecember 5, 2025
10 hours ago

Most Popular

placeholder alt text
Economy
Two months into the new fiscal year and the U.S. government is already spending more than $10 billion a week servicing national debt
By Eleanor PringleDecember 4, 2025
2 days ago
placeholder alt text
Success
‘Godfather of AI’ says Bill Gates and Elon Musk are right about the future of work—but he predicts mass unemployment is on its way
By Preston ForeDecember 4, 2025
2 days ago
placeholder alt text
Success
Nearly 4 million new manufacturing jobs are coming to America as boomers retire—but it's the one trade job Gen Z doesn't want
By Emma BurleighDecember 4, 2025
2 days ago
placeholder alt text
Success
Nvidia CEO Jensen Huang admits he works 7 days a week, including holidays, in a constant 'state of anxiety' out of fear of going bankrupt
By Jessica CoacciDecember 4, 2025
2 days ago
placeholder alt text
Real Estate
‘There is no Mamdani effect’: Manhattan luxury home sales surge after mayoral election, undercutting predictions of doom and escape to Florida
By Sasha RogelbergDecember 4, 2025
2 days ago
placeholder alt text
Economy
Tariffs and the $38 trillion national debt: Kevin Hassett sees ’big reductions’ in deficit while Scott Bessent sees a ‘shrinking ice cube’
By Nick LichtenbergDecember 4, 2025
2 days ago
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Leadership
  • Success
  • Tech
  • Asia
  • Europe
  • Environment
  • Fortune Crypto
  • Health
  • Retail
  • Lifestyle
  • Politics
  • Newsletters
  • Magazine
  • Features
  • Commentary
  • Mpw
  • CEO Initiative
  • Conferences
  • Personal Finance
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map

© 2025 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.