FORTUNE — If software is eating the world, as described by the prominent venture capitalist Marc Andreessen in 2011, then big data is supposed to be saving it. Right?
Popular use of the term “big data,” which is used to describe technologies that help parse datasets too large for conventional tools to handle, has exploded in the last two years — leaving many business executives wondering if they need it. It is in many ways an echo of the 1960s, when large corporations saw early computers as (expensive, rudimentary, futuristic) competitive tools. To fear, or to embrace? And who, exactly, should need such a thing?
In an attempt to slash through the hype, Fortune rung up Gaurav Dhillon at his office in San Mateo, Calif. If his name sounds familiar, that’s because Dhillon is the founder and former chief executive of Informatica
, the nearly $4 billion Redwood City-based software company known for managing the data warehouses of large companies.
Dhillon, who became the chief executive of the data integration company SnapLogic in 2009, believes that big data holds big promise for big businesses — but only in certain industries. He calls it the “big data barbell.” Below are his words, edited and condensed for clarity.
Fortune: Perhaps no term has been more popular in the last year or so than “big data.” It’s everywhere: in keynotes at technology conferences, in briefing materials and presentation decks, in news articles about various industries. Everybody seems to think they need it — but big data is a rather specialized type of computing, no? Is big data kind of B.S.?
Dhillon: Coming up on 22 years in the technology industry, I should have some kind of perspective. Back in 2002, I used the term “the information tsunami.” And here we are today.
I think what is true is that data under management has gotten bigger. Initially, the roots of this industry in the last century, before the web, were in retail and bar code scans and UPC codes, as you call them, to stock shelves. That was the birth of the data warehousing industry: early analytics. That industry drove marketing decisions, pricing decisions, retail forecasting, and so on.
The trend will continue; it’s not suddenly going to change. A scientist said, “Science advances one funeral at a time.” So I think the benefit of being able to use data to make decisions, and make bigger data to make more possible decisions, will continue.
The fact that data is “bigger” — well yes, my garage has more stuff in it than it did 10 years ago! Everybody has more stuff [over time].
But the interesting twist is that big data has an element of data science, which I think is more important. It first makes small data out of big data and then it looks for signals in that small data to understand what to do: Who’s going to win the election? What are the correlations between weather and language? Things that we simply didn’t have enough processing power in the last century. And now you’ve got a democratizing aspect with Hadoop and other things. So you had a fundamental shift around price and performance around compute.
The benefits of that are in some cases pretty clear, and in some cases there is gee-whiz science for which the benefits are not. So I think this aspect of being able to get a lot of information by increasingly electronic things — the supermarket, bridges, cars, roads — so you have sensor data. More data doesn’t make you any smarter; it just means you spent a lot of money to store it. This is where the market will shake out — the benefits.
In retail it’s clear. Pricing, etc. The financial industry — that’s clear. But in certain industries, it’s not clear, putting all this effort in rather than looking at the R&D budget or spending on marketing. I’m not here to tell you it’s a panacea; I’m here to tell you that managing that data … people are going to get varying mileage from it.
On this week’s episode of Mad Men, the ad agency Sterling Cooper & Partners replaces a meeting room with a new tool: an IBM System/360 mainframe computer. Some characters want the computer for competitive reasons; some want it because they see it as the future. Others are terrified that it will replace them. Is that how people look at big data?
The fear of computers has, in fact, left the building. New generations of employees, people who graduated this millennium, my kids — 13 and 6. The Millennials are not afraid of computers — they make not be programmers, but they’re tech-savvy. We think of them as citizen integrators. Captain America: The Winter Soldier was all about the dark side of big data. Today, there’s more of an arms race of, “We don’t want to be left behind.” There are Orwellian concerns around big data in society, but not in business. But in business, there are issues around having the wrong data or not being able to get at information — that’s the same as it was 50, 60 years ago. At SnapLogic, we’re trying to finish some unfinished business. Why is this so hard in 2014?
I feel there is an embrace of big data in many industries — manufacturing, financial services — because people have a fluency of computing. But I think what people are anxious for is to see the benefits of big data applied in their lives. They’re somewhat concerned. They really just want to get the benefits of it. That needs work. There are too few data scientists. Hadoop is still somewhat of a unicorn — you still need a graduate degree in computer science to set things up. It has fundamentally changed storage in terms of cost per bit. It’s a tectonic shift.
What is very clear is a “barbell” strategy around big data. Services, information-rich industries with knowledge workers in them? It’s very clear there’s a big benefit of big data. Retail, hospitality, trading stocks — if you have the ability to discover trends, you can find breakpoints in your business and take care of them. If you discover how to take advantage of certain events in the market, you can certainly take that all the way to the bank. That’s one end of the barbell.
The other end of the barbell is the industrial Internet. I think that is extremely, extremely interesting. There’s a really interesting writeup by GE saying that you will not just be able to sell aircraft engines but sell value around the [operation] of that engine. Trigger actions around the data. Do preventative maintenance on the engine. That concept has enormous implications for GE, Siemens, everybody who manufactures stuff. You would think that big data would only be a business on the knowledge side, but on the industrial side, there’s a whole barbell that becomes very interesting.
But other industries . . . can you predict trends and fashions and colors in the fashion industry? What makes a particular season successful? Maybe. A better movie is a better movie. Big data doesn’t make a better movie. Sometimes you just have to create something. You know a well-made book or movie when you see it. The barbell strategy seems extremely sound.
So should we be telling some companies, “Big data is not for you”?
We should be clear. Because if we’re not clear around it, people will be disgruntled. You can’t wound a big data problem; you have to kill it. People want to just step in it. But if you’re not willing to fund it at an effective level — and it is a substantial investment — to expect substantial returns by just tickling the chin, it’s not going to happen. So maybe you don’t have the budget this year, and maybe you should wait — it will get cheaper. Sit tight! You’re better off replenishing the guts of your company with SaaS and cloud applications and emancipate your marketing department.
Fundamentally, the c-suite are investors. What does an executive make? As Ben Horowitz, one of our investors, says: They don’t make things, they make decisions.
There’s nothing worse than a half-baked, half-funded big data project. That’s the worst of all. You’re creating a bad feeling about the true benefit of this.
Where’s the slack in the market for big data? Which areas or industries could be easily conquered but are still wide open?
All this change is causing the negative space [between connected groups of things] to become the battleground. If things don’t talk to each other, it doesn’t matter how much you’ve spent. So we actually see a lot of negative space because there are huge changes. People are unplugging traditional data warehouses. We see a lot of business applications flying to the cloud. Salesforce did it; Workday is on a roll. And the APIs and Internet of Things and data — it’s in the early stages, but it quite likely will be the greatest source of information the world has ever seen. How many barcodes can you have? You will see the monetization of that, distinctly, on the industrial side.
Putting it together is a large problem, and you know what? It’s wide open. Boy, we have a long way to go.
More on big data from Fortune: