The Current State of Artificial Intelligence, According to Nvidia’s CEO
As industry embraces AI, computers aren’t the only ones that have to learn new tricks. Fortune’s Andrew Nusca talks with Nvidia (NVDA) CEO Jen-Hsun Huang.
Fortune: What’s the current status of artificial intelligence?
Jen-Hsun Huang: 2015 was a big year. Artificial intelligence is moving into the commercial world. AI has been worked on for many years, largely in research. Various aspects of commercial use of AI, otherwise known as machine learning, is used for advertising and web searches and things like that. It wasn’t until the last few years that AI could do things that people can’t do. Several milestones were achieved in 2015 in particular that made it possible for us to use it in all kinds of areas.
There have been several recent advancements.
Yes, in an area of AI called “deep learning.” The system basically learns by itself using a lot of data and computation. If you keep showing it pictures of an orange, it eventually figures out what an orange is—or a Chihuahua versus a Labrador versus a small pony. Amazing things happened in 2015: Microsoft (MSFT) and Google (GOOG) beat the best human at image recognition for the first time. Baidu beat humans in recognizing two languages. Microsoft and the China University of Technology and Science taught a computer network how to take an IQ test, and it scored better than a college postgraduate.
What are we seeing at the commercial level?
Google, Microsoft, and Facebook (FB) are using AI, whether it’s the voice recognition on your phone or the items displayed in your news feed. We’re using the same technology for autonomous cars because we can now achieve superhuman levels of perception. We can now recognize things better than a human. Of course there are all sorts of matters with regard to the dynamic range of our eyes—it’s quite good, far superior than a camera. But if we add enough sensors like Lidar and radar and high-dynamic range cameras, in combination we will be able to surpass human perception.
Is this just for image recognition?
We’re going to use it for everything. We’re going to use it to teach a car how to drive.
Where are we on the adoption curve?
Gosh, I wish people knew. I wish I knew. The thing that I can tell you is that AI has been plodding along for 50 years in research. And all of a sudden last year something happened. This new way of doing AI called deep learning is so tractable, so understandable—a tool you can apply so that you can create one single network to be trained to learn multiple languages and animals and things. And that you and I and some data scientists and engineers can train it. Last year AI went from research concept to engineering application. And all these engineers at Facebook and Google and others are taking this deep learning concept with all these frameworks, which is basically another word for tools, and turning these ideas into things of practical use. And now you’re seeing all these Internet companies announcing these practical uses. All of the industries are just exploding. Two years ago we were talking to 100 companies interested in using deep learning. This year we’re supporting 3,500. We’re talking about medical imaging, financial services, advertising, energy discovery, automotive applications. In two years’ time there has been 35X growth.
There was a moment when we realized, maybe we can actually pull this off. That was a lot of typing we were going to do before to detect dogs and cats and grandmas and grandpas and kids and bikes. How many things are you going to type in? How many circumstances are you going to capture? That truck is not a truck; it’s an ambulance. That bus isn’t a normal bus and you can’t just drive by it—it’s a school bus and there are kids about to jump out. How do you capture all that? We never could figure it out. We attempted it. The first thing was advanced driver assistance systems, or ADAS. Going from ADAS to driverless car is a little bit like having a microprocessor in a human brain. I actually don’t know how to go from a microprocessor to a human brain.
What does that mean for Nvidia?
Selling GPUs [graphics processing units]. The data center is the biggest market for us. Every one of them is going to be AI-powered. At the moment there are something along the lines of 10 million nodes of CPUs and servers powering the cloud everywhere. Every one of them. Anyone who has to deal with a lot of big data and customization of services to end users. If you want to customize a service for one user and you have a billion users, you’ll never be able to provide good enough recommendations unless you use something like AI. So anybody who is making recognition of news, information, products, music—they’re all going to be powered by AI.
How big is the automotive part?
Data centers could very well be our largest market. The largest computer market someday will largely be in the cloud, using AI, using deep learning technology. Self-driving cars? That’s 100 million cars. I hope that every single car is not autonomous but drives so well that it keeps you out of harm’s way.
You should never have an accident, even if you’re driving. I like the idea of a virtual copilot that is superhuman. You’re still driving. You’re still navigating through whatever conditions you’re in. The car somehow always detects something around you—if you’re fine, you’re fine, and if you’re not, it might take over and apply evasive actions. There are a lot of things a car can do without being “self-driving.” But you need the self-driving technology to be able to do it. If you’re going to have a virtual copilot, I would hope that it’s a better driver than I am.
Your biggest competitor in AI?
The market doesn’t exist right now, so everybody’s a competitor. Intel (INTC) and Freescale (FSL) for microprocessors, Mobileye for sensors. But it’s largely an unsolved problem.
Are there different paradigms competing to solve it?
I would say there are two. One is human-coded software programs, which do not scale. “If you see a dog, apply brakes. If you see a dog and it’s running along the road, do not apply brakes.” I just don’t think you’re ever going to get there. I think you’ll end up saying, “If there are no objects in front of you, keep driving.” That particular algorithm is called adaptive cruise control. That’s largely a solvable problem. You might get trickier—”If driver flips the turn signal, change into the other lane if there are no other cars.” I can imagine things like that. But the world is too complex to go much further.
Deep learning is the other paradigm. It comes from two places. One is fully autonomous vehicles, a.k.a. driverless or self-driving cars. The other is driver assistance. If you keep making driver assistance better and better, one of these days you end up with a self-driving car.
Where are the automakers on this? You’re working with many of them.
We’re working with almost all of the automakers. The GPU does most of the processing for deep learning. Whereas computer programming is taking a bunch of commands and executing them one at a time, deep learning is taking a whole bunch of software “neurons” and taking a whole bunch of data, and using it to tune the neurons like our brains. GPUs are massively parallel to start. They’re quite useful for this application.
What happens to coders?
You still have a lot of code. There’s still a lot of programming. You still need a GPU alongside a CPU. You could imagine building custom chips instead of using GPUs, but they’re so readily available.
Market growth for GPUs over the next 10 years?
I have no idea. I don’t think about the market growth; I think about the market opportunity. There are too many variables—adoption rate, competition. But the size of the market we’re pursuing—data centers, autonomous driving, video games and virtual reality, industrial design—our technology is being used by almost every car company, movie…so almost every data center in the world will be accelerated by GPUs. Let’s say that’s a several billion-dollar opportunity. Every car will become “smart.” You can’t imagine that it won’t. Eventually all cars built will have autonomous capability. It’s a big opportunity.
Do you mean fully autonomous capability, or just autonomous features?
I think all cars will have driver assistance. I don’t think all cars will be self-driving or have a virtual copilot. So when a car or a person in front of you, it will stop. It’ll stay in the lane. It will follow traffic. It will almost anticipate every possible scenario, like driving in New York City.
Isn’t that the promise of vehicle-to-vehicle (V2V) technology?
The problem with V2V is that I need to know you have it. I don’t think that’s a good idea. People are rather polarized about it. I happen to think that good technology works when you can take it out on the street without assumptions for anybody else.
What are some of the challenges with automotive?
So many. Oh, my God.
You have two industries with very different cultures.
The auto industry is about assembling mechanical things that become a car. It is not as comfortable with the idea of an empty vessel for which you develop software over the next 10 years. On the day the iPhone shipped, it had a browser, a stock ticker, a clock, not much else. It had an API. The idea that this phone is a living thing that becomes more useful over time is a very alien concept in a mechanical world. That’s a real challenge for an industry that, for 100 years, shipped a product that’s complete from the day it left the lot.
The idea that you put a gigantic computer inside that you actually don’t use on the day that you ship it—that the memory is empty, the disk drive is empty, the CPUs aren’t being used, the APIs aren’t being used—you see what I’m saying? “Well, if it’s not being used, why do I have a disk drive for it? Why do I have so much memory? Why do I pay for such a large microprocessor?” Because in the past they used every last drop of that microcontroller on the day they shipped it.
This must be common in other industries.
We’ve seen them all. Digital music and Sony (SNE). People weren’t just trying to get the CD player down to the Minidisc. That wasn’t a complaint. There wasn’t any objection. As soon as the iPod came out, Sony came out with the Minidisc. “Oh, you wanted it smaller? Why didn’t you say so?” And the iPhone: “You want a digital phone with a camera? Is that what you want? Why didn’t you say so? Here’s a Nokia (NOK) phone with a camera. Oh, you wanted a web browser? We put one on there for you.” They just didn’t get it. It’s not a technical understanding problem. They just didn’t get it.
For more on AI, watch this Fortune video:
Every company is becoming a digital company.
A computer company. Nokia (NOK) was the world’s first digital phone maker. But Apple’s (AAPL) iPhone is the world’s first phone built from computers, with an operating system. Steve [Jobs] stood onstage and said, “We’ve actually got Linux in here, guys,” and people didn’t get it. His point: You can now develop applications. There are APIs. They’re both digital; they both have 1s and 0s. But that’s not the distinction.
Sony Minidisc? Digital. MP3? Digital. One is a digital media format. One is a computer format. Sony even took the two formats and said, “Look at the fidelity of the MP3 versus CD audio. CD audio crushed it, pulverized it. To this day, MP3 sucks. Put MP3 on your big MartinLogan speakers and see how it sounds. Oh my gosh—can you really do that? Apparently yes. Convenience matters. Two are digital audios, but one is a computerized version, and one is a media version. They just didn’t get it.
Are you still running into this mindset today?
The fundamental difference between Tesla and another random car company is that one is a computerized car company and the other is a digital car company.
Isn’t there a bit of me-too behavior, though? They see this and they call you?
I don’t know. Look at Nokia. Wasn’t it obvious? And the management team kept turning over people, new board of directors…do you remember that? I don’t know why that is. Here’s my proposition. I can explain all this to the last generation of people until my face turns blue and they still won’t get it. Digital music. Digital cars. It all sounds similar. But it’s so fundamentally different. Imagine the board meetings. “We have 15 million lines of code in our car! More than what was in the Apollo [Program]!” You see what I’m saying?
Carmakers would say they have sophisticated computers.
The 100,000 lines of code in your automotive brakes—that’s not software; that’s microcode. Look, we’re not denigrating the 1s and 0s in the microcode of air brakes and airbags and seatbelts. But that’s not the software we’re talking about. We’re talking about a living, breathing capability to improve.
You’re talking about two kinds of improvement—Tesla updating your car’s software over the air at night as well as deep learning capabilities.
Right. The guy who worked on fuel injection? I admire him very much. He did a lot of good work for the world. But that’s not what I’m talking about. This is different. This new world, this software platform, this computerized platform is really about computing.
Will computer companies top mechanical companies?
Hard to say. I’ll put it this way: In another five years, if mechanical inclination is the only thing you have going, you’re dead. You are so Nokia’d. You are so Motorola’d. To think the Razr was a computer? Nonsense. Yes, 1s and 0s and a microprocessor. But it’s a different type of computing. If it requires explanation it’s beyond [salvaging]. You’re talking to a caveman.
Is IBM in a good place?
Yeah. I think Watson is absolutely the right answer. AI, deep learning, cognitive computing. We partner with them—the next generation of Watson will have Nvidia GPUs in it. We’re partnering with them to build supercomputers together.
So you don’t have to be on the West Coast to get it.
No. New York City is one of the hubs of AI. There’s Silicon Valley, of course, partly because of the universities. The University of Toronto—really big hub. Another one is Oxford University in the U.K. is quite a big one. China is popping up like crazy. But these are the early ones.
People still think of Nvidia as the GPU company, the gaming company.
All of the AI researchers I talk to are gamers. They’re all customers.
With all of this in the works, will people think of Nvidia as more of a software company moving forward? Do you hope they will?
We used to be a graphics technology company that you put into a computer. We’re now a computer company that specializes in several fields that include graphics. We’re a computer company no different than IBM or Cray—we have the same skills and capabilities. It’s just that our business model is really about creating a computing platform, software included. Data centers from Amazon, Facebook, Google, Baidu, Microsoft. We take some small version of that and integrate it into gaming for VR. And we take another small version of that and might add different kinds of software to it, an algorithm, and we put it in a car. But we’re a computer company that specializes in a field or type of computing. That’s a real change.
Neural networks are a fundamentally different kind of thing you’re trying to ship.
It has put us into much larger opportunities. We’re very specialized in this field—this accelerated, computerized, visual, parallel computing field. The markets that we’ve talked about don’t really exist. They’re just beginning.
It would help to have some agreement on the terminology: AI, deep learning, machine learning…
All of that’s going to happen. We’re about three months into it; it’s really early. We’re going to see a lot more writing about the capabilities of deep learning. The first textbook hasn’t emerged yet—it’s about to come out, from Yoshua Bengio at the University of Montreal.
Thirty-five years ago, [Carver] Mead and [Lynn] Conway wrote probably the single most important book for the semiconductor industry. Right before that time, chip design was considered a black art: How big should that transistor be? How should you scale it over time? How does process shrink with the transistor continuing to perform? There are all kinds of these weird questions that were embedded in the heads of 50 people. Mead and Conway came out and said actually, this problem is really easy. Here’s a CMOS transistor and this is how you scale it from generation to generation and by doing so you keep the current density exactly the same and the performance continues to scale without the power going up, otherwise known as Moore’s Law. Mead and Conway created the framework necessary for chip designers.
Deep learning is exactly that—a framework for artificial intelligence to be applied by engineers. And now it’s going to be used in all kinds of places. Deep learning is a killer technology.
A shorter version of this article appears in the March 15, 2016 issue of Fortune with the headline “Artificial Intelligence, Analyzed.”