Artificial intelligence is the next big frontier in technology. While we’re still riding the wave of hype wrought by big data and connected sensors that has led to constant coverage of the Internet of things tech giants ranging from IBM to Facebook have been investing in ways to make the coming information overload manageable. IBM calls it cognitive computing. Facebook and Google call it machine learning or artificial intelligence.
No matter what you call it, it is inevitable.
This week Google made headlines because it released the code behind its Deep Mind neural network software, so anyone with the necessary programming chops could try to experiment with their own version of AI. The software, called TensorFlow, is on GitHub for the world to see. Perhaps people will build something like Google’s Smart Reply service that will respond to emails for you, or maybe you can challenge a computer to Go as Facebook apparently is doing.
However, no matter what your fancy, there’s one thing that’s often overlooked in the discussion about artificial intelligence, and that’s the hardware it runs on. When a programmer fires up TensorFlow, the best option isn’t to run it on a traditional Intel-based x86 processors, but graphics processors called GPUs. More often than not, these are coming from Nvidia.
Nvidia announced two new graphics accelerators Tuesday, which are aimed at helping large companies like Facebook, Baidu and Google develop new deep learning models and then deploy those models at a massive scale without requiring huge, expensive banks of servers all hooked up to their own power plant. The new processors, plus several software elements that will make the hardware easier to use, are part of a rush in the semiconductor world to build silicon that can handle what every single company in technology believes is the next big thing on the horizon.
Nvidia’s nvda new processors include the Tesla M40, which is the powerhouse of the two, and is capable of delivering an incredible amount of performance for researchers trying to train their own neural networks. The second module is the Tesla M4, which is the first-ever low-power graphics processor from Nvidia, and is designed to fit into the servers of companies like Google goog, Facebook FB, or Amazon AMZN who need to deliver super fast results from an AI. For example, when you use the new Smart Reply feature from Google, the Tesla M4 might eventually be the workhorse chip that scans the words in the email and runs the natural language processing algorithms against the email so it can suggest a few replies.
The reason this takes an entirely new style of chip, as opposed to the existing Intel processors currently whirring away inside hundreds of thousands of servers inside the data centers the size of football fields, is because GPUs are well equipped to handle computing jobs that can be divided up into many different tasks. A GPU has hundreds of different computing cores, whereas an Intel Xeon processor has up to 18. And training a neural network requires a lot of similar, repetitive steps perfect divvying up on a GPU, which makes it a more efficient chip to use for the job.
Earlier this year, Yann LeCun, director AI research at Facebook, explained that GPUs are the preferred chip for use in training neural networks, but said that there were still some challenges. Namely he said that it was too challenging to get more than a few GPUs to work together on a single neural network because the software to manage them wasn’t robust enough. Nvidia has solved this with new software that lets users group more GPU cards together, which should help researchers like LeCun out.
However, Nvidia isn’t the only company trying to make a big splash in AI with dedicated processors or software. IBM has Watson, which is a form of AI that it called cognitive computing. Watson originally built to run on IBM’s Power architecture, although now Watson can run on SoftLayer’s cloud, which uses x86-based and Power servers. IBM has also announced features in its next-generation Power chips related to faster networking that will benefit the needs of AI researchers according to Guruduth Banavar, VP of cognitive computing at IBM Research.
However, IBM is also handling another challenge outside of training neural nets and then running them at large scale in a data center.
Both IBM IBM and Qualcomm qcom are trying to think about how to scale the awesome power of AI to fit on a battery-powered smartphone. Qualcomm’s answer its Zeroth technology, which basically runs a neural network on a chip. For example, it offers image recognition using a digital signal processor that is part of a number of chips sold as the brains inside mobile phones. Much like some of the dedicated processors for speech recognition on the iPhone or other handsets, this isn’t as impressive as what IBM is attempting.
IBM’s strategy is more ambitious. It wants to create a chip that mimics the human brain and will perform similar AI-level tasks on a smartphone. It’s using the human brain as a model because the brain is the most efficient computer we know of, using only 20 watts of power. In comparison, the new Nvidia “low-power” GPU consumes between 50-75 watts of power, which would drain your battery faster than you could say neural network.
So far, the resulting synaptic chip is still at the early stages, but it does exist in silicon, which was a big deal when IBM launched it in 2014. We won’t see this in smartphones anytime soon. For now, the heavy lifting of AI on our battery-powered devices will be shuffled off to dedicated silicon trained on one, or maybe two, specific types of neural networks.
With IBM, Nvidia, Qualcomm, and even Micron (which has a new processor called the Automata Processor that does pattern recognition) investing in artificial intelligence and deep learning, where’s the world’s largest chipmaker?
Intel intc recently made headlines with its purchase last month of an AI startup called Saffron, but the chip giant has been remarkably quiet when it comes to discussing its efforts around machine learning. The closest it comes is when it’s justifying its $16.7 billion purchase of Altera, a company that makes programmable chips. At an event in August Intel said it can use a combination of those programmable chips plus its own Xeon processors to run specialized algorithms such as those used to train neural networks.
This relative silence is troubling, given that silicon advancements require years planning. Intel has surprised the community with new technologies that it has developed in secret, such as its 3-D transistors from 2011, but in the chip community its lack of machine learning products are a gaping hole in its portfolio. “In chips, we have to plan two to four years out, so you have to figure out, what are the key apps,” said Jim McGregor, principal analyst with semiconductor research firm Tirias Research. “Intel missed out on mobile. It doesn’t want to miss out on this.”
Meanwhile, as the giants in the technology world invest in machine learning, the chip world is trying to give them a silicon platform that will allow them to deliver results both in the data center and on our handsets.