IBM Claims Big Breakthrough in Deep Learning
The race to make computers smarter and more human-like continued this week with IBM claiming it has developed technology that dramatically cuts the time it takes to crunch massive amounts of data and then come up with useful insights.
Deep learning, the technique used by IBM (IBM), is a subset of artificial intelligence (AI) that mimics how the human brain works. It’s a huge focus for companies like Microsoft (MSFT), Facebook (FB), Amazon (AMZN), Google (GOOGL), and IBM.
IBM’s stated goal is to reduce the time it takes for deep learning systems to digest data from days to hours. The improvements could help radiologists get faster, more accurate reads of anomalies and masses on medical images, according to Hillery Hunter, an IBM Fellow and director of systems acceleration and memory at IBM Research.
Until now, deep learning has largely run on single server because of the complexity of moving huge amounts of data between different computers. The problem is in keeping data synchronized between lots of different servers and processors
In its announcement early Tuesday, IBM (IBM) says it has come up with software that can divvy those tasks among 64 servers running up to 256 processors total, and still reap huge benefits in speed. The company is making that technology available to customers using IBM Power System servers who want to test it.
IBM used 64 of its own Power 8 servers—each of which links IBM Power microprocessors with Nvidia graphical processors with a fast NVLink interconnection to facilitate fast data flow between the two types of chips..
Atop that, IBM came up with what techies call clustering technology that manages all those moving parts. Clustering technology acts as a traffic cop between multiple processors in a given server as well as to the processors in the other 63 servers.
Get Data Sheet, Fortune’s technology newsletter.
If that traffic management is done incorrectly, some processors sit idle, waiting for something to do. Each processor has its own data set that it knows, also needs data from the other processors to get the bigger picture. If the processors get out of sync they can’t learn anything, Hunter tells Fortune. “The idea is to change the rate of how fast you can train a deep learning model and really boost that productivity.”
Expanding deep learning from a single eight-processor server to 64 servers with eight processors each can boost performance some 50 to 60 times, she notes.
Analyst Charles King, founder of Pund-IT, is impressed with what he’s hearing about IBM’s project, saying that the company has found a way to “scale up” systems so that adding extra processors improves performance.
For example, 100% scaling means that for every processor added to a given system, that system would get 100% performance improvement. In reality, such gains never happen because of complex management issues and connectivity problems.
But IBM claims its system achieved 95% scaling efficiency across 256 processors using something called the Caffe deep learning framework created at the University of California at Berkeley. The previous record was held by Facebook AI research, which achieved 89% scaling.
IBM’s latest 95% rating seems almost too good to be true, noted Patrick Moorhead, president and founder of Moor Insights & Strategy, an Austin, Tex.-based research firm.
In terms of image recognition, the IBM system (using a variation of the Caffe framework) claimed a 33.8% accuracy rate working with 7.5 million images over seven hours, IBM said. The previous record, set by Microsoft, was 29.8% accuracy, and that effort took 10 days.
In layman’s terms, IBM claims to have come up with technology that is both much faster than and more accurate than the deep learning technologies already developed. Of course, it also requires the use of IBM’s Power Systems hardware and clustering software.
Other than the Caffe framework, IBM said the popular Google (GOOGL) TensorFlow framework can likewise run atop this new technology.