CEO DailyCFO DailyBroadsheetData SheetTerm Sheet

The emerging disconnect between business and academic interests in A.I.

December 17, 2019, 2:51 PM UTC

This is the web version of Eye on A.I., Fortune’s weekly newsletter covering artificial intelligence and business. To get it delivered weekly to your in-box, sign up here.

Hi everyone, Jeremy Kahn here. I spent last week at NeurIPS in Vancouver.

NeurIPS is the nickname for the Conference on Neural Information Processing Systems. It has become the premier annual event for researchers working on artificial intelligence. (The conference has a long association with deep learning, the kind of machine learning that uses neural networks and is largely responsible for the current A.I. boom.)

This year’s NeurIPS was the biggest ever, with more than 13,000 attendees and more than 1,400 research papers presented. Many others participated in one of the 79 official NeurIPS meetups held in 35 different countries during the conference week.

This week’s Eye on A.I. will be devoted to my takeaways from the conference.

  • The conference’s continued growth is testament to the feverish interest in A.I. from both academics and businesses. NeurIPS is a prime recruiting ground for tech companies and financial firms—mostly big banks and hedge funds—eager to find freshly-minted PhDs to staff their research labs and build A.I.-powered applications. In the evening, there was the usual round of corporate-sponsored bashes aimed at wooing prospective applicants, with free-flowing booze and food, while during the day, companies conducted back-to-back interview sessions for job candidates behind closed doors at the convention center and in nearby hotel suites.
  • But NeurIPS is still primarily an academic conference. (Scroll on to see the papers I found most interesting.) And there was a sense this year that the field may be at an inflection point, one that could herald a divergence, at least temporarily, between the priorities of A.I. researchers and those of business.
  • Several speakers lamented that deep learning systems do not exhibit human-like learning abilities, such as the ability to master new tasks from just a handful of examples, learn concepts, and use common sense. Researchers I spoke to at NeurIPS thought the field was stumbling in the dark when it came to imbuing A.I. systems with human-like flexibility and efficiency.
  • A minority were optimistic that recent innovations will provide the building blocks for human-like A.I. Chief among them was Yoshua Bengio, who won last year’s Turing Award along with two others for his pioneering work on deep learning, cited a mechanism called “attention” that you can read more about here.
  • Many speakers, including Bengio, called for researchers to return to nature for inspiration, just as they had with the original neural networks. Cognitive psychologist Celeste Kidd used her keynote to urge researchers to look more closely at infant and child development. Blaise Aguera y Arcas, a Google A.I. scientist, advocated techniques drawn from natural selection, including some based on the way bacteria evolve, to create new algorithms.
  • In the near term, business is interested in equaling or exceeding human abilities on specific tasks, such as spotting manufacturing defects, predicting customer churn or optimizing delivery routes. And most business leaders are agnostic about the A.I. techniques that are used to get there (with the only caveat being a preference, in some contexts, for easily explainable machine-learning techniques.)
  • That means the brief alignment of academic and business interests may be coming to an end. Industry is likely to want to continue to refine and implement data-intensive, narrow algorithms, even as the research community, which has spent much of the past decade designing such systems, turns its attention to more human-like learning software.

Read on for more A.I. news.

Jeremy Kahn


Here is a rundown of the papers and announcements I found most interesting at this year's NeurIPS:

Turning 2-D images into 3-D. Researchers have created a system that can turn two-dimensional images into three-dimensional ones, without any access to a three-dimensional training set. This is the first system that can predict not just the shape of three-dimensional objects, but also their colors, textures and lighting from a simple two-dimensional drawing. The team, which included researchers from Nvidia Research, the Vector Institute, the University of Toronto, and Aalto University, say the work may point towards systems that will be able to render whole scenes for video games or animated movies in a much faster and less expensive way. You can read their paper here

Neural networks can forecast traffic better than you (and me). The Institute for Advanced Research in Artificial Intelligence announced the results of a competition to see what kinds of machine learning systems could best predict traffic patterns in Berlin, Istanbul and Moscow from historical geospatial data. Forty teams from around the world entered the competition. The winning teams, which hailed from South Korea, a combined Oxford/Zurich team and a team from the University of Toronto,  all used neural networks and found that these deep learning systems outperformed previous forecasting methods, including other kinds of machine learning.

But humans are still at Minecraft. Microsoft announced that none of the 660 teams that entered its MineRL competition, which asked competitors to see if they could use reinforcement learning to create A.I. agents able to successfully mine diamonds in the  video game Minecraft, were able to complete the challenge. Some of the bots were able to accomplish intermediate tasks, such as creating a furnace to forge the pick axe they would need to do the mining. But none found a diamond. One of the issues: compared to the A.I. systems that have bested human competitors at complex strategy games such as Go and the video games Starcraft 2, the MineRL competitors had access to relatively little data and computing power from which to train their bots: just 1,000 hours of pre-recorded game play and and a single Nvidia GPU, as well as a maximum of four days training time. 

An algorithm that can both maximize revenue and guarantee fairness? In many contexts, there is presumed to be a trade-off between algorithms that maximize some reward and those that guarantee fair treatment. But a team of researchers from the University of Massachusetts Amherst and Stanford University created an algorithm, which they called Robinhood, that could be used to design systems that both maximizes a reward, such as revenue, while also guaranteeing fairness, within certain user-specified limits. In order to work, the user has to be able to specify the fairness constraint — for instance that men and women have equal opportunity to be approved for a bank loan—in a formal, mathematical way. The authors tested their algorithm on datasets for loan applications, an educational tutoring system and criminal recidivism and found in each case Robinhood could maximize outcomes while guaranteeing fairness.

Pay attention. The attention-based neural network architecture known as a Transformer has had a massive impact on natural language processing algorithms over the past two years. Now the same team from Google Brain that came up with the original Transformer is applying a similar technique to computer vision. In vision, attention mechanisms have already been used to augment convolutional neural networks, but the Google Brain team proposes that attention alone, without any convolutions at all, might produce equally good results while at the same time reducing the amount of computer power needed to train and run these vision models. 

Weights weights, don't tell me. One of the trickier aspects of getting neural networks to work well is tuning the initial weights attached to the various datapoints the models ingest. But it turns out the architecture of the neural network, or the exact design of the connections between the layers of the network, probably matters far more than the weights. Adam Gaier and David Ha, both researchers at Google Brain, presented their finding that it was possible to design networks that would perform well on image classification tasks, even with no training, when each node of the network was assigned the exact same random weight. They used a clever evolutionary technique to refine the neural network architecture so it would be best for each task, while keeping the weights constant.


Intel Buys Habana Labs. Intel announced on Monday that it is buying Habana Labs, an Israel-based maker of computer chips designed specifically for A.I. workloads, for $2 billion. Habana has claimed that some of its processors can handle four times the amount of data per second than rival chips — a key factor for heavy A.I. training loads. Habana, which will continue to operate independently according to Intel, adds to the chip giant's stable of A.I.-oriented hardware. In 2016, it purchased Nervana Systems, a startup developing A.I.-specific chips. 

Emotion-Recognition Systems Under Fire. The AI Now Institute, a center at New York University dedicated to understanding the societal impact of A.I., has called for a ban on the use of emotion-recognition software in important decisions in its latest annual report. Such systems, which purport to detect human emotions from video of people's faces, have increasingly been used in hiring as well as in some other contexts, such as patient pain studies. AI Now says that the scientific basis of the technology is "contested" and it "should not be allowed to play a role in important decisions about human lives." 

Facebook Ads Still Discriminate Against Women, Older Workers, ProPublica Finds. The investigative journalism organization ProPublica has found Facebook ad targeting can discriminate by gender, race and age even when advertisers use a new service that doesn't allow targeting by those specific characteristics. Facebook created the new ad service as part of a settlement of five civil rights lawsuits in March. The problem, according to ProPublica, is that Facebook's algorithms have learned to rely on parameters that can be proxies for prohibited characteristics. Facebook has said it has "gone above and beyond" in trying to prevent discrimination, but that advertisers determined to violate civil rights laws can always find a way to do so.

Instagram Will Use A.I. to Police Bullying and Hate Speech. Facebook-owned Instagram has begun using A.I. to police bullying and hate speech on the photo-sharing social network. The system works by reviewing captions users write before they are posted. If the software flags a caption as problematic, Instagram will notify the user and give them a chance to edit or delete the caption before the image can be posted.

Schlumberger Signs New A.I. Deal. Paris-based oilfield services company Schlumberger reached an agreement with Dataiku, a New York software firm, to jointly develop A.I. applications for Schlumberger's exploration and production companies, The Houston Chronicle reports. The move comes just a month after rival oilfield services firm Baker Hughes struck a similar deal with Microsoft and A.I. startup Such deals are growing, according to a recent Wall Street Journal story. But they're controversial: One anonymous Microsoft engineer recently wrote about the moral qualms of helping Big Oil in a provocative essay in the journal Logic


Another takeaway from this year's NeurIPS: The market for A.I.-specific computer chips is real, and heating up. (And this was before Monday's announcement about Intel acquiring Habana Labs.)

On the trade floor at the conference, five of the leading A.I. chip companies faced off:

  • Graphcore (which recently signed a major deal to make its Inference Processing Units available through Microsoft’s Azure cloud)
  • Cerebrus (which has built the world’s largest computer chip, designed to handle A.I. training loads)
  • Intel (which entered A.I.-specific hardware through its acquisition of Nervana Systems in 2016 and which has now acquired Habana)
  • Habana itself
  • Google (which has its Tensor Processing Units available in its cloud)

While trumpeting their own chips’ performance, each of these companies bashed Nvidia, whose graphics processing units (GPUs), chips originally designed to handle the heavy computing workloads of video games, have until now been the standard equipment for A.I. training tasks. Many of these companies cited the increasingly compute-intensive models used for key business tasks as the reason their hardware would be essential in the coming years.

Weirdly, Nvidia, which once had a major presence at NeurIPS, did not have a corporate stand at the conference this year. And given that the Santa Clara, California-based company has still not announced plans to offer an A.I.-specific chip of its own, it might be easy to conclude Nvidia co-founder and CEO Jensen Huang has fallen into a classic innovator’s dilemma, and Nvidia is now in danger of losing its dominance in A.I. computing.

But Konstantinos Georgatzis, global head of data science research and development for QuantumBlack, an arm of consulting firm McKinsey & Company that helps companies implement machine learning, said most corporate customers aren’t tripping over themselves to get access to A.I.-specific chips. The reason? Integrating these A.I. chips with companies' existing software and workflows is not seamless and most businesses don’t want to take the time—or run the risk—of retooling their projects to run on the new hardware, despite the promise of a training speedup. They are also wary of getting locked into an unproven chip design that may quickly become obsolete. So, Georgatzis says, they are taking a “wait-and-see” approach while continuing to run A.I. workloads on GPUs that are readily available through most cloud service providers.

For the moment, the trend of ever-larger A.I. models represents a big opportunity for A.I.-specific hardware makers. But that trend may not last. At current exponential growth rates, models will become prohibitively expensive to run, even with new hardware. And, if the push for more human-like learning efficiency (see above) does bear results, such massive models and the equally massive compute (and electricity) they require may ultimately wind up seeming like a weird, brief eddy in the flow of A.I.’s development.

Bottom line: Nvidia is under pressure, but don't count it out just yet.


IBM’s A.I. Can Now Mine People’s Collective Thoughts. Will Businesses Use This Data Thoughtfully? —By Jeremy Kahn

Hair-Brushing Robot Shows How Artificial Intelligence May Help the Disabled —By Jeremy Kahn

First Ever Pro Drone Race Gave an A.I.-Piloted Aircraft the $1 Million Grand Prize. But It Still Couldn’t Beat a Human Pilot—By Aaron Pressman

A.I. Might Be the Reason You Didn’t Get the Job—By Danielle Abril

Sexual Harassment Issues Highlighted at a Leading A.I. Conference—by Jeremy Kahn


What do state-of-the-art language models actually know?

In the past 18 months, a succession of ever-more-capable language models (algorithms that can predict the next word in a sentence) have debuted, finding their way into search engines, chat bots, question answering systems and writing generators. But the exact workings of these massive language models—CTRL, one of the current ones, designed by Salesforce's A.I. research team has 1.6 billion different parameters, for instance—has been mysterious even to the computer scientists who have created them.

At NeurIPS, researchers showed off systems that can peer inside the large language models and give people an inkling of how they work. Ben Hoover and Hendrik Strobelt, two researchers at IBM, along with Sebastian Gehrmann from Harvard, came up with exBERT, a tool that helps users explore Google's giant Bert language model. Using it, you see exactly which words in a sequence the model is paying attention to in order to make its predictions of the next words. You can also see how the words that it is paying attention to change as one moves through the 12 different layers of the neural network. 

A somewhat similar tool has been created by the Allen Institute for Artificial Intelligence and researchers at the University of California Irvine. Its AllenNLP Interpret works on any large NLP model, not just Bert. It also shows which words a model is paying attention to at each stage of its prediction and how much weight it is giving each of those words in making its decision. It can help point out how these systems can be tricked into making incorrect predictions, a method that could eventually help improve the models and make them less vulnerable to potential attacks.

These methods work a bit like a functional MRI scan for a human brain. They help highlight what areas the model is using. What it doesn't show is exactly why the model functions in this way or exactly what happens in the middle layers of the network that help move the model towards a correct decision.

It remains debatable exactly how far these massive language models can go. It also remains unclear exactly how much about the structure of language they really capture. (For instance, there is a debate among researchers about how well their language mappings correspond to grammar.) But the advent of these NLP "MRI scans" could point the way to answering those questions.