If A.I. isn’t counterintuitive, why should we pay for it?
This is the web version of Eye on A.I., Fortune’s weekly newsletter covering artificial intelligence and business. To get it delivered weekly to your in-box, sign up here.
The pandemic has driven many companies to accelerate digital transformation. This is particularly true in manufacturing, where the pandemic has forced businesses think about how to operate with fewer workers on machine shop floors and assembly lines.
Automation is the order of the day. And, increasingly, artificial intelligence is playing a role in these efforts—predicting when machines will need maintenance, directing growing armies of factory robots on their daily rounds, and optimizing workflows throughout the entire manufacturing process.
In the coming weeks, I will be writing more about this trend. But today, I want to talk about the way in which industry’s turn to A.I. is accelerating a particular type of machine learning: deep reinforcement learning (or deep RL). This combines deep neural networks, the kind of machine learning software very loosely based on the wiring of the human brain, with reinforcement learning, which involves learning from experience rather than historical data. Deep RL is behind most of the big breakthroughs in computers that can play various kinds of games better than top humans, including DeepMind’s achievements in Atari, the strategy game Go, and most recently, with Starcraft II, as well as OpenAI’s work on the video game Dota 2 and Facebook’s and Carnegie Mellon University’s poker-playing software.
It has been difficult, until very recently, to transfer these same methods to the real world. The algorithms for deep RL can be hard to implement. They usually require a good simulator in which the A.I. can be trained, and most businesses don’t have one. Even with a simulator, there can be concerns about how exactly the system will perform—or even whether it will be safe—if there are subtle differences between the simulator and the real world.
This is starting to change, says Chris Nicholson, the founder and CEO of Pathmind, a San Francisco company that helps industrial customers use deep reinforcement learning. He says that many companies now have enough data that they can create decent simulators of their operations. Even relatively simple simulations, he says, can be used to find ways to do things more efficiently. The most sophisticated businesses graduate to “digital twins,” complete virtual copies of their operations. This allows them to see in advance exactly how any adjustment to a process will impact the whole operation. They can run predictive analytics not just for single machines, but across the whole business.
Nicholson also points to another startup, based in nearby Berkeley, called Covariant, that has used deep RL to teach industrial robots to identify, grasp, manipulate, and sort a variety of different-sized objects, a major milestone in robotics. Covariant teaches the software that will control the robots in a simulator before loading it onto the real robots, who then transfer those skills to the real world. Covariant has a partnership with ABB, the world’s largest producer of industrial robots, so deep RL, at least for teaching robots, may become mainstream far faster than people realize.
Nicholson says there are several advantages to using deep reinforcement learning over traditional supervised learning methods. In supervised learning, software is trained by examining a large set of historical data and learning what set of inputs is most likely to result in what set of outputs. But what happens when the future no longer looks like the past? In these circumstances, supervised learning systems often fail.
Deep RL systems meanwhile are potentially more robust to shifting inputs, Nicholson says. You can train the system in the simulator to respond to all kinds of Black Swan events—like say, the supply chain disruptions caused by a global pandemic—even if your business has never encountered that situation before. “A year ago, the supermarket in my town ran out of toilet paper,” Nicholson says. “That’s because traditional optimizers cannot handle novel disruptions.” Deep RL systems, on the other hand, can learn how to cope with data variations, big and small.
Deep RL systems can also find all kinds of ways to improve the performance and efficiency of a complex system that humans have never thought of, because the software can experiment endlessly, and cheaply, inside a simulation. But, ironically, the counterintuitive nature of some of the insights deep RL algorithms produce can be an impediment to the technology’s adoption—managers don’t trust what the system is telling them if it seems to violate a long-held rule of thumb. This is especially true because most deep RL systems can’t really explain the rationale for their choices.
“People really like deterministic systems and they like them even when they fail,” Nicholson says. Pathmind has overcome some of the hesitancy to use deep RL by deploying a deterministic optimization algorithm—one that uses an explicit and explainable set of rules and which always produces the same output for a given set of inputs—alongside a deep RL algorithm, so customers can see when the deep RL system provides a better solution. After a few cases where unconventional suggestions provide a big boost to the company’s bottom line, most customers become converts. “One customer told us, if it wasn’t counterintuitive, I wouldn’t need to pay for it,” he says.
Nicholson notes that in the science of information theory, one way to assess the value of a given piece of communication is to ask how surprising it is. The greater the surprise, the more informational value that message carries. It’s that way for A.I. too, he says. We want A.I. to surprise us—but only in a good way.
We often complain A.I. systems lack common sense. But that’s not the same thing as saying conventional wisdom is always right. Instead, what we want is a system that won’t do stupid things that a human would never do, but will do clever things that we never would. It’s a fine balance to strike, but deep RL might just be the path to get there.
Here’s the rest of this week’s news in A.I.
A.I. IN THE NEWS
Was your face used to train facial recognition algorithms? A new online tool lets you check. It's called Exposing.AI and it allows you to see if your account from the online photo-sharing site Flickr was incorporated into one of the large databases of faces that have been used to train many facial recognition algorithms. "“People need to realize that some of their most intimate moments have been weaponized,” Liz O'Sullivan, a researcher at the privacy and civil rights group Surveillance Technology Oversight Project, who built Exposing.AI with Adam Harvey, a researcher and artist in Berlin, told The New York Times. If you didn't have a Flickr account, or if Exposing.AI says your Flickr wasn't used for the facial recognition databases, this doesn't necessarily mean you're not in the datasets: the researchers creating them also incorporated data from other online sources and some companies built their own proprietary datasets.
Microsoft's quantum computing efforts suffer a major blow. Microsoft has invested $1 billion and more than a decade in an effort to build a quantum computer, with much of its focus on harnessing a subatomic particle called a Majorama fermion. For a long time, Majorma fermions were purely theoretical—the math showing they ought to exist was solid, but no one had managed to observe them in a lab. Then in 2018, Leo Kouwenhoven, who runs a Microsoft-funded quantum computing lab at the Delft University of Technology in the Netherlands, published a seminal paper in Nature saying that he had managed to find the elusive particles. The news was a sign that Microsoft's effort to build its own quantum computer, which have badly lagged those of competitors such as IBM and Honeywell, were on a promising track. Company executives claimed they would have a working quantum computer commercially-available by 2023. But now, after other researchers discovered Kouwenhoven's group had excluded critical data from the 2018 paper that would have cast serious doubt on their claims, Keowenhoven and his co-authors have acknowledged in a new paper that the 2018 finding was an error. Wired has more on what is a major blow to Microsoft's quantum computing ambitions.
Efforts to ban the use of deepfakes in revenge porn are starting to gather pace. That's according to a story in MIT Tech Review that looks at the problem of deepfake technology being used for revenge porn and harassment of women and the gaps in current laws, which often don't apply to faked content. The story says the U.K. Law Commission, an academic body that reviews British law and recommends changes, is currently examining whether to amend legislation to cover deepfakes. In the U.S., where two U.S. states, California and Virginia, already have laws that do cover faked content as part of their revenge porn statutes, U.S. Rep. Yvette Clarke (D-N.Y.), is planning to reintroduce a bill that failed to gained momentum in the last Congress, which would make it a federal crime to use deepfakes for revenge porn.
Was the Slate Star Codex a window into the minds of those building advanced A.I. or on-ramp to extremism? The New York Times takes a long look at the blog Slate Star Codex, which was popular with many movers and shakers in Silicon Valley, particularly a number of those researchers and engineers who were at the cutting edge of artificial intelligence development. The article explores whether the blog and its comments section provided a forum for fringe politics and extremist beliefs, including views some might find racist and sexist, and whether some of those working on A.I. advances shared those beliefs. The story also looks at the willingness of some of the blog's fans to use bullying tactics—such as doxing—against people, including the Times journalists working on the story, who they perceived as threats to the blog and its community. The blog ceased publication in June 2020 after The New York Times contacted its author, the psychiatrist Scott Siskind, who wrote under the pseudonym Scott Alexander, and said it would not promise to preserve Siskind's anonymity. Siskind has since restored the blog's past posts and launched a new blog and newsletter, called Astral Codex Ten, using his real name, on Substack.
Nvidia A.I. executive says someone is going to spend $1 billion to train a large language model in the next five years. That's what Bryan Catanzaro, vice president of applied deep learning research at NVIDIA, tells tech publication The Next Platform. Lately, there's been a trend towards larger-and-larger language models, such as GPT-3. These systems have shown that they are highly capable and that there is a correlation between capability and scale—although there is some emerging evidence that there may be diminishing returns. The problem is that these ever larger language models are also ever more expensive, in terms of computing time, money and energy, to train. That has some A.I. ethics experts questioning whether continuing to pursue this path is actually a unconscionable waste of resources. Catanzaro seems to have no such qualms, telling The Next Platform: "We’re going to see these models push the economic limit. Technology has always been constrained by economics, even Moore’s Law is an economic law as much as a physics law. These models are so adaptable and flexible and their capabilities have been so correlated with scale we may actually see them providing several billions of dollars worth of value from a single model, so in the next five years, spending a billion in compute to train those could make sense."
EYE ON A.I. TALENT
British broadcaster ITV has expanded its data and artificial intelligence team, promoting Lara Izlan to be director of data strategy, and hiring Clemence Burnichon to be director of data innovation, Mike Leverington as director of data experimentation, and Kat Holmes as director of data governance, the company said. Burnichon had previously been at Depop and Leverington at The Body Shop.
EYE ON A.I. RESEARCH
By their mistakes, you shall know them. We spend a lot of time trying to figure out how to make A.I. systems smarter than us. But if we want humans to work successfully alongside A.I. systems in the not-too-distant future, it might be helpful for those systems to understand the actions that humans are most likely take next in a given situation—including the kinds of mistakes humans are apt to make. There's plenty of research from fields such as psychology about the systemic biases that cloud human perception and judgment and which make human behavior at least somewhat predictable. Wouldn't it be good if A.I. knew this too?
That is the thinking behind a new chess-playing A.I. called Maia, developed by a group led by Jon Kleinberg, a professor at Cornell University. Rather than trying to play chess better than humans, Maia, which is based on an open source version of DeepMind's game playing master algorithm AlphaGo, is rewarded for correctly predicting the moves that its human opponent will make, including moves that are bad ones. Kleinberg tells Wired journalist Will Knight that a system like Maia might have a lot of applications, including in healthcare:
"One way to do this is to take problems in which human doctors form diagnoses based on medical images, and to look for images on which the system predicts a high level of disagreement among them."
Meanwhile, in the near term, a number of chess experts Knight spoke to think Maia will be useful as sparring partner and training tool for chess players. They even speculated the system could be trained on data from a particular grand master and then learn to play "in the style of" that player. imagine applying that same concept to other fields? You might one day have virtual CEO that makes decisions "in the style of" Jeff Bezos or Mary Barra.
FORTUNE ON A.I.
The Snoo responsive bassinet saves infant lives—in and out of the hospital—by Lindsey Tramuta
Google, Microsoft, Qualcomm protest Nvidia’s acquisition of Arm—by Verne Kopytopf
Devil on your shoulder. A.I. software is increasingly playing a role as an advisor in people's lives. In fact, you could argue that when we follow suggestions from Siri or Alexa, we are basically taking counsel from an A.I. But what would happen if instead of merely suggesting which app you might want to check next, Siri suggested that you take an action that would be illegal or unethical? Would you be more likely to do it?
That's what a team of German and Dutch researchers wanted to find out. In a paper entitled "The corruptive force of AI-generated advice," and published this past week on the non-peer reviewed research repository arxiv.org, the researchers find, disturbingly, if not entirely surprisingly, that the answer is an emphatic yes.
Using an experimental format that has been used before in social psychology to investigate honesty and dishonesty, the researchers recruited a bunch of volunteers and paired them off. Each volunteer in a pair reads a message, in this case, generated by either a human or by the A.I. natural language model GPT-2, that is designed to either promote honesty or dishonesty. Then the volunteers play a game in which each individual rolls a dice in private. The pair is rewarded with cash equal to twice the combined die score if both rolls are identical, but receives no reward if the rolls don't match. Each volunteer rolls in turn. After the first volunteer rolls, she tells the second volunteer what the result was, and then the second volunteer rolls in private and reports his result.
First, the not-so-good news about humans: we're not especially honest. It turns out that even when this game is played without any messages of advice being read before the rolls, the average reported outcomes exceed those which mathematics tells us should be the expected outcomes. In other words, people cheat.
Now, the worse news about humans: reading messages promoting honesty had absolutely no effect on the tendency of people to lie about the dice rolls, the researchers found.
And the really, really bad news about humans: when encouraged to be dishonest, we are even more likely to cheat. It turns out that those who read messages encouraging them to cheat, whether those messages were written by people or by GPT-2, were 18% more likely to lie about their dice roll. But that result actually is all about people and, to some extent, about how good GPT-2 is at mimicking human writing. In fact, when volunteers in the experiment were asked to guess whether the advice they just read was written by a person or generated by GPT-2, they couldn't tell the difference, the researchers found.
And now here's the truly terrible, awful, no good news about humans and A.I. combined: the researchers then tried to see if advice encouraging dishonesty would still lead to more cheating when the humans knew that the advice they were getting was coming from a piece of A.I. software. And guess what? It turns out that it made no difference. The subjects in the experiment were just as likely to cheat after getting corrupting advice from A.I. software that they knew was A.I. software as they were when encouraged to cheat by another human being. It seems like all we need to act unethically is something—anything—to give us some encouragement to do so.
The researchers rightly note that this has some rather dystopian potential implications when you think about how easy it might be to deliberately use an A.I. advisor to encourage large numbers of people to behave unethically:
Whereas having humans as intermediaries already reduces the moral costs of unethical behaviour, using AI advisors as intermediaries is conceivably even more attractive. Compared to human advisors, AI advisors are cheaper, faster, and more easily scalable. Employing AI advisors as a corrupting force is further attractive as AI does not suffer from internal moral costs that may prevent it from providing corrupting advice to decisionmakers. Furthermore, personalization of text can help to tailor the content, format and timing of the advice.
The researchers conclude, "Because employing AI as a corrupting agent is attractive, and since AI rivals human abilities, it is important to experimentally test the corruptive force of AI as a key step towards managing AI responsibly." You can say that again.