How to use A.I. responsibly: lessons from DeepMind and Intel
A lot of businesses are contemplating how to govern the use of A.I. within their organizations and are concerned about deploying the technology responsibly. Last week, I had two separate conversations—one with a senior executive at the chip giant Intel, the other with a group of scientists and A.I. ethics experts from leading A.I. research company DeepMind—about how those companies try to ensure responsible A.I. development. The structures and approaches the two have employed are strikingly similar. And other companies could certainly do worse than to take inspiration from their example.
Here are some of the key takeaways:
•Ethics cannot be an afterthought. DeepMind talked to me about how they put their “responsible A.I.” principles into practice around AlphaFold, its groundbreaking A.I. system that can take the genetic code for a protein and automatically predict the three-dimensional shape that protein will assume. This work is a massive leap forward for biology and may ultimately speed the development of new medicines. From the project’s outset, DeepMind worked with its own in-house “Pioneering Responsibly” team—which consists of staff members who have an expertise in ethics and A.I. safety—to work through possible issues around the release of AlphaFold and its predictions. This included having one ethics researcher, Sasha Brown, essentially embedded with the AlphaFold team for the duration of the project.
•Require ethical impact assessments. Lama Nachman, an Intel fellow and director of its Intelligent Systems Research Lab, oversees the company’s approach to responsible A.I. innovation. She tells me that Intel requires all of its A.I. projects to complete a formal ethical impact assessment. The assessment is designed to pick up potential problems across a wide range of areas including security, privacy, fairness, sustainability, transparency, and human rights. The assessment, Nachman says, has two phases. The first, conducted early on, looks at the intended purpose of the A.I. system—is it an ethical use of the technology? The second looks at how the technology is built—does it use a biased data set, for instance, or is the system’s output understandable to users? In each case, Nachman says, teams building the technology need to show that they have tried hard to identify risks and put in place strategies to eliminate or at least mitigate these risks.
•Set up an institutional review committee. At Intel, these ethical impact assessments are then reviewed by a group called the AI Advisory Council. Nachman says the Council is composed of people with different expertise and backgrounds, including both computer science, social science, ethics, policy and legal affairs. The Council meets twice a week to review project proposals and impact assessments from throughout the company. She says that at first, some engineering teams would treat the ethical assessments as “box-ticking” exercises and they would often say that certain concerns “weren’t applicable” to the A.I. system they were building or using. But, after a process of repeated questioning from the AI Advisory Council, she says, teams have gotten much more thoughtful about risks and how to mitigate them. The ethical assessments have become more detailed and robust.
At DeepMind, the Institutional Review Committee (IRC), performs a similar role to Intel’s AI Advisory Council. (At DeepMind, the IRC meets once every two weeks.) The Committee includes rotating membership consisting of “a range of senior leaders from across the organization” and representing different areas of expertise, according to Dawn Bloxwich, who leads the company’s Pioneering Responsibly team. But the IRC is always chaired by DeepMind Chief Operating Officer Lila Ibrahim. Brown says the IRC likes to engage with teams throughout their projects, and views its role as “more like a consultancy, rather than what could be seen as kind of ethics leads coming in at the end, and like reviewing something and getting feedback.”
The team building AlphaFold had to make six formal presentations to the IRC over the course of its work, Jumper says. He says he looked at the IRC process not as a meddlesome burden, but rather an essential safety net that actually gave the team greater confidence that it was doing the right thing. He says he almost always thought the recommendations the IRC made were reasonable and raised valid issues. In the few cases where he disagreed, Jumper says, he took it “as opportunity to go back and present new evidence.” “It’s about engaging with the IRC to get to something we are all comfortable with,” he says.
•Seek outside expertise and opinions. When working on AlphaFold, the team consulted with more than 30 experts outside of DeepMind on ethical concerns in areas such as bioethics and biosecurity. To make sure this advice was truly independent, DeepMind used an outside consultancy to find the relevant experts and the entire process was double-blinded: the experts didn’t know they were offering an opinion on a system being developed by DeepMind and DeepMind didn’t know the identity of the experts who were consulted. Rather than using a panel of outside experts to weigh in on the entire AlphaFold project, Brown says that the team often put very specific, narrow questions to one particular expert.
•Being transparent about the limitations of an A.I. system. One of the issues the AlphaFold team spent the most time grappling with, according to Jumper, was how to build confidence metrics into the A.I. system that would be useful to biologists. Among their biggest fears, Jumper says, is that biologists be overly trustful of AlphaFold and take its predictions as gospel in cases when actually they should be far more cautious about the A.I.’s output. In the end, DeepMind decided it needed to find a system that could produce a measure of its own confidence in the structure of each amino acid sequence (or residue) that makes up a protein. This way, scientists using the system could have a reasonable idea of when to trust what AlphaFold was predicting, and when to be more skeptical. Jumper says that the team has also tried to be very outspoken about cases where the confidence score itself is wrong—for instance, when there’s a single mutation in a protein’s DNA that breaks the structure, AlphaFold will often continue to have high confidence that the protein looks the same as before the mutation occurred.
•Think hard about misuse. Nachman says that at Intel has worked with Article One, a consulting group that specializes in helping companies understand issues around human rights, responsible innovation, and social impact. One of the excercises Article One has helped train Intel’s teams on is deliberately trying to think of all the “bad headlines” that might result from the project they are working on. She says doing this gets people to start considering what the unintended consequences of the technology they are building might be and how that technology might be misused or abused. The problem, she says, is that engineers often approach projects from the perspective of their own intentions. And as long as their own intentions are good, she says, they often think they are fine in terms of ethics. “The reality is, it actually has nothing to do with intentions,” she says.
At DeepMind, Bloxwich says, the company engages in a process of “red-teaming” its own technology—thinking about the nefarious ways someone might use or misuse A.I. that it is building, or how someone might try to break the technology. It also performs what she calls “pre-mortems,” where you assume everything goes wrong and then you have to work out why it might have gone wrong.
In the case of AlphaFold, Jumper says the team thought about whether publishing protein structures associated with pathogens might enable someone to build better bioweapons. It was one of the areas DeepMind sought outside expertise on too. Ultimately, Jumper says, the company concluded that there was already a tremendous amount of information on pathogens in the public domain and that there were far easier methods for someone to weaponize existing bacteria and viruses than trying to use AlphaFold to reverse engineer pathogen protein structures and then use that knowledge to build more deadly, transmissible diseases. (It also helped, Jumper says, that for highly-technical reasons having to do with virus biology, AlphaFold does not produce great predictions for virus proteins.) So the team concluded, Jumper says, that the benefits of publishing predicted structures for as many proteins as possible far outweighed any marginal increase in biosecurity risk.
•Think about the ethics of access to the technology. For DeepMind, concern about ensuring that there was widespread and equitable access to AlphaFold’s predictions led the company to simply publish predictions for more than 200 million proteins in a free, publicly-accessible database maintained by the European Molecular Biology Laboratory’s European Bioinformatics Institute. “We looked at accessibility from a hardware, software, and knowledge perspective, making sure that you didn’t have to have insanely expensive machinery in order to understand what AlphaFold will give you, so we weren’t exacerbating inequalities,” Brown says.
Jumper even says the team also changed the color scheme it was using to display the three-dimensional protein structures to make sure someone who is color-blind can interpret them accurately.
But the team also knew that it needed to be proactive when it came to making sure those most in need of the technology had access to it. The company decided early on to partner with researchers working on neglected tropical diseases, to ensure they had access to protein structures associated with diseases, such as Chagas disease, that don’t attract much research money but impact millions of people in developing countries.
For more on how DeepMind thought about A.I. ethics around AlphaFold, you can read a company blog post on the subject here. And for more about how Intel thinks about responsible A.I. you can read an op-ed Nachman penned on the subject here.
Before we get to this week’s A.I. news, if you are interested in finding out more about how your company should think about A.I. ethics and responsible use of A.I., please consider attending a fab virtual round table discussion on A.I. “Values and Value” that I’ll be hosting on Thursday, October 6th at 12:00 to 1:00 PM Eastern Time.
The A.I. and machine-learning systems that underwrite so much of digital transformation are designed to serve millions of customers yet are defined by a relatively small and homogenous group of architects. Irrefutable evidence exists that these systems are learning moral choices and prejudices from these same creators. As companies tackle the ethical problems that arise from the widespread collection, analysis, and use of massive troves of data, join us to discuss where the greatest dangers lie, and how leaders like you should think about them.
- Naba Banerjee, Head of Product, Airbnb
- Krishna Gade, Founder and CEO, Fiddler AI
- Ray Eitel Porter, Managing Director and Global Lead for Responsible A.I., Accenture
- Raj Seshadri, President, Data and Services, Mastercard
You can register to attend by following the link from Fortune’s virtual event page.
A.I. IN THE NEWS
A hedge fund manager and his son have set out to disrupt pro football with machine learning. The billionaire hedge fund manager Paul Tudor Jones and his son Jack, who is a data scientist, have founded a company called SumerSports that is developing advanced analytics for NFL football teams, The Wall Street Journal reports. Jones tells the paper that pro football is ripe for application of the same kinds of quantitative methods that have transformed finance since the late 1980s. The new company’s CEO is Thomas Dimitroff, a former general manager of the Atlanta Falcons. The idea is to create scores that can help managers decide how to best optimize their draft picks for a team’s needs and the salary cap that they have to work within.
China’s text-to-image generation A.I. systems are being built to adhere to the country’s political censorship rules. Baidu, the Chinese Internet giant, built a highly-capable text-to-image generation A.I. it calls Ernie-VLG that has been praised for generating better images of Chinese celebrities and better anime art than similar systems created and trained by Western companies. But MIT Tech Review reports that Baidu’s system also won’t respond to any requests to generate images depicting Tiananmen Square or certain Chinese political leaders or certain words or images that the Chinese government considers politically sensitive. It seems that the Chinese government is keen that such A.I. systems not be used to subvert the limits of freedom of expression it has already put in place for humans.
Cruise expands its self-driving taxis to Texas and Arizona. The self-driving company owned by General Motors plans to launch a robotaxi service in Austin, Texas, and Phoenix, Arizona, in the next 90 days, Kyle Vogt, the company’s CEO and co-founder, tells TechCrunch. Cruise already operates driverless robotaxis in certain areas of San Francisco between 10 p.m. and 5:30 a.m. Vogt says that the service in Phoenix, which the company has already fully-mapped, and Austin, where it has never operated before in any capacity, will begin in a limited way and expand gradually.
EYE ON A.I. TALENT
Landing AI, the San Francisco-based startup that helps industrial companies implement computer vision software and which was co-founded by deep learning pioneer Andrew Ng, has hired Dan Maloney to be its chief operating officer, the company said in a statement. Maloney had previously been the CEO at data analytics company Zepl.
EYE ON A.I. RESEARCH
Scientists try to teach robots when it is appropriate to laugh. Researchers in Japan working on robots that can hold a conversation with humans are trying to teach a robot called Erica when it is appropriate to laugh, according to a story in The Guardian. The newspaper quotes Koji Inoue, who is leading the Erica project at Kyoto University, as saying he thinks laughter is important for a robot to master because it is an expression of empathy, which in turn is an important part of human conversation.
Fair enough. But Inoue is also using a strange data set to teach the robot: he recorded 80 speed dating conversations that male students had with the robot when it was actually being operated remotely by four female actors.
It then tested how well Erica was doing by creating four dialogues for it to share with a person, including uses of its new “shared-laughter” conversational software. “These were compared to scenarios where Erica didn’t laugh at all or emitted a social laugh every time she detected laughter. The clips were played to 130 volunteers who rated the shared-laughter algorithm most favorably for empathy, naturalness, human-likeness and understanding,” the newspaper reported.
FORTUNE ON A.I.
Will A.I. destroy humanity? Long a staple of sci-fi apocalyptic fiction, the scenario has gotten an increasing amount of serious attention in the past decade as A.I. systems based on neural networks have made big leaps in performance—and efforts to think about how to stop it from happening have gotten increasing amounts of serious money from the likes of Elon Musk, Dustin Moskovitz, and Sam Bankman-Fried (all associated, to varying degrees, with the longtermism branch of the Effective Altruism movement). Now researchers from the University of Oxford, DeepMind, and the Australian National University—with grant funding from some of the ‘longtermist’ institutes backed by those same billionaires—have published a paper in AI Magazine saying that the chances that an advanced A.I. system will destroy humanity are even higher than previously assumed.
The researchers look at the problem of an advanced A.I. agent that is rewarded for maximizing some goal in an unknown environment. They say that unless specific steps are taken to prevent the A.I. system from doing so (and they propose a few possible ones towards the end of the paper), the A.I. system will likely conclude that it should tamper with humans ability to prevent it from receiving whatever reward signal it is getting. It says that taken to a logical extreme—and why wouldn’t an advance A.I. take something to a logical extreme?—the A.I. system will always conclude that the surest way to ensure it will go on receiving as many rewards as possible will be to eliminate people.
This is a kind of updating of the “paperclip” scenario—where an A.I. is told to simply produce as many paperclips as it possibly can and quickly concludes that the best way to do so is to eliminate people from the planet so that they don’t interfere with its ability to gather the resources, energy, and storage space needed for its successful paper-clip manufacturing operation. The researchers conclude “an existential catastrophe is not just possible, but likely.” Worse, they say it’s hard to think of ways to prevent this doomsday scenario from happening. (Although they do propose a few avenues that might work.)
I guess I am glad some smart people are thinking about this stuff. Certainly, it underlines the need for us to design ethics and some hard and fast rules (such as Asimov’s “First Law of Robotics”) into any advanced A.I. system we actually set loose in the world. I mean, you could argue that we already have some single goal, reward-maximizing (collective) intelligence agents out there: corporations. And we all know what kind of trouble they can get up to unless constrained by ethics and, sometimes, regulation.
Our mission to make business better is fueled by readers like you. To enjoy unlimited access to our journalism, subscribe today.