DeepMind's protein-folding breakthrough holds lessons for any company implementing A.I.

This is the web version of Eye on A.I., Fortune’s weekly newsletter covering artificial intelligence and business. To get it delivered weekly to your in-box, sign up here.

The biggest news in A.I. this week is DeepMind’s breakthrough on protein-folding.

A question that had confounded scientists for more than 50 years—how to use a protein’s genetic sequence to predict the exact three-dimensional shape that a protein will take—has effectively been answered by DeepMind’s A.I. system, which can now predict the structure of a protein to within an atom’s width of accuracy in many cases.

I got exclusive access to DeepMind’s protein-folding team in the run up to Monday’s announcement. You can read my in-depth feature on exactly how the London-based A.I. company accomplished this goal here. You can also read about how its A.I. system, called AlphaFold 2, has already contributed to the fight against the COVID-19 pandemic here.

Today, I’ll highlight some lessons that emerged from DeepMind’s work on AlphaFold 2 that could apply to any company building an A.I. system.

• “Off-the-shelf” A.I. will only get you so far. Two years ago, DeepMind created a different A.I. system to predict protein structures. That original AlphaFold—AlphaFold 1.0 if you will—was pretty good, but not good enough to be very useful for biologists and medical researchers.

John Jumper, the senior researcher who leads DeepMind’s protein folding team, tells me that the original AlphaFold used “relatively off the shelf neural network technology,” in this case a standard type of neural network architecture originally used to classify objects in images. When it came time to try to improve the system, he says, “What we found is we’ve hit a real wall in what we were able to do with these types of techniques.”

To get better performance, DeepMind had to go back to the drawing board and design a neural network that was much more bespoke to the problem it was trying to solve. It began with a first principles question, Jumper says: “What should the solution look like? And how do we put that into our neural network instead of around it?”

That’s an important lesson for companies to remember, particularly if they are considering using outside vendors and pre-built A.I. components.

•End-to-end systems are better than assemblages of components… The 2018 AlphaFold was a collection of parts: one neural network predicted the distance between amino acid pairs in a protein, another tried to determine the most likely angles between them, and a third piece refined the overall structure. By contrast, AlphaFold 2 is what’s known as an “end-to-end system”—it takes the genetic information as an input and directly outputs a three-dimensional structure. It’s a good reminder that end-to-end systems generally achieve better performance.

•…but don’t ignore the “trust” factor. But a big problem with neural networks that perform a task end-to-end is that they can be highly inscrutable. And that opacity can make it difficult for humans using the software to trust it.

In fact, this is why, in 2018, when DeepMind built a different A.I. system to diagnose 50 different sight-threatening eye diseases from a particular kind of eye scan, it used a system consisting of two different neural networks: One took in the raw data from the scanner and turned that into disease features; one then made diagnoses. This allowed human doctors to have more insight into why the diagnostic system was making its decisions.

In the case of AlphaFold 2, what DeepMind has done instead is build in a confidence gauge, which asks AlphaFold 2 to say how confident it is in its own predictions for each part of the protein structure. That confidence doesn’t really explain why AlphaFold 2 is predicting the structure, but it will give biologists and medical researchers some sense of when they should trust the predictions and when to treat them with more skepticism

•Domain expertise matters. DeepMind trounced academic molecular biology labs that had been working on the protein-folding problem for a lot longer. Part of the reason is that while these academic labs are full of people who deeply understand protein structure, they are not computer scientists. DeepMind has a level of machine learning expertise and engineering resources that these academic labs lack. But, that being said, the team required input from protein structure experts. “We are always collaborative with domain experts,” Demis Hassabis, DeepMind’s co-founder and chief executive officer says. Eventually DeepMind even hired some of these experts, like Jumper.

•But having a diverse team matters too. DeepMind also had people on the team from a range of different science backgrounds. That diversity is helpful, Pushmeet Kohli, the head of DeepMind’s A.I. for science division, tells me, because sometimes people coming from outside the field will have an insight that people from within the field can miss.

The key to making a diverse team work? “Respect,” Kohli says. “Being respectful of all different ways that people contribute and all the different insights that all these different people have.”

But, Kohli tells me, each person on the team should never lose sight of the fact that the goal is to solve the problem—not to prove that a particular approach to solving it is the right one. “The problem is the most important thing and everyone is contributing towards it in their own different way,” he says.

•Try more than one “mode” of working. Researchers who worked on AlphaFold 2 told me that they got stuck many times and couldn’t figure out how they were ever going to make more progress. In such moments, Hassabis says, it is worth switching between two different modes of working: One, which he calls “strike mode,” involves pushing the team to ring as much performance as possible out of the existing approach. But, when this stops working, he says, it is critical to switch to a “creative mode.” In this work style, Hassabis no longer presses the team on performance—in fact, he tolerates and even expects some temporary declines—and instead encourages the team to experiment widely. “You want to encourage as many crazy ideas as possible, brainstorming,” he says.

While some people can work equally well in both modes, others are more comfortable with one work style. Hassabis says it is important to recognize this—and even be prepared to change up the team’s composition and bring in fresh people with new ideas or people better suited to a particular work mode.

Now, here’s the rest of this week’s A.I. news.

Jeremy Kahn
@jeremyakahn
jeremy.kahn@fortune.com

A.I. IN THE NEWS

Facebook's use of A.I. for content moderation under fire for failures. Last month Facebook announced that its automated content moderation systems had gotten good enough that they would take over triaging the posts that are brought to the company's 15,000 human content moderators for review. But the new system doesn't seem to have pleased many folks. Bloomberg reported that many small businesses have had their advertising accounts banned in error by the new software and that they've been unable to get the company to address the problems. Facebook has issued a statement apologizing for "any inconvenience recent disruptions may have caused" but the underlying issue does not seem to have been remedied.

For more about how Facebook is using A.I. across its business and whether it is actually making a dent in the company's massive issues with hate speech, disinformation, phony accounts and more, tune into Web Summit for my fireside chat with Mike "Schrep" Schroepfer, Facebook's chief technology officer, on December 2nd at 7:25 p.m. GMT (2:25 pm EST).

ServiceNow acquires Element AI. The Montreal-based Element AI, which builds machine learning systems for industry customers, is being acquired by ServiceNow, the cloud-based IT services company, for $500 million, according to a story in TechCrunch. The acquisition represents a major push into A.I. for ServiceNow, which is now being helmed by former SAP CEO Bill McDermott, who has made a series of deals recently as he seeks to turn ServiceNow into a one-stop shop for managing companies' digital transformation efforts.

Cerebras claims its massive A.I. computer chip can map fluid dynamics faster than a supercomputer. The Silicon Valley-based A.I. computer chip startup says that its CS-1 system, which consists of a single 18-gigabyte chip that has to be kept in a cooling device about the size of a mini-refrigerator, was 200 times faster at running a complex fluid dynamics simulation than the U.S. Department of Energy's Joule supercomputer, according to a story in tech publication The Register. But the CS-1 was only racing against Joule's largest processing cluster, consisting of 16,384 cores, and not Joule's complete arsenal of 84,000 cores. Plus, "the results should be taken with a pinch of salt," The Register cautions, "as the company has yet to publicly disclose its chip performance in more typical benchmarking tests used for AI and machine learning."

FAA gets closer to approving commercial drones that can operate autonomously. In a move that brings drone delivery operations in the U.S. one step closer to reality, the U.S. Federal Aviation Authority has issued airworthiness criteria for 10 drones, some of which are designed to operate autonomously out of the line of sight of their operators, the agency said. The criteria were issued for drones made by Amazon, as well as startup Airobtoics, Zipline and Wingcopter, among others.

A Supreme Court case could make it easier for researchers to find security flaws in A.I. systems. This week the U.S. Supreme Court heard oral arguments in Van Buren v. United States which will test whether cybersecurity researchers are potentially violating the 1986 Computer Fraud and Abuse Act (CFAA) when they try to find vulnerabilities in existing software and systems. A lower court ruled that this sort of research should not run afoul of the law. If the Supreme Court agrees, it will also make it easier for researchers interested in adversarial machine learning—a field of research which deals with how A.I. systems can be tricked into incorrect classification decisions or predictions. But if the Court reverses the lower court and says that security researchers can be prosecuted for improper use of software, it is likely to have a chilling effect on the field, according to a story in Venture Beat. My colleague Aaron Pressman also has more about the law and the case in Monday's Data Sheet newsletter.

Archaeologists are using machine learning to take the grunt work out of their jobs. A.I. is starting to have a major impact on science, as the DeepMind protein-folding breakthrough shows. But so far, most uses of machine learning in science are less about these fundamental advances and more about process: automating time-consuming and tedious data-collection processes, as a story in The New York Times demonstrates. The paper looks at how A.I. is being used to spot possible Scythian burial mounds in satellite images, count and classify Roman pottery sherds, or identify human bones illegally being sold on the Internet.

EYE ON A.I. TALENT

ABBYY, a digital intelligence and robotic process automation company based in Milpitas, California, has named Weronika Niemczyk as chief people officer, the company said in a statement. Niemczyk previously led human resources Ascential, a British media business specializing in events, exhibitions and festivals.

Orbital Insight, the Palo Alto, California-based satellite imagery analytics company, has appointed Kevin O'Brien as its new chief executive officer, the company said in a news release. O'Brien had previously been the company's chief operating officer. Company founder James Crawford, who had been CEO, is transitioning to become chairman of the board as well as chief technology officer.

EYE ON A.I. RESEARCH

Your fancy-pants sales forecasting A.I. might not be as good as you think. That's the conclusion of a study conducted by researchers from Naver, the South Korea Internet company. In a paper published on the research repository arxiv.org, the Naver team looked at the performance of probabilistic time-series models that have become popular lately for sales forecasting tasks and compared them to a simpler machine learning model and to linear regression. The bad news? Both the simple methods outperformed three supposedly state-of-the-art probabilistic A.I. techniques.

Part of the problem, the researchers report, is that past tests of the probabilistic methods mostly evaluated them on whether they were above or below a certain limit, but not how well they did at forecasting a precise future sales figure. But having an exact forecast "is essential in industries that require specific numbers, such as the number of delivery people in a logistics company." What's more, many of the more sophisticated sales forecasting models had erratic performance on different tests, the researchers found.

They issued a fairly stinging indictment of the way research on probabilistic models has been conducted and suggest that many previous studies may have cherry-picked the data used to evaluate these systems. "Prominent probabilistic time-series models do not work effectively especially for other datasets not used in the original papers," the researchers report.

FORTUNE ON A.I.

As libraries fight for access to e-books, a new copyright champion emerges—by Jeff John Roberts

This new VR simulator helps you prepare for the most awkward office encounters—by Lee Clifford

Why India’s software startups are poised for global dominance—by Atul Jalan and Brewer Stone

Know when to fold ’em: How a company best known for playing games used A.I. to solve one of biology’s greatest mysteries—by Jeremy Kahn

BRAIN FOOD

The technological arms race between the U.S. and China over artificial intelligence is growing.

Both nations believe that A.I. could give their respective militaries a big strategic and tactical advantage. But Avi Goldfarb and Jon Lindsay take a sober look at just what kind of military advantage A.I. may convey in a new report published by The Brookings Institution. The answer, they say, has as much to do with the human strengths of the military organization using A.I. as it does the technological capabilities themselves.

"In cases where decision problems are well-defined and plentiful relevant data is available, it may indeed be possible for machines to replace humans. In the military context, however, such situations are rare. Military problems tend to be more ambiguous while reliable data is sparse. Therefore, we expect AI to enhance the need for military personnel to determine which data to collect, which predictions to make, and which decisions to take."

Goldfarb and Lindsay say that it is critical that junior military staff understand the data that is fed into automated decision systems, and how an enemy might try to target or manipulate that data. More human judgment is needed in the lower ranks, not less.

They also write that having better A.I. prediction systems may also lead to a kind of analysis-paralysis on the part of human decision makers: "Ironically, however, the same organizational capacity that enables judgment, and thereby makes war fighting more predictable and controllable, also has the potential to make conflict more ambiguous and less decisive. In short, the ability to automate aspects of decisionmaking can make it harder to come to a decision within an organization or on the battlefield."

Do these same lessons apply in many business contexts? I am always wary about analogies between business and war, and between the organization of militaries and companies, but my guess is that they probably do.

Subscribe to Well Adjusted, our newsletter full of simple strategies to work smarter and live better, from the Fortune Well team. Sign up today.