DeepMind uses A.I. to find the shape of every known protein

Welcome to July’s special edition of Eye on A.I.

A year ago, DeepMind gave a massive gift to science: the London A.I. company, which is owned by Alphabet, used an A.I. system to predict the molecular structure of every protein in the human body and published the information in a big open database, free for researchers to use.

Before that, only 17% of human proteins had known structures. Now, suddenly, there was structural information on all of them: with about 36% being “high confidence” predictions, where the A.I. software, which DeepMind calls AlphaFold, had shown it could nail a protein’s shape to within an atom’s width of accuracy. The company also published predictions for all the proteins in 20 other organisms of interest to science, including the malaria parasite and the species of rat used in most lab experiments.

(I wrote a long feature story for Fortune in November 2020 on how DeepMind developed AlphaFold and used it to solve a 50-year old grand challenge in biology: how to take a protein’s genetic sequence and predict its structure.)

Well, today, DeepMind has topped its earlier feat: it has published a structural prediction for almost every protein known to biology—200 million in total, and more than 200 times the number available up until now.

This is a very big deal—for science, and ultimately for medicine, agriculture, and maybe the environment too. Proteins are the building blocks of all life, composed of strings of amino acids. DNA gives instructions for the order in which to string these amino acids together. But as a cell produces the protein, it spontaneously folds into a complex three-dimensional shape. That shape helps determine how a protein functions. It is, for example, what allows SARS-CoV-2, the virus that causes COVID-19, to use its famous spike protein to latch onto and penetrate the membrane of a human cell.

Knowing the structure of a protein can be critical for designing new medicines: most drugs are small molecules that bind to a particular site on a protein, changing its shape, either preventing the protein from working, or, more rarely, enhancing its function. “I think it is true that almost every drug that has come to market in the past few years has been designed at least in part by having access to the protein structures,” says Janet Thornton, an expert on protein structure who is the emeritus director of European Molecular Biology Laboratory’s European Bioinformatics Institute, the non-profit institution that is hosting the database of protein structure predictions that AlphaFold has produced. She says that having structures not only helps find targets for new drugs, but can also help ensure that those drugs are safer, not inadvertently reacting with human proteins in ways that cause harmful side effects.

Patrick Vallance, the U.K. government’s chief scientific adviser, said in a tweet that the new AlphaFold database was “not only another huge advancement, but a step towards insuring the world is prepared for future pandemic threats.”

DeepMind has already collaborated with scientists working on drugs for two tropical diseases, Chagas disease and Leishmaniasis, and with researchers developing enzymes that can digest plastic. Other scientists have used AlpahFold’s predictions to advance work on a malaria vaccine and to investigate ways to combat antibiotic resistance in bacteria.

But the possibilities of this new database are almost endlessly vast. Demis Hassabis, DeepMind’s co-founder and CEO, said it made looking up a protein’s structure “almost as easy as doing a keyword Google search,” and that it would help usher in “a new era of digital biology where A.I. and other computational methods can help to model biological processes.”

Hassabis himself is trying to help Alphabet cash in on this new era. He’s founded, and is serving as the first interim CEO, of Isomorphic Labs, a new Alphabet company, that is dedicated to using AlphaFold and other A.I. tools to accelerate drug discovery. DeepMind has also recently set up a partnership with the U.K.’s Francis Crick Institute to work on protein design and genomics, giving the company the ability to test A.I.-based predictions with wet lab experiments.

Meanwhile the original AlphaFold team is continuing to work within DeepMind. John Jumper, the DeepMind senior researcher who leads that team, is circumspect about what exactly the team is up to next. But, in past conversations with Jumper, he has indicated they may look to modify AlphaFold or create a different A.I. system that can predict how multiple proteins interact and bind with one another (the current version of AlphaFold is only intended to predict the shape of a single protein in isolation.) He says they might also work on what happens after a protein is built by a cell, for instance, predicting where on a protein sugar molecules will adhere to the structure. It’s also possible they might work on a kind of reverse version of AlphaFold that can take a protein’s structure and predict the most likely genetic sequence for that shape—which would be a useful tool for those trying to engineer synthetic proteins.

People talk a lot about “foundational models” in A.I. these days. These are building block A.I. systems, often very large ones trained on vast datasets, which can then be easily fine-tuned to perform many different useful functions in a particular domain. Most often the term has been applied to large language models. But AlphaFold is a truly foundational A.I. in that it has, almost overnight, become a standard part of every molecular biologist’s toolkit—as fundamental as an electron microscope or DNA sequencing.

What’s important about something so fundamental—and suddenly ubiquitous—is that it is hard to predict its ramifications. Just how significant AlphaFold is may only become apparent years from now.

I asked Jumper what use of AlphaFold most surprised him in the past year. He said it was the group of researchers who used AlphaFold, along with a technique called CryoEM that can produce a kind of fuzzy image of a protein’s molecular structure, to build a complete model of the nuclear pore complex. That’s a very large structure, composed of about 1,000 protein, that serves as a kind of transportation tunnel between a cell’s nucleus and the surrounding cytoplasm. Jumper thought there was no way AlphaFold could be used, at least not so soon, to map a structure that large and complex.

Looking ahead, Jumper says that people are likely to start running their own machine learning analysis on the entire protein database DeepMind has published, looking for similarities across organisms—helping to possibly unlock evolutionary history or make big breakthroughs in determining exactly what certain classes of protein shapes do functionally. Those kind of breakthroughs were simply not possible before because there was not enough data to run that sort of analysis, Jumper says.

And protein folding is just one of many areas of basic science where A.I. is making fundamental and transformative contributions. Ultimately, those breakthroughs are likely to filter down into commercial applications too. It’s a brave new world out there.

Here’s a few other things happening in A.I. this week.

Jeremy Kahn

@jeremyakahn
jeremy.kahn@fortune.com

P.S. Before you go on to this week’s news, I want to tell you about a new daily newsletter Fortune is publishing. It’s aimed at human resources executives and is called CHRO Daily. Think of it as a complement to our flagship CEO Daily newsletter, but designed to give chief talent and chief people officers, and all those who aspire to those roles, news, analysis, and tips to help them do their jobs better. You can sign up here.

A.I. IN THE NEWS

Controversial facial recognition app PimEyes can be used to find images of children, including explicit ones. That is the conclusion of an investigation by The Intercept, which used digitally-generated images of fake children to conduct searches using PimEyes and found that images of many real children turned up as possible matches. In many cases, the publication said, it would have been easy to identify these children. In other cases, the publication found that the app pulled up images of children that were labelled "potentially explicit." It said use of the app in this way "could lead to further exploitation of children at a time when the dark web has sparked an explosion of images of abuse."Giorgi Gobronidze, PimEyes' owner, told the publication that he had tasked the app's engineers with creating better safeguards for children, but that he also felt parents needed to be more careful and responsible about posting images of their kids online, especially to public websites.

The war in Ukraine has complicated efforts to ban "killer robots." Deutsche Welle, the German news site, checks in with the United Nations' committee debating what to do about A.I.-enabled lethal autonomous weapons. International human rights groups, technologists, and many countries, want these kind of weapons outlawed. But so far the U.N. has made scant progress towards any kind of legally-binding restrictions on their use, despite eight years of debate. Now, the war in Ukraine has further complicated progress at the U.N., the publication reports. One problem is that Russia, which feels diplomatically isolated over its invasion of Ukraine, has claimed that international sanctions make it impossible for its experts to attend the U.N. committee meetings in Geneva and it has used its power to try to block discussions in their absence. But a bigger issue may be that the Ukraine conflict is, to many military observers, proving the value of A.I.-enabled "loitering munitions." Otherwise known as kamikaze drones, these weapons have been deployed by both sides in the conflict. The current versions have some degree of autonomy, but generally still need a human to select the target they will strike. But fully autonomous versions, however, are on the horizon.

An A.I. system learned to use an "alternative physics" to explain what is was seeing in videos. Researchers at Columbia University wanted to see if an A.I. system could learn the fundamental variables that underpin physics—phenomenon such as mass, velocity, acceleration—from watching simple videos of objects in motion (a swinging pendulum, or an inflated balloon-like "air dancer" undulating in the wind.) It turned out that the system could in fact learn to predict the motion it was seeing in the videos and could identify and output the number of physical variables it was using to make the prediction. But, in a finding that surprised the scientists, when they investigated what these variables were, they often found they were seemingly different from those used by human scientists to explain and predict the kind of motion depicted in the videos. In fact, in a few cases, the researchers were stumped as to what it was the A.I. was focusing on that was allowing it to make accurate predictions. The discovery of this "alternative physics" raises all kind of questions about whether our current model of physics is really the best one. "Perhaps some phenomena seem enigmatically complex because we are trying to understand them using the wrong set of variables," Hod Lipson, director of the Creative Machines Lab in Columbia's Department of Mechanical Engineering, told the publication SciTechDaily.

Trendingnow

1

2

3

A.I. is rapidly transforming biological research—with big implications for everything from drug discovery to agriculture to sustainability

A.I. IN THE NEWS