Tech

DeepMind’s AI is figuring out the structures for every protein known to humanity

By understanding the shape of proteins in the human body, researchers can learn their function — and how to combat diseases.

Protein, illustration.
CHRISTOPH BURGSTEDT/SCIENCE PHOTO LIBRARY/Science Photo Library/Getty Images

DeepMind, the artificial intelligence subsidiary of Alphabet, says it will release the shapes for nearly every protein found in the human body and those in 20 other widely studied organisms, making up nearly all proteins known to science.

Using its AlphaFold program, the company has already released 350,000 structures into the public domain and plans to predict and release the structures for an additional 100 million in the next few months.

By understanding the shape of proteins in the human body, researchers can figure out what they do. Figuring out what proteins do is key to understanding the basic mechanism of life, when it works and when it doesn’t. But experiments to determine the structures of proteins can take months or longer in a lab; after decades of work, only 17 percent of the proteins in the human body have had their shape identified.

Cracking the code — In order for proteins to function correctly, they go through the cellular process of protein folding, or folding into specific, three-dimensional shapes. The structure determines what a protein does, and knowing its shape can help researchers recognize when problems arise and create drugs to combat them.

Efforts to develop a COVID-19 vaccine focused on the virus’s spike protein, for instance. The spike protein lives on the outside of a coronavirus and is how COVID-19 enters human cells. The existing vaccines give instructions for human cells to make a harmless piece of this spike protein, after which the immune system recognizes that the protein doesn’t belong there and begins building a response. Knowing the shape of the spike protein was essential to creating a vaccine.

AI for good — DeepMind trained AlphaFold on a public database of approximately 170,000 protein structures from a publicly available data bank. It has boosted the high accuracy of AlphaFold to predict the outcome of protein folding in a matter of days, and says it’s now predicted the shapes for 36 percent of human proteins with a high degree of confidence. That’s huge if it means scientists no longer have to go through an exhaustive process to figure out the underlying structures of the proteins they’re trying to study.

We often think of artificial intelligence mostly used to pedestrian ends, like deciding what to show us in our news feeds. But technology like AlphaFold actually lives up to the greater ambition of computers studying huge swathes of data and making predictions that a human never could.