DeepMind Accurately Predicts Protein Structure, Advancing Decades Challenge



[ad_1]

Two examples of protein targets in the free modeling category show AlphaFold’s prediction compared to the shape of proteins determined by experimental results. AlphaFold predictions are in blue and experimental results are in green. DeepMind screenshot.

DeepMind, the Google subsidiary that has been beating chess and Go players with artificial intelligence, has set its sights on solving a decades-long problem: predicting the structures of proteins.

In a biennial challenge in which participants must blindly predict the structure of 100 proteins based on their amino acid sequences, a system developed by DeepMind caught the researchers’ attention when it predicted their shape with a high level of precision.

Called AlphaFold, the system determined the shape of around two-thirds of the proteins with precision comparable to time-consuming laboratory experiments. Its precision with most other proteins was also high, according to results shared by CASP (the Community Experiment on Critical Evaluation of Techniques for the Prediction of Protein Structure) on Monday. The results were compared with the shape of the proteins discovered in the laboratory and were evaluated by independent scientists.

This is an important advance because the shape of proteins is closely related to their function, but it is difficult to predict the structure of a protein based on its amino acid sequence. Theoretically, proteins can fold into a multitude of ways before establishing their final structure. It can take years of research and expensive equipment to determine its shape.

“Proteins are extremely complicated molecules and their precise three-dimensional structure is key to the many functions they play, for example insulin that regulates blood sugar levels and antibodies that help us fight infection. Even small rearrangements of these vital molecules can have catastrophic effects on our health, so one of the most efficient ways to understand disease and find new treatments is to study the proteins involved, ”John Moult, computational biologist at the University of Maryland at College. Park, who co-founded CASP, said in a press release.

London-based DeepMind has been working on AlphaFold for four years. They also beat the other teams in the last CASP challenge in 2018, but did so by a much higher margin in the most recent year.

The accuracy of the model is measured by the global distance test, which roughly measures the percentage of amino acid residues within a certain distance of the correct position. On a scale of 1 to 100, DeepMind’s latest AlphaFold system scored a median of 92.4 across all targets.

For the latest version of AlphaFold, DeepMind designed a neural network that interprets the structure of a protein as a “spatial graph.” He trained the system on 170,000 protein structures from the protein database, as well as databases with proteins whose structure was unknown.

This allowed the system to determine structures in a matter of days, the team that developed it wrote in a blog post. An internal confidence measure also indicated which parts of each predicted protein structure are reliable.

What does this all mean? It could have broad implications for drug discovery and a better understanding of specific diseases. Andrei Lupas, director of the Max Planck Institute for Developmental Biology and a CASP evaluator, said the system helped his team solve a protein structure they were trapped in for nearly a decade.

Andriy Kryshtafovych, a UC Davis researcher and one of the judges, described the result as a “triumph for team science,” and credited the collaborative work of the researchers over the years to achieving this achievement.

“Being able to investigate the shape of proteins quickly and accurately has the potential to revolutionize the life sciences,” he said in a press release. Now that the problem has been largely solved for individual proteins, the way is open for the development of new methods for determining the shape of protein complexes: collections of proteins that work together to form much of the protein machinery. life and for other applications. “

[ad_2]