DeepMind ready to transform life sciences by solving the problem of protein folding



[ad_1]

Google’s artificial intelligence division, DeepMind, has recently made significant progress toward solving one of the oldest challenges in biology, calculating the shape of a protein from a sequence of amino acids. According to Nature, the breakthrough has the potential to transform the fields of biology and chemistry, allowing scientists to determine the function of many proteins that are currently mysterious.

The shape of a protein defines its function, and most biological functions depend on proteins. “Protein folding” is the name given to the process that converts chains of amino acids into the three-dimensional structures that protions require to carry out their functions. If scientists can determine the relationship between amino acid sequences and the shape of the proteins they make, they can determine which proteins impact different biological processes.

Scientists hypothesize that there are at least 80,000 proteins within the human proteome, but only a small fraction of these proteins have known structures. The traditional method of determining the shape of a protein can take years of laboratory experiments, even harnessing the power of computer algorithms and models. The work done by DeepMind can dramatically accelerate the protein structure discovery process, reliably determining protein structure in a fraction of the normal time.

The DeepMind researchers trained their algorithms on a database comprised of approximately 170,0000 protein sequences and the shapes corresponding to those sequences. The algorithms developed by the researchers were trained on between 100 and 200 GPUs, and the training process took a few weeks to complete. The model developed by the researchers was called “AlphaFold.”

AlphaFold operates through a “tension algorithm”, starting by connecting small parts of the protein and then scaling to connect larger and larger sections. The small groups of amino acids were joined at first, and then the algorithm sought to find ways to link these groups.

The AlphaFold researchers initially attempted to use conventional deep learning algorithms on genetic and structural data to predict the relationship between amino acids and proteins. AlphaFold then created consensus models for the style of proteins. When this technique proved to have too many limitations, the researchers tried a new strategy. The AlphaFold research team created more feature-trained models, and this time they had the model return predictions for the final structure of protein sequences.

The engineering team put AlphaFold to the test by participating in a competition in which computer algorithms compete to evaluate the structure of a protein from amino acid sequences. The competition was the “Critical Assessment of Protein Structure Prediction” or CASP. Participants in the competition receive 100 amino acid sequences and their models must determine the structure of proteins. AlphaFold not only outperformed the other computer models in terms of accuracy, but also performed comparably to traditional laboratory-based modeling techniques. The final mean AlphaFold score was approximately 92 out of 100, and the experimental laboratory methods were assigned a score of 90. The mean AlphaFold score fell to 87 percent for the most difficult proteins.

According to DeepMind CEO and co-founder Demis Hassabis, the company is already making plans to give researchers access to AlphaFold, and scientists at the Max Planck Institute for Developmental Biology are already using the model to discover protein structures in the cells. They have been working for more than a decade.

Janet Thornton, director emeritus of the European Bioinformatics Institute, was quoted via ScienceMag as saying that DeepMind’s achievements “will change the future of structural biology and protein research.” Meanwhile, University of Maryland Shady Grove biologist John Moult says he never thought that the protein folding problem would never be solved in this lifetime.

While AlphaFold is highly unlikely to completely replace traditional experimental methods for discovering protein structures, it could dramatically increase the speed at which protein structures are discovered. Researchers may require lower-quality experimental data to determine the structure of a protein, and researchers already have access to a large volume of genomic data that could be translated into structures using AlphaFold’s solutions.

[ad_2]