DeepMind AI predicts protein structures



[ad_1]

Programs created by Google’s artificial intelligence company, DeepMind, have beaten humans who play chess, Go, and some Atari computer games. Biologists and computer scientists say the company has done the same with the protein folding puzzle. In an international competition, the company’s program predicted how proteins fold in three dimensions given only their amino acid sequences. “A great 50-year challenge in computer science has largely been solved,” says John Moult, a structural biologist at the University of Maryland, who announced the results of the competition this week.

Credit: DeepMind

The ORF8 model of SARS-CoV-2 predicted by artificial intelligence (blue) closely matches the experimentally determined structure (green).

The shapes and functions of proteins result from how the amino acids that make up each protein interact with each other and with their environment. There are a large number of these interactions to consider even for a short stretch of protein, making predicting how proteins fold a huge challenge for scientists. In the early 1990s, Moult helped establish the Critical Evaluation of Protein Structure Prediction, or CASP, competition to propel researchers to overcome the challenge of prediction. But the researchers, including Moult, admit that they had given up hope of living to see a solution.

Then in 2018, at the last CASP conference in Cancun, investigators could be found walking around in a daze. Newcomers AlphaFold, DeepMind’s protein folding team, had just surpassed long-standing groups with many years of experience. AlphaFold not only won the competition, but put a lot of sunlight between them and the next best team. But their predicted structures still couldn’t match actual ones obtained through structural biology experiments, such as X-ray crystallography or cryogenic electron microscopy.

In this year’s competition, two-thirds of the protein structures predicted by AlphaFold were within experimental error. Basically, these structures were as good as what the researchers could obtain through their laboratory techniques.

Many groups have turned to machine learning techniques to try to predict the structures of proteins. They train their algorithms on known protein folds, hoping the programs can find patterns that translate into specific folds. But not happy with their results in 2018, the AlphaFold team led by John Jumper returned to the drawing board and completely rebuilt their machine learning approach for this year. It wasn’t smooth sailing, Jumper said at a press conference last week, but it works. The new AlphaFold approach uses different machine learning techniques, including an attention-based algorithm to solve protein structures in small fragments, a process that Jumper likens to solving a puzzle, with different “islands of solution” that you must later discover. how to solve Join. “We really didn’t know until we saw the CASP results how far we had taken the field,” says Jumper.

One of the researchers who evaluated the results of the different teams’ programs for this year’s CASP competition was Andrei Lupas from the Max Planck Institute for Developmental Biology. He says it was immediately apparent that AlphaFold had made an incredible improvement in their 2018 efforts. Not only did they have a huge advantage over the other groups overall, he says, but even though the accuracy of the other teams’ predictions dropped to As the structures became more difficult to solve, AlphaFold barely registered a difference. “They don’t care if the goal is easy or difficult,” he explains.

To test how good AlphaFold was, Lupas extracted a protein that his research team did not know the complete structure of. Lupas’ group had a good data set for the protein, he said, but over the past 10 years, they had exhausted various structural biology approaches to translate it into a 3-D structure. “So we set this as a goal and asked for models,” he explains. The AlphaFold model “solved our structure in half an hour.”

“The ultimate vision behind DeepMind has always been to build general AI and then use it to help us better understand the world around us,” says Demis Hassabis, CEO and co-founder of DeepMind. Hassabis says he first became interested in the problem of protein folding in college. The company has “made great strides with games like Go, Starcraft and Atari,” he adds. “But it’s important to realize that they were always a stepping stone on the road to this overall goal.”

This new AlphaFold program has not completely solved the problem of protein folding. There are still some protein structures that cannot yet be resolved by AlphaFold, such as complexes with many protein-protein interactions or proteins on cell membranes. However, Lupas says, for many biologists, the solutions will be good enough for their needs. For example, Lupas could use rapidly resolved structures to compare different proteins and find specific shapes or domains that suggest they evolved from a common ancestor protein or peptide.

In the next 10 years, Lupas believes that AI will advance to the point where biologists will only need one data set and one algorithm to solve for the structure of a protein, allowing them to spend less time on experiments and more time on think and conceptualize what those results are. half.

“Proteins are the most beautiful and beautiful structures and the ability to follow them to predict exactly how they fold in three dimensions is really very, very challenging,” says Janet Thornton, Director Emeritus and Principal Scientist at the European Institute of Bioinformatics, part of the European Molecular Biology Laboratory. “This is an ideal problem for machine learning,” he adds. “But I think there are also many problems, particularly in medicine and in the environment, that will really benefit from these machine learning approaches.”

While full details of the new AlphaFold system are not yet available for review, Jumper says the team plans to submit a full article describing their work to a peer-reviewed journal just as they did after its success in 2018 (Nature 2020, DOI: 10.1038 / s41586-019-1923-7). The team has also started several collaborations with research groups to see how AlphaFold could be useful. They are also exploring how they could make their services available to the industry.

[ad_2]