The Concerti algorithm identifies intrapatient heteroplasmias in patients with SARS-CoV-2



[ad_1]

The severe acute respiratory syndrome virus coronavirus 2 (SARS-CoV-2) that emerged in Wuhan, China, in late 2019, has caused the coronavirus disease 2019 (COVID-19) pandemic, which has affected more than 39.8 million lives worldwide and has claimed more than 1.11 million deaths to date. While extensive sequencing efforts are underway to understand the evolution of this virus, several studies with SARS-CoV-2 sequencing data show different allele frequencies of the virus in the same patient, a phenomenon called heteroplasmy.

The most likely explanation for these heterogeneous intrapatient viral readings is the existence of multiple viral strains. Recombination is an unlikely explanation because the chances of the virus being functional after disassembly within the host cell and reassembly into a virion having a different sequence are quite low. While there is evidence of multiple SARS-CoV-2 virus lineages in the same COVID-19 patient, to date there is no evidence that the sub-lineages recombine in the same patient.

Clinical implications of heteroplasmy

Multiple strains of a virus that infect the same patient have enormous clinical implications for the epidemiology, treatment, and control of the pandemic. Variations in viral strains can indicate different levels of transmissibility, different mechanisms of drug resistance, different responses to treatment, and explain the wide variety of symptoms. Given the importance of this in treatment and vaccine development, it is imperative that as more research focuses on the heteroplasmy of SARS-CoV-2.

Researchers from IBM Research, TJ Watson Research Center, NY, USA, recently presented a common methodological framework for interpreting phylogenomics from genomic data for multiple diseases, including COVID-19 and cancer. Your work is published on the prepress server bioRxiv*.

In the case of cancer, tumor heterogeneity in a patient indicates intrapatient heteroplasmy and the absence of recombination in tumor cells is an accepted assumption. Researchers hypothesize that just as different frequencies of genomic variants in a tumor indicate multiple tumor clones and offer a control for computationally inferring them, different frequencies of variants in viral genomic readings provide the means to calculate multiple sublineages. coinfectants.

Concerti framework scheme.  Given a set of multi-patient (COVID-19) or multi-site, multi-time (cancer) genomic samples, the algorithm analyzes the underlying alteration frequency distribution as input and performs one (1) negative selection to filter out the alterations That appear.  A (2) multidimensional pool is performed to identify pseudoclones / lineages that will then be enriched with a (3) unique sample pool that (4) fuses alterations that were initially negatively selected.  (5) All potential phylogenies are generated and their compatibility evaluated.  Finally, the set of consolidated phylogenetic structures over time or site is generated with probability scores.

Concerti framework scheme. Given a set of genomic samples from multiple patients (COVID-19) or from multiple sites and multiple times (cancer), the algorithm analyzes the underlying alteration frequency distribution as input and performs one (1) negative selection to filter out the alterations that appear. A (2) multidimensional pool is performed to identify pseudoclones / lineages that will then be enriched with a (3) single sample pool that (4) fuses alterations that were initially negatively selected. (5) All potential phylogenies are generated and their compatibility evaluated. Finally, the set of consolidated phylogenetic structures over time or site are obtained with probability scores.

An algorithm for understanding evolutionary phylogenies

The study describes a computational framework called Concerti to infer phylogenies in the two previous scenarios. To demonstrate the accuracy of this algorithm, the researchers reproduced some previously known results in both scenarios. They also identified a possible novel parallel mutation in the SARS-CoV-2 virus and discovered new clones that had therapy-resistant mutations in the context of cancer.

According to the researchers, Concerti’s ability to extract and integrate information from multiple points, sites, times or samples makes it possible to discover phylogenetic trees that capture spatial and temporal heterogeneity. These phylogeny models can directly affect therapeutics, as they can highlight the “birth” of clones that may harbor mechanisms of resistance to treatment, the “death” of subclones with drug targets, and the acquisition of functionally relevant mutations in clones that may appear clinically irrelevant.

Concerti T tumor evolution tree for patient GI1.  T tumor evolution tree for data from multiple GI1 sites from colon cancer patients.  The edges of the T are marked by known cancer genes and the colors denote the various pseudoclones estimated by Concerti.  Leaf nodes represent each of the different injury sites.  Single-site T-trees are shown at the bottom as stacked disks and sizes are proportional to prevalence values.

Concerti T tumor evolution tree for patient GI1. T tumor evolution tree for data from multiple GI1 sites from colon cancer patients. The edges of the T are marked by known cancer genes and the colors denote the different pseudoclones estimated by Concerti. Leaf nodes represent each of the different injury sites. Single-site T-trees are shown at the bottom as stacked disks and sizes are proportional to prevalence values.

The team demonstrated how Concerti could be applied to any genome sequencing dataset with different allele frequencies, be it cancer or SARS-CoV-2, and how the results provided by the algorithm can have significant disease-specific clinical implications.

“We demonstrate in this article how Concerti can be applied to any genomic sequencing dataset with varying allele frequencies, be it cancer or the novel SARS-CoV-2 virus causing the COVID-19 pandemic, and the results can have profound consequences. specific for the disease. clinical implication “.

Specific integration of multipoint data could improve response to treatment

Identifying the presence of many viral strains in a single host can profoundly affect treatment approaches, vaccine development efforts, and infection mitigation strategies. Concerti data for COVID-19 patients show the ability to identify viral strains based on different allele frequencies and thus discover the presence of new homoplasies. The researchers believe that the results provided by Concerti effectively address the crucial challenges facing research in the development of therapies and vaccines.

With cancer, precise monitoring of tumor progression throughout the course of the disease can help identify new drug targets and therapeutic methods that could stabilize this disease and control the pressures of treatment exposure and changes in the environment. of the tumor. The results of the study highlight how the specific integration of multipoint data by Concerti could facilitate more optimized and localized treatment plans for a better response to treatment.

“The Concerti results address the daunting research challenges facing 396 therapies and may help provide the key to effective vaccine development.”

*Important news

bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be considered conclusive, guide clinical practice / health-related behavior, or be treated as established information.

[ad_2]