[ad_1]
The spring of 2020 made a new statistical model famous. In an effort to raise public awareness of how powerful the coronavirus could be in March and April, two prediction systems were each proposed, one developed by Imperial College London and the other by the Institute for Health Metrics and Evaluation. (IHME). located in Siatle.
But the model’s predictions varied widely. Imperial warned that up to two million people could die from COVID-19 in the United States by the summer, while the IHME estimates were much more conservative, with around 60,000 deaths in August.
As it turned out, none of them were far from true. In the United States, about 160,000 people died from COVID-19 in early August.
The huge discrepancies in the spring forecast data caught the attention of a young data scientist. At the age of 26, Youyang Gu had a master’s degree in electrical and computer engineering from the Massachusetts Institute of Technology (MIT) and another undergraduate degree in mathematics, but had not completed any formal study in pandemic-related fields such as medicine or epidemiology. . And yet he thought that his experience working with data models could be useful during a pandemic.
Youyangas Gu
In mid-April, while living with her parents in Santa Clara, California, Youyang Gu developed her own COVID death prediction tool and website for a week to display grim information.
Soon, his model was already generating much more accurate results than published by generously funded institutions with many years of experience.
“His model was the only one that seemed rational,” said Jeremy Howard, a renowned data expert and researcher at the University of San Francisco. – Other models have repeatedly shown that they do not make sense, and there was a lack of knowledge and analysis of those who published the forecasts or of the journalists who wrote about them. People’s lives depended on these things, and Youyang Gu was the one who really analyzed the data and got it right. “
The forecasting model developed by Youyang Gu was partly simple. He initially thought about investigating the link between COVID testing, hospitalizations, and other factors, but soon became convinced that the state and federal government were publishing such data inconsistently.
Daily death rates have proven to be the most reliable. “Other models used more data sources, but I decided to rely on the number of deaths already recorded to predict future deaths,” Youyang Gu said. “Limiting this to a single source helped filter out background noise.”
To improve his calculations, Youyang Gu used machine learning algorithms, and this led to a new and advanced twist in model development. After graduating from MIT, the young man worked in the financial industry for a couple of years developing algorithms for high-frequency trading systems; He knew that if he wanted to keep his job, his predictions had to be accurate.
Working on COVID, Youyang Gu continued to compare his predictions with the final published overall death rates and continually tweaked and adjusted his machine learning software to make the prediction even more accurate. And although that job required as much time as working full-time, Youyang Gu worked voluntarily and lived off the savings. You wanted your data to not be limited by any conflict of interest or political bias.
Youyang Gu’s model, while far from perfect, worked well from the start. At the end of April, the young scientist predicted that by May 9 there would be 80,000 deaths in the United States. The actual number of deaths was 79,926. A similar IHME forecast released in late April predicted that the US would not exceed 80,000 deaths by 2020. Youyang Gu also predicted 90,000 deaths by May 18 and 100,000 deaths by May 27, and again the numbers matched. .
At a time when the IHME expected the virus to disappear as a result of restrictions on social contacts and other measures, Youyang Gu predicted a second major wave of infections and deaths, with many countries opening up after the quarantine.
The IHME was criticized in March and April for not reflecting the actual figures. However, the influential center, based at the University of Washington and funded by more than $ 500 million. Bill & Melinda Gates Foundation USD, cited by members of President Donald Trump’s administration in press conferences almost daily.
In April, Anthony Fauci, director of the US National Institute of Allergies and Communicable Diseases, told reporters that the death toll from COVID-19 was “60,000 instead of 100,000-200,000,” as predicted earlier, reflecting IHME forecasts.
And on April 19, the same day that Youyang Gu warned of the second wave, D. Trump drew attention to the 60,000 IHME deaths as an indicator that the fight against the virus would soon end.
IHME officials also actively promoted their numbers.
“Throughout the news, you have seen IHME try to convince people that the death toll will drop to zero in July,” says Youyang Gu. – Any sober person could have predicted that we will have between 1,000 and 1,500 deaths a day for some time. In my opinion, his behavior was very unfair. “
IHME Director Christopher Murray says that when the organization was better able to understand the virus after April, its predictions dramatically improved.
But that spring, with each passing week, more and more people began to hear Youyang Gu’s work. He informed journalists of his model via Twitter and also sent an email to epidemiologists asking them to verify their numbers. In late April, famous University of Washington biologist Carl Bergstrom hinted at Youyang Gu’s model on Twitter, and the US Centers for Disease Control and Prevention (CDC) had already added indicators of the type to your COVID prediction website. As the pandemic worsened, Youyang Gu, a Chinese immigrant who grew up in Illinois and California, has already participated in regular meetings with representatives from the CDC and teams of professional modelers and epidemiologists, all of whom have worked hard to improve his prognosis.
Youyang Gu’s website has skyrocketed in views, with millions of people checking it every day to find out what’s going on in their states and the US in general. In most cases, their predicted figures matched almost exactly the actual deaths, which were released a few weeks later.
With such strong interest in these forecasts, in 2020, more models began to be found during the spring-summer period. Nicholas Reich, an associate professor in the Department of Biostatistics and Epidemiology at the University of Massachusetts (Amherst), collected around 50 models and tested their accuracy over many months on the Covid-19 Forecast Hub. “Youyang’s model has always been among the best,” Reich said.
In November, Youyang Gu decided to discontinue his death prediction activities. N. Reich, combining and combining the data from various forecasts, found that the most accurate forecasts are offered by this “set of models”, or in other words, combined data.
“Yoyang Gu withdrew, showing incredible humility,” Reich said. “He realized that other models worked very well and that he had nothing to do here.”
A month before the project was halted, Youyang Gu announced a forecast that 231,000 deaths would be recorded in the United States by November 1. And on November 1, the United States announced about 230,995 deaths from coronavirus.
IHME’s Ch. Murray has a personal take on Youyang Gu’s departure. According to him, Youyang Gu’s model would not have captured the seasonal nature of the coronavirus and the jump in cases and deaths from the winter season would have been missed.
“He thought the epidemic would take off in the winter, and we saw seasonality in May,” says Ch. Murray.
The machine learning methods used by Youyang Gu work very well for making short-term predictions, Ch Murray says, but drawing a more general picture “is not particularly appropriate when you want to understand what is happening.” According to Ch. Murray, algorithms based on past events cannot take into account virus strains and how vaccines might or might not inactivate them. The IHME, for its part, correctly predicted the initial spike of the virus, then wrongly predicted a sudden reduction in deaths until it finally adjusted its model to better reflect the real situation. “We made a mistake in April,” admits Ch. Murray. “Since then, we have been the only group to consistently provide the correct data.”
N. Reich, who elaborates a description of the basic models, says that the predictions of the organization in the last phases of the pandemic were acceptable. “At an earlier stage, the IHME model didn’t do what it advertised,” says Reich. – Not long ago it became an acceptable model. I wouldn’t say it’s one of the best, but it’s acceptable. “
Youyang Gu is unwilling to comment on Ch. Murray comments on his model and rather sends an ambiguous compliment to the data scientist’s version. “I am very grateful to Dr. Chris Murray and his team for their work,” he says. “Without them, I would not be in the position that I am today.”
Reich says there are lessons to be learned from this data history and asks people not to rush too much into the original individual models the next time a pandemic strikes. He also doubts that the six- or eight-week forecasts are ever very accurate. It would be better if, in the future, the CDC and others merged the models more quickly and disseminated the combined data.
© Zuma Press / Scanpix
“I hope that we will invest time, energy and money in a system that will be better prepared to respond up front with a wider range of models,” says Reich. “We have to prepare people instead of breaking down and knocking on people’s doors.”
After a short break, Y. Gu, now 27, who lives in an apartment in New York, has returned to modeling. This time around, he’s “pouring in” figures on how many people in the United States are infected with COVID-19, how quickly vaccines are being developed and produced, and when, if ever, the country could achieve herd immunity.
According to their forecasts, about 61 percent. the population should already have some form of immunity in June, either due to a vaccine or a disease.
Before the pandemic, Youyang Gu hoped to establish a new company, possibly in the field of sports analysis. You are now considering continuing to work in the public health sector. You would like to find a job in which you can make a significant contribution to the common cause, but at the same time avoid politics, prejudice and baggage, which are sometimes inseparable from large institutions.
“There are a lot of shortcomings in this area that people around me could fix,” he says. “But I still don’t know if I found a place for myself here.”
[ad_2]