As the pandemic hits new heights, with nearly 12 million cases and 260 million deaths in the U.S., a glimmer of hope is emerging. Moderna and pharmaceutical company Pfizer, which are developing vaccines to fight the virus, have released preliminary data suggesting their vaccines are about 95% effective. Manufacturing and sales are expected to increase once the companies seek and receive approval from the U.S. Food and Drug Administration. Moderna and Pfizer representatives say the first doses could be available as early as December.
But even if the majority of Americans agree to a vaccination, the pandemic won’t end suddenly. Kenneth Frazier, CEO of Merck, and others warn that drugs used to treat or prevent COVID-19, the condition caused by the virus, are not silver bullets. Most likely we will have to wear masks and practice social distancing well into 2021, not only because vaccines are unlikely to be available until mid-2021, but also that after each vaccine is released, studies will need to be conducted to monitor the potential for side effects . Scientists will need even more time to determine the effectiveness of the vaccines or the level of protection against the coronavirus.
During this time of uncertainty, it is tempting to turn to fortune tellers for comfort. In April, researchers from the Singapore University of Technology and Design released a model they claimed could estimate the life cycle of COVID-19. After entering data – including confirmed infections, tests performed, and the total number of registered deaths – the model predicted the pandemic would end this December.
The reality is far worse. The US has recorded more than 2,000 deaths a day this week, most in a single day since the devastating first wave in the spring. The country currently has an average of more than 50% more deaths per day than it did two weeks ago, on top of an average of nearly 70% more cases per day.
It is possible – probably even – that the data the University of Singapore team used to train their model was incomplete, unbalanced, or otherwise seriously flawed. They used a COVID-19 dataset compiled by the research organization Our World in Data, which included confirmed cases and deaths from the European Center for Disease Prevention and Control, as well as testing statistics published in official reports. When hedging their bets, the model’s developers warned that forecast accuracy depends on the quality of the data, which is often unreliable and reported differently around the world.
While AI can be a useful tool when used sparingly and with reasonable judgment, blind faith in these types of predictions leads to bad decisions. In a recent case, a recent study by researchers from Stanford and Carnegie Mellon found that certain US polling demographics, including people of color and older voters, are less likely to be represented in mobility data used by the US Centers for Disease Control Be and Prevention, the California Governor’s Office and numerous cities across the country to analyze the effectiveness of social distancing. This oversight means that policymakers who rely on models trained with the data may not be able to set up pop-up test sites or assign medical devices to where they are most needed.
The fact that AI and the data it is trained on tend to be biased is not a revelation. Studies of popular image processing, natural language processing, and algorithms for predicting elections have come to the same conclusion over and over again. For example, much of the data used to train AI algorithms to diagnose disease remains inequalities, in part due to corporate reluctance to share code, datasets, and techniques. However, with a disease as widespread as COVID-19, the impact of these models is magnified a thousand fold, as is the impact of government and organizational decisions that are made by them. Because of this, it is important to avoid making AI predictions about the end of the pandemic, especially if they lead to unjustified optimism.
“If these prejudices are not properly addressed, the spread of these prejudices under the guise of AI can exaggerate the health disparities among minorities who are already the most burdened of disease,” wrote the co-authors of a recent article published in the Journal of American Medical Informatics Association. They argued that biased models could exacerbate the disproportionate impact of the pandemic on people of color. “These tools are based on biased data that reflect biased health systems and are therefore subject to a high risk of bias themselves – even if sensitive attributes such as race or gender are explicitly excluded.”
We would do well to heed their words.
For AI coverage, send news tips to Khari Johnson and Kyle Wiggers – and be sure to subscribe to the AI Weekly newsletter and bookmark The Machine.
Thank you for reading,
AI Staff Writer
Best Practices for a Successful AI Center of Excellence: A Guide for CoEs and Business Units Access here