The Promise and Peril of Data-Driven Decisions in Healthcare

Artificial intelligence and machine learning can transform healthcare in low-resource settings, but only if we’re careful, experts say

Published February 18, 2021 under Around DGHI

Ai, Data Science and Global Health

Data science can reveal hidden patterns and hard truths about healthcare access and delivery, pointing to systemic inequities and ways to address them. But it can also perpetuate those inequities.

So how do data scientists deploy their tools to do the former while avoiding the latter? Three experts in data science and artificial intelligence tackled that question during a recent webinar hosted by the Duke Global Health Institute.

Moderated by Elizabeth Turner, an associate professor of biostatistics and director of DGHI’s Research Design and Analysis Core, the discussion explored how new tools for data analysis and machine learning are aiding decisions about healthcare allocation and delivery, particularly in low- and middle-income countries. While universally enthusiastic about the promise of the field, panelists emphasized the need for strong ethical boundaries and common standards to assure that data is used appropriately and does not reinforce human biases.

Watch the full panel discussion, part of DGHI’s Think Global series, below, or scroll down for a summary of comments.

About the speakers

Andy Tatem is a professor of spatial demography and epidemiology at the University of Southampton and the Director of WorldPop and Flowminder. His research has led to pioneering approaches in the use and integration of satellite, survey, cell phone and census data to map the distributions of vulnerable populations for disease, disaster and development applications.

João Vissoci is an assistant professor of surgery and global health at Duke. His research interests include applying data science and technology to innovate in ways to address access to care and health systems gaps in global health and remote areas. His work includes the use of geospatial analysis and geostatistics, latent variable modeling, psychometrics and machine learning.

Eric Laber recently joined Duke as a Professor of Statistical Science and Biostatistics and Bioinformatics. His research focuses on data-driven sequential decision problems with applications in precision medicine, public health, defense, and retail planning. A current focus is the development of safe decision-support systems for high-risk decision problems, such as adaptive systems for planning diet, exercise and insulin adjustments for patients with type 1 diabetes.


What is AI as it relates to healthcare?

Eric Laber:

“I define artificial intelligence in healthcare as just simply using data to make healthcare better.”


João Vissoci:

“Artificial intelligence is a way to synthesize data, making data make decisions for us, or support decisions for us, in a faster way. … But I always like to stress that artificial intelligence and data science are just ways of understanding data. Data is at the center of all of this.”


On how data can inform decisions about allocation of health resources

Andy Tatem:

“In sub-Saharan Africa, for instance, there are situations there where there hasn’t been a census for 20 or 30 years. And there are decisions being made about allocation of vaccinations and allocation of resources – a range of very important decisions – based on very uncertain data.”

“The really exciting part for us is bout extracting information from satellite images. … It’s really revolutionized what we can do in terms of the accuracy of being able to identify where people are likely to live. We can link it with survey data to estimate how many people live in areas that have never even appeared on maps.”


João Vissoci:

“Identifying where people live is the first step, and the second step is identifying how can we optimize resource allocation based on where healthcare facilities are and the dynamics of the health system.”

On his recent research on COVID cases in Brazil:

“We’ve been able to show, for example, that places where we had a surge of COVID deaths and COVID-related outcomes are also the places where you have way less access to care, or worse, difficulty accessing care because of access to travel and inaccessible roadways.”

On how data systems about population and healthcare access are guiding allocation of COVID vaccines:

“Building these kinds of systems allows us to optimize how resources are distributed within a network in a way that resources that are scarce – and COVID vaccines are definitely a scarce resource globally right now – can be allocated to have the most impact.”


On how data complements human decisions in healthcare

Eric Laber:

“For many diseases, the decision should be made by a person, but how do we help that person synthesize very complex data from multiple sources to help them make the best possible decision.”


On the need for standardization in data sets

Andy Tatem:

“When we’re working with data sets from multiple different fields or different government ministries, there’s a need to check for completeness and quality. A huge part of our work is the harmonization just to be able to get to that stage where we can extract new insights from them.”


Is bias a problem in data-driven decision-making?

Eric Laber:

“if you’re trying to build a predictive model that imitates a system where there is already bias, you have a huge problem. … There have been plenty of examples where we fit one of these systems, and it either maintains bias or makes bias worse. It’s not just how the algorithm is built, but it’s also what data we use to train the algorithm and evaluate the algorithm, where it’s deployed and who has access to it.”


Andy Tatem:

“It’s vitally important to understand these biases. If you’re just taking data and putting it straight into a model, and not taking the time to understand that data, that’s a very dangerous route to go.”


João Vissoci:

“We need to discuss equity in data generation and data quality. If most models are developed out of data coming from high-income countries like the United States, then most models are going to reflect this society. So how are these models going to reflect the nuances of health systems that have problems with access to care?”


Balancing access and privacy of health-related data

João Vissoci

“Data is an eternal thing that can be used and should be used to leverage other questions. There’s a strong ethical component to developing data that is more than answering the question for a project, but developing something that could be answering questions for other projects.”


Eric Laber:

“A major area of research now is how do we build systems where healthcare organizations can share data to make better decisions, but still respect the privacy of individuals. And I think there’s also a need for better consenting about how people’s data are used.”