Understanding bias in medical imaging AI

BLOOMINGTON – Experts at Indiana University participated in a nationwide effort to develop artificial intelligence (AI) technology that can accurately identify self-reported race through medical images. The AI can correctly predict a patient’s race—the race they identify with on hospital paperwork— with only the medical image, something a human radiologist would be unable to do.

The research, which was recently published in The Lancet Digital Health, included experts like John Burns, director of informatics within the Department of Radiology and Imaging Sciences at IU School of Medicine and an associate faculty member at the School of Informatics and Computing at IUPUI.

“If you think about it, radiology has the ability to be one of the least biased facets of medical care due to its limits in patient interaction,” Burns said. By integrating this new deep learning technology to an area of medicine where so much patient data is already withheld in medical reports/imaging that radiologists receive, it’s clear how introducing AI that is trained in diverse data is critical to its future use in clinical care.

But if the AI is this accurate, could it also begin to make assumptions about a patient’s race that might instill bias? Burns and his colleagues asked this question, and their research explored understanding the bias in medical imaging AI and how it can be improved.

Value in federated learning

With the use of the federated learning platform, Burns and his collaborators hope to encourage the development of guidelines surrounding how AI models are trained and deployed. By doing so, the research group plans to make AI data sets that are more diverse and accessible to researchers across the country and throughout the world, which is quite novel considering how many national research projects can be slowed due to HIPAA guidelines and time-consuming processes. Since their platform doesn’t include the sharing of HIPAA information, the research group is able to offer their partners a shared research experience unlike any other.

“When I look at the AI models that are sold,” Burns said, “I think, ‘yeah, you’ve got really good accuracy or performance metrics on your data, but I wonder how does it perform on my data? And how do you show me that it performs well on my data?”

Through their federated learning effort, Burns and his team are able to establish a system that can efficiently provide answers to these types of questions for researchers, which will ideally have a profound impact in ensuring data reflects the communities they aim to serve.

Samples of images altered to test AI accuracy

Challenging the system

Uniquely, even when the group altered the clinical images used in this research, the AI could still accurately predict the race. Despite clipping the brightness, blurring or making them noisy—they still received good results.

“Even when there’s no longer anything diagnostically valuable in the image, the AI could still predict the patient’s race,” Burns said. “The information about race is so encoded into clinical images, that even when you get rid of the clinical information, it’s still there.”

As the research continues, Burns and his colleagues hope to better understand what in the image the AI is pulling race identity from. Understanding this aspect of clinical images could open the door to more possibilities in the diagnosing of patients. Additionally, at this time there are no guidelines for AI models to ensure they are trained and tested with diverse data—meaning current models’ data sets do not need to meet certain requirements that reflect certain populations within a community. By encouraging the development of more diverse AI training, Burns and his team hope to create a more lasting impact on deep learning technology and its implementation in health care.

Learn more about imaging informatics at IU School of Medicine.