Boston

MIT Study Uncovers Potential Bias in AI Medical Diagnostics Linked to Race, Gender, Age

AI Assisted Icon
Published on June 28, 2024
MIT Study Uncovers Potential Bias in AI Medical Diagnostics Linked to Race, Gender, AgeSource: Google Street View

A recent study from MIT indicates a troubling trend in AI models used for medical diagnoses based on image analysis: these models are capable of predicting a patient's race, gender, and age and may use those traits, potentially leading to biased diagnostic outcomes. According to MIT News, the AI systems that are more accurate at identifying demographic information also have larger "fairness gaps" in diagnosing diseases among different races and genders. Marzyeh Ghassemi, an MIT associate professor, equated high-capacity AI models' ability to predict human demographics to a previously unrecognized performance discrepancy across patient demographics.

These findings emerge against a backdrop where medical AI is increasingly integrated into health care, with the Food and Drug Administration (FDA) having approved hundreds of AI-enabled devices to date. The "fairness gaps" showcased in the models raise questions about their real-world applications, particularly when applied to patient populations different from those they were trained on, ironically while they retain high accuracy in diagnostics, their demographic prediction power might hamper their effectiveness as they could be using demographic shortcuts which could lead to less accurate results for some groups.

In efforts to address this bias, MIT researchers have developed methods aimed at retraining AI models to improve their fairness. They tested strategies such as "subgroup robustness," which rewards better performance on underperforming subgroups, and "group adversarial" approaches to remove demographic information from the analysis process. Although these methods have yielded positive results within the bounds of the original training dataset, their effectiveness diminishes when models are applied to patient data from different hospitals, as explained by Haoran Zhang, an MIT graduate student and one of the lead authors of the paper, to MIT News.

The study's implications are significant for health care providers who rely on externally-developed AI models, suggesting a clear need to conduct localized evaluations before implementation to ensure diagnostic accuracy across all patient groups, this point is emphasized by the study's authors who note that models developed and validated in one hospital setting might not translate well when used elsewhere, leading to possible disparities in patient care that hinge based on the demographic skews embedded within the AI's predictive algorithms despite their high overall diagnostic accuracy.

Funding for the research was sourced from several notable institutions, including a Google Research Scholar Award and the Robert Wood Johnson Foundation, among others. With 882 FDA-approved AI devices and a clear acknowledgment of current shortcomings, the push for more equitable AI in medical image diagnostics continues as researchers explore additional methods to enhance fairness in predictions across diverse patient sets.

Boston-Science, Tech & Medicine