← Back to incidents

Optum AI Algorithm Shows Racial Bias in Healthcare Risk Predictions

Critical

Optum's widely-used healthcare risk prediction algorithm showed severe racial bias, requiring Black patients to be significantly sicker than white patients to receive the same care recommendations, affecting an estimated 200 million patients nationwide.

Category
Bias
Industry
Healthcare
Status
Resolved
Date Occurred
Oct 1, 2019
Date Reported
Oct 25, 2019
Jurisdiction
US
AI Provider
Other/Unknown
Application Type
embedded
Harm Type
physical
People Affected
200,000,000
Human Review in Place
No
Litigation Filed
No
racial_biashealthcare_algorithmoptumunitedhealthdisparate_impactrisk_predictioncare_managementalgorithmic_fairness

Full Description

In October 2019, researchers published a landmark study in Science revealing that an algorithm developed by Optum, a subsidiary of UnitedHealth Group, contained significant racial bias in its healthcare risk predictions. The algorithm was widely deployed across hospitals and health systems nationwide, affecting an estimated 200 million patients annually. The system was designed to identify patients who would benefit from high-risk care management programs, which provide additional medical attention and resources to prevent costly health complications. The research team, led by Ziad Obermeyer from UC Berkeley, analyzed the algorithm's performance using data from a large academic hospital. They discovered that Black patients needed to be significantly sicker than white patients to receive the same risk scores and care recommendations. Specifically, Black patients had 26.3% more active chronic conditions than white patients who received identical risk scores. This meant that Black patients were systematically under-referred to care management programs despite having greater medical needs. The bias stemmed from the algorithm's use of healthcare costs as a proxy for healthcare need during training. Due to historical and ongoing systemic inequalities in healthcare access, Black patients typically incurred lower healthcare costs than white patients with similar health conditions. This occurred because Black patients faced barriers to accessing care, received less intensive treatments, or were treated at lower-cost facilities. The algorithm learned from this biased historical data, perpetuating and amplifying existing healthcare disparities. When the researchers worked with Optum to retrain the algorithm using health outcomes rather than costs as the target variable, the bias was significantly reduced. The corrected algorithm increased the percentage of Black patients identified for high-risk care management from 17.7% to 46.5%, bringing it much closer to the 53.6% rate for white patients. This represented a potential improvement in care access for tens of thousands of Black patients at just the single hospital studied, suggesting massive implications across all healthcare systems using the original algorithm. The incident highlighted the critical importance of algorithmic auditing in healthcare AI systems and demonstrated how seemingly neutral technical choices, such as using cost as a proxy for need, can perpetuate systemic discrimination. Optum committed to working with the researchers to address the bias and implement the improved algorithm. The case became a seminal example in discussions of AI fairness and has influenced regulatory approaches to algorithmic bias in healthcare, though no formal regulatory action was taken against the company.

Root Cause

The algorithm used healthcare costs as a proxy for healthcare need, but due to systemic inequalities, Black patients historically received less expensive care than white patients with similar health conditions. This created a feedback loop where the AI learned to systematically underestimate Black patients' healthcare needs.

Mitigation Analysis

Regular algorithmic auditing for disparate impact across demographic groups could have identified this bias. Using health outcomes rather than healthcare spending as training targets, implementing fairness constraints during model development, and conducting pre-deployment bias testing across racial groups would have prevented this systematic discrimination.

Lessons Learned

This incident demonstrated that healthcare AI systems can perpetuate and amplify existing disparities when trained on biased historical data. It emphasized the critical need for algorithmic auditing, careful selection of training targets that don't embed historical inequalities, and proactive bias testing before deployment in healthcare settings.