Score: 2

Bias Delayed is Bias Denied? Assessing the Effect of Reporting Delays on Disparity Assessments

Published: June 16, 2025 | arXiv ID: 2506.13735v1

By: Jennah Gosciak , Aparna Balagopalan , Derek Ouyang and more

BigTech Affiliations: Massachusetts Institute of Technology

Potential Business Impact:

Fixes unfairness when patient data arrives late.

Business Areas:
Facial Recognition Data and Analytics, Software

Conducting disparity assessments at regular time intervals is critical for surfacing potential biases in decision-making and improving outcomes across demographic groups. Because disparity assessments fundamentally depend on the availability of demographic information, their efficacy is limited by the availability and consistency of available demographic identifiers. While prior work has considered the impact of missing data on fairness, little attention has been paid to the role of delayed demographic data. Delayed data, while eventually observed, might be missing at the critical point of monitoring and action -- and delays may be unequally distributed across groups in ways that distort disparity assessments. We characterize such impacts in healthcare, using electronic health records of over 5M patients across primary care practices in all 50 states. Our contributions are threefold. First, we document the high rate of race and ethnicity reporting delays in a healthcare setting and demonstrate widespread variation in rates at which demographics are reported across different groups. Second, through a set of retrospective analyses using real data, we find that such delays impact disparity assessments and hence conclusions made across a range of consequential healthcare outcomes, particularly at more granular levels of state-level and practice-level assessments. Third, we find limited ability of conventional methods that impute missing race in mitigating the effects of reporting delays on the accuracy of timely disparity assessments. Our insights and methods generalize to many domains of algorithmic fairness where delays in the availability of sensitive information may confound audits, thus deserving closer attention within a pipeline-aware machine learning framework.

Country of Origin
🇺🇸 United States

Repos / Data Links

Page Count
19 pages

Category
Computer Science:
Computers and Society