Reliable fairness auditing with semi-supervised inference
By: Jianhui Gao, Jessica Gronsbell
Potential Business Impact:
Find unfairness in computer health helpers.
Machine learning (ML) models often exhibit bias that can exacerbate inequities in biomedical applications. Fairness auditing, the process of evaluating a model's performance across subpopulations, is critical for identifying and mitigating these biases. However, such audits typically rely on large volumes of labeled data, which are costly and labor-intensive to obtain. To address this challenge, we introduce $\textit{Infairness}$, a unified framework for auditing a wide range of fairness criteria using semi-supervised inference. Our approach combines a small labeled dataset with a large unlabeled dataset by imputing missing outcomes via regression with carefully selected nonlinear basis functions. We show that our proposed estimator is (i) consistent regardless of whether the ML or imputation models are correctly specified and (ii) more efficient than standard supervised estimation with the labeled data when the imputation model is correctly specified. Through extensive simulations, we also demonstrate that Infairness consistently achieves higher precision than supervised estimation. In a real-world application of phenotyping depression from electronic health records data, Infairness reduces variance by up to 64% compared to supervised estimation, underscoring its value for reliable fairness auditing with limited labeled data.
Similar Papers
Cost Efficient Fairness Audit Under Partial Feedback
Machine Learning (CS)
Find unfairness in decisions, saving money.
Fairness Perceptions in Regression-based Predictive Models
Human-Computer Interaction
Makes organ transplants fairer for everyone.
One Size Fits None: Rethinking Fairness in Medical AI
Machine Learning (CS)
Checks if AI doctors treat everyone fairly.