Bias in Large Language Models Across Clinical Applications: A Systematic Review
By: Thanathip Suenghataiphorn , Narisara Tribuddharat , Pojsakorn Danpanichkul and more
Potential Business Impact:
Fixes AI mistakes in doctor's notes for fairness.
Background: Large language models (LLMs) are rapidly being integrated into healthcare, promising to enhance various clinical tasks. However, concerns exist regarding their potential for bias, which could compromise patient care and exacerbate health inequities. This systematic review investigates the prevalence, sources, manifestations, and clinical implications of bias in LLMs. Methods: We conducted a systematic search of PubMed, OVID, and EMBASE from database inception through 2025, for studies evaluating bias in LLMs applied to clinical tasks. We extracted data on LLM type, bias source, bias manifestation, affected attributes, clinical task, evaluation methods, and outcomes. Risk of bias was assessed using a modified ROBINS-I tool. Results: Thirty-eight studies met inclusion criteria, revealing pervasive bias across various LLMs and clinical applications. Both data-related bias (from biased training data) and model-related bias (from model training) were significant contributors. Biases manifested as: allocative harm (e.g., differential treatment recommendations); representational harm (e.g., stereotypical associations, biased image generation); and performance disparities (e.g., variable output quality). These biases affected multiple attributes, most frequently race/ethnicity and gender, but also age, disability, and language. Conclusions: Bias in clinical LLMs is a pervasive and systemic issue, with a potential to lead to misdiagnosis and inappropriate treatment, particularly for marginalized patient populations. Rigorous evaluation of the model is crucial. Furthermore, the development and implementation of effective mitigation strategies, coupled with continuous monitoring in real-world clinical settings, are essential to ensure the safe, equitable, and trustworthy deployment of LLMs in healthcare.
Similar Papers
No LLM is Free From Bias: A Comprehensive Study of Bias Evaluation in Large Language Models
Computation and Language
Finds and fixes unfairness in AI language.
Large Language Models in Healthcare
Computers and Society
Helps doctors use smart computers for better patient care.
Large Language Models for Healthcare Text Classification: A Systematic Review
Computation and Language
Helps doctors sort patient notes using smart computers.