Disparities in Multilingual LLM-Based Healthcare Q&A
By: Ipek Baris Schlicht , Burcu Sayin , Zhixue Zhao and more
Potential Business Impact:
Makes AI give fair health answers in any language.
Equitable access to reliable health information is vital when integrating AI into healthcare. Yet, information quality varies across languages, raising concerns about the reliability and consistency of multilingual Large Language Models (LLMs). We systematically examine cross-lingual disparities in pre-training source and factuality alignment in LLM answers for multilingual healthcare Q&A across English, German, Turkish, Chinese (Mandarin), and Italian. We (i) constructed Multilingual Wiki Health Care (MultiWikiHealthCare), a multilingual dataset from Wikipedia; (ii) analyzed cross-lingual healthcare coverage; (iii) assessed LLM response alignment with these references; and (iv) conducted a case study on factual alignment through the use of contextual information and Retrieval-Augmented Generation (RAG). Our findings reveal substantial cross-lingual disparities in both Wikipedia coverage and LLM factual alignment. Across LLMs, responses align more with English Wikipedia, even when the prompts are non-English. Providing contextual excerpts from non-English Wikipedia at inference time effectively shifts factual alignment toward culturally relevant knowledge. These results highlight practical pathways for building more equitable, multilingual AI systems for healthcare.
Similar Papers
Are LLMs Truly Multilingual? Exploring Zero-Shot Multilingual Capability of LLMs for Information Retrieval: An Italian Healthcare Use Case
Artificial Intelligence
Helps doctors find patient info from notes.
Dr. Bias: Social Disparities in AI-Powered Medical Guidance
Artificial Intelligence
AI gives different health advice to different people.
Dr. Bias: Social Disparities in AI-Powered Medical Guidance
Artificial Intelligence
AI gives different health advice to different people.