Comparing Large Language Models and Traditional Machine Translation Tools for Translating Medical Consultation Summaries: A Pilot Study
By: Andy Li , Wei Zhou , Rashina Hoda and more
Potential Business Impact:
Helps doctors translate patient notes better.
This study evaluates how well large language models (LLMs) and traditional machine translation (MT) tools translate medical consultation summaries from English into Arabic, Chinese, and Vietnamese. It assesses both patient, friendly and clinician, focused texts using standard automated metrics. Results showed that traditional MT tools generally performed better, especially for complex texts, while LLMs showed promise, particularly in Vietnamese and Chinese, when translating simpler summaries. Arabic translations improved with complexity due to the language's morphology. Overall, while LLMs offer contextual flexibility, they remain inconsistent, and current evaluation metrics fail to capture clinical relevance. The study highlights the need for domain-specific training, improved evaluation methods, and human oversight in medical translation.
Similar Papers
Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation
Computation and Language
Helps computers translate rare languages better.
Performance of Large Language Models in Supporting Medical Diagnosis and Treatment
Computation and Language
AI helps doctors diagnose illnesses and plan treatments.
Generalist Large Language Models Outperform Clinical Tools on Medical Benchmarks
Computation and Language
New AI helps doctors more than old AI.