MultiMed-ST: Large-scale Many-to-many Multilingual Medical Speech Translation
By: Khai Le-Duc , Tuyen Tran , Bach Phan Tat and more
Potential Business Impact:
Translates doctor talk between many languages.
Multilingual speech translation (ST) in the medical domain enhances patient care by enabling efficient communication across language barriers, alleviating specialized workforce shortages, and facilitating improved diagnosis and treatment, particularly during pandemics. In this work, we present the first systematic study on medical ST, to our best knowledge, by releasing MultiMed-ST, a large-scale ST dataset for the medical domain, spanning all translation directions in five languages: Vietnamese, English, German, French, Traditional Chinese and Simplified Chinese, together with the models. With 290,000 samples, our dataset is the largest medical machine translation (MT) dataset and the largest many-to-many multilingual ST among all domains. Secondly, we present the most extensive analysis study in ST research to date, including: empirical baselines, bilingual-multilingual comparative study, end-to-end vs. cascaded comparative study, task-specific vs. multi-task sequence-to-sequence (seq2seq) comparative study, code-switch analysis, and quantitative-qualitative error analysis. All code, data, and models are available online: https://github.com/leduckhai/MultiMed-ST.
Similar Papers
Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation
Computation and Language
Translates many languages spoken at once.
MCAT: Scaling Many-to-Many Speech-to-Text Translation with MLLMs to 70 Languages
Computation and Language
Translates speech to text in 70 languages faster.
Multilingual LLM Prompting Strategies for Medical English-Vietnamese Machine Translation
Computation and Language
Helps doctors translate English medical words to Vietnamese.