Score: 2

Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs

Published: November 10, 2025 | arXiv ID: 2511.07003v1

By: Yingfeng Luo , Ziqiang Xu , Yuxuan Ouyang and more

Potential Business Impact:

Translates 60 languages better, even Chinese.

Business Areas:

Translation Service Professional Services

Large language models have significantly advanced Multilingual Machine Translation (MMT), yet the broad language coverage, consistent translation quality, and English-centric bias remain open challenges. To address these challenges, we introduce \textbf{LMT}, a suite of \textbf{L}arge-scale \textbf{M}ultilingual \textbf{T}ranslation models centered on both Chinese and English, covering 60 languages and 234 translation directions. During development, we identify a previously overlooked phenomenon of \textbf{directional degeneration}, where symmetric multi-way fine-tuning data overemphasize reverse directions (X $\to$ En/Zh), leading to excessive many-to-one mappings and degraded translation quality. We propose \textbf{Strategic Downsampling}, a simple yet effective method to mitigate this degeneration. In addition, we design \textbf{Parallel Multilingual Prompting (PMP)}, which leverages typologically related auxiliary languages to enhance cross-lingual transfer. Through rigorous data curation and refined adaptation strategies, LMT achieves SOTA performance among models of comparable language coverage, with our 4B model (LMT-60-4B) surpassing the much larger Aya-101-13B and NLLB-54B models by a substantial margin. We release LMT in four sizes (0.6B/1.7B/4B/8B) to catalyze future research and provide strong baselines for inclusive, scalable, and high-quality MMT \footnote{\href{https://github.com/NiuTrans/LMT}{https://github.com/NiuTrans/LMT}}.

Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation

Computation and Language

Helps computers translate rare languages better.

2 Apr 2025 0

91%

MCAT: Scaling Many-to-Many Speech-to-Text Translation with MLLMs to 70 Languages

Computation and Language

Translates speech to text in 70 languages faster.

1 Dec 2025 2

91%

Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study

Computation and Language

Makes computers translate 28 languages perfectly.

4 Feb 2025 4

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

24 pages

Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs

Translates 60 languages better, even Chinese.

Technical Abstract

Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation

MCAT: Scaling Many-to-Many Speech-to-Text Translation with MLLMs to 70 Languages

Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study