Score: 2

LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline

Published: April 13, 2025 | arXiv ID: 2504.09570v2

By: Biao Fu , Minpeng Liao , Kai Fan and more

BigTech Affiliations: Alibaba

Potential Business Impact:

Lets computers translate speech as it happens.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

When the complete source sentence is provided, Large Language Models (LLMs) perform excellently in offline machine translation even with a simple prompt "Translate the following sentence from [src lang] into [tgt lang]:". However, in many real scenarios, the source tokens arrive in a streaming manner and simultaneous machine translation (SiMT) is required, then the efficiency and performance of decoder-only LLMs are significantly limited by their auto-regressive nature. To enable LLMs to achieve high-quality SiMT as efficiently as offline translation, we propose a novel paradigm that includes constructing supervised fine-tuning (SFT) data for SiMT, along with new training and inference strategies. To replicate the token input/output stream in SiMT, the source and target tokens are rearranged into an interleaved sequence, separated by special tokens according to varying latency requirements. This enables powerful LLMs to learn read and write operations adaptively, based on varying latency prompts, while still maintaining efficient auto-regressive decoding. Experimental results show that, even with limited SFT data, our approach achieves state-of-the-art performance across various SiMT benchmarks, and preserves the original abilities of offline translation. Moreover, our approach generalizes well to document-level SiMT setting without requiring specific fine-tuning, even beyond the offline translation model.

SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation

Computation and Language

Translates talking instantly, like a real-time interpreter.

22 Apr 2025 1

89%

Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning

Artificial Intelligence

Makes AI cheaper and faster for everyday tasks.

17 Dec 2025 1

89%

MultiStream-LLM: Bridging Modalities for Robust Sign Language Translation

Computation and Language

Translates sign language better by using special parts.

20 Aug 2025 1

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com

Page Count

24 pages

LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline

Lets computers translate speech as it happens.

Technical Abstract

SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation

Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning

MultiStream-LLM: Bridging Modalities for Robust Sign Language Translation