Score: 2

Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement

Published: April 8, 2025 | arXiv ID: 2504.05614v1

By: Yichen Dong , Xinglin Lyu , Junhui Li and more

BigTech Affiliations: Huawei

Potential Business Impact:

Makes translated documents sound more natural.

Business Areas:

Translation Service Professional Services

Recent research has shown that large language models (LLMs) can enhance translation quality through self-refinement. In this paper, we build on this idea by extending the refinement from sentence-level to document-level translation, specifically focusing on document-to-document (Doc2Doc) translation refinement. Since sentence-to-sentence (Sent2Sent) and Doc2Doc translation address different aspects of the translation process, we propose fine-tuning LLMs for translation refinement using two intermediate translations, combining the strengths of both Sent2Sent and Doc2Doc. Additionally, recognizing that the quality of intermediate translations varies, we introduce an enhanced fine-tuning method with quality awareness that assigns lower weights to easier translations and higher weights to more difficult ones, enabling the model to focus on challenging translation cases. Experimental results across ten translation tasks with LLaMA-3-8B-Instruct and Mistral-Nemo-Instruct demonstrate the effectiveness of our approach.

Multilingual Contextualization of Large Language Models for Document-Level Machine Translation

Computation and Language

Translates whole books, not just sentences.

16 Apr 2025 0

90%

Improving LLM-based Document-level Machine Translation with Multi-Knowledge Fusion

Computation and Language

Improves computer translation by using summaries and key words.

15 Mar 2025 2

88%

Beyond the Sentence: A Survey on Context-Aware Machine Translation with Large Language Models

Computation and Language

Makes computer translations understand more context.

9 Jun 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com github.com github.com github.com

Page Count

17 pages

Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement

Makes translated documents sound more natural.

Technical Abstract

Multilingual Contextualization of Large Language Models for Document-Level Machine Translation

Improving LLM-based Document-level Machine Translation with Multi-Knowledge Fusion

Beyond the Sentence: A Survey on Context-Aware Machine Translation with Large Language Models