Score: 1

Improving LLMs for Machine Translation Using Synthetic Preference Data

Published: August 20, 2025 | arXiv ID: 2508.14951v1

By: Dario Vajda, Domen Vreš, Marko Robnik-Šikonja

Potential Business Impact:

Makes computer translations much better and more accurate.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large language models have emerged as effective machine translation systems. In this paper, we explore how a general instruction-tuned large language model can be improved for machine translation using relatively few easily produced data resources. Using Slovene as a use case, we improve the GaMS-9B-Instruct model using Direct Preference Optimization (DPO) training on a programmatically curated and enhanced subset of a public dataset. As DPO requires pairs of quality-ranked instances, we generated its training dataset by translating English Wikipedia articles using two LLMs, GaMS-9B-Instruct and EuroLLM-9B-Instruct. We ranked the resulting translations based on heuristics coupled with automatic evaluation metrics such as COMET. The evaluation shows that our fine-tuned model outperforms both models involved in the dataset generation. In comparison to the baseline models, the fine-tuned model achieved a COMET score gain of around 0.04 and 0.02, respectively, on translating Wikipedia articles. It also more consistently avoids language and formatting errors.

Multimodal Large Language Models with Adaptive Preference Optimization for Sequential Recommendation

Information Retrieval

Helps computers pick what you'll like better.

24 Nov 2025 0

90%

When Data is the Algorithm: A Systematic Study and Curation of Preference Optimization Datasets

Computation and Language

Makes AI understand what you like better.

14 Nov 2025 4

90%

DPO-Tuned Large Language Models for Segmentation in Simultaneous Speech Translation

Computation and Language

Makes real-time translation sound more natural.

14 Oct 2025 0

View PDF Login to Bookmark

Repos / Data Links

github.com github.com

Page Count

7 pages

Improving LLMs for Machine Translation Using Synthetic Preference Data

Makes computer translations much better and more accurate.

Technical Abstract

Multimodal Large Language Models with Adaptive Preference Optimization for Sequential Recommendation

When Data is the Algorithm: A Systematic Study and Curation of Preference Optimization Datasets

DPO-Tuned Large Language Models for Segmentation in Simultaneous Speech Translation