Score: 1

Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding

Published: October 8, 2025 | arXiv ID: 2510.06866v1

By: Wafaa Mohammed, Vlad Niculae, Chrysoula Zerva

Potential Business Impact:

Makes computer translations understand stories better.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large language models (LLMs) have emerged as strong contenders in machine translation.Yet, they still struggle to adequately handle discourse phenomena, such as pronoun resolution and lexical cohesion at the document level. In this study, we thoroughly investigate the discourse phenomena performance of LLMs in context-aware translation. We demonstrate that discourse knowledge is encoded within LLMs and propose the use of quality-aware decoding (QAD) to effectively extract this knowledge, showcasing its superiority over other decoding approaches through comprehensive analysis. Furthermore, we illustrate that QAD enhances the semantic richness of translations and aligns them more closely with human preferences.

Multilingual Contextualization of Large Language Models for Document-Level Machine Translation

Computation and Language

Translates whole books, not just sentences.

16 Apr 2025 0

89%

Beyond the Sentence: A Survey on Context-Aware Machine Translation with Large Language Models

Computation and Language

Makes computer translations understand more context.

9 Jun 2025 0

88%

Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set

Computation and Language

Computers understand how sentences connect across languages.

13 Mar 2025 1

View PDF Login to Bookmark

Country of Origin

🇳🇱 Netherlands

Repos / Data Links

github.com github.com github.com github.com

Page Count

23 pages

Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding

Makes computer translations understand stories better.

Technical Abstract

Multilingual Contextualization of Large Language Models for Document-Level Machine Translation

Beyond the Sentence: A Survey on Context-Aware Machine Translation with Large Language Models

Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set