Score: 2

TAT-R1: Terminology-Aware Translation with Reinforcement Learning and Word Alignment

Published: May 27, 2025 | arXiv ID: 2505.21172v1

By: Zheng Li , Mao Zheng , Mingyang Song and more

BigTech Affiliations: Tencent

Potential Business Impact:

Teaches computers to translate words exactly right.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Recently, deep reasoning large language models(LLMs) like DeepSeek-R1 have made significant progress in tasks such as mathematics and coding. Inspired by this, several studies have employed reinforcement learning(RL) to enhance models' deep reasoning capabilities and improve machine translation(MT) quality. However, the terminology translation, an essential task in MT, remains unexplored in deep reasoning LLMs. In this paper, we propose \textbf{TAT-R1}, a terminology-aware translation model trained with reinforcement learning and word alignment. Specifically, we first extract the keyword translation pairs using a word alignment model. Then we carefully design three types of rule-based alignment rewards with the extracted alignment relationships. With those alignment rewards, the RL-trained translation model can learn to focus on the accurate translation of key information, including terminology in the source text. Experimental results show the effectiveness of TAT-R1. Our model significantly improves terminology translation accuracy compared to the baseline models while maintaining comparable performance on general translation tasks. In addition, we conduct detailed ablation studies of the DeepSeek-R1-like training paradigm for machine translation and reveal several key findings.

R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning

Computation and Language

Makes computer translation think like a human.

27 Feb 2025 3

88%

NeoAMT: Neologism-Aware Agentic Machine Translation with Reinforcement Learning

Computation and Language

Translates new words into other languages.

7 Jan 2026 1

88%

Audited Reasoning Refinement: Fine-Tuning Language Models via LLM-Guided Step-Wise Evaluation and Correction

Computation and Language

Teaches computers to reason better with less data.

15 Sep 2025 2

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com

Page Count

11 pages

TAT-R1: Terminology-Aware Translation with Reinforcement Learning and Word Alignment

Teaches computers to translate words exactly right.

Technical Abstract

R1-T1: Fully Incentivizing Translation Capability in LLMs via Reasoning Learning

NeoAMT: Neologism-Aware Agentic Machine Translation with Reinforcement Learning

Audited Reasoning Refinement: Fine-Tuning Language Models via LLM-Guided Step-Wise Evaluation and Correction