Score: 1

Leveraging Domain Knowledge at Inference Time for LLM Translation: Retrieval versus Generation

Published: March 6, 2025 | arXiv ID: 2503.05010v1

By: Bryan Li , Jiaming Luo , Eleftheria Briakou and more

BigTech Affiliations: Google

Potential Business Impact:

Helps computers translate tricky medical and legal words.

Business Areas:

Semantic Search Internet Services

While large language models (LLMs) have been increasingly adopted for machine translation (MT), their performance for specialist domains such as medicine and law remains an open challenge. Prior work has shown that LLMs can be domain-adapted at test-time by retrieving targeted few-shot demonstrations or terminologies for inclusion in the prompt. Meanwhile, for general-purpose LLM MT, recent studies have found some success in generating similarly useful domain knowledge from an LLM itself, prior to translation. Our work studies domain-adapted MT with LLMs through a careful prompting setup, finding that demonstrations consistently outperform terminology, and retrieval consistently outperforms generation. We find that generating demonstrations with weaker models can close the gap with larger model's zero-shot performance. Given the effectiveness of demonstrations, we perform detailed analyses to understand their value. We find that domain-specificity is particularly important, and that the popular multi-domain benchmark is testing adaptation to a particular writing style more so than to a specific domain.

Exploring How LLMs Capture and Represent Domain-Specific Knowledge

Machine Learning (CS)

Helps computers pick the best AI for each job.

23 Apr 2025 1

90%

Assessing the Capability of Large Language Models for Domain-Specific Ontology Generation

Artificial Intelligence

Builds smart knowledge maps for any topic.

24 Apr 2025 3

90%

Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation

Computation and Language

Helps computers translate rare languages better.

2 Apr 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

15 pages

Leveraging Domain Knowledge at Inference Time for LLM Translation: Retrieval versus Generation

Helps computers translate tricky medical and legal words.

Technical Abstract

Exploring How LLMs Capture and Represent Domain-Specific Knowledge

Assessing the Capability of Large Language Models for Domain-Specific Ontology Generation

Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation