Evaluating the Impact of Verbal Multiword Expressions on Machine Translation
By: Linfeng Liu, Saptarshi Ghosh, Tianyu Jiang
Potential Business Impact:
Fixes computer translations of tricky phrases.
Verbal multiword expressions (VMWEs) present significant challenges for natural language processing due to their complex and often non-compositional nature. While machine translation models have seen significant improvement with the advent of language models in recent years, accurately translating these complex linguistic structures remains an open problem. In this study, we analyze the impact of three VMWE categories -- verbal idioms, verb-particle constructions, and light verb constructions -- on machine translation quality from English to multiple languages. Using both established multiword expression datasets and sentences containing these language phenomena extracted from machine translation datasets, we evaluate how state-of-the-art translation systems handle these expressions. Our experimental results consistently show that VMWEs negatively affect translation quality. We also propose an LLM-based paraphrasing approach that replaces these expressions with their literal counterparts, demonstrating significant improvement in translation quality for verbal idioms and verb-particle constructions.
Similar Papers
An Empirical Study on Chinese Character Decomposition in Multiword Expression-Aware Neural Machine Translation
Computation and Language
Helps computers understand Chinese word meanings better.
Dancing with Deer: A Constructional Perspective on MWEs in the Era of LLMs
Computation and Language
Helps computers learn new phrases like people do.
Testing the Limits of Machine Translation from One Book
Computation and Language
Helps computers translate rare languages better.