Investigating the Effect of Parallel Data in the Cross-Lingual Transfer for Vision-Language Encoders
By: Andrei-Alexandru Manea, Jindřich Libovický
Potential Business Impact:
Helps computers understand images in many languages.
Most pre-trained Vision-Language (VL) models and training data for the downstream tasks are only available in English. Therefore, multilingual VL tasks are solved using cross-lingual transfer: fine-tune a multilingual pre-trained model or transfer the text encoder using parallel data. We study the alternative approach: transferring an already trained encoder using parallel data. We investigate the effect of parallel data: domain and the number of languages, which were out of focus in previous work. Our results show that even machine-translated task data are the best on average, caption-like authentic parallel data outperformed it in some languages. Further, we show that most languages benefit from multilingual training.
Similar Papers
Just Go Parallel: Improving the Multilingual Capabilities of Large Language Models
Computation and Language
Adds more languages to computer translators.
Massively Multilingual Adaptation of Large Language Models Using Bilingual Translation Data
Computation and Language
Helps computers understand many more languages.
Parallel Tokenizers: Rethinking Vocabulary Design for Cross-Lingual Transfer
Computation and Language
Helps computers understand many languages better.