Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation
By: Wuwei Huang , Renren Jin , Wen Zhang and more
Potential Business Impact:
Translates many languages spoken at once.
Recent studies on end-to-end speech translation(ST) have facilitated the exploration of multilingual end-to-end ST and end-to-end simultaneous ST. In this paper, we investigate end-to-end simultaneous speech translation in a one-to-many multilingual setting which is closer to applications in real scenarios. We explore a separate decoder architecture and a unified architecture for joint synchronous training in this scenario. To further explore knowledge transfer across languages, we propose an asynchronous training strategy on the proposed unified decoder architecture. A multi-way aligned multilingual end-to-end ST dataset was curated as a benchmark testbed to evaluate our methods. Experimental results demonstrate the effectiveness of our models on the collected dataset. Our codes and data are available at: https://github.com/XiaoMi/TED-MMST.
Similar Papers
MultiMed-ST: Large-scale Many-to-many Multilingual Medical Speech Translation
Computation and Language
Translates doctor talk between many languages.
End-to-end Automatic Speech Recognition and Speech Translation: Integration of Speech Foundational Models and LLMs
Computation and Language
Lets computers translate spoken words to another language.
Efficient and Adaptive Simultaneous Speech Translation with Fully Unidirectional Architecture
Computation and Language
Translates talking instantly, faster and smarter.