Different Speech Translation Models Encode and Translate Speaker Gender Differently
By: Dennis Fucci , Marco Gaido , Matteo Negri and more
Potential Business Impact:
Translators learn gender, but some new ones don't.
Recent studies on interpreting the hidden states of speech models have shown their ability to capture speaker-specific features, including gender. Does this finding also hold for speech translation (ST) models? If so, what are the implications for the speaker's gender assignment in translation? We address these questions from an interpretability perspective, using probing methods to assess gender encoding across diverse ST models. Results on three language directions (English-French/Italian/Spanish) indicate that while traditional encoder-decoder models capture gender information, newer architectures -- integrating a speech encoder with a machine translation system via adapters -- do not. We also demonstrate that low gender encoding capabilities result in systems' tendency toward a masculine default, a translation bias that is more pronounced in newer architectures.
Similar Papers
Voice, Bias, and Coreference: An Interpretability Study of Gender in Speech Translation
Computation and Language
Translates speech, guessing gender from sound, not just pitch.
Gender Encoding Patterns in Pretrained Language Model Representations
Computation and Language
Fixes computer language models to be less unfair.
Acoustic-based Gender Differentiation in Speech-aware Language Models
Computation and Language
Fixes AI voices that unfairly favor men.