RFOP: Rethinking Fusion and Orthogonal Projection for Face-Voice Association
By: Abdul Hannan , Furqan Malik , Hina Jabbar and more
Potential Business Impact:
Helps computers match faces to voices in different languages.
Face-voice association in multilingual environment challenge 2026 aims to investigate the face-voice association task in multilingual scenario. The challenge introduces English-German face-voice pairs to be utilized in the evaluation phase. To this end, we revisit the fusion and orthogonal projection for face-voice association by effectively focusing on the relevant semantic information within the two modalities. Our method performs favorably on the English-German data split and ranked 3rd in the FAME 2026 challenge by achieving the EER of 33.1.
Similar Papers
Linking Faces and Voices Across Languages: Insights from the FAME 2026 Challenge
CV and Pattern Recognition
Lets computers match faces to voices in any language.
Towards Language-Independent Face-Voice Association with Multimodal Foundation Models
Audio and Speech Processing
Lets computers recognize voices in new languages.
Shared Multi-modal Embedding Space for Face-Voice Association
Sound
Matches voices to faces, even in new languages.