Multimodal Federated Learning: A Survey through the Lens of Different FL Paradigms
By: Yuanzhe Peng , Jieming Bian , Lei Wang and more
Potential Business Impact:
Helps AI learn from different kinds of data safely.
Multimodal Federated Learning (MFL) lies at the intersection of two pivotal research areas: leveraging complementary information from multiple modalities to improve downstream inference performance and enabling distributed training to enhance efficiency and preserve privacy. Despite the growing interest in MFL, there is currently no comprehensive taxonomy that organizes MFL through the lens of different Federated Learning (FL) paradigms. This perspective is important because multimodal data introduces distinct challenges across various FL settings. These challenges, including modality heterogeneity, privacy heterogeneity, and communication inefficiency, are fundamentally different from those encountered in traditional unimodal or non-FL scenarios. In this paper, we systematically examine MFL within the context of three major FL paradigms: horizontal FL (HFL), vertical FL (VFL), and hybrid FL. For each paradigm, we present the problem formulation, review representative training algorithms, and highlight the most prominent challenge introduced by multimodal data in distributed settings. We also discuss open challenges and provide insights for future research. By establishing this taxonomy, we aim to uncover the novel challenges posed by multimodal data from the perspective of different FL paradigms and to offer a new lens through which to understand and advance the development of MFL.
Similar Papers
BlendFL: Blended Federated Learning for Handling Multimodal Data Heterogeneity
Machine Learning (CS)
Helps computers learn from mixed data without sharing.
Modular Federated Learning: A Meta-Framework Perspective
Machine Learning (CS)
Lets computers learn together without sharing private data.
Adaptive Prototype Knowledge Transfer for Federated Learning with Mixed Modalities and Heterogeneous Tasks
Machine Learning (CS)
Helps computers learn from different data types privately.