MHARFedLLM: Multimodal Human Activity Recognition Using Federated Large Language Model
By: Asmit Bandyopadhyay , Rohit Basu , Tanmay Sen and more
Potential Business Impact:
Helps computers understand what people are doing.
Human Activity Recognition (HAR) plays a vital role in applications such as fitness tracking, smart homes, and healthcare monitoring. Traditional HAR systems often rely on single modalities, such as motion sensors or cameras, limiting robustness and accuracy in real-world environments. This work presents FedTime-MAGNET, a novel multimodal federated learning framework that advances HAR by combining heterogeneous data sources: depth cameras, pressure mats, and accelerometers. At its core is the Multimodal Adaptive Graph Neural Expert Transformer (MAGNET), a fusion architecture that uses graph attention and a Mixture of Experts to generate unified, discriminative embeddings across modalities. To capture complex temporal dependencies, a lightweight T5 encoder only architecture is customized and adapted within this framework. Extensive experiments show that FedTime-MAGNET significantly improves HAR performance, achieving a centralized F1 Score of 0.934 and a strong federated F1 Score of 0.881. These results demonstrate the effectiveness of combining multimodal fusion, time series LLMs, and federated learning for building accurate and robust HAR systems.
Similar Papers
GraMFedDHAR: Graph Based Multimodal Differentially Private Federated HAR
Machine Learning (CS)
Helps computers understand actions from many sensors.
Multi-Frequency Federated Learning for Human Activity Recognition Using Head-Worn Sensors
Machine Learning (CS)
Lets earbuds learn health from you privately.
A Novel Deep Hybrid Framework with Ensemble-Based Feature Optimization for Robust Real-Time Human Activity Recognition
CV and Pattern Recognition
Helps computers understand what people are doing.