Hybrid-supervised Hypergraph-enhanced Transformer for Micro-gesture Based Emotion Recognition
By: Zhaoqiang Xia , Hexiang Huang , Haoyu Chen and more
Potential Business Impact:
Reads your hidden feelings from tiny movements.
Micro-gestures are unconsciously performed body gestures that can convey the emotion states of humans and start to attract more research attention in the fields of human behavior understanding and affective computing as an emerging topic. However, the modeling of human emotion based on micro-gestures has not been explored sufficiently. In this work, we propose to recognize the emotion states based on the micro-gestures by reconstructing the behavior patterns with a hypergraph-enhanced Transformer in a hybrid-supervised framework. In the framework, hypergraph Transformer based encoder and decoder are separately designed by stacking the hypergraph-enhanced self-attention and multiscale temporal convolution modules. Especially, to better capture the subtle motion of micro-gestures, we construct a decoder with additional upsampling operations for a reconstruction task in a self-supervised learning manner. We further propose a hypergraph-enhanced self-attention module where the hyperedges between skeleton joints are gradually updated to present the relationships of body joints for modeling the subtle local motion. Lastly, for exploiting the relationship between the emotion states and local motion of micro-gestures, an emotion recognition head from the output of encoder is designed with a shallow architecture and learned in a supervised way. The end-to-end framework is jointly trained in a one-stage way by comprehensively utilizing self-reconstruction and supervision information. The proposed method is evaluated on two publicly available datasets, namely iMiGUE and SMG, and achieves the best performance under multiple metrics, which is superior to the existing methods.
Similar Papers
Towards Fine-Grained Emotion Understanding via Skeleton-Based Micro-Gesture Recognition
CV and Pattern Recognition
Reads tiny hand movements to guess hidden feelings.
MM-Gesture: Towards Precise Micro-Gesture Recognition through Multimodal Fusion
CV and Pattern Recognition
Recognizes tiny hand movements from many video types.
Multi-Track Multimodal Learning on iMiGUE: Micro-Gesture and Emotion Recognition
CV and Pattern Recognition
Lets computers understand your feelings and tiny movements.