PoseGRAF: Geometric-Reinforced Adaptive Fusion for Monocular 3D Human Pose Estimation
By: Ming Xu, Xu Zhang
Potential Business Impact:
Makes computer-drawn people move more realistically.
Existing monocular 3D pose estimation methods primarily rely on joint positional features, while overlooking intrinsic directional and angular correlations within the skeleton. As a result, they often produce implausible poses under joint occlusions or rapid motion changes. To address these challenges, we propose the PoseGRAF framework. We first construct a dual graph convolutional structure that separately processes joint and bone graphs, effectively capturing their local dependencies. A Cross-Attention module is then introduced to model interdependencies between bone directions and joint features. Building upon this, a dynamic fusion module is designed to adaptively integrate both feature types by leveraging the relational dependencies between joints and bones. An improved Transformer encoder is further incorporated in a residual manner to generate the final output. Experimental results on the Human3.6M and MPI-INF-3DHP datasets show that our method exceeds state-of-the-art approaches. Additional evaluations on in-the-wild videos further validate its generalizability. The code is publicly available at https://github.com/iCityLab/PoseGRAF.
Similar Papers
HGFreNet: Hop-hybrid GraphFomer for 3D Human Pose Estimation with Trajectory Consistency in Frequency Domain
CV and Pattern Recognition
Makes 2D videos show people's real 3D movements.
3D Human Pose Estimation via Spatial Graph Order Attention and Temporal Body Aware Transformer
CV and Pattern Recognition
Makes computers understand body movements better.
Multi-Grained Feature Pruning for Video-Based Human Pose Estimation
CV and Pattern Recognition
Makes computer movement tracking faster and more accurate.