Fish2Mesh Transformer: 3D Human Mesh Recovery from Egocentric Vision
By: David C. Jeong , Aditya Puranik , James Vong and more
Potential Business Impact:
Helps cameras see your body shape from your eyes.
Egocentric human body estimation allows for the inference of user body pose and shape from a wearable camera's first-person perspective. Although research has used pose estimation techniques to overcome self-occlusions and image distortions caused by head-mounted fisheye images, similar advances in 3D human mesh recovery (HMR) techniques have been limited. We introduce Fish2Mesh, a fisheye-aware transformer-based model designed for 3D egocentric human mesh recovery. We propose an egocentric position embedding block to generate an ego-specific position table for the Swin Transformer to reduce fisheye image distortion. Our model utilizes multi-task heads for SMPL parametric regression and camera translations, estimating 3D and 2D joints as auxiliary loss to support model training. To address the scarcity of egocentric camera data, we create a training dataset by employing the pre-trained 4D-Human model and third-person cameras for weak supervision. Our experiments demonstrate that Fish2Mesh outperforms previous state-of-the-art 3D HMR models.
Similar Papers
On the Use of Hierarchical Vision Foundation Models for Low-Cost Human Mesh Recovery and Pose Estimation
CV and Pattern Recognition
Makes computer models of people smaller, faster.
Bring Your Rear Cameras for Egocentric 3D Human Pose Estimation
CV and Pattern Recognition
Lets virtual characters copy your full body movements.
MetricHMR: Metric Human Mesh Recovery from Monocular Images
CV and Pattern Recognition
Makes 3D body models from one picture.