Monocular 3D Hand Pose Estimation with Implicit Camera Alignment
By: Christos Pantazopoulos, Spyridon Thermos, Gerasimos Potamianos
Potential Business Impact:
Lets computers guess hand shapes from pictures.
Estimating the 3D hand articulation from a single color image is an important problem with applications in Augmented Reality (AR), Virtual Reality (VR), Human-Computer Interaction (HCI), and robotics. Apart from the absence of depth information, occlusions, articulation complexity, and the need for camera parameters knowledge pose additional challenges. In this work, we propose an optimization pipeline for estimating the 3D hand articulation from 2D keypoint input, which includes a keypoint alignment step and a fingertip loss to overcome the need to know or estimate the camera parameters. We evaluate our approach on the EgoDexter and Dexter+Object benchmarks to showcase that it performs competitively with the state-of-the-art, while also demonstrating its robustness when processing "in-the-wild" images without any prior camera knowledge. Our quantitative analysis highlights the sensitivity of the 2D keypoint estimation accuracy, despite the use of hand priors. Code is available at the project page https://cpantazop.github.io/HandRepo/
Similar Papers
The Invisible EgoHand: 3D Hand Forecasting through EgoBody Pose Estimation
CV and Pattern Recognition
Predicts where hands will move, even when hidden.
HOSt3R: Keypoint-free Hand-Object 3D Reconstruction from RGB images
CV and Pattern Recognition
Makes robots understand and grab any object.
HOSt3R: Keypoint-free Hand-Object 3D Reconstruction from RGB images
CV and Pattern Recognition
Lets computers see 3D objects hands hold.