PicoPose: Progressive Pixel-to-Pixel Correspondence Learning for Novel Object Pose Estimation
By: Lihua Liu , Jiehong Lin , Zhenxin Liu and more
Potential Business Impact:
Helps robots find and grab new things.
RGB-based novel object pose estimation is critical for rapid deployment in robotic applications, yet zero-shot generalization remains a key challenge. In this paper, we introduce PicoPose, a novel framework designed to tackle this task using a three-stage pixel-to-pixel correspondence learning process. Firstly, PicoPose matches features from the RGB observation with those from rendered object templates, identifying the best-matched template and establishing coarse correspondences. Secondly, PicoPose smooths the correspondences by globally regressing a 2D affine transformation, including in-plane rotation, scale, and 2D translation, from the coarse correspondence map. Thirdly, PicoPose applies the affine transformation to the feature map of the best-matched template and learns correspondence offsets within local regions to achieve fine-grained correspondences. By progressively refining the correspondences, PicoPose significantly improves the accuracy of object poses computed via PnP/RANSAC. PicoPose achieves state-of-the-art performance on the seven core datasets of the BOP benchmark, demonstrating exceptional generalization to novel objects. Code and trained models are available at https://github.com/foollh/PicoPose.
Similar Papers
Co-op: Correspondence-based Novel Object Pose Estimation
CV and Pattern Recognition
Helps robots see and grab objects they've never seen.
RefPose: Leveraging Reference Geometric Correspondences for Accurate 6D Pose Estimation of Unseen Objects
CV and Pattern Recognition
Helps robots find and grab new things.
Structure-Aware Correspondence Learning for Relative Pose Estimation
CV and Pattern Recognition
Helps robots understand object shapes without seeing them.