Generalizable Articulated Object Reconstruction from Casually Captured RGBD Videos
By: Weikun Peng , Jun Lv , Cewu Lu and more
Potential Business Impact:
Lets robots build and move objects better.
Articulated objects are prevalent in daily life. Understanding their kinematic structure and reconstructing them have numerous applications in embodied AI and robotics. However, current methods require carefully captured data for training or inference, preventing practical, scalable, and generalizable reconstruction of articulated objects. We focus on reconstruction of an articulated object from a casually captured RGBD video shot with a hand-held camera. A casually captured video of an interaction with an articulated object is easy to acquire at scale using smartphones. However, this setting is quite challenging, as the object and camera move simultaneously and there are significant occlusions as the person interacts with the object. To tackle these challenges, we introduce a coarse-to-fine framework that infers joint parameters and segments movable parts of the object from a dynamic RGBD video. To evaluate our method under this new setting, we build a 20$\times$ larger synthetic dataset of 784 videos containing 284 objects across 11 categories. We compare our approach with existing methods that also take video as input. Experiments show that our method can reconstruct synthetic and real articulated objects across different categories from dynamic RGBD videos, outperforming existing methods significantly.
Similar Papers
Detection Based Part-level Articulated Object Reconstruction from Single RGBD Image
CV and Pattern Recognition
Builds 3D models of robot-like things from pictures.
Articulated Object Estimation in the Wild
Robotics
Robots learn to move objects like humans.
VideoArtGS: Building Digital Twins of Articulated Objects from Monocular Video
CV and Pattern Recognition
Creates 3D models of moving objects from video.