Persistent Object Gaussian Splat (POGS) for Tracking Human and Robot Manipulation of Irregularly Shaped Objects
By: Justin Yu , Kush Hari , Karim El-Refai and more
Potential Business Impact:
Robots can now grab and move weird objects.
Tracking and manipulating irregularly-shaped, previously unseen objects in dynamic environments is important for robotic applications in manufacturing, assembly, and logistics. Recently introduced Gaussian Splats efficiently model object geometry, but lack persistent state estimation for task-oriented manipulation. We present Persistent Object Gaussian Splat (POGS), a system that embeds semantics, self-supervised visual features, and object grouping features into a compact representation that can be continuously updated to estimate the pose of scanned objects. POGS updates object states without requiring expensive rescanning or prior CAD models of objects. After an initial multi-view scene capture and training phase, POGS uses a single stereo camera to integrate depth estimates along with self-supervised vision encoder features for object pose estimation. POGS supports grasping, reorientation, and natural language-driven manipulation by refining object pose estimates, facilitating sequential object reset operations with human-induced object perturbations and tool servoing, where robots recover tool pose despite tool perturbations of up to 30{\deg}. POGS achieves up to 12 consecutive successful object resets and recovers from 80% of in-grasp tool perturbations.
Similar Papers
Physics-Aware Human-Object Rendering from Sparse Views via 3D Gaussian Splatting
Graphics
Makes computer videos show people touching things realistically.
SPAGS: Sparse-View Articulated Object Reconstruction from Single State via Planar Gaussian Splatting
CV and Pattern Recognition
Builds 3D models from just a few pictures.
PEGS: Physics-Event Enhanced Large Spatiotemporal Motion Reconstruction via 3D Gaussian Splatting
CV and Pattern Recognition
Makes blurry videos clear and tracks fast movement.