Variational Shape Inference for Grasp Diffusion on SE(3)
By: S. Talha Bukhari , Kaivalya Agrawal , Zachary Kingston and more
Potential Business Impact:
Robots learn to grab objects better, even with bad data.
Grasp synthesis is a fundamental task in robotic manipulation which usually has multiple feasible solutions. Multimodal grasp synthesis seeks to generate diverse sets of stable grasps conditioned on object geometry, making the robust learning of geometric features crucial for success. To address this challenge, we propose a framework for learning multimodal grasp distributions that leverages variational shape inference to enhance robustness against shape noise and measurement sparsity. Our approach first trains a variational autoencoder for shape inference using implicit neural representations, and then uses these learned geometric features to guide a diffusion model for grasp synthesis on the SE(3) manifold. Additionally, we introduce a test-time grasp optimization technique that can be integrated as a plugin to further enhance grasping performance. Experimental results demonstrate that our shape inference for grasp synthesis formulation outperforms state-of-the-art multimodal grasp synthesis methods on the ACRONYM dataset by 6.3%, while demonstrating robustness to deterioration in point cloud density compared to other approaches. Furthermore, our trained model achieves zero-shot transfer to real-world manipulation of household objects, generating 34% more successful grasps than baselines despite measurement noise and point cloud calibration errors.
Similar Papers
Simultaneous Pick and Place Detection by Combining SE(3) Diffusion Models with Differential Kinematics
Robotics
Teaches robots to grab and move things better.
GAGrasp: Geometric Algebra Diffusion for Dexterous Grasping
Robotics
Robots can grab objects from any angle.
Towards a Multi-Embodied Grasping Agent
Robotics
Robots learn to grab anything with any hand.