Score: 0

Simultaneous Pick and Place Detection by Combining SE(3) Diffusion Models with Differential Kinematics

Published: April 28, 2025 | arXiv ID: 2504.19502v2

By: Tianyi Ko , Takuya Ikeda , Balazs Opra and more

Potential Business Impact:

Teaches robots to grab and move things better.

Business Areas:

Indoor Positioning Navigation and Mapping

Grasp detection methods typically target the detection of a set of free-floating hand poses that can grasp the object. However, not all of the detected grasp poses are executable due to physical constraints. Even though it is straightforward to filter invalid grasp poses in the post-process, such a two-staged approach is computationally inefficient, especially when the constraint is hard. In this work, we propose an approach to take the following two constraints into account during the grasp detection stage, namely, (i) the picked object must be able to be placed with a predefined configuration without in-hand manipulation (ii) it must be reachable by the robot under the joint limit and collision-avoidance constraints for both pick and place cases. Our key idea is to train an SE(3) grasp diffusion network to estimate the noise in the form of spatial velocity, and constrain the denoising process by a multi-target differential inverse kinematics with an inequality constraint, so that the states are guaranteed to be reachable and placement can be performed without collision. In addition to an improved success ratio, we experimentally confirmed that our approach is more efficient and consistent in computation time compared to a naive two-stage approach.