Score: 1

ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping

Published: April 15, 2025 | arXiv ID: 2504.10857v1

By: Shun Iwase , Zubair Irshad , Katherine Liu and more

Potential Business Impact:

Robots can grab things better by seeing them.

Business Areas:

Image Recognition Data and Analytics, Software

Robotic grasping is a cornerstone capability of embodied systems. Many methods directly output grasps from partial information without modeling the geometry of the scene, leading to suboptimal motion and even collisions. To address these issues, we introduce ZeroGrasp, a novel framework that simultaneously performs 3D reconstruction and grasp pose prediction in near real-time. A key insight of our method is that occlusion reasoning and modeling the spatial relationships between objects is beneficial for both accurate reconstruction and grasping. We couple our method with a novel large-scale synthetic dataset, which comprises 1M photo-realistic images, high-resolution 3D reconstructions and 11.3B physically-valid grasp pose annotations for 12K objects from the Objaverse-LVIS dataset. We evaluate ZeroGrasp on the GraspNet-1B benchmark as well as through real-world robot experiments. ZeroGrasp achieves state-of-the-art performance and generalizes to novel real-world objects by leveraging synthetic data.

ORACLE-Grasp: Zero-Shot Task-Oriented Robotic Grasping using Large Multimodal Models

Robotics

Robots learn to grab new things without practice.

13 May 2025 0

91%

VLAD-Grasp: Zero-shot Grasp Detection via Vision-Language Models

Robotics

Robots can grab new things without learning.

8 Nov 2025 1

90%

ZeroDexGrasp: Zero-Shot Task-Oriented Dexterous Grasp Synthesis with Prompt-Based Multi-Stage Semantic Reasoning

Robotics

Robots learn to grab things for any job.

17 Nov 2025 0

View PDF Login to Bookmark

Page Count

19 pages

ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping

Robots can grab things better by seeing them.

Technical Abstract

ORACLE-Grasp: Zero-Shot Task-Oriented Robotic Grasping using Large Multimodal Models

VLAD-Grasp: Zero-shot Grasp Detection via Vision-Language Models

ZeroDexGrasp: Zero-Shot Task-Oriented Dexterous Grasp Synthesis with Prompt-Based Multi-Stage Semantic Reasoning