Score: 0

Rethinking Transparent Object Grasping: Depth Completion with Monocular Depth Estimation and Instance Mask

Published: August 4, 2025 | arXiv ID: 2508.02507v1

By: Yaofeng Cheng , Xinkai Gao , Sen Zhang and more

Potential Business Impact:

Lets robots grab clear objects accurately

Due to the optical properties, transparent objects often lead depth cameras to generate incomplete or invalid depth data, which in turn reduces the accuracy and reliability of robotic grasping. Existing approaches typically input the RGB-D image directly into the network to output the complete depth, expecting the model to implicitly infer the reliability of depth values. However, while effective in training datasets, such methods often fail to generalize to real-world scenarios, where complex light interactions lead to highly variable distributions of valid and invalid depth data. To address this, we propose ReMake, a novel depth completion framework guided by an instance mask and monocular depth estimation. By explicitly distinguishing transparent regions from non-transparent ones, the mask enables the model to concentrate on learning accurate depth estimation in these areas from RGB-D input during training. This targeted supervision reduces reliance on implicit reasoning and improves generalization to real-world scenarios. Additionally, monocular depth estimation provides depth context between the transparent object and its surroundings, enhancing depth prediction accuracy. Extensive experiments show that our method outperforms existing approaches on both benchmark datasets and real-world scenarios, demonstrating superior accuracy and generalization capability. Code and videos are available at https://chengyaofeng.github.io/ReMake.github.io/.

Self-Supervised Learning for Transparent Object Depth Completion Using Depth from Non-Transparent Objects

CV and Pattern Recognition

Helps computers see through glass objects accurately.

4 Dec 2025 0

88%

MT-Depth: Multi-task Instance feature analysis for the Depth Completion

CV and Pattern Recognition

Makes self-driving cars see objects better in 3D.

4 Dec 2025 0

88%

DCIRNet: Depth Completion with Iterative Refinement for Dexterous Grasping of Transparent and Reflective Objects

Robotics

Helps robots grab shiny, see-through things better.

11 Jun 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

11 pages

Rethinking Transparent Object Grasping: Depth Completion with Monocular Depth Estimation and Instance Mask

Lets robots grab clear objects accurately

Technical Abstract

Self-Supervised Learning for Transparent Object Depth Completion Using Depth from Non-Transparent Objects

MT-Depth: Multi-task Instance feature analysis for the Depth Completion

DCIRNet: Depth Completion with Iterative Refinement for Dexterous Grasping of Transparent and Reflective Objects