Score: 1

Class-agnostic 3D Segmentation by Granularity-Consistent Automatic 2D Mask Tracking

Published: November 2, 2025 | arXiv ID: 2511.00785v1

By: Juan Wang , Yasutomo Kawanishi , Tomo Miyazaki and more

Potential Business Impact:

Makes 3D pictures from videos more accurate.

Business Areas:

Image Recognition Data and Analytics, Software

3D instance segmentation is an important task for real-world applications. To avoid costly manual annotations, existing methods have explored generating pseudo labels by transferring 2D masks from foundation models to 3D. However, this approach is often suboptimal since the video frames are processed independently. This causes inconsistent segmentation granularity and conflicting 3D pseudo labels, which degrades the accuracy of final segmentation. To address this, we introduce a Granularity-Consistent automatic 2D Mask Tracking approach that maintains temporal correspondences across frames, eliminating conflicting pseudo labels. Combined with a three-stage curriculum learning framework, our approach progressively trains from fragmented single-view data to unified multi-view annotations, ultimately globally coherent full-scene supervision. This structured learning pipeline enables the model to progressively expose to pseudo-labels of increasing consistency. Thus, we can robustly distill a consistent 3D representation from initially fragmented and contradictory 2D priors. Experimental results demonstrated that our method effectively generated consistent and accurate 3D segmentations. Furthermore, the proposed method achieved state-of-the-art results on standard benchmarks and open-vocabulary ability.

Integrating SAM Supervision for 3D Weakly Supervised Point Cloud Segmentation

CV and Pattern Recognition

Helps computers understand 3D shapes with less 3D data.

27 Aug 2025 1

89%

OpenTrack3D: Towards Accurate and Generalizable Open-Vocabulary 3D Instance Segmentation

CV and Pattern Recognition

Lets robots understand and find any object.

3 Dec 2025 2

89%

SGS-3D: High-Fidelity 3D Instance Segmentation via Reliable Semantic Mask Splitting and Growing

CV and Pattern Recognition

Makes 3D pictures show objects more clearly.

5 Sep 2025 1

View PDF Login to Bookmark

Page Count

31 pages

Class-agnostic 3D Segmentation by Granularity-Consistent Automatic 2D Mask Tracking

Makes 3D pictures from videos more accurate.

Technical Abstract

Integrating SAM Supervision for 3D Weakly Supervised Point Cloud Segmentation

OpenTrack3D: Towards Accurate and Generalizable Open-Vocabulary 3D Instance Segmentation

SGS-3D: High-Fidelity 3D Instance Segmentation via Reliable Semantic Mask Splitting and Growing