Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection
By: Jiangyi Wang, Na Zhao
Potential Business Impact:
Teaches robots to see better indoors with less data.
Active learning has emerged as a promising approach to reduce the substantial annotation burden in 3D object detection tasks, spurring several initiatives in outdoor environments. However, its application in indoor environments remains unexplored. Compared to outdoor 3D datasets, indoor datasets face significant challenges, including fewer training samples per class, a greater number of classes, more severe class imbalance, and more diverse scene types and intra-class variances. This paper presents the first study on active learning for indoor 3D object detection, where we propose a novel framework tailored for this task. Our method incorporates two key criteria - uncertainty and diversity - to actively select the most ambiguous and informative unlabeled samples for annotation. The uncertainty criterion accounts for both inaccurate detections and undetected objects, ensuring that the most ambiguous samples are prioritized. Meanwhile, the diversity criterion is formulated as a joint optimization problem that maximizes the diversity of both object class distributions and scene types, using a new Class-aware Adaptive Prototype (CAP) bank. The CAP bank dynamically allocates representative prototypes to each class, helping to capture varying intra-class diversity across different categories. We evaluate our method on SUN RGB-D and ScanNetV2, where it outperforms baselines by a significant margin, achieving over 85% of fully-supervised performance with just 10% of the annotation budget.
Similar Papers
IDEAL-M3D: Instance Diversity-Enriched Active Learning for Monocular 3D Detection
CV and Pattern Recognition
Teaches AI to see 3D objects with fewer pictures.
DUAL: Diversity and Uncertainty Active Learning for Text Summarization
Computation and Language
Teaches computers to summarize text better with less data.
HeAL3D: Heuristical-enhanced Active Learning for 3D Object Detection
CV and Pattern Recognition
Teaches self-driving cars with fewer examples.