OWL: Unsupervised 3D Object Detection by Occupancy Guided Warm-up and Large Model Priors Reasoning
By: Xusheng Guo , Wanfa Zhang , Shijia Zhao and more
Unsupervised 3D object detection leverages heuristic algorithms to discover potential objects, offering a promising route to reduce annotation costs in autonomous driving. Existing approaches mainly generate pseudo labels and refine them through self-training iterations. However, these pseudo-labels are often incorrect at the beginning of training, resulting in misleading the optimization process. Moreover, effectively filtering and refining them remains a critical challenge. In this paper, we propose OWL for unsupervised 3D object detection by occupancy guided warm-up and large-model priors reasoning. OWL first employs an Occupancy Guided Warm-up (OGW) strategy to initialize the backbone weight with spatial perception capabilities, mitigating the interference of incorrect pseudo-labels on network convergence. Furthermore, OWL introduces an Instance-Cued Reasoning (ICR) module that leverages the prior knowledge of large models to assess pseudo-label quality, enabling precise filtering and refinement. Finally, we design a Weight-adapted Self-training (WAS) strategy to dynamically re-weight pseudo-labels, improving the performance through self-training. Extensive experiments on Waymo Open Dataset (WOD) and KITTI demonstrate that OWL outperforms state-of-the-art unsupervised methods by over 15.0% mAP, revealing the effectiveness of our method.
Similar Papers
QueryOcc: Query-based Self-Supervision for 3D Semantic Occupancy
CV and Pattern Recognition
Teaches cars to see and understand 3D worlds.
Video Spatial Reasoning with Object-Centric 3D Rollout
CV and Pattern Recognition
Teaches computers to understand 3D object locations in videos.
OpenM3D: Open Vocabulary Multi-view Indoor 3D Object Detection without Human Annotations
CV and Pattern Recognition
Finds objects in 3D rooms without human labels.