MonoCT: Overcoming Monocular 3D Detection Domain Shift with Consistent Teacher Models
By: Johannes Meier , Louis Inchingolo , Oussema Dhaouadi and more
Potential Business Impact:
Helps cars see in 3D without extra cameras.
We tackle the problem of monocular 3D object detection across different sensors, environments, and camera setups. In this paper, we introduce a novel unsupervised domain adaptation approach, MonoCT, that generates highly accurate pseudo labels for self-supervision. Inspired by our observation that accurate depth estimation is critical to mitigating domain shifts, MonoCT introduces a novel Generalized Depth Enhancement (GDE) module with an ensemble concept to improve depth estimation accuracy. Moreover, we introduce a novel Pseudo Label Scoring (PLS) module by exploring inner-model consistency measurement and a Diversity Maximization (DM) strategy to further generate high-quality pseudo labels for self-training. Extensive experiments on six benchmarks show that MonoCT outperforms existing SOTA domain adaptation methods by large margins (~21% minimum for AP Mod.) and generalizes well to car, traffic camera and drone views.
Similar Papers
Generalizing Monocular 3D Object Detection
CV and Pattern Recognition
Helps cars see in 3D from one picture.
GATE3D: Generalized Attention-based Task-synergized Estimation in 3D*
CV and Pattern Recognition
Helps robots see in 3D everywhere, not just roads.
MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection
CV and Pattern Recognition
Makes self-driving cars see better in 3D.