LOSC: LiDAR Open-voc Segmentation Consolidator
By: Nermin Samet, Gilles Puy, Renaud Marlet
Potential Business Impact:
Helps self-driving cars see and understand everything.
We study the use of image-based Vision-Language Models (VLMs) for open-vocabulary segmentation of lidar scans in driving settings. Classically, image semantics can be back-projected onto 3D point clouds. Yet, resulting point labels are noisy and sparse. We consolidate these labels to enforce both spatio-temporal consistency and robustness to image-level augmentations. We then train a 3D network based on these refined labels. This simple method, called LOSC, outperforms the SOTA of zero-shot open-vocabulary semantic and panoptic segmentation on both nuScenes and SemanticKITTI, with significant margins.
Similar Papers
Label-Efficient LiDAR Panoptic Segmentation
CV and Pattern Recognition
Teaches robots to understand surroundings with less data.
Zero-Shot 4D Lidar Panoptic Segmentation
CV and Pattern Recognition
Helps robots understand moving things in 3D.
Semantic Segmentation Algorithm Based on Light Field and LiDAR Fusion
CV and Pattern Recognition
Helps self-driving cars see through obstacles.