Leveraging 3D Representation Alignment and RGB Pretrained Priors for LiDAR Scene Generation
By: Nicolas Sereyjol-Garros , Ellington Kirby , Victor Besnier and more
Potential Business Impact:
Creates realistic 3D car views from limited data.
LiDAR scene synthesis is an emerging solution to scarcity in 3D data for robotic tasks such as autonomous driving. Recent approaches employ diffusion or flow matching models to generate realistic scenes, but 3D data remains limited compared to RGB datasets with millions of samples. We introduce R3DPA, the first LiDAR scene generation method to unlock image-pretrained priors for LiDAR point clouds, and leverage self-supervised 3D representations for state-of-the-art results. Specifically, we (i) align intermediate features of our generative model with self-supervised 3D features, which substantially improves generation quality; (ii) transfer knowledge from large-scale image-pretrained generative models to LiDAR generation, mitigating limited LiDAR datasets; and (iii) enable point cloud control at inference for object inpainting and scene mixing with solely an unconditional model. On the KITTI-360 benchmark R3DPA achieves state of the art performance. Code and pretrained models are available at https://github.com/valeoai/R3DPA.
Similar Papers
DriveLiDAR4D: Sequential and Controllable LiDAR Scene Generation for Autonomous Driving
CV and Pattern Recognition
Creates realistic driving scenes for self-driving cars.
UniLiPs: Unified LiDAR Pseudo-Labeling with Geometry-Grounded Dynamic Scene Decomposition
CV and Pattern Recognition
Makes self-driving cars see better without labels.
Learning to Generate 4D LiDAR Sequences
CV and Pattern Recognition
Creates 3D car sensor data from words.