Unleashing Semantic and Geometric Priors for 3D Scene Completion
By: Shiyuan Chen , Wei Sui , Bohao Zhang and more
Potential Business Impact:
Helps cars understand 3D world for safer driving.
Camera-based 3D semantic scene completion (SSC) provides dense geometric and semantic perception for autonomous driving and robotic navigation. However, existing methods rely on a coupled encoder to deliver both semantic and geometric priors, which forces the model to make a trade-off between conflicting demands and limits its overall performance. To tackle these challenges, we propose FoundationSSC, a novel framework that performs dual decoupling at both the source and pathway levels. At the source level, we introduce a foundation encoder that provides rich semantic feature priors for the semantic branch and high-fidelity stereo cost volumes for the geometric branch. At the pathway level, these priors are refined through specialised, decoupled pathways, yielding superior semantic context and depth distributions. Our dual-decoupling design produces disentangled and refined inputs, which are then utilised by a hybrid view transformation to generate complementary 3D features. Additionally, we introduce a novel Axis-Aware Fusion (AAF) module that addresses the often-overlooked challenge of fusing these features by anisotropically merging them into a unified representation. Extensive experiments demonstrate the advantages of FoundationSSC, achieving simultaneous improvements in both semantic and geometric metrics, surpassing prior bests by +0.23 mIoU and +2.03 IoU on SemanticKITTI. Additionally, we achieve state-of-the-art performance on SSCBench-KITTI-360, with 21.78 mIoU and 48.61 IoU. The code will be released upon acceptance.
Similar Papers
Towards 3D Object-Centric Feature Learning for Semantic Scene Completion
CV and Pattern Recognition
Helps self-driving cars see objects better.
CurriFlow: Curriculum-Guided Depth Fusion with Optical Flow-Based Temporal Alignment for 3D Semantic Scene Completion
CV and Pattern Recognition
Helps cars see and understand 3D spaces.
SplatSSC: Decoupled Depth-Guided Gaussian Splatting for Semantic Scene Completion
CV and Pattern Recognition
Makes 3D pictures from one photo better.