IDESplat: Iterative Depth Probability Estimation for Generalizable 3D Gaussian Splatting
By: Wei Long , Haifeng Wu , Shiyin Jiang and more
Potential Business Impact:
Makes 3D scenes look real with less data.
Generalizable 3D Gaussian Splatting aims to directly predict Gaussian parameters using a feed-forward network for scene reconstruction. Among these parameters, Gaussian means are particularly difficult to predict, so depth is usually estimated first and then unprojected to obtain the Gaussian sphere centers. Existing methods typically rely solely on a single warp to estimate depth probability, which hinders their ability to fully leverage cross-view geometric cues, resulting in unstable and coarse depth maps. To address this limitation, we propose IDESplat, which iteratively applies warp operations to boost depth probability estimation for accurate Gaussian mean prediction. First, to eliminate the inherent instability of a single warp, we introduce a Depth Probability Boosting Unit (DPBU) that integrates epipolar attention maps produced by cascading warp operations in a multiplicative manner. Next, we construct an iterative depth estimation process by stacking multiple DPBUs, progressively identifying potential depth candidates with high likelihood. As IDESplat iteratively boosts depth probability estimates and updates the depth candidates, the depth map is gradually refined, resulting in accurate Gaussian means. We conduct experiments on RealEstate10K, ACID, and DL3DV. IDESplat achieves outstanding reconstruction quality and state-of-the-art performance with real-time efficiency. On RE10K, it outperforms DepthSplat by 0.33 dB in PSNR, using only 10.7% of the parameters and 70% of the memory. Additionally, our IDESplat improves PSNR by 2.95 dB over DepthSplat on the DTU dataset in cross-dataset experiments, demonstrating its strong generalization ability.
Similar Papers
IDSplat: Instance-Decomposed 3D Gaussian Splatting for Driving Scenes
CV and Pattern Recognition
Lets self-driving cars learn from real-world driving.
RobustSplat++: Decoupling Densification, Dynamics, and Illumination for In-the-Wild 3DGS
CV and Pattern Recognition
Makes 3D pictures ignore moving things and changing light.
EcoSplat: Efficiency-controllable Feed-forward 3D Gaussian Splatting from Multi-view Images
CV and Pattern Recognition
Makes 3D pictures with fewer computer shapes.