FaithFusion: Harmonizing Reconstruction and Generation via Pixel-wise Information Gain
By: YuAn Wang , Xiaofan Li , Chi Huang and more
Potential Business Impact:
Makes 3D scenes look real from any angle.
In controllable driving-scene reconstruction and 3D scene generation, maintaining geometric fidelity while synthesizing visually plausible appearance under large viewpoint shifts is crucial. However, effective fusion of geometry-based 3DGS and appearance-driven diffusion models faces inherent challenges, as the absence of pixel-wise, 3D-consistent editing criteria often leads to over-restoration and geometric drift. To address these issues, we introduce \textbf{FaithFusion}, a 3DGS-diffusion fusion framework driven by pixel-wise Expected Information Gain (EIG). EIG acts as a unified policy for coherent spatio-temporal synthesis: it guides diffusion as a spatial prior to refine high-uncertainty regions, while its pixel-level weighting distills the edits back into 3DGS. The resulting plug-and-play system is free from extra prior conditions and structural modifications.Extensive experiments on the Waymo dataset demonstrate that our approach attains SOTA performance across NTA-IoU, NTL-IoU, and FID, maintaining an FID of 107.47 even at 6 meters lane shift. Our code is available at https://github.com/wangyuanbiubiubiu/FaithFusion.
Similar Papers
EGG-Fusion: Efficient 3D Reconstruction with Geometry-aware Gaussian Surfel on the Fly
CV and Pattern Recognition
Creates super-accurate 3D models in real-time.
IGFuse: Interactive 3D Gaussian Scene Reconstruction via Multi-Scans Fusion
CV and Pattern Recognition
Builds full 3D worlds from many pictures.
Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach
CV and Pattern Recognition
Combines pictures using words to make better images.