Detail Enhanced Gaussian Splatting for Large-Scale Volumetric Capture
By: Julien Philip , Li Ma , Pascal Clausen and more
Potential Business Impact:
Makes movie characters look super real.
We present a unique system for large-scale, multi-performer, high resolution 4D volumetric capture providing realistic free-viewpoint video up to and including 4K resolution facial closeups. To achieve this, we employ a novel volumetric capture, reconstruction and rendering pipeline based on Dynamic Gaussian Splatting and Diffusion-based Detail Enhancement. We design our pipeline specifically to meet the demands of high-end media production. We employ two capture rigs: the Scene Rig, which captures multi-actor performances at a resolution which falls short of 4K production quality, and the Face Rig, which records high-fidelity single-actor facial detail to serve as a reference for detail enhancement. We first reconstruct dynamic performances from the Scene Rig using 4D Gaussian Splatting, incorporating new model designs and training strategies to improve reconstruction, dynamic range, and rendering quality. Then to render high-quality images for facial closeups, we introduce a diffusion-based detail enhancement model. This model is fine-tuned with high-fidelity data from the same actors recorded in the Face Rig. We train on paired data generated from low- and high-quality Gaussian Splatting (GS) models, using the low-quality input to match the quality of the Scene Rig, with the high-quality GS as ground truth. Our results demonstrate the effectiveness of this pipeline in bridging the gap between the scalable performance capture of a large-scale rig and the high-resolution standards required for film and media production.
Similar Papers
VDEGaussian: Video Diffusion Enhanced 4D Gaussian Splatting for Dynamic Urban Scenes Modeling
CV and Pattern Recognition
Makes videos of moving things look clearer.
RobustSplat++: Decoupling Densification, Dynamics, and Illumination for In-the-Wild 3DGS
CV and Pattern Recognition
Makes 3D pictures ignore moving things and changing light.
From Volume Rendering to 3D Gaussian Splatting: Theory and Applications
CV and Pattern Recognition
Creates realistic 3D worlds from photos fast.