Dynamic Avatar-Scene Rendering from Human-centric Context
By: Wenqing Wang , Haosen Yang , Josef Kittler and more
Potential Business Impact:
Makes videos of people look real, even with backgrounds.
Reconstructing dynamic humans interacting with real-world environments from monocular videos is an important and challenging task. Despite considerable progress in 4D neural rendering, existing approaches either model dynamic scenes holistically or model scenes and backgrounds separately aim to introduce parametric human priors. However, these approaches either neglect distinct motion characteristics of various components in scene especially human, leading to incomplete reconstructions, or ignore the information exchange between the separately modeled components, resulting in spatial inconsistencies and visual artifacts at human-scene boundaries. To address this, we propose {\bf Separate-then-Map} (StM) strategy that introduces a dedicated information mapping mechanism to bridge separately defined and optimized models. Our method employs a shared transformation function for each Gaussian attribute to unify separately modeled components, enhancing computational efficiency by avoiding exhaustive pairwise interactions while ensuring spatial and visual coherence between humans and their surroundings. Extensive experiments on monocular video datasets demonstrate that StM significantly outperforms existing state-of-the-art methods in both visual quality and rendering accuracy, particularly at challenging human-scene interaction boundaries.
Similar Papers
Asset-Driven Sematic Reconstruction of Dynamic Scene with Multi-Human-Object Interactions
CV and Pattern Recognition
Makes 3D models of moving people and things.
AHA! Animating Human Avatars in Diverse Scenes with Gaussian Splatting
CV and Pattern Recognition
Makes animated people look real in 3D videos.
SHARE: Scene-Human Aligned Reconstruction
CV and Pattern Recognition
Puts people in 3D worlds accurately from videos.