Blur2Sharp: Human Novel Pose and View Synthesis with Generative Prior Refinement
By: Chia-Hern Lai , I-Hsuan Lo , Yen-Ku Yeh and more
The creation of lifelike human avatars capable of realistic pose variation and viewpoint flexibility remains a fundamental challenge in computer vision and graphics. Current approaches typically yield either geometrically inconsistent multi-view images or sacrifice photorealism, resulting in blurry outputs under diverse viewing angles and complex motions. To address these issues, we propose Blur2Sharp, a novel framework integrating 3D-aware neural rendering and diffusion models to generate sharp, geometrically consistent novel-view images from only a single reference view. Our method employs a dual-conditioning architecture: initially, a Human NeRF model generates geometrically coherent multi-view renderings for target poses, explicitly encoding 3D structural guidance. Subsequently, a diffusion model conditioned on these renderings refines the generated images, preserving fine-grained details and structural fidelity. We further enhance visual quality through hierarchical feature fusion, incorporating texture, normal, and semantic priors extracted from parametric SMPL models to simultaneously improve global coherence and local detail accuracy. Extensive experiments demonstrate that Blur2Sharp consistently surpasses state-of-the-art techniques in both novel pose and view generation tasks, particularly excelling under challenging scenarios involving loose clothing and occlusions.
Similar Papers
TurboPortrait3D: Single-step diffusion-based fast portrait novel-view synthesis
CV and Pattern Recognition
Creates realistic 3D people from one photo.
CloseUpShot: Close-up Novel View Synthesis from Sparse-views via Point-conditioned Diffusion Model
CV and Pattern Recognition
Creates detailed 3D views from few pictures.
ViSA: 3D-Aware Video Shading for Real-Time Upper-Body Avatar Creation
CV and Pattern Recognition
Creates realistic 3D characters from photos.