Jointly Conditioned Diffusion Model for Multi-View Pose-Guided Person Image Synthesis
By: Chengyu Xie , Zhi Gong , Junchi Ren and more
Potential Business Impact:
Creates realistic people from different angles.
Pose-guided human image generation is limited by incomplete textures from single reference views and the absence of explicit cross-view interaction. We present jointly conditioned diffusion model (JCDM), a jointly conditioned diffusion framework that exploits multi-view priors. The appearance prior module (APM) infers a holistic identity preserving prior from incomplete references, and the joint conditional injection (JCI) mechanism fuses multi-view cues and injects shared conditioning into the denoising backbone to align identity, color, and texture across poses. JCDM supports a variable number of reference views and integrates with standard diffusion backbones with minimal and targeted architectural modifications. Experiments demonstrate state of the art fidelity and cross-view consistency.
Similar Papers
Multi-focal Conditioned Latent Diffusion for Person Image Synthesis
CV and Pattern Recognition
Makes AI create realistic people pictures.
SyncMV4D: Synchronized Multi-view Joint Diffusion of Appearance and Motion for Hand-Object Interaction Synthesis
CV and Pattern Recognition
Creates realistic 3D animations of people and objects.
Coupled Diffusion Sampling for Training-Free Multi-View Image Editing
CV and Pattern Recognition
Edits pictures from many angles, all matching.