LADB: Latent Aligned Diffusion Bridges for Semi-Supervised Domain Translation
By: Xuqin Wang , Tao Wu , Yanfeng Zhang and more
Potential Business Impact:
Teaches computers to change pictures with less data.
Diffusion models excel at generating high-quality outputs but face challenges in data-scarce domains, where exhaustive retraining or costly paired data are often required. To address these limitations, we propose Latent Aligned Diffusion Bridges (LADB), a semi-supervised framework for sample-to-sample translation that effectively bridges domain gaps using partially paired data. By aligning source and target distributions within a shared latent space, LADB seamlessly integrates pretrained source-domain diffusion models with a target-domain Latent Aligned Diffusion Model (LADM), trained on partially paired latent representations. This approach enables deterministic domain mapping without the need for full supervision. Compared to unpaired methods, which often lack controllability, and fully paired approaches that require large, domain-specific datasets, LADB strikes a balance between fidelity and diversity by leveraging a mixture of paired and unpaired latent-target couplings. Our experimental results demonstrate superior performance in depth-to-image translation under partial supervision. Furthermore, we extend LADB to handle multi-source translation (from depth maps and segmentation masks) and multi-target translation in a class-conditioned style transfer task, showcasing its versatility in handling diverse and heterogeneous use cases. Ultimately, we present LADB as a scalable and versatile solution for real-world domain translation, particularly in scenarios where data annotation is costly or incomplete.
Similar Papers
Towards General Modality Translation with Contrastive and Predictive Latent Diffusion Bridge
CV and Pattern Recognition
Translates between different types of data, like pictures and 3D shapes.
Heterogeneous-Modal Unsupervised Domain Adaptation via Latent Space Bridging
CV and Pattern Recognition
Lets computers learn from different kinds of pictures.
Balanced Learning for Domain Adaptive Semantic Segmentation
CV and Pattern Recognition
Helps computers better understand pictures of things.