Improving Compositional Generation with Diffusion Models Using Lift Scores
By: Chenning Yu, Sicun Gao
Potential Business Impact:
Makes AI pictures match your exact words.
We introduce a novel resampling criterion using lift scores, for improving compositional generation in diffusion models. By leveraging the lift scores, we evaluate whether generated samples align with each single condition and then compose the results to determine whether the composed prompt is satisfied. Our key insight is that lift scores can be efficiently approximated using only the original diffusion model, requiring no additional training or external modules. We develop an optimized variant that achieves relatively lower computational overhead during inference while maintaining effectiveness. Through extensive experiments, we demonstrate that lift scores significantly improved the condition alignment for compositional generation across 2D synthetic data, CLEVR position tasks, and text-to-image synthesis. Our code is available at http://rainorangelemon.github.io/complift.
Similar Papers
ScoreMix: Improving Face Recognition via Score Composition in Diffusion Generators
CV and Pattern Recognition
Makes AI better at recognizing things with less data.
LUSD: Localized Update Score Distillation for Text-Guided Image Editing
Graphics
Makes AI better at changing pictures without messing up.
Composition and Alignment of Diffusion Models using Constrained Learning
Machine Learning (CS)
Makes AI create better pictures with many rules.