Beyond Binary Preference: Aligning Diffusion Models to Fine-grained Criteria by Decoupling Attributes
By: Chenye Meng , Zejian Li , Zhongni Liu and more
Potential Business Impact:
Teaches computers to judge art like experts.
Post-training alignment of diffusion models relies on simplified signals, such as scalar rewards or binary preferences. This limits alignment with complex human expertise, which is hierarchical and fine-grained. To address this, we first construct a hierarchical, fine-grained evaluation criteria with domain experts, which decomposes image quality into multiple positive and negative attributes organized in a tree structure. Building on this, we propose a two-stage alignment framework. First, we inject domain knowledge to an auxiliary diffusion model via Supervised Fine-Tuning. Second, we introduce Complex Preference Optimization (CPO) that extends DPO to align the target diffusion to our non-binary, hierarchical criteria. Specifically, we reformulate the alignment problem to simultaneously maximize the probability of positive attributes while minimizing the probability of negative attributes with the auxiliary diffusion. We instantiate our approach in the domain of painting generation and conduct CPO training with an annotated dataset of painting with fine-grained attributes based on our criteria. Extensive experiments demonstrate that CPO significantly enhances generation quality and alignment with expertise, opening new avenues for fine-grained criteria alignment.
Similar Papers
Calibrated Multi-Preference Optimization for Aligning Diffusion Models
CV and Pattern Recognition
Makes AI art better by learning from many opinions.
Multi-dimensional Preference Alignment by Conditioning Reward Itself
CV and Pattern Recognition
Teaches AI to create better pictures by understanding different feedback.
Mind the Generative Details: Direct Localized Detail Preference Optimization for Video Diffusion Models
CV and Pattern Recognition
Makes AI videos look more real and flow better.