Per-Query Visual Concept Learning
By: Ori Malca, Dvir Samuel, Gal Chechik
Potential Business Impact:
Teaches computers to draw your specific ideas.
Visual concept learning, also known as Text-to-image personalization, is the process of teaching new concepts to a pretrained model. This has numerous applications from product placement to entertainment and personalized design. Here we show that many existing methods can be substantially augmented by adding a personalization step that is (1) specific to the prompt and noise seed, and (2) using two loss terms based on the self- and cross- attention, capturing the identity of the personalized concept. Specifically, we leverage PDM features - previously designed to capture identity - and show how they can be used to improve personalized semantic similarity. We evaluate the benefit that our method gains on top of six different personalization methods, and several base text-to-image models (both UNet- and DiT-based). We find significant improvements even over previous per-query personalization methods.
Similar Papers
Salient Concept-Aware Generative Data Augmentation
CV and Pattern Recognition
Makes AI create better, more varied pictures from words.
A Comprehensive Survey on Visual Concept Mining in Text-to-image Diffusion Models
CV and Pattern Recognition
Teaches computers to draw exactly what you describe.
Personalized Image Descriptions from Attention Sequences
CV and Pattern Recognition
Helps computers describe pictures like you do.