PEO: Training-Free Aesthetic Quality Enhancement in Pre-Trained Text-to-Image Diffusion Models with Prompt Embedding Optimization
By: Hovhannes Margaryan, Bo Wan, Tinne Tuytelaars
Potential Business Impact:
Makes AI art look better from simple words.
This paper introduces a novel approach to aesthetic quality improvement in pre-trained text-to-image diffusion models when given a simple prompt. Our method, dubbed Prompt Embedding Optimization (PEO), leverages a pre-trained text-to-image diffusion model as a backbone and optimizes the text embedding of a given simple and uncurated prompt to enhance the visual quality of the generated image. We achieve this by a tripartite objective function that improves the aesthetic fidelity of the generated image, ensures adherence to the optimized text embedding, and minimal divergence from the initial prompt. The latter is accomplished through a prompt preservation term. Additionally, PEO is training-free and backbone-independent. Quantitative and qualitative evaluations confirm the effectiveness of the proposed method, exceeding or equating the performance of state-of-the-art text-to-image and prompt adaptation methods.
Similar Papers
Training-Free Generation of Diverse and High-Fidelity Images via Prompt Semantic Space Optimization
CV and Pattern Recognition
Makes AI art makers create more different pictures.
PromptLNet: Region-Adaptive Aesthetic Enhancement via Prompt Guidance in Low-Light Enhancement Net
CV and Pattern Recognition
Makes dark pictures look good, not just bright.
PromptEnhancer: A Simple Approach to Enhance Text-to-Image Models via Chain-of-Thought Prompt Rewriting
CV and Pattern Recognition
Makes AI pictures match your exact words.