FreeInpaint: Tuning-free Prompt Alignment and Visual Rationality Enhancement in Image Inpainting
By: Chao Gong , Dong Li , Yingwei Pan and more
Text-guided image inpainting endeavors to generate new content within specified regions of images using textual prompts from users. The primary challenge is to accurately align the inpainted areas with the user-provided prompts while maintaining a high degree of visual fidelity. While existing inpainting methods have produced visually convincing results by leveraging the pre-trained text-to-image diffusion models, they still struggle to uphold both prompt alignment and visual rationality simultaneously. In this work, we introduce FreeInpaint, a plug-and-play tuning-free approach that directly optimizes the diffusion latents on the fly during inference to improve the faithfulness of the generated images. Technically, we introduce a prior-guided noise optimization method that steers model attention towards valid inpainting regions by optimizing the initial noise. Furthermore, we meticulously design a composite guidance objective tailored specifically for the inpainting task. This objective efficiently directs the denoising process, enhancing prompt alignment and visual rationality by optimizing intermediate latents at each step. Through extensive experiments involving various inpainting diffusion models and evaluation metrics, we demonstrate the effectiveness and robustness of our proposed FreeInpaint.
Similar Papers
GuidPaint: Class-Guided Image Inpainting with Diffusion Models
CV and Pattern Recognition
Fixes pictures by adding missing parts perfectly.
ControlFill: Spatially Adjustable Image Inpainting from Prompt Learning
CV and Pattern Recognition
Lets you add or remove things from pictures easily.
HarmonPaint: Harmonized Training-Free Diffusion Inpainting
CV and Pattern Recognition
Fixes pictures without needing to learn.