Animalbooth: multimodal feature enhancement for animal subject personalization
By: Chen Liu , Haitao Wu , Kafeng Wang and more
Potential Business Impact:
Creates unique animal pictures that look real.
Personalized animal image generation is challenging due to rich appearance cues and large morphological variability. Existing approaches often exhibit feature misalignment across domains, which leads to identity drift. We present AnimalBooth, a framework that strengthens identity preservation with an Animal Net and an adaptive attention module, mitigating cross domain alignment errors. We further introduce a frequency controlled feature integration module that applies Discrete Cosine Transform filtering in the latent space to guide the diffusion process, enabling a coarse to fine progression from global structure to detailed texture. To advance research in this area, we curate AnimalBench, a high resolution dataset for animal personalization. Extensive experiments show that AnimalBooth consistently outperforms strong baselines on multiple benchmarks and improves both identity fidelity and perceptual quality.
Similar Papers
AgeBooth: Controllable Facial Aging and Rejuvenation via Diffusion Models
CV and Pattern Recognition
Changes a person's age in a picture.
ID-Booth: Identity-consistent Face Generation with Diffusion Models
CV and Pattern Recognition
Creates realistic faces that look like specific people.
PersonaBooth: Personalized Text-to-Motion Generation
CV and Pattern Recognition
Creates unique character movements from text.