Score: 2

A Hybrid Wavelet-Fourier Method for Next-Generation Conditional Diffusion Models

Published: April 4, 2025 | arXiv ID: 2504.03821v1

By: Andrew Kiruluta, Andreas Lemos

BigTech Affiliations: University of California, Berkeley

Potential Business Impact:

Makes computer pictures look more real.

Business Areas:
RFID Hardware

We present a novel generative modeling framework,Wavelet-Fourier-Diffusion, which adapts the diffusion paradigm to hybrid frequency representations in order to synthesize high-quality, high-fidelity images with improved spatial localization. In contrast to conventional diffusion models that rely exclusively on additive noise in pixel space, our approach leverages a multi-transform that combines wavelet sub-band decomposition with partial Fourier steps. This strategy progressively degrades and then reconstructs images in a hybrid spectral domain during the forward and reverse diffusion processes. By supplementing traditional Fourier-based analysis with the spatial localization capabilities of wavelets, our model can capture both global structures and fine-grained features more effectively. We further extend the approach to conditional image generation by integrating embeddings or conditional features via cross-attention. Experimental evaluations on CIFAR-10, CelebA-HQ, and a conditional ImageNet subset illustrate that our method achieves competitive or superior performance relative to baseline diffusion models and state-of-the-art GANs, as measured by Fr\'echet Inception Distance (FID) and Inception Score (IS). We also show how the hybrid frequency-based representation improves control over global coherence and fine texture synthesis, paving the way for new directions in multi-scale generative modeling.

Country of Origin
🇺🇸 United States

Page Count
11 pages

Category
Computer Science:
CV and Pattern Recognition