Score: 1

Diffusion Transformer meets Multi-level Wavelet Spectrum for Single Image Super-Resolution

Published: November 3, 2025 | arXiv ID: 2511.01175v2

By: Peng Du , Hui Li , Han Xu and more

BigTech Affiliations: Samsung

Potential Business Impact:

Makes blurry pictures sharp and clear.

Business Areas:
DSP Hardware

Discrete Wavelet Transform (DWT) has been widely explored to enhance the performance of image superresolution (SR). Despite some DWT-based methods improving SR by capturing fine-grained frequency signals, most existing approaches neglect the interrelations among multiscale frequency sub-bands, resulting in inconsistencies and unnatural artifacts in the reconstructed images. To address this challenge, we propose a Diffusion Transformer model based on image Wavelet spectra for SR (DTWSR). DTWSR incorporates the superiority of diffusion models and transformers to capture the interrelations among multiscale frequency sub-bands, leading to a more consistence and realistic SR image. Specifically, we use a Multi-level Discrete Wavelet Transform to decompose images into wavelet spectra. A pyramid tokenization method is proposed which embeds the spectra into a sequence of tokens for transformer model, facilitating to capture features from both spatial and frequency domain. A dual-decoder is designed elaborately to handle the distinct variances in low-frequency and high-frequency sub-bands, without omitting their alignment in image generation. Extensive experiments on multiple benchmark datasets demonstrate the effectiveness of our method, with high performance on both perception quality and fidelity.

Country of Origin
🇰🇷 South Korea

Page Count
11 pages

Category
Computer Science:
CV and Pattern Recognition