EfficienT-HDR: An Efficient Transformer-Based Framework via Multi-Exposure Fusion for HDR Reconstruction
By: Yu-Shen Huang , Tzu-Han Chen , Cheng-Yen Hsiao and more
Potential Business Impact:
Makes phone cameras see better in bright light.
Achieving high-quality High Dynamic Range (HDR) imaging on resource-constrained edge devices is a critical challenge in computer vision, as its performance directly impacts downstream tasks such as intelligent surveillance and autonomous driving. Multi-Exposure Fusion (MEF) is a mainstream technique to achieve this goal; however, existing methods generally face the dual bottlenecks of high computational costs and ghosting artifacts, hindering their widespread deployment. To this end, this study proposes a light-weight Vision Transformer architecture designed explicitly for HDR reconstruction to overcome these limitations. This study is based on the Context-Aware Vision Transformer and begins by converting input images to the YCbCr color space to separate luminance and chrominance information. It then employs an Intersection-Aware Adaptive Fusion (IAAF) module to suppress ghosting effectively. To further achieve a light-weight design, we introduce Inverted Residual Embedding (IRE), Dynamic Tanh (DyT), and propose Enhanced Multi-Scale Dilated Convolution (E-MSDC) to reduce computational complexity at multiple levels. Our study ultimately contributes two model versions: a main version for high visual quality and a light-weight version with advantages in computational efficiency, both of which achieve an excellent balance between performance and image quality. Experimental results demonstrate that, compared to the baseline, the main version reduces FLOPS by approximately 67% and increases inference speed by more than fivefold on CPU and 2.5 times on an edge device. These results confirm that our method provides an efficient and ghost-free HDR imaging solution for edge devices, demonstrating versatility and practicality across various dynamic scenarios.
Similar Papers
Towards Unified Semantic and Controllable Image Fusion: A Diffusion Transformer Approach
CV and Pattern Recognition
Combines pictures using words to make better images.
MdaIF: Robust One-Stop Multi-Degradation-Aware Image Fusion with Language-Driven Semantics
CV and Pattern Recognition
Cleans up blurry pictures from bad weather.
Retinex-MEF: Retinex-based Glare Effects Aware Unsupervised Multi-Exposure Image Fusion
CV and Pattern Recognition
Fixes blurry photos with too much or too little light.