Score: 2

AWM-Fuse: Multi-Modality Image Fusion for Adverse Weather via Global and Local Text Perception

Published: August 23, 2025 | arXiv ID: 2508.16881v1

By: Xilai Li , Huichun Liu , Xiaosong Li and more

Potential Business Impact:

Clears foggy pictures using words and AI.

Business Areas:
Image Recognition Data and Analytics, Software

Multi-modality image fusion (MMIF) in adverse weather aims to address the loss of visual information caused by weather-related degradations, providing clearer scene representations. Although less studies have attempted to incorporate textual information to improve semantic perception, they often lack effective categorization and thorough analysis of textual content. In response, we propose AWM-Fuse, a novel fusion method for adverse weather conditions, designed to handle multiple degradations through global and local text perception within a unified, shared weight architecture. In particular, a global feature perception module leverages BLIP-produced captions to extract overall scene features and identify primary degradation types, thus promoting generalization across various adverse weather conditions. Complementing this, the local module employs detailed scene descriptions produced by ChatGPT to concentrate on specific degradation effects through concrete textual cues, thereby capturing finer details. Furthermore, textual descriptions are used to constrain the generation of fusion images, effectively steering the network learning process toward better alignment with real semantic labels, thereby promoting the learning of more meaningful visual features. Extensive experiments demonstrate that AWM-Fuse outperforms current state-of-the-art methods in complex weather conditions and downstream tasks. Our code is available at https://github.com/Feecuin/AWM-Fuse.

Country of Origin
🇨🇳 China

Repos / Data Links

Page Count
11 pages

Category
Computer Science:
CV and Pattern Recognition