MMIF-AMIN: Adaptive Loss-Driven Multi-Scale Invertible Dense Network for Multimodal Medical Image Fusion
By: Tao Luo, Weihua Xu
Potential Business Impact:
Combines medical pictures for clearer diagnoses.
Multimodal medical image fusion (MMIF) aims to integrate images from different modalities to produce a comprehensive image that enhances medical diagnosis by accurately depicting organ structures, tissue textures, and metabolic information. Capturing both the unique and complementary information across multiple modalities simultaneously is a key research challenge in MMIF. To address this challenge, this paper proposes a novel image fusion method, MMIF-AMIN, which features a new architecture that can effectively extract these unique and complementary features. Specifically, an Invertible Dense Network (IDN) is employed for lossless feature extraction from individual modalities. To extract complementary information between modalities, a Multi-scale Complementary Feature Extraction Module (MCFEM) is designed, which incorporates a hybrid attention mechanism, convolutional layers of varying sizes, and Transformers. An adaptive loss function is introduced to guide model learning, addressing the limitations of traditional manually-designed loss functions and enhancing the depth of data mining. Extensive experiments demonstrate that MMIF-AMIN outperforms nine state-of-the-art MMIF methods, delivering superior results in both quantitative and qualitative analyses. Ablation experiments confirm the effectiveness of each component of the proposed method. Additionally, extending MMIF-AMIN to other image fusion tasks also achieves promising performance.
Similar Papers
Task-Generalized Adaptive Cross-Domain Learning for Multimodal Image Fusion
CV and Pattern Recognition
Combines different pictures for clearer images.
Spatial-Frequency Enhanced Mamba for Multi-Modal Image Fusion
CV and Pattern Recognition
Makes pictures clearer by combining different kinds.
Leveraging Pre-Trained Models for Multimodal Class-Incremental Learning under Adaptive Fusion
Machine Learning (CS)
Teaches AI to learn from sight, sound, and words.