Score: 3

Multimodal Feature Fusion Network with Text Difference Enhancement for Remote Sensing Change Detection

Published: September 4, 2025 | arXiv ID: 2509.03961v1

By: Yijun Zhou , Yikui Zhai , Zilu Ying and more

Potential Business Impact:

Finds changes in pictures using words too.

Business Areas:

Image Recognition Data and Analytics, Software

Although deep learning has advanced remote sensing change detection (RSCD), most methods rely solely on image modality, limiting feature representation, change pattern modeling, and generalization especially under illumination and noise disturbances. To address this, we propose MMChange, a multimodal RSCD method that combines image and text modalities to enhance accuracy and robustness. An Image Feature Refinement (IFR) module is introduced to highlight key regions and suppress environmental noise. To overcome the semantic limitations of image features, we employ a vision language model (VLM) to generate semantic descriptions of bitemporal images. A Textual Difference Enhancement (TDE) module then captures fine grained semantic shifts, guiding the model toward meaningful changes. To bridge the heterogeneity between modalities, we design an Image Text Feature Fusion (ITFF) module that enables deep cross modal integration. Extensive experiments on LEVIRCD, WHUCD, and SYSUCD demonstrate that MMChange consistently surpasses state of the art methods across multiple metrics, validating its effectiveness for multimodal RSCD. Code is available at: https://github.com/yikuizhai/MMChange.

Referring Change Detection in Remote Sensing Imagery

CV and Pattern Recognition

Finds specific changes in pictures using words.

12 Dec 2025 1

89%

MGCR-Net:Multimodal Graph-Conditioned Vision-Language Reconstruction Network for Remote Sensing Change Detection

Image and Video Processing

Spots land changes faster from satellite pictures.

3 Aug 2025 1

88%

IRDFusion: Iterative Relation-Map Difference guided Feature Fusion for Multispectral Object Detection

CV and Pattern Recognition

Helps cameras see better in fog and darkness.

11 Sep 2025 2

View PDF Login to Bookmark

Country of Origin

🇲🇴 🇨🇳 🇭🇰 🇺🇸 Macao, Hong Kong, United States, China

Repos / Data Links

github.com

Page Count

15 pages

Multimodal Feature Fusion Network with Text Difference Enhancement for Remote Sensing Change Detection

Finds changes in pictures using words too.

Technical Abstract

Referring Change Detection in Remote Sensing Imagery

MGCR-Net:Multimodal Graph-Conditioned Vision-Language Reconstruction Network for Remote Sensing Change Detection

IRDFusion: Iterative Relation-Map Difference guided Feature Fusion for Multispectral Object Detection