Score: 1

ProMSC-MIS: Prompt-based Multimodal Semantic Communication for Multi-Spectral Image Segmentation

Published: August 27, 2025 | arXiv ID: 2508.20057v1

By: Haoshuo Zhang, Yufei Bo, Meixia Tao

Potential Business Impact:

Lets cameras see better with less data.

Business Areas:

Semantic Search Internet Services

Multimodal semantic communication has great potential to enhance downstream task performance by integrating complementary information across modalities. This paper introduces ProMSC-MIS, a novel Prompt-based Multimodal Semantic Communication framework for Multi-Spectral Image Segmentation. It enables efficient task-oriented transmission of spatially aligned RGB and thermal images over band-limited channels. Our framework has two main design novelties. First, by leveraging prompt learning and contrastive learning, unimodal semantic encoders are pre-trained to learn diverse and complementary semantic representations by using features from one modality as prompts for another. Second, a semantic fusion module that combines cross-attention mechanism and squeeze-and-excitation (SE) networks is designed to effectively fuse cross-modal features. Experimental results demonstrate that ProMSC-MIS substantially outperforms conventional image transmission combined with state-of-the-art segmentation methods. Notably, it reduces the required channel bandwidth by 50%--70% at the same segmentation performance, while also decreasing the storage overhead and computational complexity by 26% and 37%, respectively. Ablation studies also validate the effectiveness of the proposed pre-training and semantic fusion strategies. Our scheme is highly suitable for applications such as autonomous driving and nighttime surveillance.

Prompt-based Multimodal Semantic Communication for Multi-spectral Image Segmentation

Image and Video Processing

Boosts scene splitting from multi-light images for safe driving

25 Aug 2025 0

90%

Perception-Enhanced Multitask Multimodal Semantic Communication for UAV-Assisted Integrated Sensing and Communication System

Information Theory

Drones share pictures and maps better for emergencies.

25 Mar 2025 0

90%

Multi-Modal Semantic Communication

Machine Learning (CS)

Lets computers understand pictures from your words.

17 Dec 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

11 pages

ProMSC-MIS: Prompt-based Multimodal Semantic Communication for Multi-Spectral Image Segmentation

Lets cameras see better with less data.

Technical Abstract

Prompt-based Multimodal Semantic Communication for Multi-spectral Image Segmentation

Perception-Enhanced Multitask Multimodal Semantic Communication for UAV-Assisted Integrated Sensing and Communication System

Multi-Modal Semantic Communication