Score: 1

Multimodal Prototype Alignment for Semi-supervised Pathology Image Segmentation

Published: August 27, 2025 | arXiv ID: 2508.19574v1

By: Mingxi Fu , Fanglei Fu , Xitong Ling and more

Potential Business Impact:

Helps doctors find sickness in body pictures.

Business Areas:

Image Recognition Data and Analytics, Software

Pathological image segmentation faces numerous challenges, particularly due to ambiguous semantic boundaries and the high cost of pixel-level annotations. Although recent semi-supervised methods based on consistency regularization (e.g., UniMatch) have made notable progress, they mainly rely on perturbation-based consistency within the image modality, making it difficult to capture high-level semantic priors, especially in structurally complex pathology images. To address these limitations, we propose MPAMatch - a novel segmentation framework that performs pixel-level contrastive learning under a multimodal prototype-guided supervision paradigm. The core innovation of MPAMatch lies in the dual contrastive learning scheme between image prototypes and pixel labels, and between text prototypes and pixel labels, providing supervision at both structural and semantic levels. This coarse-to-fine supervisory strategy not only enhances the discriminative capability on unlabeled samples but also introduces the text prototype supervision into segmentation for the first time, significantly improving semantic boundary modeling. In addition, we reconstruct the classic segmentation architecture (TransUNet) by replacing its ViT backbone with a pathology-pretrained foundation model (Uni), enabling more effective extraction of pathology-relevant features. Extensive experiments on GLAS, EBHI-SEG-GLAND, EBHI-SEG-CANCER, and KPI show MPAMatch's superiority over state-of-the-art methods, validating its dual advantages in structural and semantic modeling.

Pathology-Aware Prototype Evolution via LLM-Driven Semantic Disambiguation for Multicenter Diabetic Retinopathy Diagnosis

Artificial Intelligence

Helps doctors spot eye disease earlier and better.

27 Nov 2025 2

89%

Accurate and Scalable Multimodal Pathology Retrieval via Attentive Vision-Language Alignment

CV and Pattern Recognition

Finds similar medical slides using pictures or words.

27 Oct 2025 1

88%

Improving Medical Visual Representation Learning with Pathological-level Cross-Modal Alignment and Correlation Exploration

CV and Pattern Recognition

Helps doctors find diseases in X-rays better.

12 Jun 2025 1

View PDF Login to Bookmark

Page Count

9 pages

Multimodal Prototype Alignment for Semi-supervised Pathology Image Segmentation

Helps doctors find sickness in body pictures.

Technical Abstract

Pathology-Aware Prototype Evolution via LLM-Driven Semantic Disambiguation for Multicenter Diabetic Retinopathy Diagnosis

Accurate and Scalable Multimodal Pathology Retrieval via Attentive Vision-Language Alignment

Improving Medical Visual Representation Learning with Pathological-level Cross-Modal Alignment and Correlation Exploration