Score: 3

DualProtoSeg: Simple and Efficient Design with Text- and Image-Guided Prototype Learning for Weakly Supervised Histopathology Image Segmentation

Published: December 11, 2025 | arXiv ID: 2512.10314v1

By: Anh M. Vu , Khang P. Le , Trang T. K. Vo and more

Potential Business Impact:

Helps doctors find diseases in pictures better.

Business Areas:
Image Recognition Data and Analytics, Software

Weakly supervised semantic segmentation (WSSS) in histopathology seeks to reduce annotation cost by learning from image-level labels, yet it remains limited by inter-class homogeneity, intra-class heterogeneity, and the region-shrinkage effect of CAM-based supervision. We propose a simple and effective prototype-driven framework that leverages vision-language alignment to improve region discovery under weak supervision. Our method integrates CoOp-style learnable prompt tuning to generate text-based prototypes and combines them with learnable image prototypes, forming a dual-modal prototype bank that captures both semantic and appearance cues. To address oversmoothing in ViT representations, we incorporate a multi-scale pyramid module that enhances spatial precision and improves localization quality. Experiments on the BCSS-WSSS benchmark show that our approach surpasses existing state-of-the-art methods, and detailed analyses demonstrate the benefits of text description diversity, context length, and the complementary behavior of text and image prototypes. These results highlight the effectiveness of jointly leveraging textual semantics and visual prototype learning for WSSS in digital pathology.

Country of Origin
πŸ‡¦πŸ‡Ί πŸ‡ΊπŸ‡Έ United States, Australia

Repos / Data Links

Page Count
13 pages

Category
Computer Science:
CV and Pattern Recognition