Score: 2

HER-Seg: Holistically Efficient Segmentation for High-Resolution Medical Images

Published: April 8, 2025 | arXiv ID: 2504.06205v2

By: Qing Xu , Zhenye Lou , Chenxin Li and more

Potential Business Impact:

Helps doctors see tiny details in medical scans.

Business Areas:
Image Recognition Data and Analytics, Software

High-resolution segmentation is critical for precise disease diagnosis by extracting fine-grained morphological details. Existing hierarchical encoder-decoder frameworks have demonstrated remarkable adaptability across diverse medical segmentation tasks. While beneficial, they usually require the huge computation and memory cost when handling large-size segmentation, which limits their applications in foundation model building and real-world clinical scenarios. To address this limitation, we propose a holistically efficient framework for high-resolution medical image segmentation, called HER-Seg. Specifically, we first devise a computation-efficient image encoder (CE-Encoder) to model long-range dependencies with linear complexity while maintaining sufficient representations. In particular, we introduce the dual-gated linear attention (DLA) mechanism to perform cascaded token filtering, selectively retaining important tokens while ignoring irrelevant ones to enhance attention computation efficiency. Then, we introduce a memory-efficient mask decoder (ME-Decoder) to eliminate the demand for the hierarchical structure by leveraging cross-scale segmentation decoding. Extensive experiments reveal that HER-Seg outperforms state-of-the-arts in high-resolution medical 2D, 3D and video segmentation tasks. In particular, our HER-Seg requires only 0.59GB training GPU memory and 9.39G inference FLOPs per 1024$\times$1024 image, demonstrating superior memory and computation efficiency. The code is available at https://github.com/xq141839/HER-Seg.

Country of Origin
πŸ‡¬πŸ‡§ πŸ‡¨πŸ‡³ United Kingdom, China

Repos / Data Links

Page Count
10 pages

Category
Electrical Engineering and Systems Science:
Image and Video Processing