SAM2-UNeXT: An Improved High-Resolution Baseline for Adapting Foundation Models to Downstream Segmentation Tasks
By: Xinyu Xiong , Zihuang Wu , Lei Zhang and more
Potential Business Impact:
Makes computer pictures understand objects better.
Recent studies have highlighted the potential of adapting the Segment Anything Model (SAM) for various downstream tasks. However, constructing a more powerful and generalizable encoder to further enhance performance remains an open challenge. In this work, we propose SAM2-UNeXT, an advanced framework that builds upon the core principles of SAM2-UNet while extending the representational capacity of SAM2 through the integration of an auxiliary DINOv2 encoder. By incorporating a dual-resolution strategy and a dense glue layer, our approach enables more accurate segmentation with a simple architecture, relaxing the need for complex decoder designs. Extensive experiments conducted on four benchmarks, including dichotomous image segmentation, camouflaged object detection, marine animal segmentation, and remote sensing saliency detection, demonstrate the superior performance of our proposed method. The code is available at https://github.com/WZH0120/SAM2-UNeXT.
Similar Papers
SAM3-UNet: Simplified Adaptation of Segment Anything Model 3
CV and Pattern Recognition
Teaches computers to find things in pictures faster.
How Universal Are SAM2 Features?
CV and Pattern Recognition
Makes AI better at seeing, but not everything.
UniUltra: Interactive Parameter-Efficient SAM2 for Universal Ultrasound Segmentation
Image and Video Processing
Helps doctors see inside bodies better with sound.