SSNet: Saliency Prior and State Space Model-based Network for Salient Object Detection in RGB-D Images
By: Gargi Panda , Soumitra Kundu , Saumik Bhattacharya and more
Potential Business Impact:
Helps robots see important things in pictures.
Salient object detection (SOD) in RGB-D images is an essential task in computer vision, enabling applications in scene understanding, robotics, and augmented reality. However, existing methods struggle to capture global dependency across modalities, lack comprehensive saliency priors from both RGB and depth data, and are ineffective in handling low-quality depth maps. To address these challenges, we propose SSNet, a saliency-prior and state space model (SSM)-based network for the RGB-D SOD task. Unlike existing convolution- or transformer-based approaches, SSNet introduces an SSM-based multi-modal multi-scale decoder module to efficiently capture both intra- and inter-modal global dependency with linear complexity. Specifically, we propose a cross-modal selective scan SSM (CM-S6) mechanism, which effectively captures global dependency between different modalities. Furthermore, we introduce a saliency enhancement module (SEM) that integrates three saliency priors with deep features to refine feature representation and improve the localization of salient objects. To further address the issue of low-quality depth maps, we propose an adaptive contrast enhancement technique that dynamically refines depth maps, making them more suitable for the RGB-D SOD task. Extensive quantitative and qualitative experiments on seven benchmark datasets demonstrate that SSNet outperforms state-of-the-art methods.
Similar Papers
Lightweight RGB-D Salient Object Detection from a Speed-Accuracy Tradeoff Perspective
CV and Pattern Recognition
Makes computers see important things faster.
LEAF-Mamba: Local Emphatic and Adaptive Fusion State Space Model for RGB-D Salient Object Detection
CV and Pattern Recognition
Helps computers find important things in pictures.
DualGazeNet: A Biologically Inspired Dual-Gaze Query Network for Salient Object Detection
CV and Pattern Recognition
Finds important things in pictures faster.