S3OD: Towards Generalizable Salient Object Detection with Synthetic Data
By: Orest Kupyn, Hirokatsu Kataoka, Christian Rupprecht
Potential Business Impact:
Finds important things in pictures better.
Salient object detection exemplifies data-bounded tasks where expensive pixel-precise annotations force separate model training for related subtasks like DIS and HR-SOD. We present a method that dramatically improves generalization through large-scale synthetic data generation and ambiguity-aware architecture. We introduce S3OD, a dataset of over 139,000 high-resolution images created through our multi-modal diffusion pipeline that extracts labels from diffusion and DINO-v3 features. The iterative generation framework prioritizes challenging categories based on model performance. We propose a streamlined multi-mask decoder that naturally handles the inherent ambiguity in salient object detection by predicting multiple valid interpretations. Models trained solely on synthetic data achieve 20-50% error reduction in cross-dataset generalization, while fine-tuned versions reach state-of-the-art performance across DIS and HR-SOD benchmarks.
Similar Papers
Hyperspectral Remote Sensing Images Salient Object Detection: The First Benchmark Dataset and Baseline
CV and Pattern Recognition
Finds important things in pictures from space.
SOS: Synthetic Object Segments Improve Detection, Segmentation, and Grounding
CV and Pattern Recognition
Makes computers better at seeing and understanding objects.
Domain Randomization for Object Detection in Manufacturing Applications using Synthetic Data: A Comprehensive Study
CV and Pattern Recognition
Teaches robots to see and grab parts.