Towards Agnostic and Holistic Universal Image Segmentation with Bit Diffusion
By: Jakob Lønborg Christensen , Morten Rieger Hannemose , Anders Bjorholm Dahl and more
Potential Business Impact:
Lets computers understand any picture perfectly.
This paper introduces a diffusion-based framework for universal image segmentation, making agnostic segmentation possible without depending on mask-based frameworks and instead predicting the full segmentation in a holistic manner. We present several key adaptations to diffusion models, which are important in this discrete setting. Notably, we show that a location-aware palette with our 2D gray code ordering improves performance. Adding a final tanh activation function is crucial for discrete data. On optimizing diffusion parameters, the sigmoid loss weighting consistently outperforms alternatives, regardless of the prediction type used, and we settle on x-prediction. While our current model does not yet surpass leading mask-based architectures, it narrows the performance gap and introduces unique capabilities, such as principled ambiguity modeling, that these models lack. All models were trained from scratch, and we believe that combining our proposed improvements with large-scale pretraining or promptable conditioning could lead to competitive models.
Similar Papers
Diffusion Based Ambiguous Image Segmentation
CV and Pattern Recognition
Helps doctors see all possible lung tumor shapes.
Diffusion Model in Latent Space for Medical Image Segmentation Task
CV and Pattern Recognition
Helps doctors see uncertain details in medical scans.
Generative Image Coding with Diffusion Prior
CV and Pattern Recognition
Makes pictures look good even when squeezed small.