Synthetic Data Augmentation using Pre-trained Diffusion Models for Long-tailed Food Image Classification
By: GaYeon Koh , Hyun-Jic Oh , Jeonghyun Noh and more
Potential Business Impact:
Helps computers recognize rare foods better.
Deep learning-based food image classification enables precise identification of food categories, further facilitating accurate nutritional analysis. However, real-world food images often show a skewed distribution, with some food types being more prevalent than others. This class imbalance can be problematic, causing models to favor the majority (head) classes with overall performance degradation for the less common (tail) classes. Recently, synthetic data augmentation using diffusion-based generative models has emerged as a promising solution to address this issue. By generating high-quality synthetic images, these models can help uniformize the data distribution, potentially improving classification performance. However, existing approaches face challenges: fine-tuning-based methods need a uniformly distributed dataset, while pre-trained model-based approaches often overlook inter-class separation in synthetic data. In this paper, we propose a two-stage synthetic data augmentation framework, leveraging pre-trained diffusion models for long-tailed food classification. We generate a reference set conditioned by a positive prompt on the generation target and then select a class that shares similar features with the generation target as a negative prompt. Subsequently, we generate a synthetic augmentation set using positive and negative prompt conditions by a combined sampling strategy that promotes intra-class diversity and inter-class separation. We demonstrate the efficacy of the proposed method on two long-tailed food benchmark datasets, achieving superior performance compared to previous works in terms of top-1 accuracy.
Similar Papers
Boosting Statistic Learning with Synthetic Data from Pretrained Large Models
Machine Learning (Stat)
Makes computer models learn better with fake data.
Efficient Long-Tail Learning in Latent Space by sampling Synthetic Data
Machine Learning (CS)
Makes computer learning fair for rare things.
Context-guided Responsible Data Augmentation with Diffusion Models
CV and Pattern Recognition
Makes AI better at recognizing pictures by adding fake ones.