Diversity-Driven Generative Dataset Distillation Based on Diffusion Model with Self-Adaptive Memory
By: Mingzhuo Li , Guang Li , Jiafeng Mao and more
Potential Business Impact:
Makes AI learn faster with smaller, better data.
Dataset distillation enables the training of deep neural networks with comparable performance in significantly reduced time by compressing large datasets into small and representative ones. Although the introduction of generative models has made great achievements in this field, the distributions of their distilled datasets are not diverse enough to represent the original ones, leading to a decrease in downstream validation accuracy. In this paper, we present a diversity-driven generative dataset distillation method based on a diffusion model to solve this problem. We introduce self-adaptive memory to align the distribution between distilled and real datasets, assessing the representativeness. The degree of alignment leads the diffusion model to generate more diverse datasets during the distillation process. Extensive experiments show that our method outperforms existing state-of-the-art methods in most situations, proving its ability to tackle dataset distillation tasks.
Similar Papers
Generative Dataset Distillation Based on Self-knowledge Distillation
CV and Pattern Recognition
Makes computer learning faster and better.
Task-Specific Generative Dataset Distillation with Difficulty-Guided Sampling
CV and Pattern Recognition
Makes AI learn better with less data.
Efficient Multimodal Dataset Distillation via Generative Models
CV and Pattern Recognition
Makes AI learn from pictures and words faster.