Self-Consuming Generative Models with Adversarially Curated Data
By: Xiukun Wei, Xueru Zhang
Potential Business Impact:
Makes AI models learn wrong things from bad data.
Recent advances in generative models have made it increasingly difficult to distinguish real data from model-generated synthetic data. Using synthetic data for successive training of future model generations creates "self-consuming loops", which may lead to model collapse or training instability. Furthermore, synthetic data is often subject to human feedback and curated by users based on their preferences. Ferbach et al. (2024) recently showed that when data is curated according to user preferences, the self-consuming retraining loop drives the model to converge toward a distribution that optimizes those preferences. However, in practice, data curation is often noisy or adversarially manipulated. For example, competing platforms may recruit malicious users to adversarially curate data and disrupt rival models. In this paper, we study how generative models evolve under self-consuming retraining loops with noisy and adversarially curated data. We theoretically analyze the impact of such noisy data curation on generative models and identify conditions for the robustness of the retraining process. Building on this analysis, we design attack algorithms for competitive adversarial scenarios, where a platform with a limited budget employs malicious users to misalign a rival's model from actual user preferences. Experiments on both synthetic and real-world datasets demonstrate the effectiveness of the proposed algorithms.
Similar Papers
Convergence and Stability Analysis of Self-Consuming Generative Models with Heterogeneous Human Curation
Machine Learning (Stat)
Teaches computers to learn from their own mistakes.
Observations and Remedies for Large Language Model Bias in Self-Consuming Performative Loop
Artificial Intelligence
Fixes AI bias from its own mistakes.
A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops
Machine Learning (CS)
Teaches computers to learn from their own mistakes.