Sample-wise Adaptive Weighting for Transfer Consistency in Adversarial Distillation
By: Hongsin Lee, Hye Won Chung
Potential Business Impact:
Makes AI smarter and more secure.
Adversarial distillation in the standard min-max adversarial training framework aims to transfer adversarial robustness from a large, robust teacher network to a compact student. However, existing work often neglects to incorporate state-of-the-art robust teachers. Through extensive analysis, we find that stronger teachers do not necessarily yield more robust students-a phenomenon known as robust saturation. While typically attributed to capacity gaps, we show that such explanations are incomplete. Instead, we identify adversarial transferability-the fraction of student-crafted adversarial examples that remain effective against the teacher-as a key factor in successful robustness transfer. Based on this insight, we propose Sample-wise Adaptive Adversarial Distillation (SAAD), which reweights training examples by their measured transferability without incurring additional computational cost. Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet show that SAAD consistently improves AutoAttack robustness over prior methods. Our code is available at https://github.com/HongsinLee/saad.
Similar Papers
AdaGAT: Adaptive Guidance Adversarial Training for the Robustness of Deep Neural Networks
CV and Pattern Recognition
Makes small computer brains smarter and tougher.
CIARD: Cyclic Iterative Adversarial Robustness Distillation
CV and Pattern Recognition
Makes AI smarter and safer, even when tricked.
Towards Class-wise Fair Adversarial Training via Anti-Bias Soft Label Distillation
CV and Pattern Recognition
Makes AI fair by teaching it to protect all information.