ProARD: progressive adversarial robustness distillation: provide wide range of robust students
By: Seyedhamidreza Mousavi, Seyedali Mousavi, Masoud Daneshtalab
Potential Business Impact:
Trains one smart computer to help many others.
Adversarial Robustness Distillation (ARD) has emerged as an effective method to enhance the robustness of lightweight deep neural networks against adversarial attacks. Current ARD approaches have leveraged a large robust teacher network to train one robust lightweight student. However, due to the diverse range of edge devices and resource constraints, current approaches require training a new student network from scratch to meet specific constraints, leading to substantial computational costs and increased CO2 emissions. This paper proposes Progressive Adversarial Robustness Distillation (ProARD), enabling the efficient one-time training of a dynamic network that supports a diverse range of accurate and robust student networks without requiring retraining. We first make a dynamic deep neural network based on dynamic layers by encompassing variations in width, depth, and expansion in each design stage to support a wide range of architectures. Then, we consider the student network with the largest size as the dynamic teacher network. ProARD trains this dynamic network using a weight-sharing mechanism to jointly optimize the dynamic teacher network and its internal student networks. However, due to the high computational cost of calculating exact gradients for all the students within the dynamic network, a sampling mechanism is required to select a subset of students. We show that random student sampling in each iteration fails to produce accurate and robust students.
Similar Papers
DARD: Dice Adversarial Robustness Distillation against Adversarial Attacks
Machine Learning (CS)
Makes AI smarter and safer from tricks.
CIARD: Cyclic Iterative Adversarial Robustness Distillation
CV and Pattern Recognition
Makes AI smarter and safer, even when tricked.
MMARD: Improving the Min-Max Optimization Process in Adversarial Robustness Distillation
CV and Pattern Recognition
Makes small computer brains smarter and safer.