Are classical deep neural networks weakly adversarially robust?
By: Nuolin Sun , Linyuan Wang , Dongyang Li and more
Potential Business Impact:
Finds fake images by following their "feature paths."
Adversarial attacks have received increasing attention and it has been widely recognized that classical DNNs have weak adversarial robustness. The most commonly used adversarial defense method, adversarial training, improves the adversarial accuracy of DNNs by generating adversarial examples and retraining the model. However, adversarial training requires a significant computational overhead. In this paper, inspired by existing studies focusing on the clustering properties of DNN output features at each layer and the Progressive Feedforward Collapse phenomenon, we propose a method for adversarial example detection and image recognition that uses layer-wise features to construct feature paths and computes the correlation between the examples feature paths and the class-centered feature paths. Experimental results show that the recognition method achieves 82.77% clean accuracy and 44.17% adversarial accuracy on the ResNet-20 with PFC. Compared to the adversarial training method with 77.64% clean accuracy and 52.94% adversarial accuracy, our method exhibits a trade-off without relying on computationally expensive defense strategies. Furthermore, on the standard ResNet-18, our method maintains this advantage with respective metrics of 80.01% and 46.1%. This result reveals inherent adversarial robustness in DNNs, challenging the conventional understanding of the weak adversarial robustness in DNNs.
Similar Papers
The Impact of Scaling Training Data on Adversarial Robustness
CV and Pattern Recognition
Makes AI smarter and harder to trick.
C-LEAD: Contrastive Learning for Enhanced Adversarial Defense
CV and Pattern Recognition
Makes AI smarter and harder to trick.
Narrowing Class-Wise Robustness Gaps in Adversarial Training
CV and Pattern Recognition
Makes AI better at guessing, even with tricky data.