Algebraic Adversarial Attacks on Explainability Models
By: Lachlan Simpson , Federico Costanza , Kyle Millar and more
Potential Business Impact:
Makes AI explain its mistakes to us.
Classical adversarial attacks are phrased as a constrained optimisation problem. Despite the efficacy of a constrained optimisation approach to adversarial attacks, one cannot trace how an adversarial point was generated. In this work, we propose an algebraic approach to adversarial attacks and study the conditions under which one can generate adversarial examples for post-hoc explainability models. Phrasing neural networks in the framework of geometric deep learning, algebraic adversarial attacks are constructed through analysis of the symmetry groups of neural networks. Algebraic adversarial examples provide a mathematically tractable approach to adversarial examples. We validate our approach of algebraic adversarial examples on two well-known and one real-world dataset.
Similar Papers
Algorithms for Adversarially Robust Deep Learning
Machine Learning (CS)
Makes AI safer from tricks and mistakes.
Geometric origin of adversarial vulnerability in deep learning
Machine Learning (CS)
Makes AI smarter and harder to trick.
Evasion Attacks Against Bayesian Predictive Models
Machine Learning (Stat)
Makes smart programs harder to trick.