Adversarial Surrogate Risk Bounds for Binary Classification
By: Natalie S. Frank
Potential Business Impact:
Makes AI harder for hackers to trick.
A central concern in classification is the vulnerability of machine learning models to adversarial attacks. Adversarial training is one of the most popular techniques for training robust classifiers, which involves minimizing an adversarial surrogate risk. Recent work characterized when a minimizing sequence of an adversarial surrogate risk is also a minimizing sequence of the adversarial classification risk for binary classification -- a property known as adversarial consistency. However, these results do not address the rate at which the adversarial classification risk converges to its optimal value for such a sequence of functions that minimize the adversarial surrogate. This paper provides surrogate risk bounds that quantify that convergence rate. Additionally, we derive distribution-dependent surrogate risk bounds in the standard (non-adversarial) learning setting, that may be of independent interest.
Similar Papers
Risk Analysis and Design Against Adversarial Actions
Machine Learning (CS)
Makes smart programs safer from tricky data.
On the existence of consistent adversarial attacks in high-dimensional linear classification
Machine Learning (Stat)
Finds how computer mistakes can be tricked.
Narrowing Class-Wise Robustness Gaps in Adversarial Training
CV and Pattern Recognition
Makes AI better at guessing, even with tricky data.