Global Adversarial Attacks for Assessing Deep Learning Robustness (1906.07920v1)

Published 19 Jun 2019 in cs.LG, cs.CR, and stat.ML

Abstract: It has been shown that deep neural networks (DNNs) may be vulnerable to adversarial attacks, raising the concern on their robustness particularly for safety-critical applications. Recognizing the local nature and limitations of existing adversarial attacks, we present a new type of global adversarial attacks for assessing global DNN robustness. More specifically, we propose a novel concept of global adversarial example pairs in which each pair of two examples are close to each other but have different class labels predicted by the DNN. We further propose two families of global attack methods and show that our methods are able to generate diverse and intriguing adversarial example pairs at locations far from the training or testing data. Moreover, we demonstrate that DNNs hardened using the strong projected gradient descent (PGD) based (local) adversarial training are vulnerable to the proposed global adversarial example pairs, suggesting that global robustness must be considered while training robust deep learning networks.

Citations (4)

View on Semantic Scholar