Robust Physical-World Attacks on Deep Learning Models (1707.08945v5)

Published 27 Jul 2017 in cs.CR and cs.LG

Abstract: Recent studies show that the state-of-the-art deep neural networks (DNNs) are vulnerable to adversarial examples, resulting from small-magnitude perturbations added to the input. Given that that emerging physical systems are using DNNs in safety-critical situations, adversarial examples could mislead these systems and cause dangerous situations.Therefore, understanding adversarial examples in the physical world is an important step towards developing resilient learning algorithms. We propose a general attack algorithm,Robust Physical Perturbations (RP2), to generate robust visual adversarial perturbations under different physical conditions. Using the real-world case of road sign classification, we show that adversarial examples generated using RP2 achieve high targeted misclassification rates against standard-architecture road sign classifiers in the physical world under various environmental conditions, including viewpoints. Due to the current lack of a standardized testing method, we propose a two-stage evaluation methodology for robust physical adversarial examples consisting of lab and field tests. Using this methodology, we evaluate the efficacy of physical adversarial manipulations on real objects. Witha perturbation in the form of only black and white stickers,we attack a real stop sign, causing targeted misclassification in 100% of the images obtained in lab settings, and in 84.8%of the captured video frames obtained on a moving vehicle(field test) for the target classifier.

Citations (577)

View on Semantic Scholar

Summary

The paper introduces the Robust Physical Perturbations (RPP) algorithm that generates adversarial attacks by accounting for physical constraints.
The paper validates its method with lab and field tests, achieving 100% misclassification in controlled settings and 84.8% in real-world scenarios.
The study reveals significant risks for deep learning visual classifiers, emphasizing the need for robust safety improvements in autonomous systems.

Overview of "Robust Physical-World Attacks on Deep Learning Visual Classification"

The paper "Robust Physical-World Attacks on Deep Learning Visual Classification" addresses the vulnerability of deep neural networks (DNNs) to adversarial examples within the context of physical environments. Focusing on road sign classification, this paper explores generating physically realizable attacks that cause misclassification in DNN-based systems without overt physical alterations such as background modifications.

Research Contributions

The authors propose the Robust Physical Perturbations (RPP) algorithm, which effectively generates adversarial perturbations that maintain robustness under varying environmental conditions. The key contributions of the paper include:

RPP Algorithm: The RPP algorithm creates perturbations using a distribution model that accounts for diverse physical dynamics like distance and angle variations. It generates visible but inconspicuous perturbations that adhere to the object’s surface, thus addressing spatial constraints.
Evaluation Methodology: A two-stage experimental evaluation process is proposed comprising of lab tests with fixed camera configurations and field tests conducted via drive-by scenarios. This rigorous testing ensures the perturbation's effectiveness in real-world settings.
Empirical Validation: The paper demonstrates the success of the perturbations against standard architecture classifiers (LISA-CNN and GTSRB-CNN), resulting in significant misclassification rates. For instance, perturbed stop signs were misclassified largely as Speed Limit signs, highlighting the method's effectiveness.
General Applicability: Beyond traffic sign recognition, the perturbations were applied to other objects like microwaves using the Inception-v3 model, demonstrating versatility with attacks yielding a 90% success rate.

Numerical Results

Strong numerical outcomes are reported, specifically with the Stop sign attacked within different environmental settings. Targeted misclassification was achieved at a rate of 100% in lab conditions and 84.8% during field tests when using sticker attacks designed to mimic graffiti.

Implications and Future Research

This research highlights the potential risks posed by adversarial examples in safety-critical systems such as autonomous vehicles. The methodologies developed have broader implications for improving the robustness of visual systems in dynamic environments. Possible future research directions include extending this work to explore defenses against such robust physical-world adversarial attacks or assessing the impacts on different types of sensors and perception modules.

In summary, the paper presents a comprehensive investigation into the robustness of physical-world adversarial attacks, illustrating both the vulnerability of current models and the necessity for ongoing security enhancements in visual classification systems.

PDF Markdown

Related Papers

Tweets

https://twitter.com/EarlenceF/status/1918406866822152220

https://twitter.com/Sve_Sic/status/1748553671938388172

YouTube

Show All Videos