Going Further: Flatness at the Rescue of Early Stopping for Adversarial Example Transferability (2304.02688v2)
Abstract: Transferability is the property of adversarial examples to be misclassified by other models than the surrogate model for which they were crafted. Previous research has shown that early stopping the training of the surrogate model substantially increases transferability. A common hypothesis to explain this is that deep neural networks (DNNs) first learn robust features, which are more generic, thus a better surrogate. Then, at later epochs, DNNs learn non-robust features, which are more brittle, hence worst surrogate. First, we tend to refute this hypothesis, using transferability as a proxy for representation similarity. We then establish links between transferability and the exploration of the loss landscape in parameter space, focusing on sharpness, which is affected by early stopping. This leads us to evaluate surrogate models trained with seven minimizers that minimize both loss value and loss sharpness. Among them, SAM consistently outperforms early stopping by up to 28.8 percentage points. We discover that the strong SAM regularization from large flat neighborhoods tightly links to transferability. Finally, the best sharpness-aware minimizers prove competitive with other training methods and complement existing transferability techniques.
- A modern look at the relationship between sharpness and generalization. February 2023.
- Pitfalls of In-Domain Uncertainty Estimation and Ensembling in Deep Learning. 2 2020. URL http://arxiv.org/abs/2002.06470.
- Batch Normalization Increases Adversarial Vulnerability and Decreases Adversarial Transferability: A Non-Robust Feature Perspective. In ICCV 2021, 10 2021. 10.1109/ICCV48922.2021.00772. URL http://arxiv.org/abs/2010.03316.
- Evasion attacks against machine learning at test time. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 8190 LNAI, pages 387–402, 8 2013. ISBN 9783642409936. 10.1007/978-3-642-40994-3_25. URL http://arxiv.org/abs/1708.06131http://dx.doi.org/10.1007/978-3-642-40994-3_25.
- Low-pass filtering sgd for recovering flat optima in the deep learning optimization landscape. In Gustau Camps-Valls, Francisco J. R. Ruiz, and Isabel Valera, editors, Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151 of Proceedings of Machine Learning Research, pages 8299–8339. PMLR, 28–30 Mar 2022. URL https://proceedings.mlr.press/v151/bisla22a.html.
- Boosting Adversarial Attacks with Momentum. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 9185–9193, 10 2018. ISBN 9781538664209. 10.1109/CVPR.2018.00957. URL http://arxiv.org/abs/1710.06081.
- Efficient Sharpness-aware Minimization for Improved Training of Neural Networks. In International Conference on Learning Representations, October 2021.
- Robustness (Python Library), 2019. URL https://github.com/MadryLab/robustness.
- Sharpness-Aware Minimization for Efficiently Improving Generalization. 10 2020. 10.48550/arxiv.2010.01412. URL https://arxiv.org/abs/2010.01412v3.
- Explaining and Harnessing Adversarial Examples. 12 2014. URL http://arxiv.org/abs/1412.6572.
- Efficient and Transferable Adversarial Examples from Bayesian Neural Networks. In UAI 2022, 2022a. URL http://arxiv.org/abs/2011.05074.
- LGV: Boosting Adversarial Example Transferability from Large Geometric Vicinity. In ECCV 2022, 2022b.
- Adversarial Examples Are Not Bugs, They Are Features. 5 2019. URL http://arxiv.org/abs/1905.02175.
- Averaging weights leads to wider optima and better generalization. In 34th Conference on Uncertainty in Artificial Intelligence 2018, UAI 2018, volume 2, pages 876–885. Association For Uncertainty in Artificial Intelligence (AUAI), March 2018. ISBN 978-1-5108-7160-1.
- When Do Flat Minima Optimizers Work? In NeurIPS 2022, 2 2022. 10.48550/arxiv.2202.00661. URL https://arxiv.org/abs/2202.00661v5.
- On large-batch training for deep learning: Generalization gap and sharp minima. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=H1oyRlYgg.
- Hoki Kim. Torchattacks: A pytorch repository for adversarial attacks. arXiv preprint arXiv:2010.01950, 2020.
- Adversarial examples in the physical world. In 5th International Conference on Learning Representations, ICLR 2017 - Workshop Track Proceedings, 7 2017. URL http://arxiv.org/abs/1607.02533.
- ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks. 2 2021. 10.48550/arxiv.2102.11600. URL https://arxiv.org/abs/2102.11600v3.
- Learning Transferable Adversarial Examples via Ghost Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 34(07):11458–11465, 12 2018. ISSN 2374-3468. 10.1609/aaai.v34i07.6810. URL http://arxiv.org/abs/1812.03413.
- Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks. 8 2019. URL http://arxiv.org/abs/1908.06281.
- Towards Efficient and Scalable Sharpness-Aware Minimization. In CVPR, pages 12360–12370, 2022.
- On Improving Adversarial Transferability of Vision Transformers. In ICLR (spotlight), 3 2022.
- Vikram Nitin. SGD on Neural Networks learns Robust Features before Non-Robust, 3 2021.
- Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation. In NeurIPS 2022, October 2022. 10.48550/arxiv.2210.05968.
- Cockpit: A Practical Debugging Tool for the Training of Deep Neural Networks. 2 2021. 10.48550/arxiv.2102.06604. URL https://arxiv.org/abs/2102.06604v2http://arxiv.org/abs/2102.06604.
- A Little Robustness Goes a Long Way: Leveraging Robust Features for Targeted Transfer Attacks. Advances in Neural Information Processing Systems, 12:9759–9773, 6 2021. ISSN 10495258. 10.48550/arxiv.2106.02105. URL https://arxiv.org/abs/2106.02105v2.
- Relating Adversarially Robust Generalization to Flat Minima. Proceedings of the IEEE International Conference on Computer Vision, pages 7787–7797, 4 2021. ISSN 15505499. 10.1109/ICCV48922.2021.00771. URL https://arxiv.org/abs/2104.04448v2.
- Intriguing properties of neural networks. 12 2013. URL http://arxiv.org/abs/1312.6199.
- Enhancing the Transferability of Adversarial Attacks through Variance Tuning. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1924–1933, 3 2021. ISSN 10636919. 10.1109/CVPR46437.2021.00196. URL https://arxiv.org/abs/2103.15571v3.
- Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization, July 2023.
- Skip Connections Matter: On the Transferability of Adversarial Examples Generated with ResNets. In ICLR, 2 2020. URL https://arxiv.org/abs/2002.05990v1http://arxiv.org/abs/2002.05990.
- Improving transferability of adversarial examples with input diversity. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 2019-June, pages 2725–2734, 3 2019. ISBN 9781728132938. 10.1109/CVPR.2019.00284. URL http://arxiv.org/abs/1803.06978.
- PyHessian: Neural Networks Through the Lens of the Hessian. Proceedings - 2020 IEEE International Conference on Big Data, Big Data 2020, pages 581–590, 12 2019. ISSN 2331-8422. 10.1109/BigData50022.2020.9378171. URL https://arxiv.org/abs/1912.07145v3.
- Early Stop And Adversarial Training Yield Better surrogate Model: Very Non-Robust Features Harm Adversarial Transferability, 2021.
- Towards Good Practices in Evaluating Transfer Adversarial Attacks. 11 2022. 10.48550/arxiv.2211.09565. URL https://arxiv.org/abs/2211.09565v1.
- Surrogate Gap Minimization Improves Sharpness-Aware Training. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=edONMAnhLu-.