Transcending Adversarial Perturbations: Manifold-Aided Adversarial Examples with Legitimate Semantics (2402.03095v1)
Abstract: Deep neural networks were significantly vulnerable to adversarial examples manipulated by malicious tiny perturbations. Although most conventional adversarial attacks ensured the visual imperceptibility between adversarial examples and corresponding raw images by minimizing their geometric distance, these constraints on geometric distance led to limited attack transferability, inferior visual quality, and human-imperceptible interpretability. In this paper, we proposed a supervised semantic-transformation generative model to generate adversarial examples with real and legitimate semantics, wherein an unrestricted adversarial manifold containing continuous semantic variations was constructed for the first time to realize a legitimate transition from non-adversarial examples to adversarial ones. Comprehensive experiments on MNIST and industrial defect datasets showed that our adversarial examples not only exhibited better visual quality but also achieved superior attack transferability and more effective explanations for model vulnerabilities, indicating their great potential as generic adversarial examples. The code and pre-trained models were available at https://github.com/shuaili1027/MAELS.git.
- A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proceedings of the IEEE, 109(5):820–838, 2021.
- Automated visual defect detection for flat steel surface: A survey. IEEE Transactions on Instrumentation and Measurement, 69(3):626–644, 2020.
- Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
- Attack to fool and explain deep networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):5980–5995, 2021.
- Deep fusion: crafting transferable adversarial examples and improving robustness of industrial artificial intelligence of things. IEEE Transactions on Industrial Informatics, 2022.
- Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
- Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp), pages 39–57. Ieee, 2017.
- Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2574–2582, 2016.
- Demiguise attack: Crafting invisible semantic adversarial perturbations with perceptual similarity. arXiv preprint arXiv:2107.01396, 2021.
- Semantic adversarial examples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 1614–1619, 2018.
- Describing objects by their attributes. In 2009 IEEE conference on computer vision and pattern recognition, pages 1778–1785. IEEE, 2009.
- Constructing unrestricted adversarial examples with generative models. Advances in Neural Information Processing Systems, 31, 2018.
- Generating natural adversarial examples. arXiv preprint arXiv:1710.11342, 2017.
- Colorfool: Semantic adversarial colorization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1151–1160, 2020.
- Generating semantic adversarial examples via feature manipulation in latent space. IEEE Transactions on Neural Networks and Learning Systems, 2023.
- Semanticadv: Generating adversarial examples via attribute-conditioned image editing. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIV 16, pages 19–37. Springer, 2020.
- Breaking certified defenses: Semantic adversarial examples with spoofed robustness certificates. arXiv preprint arXiv:2003.08937, 2020.
- Adversarial attack type i: Cheat classifiers by significant changes. IEEE transactions on pattern analysis and machine intelligence, 43(3):1100–1109, 2019.
- Semantic perturbations with normalizing flows for improved generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6619–6629, 2021.
- Discrete point-wise attack is not enough: Generalized manifold adversarial attack for face recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20575–20584, 2023.
- Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8):1798–1828, 2013.
- Infogan: Interpretable representation learning by information maximizing generative adversarial nets. Advances in neural information processing systems, 29, 2016.
- Transferable representation learning with deep adaptation networks. IEEE transactions on pattern analysis and machine intelligence, 41(12):3071–3085, 2018.
- Interfacegan: Interpreting the disentangled face representation learned by gans. IEEE transactions on pattern analysis and machine intelligence, 44(4):2004–2018, 2020.
- Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE symposium on security and privacy (SP), pages 582–597. IEEE, 2016.
- Triplet-graph reasoning network for few-shot metal generic surface defect segmentation. IEEE Transactions on Instrumentation and Measurement, 70:1–11, 2021.
- Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236, 2016.
- Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International conference on machine learning, pages 2206–2216. PMLR, 2020.
- Square attack: a query-efficient black-box adversarial attack via random search. In European conference on computer vision, pages 484–501. Springer, 2020.
- Provably robust deep learning via adversarially trained smoothed classifiers. Advances in Neural Information Processing Systems, 32, 2019.
- Adversarial example detection by predicting adversarial noise in the frequency domain. Multimedia Tools and Applications, pages 1–17, 2023.
- Diffusion models for adversarial purification. arXiv preprint arXiv:2205.07460, 2022.
- Constructing semantics-aware adversarial examples with probabilistic perspective. arXiv preprint arXiv:2306.00353, 2023.
- On the minimal adversarial perturbation for deep neural networks with provable estimation error. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4):5038–5052, 2022.
- The robust manifold defense: Adversarial training using generative models. arXiv preprint arXiv:1712.09196, 2017.
- Generating adversarial examples with adversarial networks. arXiv preprint arXiv:1801.02610, 2018.
- The secret revealer: Generative model-inversion attacks against deep neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 253–261, 2020.
- Query-efficient black-box adversarial attacks guided by a transfer-based prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12):9536–9548, 2021.
- Attacks meet interpretability: Attribute-steered detection of adversarial samples. Advances in Neural Information Processing Systems, 31, 2018.
- Robust feature-level adversaries are interpretability tools. Advances in Neural Information Processing Systems, 35:33093–33106, 2022.
- Use hirescam instead of grad-cam for faithful explanations of convolutional neural networks. arXiv preprint arXiv:2011.08891, 2020.
- Shuai Li (295 papers)
- Xiaoyu Jiang (17 papers)
- Xiaoguang Ma (14 papers)