Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
157 tokens/sec
GPT-4o
43 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Can Implicit Bias Imply Adversarial Robustness? (2405.15942v2)

Published 24 May 2024 in cs.LG and stat.ML

Abstract: The implicit bias of gradient-based training algorithms has been considered mostly beneficial as it leads to trained networks that often generalize well. However, Frei et al. (2023) show that such implicit bias can harm adversarial robustness. Specifically, they show that if the data consists of clusters with small inter-cluster correlation, a shallow (two-layer) ReLU network trained by gradient flow generalizes well, but it is not robust to adversarial attacks of small radius. Moreover, this phenomenon occurs despite the existence of a much more robust classifier that can be explicitly constructed from a shallow network. In this paper, we extend recent analyses of neuron alignment to show that a shallow network with a polynomial ReLU activation (pReLU) trained by gradient flow not only generalizes well but is also robust to adversarial attacks. Our results highlight the importance of the interplay between data structure and architecture design in the implicit bias and robustness of trained networks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Transformers learn through gradual rank increase. Advances in Neural Information Processing Systems, 36, 2023.
  2. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International conference on machine learning, pp.  274–283. PMLR, 2018.
  3. B-cos networks: Alignment is all we need for interpretability. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  10329–10338, 2022.
  4. Early alignment in two-layer networks training is a two-edged sword. arXiv preprint arXiv:2401.10791, 2024.
  5. Gradient flow dynamics of shallow relu networks for square loss and orthogonal inputs. In Advances in Neural Information Processing Systems, volume 35, pp.  20105–20118, 2022.
  6. On evaluating adversarial robustness. arXiv preprint arXiv:1902.06705, 2019.
  7. Learning a neuron by a shallow reLU network: Dynamics and implicit bias for correlated inputs. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  8. Implicit bias of gradient descent for wide two-layer neural networks trained with the logistic loss. In Proceedings of Thirty Third Conference on Learning Theory, volume 125, pp.  1305–1338. PMLR, 09–12 Jul 2020.
  9. Clarke, F. H. Optimization and nonsmooth analysis. SIAM, 1990.
  10. Certified adversarial robustness via randomized smoothing. In international conference on machine learning, pp.  1310–1320. PMLR, 2019.
  11. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In ICML, 2020.
  12. Dohmatob, E. Generalized no free lunch theorem for adversarial robustness. In International Conference on Machine Learning, pp.  1646–1654. PMLR, 2019.
  13. Algorithmic regularization in learning deep homogeneous models: Layers are automatically balanced. In Advances in Neural Information Processing Systems (NeurIPS), 2018.
  14. Bridging the gap between adversarial robustness and optimization bias. arXiv preprint arXiv:2102.08868, 2021.
  15. Adversarial vulnerability for any classifier. Advances in neural information processing systems, 31, 2018.
  16. Implicit bias in leaky relu networks trained on high-dimensional data. In The Eleventh International Conference on Learning Representations, 2022.
  17. The double-edged sword of implicit bias: Generalization vs. robustness in reLU networks. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  18. Caltech-256 object category dataset. 2007.
  19. Implicit regularization in matrix factorization. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp.  6152–6160, 2017.
  20. Countering adversarial images using input transformations. In International Conference on Learning Representations, 2018.
  21. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pp.  1026–1034, 2015.
  22. Saddle-to-saddle dynamics in deep linear networks: Small initialization training, symmetry, and sparsity. arXiv preprint arXiv:2106.15933, 2021.
  23. Gradient descent aligns the layers of deep linear networks. In 7th International Conference on Learning Representations, ICLR 2019, 2019.
  24. Directional convergence and alignment in deep learning. Advances in Neural Information Processing Systems, 33:17176–17186, 2020.
  25. Directional convergence near small initializations and saddles in two-homogeneous neural networks. arXiv preprint arXiv:2402.09226, 2024.
  26. (de) randomized smoothing for certifiable defense against patch attacks. Advances in Neural Information Processing Systems, 33:6465–6475, 2020.
  27. Gradient descent maximizes the margin of homogeneous neural networks. In International Conference on Learning Representations, 2019.
  28. Gradient descent on two-layer nets: Margin maximization and simplicity bias. Advances in Neural Information Processing Systems, 34:12978–12991, 2021.
  29. Gradient descent quantizes relu network features. arXiv preprint arXiv:1803.08367, 2018.
  30. On the explicit role of initialization on the convergence and implicit bias of overparametrized linear networks. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp.  7760–7768. PMLR, 18–24 Jul 2021.
  31. Early neuron alignment in two-layer relu networks with small initialization. In International Conference on Learning Representations, pp.  1–8, 5 2024.
  32. Adversarial examples might be avoidable: The role of data concentration in adversarial robustness. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  33. Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE symposium on security and privacy (SP), pp.  582–597. IEEE, 2016.
  34. Weight normalization: A simple reparameterization to accelerate training of deep neural networks. Advances in neural information processing systems, 29, 2016.
  35. Exact solutions to the nonlinear dynamics of learning in deep linear neural network. In International Conference on Learning Representations, 2014.
  36. Are adversarial examples inevitable? In International Conference on Learning Representations, 2018.
  37. Adversarial training for free! Advances in Neural Information Processing Systems, 32, 2019.
  38. Implicit balancing and regularization: Generalization and convergence guarantees for overparameterized asymmetric matrix sensing. In Proceedings of Thirty Sixth Conference on Learning Theory, volume 195, pp.  5140–5142. PMLR, 12–15 Jul 2023.
  39. Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction. Advances in Neural Information Processing Systems, 34, 2021.
  40. Adversarial robustness of supervised sparse coding. Advances in neural information processing systems, 33:2110–2121, 2020.
  41. Intriguing properties of neural networks. In 2nd International Conference on Learning Representations, 2014.
  42. Max-margin token selection in attention mechanism. Advances in Neural Information Processing Systems, 36, 2023.
  43. On margin maximization in linear and relu networks. Advances in Neural Information Processing Systems, 35:37024–37036, 2022.
  44. Understanding multi-phase optimization dynamics and rich nonlinear behaviors of reLU networks. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  45. Implicit bias of sgd in l⁢_𝑙_l\_italic_l _{2222}-regularized linear dnns: One-way jumps from high to low rank. arXiv preprint arXiv:2305.16038, 2023.
  46. Fast is better than free: Revisiting adversarial training. In International Conference on Learning Representations, 2019.
  47. Kernel and rich regimes in overparametrized models. In Conference on Learning Theory, pp.  3635–3673. PMLR, 2020.
  48. Randomized smoothing of all shapes and sizes. In International Conference on Machine Learning, pp.  10693–10705. PMLR, 2020.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets