Understanding Parameter Saliency via Extreme Value Theory (2310.17951v2)
Abstract: Deep neural networks are being increasingly implemented throughout society in recent years. It is useful to identify which parameters trigger misclassification in diagnosing undesirable model behaviors. The concept of parameter saliency is proposed and used to diagnose convolutional neural networks (CNNs) by ranking convolution filters that may have caused misclassification on the basis of parameter saliency. It is also shown that fine-tuning the top ranking salient filters efficiently corrects misidentification on ImageNet. However, there is still a knowledge gap in terms of understanding why parameter saliency ranking can find the filters inducing misidentification. In this work, we attempt to bridge the gap by analyzing parameter saliency ranking from a statistical viewpoint, namely, extreme value theory. We first show that the existing work implicitly assumes that the gradient norm computed for each filter follows a normal distribution. Then, we clarify the relationship between parameter saliency and the score based on the peaks-over-threshold (POT) method, which is often used to model extreme values. Finally, we reformulate parameter saliency in terms of the POT method, where this reformulation is regarded as statistical anomaly detection and does not require the implicit assumptions of the existing parameter-saliency formulation. Our experimental results demonstrate that our reformulation can detect malicious filters as well. Furthermore, we show that the existing parameter saliency method exhibits a bias against the depth of layers in deep neural networks. In particular, this bias has the potential to inhibit the discovery of filters that cause misidentification in situations where domain shift occurs. In contrast, parameter saliency based on POT shows less of this bias.
- Sanity checks for saliency maps. Advances in neural information processing systems, 31, 2018.
- Evaluating saliency map explanations for convolutional neural networks: a user study. In Proceedings of the 25th International Conference on Intelligent User Interfaces, pages 275–285, 2020.
- Residual life time at great age. The Annals of probability, 2(5):792–804, 1974.
- Identifying and controlling important neurons in neural machine translation. arXiv preprint arXiv:1811.01157, 2018.
- Network dissection: Quantifying interpretability of deep visual representations. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6541–6549, 2017.
- Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE international conference on computer vision, pages 3429–3437, 2017.
- Distilling a neural network into a soft decision tree. arXiv preprint arXiv:1711.09784, 2017.
- Scott D Grimshaw. Computing maximum likelihood estimates for the generalized pareto distribution. Technometrics, 35(2):185–191, 1993.
- Extreme value theory: an introduction, volume 3. Springer, 2006.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Filter pruning via feature discrimination in deep neural networks. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXI, pages 245–261. Springer, 2022.
- Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261, 2019.
- Last layer re-training is sufficient for robustness to spurious correlations. arXiv preprint arXiv:2204.02937, 2022.
- Where do models go wrong? parameter-space saliency maps for explainability. arXiv preprint arXiv:2108.01335, 2021.
- Deeper, broader and artier domain generalization. In Proceedings of the IEEE international conference on computer vision, pages 5542–5550, 2017.
- Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016.
- Channel pruning based on mean gradient for accelerating convolutional neural networks. Signal Processing, 156:84–91, 2019.
- Christoph Molnar. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. 2nd edition, 2022. URL https://christophm.github.io/interpretable-ml-book.
- Rise: Randomized input sampling for explanation ofblack-box models. In British Machine Vision Conference, 2018.
- James Pickands III. Statistical inference using extreme order statistics. the Annals of Statistics, pages 119–131, 1975.
- " why should i trust you?" explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
- Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems, 28(11):2660–2673, 2016.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618–626, 2017.
- Anomaly detection in streams with extreme value theory. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1067–1075, 2017.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034, 2013.
- meprop: Sparsified back propagation for accelerated deep learning with reduced overfitting. In International Conference on Machine Learning, pages 3299–3308. PMLR, 2017.
- Axiomatic attribution for deep networks. In International conference on machine learning, pages 3319–3328. PMLR, 2017.
- Counterfactual explanations and algorithmic recourses for machine learning: A review, 2020. URL https://arxiv.org/abs/2010.10596.
- Learning structured sparsity in deep neural networks. Advances in neural information processing systems, 29, 2016.
- Nisp: Pruning networks using neuron importance score propagation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9194–9203, 2018.
- Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13, pages 818–833. Springer, 2014.
- Hao Zhang and WK Chan. Apricot: A weight-adaptation approach to fixing deep learning models. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 376–387. IEEE, 2019.
- Shuo Wang (382 papers)
- Issei Sato (82 papers)