Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On Gradient-like Explanation under a Black-box Setting: When Black-box Explanations Become as Good as White-box (2308.09381v3)

Published 18 Aug 2023 in cs.LG

Abstract: Attribution methods shed light on the explainability of data-driven approaches such as deep learning models by uncovering the most influential features in a to-be-explained decision. While determining feature attributions via gradients delivers promising results, the internal access required for acquiring gradients can be impractical under safety concerns, thus limiting the applicability of gradient-based approaches. In response to such limited flexibility, this paper presents \methodAbr~(gradient-estimation-based explanation), an approach that produces gradient-like explanations through only query-level access. The proposed approach holds a set of fundamental properties for attribution methods, which are mathematically rigorously proved, ensuring the quality of its explanations. In addition to the theoretical analysis, with a focus on image data, the experimental results empirically demonstrate the superiority of the proposed method over state-of-the-art black-box methods and its competitive performance compared to methods with full access.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one, 10(7):e0130140.
  2. The shattered gradients problem: If resnets are the answer, then what is the question? In International Conference on Machine Learning, pages 342–350. PMLR.
  3. Mirrored sampling and sequential selection for evolution strategies. In Parallel Problem Solving from Nature, PPSN XI: 11th International Conference, Kraków, Poland, September 11-15, 2010, Proceedings, Part I 11, pages 11–21. Springer.
  4. Unifying orthogonal monte carlo methods. In International Conference on Machine Learning, pages 1203–1212. PMLR.
  5. When explainability meets adversarial learning: Detecting adversarial examples using shap signatures. In 2020 international joint conference on neural networks (IJCNN), pages 1–8. IEEE.
  6. Friedman, E. J. (2004). Paths and consistency in additive cost sharing. International Journal of Game Theory, 32:501–518.
  7. Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11):665–673.
  8. Johnson, H. M. (1911). Clever hans (the horse of mr. von osten): A contribution to experimental, animal, and human psychology. The Journal of Philosophy, Psychology and Scientific Methods, 8(24):663–666.
  9. Unmasking clever hans predictors and assessing what machines really learn. Nature communications, 10(1):1096.
  10. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324.
  11. Monte carlo gradient estimation in machine learning. The Journal of Machine Learning Research, 21(1):5183–5244.
  12. Explaining nonlinear classification decisions with deep taylor decomposition. Pattern recognition, 65:211–222.
  13. Rise: Randomized input sampling for explanation of black-box models. In Proceeedings of the British Machine Vision Conference 2018, BMVC 2018, Newcastle, UK.
  14. “Why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144.
  15. Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252.
  16. Russell, S. J. (2010). Artificial intelligence a modern approach. Pearson Education, Inc.
  17. Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems, 28(11):2660–2673.
  18. Deep inside convolutional networks: visualising image classification models and saliency maps. In Proceedings of the International Conference on Learning Representations. ICLR.
  19. Smoothgrad: removing noise by adding noise. In Proceedings of the ICML Workshop on Visualization for Deep Learning, Sydney, Australia, 10 August 2017.
  20. Axiomatic attribution for deep networks. In International conference on machine learning, pages 3319–3328. PMLR.
  21. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826.
  22. Quick shift and kernel methods for mode seeking. In Computer Vision–ECCV 2008: 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part IV 10, pages 705–718. Springer.
  23. Attack-agnostic adversarial detection on medical data using explainable machine learning. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 8180–8187. IEEE.
  24. Natural evolution strategies. The Journal of Machine Learning Research, 15(1):949–980.
  25. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747.
  26. Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13, pages 818–833. Springer.

Summary

We haven't generated a summary for this paper yet.