Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Understanding Distributed Representations of Concepts in Deep Neural Networks without Supervision (2312.17285v2)

Published 28 Dec 2023 in cs.CV, cs.AI, and cs.LG

Abstract: Understanding intermediate representations of the concepts learned by deep learning classifiers is indispensable for interpreting general model behaviors. Existing approaches to reveal learned concepts often rely on human supervision, such as pre-defined concept sets or segmentation processes. In this paper, we propose a novel unsupervised method for discovering distributed representations of concepts by selecting a principal subset of neurons. Our empirical findings demonstrate that instances with similar neuron activation states tend to share coherent concepts. Based on the observations, the proposed method selects principal neurons that construct an interpretable region, namely a Relaxed Decision Region (RDR), encompassing instances with coherent concepts in the feature space. It can be utilized to identify unlabeled subclasses within data and to detect the causes of misclassifications. Furthermore, the applicability of our method across various layers discloses distinct distributed representations over the layers, which provides deeper insights into the internal mechanisms of the deep learning model.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. From” where” to” what”: Towards human-understandable explanations through concept relevance propagation. arXiv preprint arXiv:2206.03208.
  2. Angelov, D. 2020. Top2Vec: Distributed Representations of Topics. arXiv:2008.09470.
  3. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one, 10(7): e0130140.
  4. Learning representations by maximizing mutual information across views. Advances in neural information processing systems, 32.
  5. Network dissection: Quantifying interpretability of deep visual representations. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 6541–6549.
  6. Carlsson, S. 2019. Geometry of deep convolutional networks. arXiv preprint arXiv:1905.08922.
  7. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE winter conference on applications of computer vision (WACV), 839–847. IEEE.
  8. This looks like that: deep learning for interpretable image recognition. Advances in neural information processing systems, 32.
  9. Interpreting internal activation patterns in deep temporal neural networks by finding prototypes. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 158–166.
  10. Exact and consistent interpretation for piecewise linear neural networks: A closed form solution. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 1244–1253.
  11. Concept Activation Regions: A Generalized Framework For Concept-Based Explanations. arXiv:2209.11222.
  12. Net2vec: Quantifying and explaining how concepts are encoded by filters in deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 8730–8738.
  13. Towards automatic concept-based explanations. In Proceedings of the Advances in Neural Information Processing Systems, 32.
  14. Property inference for deep neural networks. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), 797–809. IEEE.
  15. XAI—Explainable artificial intelligence. Science robotics, 4(37): eaay7120.
  16. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
  17. Hinton, G. E. 1984. Distributed representations.
  18. Hinton, G. E.; et al. 1986. Learning distributed representations of concepts. In Proceedings of the eighth annual conference of the cognitive science society, volume 1, 12. Amherst, MA.
  19. Imagenet-x: Understanding model mistakes with factor of variation annotations. arXiv preprint arXiv:2211.01866.
  20. An Efficient Explorative Sampling Considering the Generative Boundaries of Deep Generative Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 34(04): 4288–4295.
  21. Interpreting black box predictions using fisher kernels. In The 22nd International Conference on Artificial Intelligence and Statistics, 3382–3390.
  22. Examples are not enough, learn to criticize! criticism for interpretability. Proceedings of the Advances in Neural Information Processing Systems, 29.
  23. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In Proceedings of the International Conference on Machine Learning, 2668–2677.
  24. Concept bottleneck models. In International conference on machine learning, 5338–5348. PMLR.
  25. Conceptual explanations of neural network prediction for time series. In 2020 International joint conference on neural networks (IJCNN), 1–6. IEEE.
  26. Finding representative interpretations on convolutional neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 1345–1354.
  27. Deep learning. nature, 521(7553): 436–444.
  28. On the number of linear regions of deep neural networks. Proceedings of the Advances in neural information processing systems, 27.
  29. PIP-Net: Patch-Based Intuitive Prototypes for Interpretable Image Classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2744–2753.
  30. From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating explainable ai. ACM Computing Surveys, 55(13s): 1–42.
  31. CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks. arXiv:2204.10965.
  32. Neuron-level interpretation of deep nlp models: A survey. Transactions of the Association for Computational Linguistics, 10: 1285–1303.
  33. Explainable AI: interpreting, explaining and visualizing deep learning, volume 11700. Springer Nature.
  34. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, 4510–4520.
  35. Best of both worlds: local and global explanations with human-understandable concepts. arXiv preprint arXiv:2106.08641.
  36. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, 618–626.
  37. Visualising image classification models and saliency maps. Deep Inside Convolutional Networks, 2.
  38. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
  39. Axiomatic attribution for deep networks. In International conference on machine learning, 3319–3328. PMLR.
  40. Matching networks for one shot learning. Advances in neural information processing systems, 29.
  41. Criterion functions for document clustering: Experiments and analysis.

Summary

We haven't generated a summary for this paper yet.