Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 49 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 19 tok/s Pro
GPT-5 High 16 tok/s Pro
GPT-4o 103 tok/s Pro
Kimi K2 172 tok/s Pro
GPT OSS 120B 472 tok/s Pro
Claude Sonnet 4 39 tok/s Pro
2000 character limit reached

Concept backpropagation: An Explainable AI approach for visualising learned concepts in neural network models (2307.12601v1)

Published 24 Jul 2023 in cs.LG

Abstract: Neural network models are widely used in a variety of domains, often as black-box solutions, since they are not directly interpretable for humans. The field of explainable artificial intelligence aims at developing explanation methods to address this challenge, and several approaches have been developed over the recent years, including methods for investigating what type of knowledge these models internalise during the training process. Among these, the method of concept detection, investigates which \emph{concepts} neural network models learn to represent in order to complete their tasks. In this work, we present an extension to the method of concept detection, named \emph{concept backpropagation}, which provides a way of analysing how the information representing a given concept is internalised in a given neural network model. In this approach, the model input is perturbed in a manner guided by a trained concept probe for the described model, such that the concept of interest is maximised. This allows for the visualisation of the detected concept directly in the input space of the model, which in turn makes it possible to see what information the model depends on for representing the described concept. We present results for this method applied to a various set of input modalities, and discuss how our proposed method can be used to visualise what information trained concept probes use, and the degree as to which the representation of the probed concept is entangled within the neural network model itself.

Summary

We haven't generated a summary for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.