Emergent Mind

Integrated Gradient Correlation: a Dataset-wise Attribution Method

(2404.13910)
Published Apr 22, 2024 in cs.LG and cs.AI

Abstract

Attribution methods are primarily designed to study the distribution of input component contributions to individual model predictions. However, some research applications require a summary of attribution patterns across the entire dataset to facilitate the interpretability of the scrutinized models. In this paper, we present a new method called Integrated Gradient Correlation (IGC) that relates dataset-wise attributions to a model prediction score and enables region-specific analysis by a direct summation over associated components. We demonstrate our method on scalar predictions with the study of image feature representation in the brain from fMRI neural signals and the estimation of neural population receptive fields (NSD dataset), as well as on categorical predictions with the investigation of handwritten digit recognition (MNIST dataset). The resulting IGC attributions show selective patterns, revealing underlying model strategies coherent with their respective objectives.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.