Dimensions underlying the representational alignment of deep neural networks with humans (2406.19087v2)

Published 27 Jun 2024 in cs.CV, cs.AI, cs.LG, and q-bio.QM

Abstract: Determining the similarities and differences between humans and AI is an important goal both in computational cognitive neuroscience and machine learning, promising a deeper understanding of human cognition and safer, more reliable AI systems. Much previous work comparing representations in humans and AI has relied on global, scalar measures to quantify their alignment. However, without explicit hypotheses, these measures only inform us about the degree of alignment, not the factors that determine it. To address this challenge, we propose a generic framework to compare human and AI representations, based on identifying latent representational dimensions underlying the same behavior in both domains. Applying this framework to humans and a deep neural network (DNN) model of natural images revealed a low-dimensional DNN embedding of both visual and semantic dimensions. In contrast to humans, DNNs exhibited a clear dominance of visual over semantic properties, indicating divergent strategies for representing images. While in-silico experiments showed seemingly consistent interpretability of DNN dimensions, a direct comparison between human and DNN representations revealed substantial differences in how they process images. By making representations directly comparable, our results reveal important challenges for representational alignment and offer a means for improving their comparability.

Citations (2)

View on Semantic Scholar

Summary

The paper demonstrates that human embeddings capture 90.85% of explainable variance while DNN embeddings achieve 84.03% using odd-one-out tasks.
The study reveals that DNN embeddings are visually biased, whereas human embeddings predominantly encode semantic information.
Using methods like Grad-CAM and activation maximization, the paper validates dimension-specific interpretability and highlights behavioral differences between modalities.

Dimensions Underlying the Representational Alignment of Deep Neural Networks with Humans

The paper "Dimensions underlying the representational alignment of deep neural networks with humans" by Florian P. Mahner et al. provides a systematic framework for comparing the representational embeddings of humans and Deep Neural Networks (DNNs) using the same experimental and modeling paradigms. This paper explores the often-discussed topic of representational alignment and contributes a detailed examination of the visual and semantic dimensions that underlie image representations in both domains.

Overview of Methodology

To accomplish their objective, the authors collected similarity judgments for object images from both humans and a pretrained VGG-16 model. Human data was derived from the THINGS database, while the DNN data was synthetically generated using 24,102 images for the same objects. The paper employs a triplet odd-one-out task, where subjects (both human and DNN) select the most dissimilar object from a set of three, to create low-dimensional embeddings that can be directly compared.

Key Findings

Representational Embeddings:
- Both human and DNN embeddings were optimized to predict the odd-one-out choices, yielding 70 stable dimensions for DNNs and 68 for humans.
- Human embeddings captured 90.85% of the explainable variance, whereas VGG-16 embeddings captured 84.03%.
Visual versus Semantic Bias:
- The DNN showed a striking visual bias with many dimensions reflecting visual-perceptual properties.
- Human embeddings were primarily dominated by semantic dimensions.
- This contrast was quantitatively verified by expert ratings, which showed a higher proportion of semantic dimensions in human embeddings compared to DNNs.
Interpretability and Causality:
- Various interpretability techniques such as Grad-CAM, activation maximization with StyleGAN, and causal image manipulations were used to identify and verify the contributory image features.
- Grad-CAM heatmaps and dimension-specific synthetic images revealed high interpretability in the DNN dimensions.
Comparative Analysis:
- Pairwise RSA analyses indicated a moderate global alignment between human and DNN RSMs with a Pearson correlation of 0.55.
- High-correlating dimensions between the two modalities still revealed significant differences in interpretation, with seeming alignment driven by different feature sets.
Behavioral Implications:
- Analyzing the relevance of individual dimensions for odd-one-out tasks showed that humans prioritize semantic properties, while DNN prioritizes visual features.
- Jackknife resampling demonstrated this behavioral divergence, emphasizing differing categorical strategies between humans and DNNs.

Implications and Future Directions

Practical Implications:

The detailed comparison of representational embeddings suggests that while DNNs can approximate human-like semantic dimensions, they predominantly rely on visual features. This has direct implications for developing AI systems aimed at human-like cognitive tasks, such as those in medical imaging, autonomous driving, and interactive AI systems, recommending a focus on models that better balance semantic and visual information.

Theoretical Implications:

From a theoretical perspective, these findings challenge the notion that DNNs naturally align with human cognition simply by increasing the complexity of visual tasks. The observed visual bias suggests that additional explicit training on semantic associations may be necessary to achieve closer representational fidelity to human cognition.

Future Developments:

Future work could explore various DNN architectures and training paradigms to see if and how they impact the representational alignment with humans. The integration of higher-level cognitive tasks and richer semantic datasets might help bridge the visual-semantic gap observed. Furthermore, expanding the framework to different sensory modalities or complex stimuli can provide a holistic understanding of human and AI representational capabilities.

Conclusion

The presented framework and findings by Mahner et al. substantially advance our understanding of the dimensions underlying representational alignment between humans and DNNs. While showing a reasonable degree of global alignment, the deep dive into visual and semantic dimensions uncovers divergent strategies, emphasizing the need for new approaches in AI development to achieve more human-like cognitive models. The paper sets the stage for further exploration into optimization and interpretability techniques that can better align DNN representations with human cognition.

PDF Markdown

Related Papers

Tweets

https://twitter.com/martin_hebart/status/1806687440339402887

https://twitter.com/KordingLab/status/1807827555388448942

https://twitter.com/leafs_s_jp/status/1806698511276372307

https://twitter.com/fly51fly/status/1806819941515219126

https://twitter.com/arxivsanitybot/status/1806873063059054633

https://twitter.com/XTXI/status/1884301746622652446

YouTube

Show All Videos