Robustness to Transformations Across Categories: Is Robustness To Transformations Driven by Invariant Neural Representations? (2007.00112v4)
Abstract: Deep Convolutional Neural Networks (DCNNs) have demonstrated impressive robustness to recognize objects under transformations (eg. blur or noise) when these transformations are included in the training set. A hypothesis to explain such robustness is that DCNNs develop invariant neural representations that remain unaltered when the image is transformed. However, to what extent this hypothesis holds true is an outstanding question, as robustness to transformations could be achieved with properties different from invariance, eg. parts of the network could be specialized to recognize either transformed or non-transformed images. This paper investigates the conditions under which invariant neural representations emerge by leveraging that they facilitate robustness to transformations beyond the training distribution. Concretely, we analyze a training paradigm in which only some object categories are seen transformed during training and evaluate whether the DCNN is robust to transformations across categories not seen transformed. Our results with state-of-the-art DCNNs indicate that invariant neural representations do not always drive robustness to transformations, as networks show robustness for categories seen transformed during training even in the absence of invariant neural representations. Invariance only emerges as the number of transformed categories in the training set is increased. This phenomenon is much more prominent with local transformations such as blurring and high-pass filtering than geometric transformations such as rotation and thinning, which entail changes in the spatial arrangement of the object. Our results contribute to a better understanding of invariant neural representations in deep learning and the conditions under which it spontaneously emerges.
- “Emergence of invariance and disentanglement in deep representations” In The Journal of Machine Learning Research 19.1 JMLR. org, 2018, pp. 1947–1980
- “Why do deep convolutional networks generalize so poorly to small image transformations?” In Journal of Machine Learning Research, 2019
- “A map of object space in primate inferotemporal cortex” In Nature 583.7814 Nature Publishing Group UK London, 2020, pp. 103–108
- “Network dissection: Quantifying interpretability of deep visual representations” In CVPR, 2017, pp. 3319–3327 IEEE
- “Imagenet: A large-scale hierarchical image database” In CVPR, 2009
- James J DiCarlo, Davide Zoccolan and Nicole C Rust “How does the brain solve visual object recognition?” In Neuron 73.3 Elsevier, 2012, pp. 415–434
- “Exploring the Landscape of Spatial Robustness” In ICML, 2019
- Daniel J Felleman and David C Van Essen “Distributed hierarchical processing in the primate cerebral cortex.” In Cerebral cortex (New York, NY: 1991) 1.1, 1991, pp. 1–47
- “ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness” In ICLR, 2019
- “Generalisation in humans and deep neural networks” In NeurIPS, 2018
- “Measuring invariances in deep networks” In NIPS, 2009
- “Deep residual learning for image recognition” In CVPR, 2016
- Max Jaderberg, Karen Simonyan and Andrew Zisserman “Spatial transformer networks” In NIPS, 2015
- Hojin Jang, Devin McCormack and Frank Tong “Noise-trained deep neural networks effectively predict human vision and its neural responses to challenging images” In PLoS biology 19.12 Public Library of Science San Francisco, CA USA, 2021, pp. e3001418
- “Evidence that recurrent circuits are critical to the ventral stream’s execution of core object recognition behavior” In Nature neuroscience 22.6 Nature Publishing Group US New York, 2019, pp. 974–983
- “Overcoming catastrophic forgetting in neural networks” In Proceedings of the National Academy of Sciences, 2017
- Alex Krizhevsky, Ilya Sutskever and Geoffrey E Hinton “Imagenet classification with deep convolutional neural networks” In NIPS, 2012
- Victor AF Lamme and Pieter R Roelfsema “The distinct modes of vision offered by feedforward and recurrent processing” In Trends in neurosciences 23.11 Elsevier, 2000, pp. 571–579
- “Early visual experience and face processing” In Nature 410.6831 Nature Publishing Group UK London, 2001, pp. 890–890
- “The role of disentanglement in generalisation” In International Conference on Learning Representations, 2021
- “A data-driven approach to cleaning large face datasets” In ICIP, 2014
- “The building blocks of interpretability” In Distill, 2018
- “Recurrent processing during object recognition” In Frontiers in psychology 4 Frontiers Media SA, 2013, pp. 124
- Thomas J Palmeri and Isabel Gauthier “Visual object understanding” In Nature Reviews Neuroscience 5.4 Nature Publishing Group UK London, 2004, pp. 291–303
- Jessie J Peissig and Michael J Tarr “Visual object recognition: do we know more now than we did 20 years ago?” In Annu. Rev. Psychol. 58 Annual Reviews, 2007, pp. 75–96
- “Visual cortex and deep networks: learning invariant representations” MIT Press, 2016
- “Invariant visual representation by single neurons in the human brain” In Nature, 2005
- Gregor Rainer and Earl K Miller “Effects of visual experience on the representation of objects in the prefrontal cortex” In Neuron 27.1 Elsevier, 2000, pp. 179–189
- “Just one view: Invariances in inferotemporal cell tuning” In NIPS, 1998
- Pieter R Roelfsema and Floris P Lange “Early visual cortex as a multiscale cognitive blackboard” In Annual review of vision science 2 Annual Reviews, 2016, pp. 131–151
- “Adversarially robust generalization requires more data” In NeurIPS, 2018
- “Visual Representation Learning Does Not Generalize Strongly Within the Same Domain” In ArXiv abs/2107.08221, 2021
- Roger N Shepard and Jacqueline Metzler “Mental rotation of three-dimensional objects” In Science 171.3972 American Association for the Advancement of Science, 1971, pp. 701–703
- Jake Snell, Kevin Swersky and Richard Zemel “Prototypical networks for few-shot learning” In Advances in neural information processing systems 30, 2017
- Courtney J Spoerer, Patrick McClure and Nikolaus Kriegeskorte “Recurrent convolutional neural networks: a better model of biological object recognition” In Frontiers in psychology 8 Frontiers Media SA, 2017, pp. 1551
- Sanjana Srivastava, Guy Ben-Yosef and Xavier Boix “Minimal Images in Deep Neural Networks: Fragile Object Recognition in Natural Images” In ICLR, 2019
- “Going deeper with convolutions” In Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1–9
- Andrea Tacchetti, Leyla Isik and Tomaso A Poggio “Invariant recognition shapes neural representations of visual input” In Annual review of vision science 4 Annual Reviews, 2018, pp. 403–422
- Doris Y Tsao and Margaret S Livingstone “Mechanisms of face perception” In Annu. Rev. Neurosci. 31 Annual Reviews, 2008, pp. 411–437
- “Matching networks for one shot learning” In Advances in neural information processing systems 29, 2016
- “Potential downside of high initial visual acuity” In Proceedings of the National Academy of Sciences, 2018
- “The effectiveness of data augmentation in image classification using deep learning” In Convolutional Neural Networks Vis. Recognit, 2017, pp. 11
- Justin N Wood and Samantha MW Wood “The development of invariant object recognition requires visual experience with temporally smooth objects” In Cognitive Science 42.4 Wiley Online Library, 2018, pp. 1391–1406
- Dean Wyatte, Tim Curran and Randall O’Reilly “The limits of feedforward vision: Recurrent processing promotes robust object recognition when objects are degraded” In Journal of Cognitive Neuroscience 24.11 MIT Press One Rogers Street, Cambridge, MA 02142-1209, USA journals-info …, 2012, pp. 2248–2261
- Matthew D Zeiler and Rob Fergus “Visualizing and understanding convolutional networks” In ECCV, 2014
- “Revisiting the importance of individual units in cnns via ablation” In NeurIPS, 2018