Emergent Mind

Residual Networks as Nonlinear Systems: Stability Analysis using Linearization

(1905.13386)
Published May 31, 2019 in cs.LG , cs.CV , and stat.ML

Abstract

We regard pre-trained residual networks (ResNets) as nonlinear systems and use linearization, a common method used in the qualitative analysis of nonlinear systems, to understand the behavior of the networks under small perturbations of the input images. We work with ResNet-56 and ResNet-110 trained on the CIFAR-10 data set. We linearize these networks at the level of residual units and network stages, and the singular value decomposition is used in the stability analysis of these components. It is found that most of the singular values of the linearizations of residual units are 1 and, in spite of the fact that the linearizations depend directly on the activation maps, the singular values differ only slightly for different input images. However, adjusting the scaling of the skip connection or the values of the weights in a residual unit has a significant impact on the singular value distributions. Inspection of how random and adversarial perturbations of input images propagate through the network reveals that there is a dramatic jump in the magnitude of adversarial perturbations towards the end of the final stage of the network that is not present in the case of random perturbations. We attempt to gain a better understanding of this phenomenon by projecting the perturbations onto singular vectors of the linearizations of the residual units.

Overview

  • This paper investigates the stability of pre-trained ResNets by considering them as nonlinear systems and applying linearization.

  • The analysis focuses on ResNet-56 and ResNet-110 models that have been trained on the CIFAR-10 dataset.

  • Using the Jacobian matrix and singular value decomposition, the study examines how the networks respond to small perturbations.

  • Results indicate that ResNets generally exhibit stability across layers, with singular values around 1, but adversarial perturbations increase towards the network's end.

  • The research highlights that network architecture plays a crucial role in stability and could inform the design of more robust networks against adversarial attacks.

Introduction

Researchers have long sought to understand why residual networks (ResNets) operate effectively, especially as systems that achieve high accuracy in tasks such as image classification. A recent study has taken a fresh approach by looking at ResNets as nonlinear systems. Nonlinear systems are mathematical models that describe a wide range of physical phenomena, but their complexity often requires simplifications to analyze their behavior. Linearization, which is approximating a nonlinear function with a linear one around a specific point, is a common technique used in such analyses. In this paper, the authors applied linearization to study the stability of pre-trained ResNets, specifically analyzing ResNet-56 and ResNet-110 models trained on the CIFAR-10 dataset.

Theoretical Background

Linearization converts a nonlinear function with a small perturbation at its input into a linear system, which can be described using a Jacobian matrix. This matrix provides insights into how small changes in the input affect the output. By examining residual units—the building blocks of ResNets—and network stages, the researchers used this method to infer how small perturbations, like those found in adversarial attacks, would propagate through the network. Considering the singular value decomposition (SVD) of these Jacobian matrices revealed the extent to which perturbations could grow or shrink as they pass through successive layers of the network.

Findings

The paper uncovers several key insights. For the most part, the singular values of the Jacobians of the residual units were found to be around 1, with small variations across different input images, suggesting a general stability of the ResNet architecture. Additionally, the authors noted that the number of singular values greater than 1 tended to decrease from the initial to the terminal layers of the network, indicating a gradual stabilization as data moves through the network. However, adversarial perturbations showed a tendency to dramatically increase towards the end of the network, an observation that warrants future exploration into the robustness of ResNets to such inputs.

Implications and Future Work

While the study provides a significant contribution to our understanding of pre-trained ResNets' behavior, it opens up several avenues for further research. One of the primary observations is that the stability properties of these networks hinge more on their architectural design rather than the input images themselves. This has significant implications for the design of future neural networks and their potential vulnerability to adversarial attacks. The linearization technique applied in this study offers a promising tool for analyzing the robustness and reliability of ResNets with the goal of improving their design against adversarial examples.

In summary, this work shines a light on the underlying stability of pre-trained ResNets, offering novel insights into their behavior and setting the stage for more advanced studies that could lead to the development of more robust ML systems.

Create an account to read this summary for free:

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.