Residual Networks as Nonlinear Systems: Stability Analysis using Linearization (1905.13386v1)

Published 31 May 2019 in cs.LG, cs.CV, and stat.ML

Abstract: We regard pre-trained residual networks (ResNets) as nonlinear systems and use linearization, a common method used in the qualitative analysis of nonlinear systems, to understand the behavior of the networks under small perturbations of the input images. We work with ResNet-56 and ResNet-110 trained on the CIFAR-10 data set. We linearize these networks at the level of residual units and network stages, and the singular value decomposition is used in the stability analysis of these components. It is found that most of the singular values of the linearizations of residual units are 1 and, in spite of the fact that the linearizations depend directly on the activation maps, the singular values differ only slightly for different input images. However, adjusting the scaling of the skip connection or the values of the weights in a residual unit has a significant impact on the singular value distributions. Inspection of how random and adversarial perturbations of input images propagate through the network reveals that there is a dramatic jump in the magnitude of adversarial perturbations towards the end of the final stage of the network that is not present in the case of random perturbations. We attempt to gain a better understanding of this phenomenon by projecting the perturbations onto singular vectors of the linearizations of the residual units.

Citations (2)

View on Semantic Scholar

Summary

The paper applies linearization to analyze the stability of ResNet-56 and ResNet-110, revealing that residual units maintain singular values around one under small perturbations.
It demonstrates that as data flows through the network, the number of singular values exceeding one diminishes, indicating progressive stability from input to output layers.
The findings highlight that the architecture itself underpins stability while also exposing vulnerabilities to adversarial attacks, suggesting directions for future robust network designs.

Introduction

Researchers have long sought to understand why residual networks (ResNets) operate effectively, especially as systems that achieve high accuracy in tasks such as image classification. A paper has taken a fresh approach by looking at ResNets as nonlinear systems. Nonlinear systems are mathematical models that describe a wide range of physical phenomena, but their complexity often requires simplifications to analyze their behavior. Linearization, which is approximating a nonlinear function with a linear one around a specific point, is a common technique used in such analyses. In this paper, the authors applied linearization to paper the stability of pre-trained ResNets, specifically analyzing ResNet-56 and ResNet-110 models trained on the CIFAR-10 dataset.

Theoretical Background

Linearization converts a nonlinear function with a small perturbation at its input into a linear system, which can be described using a Jacobian matrix. This matrix provides insights into how small changes in the input affect the output. By examining residual units—the building blocks of ResNets—and network stages, the researchers used this method to infer how small perturbations, like those found in adversarial attacks, would propagate through the network. Considering the singular value decomposition (SVD) of these Jacobian matrices revealed the extent to which perturbations could grow or shrink as they pass through successive layers of the network.

Findings

The paper uncovers several key insights. For the most part, the singular values of the Jacobians of the residual units were found to be around 1, with small variations across different input images, suggesting a general stability of the ResNet architecture. Additionally, the authors noted that the number of singular values greater than 1 tended to decrease from the initial to the terminal layers of the network, indicating a gradual stabilization as data moves through the network. However, adversarial perturbations showed a tendency to dramatically increase towards the end of the network, an observation that warrants future exploration into the robustness of ResNets to such inputs.

Implications and Future Work

While the paper provides a significant contribution to our understanding of pre-trained ResNets' behavior, it opens up several avenues for further research. One of the primary observations is that the stability properties of these networks hinge more on their architectural design rather than the input images themselves. This has significant implications for the design of future neural networks and their potential vulnerability to adversarial attacks. The linearization technique applied in this paper offers a promising tool for analyzing the robustness and reliability of ResNets with the goal of improving their design against adversarial examples.

In summary, this work shines a light on the underlying stability of pre-trained ResNets, offering novel insights into their behavior and setting the stage for more advanced studies that could lead to the development of more robust ML systems.

PDF Markdown