Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

RMNet: Equivalently Removing Residual Connection from Networks (2111.00687v1)

Published 1 Nov 2021 in cs.CV and cs.LG

Abstract: Although residual connection enables training very deep neural networks, it is not friendly for online inference due to its multi-branch topology. This encourages many researchers to work on designing DNNs without residual connections at inference. For example, RepVGG re-parameterizes multi-branch topology to a VGG-like (single-branch) model when deploying, showing great performance when the network is relatively shallow. However, RepVGG can not transform ResNet to VGG equivalently because re-parameterizing methods can only be applied to linear blocks and the non-linear layers (ReLU) have to be put outside of the residual connection which results in limited representation ability, especially for deeper networks. In this paper, we aim to remedy this problem and propose to remove the residual connection in a vanilla ResNet equivalently by a reserving and merging (RM) operation on ResBlock. Specifically, the RM operation allows input feature maps to pass through the block while reserving their information and merges all the information at the end of each block, which can remove residual connections without changing the original output. As a plug-in method, RM Operation basically has three advantages: 1) its implementation makes it naturally friendly for high ratio network pruning. 2) it helps break the depth limitation of RepVGG. 3) it leads to better accuracy-speed trade-off network (RMNet) compared to ResNet and RepVGG. We believe the ideology of RM Operation can inspire many insights on model design for the community in the future. Code is available at: https://github.com/fxmeng/RMNet.

Citations (10)

Summary

  • The paper introduces an RM operation that substitutes residual connections without altering network outputs by using reserving and merging steps.
  • It demonstrates improved accuracy-speed trade-offs and enables training deeper networks compared to conventional ResNet and RepVGG models.
  • The RMNet approach optimizes neural architectures for real-time applications by reducing computational load during inference.

An Overview of RMNet: Equivalently Removing Residual Connections from Networks

The paper "RMNet: Equivalently Removing Residual Connection from Networks" presents a novel approach aimed at re-examining the necessity of residual connections in deep neural networks, predominantly in architectures like ResNet. The authors propose RMNet, a method premised on the removal of residual connections without altering the network's output by employing a combination of reserving and merging (RM) operations within the ResBlock. This approach is motivated by the observation that residual connections, while beneficial for training very deep networks, can be computationally intensive during inference due to their multi-branch topologies.

Problem Statement and Motivation

Residual connections, popularized by ResNet, have been instrumental in addressing network training challenges such as gradient vanishing. However, their deployment can consume significant memory and computational resources during inference, impacting their suitability for real-time applications. Preceding methods like RepVGG attempt to alleviate these issues by converting multi-branch networks to single-branch configurations at inference time but struggle with deeper networks due to limitations in handling non-linear layers.

Proposed RM Operation

The RM operation is the core innovation of this paper, enabling the effective removal of residual connections while preserving the network's output integrity. This operation involves two key processes:

  1. Reserving: This step involves using Dirac-initialized filters in the first convolution layer to preserve the input features, ensuring that the original channel information is retained through the block.
  2. Merging: At the end of each block, the reserved input features are merged back into the output features using an identity function represented by additional convolution channels in the second layer.

This technique ensures that residual connections are removed systematically without affecting the desired functional output of the network.

Implementation and Advantages

The implementation of RMNet as a plug-in method provides several benefits:

  • It simplifies the architecture by natural network pruning compatibility, allowing for more efficient model deployment and execution.
  • RMNet surpasses the depth constraints faced by RepVGG, enabling the training of deeper networks.
  • Offers improved accuracy-speed trade-offs compared to traditional architectures like ResNet, as validated through empirical evidence.

Evaluation and Results

The paper presents a comprehensive evaluation across CIFAR-10, CIFAR-100, and ImageNet datasets. Notably, RMNet shows improved speed without sacrificing accuracy relative to both conventional ResNet and RepVGG models. The results underline that RMNet can harness the advantages of deeper network architectures without the performance degradation typically associated with models lacking residual connections.

Implications and Future Directions

The RMNet methodology holds significant implications for the design of efficient neural network architectures, particularly in scenarios where real-time inference is critical. By demonstrating the feasibility of eliminating residual connections in a manner that maintains network performance, RMNet opens new avenues in model design whereby computational efficiency and memory usage are prioritized.

Future research might explore extending RM operation principles to neural network architectures beyond ResNet and MobileNetV2, such as those utilized in different domains and configurations like transformers. Additionally, integrating Network Architecture Search (NAS) to automate the design of variants with optimized configurations may further enhance efficiency and performance.

In conclusion, the RMNet approach presents a significant stride in the evolution of neural network design, balancing performance with computational efficiency, and setting a precedent for future innovations in deep learning model optimization.