Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
60 tokens/sec
GPT-4o
12 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SensLI: Sensitivity-Based Layer Insertion for Neural Networks (2311.15995v2)

Published 27 Nov 2023 in cs.LG and math.OC

Abstract: The training of neural networks requires tedious and often manual tuning of the network architecture. We propose a systematic approach to inserting new layers during the training process. Our method eliminates the need to choose a fixed network size before training, is numerically inexpensive to execute and applicable to various architectures including fully connected feedforward networks, ResNets and CNNs. Our technique borrows ideas from constrained optimization and is based on first-order sensitivity information of the loss function with respect to the virtual parameters that additional layers, if inserted, would offer. In numerical experiments, our proposed sensitivity-based layer insertion technique (SensLI) exhibits improved performance on training loss and test error, compared to training on a fixed architecture, and reduced computational effort in comparison to training the extended architecture from the beginning. Our code is available on https://github.com/mathemml/SensLI.

Summary

We haven't generated a summary for this paper yet.