Depth with Nonlinearity Creates No Bad Local Minima in ResNets (1810.09038v3)

Published 21 Oct 2018 in stat.ML, cs.AI, cs.LG, and math.OC

Abstract: In this paper, we prove that depth with nonlinearity creates no bad local minima in a type of arbitrarily deep ResNets with arbitrary nonlinear activation functions, in the sense that the values of all local minima are no worse than the global minimum value of corresponding classical machine-learning models, and are guaranteed to further improve via residual representations. As a result, this paper provides an affirmative answer to an open question stated in a paper in the conference on Neural Information Processing Systems 2018. This paper advances the optimization theory of deep learning only for ResNets and not for other network architectures.

Citations (61)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Related Papers

Deep Learning without Poor Local Minima (2016)
Depth Creates No Bad Local Minima (2017)
Are deep ResNets provably better than linear predictors? (2019)
Effect of Depth and Width on Local Minima in Deep Learning (2018)
Are ResNets Provably Better than Linear Predictors? (2018)