A convergence result of a continuous model of deep learning via Łojasiewicz--Simon inequality (2311.15365v2)

Published 26 Nov 2023 in cs.LG, math.AP, math.FA, and math.PR

Abstract: This study focuses on a Wasserstein-type gradient flow, which represents an optimization process of a continuous model of a Deep Neural Network (DNN). First, we establish the existence of a minimizer for an average loss of the model under $L^{2$-regularization.} Subsequently, we show the existence of a curve of maximal slope of the loss. Our main result is the convergence of flow to a critical point of the loss as time goes to infinity. An essential aspect of proving this result involves the establishment of the \L{}ojasiewicz--Simon gradient inequality for the loss. We derive this inequality by assuming the analyticity of NNs and loss functions. Our proofs offer a new approach for analyzing the asymptotic behavior of Wasserstein-type gradient flows for nonconvex functionals.

Citations (2)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

A convergence result of a continuous model of deep learning via Łojasiewicz--Simon inequality (2311.15365v2)

Summary

Related Papers