Emergent Mind

Convergence to minima for the continuous version of Backtracking Gradient Descent

(1911.04221)
Published Nov 11, 2019 in math.OC , cs.LG , cs.NA , math.NA , and stat.ML

Abstract

The main result of this paper is: {\bf Theorem.} Let $f:\mathbb{R}k\rightarrow \mathbb{R}$ be a $C{1}$ function, so that $\nabla f$ is locally Lipschitz continuous. Assume moreover that $f$ is $C2$ near its generalised saddle points. Fix real numbers $\delta0>0$ and $0<\alpha <1$. Then there is a smooth function $h:\mathbb{R}k\rightarrow (0,\delta0]$ so that the map $H:\mathbb{R}k\rightarrow \mathbb{R}k$ defined by $H(x)=x-h(x)\nabla f(x)$ has the following property: (i) For all $x\in \mathbb{R}k$, we have $f(H(x)))-f(x)\leq -\alpha h(x)||\nabla f(x)||2$. (ii) For every $x0\in \mathbb{R}k$, the sequence $x{n+1}=H(xn)$ either satisfies $\lim{n\rightarrow\infty}||x{n+1}-xn||=0$ or $ \lim{n\rightarrow\infty}||xn||=\infty$. Each cluster point of ${xn}$ is a critical point of $f$. If moreover $f$ has at most countably many critical points, then ${xn}$ either converges to a critical point of $f$ or $\lim{n\rightarrow\infty}||xn||=\infty$. (iii) There is a set $\mathcal{E}1\subset \mathbb{R}k$ of Lebesgue measure $0$ so that for all $x0\in \mathbb{R}k\backslash \mathcal{E}1$, the sequence $x{n+1}=H(xn)$, {\bf if converges}, cannot converge to a {\bf generalised} saddle point. (iv) There is a set $\mathcal{E}2\subset \mathbb{R}k$ of Lebesgue measure $0$ so that for all $x0\in \mathbb{R}k\backslash \mathcal{E}2$, any cluster point of the sequence $x{n+1}=H(xn)$ is not a saddle point, and more generally cannot be an isolated generalised saddle point. Some other results are proven.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.