Stochastic Multi-level Composition Optimization Algorithms with Level-Independent Convergence Rates (2008.10526v5)
Abstract: In this paper, we study smooth stochastic multi-level composition optimization problems, where the objective function is a nested composition of $T$ functions. We assume access to noisy evaluations of the functions and their gradients, through a stochastic first-order oracle. For solving this class of problems, we propose two algorithms using moving-average stochastic estimates, and analyze their convergence to an $\epsilon$-stationary point of the problem. We show that the first algorithm, which is a generalization of \cite{GhaRuswan20} to the $T$ level case, can achieve a sample complexity of $\mathcal{O}(1/\epsilon6)$ by using mini-batches of samples in each iteration. By modifying this algorithm using linearized stochastic estimates of the function values, we improve the sample complexity to $\mathcal{O}(1/\epsilon4)$. {\color{black}This modification not only removes the requirement of having a mini-batch of samples in each iteration, but also makes the algorithm parameter-free and easy to implement}. To the best of our knowledge, this is the first time that such an online algorithm designed for the (un)constrained multi-level setting, obtains the same sample complexity of the smooth single-level setting, under standard assumptions (unbiasedness and boundedness of the second moments) on the stochastic first-order oracle.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.