The Computational Power of Optimization in Online Learning
(1504.02089)Abstract
We consider the fundamental problem of prediction with expert advice where the experts are "optimizable": there is a black-box optimization oracle that can be used to compute, in constant time, the leading expert in retrospect at any point in time. In this setting, we give a novel online algorithm that attains vanishing regret with respect to $N$ experts in total $\widetilde{O}(\sqrt{N})$ computation time. We also give a lower bound showing that this running time cannot be improved (up to log factors) in the oracle model, thereby exhibiting a quadratic speedup as compared to the standard, oracle-free setting where the required time for vanishing regret is $\widetilde{\Theta}(N)$. These results demonstrate an exponential gap between the power of optimization in online learning and its power in statistical learning: in the latter, an optimization oraclei.e., an efficient empirical risk minimizerallows to learn a finite hypothesis class of size $N$ in time $O(\log{N})$. We also study the implications of our results to learning in repeated zero-sum games, in a setting where the players have access to oracles that compute, in constant time, their best-response to any mixed strategy of their opponent. We show that the runtime required for approximating the minimax value of the game in this setting is $\widetilde{\Theta}(\sqrt{N})$, yielding again a quadratic improvement upon the oracle-free setting, where $\widetilde{\Theta}(N)$ is known to be tight.
We're not able to analyze this paper right now due to high demand.
Please check back later (sorry!).
Generate a summary of this paper on our Pro plan:
We ran into a problem analyzing this paper.