Hedging in games: Faster convergence of external and swap regrets
(2006.04953)Abstract
We consider the setting where players run the Hedge algorithm or its optimistic variant to play an $n$-action game repeatedly for $T$ rounds. 1) For two-player games, we show that the regret of optimistic Hedge decays at $\tilde{O}( 1/T {5/6} )$, improving the previous bound $O(1/T{3/4})$ by Syrgkanis, Agarwal, Luo and Schapire (NIPS'15) 2) In contrast, we show that the convergence rate of vanilla Hedge is no better than $\tilde{\Omega}(1/ \sqrt{T})$, addressing an open question posted in Syrgkanis, Agarwal, Luo and Schapire (NIPS'15). For general m-player games, we show that the swap regret of each player decays at rate $\tilde{O}(m{1/2} (n/T){3/4})$ when they combine optimistic Hedge with the classical external-to-internal reduction of Blum and Mansour (JMLR'07). The algorithm can also be modified to achieve the same rate against itself and a rate of $\tilde{O}(\sqrt{n/T})$ against adversaries. Via standard connections, our upper bounds also imply faster convergence to coarse correlated equilibria in two-player games and to correlated equilibria in multiplayer games.
We're not able to analyze this paper right now due to high demand.
Please check back later (sorry!).
Generate a summary of this paper on our Pro plan:
We ran into a problem analyzing this paper.