Emergent Mind

Abstract

We study algorithms for online linear optimization in Hilbert spaces, focusing on the case where the player is unconstrained. We develop a novel characterization of a large class of minimax algorithms, recovering, and even improving, several previous results as immediate corollaries. Moreover, using our tools, we develop an algorithm that provides a regret bound of $\mathcal{O}\Big(U \sqrt{T \log(U \sqrt{T} \log2 T +1)}\Big)$, where $U$ is the $L_2$ norm of an arbitrary comparator and both $T$ and $U$ are unknown to the player. This bound is optimal up to $\sqrt{\log \log T}$ terms. When $T$ is known, we derive an algorithm with an optimal regret bound (up to constant factors). For both the known and unknown $T$ case, a Normal approximation to the conditional value of the game proves to be the key analysis tool.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.