Efficient and Robust Algorithms for Adversarial Linear Contextual Bandits (2002.00287v3)

Published 1 Feb 2020 in cs.LG and stat.ML

Abstract: We consider an adversarial variant of the classic $K$-armed linear contextual bandit problem where the sequence of loss functions associated with each arm are allowed to change without restriction over time. Under the assumption that the $d$-dimensional contexts are generated i.i.d.~at random from a known distributions, we develop computationally efficient algorithms based on the classic Exp3 algorithm. Our first algorithm, RealLinExp3, is shown to achieve a regret guarantee of $\widetilde{O}(\sqrt{KdT})$ over $T$ rounds, which matches the best available bound for this problem. Our second algorithm, RobustLinExp3, is shown to be robust to misspecification, in that it achieves a regret bound of $\widetilde{O}((Kd)^{{1/3}T^{2/3})} + \varepsilon \sqrt{d} T$ if the true reward function is linear up to an additive nonlinear error uniformly bounded in absolute value by $\varepsilon$. To our knowledge, our performance guarantees constitute the very first results on this problem setting.

Citations (42)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Efficient and Robust Algorithms for Adversarial Linear Contextual Bandits (2002.00287v3)

Summary

Related Papers