Emergent Mind

Improved Regret Bounds for Online Submodular Maximization

(2106.07836)
Published Jun 15, 2021 in cs.LG , math.OC , and stat.ML

Abstract

In this paper, we consider an online optimization problem over $T$ rounds where at each step $t\in[T]$, the algorithm chooses an action $xt$ from the fixed convex and compact domain set $\mathcal{K}$. A utility function $ft(\cdot)$ is then revealed and the algorithm receives the payoff $ft(xt)$. This problem has been previously studied under the assumption that the utilities are adversarially chosen monotone DR-submodular functions and $\mathcal{O}(\sqrt{T})$ regret bounds have been derived. We first characterize the class of strongly DR-submodular functions and then, we derive regret bounds for the following new online settings: $(1)$ ${ft}{t=1}T$ are monotone strongly DR-submodular and chosen adversarially, $(2)$ ${ft}{t=1}T$ are monotone submodular (while the average $\frac{1}{T}\sum{t=1}T ft$ is strongly DR-submodular) and chosen by an adversary but they arrive in a uniformly random order, $(3)$ ${ft}{t=1}T$ are drawn i.i.d. from some unknown distribution $ft\sim \mathcal{D}$ where the expected function $f(\cdot)=\mathbb{E}{ft\sim\mathcal{D}}[ft(\cdot)]$ is monotone DR-submodular. For $(1)$, we obtain the first logarithmic regret bounds. In terms of the second framework, we show that it is possible to obtain similar logarithmic bounds with high probability. Finally, for the i.i.d. model, we provide algorithms with $\tilde{\mathcal{O}}(\sqrt{T})$ stochastic regret bound, both in expectation and with high probability. Experimental results demonstrate that our algorithms outperform the previous techniques in the aforementioned three settings.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.