On Learning the $cμ$ Rule in Single and Parallel Server Networks
(1802.06723)Abstract
We consider learning-based variants of the $c \mu$ rule for scheduling in single and parallel server settings of multi-class queueing systems. In the single server setting, the $c \mu$ rule is known to minimize the expected holding-cost (weighted queue-lengths summed over classes and a fixed time horizon). We focus on the problem where the service rates $\mu$ are unknown with the holding-cost regret (regret against the $c \mu$ rule with known $\mu$) as our objective. We show that the greedy algorithm that uses empirically learned service rates results in a constant holding-cost regret (the regret is independent of the time horizon). This free exploration can be explained in the single server setting by the fact that any work-conserving policy obtains the same number of samples in a busy cycle. In the parallel server setting, we show that the $c \mu$ rule may result in unstable queues, even for arrival rates within the capacity region. We then present sufficient conditions for geometric ergodicity under the $c \mu$ rule. Using these results, we propose an almost greedy algorithm that explores only when the number of samples falls below a threshold. We show that this algorithm delivers constant holding-cost regret because a free exploration condition is eventually satisfied.
We're not able to analyze this paper right now due to high demand.
Please check back later (sorry!).
Generate a summary of this paper on our Pro plan:
We ran into a problem analyzing this paper.