Papers
Topics
Authors
Recent
2000 character limit reached

Fair Multi-Agent Bandits (2306.04498v2)

Published 7 Jun 2023 in cs.LG, cs.CY, and cs.DC

Abstract: In this paper, we study the problem of fair multi-agent multi-arm bandit learning when agents do not communicate with each other, except collision information, provided to agents accessing the same arm simultaneously. We provide an algorithm with regret $O\left(N3 \log \frac{B}{\Delta} f(\log T) \log T \right)$ (assuming bounded rewards, with unknown bound), where $f(t)$ is any function diverging to infinity with $t$. This significantly improves previous results which had the same upper bound on the regret of order $O(f(\log T) \log T )$ but an exponential dependence on the number of agents. The result is attained by using a distributed auction algorithm to learn the sample-optimal matching and a novel order-statistics-based regret analysis. Simulation results present the dependence of the regret on $\log T$.

Citations (1)

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.