Simultaneously Achieving Group Exposure Fairness and Within-Group Meritocracy in Stochastic Bandits (2402.05575v1)

Published 8 Feb 2024 in cs.LG, cs.AI, cs.CY, and cs.MA

Abstract: Existing approaches to fairness in stochastic multi-armed bandits (MAB) primarily focus on exposure guarantee to individual arms. When arms are naturally grouped by certain attribute(s), we propose Bi-Level Fairness, which considers two levels of fairness. At the first level, Bi-Level Fairness guarantees a certain minimum exposure to each group. To address the unbalanced allocation of pulls to individual arms within a group, we consider meritocratic fairness at the second level, which ensures that each arm is pulled according to its merit within the group. Our work shows that we can adapt a UCB-based algorithm to achieve a Bi-Level Fairness by providing (i) anytime Group Exposure Fairness guarantees and (ii) ensuring individual-level Meritocratic Fairness within each group. We first show that one can decompose regret bounds into two components: (a) regret due to anytime group exposure fairness and (b) regret due to meritocratic fairness within each group. Our proposed algorithm BF-UCB balances these two regrets optimally to achieve the upper bound of $O(\sqrt{T})$ on regret; $T$ being the stopping time. With the help of simulated experiments, we further show that BF-UCB achieves sub-linear regret; provides better group and individual exposure guarantees compared to existing algorithms; and does not result in a significant drop in reward with respect to UCB algorithm, which does not impose any fairness constraint.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Related Papers

Calibrated Fairness in Bandits (2017)
Fairness and Privacy Guarantees in Federated Contextual Bandits (2024)
Achieving User-Side Fairness in Contextual Bandits (2020)
Fairness of Exposure in Stochastic Bandits (2021)
Combinatorial Sleeping Bandits with Fairness Constraints (2019)