Fair Algorithms with Probing for Multi-Agent Multi-Armed Bandits
By: Tianyi Xu , Jiaxin Liu , Nicholas Mattei and more
Potential Business Impact:
Fairly shares rewards, making systems work better.
We propose a multi-agent multi-armed bandit (MA-MAB) framework aimed at ensuring fair outcomes across agents while maximizing overall system performance. A key challenge in this setting is decision-making under limited information about arm rewards. To address this, we introduce a novel probing framework that strategically gathers information about selected arms before allocation. In the offline setting, where reward distributions are known, we leverage submodular properties to design a greedy probing algorithm with a provable performance bound. For the more complex online setting, we develop an algorithm that achieves sublinear regret while maintaining fairness. Extensive experiments on synthetic and real-world datasets show that our approach outperforms baseline methods, achieving better fairness and efficiency.
Similar Papers
Stochastic Multi-Objective Multi-Armed Bandits: Regret Definition and Algorithm
Machine Learning (CS)
Helps computers choose best options with many goals.
Distributed Algorithms for Multi-Agent Multi-Armed Bandits with Collision
Machine Learning (CS)
Helps players get more rewards without talking.
Decentralized Asynchronous Multi-player Bandits
Machine Learning (CS)
Helps devices share wireless signals without crashing.