Stochastic Bandits for Crowdsourcing and Multi-Platform Autobidding
By: François Bachoc , Nicolò Cesa-Bianchi , Tommaso Cesari and more
Potential Business Impact:
Helps spend money fairly on many tasks.
Motivated by applications in crowdsourcing, where a fixed sum of money is split among $K$ workers, and autobidding, where a fixed budget is used to bid in $K$ simultaneous auctions, we define a stochastic bandit model where arms belong to the $K$-dimensional probability simplex and represent the fraction of budget allocated to each task/auction. The reward in each round is the sum of $K$ stochastic rewards, where each of these rewards is unlocked with a probability that varies with the fraction of the budget allocated to that task/auction. We design an algorithm whose expected regret after $T$ steps is of order $K\sqrt{T}$ (up to log factors) and prove a matching lower bound. Improved bounds of order $K (\log T)^2$ are shown when the function mapping budget to probability of unlocking the reward (i.e., terminating the task or winning the auction) satisfies additional diminishing-returns conditions.
Similar Papers
Algorithm Design and Stronger Guarantees for the Improving Multi-Armed Bandits Problem
Machine Learning (CS)
Helps computers pick the best option faster.
Batched Stochastic Matching Bandits
Machine Learning (Stat)
Helps match people to jobs faster.
Incentivized Lipschitz Bandits
Machine Learning (CS)
Helps robots learn faster with smart rewards.