Locally Private Nonparametric Contextual Multi-armed Bandits
By: Yuheng Ma , Feiyu Jiang , Zifeng Zhao and more
Potential Business Impact:
Keeps private data safe while making smart choices.
Motivated by privacy concerns in sequential decision-making on sensitive data, we address the challenge of nonparametric contextual multi-armed bandits (MAB) under local differential privacy (LDP). We develop a uniform-confidence-bound-type estimator, showing its minimax optimality supported by a matching minimax lower bound. We further consider the case where auxiliary datasets are available, subject also to (possibly heterogeneous) LDP constraints. Under the widely-used covariate shift framework, we propose a jump-start scheme to effectively utilize the auxiliary data, the minimax optimality of which is further established by a matching lower bound. Comprehensive experiments on both synthetic and real-world datasets validate our theoretical results and underscore the effectiveness of the proposed methods.
Similar Papers
Semi-Parametric Batched Global Multi-Armed Bandits with Covariates
Machine Learning (Stat)
Helps computers learn better from grouped information.
Sparse Additive Contextual Bandits: A Nonparametric Approach for Online Decision-making with High-dimensional Covariates
Machine Learning (Stat)
Helps computers learn better with lots of information.
Multi-Armed Bandits with Minimum Aggregated Revenue Constraints
Machine Learning (CS)
Helps websites show you the best ads.