Quantum-Enhanced Neural Contextual Bandit Algorithms
By: Yuqi Huang, Vincent Y. F Tan, Sharu Theresa Jose
Potential Business Impact:
Makes quantum computers learn faster with less data.
Stochastic contextual bandits are fundamental for sequential decision-making but pose significant challenges for existing neural network-based algorithms, particularly when scaling to quantum neural networks (QNNs) due to issues such as massive over-parameterization, computational instability, and the barren plateau phenomenon. This paper introduces the Quantum Neural Tangent Kernel-Upper Confidence Bound (QNTK-UCB) algorithm, a novel algorithm that leverages the Quantum Neural Tangent Kernel (QNTK) to address these limitations. By freezing the QNN at a random initialization and utilizing its static QNTK as a kernel for ridge regression, QNTK-UCB bypasses the unstable training dynamics inherent in explicit parameterized quantum circuit training while fully exploiting the unique quantum inductive bias. For a time horizon $T$ and $K$ actions, our theoretical analysis reveals a significantly improved parameter scaling of $Ξ©((TK)^3)$ for QNTK-UCB, a substantial reduction compared to $Ξ©((TK)^8)$ required by classical NeuralUCB algorithms for similar regret guarantees. Empirical evaluations on non-linear synthetic benchmarks and quantum-native variational quantum eigensolver tasks demonstrate QNTK-UCB's superior sample efficiency in low-data regimes. This work highlights how the inherent properties of QNTK provide implicit regularization and a sharper spectral decay, paving the way for achieving ``quantum advantage'' in online learning.
Similar Papers
Quantum Non-Linear Bandit Optimization
Machine Learning (CS)
Finds best solutions faster, even with many choices.
Batched Nonparametric Bandits via k-Nearest Neighbor UCB
Machine Learning (Stat)
Helps computers learn best choices with limited feedback.
Uncertainty Quantification with the Empirical Neural Tangent Kernel
Machine Learning (Stat)
Makes AI more trustworthy by showing how sure it is.