Eluder dimension: localise it!
By: Alireza Bakhtiari , Alex Ayoub , Samuel Robertson and more
We establish a lower bound on the eluder dimension of generalised linear model classes, showing that standard eluder dimension-based analysis cannot lead to first-order regret bounds. To address this, we introduce a localisation method for the eluder dimension; our analysis immediately recovers and improves on classic results for Bernoulli bandits, and allows for the first genuine first-order bounds for finite-horizon reinforcement learning tasks with bounded cumulative returns.
Similar Papers
High-Dimensional Linear Bandits under Stochastic Latent Heterogeneity
Machine Learning (CS)
Helps computers guess what people want better.
A UCB Bandit Algorithm for General ML-Based Estimators
Machine Learning (CS)
Helps computers learn faster to make better choices.
Limits To (Machine) Learning
Machine Learning (Stat)
Finds hidden money patterns machines miss.