Online Linear Regression with Paid Stochastic Features
By: Nadav Merlis, Kyoungseok Jang, Nicolò Cesa-Bianchi
Potential Business Impact:
Learns better by choosing how much to pay for cleaner data.
We study an online linear regression setting in which the observed feature vectors are corrupted by noise and the learner can pay to reduce the noise level. In practice, this may happen for several reasons: for example, because features can be measured more accurately using more expensive equipment, or because data providers can be incentivized to release less private features. Assuming feature vectors are drawn i.i.d. from a fixed but unknown distribution, we measure the learner's regret against the linear predictor minimizing a notion of loss that combines the prediction error and payment. When the mapping between payments and noise covariance is known, we prove that the rate $\sqrt{T}$ is optimal for regret if logarithmic factors are ignored. When the noise covariance is unknown, we show that the optimal regret rate becomes of order $T^{2/3}$ (ignoring log factors). Our analysis leverages matrix martingale concentration, showing that the empirical loss uniformly converges to the expected one for all payments and linear predictors.
Similar Papers
On the Rate of Gaussian Approximation for Linear Regression Problems
Machine Learning (Stat)
Helps computers guess better with more data.
Generalisation and benign over-fitting for linear regression onto random functional covariates
Machine Learning (Stat)
Helps computers learn from messy, connected data.
A Polynomial-time Algorithm for Online Sparse Linear Regression with Improved Regret Bound under Weaker Conditions
Machine Learning (CS)
Helps computers learn with less information.