Score: 1

Online Linear Regression with Paid Stochastic Features

Published: November 11, 2025 | arXiv ID: 2511.08073v1

By: Nadav Merlis, Kyoungseok Jang, Nicolò Cesa-Bianchi

Potential Business Impact:

Learns better by choosing how much to pay for cleaner data.

Business Areas:
A/B Testing Data and Analytics

We study an online linear regression setting in which the observed feature vectors are corrupted by noise and the learner can pay to reduce the noise level. In practice, this may happen for several reasons: for example, because features can be measured more accurately using more expensive equipment, or because data providers can be incentivized to release less private features. Assuming feature vectors are drawn i.i.d. from a fixed but unknown distribution, we measure the learner's regret against the linear predictor minimizing a notion of loss that combines the prediction error and payment. When the mapping between payments and noise covariance is known, we prove that the rate $\sqrt{T}$ is optimal for regret if logarithmic factors are ignored. When the noise covariance is unknown, we show that the optimal regret rate becomes of order $T^{2/3}$ (ignoring log factors). Our analysis leverages matrix martingale concentration, showing that the empirical loss uniformly converges to the expected one for all payments and linear predictors.

Country of Origin
🇮🇱 🇮🇹 🇰🇷 Italy, Israel, Korea, Republic of

Page Count
27 pages

Category
Computer Science:
Machine Learning (CS)