Generalisation and benign over-fitting for linear regression onto random functional covariates
By: Andrew Jones, Nick Whiteley
Potential Business Impact:
Helps computers learn from messy, connected data.
We study theoretical predictive performance of ridge and ridge-less least-squares regression when covariate vectors arise from evaluating $p$ random, means-square continuous functions over a latent metric space at $n$ random and unobserved locations, subject to additive noise. This leads us away from the standard assumption of i.i.d. data to a setting in which the $n$ covariate vectors are exchangeable but not independent in general. Under an assumption of independence across dimensions, $4$-th order moment, and other regularity conditions, we obtain probabilistic bounds on a notion of predictive excess risk adapted to our random functional covariate setting, making use of recent results of Barzilai and Shamir. We derive convergence rates in regimes where $p$ grows suitably fast relative to $n$, illustrating interplay between ingredients of the model in determining convergence behaviour and the role of additive covariate noise in benign-overfitting.
Similar Papers
Online Linear Regression with Paid Stochastic Features
Machine Learning (CS)
Learns better by choosing how much to pay for cleaner data.
Benign Overfitting and the Geometry of the Ridge Regression Solution in Binary Classification
Machine Learning (Stat)
Makes computers learn better even with messy data.
Nonparametric local polynomial regression for functional covariates
Statistics Theory
Improves math models for complex data.