A Compositional Kernel Model for Feature Learning
By: Feng Ruan, Keli Liu, Michael Jordan
Potential Business Impact:
Finds important information, ignores junk.
We study a compositional variant of kernel ridge regression in which the predictor is applied to a coordinate-wise reweighting of the inputs. Formulated as a variational problem, this model provides a simple testbed for feature learning in compositional architectures. From the perspective of variable selection, we show how relevant variables are recovered while noise variables are eliminated. We establish guarantees showing that both global minimizers and stationary points discard noise coordinates when the noise variables are Gaussian distributed. A central finding is that $\ell_1$-type kernels, such as the Laplace kernel, succeed in recovering features contributing to nonlinear effects at stationary points, whereas Gaussian kernels recover only linear ones.
Similar Papers
Interpretable Kernels
Machine Learning (Stat)
Makes AI explain its decisions using original data.
Learning Multi-Index Models with Hyper-Kernel Ridge Regression
Machine Learning (Stat)
Helps computers learn complex tasks better than before.
A Kernel-based Stochastic Approximation Framework for Nonlinear Operator Learning
Machine Learning (Stat)
Teaches computers to solve hard math problems.