Statistically guided deep learning
By: Michael Kohler, Adam Krzyzak
Potential Business Impact:
Makes computer learning more accurate and faster.
We present a theoretically well-founded deep learning algorithm for nonparametric regression. It uses over-parametrized deep neural networks with logistic activation function, which are fitted to the given data via gradient descent. We propose a special topology of these networks, a special random initialization of the weights, and a data-dependent choice of the learning rate and the number of gradient descent steps. We prove a theoretical bound on the expected $L_2$ error of this estimate, and illustrate its finite sample size performance by applying it to simulated data. Our results show that a theoretical analysis of deep learning which takes into account simultaneously optimization, generalization and approximation can result in a new deep learning estimate which has an improved finite sample performance.
Similar Papers
Semiparametric M-estimation with overparameterized neural networks
Statistics Theory
Makes smart computer models explain their answers.
On the rate of convergence of an over-parametrized deep neural network regression estimate learned by gradient descent
Statistics Theory
Teaches computers to learn from messy data.
Statistical physics of deep learning: Optimal learning of a multi-layer perceptron near interpolation
Machine Learning (Stat)
Helps computers learn better from lots of information.