Gradient-free stochastic optimization for additive models
By: Arya Akhavan, Alexandre B. Tsybakov
Potential Business Impact:
Makes computer learning faster without needing exact math.
We address the problem of zero-order optimization from noisy observations for an objective function satisfying the Polyak-{\L}ojasiewicz or the strong convexity condition. Additionally, we assume that the objective function has an additive structure and satisfies a higher-order smoothness property, characterized by the H\"older family of functions. The additive model for H\"older classes of functions is well-studied in the literature on nonparametric function estimation, where it is shown that such a model benefits from a substantial improvement of the estimation accuracy compared to the H\"older model without additive structure. We study this established framework in the context of gradient-free optimization. We propose a randomized gradient estimator that, when plugged into a gradient descent algorithm, allows one to achieve minimax optimal optimization error of the order $dT^{-(\beta-1)/\beta}$, where $d$ is the dimension of the problem, $T$ is the number of queries and $\beta\ge 2$ is the H\"older degree of smoothness. We conclude that, in contrast to nonparametric estimation problems, no substantial gain of accuracy can be achieved when using additive models in gradient-free optimization.
Similar Papers
Robust Variable Selection in High-dimensional Nonparametric Additive Model
Methodology
Finds important patterns even with messy data.
Convergence of a class of gradient-free optimisation schemes when the objective function is noisy, irregular, or both
Computation
Improves computer learning from messy data.
Minimisation of Submodular Functions Using Gaussian Zeroth-Order Random Oracles
Optimization and Control
Helps computers find the best choices faster.