An Exponential Averaging Process with Strong Convergence Properties
By: Frederik Köhne, Anton Schiela
Potential Business Impact:
Makes computer learning more accurate with noisy data.
Averaging, or smoothing, is a fundamental approach to obtain stable, de-noised estimates from noisy observations. In certain scenarios, observations made along trajectories of random dynamical systems are of particular interest. One popular smoothing technique for such a scenario is exponential moving averaging (EMA), which assigns observations a weight that decreases exponentially in their age, thus giving younger observations a larger weight. However, EMA fails to enjoy strong stochastic convergence properties, which stems from the fact that the weight assigned to the youngest observation is constant over time, preventing the noise in the averaged quantity from decreasing to zero. In this work, we consider an adaptation to EMA, which we call $p$-EMA, where the weights assigned to the last observations decrease to zero at a subharmonic rate. We provide stochastic convergence guarantees for this kind of averaging under mild assumptions on the autocorrelations of the underlying random dynamical system. We further discuss the implications of our results for a recently introduced adaptive step size control for Stochastic Gradient Descent (SGD), which uses $p$-EMA for averaging noisy observations.
Similar Papers
EMA Without the Lag: Bias-Corrected Iterate Averaging Schemes
Machine Learning (CS)
Makes AI learn faster and better.
Adaptive stable distribution and Hurst exponent by method of moments moving estimator for nonstationary time series
Methodology
Helps predict stock market crashes by watching changes.
High-Dimensional Model Averaging via Cross-Validation
Statistics Theory
Helps computers pick the best answers from many guesses.