Developing Training Procedures for Piecewise-linear Spline Activation Functions in Neural Networks
By: William H Patty
Potential Business Impact:
Makes computer brains learn better and faster.
Activation functions in neural networks are typically selected from a set of empirically validated, commonly used static functions such as ReLU, tanh, or sigmoid. However, by optimizing the shapes of a network's activation functions, we can train models that are more parameter-efficient and accurate by assigning more optimal activations to the neurons. In this paper, I present and compare 9 training methodologies to explore dual-optimization dynamics in neural networks with parameterized linear B-spline activation functions. The experiments realize up to 94% lower end model error rates in FNNs and 51% lower rates in CNNs compared to traditional ReLU-based models. These gains come at the cost of additional development and training complexity as well as end model latency.
Similar Papers
SmartMixed: A Two-Phase Training Strategy for Adaptive Activation Function Learning in Neural Networks
Machine Learning (CS)
Lets computer brains learn better ways to think.
Hi-fi functional priors by learning activations
Machine Learning (CS)
Makes smart computers learn better with new brain parts.
Breaking the Conventional Forward-Backward Tie in Neural Networks: Activation Functions
Neural and Evolutionary Computing
Lets computers learn with simpler math.