SUPN: Shallow Universal Polynomial Networks
By: Zachary Morrow , Michael Penwarden , Brian Chen and more
Potential Business Impact:
Makes computer learning faster and more accurate.
Deep neural networks (DNNs) and Kolmogorov-Arnold networks (KANs) are popular methods for function approximation due to their flexibility and expressivity. However, they typically require a large number of trainable parameters to produce a suitable approximation. Beyond making the resulting network less transparent, overparameterization creates a large optimization space, likely producing local minima in training that have quite different generalization errors. In this case, network initialization can have an outsize impact on the model's out-of-sample accuracy. For these reasons, we propose shallow universal polynomial networks (SUPNs). These networks replace all but the last hidden layer with a single layer of polynomials with learnable coefficients, leveraging the strengths of DNNs and polynomials to achieve sufficient expressivity with far fewer parameters. We prove that SUPNs converge at the same rate as the best polynomial approximation of the same degree, and we derive explicit formulas for quasi-optimal SUPN parameters. We complement theory with an extensive suite of numerical experiments involving SUPNs, DNNs, KANs, and polynomial projection in one, two, and ten dimensions, consisting of over 13,000 trained models. On the target functions we numerically studied, for a given number of trainable parameters, the approximation error and variability are often lower for SUPNs than for DNNs and KANs by an order of magnitude. In our examples, SUPNs even outperform polynomial projection on non-smooth functions.
Similar Papers
MLPs and KANs for data-driven learning in physical problems: A performance comparison
Machine Learning (CS)
New computer brains solve science problems better.
QuantKAN: A Unified Quantization Framework for Kolmogorov Arnold Networks
Machine Learning (CS)
Makes smart computer brains smaller and faster.
Geometry and Optimization of Shallow Polynomial Networks
Machine Learning (CS)
Helps computers learn better by understanding math.