Neural Scaling Laws for Deep Regression
By: Tilen Cadez, Kyoung-Min Kim
Potential Business Impact:
Improves computer predictions with more data.
Neural scaling laws--power-law relationships between generalization errors and characteristics of deep learning models--are vital tools for developing reliable models while managing limited resources. Although the success of large language models highlights the importance of these laws, their application to deep regression models remains largely unexplored. Here, we empirically investigate neural scaling laws in deep regression using a parameter estimation model for twisted van der Waals magnets. We observe power-law relationships between the loss and both training dataset size and model capacity across a wide range of values, employing various architectures--including fully connected networks, residual networks, and vision transformers. Furthermore, the scaling exponents governing these relationships range from 1 to 2, with specific values depending on the regressed parameters and model details. The consistent scaling behaviors and their large scaling exponents suggest that the performance of deep regression models can improve substantially with increasing data size.
Similar Papers
Scaling Laws are Redundancy Laws
Machine Learning (CS)
Explains why bigger computer brains learn faster.
Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches
Machine Learning (CS)
Makes AI smarter by understanding how to train them.
Scaling Laws for Uncertainty in Deep Learning
Machine Learning (Stat)
Makes AI know when it's unsure about answers.