Score: 0

Generalizability of Neural Networks Minimizing Empirical Risk Based on Expressive Ability

Published: March 6, 2025 | arXiv ID: 2503.04111v1

By: Lijia Yu , Yibo Miao , Yifan Zhu and more

Potential Business Impact:

Teaches computers to learn from more data.

Business Areas:

A/B Testing Data and Analytics

The primary objective of learning methods is generalization. Classic uniform generalization bounds, which rely on VC-dimension or Rademacher complexity, fail to explain the significant attribute that over-parameterized models in deep learning exhibit nice generalizability. On the other hand, algorithm-dependent generalization bounds, like stability bounds, often rely on strict assumptions. To establish generalizability under less stringent assumptions, this paper investigates the generalizability of neural networks that minimize or approximately minimize empirical risk. We establish a lower bound for population accuracy based on the expressiveness of these networks, which indicates that with an adequate large number of training samples and network sizes, these networks, including over-parameterized ones, can generalize effectively. Additionally, we provide a necessary condition for generalization, demonstrating that, for certain data distributions, the quantity of training data required to ensure generalization exceeds the network size needed to represent the corresponding data distribution. Finally, we provide theoretical insights into several phenomena in deep learning, including robust generalization, importance of over-parameterization, and effect of loss function on generalization.

Architecture independent generalization bounds for overparametrized deep ReLU networks

Machine Learning (CS)

Makes smart computer programs learn better, no matter how big.

8 Apr 2025 0

88%

Provable Generalization in Overparameterized Neural Nets

Machine Learning (CS)

Explains why big AI models learn so well.

24 Aug 2025 0

88%

Graph Representational Learning: When Does More Expressivity Hurt Generalization?

Machine Learning (CS)

Makes smart computer programs learn better from connected data.

16 May 2025 0

View PDF Login to Bookmark

Page Count

33 pages

Generalizability of Neural Networks Minimizing Empirical Risk Based on Expressive Ability

Teaches computers to learn from more data.

Technical Abstract

Architecture independent generalization bounds for overparametrized deep ReLU networks

Provable Generalization in Overparameterized Neural Nets

Graph Representational Learning: When Does More Expressivity Hurt Generalization?