Score: 1

Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure

Published: March 2, 2025 | arXiv ID: 2503.00856v3

By: Samet Demir, Zafer Dogan

Potential Business Impact:

Makes computers learn better from messy data.

Business Areas:

A/B Testing Data and Analytics

In this work, we study the training and generalization performance of two-layer neural networks (NNs) after one gradient descent step under structured data modeled by Gaussian mixtures. While previous research has extensively analyzed this model under isotropic data assumption, such simplifications overlook the complexities inherent in real-world datasets. Our work addresses this limitation by analyzing two-layer NNs under Gaussian mixture data assumption in the asymptotically proportional limit, where the input dimension, number of hidden neurons, and sample size grow with finite ratios. We characterize the training and generalization errors by leveraging recent advancements in Gaussian universality. Specifically, we prove that a high-order polynomial model performs equivalent to the nonlinear neural networks under certain conditions. The degree of the equivalent model is intricately linked to both the "data spread" and the learning rate employed during one gradient step. Through extensive simulations, we demonstrate the equivalence between the original model and its polynomial counterpart across various regression and classification tasks. Additionally, we explore how different properties of Gaussian mixtures affect learning outcomes. Finally, we illustrate experimental results on Fashion-MNIST classification, indicating that our findings can translate to realistic data.

Neural Networks Learn Generic Multi-Index Models Near Information-Theoretic Limit

Machine Learning (Stat)

Teaches computers to learn hidden patterns faster.

19 Nov 2025 0

87%

Information-theoretic reduction of deep neural networks to linear models in the overparametrized proportional regime

Statistics Theory

Makes smart computer programs work like simple math.

6 May 2025 0

87%

A Near Complete Nonasymptotic Generalization Theory For Multilayer Neural Networks: Beyond the Bias-Variance Tradeoff

Machine Learning (CS)

Makes AI learn better without needing perfect data.

3 Mar 2025 0

View PDF Login to Bookmark

Country of Origin

🇹🇷 Turkey

Repos / Data Links

github.com

Page Count

27 pages

Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure

Makes computers learn better from messy data.

Technical Abstract

Neural Networks Learn Generic Multi-Index Models Near Information-Theoretic Limit

Information-theoretic reduction of deep neural networks to linear models in the overparametrized proportional regime

A Near Complete Nonasymptotic Generalization Theory For Multilayer Neural Networks: Beyond the Bias-Variance Tradeoff