A Derandomization Framework for Structure Discovery: Applications in Neural Networks and Beyond
By: Nikos Tsikouras , Yorgos Pantis , Ioannis Mitliagkas and more
Potential Business Impact:
Helps computers learn better with less data.
Understanding the dynamics of feature learning in neural networks (NNs) remains a significant challenge. The work of (Mousavi-Hosseini et al., 2023) analyzes a multiple index teacher-student setting and shows that a two-layer student attains a low-rank structure in its first-layer weights when trained with stochastic gradient descent (SGD) and a strong regularizer. This structural property is known to reduce sample complexity of generalization. Indeed, in a second step, the same authors establish algorithm-specific learning guarantees under additional assumptions. In this paper, we focus exclusively on the structure discovery aspect and study it under weaker assumptions, more specifically: we allow (a) NNs of arbitrary size and depth, (b) with all parameters trainable, (c) under any smooth loss function, (d) tiny regularization, and (e) trained by any method that attains a second-order stationary point (SOSP), e.g.\ perturbed gradient descent (PGD). At the core of our approach is a key $\textit{derandomization}$ lemma, which states that optimizing the function $\mathbb{E}_{\mathbf{x}} \left[g_{\theta}(\mathbf{W}\mathbf{x} + \mathbf{b})\right]$ converges to a point where $\mathbf{W} = \mathbf{0}$, under mild conditions. The fundamental nature of this lemma directly explains structure discovery and has immediate applications in other domains including an end-to-end approximation for MAXCUT, and computing Johnson-Lindenstrauss embeddings.
Similar Papers
Parameter-Free Structural-Diversity Message Passing for Graph Neural Networks
Machine Learning (CS)
Helps computers understand messy data better.
Parameter-Free Structural-Diversity Message Passing for Graph Neural Networks
Machine Learning (CS)
Helps computers understand complex connections better.
On the Interplay between Graph Structure and Learning Algorithms in Graph Neural Networks
Machine Learning (CS)
Helps computers learn better from connected data.