A Combinatorial Theory of Dropout: Subnetworks, Graph Geometry, and Generalization
By: Sahil Rajesh Dhayalkar
Potential Business Impact:
Makes computer learning better by picking good parts.
We propose a combinatorial and graph-theoretic theory of dropout by modeling training as a random walk over a high-dimensional graph of binary subnetworks. Each node represents a masked version of the network, and dropout induces stochastic traversal across this space. We define a subnetwork contribution score that quantifies generalization and show that it varies smoothly over the graph. Using tools from spectral graph theory, PAC-Bayes analysis, and combinatorics, we prove that generalizing subnetworks form large, connected, low-resistance clusters, and that their number grows exponentially with network width. This reveals dropout as a mechanism for sampling from a robust, structured ensemble of well-generalizing subnetworks with built-in redundancy. Extensive experiments validate every theoretical claim across diverse architectures. Together, our results offer a unified foundation for understanding dropout and suggest new directions for mask-guided regularization and subnetwork optimization.
Similar Papers
Analytic theory of dropout regularization
Machine Learning (Stat)
Makes computer learning better by ignoring bad data.
Dropout Neural Network Training Viewed from a Percolation Perspective
Machine Learning (CS)
Makes computer brains learn better by breaking connections.
Convergence, design and training of continuous-time dropout as a random batch method
Machine Learning (CS)
Makes computer learning faster and more accurate.