The Physics of Data and Tasks: Theories of Locality and Compositionality in Deep Learning
By: Alessandro Favero
Potential Business Impact:
Finds hidden patterns in data for smarter learning.
Deep neural networks have achieved remarkable success, yet our understanding of how they learn remains limited. These models can learn high-dimensional tasks, which is generally statistically intractable due to the curse of dimensionality. This apparent paradox suggests that learnable data must have an underlying latent structure. What is the nature of this structure? How do neural networks encode and exploit it, and how does it quantitatively impact performance - for instance, how does generalization improve with the number of training examples? This thesis addresses these questions by studying the roles of locality and compositionality in data, tasks, and deep learning representations.
Similar Papers
Scale leads to compositional generalization
Machine Learning (CS)
Computers learn to combine ideas to do new tasks.
Position: A Theory of Deep Learning Must Include Compositional Sparsity
Machine Learning (CS)
Lets computers learn complex things by breaking them down.
Statistical Physics of Deep Neural Networks: Generalization Capability, Beyond the Infinite Width, and Feature Learning
Disordered Systems and Neural Networks
Explains how computer "brains" learn and remember.