Score: 1

A Theoretical Analysis of Compositional Generalization in Neural Networks: A Necessary and Sufficient Condition

Published: May 5, 2025 | arXiv ID: 2505.02627v1

By: Yuanpeng Li

Potential Business Impact:

Teaches computers to understand new word combinations.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Compositional generalization is a crucial property in artificial intelligence, enabling models to handle novel combinations of known components. While most deep learning models lack this capability, certain models succeed in specific tasks, suggesting the existence of governing conditions. This paper derives a necessary and sufficient condition for compositional generalization in neural networks. Conceptually, it requires that (i) the computational graph matches the true compositional structure, and (ii) components encode just enough information in training. The condition is supported by mathematical proofs. This criterion combines aspects of architecture design, regularization, and training data properties. A carefully designed minimal example illustrates an intuitive understanding of the condition. We also discuss the potential of the condition for assessing compositional generalization before training. This work is a fundamental theoretical study of compositional generalization in neural networks.