Disaggregation Reveals Hidden Training Dynamics: The Case of Agreement Attraction
By: James A. Michaelov, Catherine Arnett
Potential Business Impact:
Makes computers learn grammar like kids do.
Language models generally produce grammatical text, but they are more likely to make errors in certain contexts. Drawing on paradigms from psycholinguistics, we carry out a fine-grained analysis of those errors in different syntactic contexts. We demonstrate that by disaggregating over the conditions of carefully constructed datasets and comparing model performance on each over the course of training, it is possible to better understand the intermediate stages of grammatical learning in language models. Specifically, we identify distinct phases of training where language model behavior aligns with specific heuristics such as word frequency and local context rather than generalized grammatical rules. We argue that taking this approach to analyzing language model behavior more generally can serve as a powerful tool for understanding the intermediate learning phases, overall training dynamics, and the specific generalizations learned by language models.
Similar Papers
Different types of syntactic agreement recruit the same units within large language models
Computation and Language
Models learn grammar rules like humans do.
Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale
Computation and Language
Makes AI understand words by how often they're used.
Beyond the Rosetta Stone: Unification Forces in Generalization Dynamics
Computation and Language
Helps computers use knowledge across different languages.