BabyLM's First Constructions: Causal probing provides a signal of learning
By: Joshua Rozner, Leonie Weissweiler, Cory Shain
Potential Business Impact:
Models learn language rules from less data.
Construction grammar posits that language learners acquire constructions (form-meaning pairings) from the statistics of their environment. Recent work supports this hypothesis by showing sensitivity to constructions in pretrained language models (PLMs), including one recent study (Rozner et al., 2025) demonstrating that constructions shape RoBERTa's output distribution. However, models under study have generally been trained on developmentally implausible amounts of data, casting doubt on their relevance to human language learning. Here we use Rozner et al.'s methods to evaluate construction learning in masked language models from the 2024 BabyLM Challenge. Our results show that even when trained on developmentally plausible quantities of data, models learn diverse constructions, even hard cases that are superficially indistinguishable. We further find correlational evidence that constructional performance may be functionally relevant: models that better represent construction perform better on the BabyLM benchmarks.
Similar Papers
Do Construction Distributions Shape Formal Language Learning In German BabyLMs?
Computation and Language
Helps computers learn language like babies.
LLMs Learn Constructions That Humans Do Not Know
Computation and Language
Finds AI makes up fake grammar rules.
Evaluating CxG Generalisation in LLMs via Construction-Based NLI Fine Tuning
Computation and Language
Helps computers understand sentence structure better.