Score: 3

BabyLM's First Constructions: Causal probing provides a signal of learning

Published: June 2, 2025 | arXiv ID: 2506.02147v2

By: Joshua Rozner, Leonie Weissweiler, Cory Shain

BigTech Affiliations: Stanford University

Potential Business Impact:

Models learn language rules from less data.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Construction grammar posits that language learners acquire constructions (form-meaning pairings) from the statistics of their environment. Recent work supports this hypothesis by showing sensitivity to constructions in pretrained language models (PLMs), including one recent study (Rozner et al., 2025) demonstrating that constructions shape RoBERTa's output distribution. However, models under study have generally been trained on developmentally implausible amounts of data, casting doubt on their relevance to human language learning. Here we use Rozner et al.'s methods to evaluate construction learning in masked language models from the 2024 BabyLM Challenge. Our results show that even when trained on developmentally plausible quantities of data, models learn diverse constructions, even hard cases that are superficially indistinguishable. We further find correlational evidence that constructional performance may be functionally relevant: models that better represent construction perform better on the BabyLM benchmarks.

Country of Origin
πŸ‡ΈπŸ‡ͺ πŸ‡ΊπŸ‡Έ United States, Sweden

Repos / Data Links

Page Count
13 pages

Category
Computer Science:
Computation and Language